Skip to content

Improve run_api dynamic schema creation #110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

mpangrazzi
Copy link
Contributor

This will add better description of run_api params / return type to dynamically create request / response pydantic models using docstring_parser to automatically parse run_api docstring.

For example, if you have a run_api like:

def run_api(question: str, urls: List[str]) -> str:
    """
    Ask a question about one or more websites using a Haystack pipeline.

    Args:
        question: The question to ask.
        urls: List of URLs to fetch.
        
    Returns:
        The answer to the question.
    """
    ...

It will add proper description, defaults (if present) and type to each param on pydantic model schema for question, urls and the return type.

This improved description will improve Hayhooks usage as OpenAPI Tool or MCP Tool.

@mpangrazzi mpangrazzi self-assigned this Apr 28, 2025
@mpangrazzi mpangrazzi requested a review from vblagoje April 28, 2025 12:45
@mpangrazzi mpangrazzi marked this pull request as ready for review April 28, 2025 12:45
@mpangrazzi mpangrazzi requested a review from anakin87 April 28, 2025 15:26
@@ -270,8 +279,9 @@ async def run_endpoint_without_files(run_req: request_model) -> response_model:
def add_pipeline_api_route(app: FastAPI, pipeline_name: str, pipeline_wrapper: BasePipelineWrapper) -> None:
clog = log.bind(pipeline_name=pipeline_name)

RunRequest = create_request_model_from_callable(pipeline_wrapper.run_api, f'{pipeline_name}Run')
RunResponse = create_response_model_from_callable(pipeline_wrapper.run_api, f'{pipeline_name}Run')
docstring = docstring_parser.parse(inspect.getdoc(pipeline_wrapper.run_api) or "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect more failures with docstring parsing than with the previous simple approach.

What do you think?

  • Do we want to force users to stick to this more rigid format?
  • Should we instead provide a safe fallback if docstring parsing fails?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your suggestion makes sense, but honestly, I haven't been able to make docstring.parse() raise an exception in my tests 😅.

Even when mixing different docstring formats (or not strictly adhering to them), it doesn't raise any exceptions. Instead, it just skips certain parameters. If, for example, you type ((int), you might get a slightly malformed type_name value, but the function doesn't fail -it simply behaves unexpectedly. Evaluating this behaviour automatically could be tricky.

I also noticed that within Haystack, .parse() exceptions are not explicitly handled either (in ComponentTool).

Perhaps we can leave the code as it is for now unless a bug pops out in the future?

@mpangrazzi
Copy link
Contributor Author

I am also attaching an example: if you provide a run_api like the following:

def run_api(self, urls: List[str], question: str) -> str:
        """
        Ask a question to a list of websites.

        Args:
            urls (List[str]): List of URLs to ask the question to.
            question (str): Question to ask the websites.

        Returns:
            str: Answer to the question.
        """
        result = self.pipeline.run({"fetcher": {"urls": urls}, "prompt": {"query": question}})
        return result["llm"]["replies"][0]

You'll get a Tool description like the following (using MCP Inspector):
Screenshot 2025-04-28 at 21 54 54

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants