Update doc on tools

2024-12-31 19:08:18 +01:00 · 2024-12-31 19:08:18 +01:00 · 8a769904c9
parent 3b600dbfb8
commit 8a769904c9
2 changed files with 71 additions and 32 deletions
--- a/docs/source/en/guided_tour.md
+++ b/docs/source/en/guided_tour.md
@ -155,7 +155,11 @@ Here are a few useful attributes to inspect what happened after a run:

 ## Tools

-A tool is an atomic function to be used by an agent.
+A tool is an atomic function to be used by an agent. To be used by an LLM, it also needs a few attributes that constitute its API and will be used to describe to the LLM how to call this tool:
+- A name
+- A description
+- Input types and descriptions
+- An output type

 You can for instance check the [`PythonInterpreterTool`]: it has a name, a description, input descriptions, an output type, and a `__call__` method to perform the action.

@ -190,8 +194,8 @@ from huggingface_hub import list_models

 task = "text-classification"

-model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
-print(model.id)
+most_downloaded_model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
+print(most_downloaded_model.id)
 ```

 This code can quickly be converted into a tool, just by wrapping it in a function and adding the `tool` decorator:
@ -209,8 +213,8 @@ def model_download_tool(task: str) -> str:
    Args:
        task: The task for which
    """
-    model = next(iter(list_models(filter="text-classification", sort="downloads", direction=-1)))
-    return model.id
+    most_downloaded_model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
+    return most_downloaded_model.id
 ```

 The function needs:
@ -224,29 +228,48 @@ All these will be automatically baked into the agent's system prompt upon initia

 Then you can directly initialize your agent:
 ```py
-from smolagents import CodeAgent
-agent = CodeAgent(tools=[model_download_tool], model=model)
+from smolagents import CodeAgent, HfApiModel
+agent = CodeAgent(tools=[model_download_tool], model=HfApiModel())
 agent.run(
    "Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?"
 )
 ```

-You get the following:
+You get the following logs:
 ```text
-======== New task ========
-Can you give me the name of the model that has the most downloads in the 'text-to-video' task on the Hugging Face Hub?
-==== Agent is executing the code below:
-most_downloaded_model = model_download_tool(task="text-to-video")
-print(f"The most downloaded model for the 'text-to-video' task is {most_downloaded_model}.")
-====
+╭──────────────────────────────────────── New run ─────────────────────────────────────────╮
+│                                                                                          │
+│ Can you give me the name of the model that has the most downloads in the 'text-to-video' │
+│ task on the Hugging Face Hub?                                                            │
+│                                                                                          │
+╰─ HfApiModel - Qwen/Qwen2.5-Coder-32B-Instruct ───────────────────────────────────────────╯
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 0 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+╭─ Executing this code: ───────────────────────────────────────────────────────────────────╮
+│   1 model_name = model_download_tool(task="text-to-video")                               │
+│   2 print(model_name)                                                                    │
+╰──────────────────────────────────────────────────────────────────────────────────────────╯
+Execution logs:
+ByteDance/AnimateDiff-Lightning
+
+Out: None
+[Step 0: Duration 0.27 seconds| Input tokens: 2,069 | Output tokens: 60]
+━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ Step 1 ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
+╭─ Executing this code: ───────────────────────────────────────────────────────────────────╮
+│   1 final_answer("ByteDance/AnimateDiff-Lightning")                                      │
+╰──────────────────────────────────────────────────────────────────────────────────────────╯
+Out - Final answer: ByteDance/AnimateDiff-Lightning
+[Step 1: Duration 0.10 seconds| Input tokens: 4,288 | Output tokens: 148]
+Out[20]: 'ByteDance/AnimateDiff-Lightning'
 ```

-And the output:
-`"The most downloaded model for the 'text-to-video' task is ByteDance/AnimateDiff-Lightning."`
+This is not the only way to build the tool: you can directly define it as a subclass of [`Tool`], which gives you more flexibility, for instance the possibility to initialize heavy class attributes.
+
+Read more in the [dedicated tool tutorial](./tutorials/tools#what-is-a-tool-and-how-to-build-one)

 ## Multi-agents

-Multi-agent has been introduced in Microsoft's framework [Autogen](https://huggingface.co/papers/2308.08155).
+Multi-agent systems have been introduced with Microsoft's framework [Autogen](https://huggingface.co/papers/2308.08155).
+
 In this type of framework, you have several agents working together to solve your task instead of only one.
 It empirically yields better performance on most benchmarks. The reason for this better performance is conceptually simple: for many tasks, rather than using a do-it-all system, you would prefer to specialize units on sub-tasks. Here, having agents with separate tool sets and memories allows to achieve efficient specialization. For instance, why fill the memory of the code generating agent with all the content of webpages visited by the web search agent? It's better to keep them separate.

--- a/docs/source/en/tutorials/tools.md
+++ b/docs/source/en/tutorials/tools.md
@ -22,22 +22,26 @@ Here, we're going to see advanced tool usage.
 > [!TIP]
 > If you're new to building agents, make sure to first read the [intro to agents](../conceptual_guides/intro_agents) and the [guided tour of smolagents](../guided_tour).

-### Directly define a tool by subclassing Tool
+- [Tools](#tools)
+    - [What is a tool, and how to build one?](#what-is-a-tool-and-how-to-build-one)
+    - [Share your tool to the Hub](#share-your-tool-to-the-hub)
+    - [Import a Space as a tool](#import-a-space-as-a-tool)
+    - [Use gradio-tools](#use-gradio-tools)
+    - [Use LangChain tools](#use-langchain-tools)
+    - [Manage your agent's toolbox](#manage-your-agents-toolbox)
+    - [Use a collection of tools](#use-a-collection-of-tools)

-Let's take again the tool example from the [quicktour](../quicktour), for which we had implemented a `@tool` decorator. The `tool` decorator is the standard format, but sometimes you need more: use several methods in a class for more clarity, or using additional class attributes.
+### What is a tool, and how to build one?

-In this case, you can build your tool following the fine-grained method: building a class that inherits from the [`Tool`] superclass.
+A tool is mostly a function that an LLM can use in an agentic system.

-The custom tool needs:
- An attribute `name`, which corresponds to the name of the tool itself. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's name it `model_download_counter`.
- An attribute `description` is used to populate the agent's system prompt.
- An `inputs` attribute, which is a dictionary with keys `"type"` and `"description"`. It contains information that helps the Python interpreter make educated choices about the input.
- An `output_type` attribute, which specifies the output type.
- A `forward` method which contains the inference code to be executed.
+But to use it, the LLM will need to be given an API: name, tool description, input types and descriptions, output type.

-The types for both `inputs` and `output_type` should be amongst [Pydantic formats](https://docs.pydantic.dev/latest/concepts/json_schema/#generating-json-schema), they can be either of these: [`~AUTHORIZED_TYPES`].
+So it cannot be only a function. It should be a class.

-Also, all imports should be put within the tool's forward function, else you will get an error.
+So at core, the tool is a class that wraps a function with metadata that helps the LLM understand how to use it.
+
+Here's how it looks:

 ```python
 from smolagents import Tool
@ -47,7 +51,6 @@ class HFModelDownloadsTool(Tool):
    description = """
    This is a tool that returns the most downloaded model of a given task on the Hugging Face Hub.
    It returns the name of the checkpoint."""
-
    inputs = {
        "task": {
            "type": "string",
@ -61,21 +64,34 @@ class HFModelDownloadsTool(Tool):

        model = next(iter(list_models(filter=task, sort="downloads", direction=-1)))
        return model.id
+
 tool = HFModelDownloadsTool()
 ```

-Now the custom `HfModelDownloadsTool` class is ready.
+The custom tool subclasses [`Tool`] to inherit useful methods. The child class also defines:
+- An attribute `name`, which corresponds to the name of the tool itself. The name usually describes what the tool does. Since the code returns the model with the most downloads for a task, let's name it `model_download_counter`.
+- An attribute `description` is used to populate the agent's system prompt.
+- An `inputs` attribute, which is a dictionary with keys `"type"` and `"description"`. It contains information that helps the Python interpreter make educated choices about the input.
+- An `output_type` attribute, which specifies the output type. The types for both `inputs` and `output_type` should be [Pydantic formats](https://docs.pydantic.dev/latest/concepts/json_schema/#generating-json-schema), they can be either of these: [`~AUTHORIZED_TYPES`].
+- A `forward` method which contains the inference code to be executed.
+
+And that's all it needs to be used in an agent!
+
+There's another way to build a tool. In the [guided_tour](../guided_tour), we implemented a tool using the `@tool` decorator. The [`tool`] decorator is the recommended way to define simple tools, but sometimes you need more than this: using several methods in a class for more clarity, or using additional class attributes.
+
+In this case, you can build your tool by subclassing [`Tool`] as described above.

 ### Share your tool to the Hub

-You can also share your custom tool to the Hub by calling [`~Tool.push_to_hub`] on the tool. Make sure you've created a repository for it on the Hub and are using a token with read access.
+You can share your custom tool to the Hub by calling [`~Tool.push_to_hub`] on the tool. Make sure you've created a repository for it on the Hub and are using a token with read access.

 ```python
 tool.push_to_hub("{your_username}/hf-model-downloads", token="<YOUR_HUGGINGFACEHUB_API_TOKEN>")
 ```

 For the push to Hub to work, your tool will need to respect some rules:
- All method are self-contained, e.g. use variables that come either from their args, 
+- All method are self-contained, e.g. use variables that come either from their args.
+- As per the above point, **all imports should be defined directky within the tool's functions**, else you will get an error when trying to call [`~Tool.save`] or [`~Tool.push_to_hub`] with your custom tool.
 - If you subclass the `__init__` method, you can give it no other argument than `self`. This is because arguments set during a specific tool instance's initialization are hard to track, which prevents from sharing them properly to the hub. And anyway, the idea of making a specific class is that you can already set class attributes for anything you need to hard-code (just set `your_variable=(...)` directly under the `class YourTool(Tool):` line). And of course you can still create a class attribute anywhere in your code by assigning stuff to `self.your_variable`.

 Once your tool is pushed to Hub, you can load it with the [`~Tool.load_tool`] function and pass it to the `tools` parameter in your agent.