smolagents/docs/source/en/reference/agents.md

6.5 KiB

Agents

Smolagents is an experimental API which is subject to change at any time. Results returned by the agents can vary as the APIs or underlying models are prone to change.

To learn more about agents and tools make sure to read the introductory guide. This page contains the API docs for the underlying classes.

Agents

Our agents inherit from [MultiStepAgent], which means they can act in multiple steps, each step consisting of one thought, then one tool call and execution. Read more in this conceptual guide.

We provide two types of agents, based on the main [Agent] class.

  • [CodeAgent] is the default agent, it writes its tool calls in Python code.
  • [ToolCallingAgent] writes its tool calls in JSON.

Both require arguments model and list of tools tools at initialization.

Classes of agents

autodoc MultiStepAgent

autodoc CodeAgent

autodoc ToolCallingAgent

ManagedAgent

autodoc ManagedAgent

stream_to_gradio

autodoc stream_to_gradio

GradioUI

[!TIP] You must have gradio installed to use the UI. Please run pip install smolagents[gradio] if it's not the case.

autodoc GradioUI

Models

You're free to create and use your own models to power your agent.

You could use any model callable for your agent, as long as:

  1. It follows the messages format (List[Dict[str, str]]) for its input messages, and it returns a str.
  2. It stops generating outputs before the sequences passed in the argument stop_sequences

For defining your LLM, you can make a custom_model method which accepts a list of messages and returns an object with a .content attribute containing the text. This callable also needs to accept a stop_sequences argument that indicates when to stop generating.

from huggingface_hub import login, InferenceClient

login("<YOUR_HUGGINGFACEHUB_API_TOKEN>")

model_id = "meta-llama/Llama-3.3-70B-Instruct"

client = InferenceClient(model=model_id)

def custom_model(messages, stop_sequences=["Task"]):
    response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1000)
    answer = response.choices[0].message
    return answer

Additionally, custom_model can also take a grammar argument. In the case where you specify a grammar upon agent initialization, this argument will be passed to the calls to model, with the grammar that you defined upon initialization, to allow constrained generation in order to force properly-formatted agent outputs.

TransformersModel

For convenience, we have added a TransformersModel that implements the points above by building a local transformers pipeline for the model_id given at initialization.

from smolagents import TransformersModel

model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")

print(model([{"role": "user", "content": "Ok!"}], stop_sequences=["great"]))
>>> What a

[!TIP] You must have transformers and torch installed on your machine. Please run pip install smolagents[transformers] if it's not the case.

autodoc TransformersModel

HfApiModel

The HfApiModel wraps an HF Inference API client for the execution of the LLM.

from smolagents import HfApiModel

messages = [
  {"role": "user", "content": "Hello, how are you?"},
  {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
  {"role": "user", "content": "No need to help, take it easy."},
]

model = HfApiModel()
print(model(messages))
>>> Of course! If you change your mind, feel free to reach out. Take care!

autodoc HfApiModel

LiteLLMModel

The LiteLLMModel leverages LiteLLM to support 100+ LLMs from various providers. You can pass kwargs upon model initialization that will then be used whenever using the model, for instance below we pass temperature.

from smolagents import LiteLLMModel

messages = [
  {"role": "user", "content": "Hello, how are you?"},
  {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
  {"role": "user", "content": "No need to help, take it easy."},
]

model = LiteLLMModel("anthropic/claude-3-5-sonnet-latest", temperature=0.2, max_tokens=10)
print(model(messages))

autodoc LiteLLMModel

OpenAIServerModel

This class lets you call any OpenAIServer compatible model. Here's how you can set it (you can customise the api_base url to point to another server):

from smolagents import OpenAIServerModel

model = OpenAIServerModel(
    model_id="gpt-4o",
    api_base="https://api.openai.com/v1",
    api_key=os.environ["OPENAI_API_KEY"],
)

autodoc OpenAIServerModel

AzureOpenAIServerModel

AzureOpenAIServerModel allows you to connect to any Azure OpenAI deployment.

Below you can find an example of how to set it up, note that you can omit the azure_endpoint, api_key, and api_version arguments, provided you've set the corresponding environment variables -- AZURE_OPENAI_ENDPOINT, AZURE_OPENAI_API_KEY, and OPENAI_API_VERSION.

Pay attention to the lack of an AZURE_ prefix for OPENAI_API_VERSION, this is due to the way the underlying openai package is designed.

import os

from smolagents import AzureOpenAIServerModel

model = AzureOpenAIServerModel(
    model_id = os.environ.get("AZURE_OPENAI_MODEL"),
    azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
    api_version=os.environ.get("OPENAI_API_VERSION")    
)

autodoc AzureOpenAIServerModel