diff --git a/docs/source/en/_toctree.yml b/docs/source/en/_toctree.yml index 71faa4d..1a2c39a 100644 --- a/docs/source/en/_toctree.yml +++ b/docs/source/en/_toctree.yml @@ -32,5 +32,7 @@ sections: - local: reference/agents title: Agent-related objects + - local: reference/models + title: Model-related objects - local: reference/tools title: Tool-related objects diff --git a/docs/source/en/reference/agents.md b/docs/source/en/reference/agents.md index 77a0df1..425ec39 100644 --- a/docs/source/en/reference/agents.md +++ b/docs/source/en/reference/agents.md @@ -57,130 +57,3 @@ Both require arguments `model` and list of tools `tools` at initialization. > You must have `gradio` installed to use the UI. Please run `pip install smolagents[gradio]` if it's not the case. [[autodoc]] GradioUI - -## Models - -You're free to create and use your own models to power your agent. - -You could use any `model` callable for your agent, as long as: -1. It follows the [messages format](./chat_templating) (`List[Dict[str, str]]`) for its input `messages`, and it returns a `str`. -2. It stops generating outputs *before* the sequences passed in the argument `stop_sequences` - -For defining your LLM, you can make a `custom_model` method which accepts a list of [messages](./chat_templating) and returns an object with a .content attribute containing the text. This callable also needs to accept a `stop_sequences` argument that indicates when to stop generating. - -```python -from huggingface_hub import login, InferenceClient - -login("") - -model_id = "meta-llama/Llama-3.3-70B-Instruct" - -client = InferenceClient(model=model_id) - -def custom_model(messages, stop_sequences=["Task"]): - response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1000) - answer = response.choices[0].message - return answer -``` - -Additionally, `custom_model` can also take a `grammar` argument. In the case where you specify a `grammar` upon agent initialization, this argument will be passed to the calls to model, with the `grammar` that you defined upon initialization, to allow [constrained generation](https://huggingface.co/docs/text-generation-inference/conceptual/guidance) in order to force properly-formatted agent outputs. - -### TransformersModel - -For convenience, we have added a `TransformersModel` that implements the points above by building a local `transformers` pipeline for the model_id given at initialization. - -```python -from smolagents import TransformersModel - -model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct") - -print(model([{"role": "user", "content": "Ok!"}], stop_sequences=["great"])) -``` -```text ->>> What a -``` - -> [!TIP] -> You must have `transformers` and `torch` installed on your machine. Please run `pip install smolagents[transformers]` if it's not the case. - -[[autodoc]] TransformersModel - -### HfApiModel - -The `HfApiModel` wraps an [HF Inference API](https://huggingface.co/docs/api-inference/index) client for the execution of the LLM. - -```python -from smolagents import HfApiModel - -messages = [ - {"role": "user", "content": "Hello, how are you?"}, - {"role": "assistant", "content": "I'm doing great. How can I help you today?"}, - {"role": "user", "content": "No need to help, take it easy."}, -] - -model = HfApiModel() -print(model(messages)) -``` -```text ->>> Of course! If you change your mind, feel free to reach out. Take care! -``` -[[autodoc]] HfApiModel - -### LiteLLMModel - -The `LiteLLMModel` leverages [LiteLLM](https://www.litellm.ai/) to support 100+ LLMs from various providers. -You can pass kwargs upon model initialization that will then be used whenever using the model, for instance below we pass `temperature`. - -```python -from smolagents import LiteLLMModel - -messages = [ - {"role": "user", "content": "Hello, how are you?"}, - {"role": "assistant", "content": "I'm doing great. How can I help you today?"}, - {"role": "user", "content": "No need to help, take it easy."}, -] - -model = LiteLLMModel("anthropic/claude-3-5-sonnet-latest", temperature=0.2, max_tokens=10) -print(model(messages)) -``` - -[[autodoc]] LiteLLMModel - -### OpenAIServerModel - -This class lets you call any OpenAIServer compatible model. -Here's how you can set it (you can customise the `api_base` url to point to another server): -```py -from smolagents import OpenAIServerModel - -model = OpenAIServerModel( - model_id="gpt-4o", - api_base="https://api.openai.com/v1", - api_key=os.environ["OPENAI_API_KEY"], -) -``` - -[[autodoc]] OpenAIServerModel - -### AzureOpenAIServerModel - -`AzureOpenAIServerModel` allows you to connect to any Azure OpenAI deployment. - -Below you can find an example of how to set it up, note that you can omit the `azure_endpoint`, `api_key`, and `api_version` arguments, provided you've set the corresponding environment variables -- `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and `OPENAI_API_VERSION`. - -Pay attention to the lack of an `AZURE_` prefix for `OPENAI_API_VERSION`, this is due to the way the underlying [openai](https://github.com/openai/openai-python) package is designed. - -```py -import os - -from smolagents import AzureOpenAIServerModel - -model = AzureOpenAIServerModel( - model_id = os.environ.get("AZURE_OPENAI_MODEL"), - azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"), - api_key=os.environ.get("AZURE_OPENAI_API_KEY"), - api_version=os.environ.get("OPENAI_API_VERSION") -) -``` - -[[autodoc]] AzureOpenAIServerModel \ No newline at end of file diff --git a/docs/source/en/reference/models.md b/docs/source/en/reference/models.md new file mode 100644 index 0000000..3c4297a --- /dev/null +++ b/docs/source/en/reference/models.md @@ -0,0 +1,153 @@ + +# Models + + + +Smolagents is an experimental API which is subject to change at any time. Results returned by the agents +can vary as the APIs or underlying models are prone to change. + + + +To learn more about agents and tools make sure to read the [introductory guide](../index). This page +contains the API docs for the underlying classes. + +## Models + +You're free to create and use your own models to power your agent. + +You could use any `model` callable for your agent, as long as: +1. It follows the [messages format](./chat_templating) (`List[Dict[str, str]]`) for its input `messages`, and it returns a `str`. +2. It stops generating outputs *before* the sequences passed in the argument `stop_sequences` + +For defining your LLM, you can make a `custom_model` method which accepts a list of [messages](./chat_templating) and returns an object with a .content attribute containing the text. This callable also needs to accept a `stop_sequences` argument that indicates when to stop generating. + +```python +from huggingface_hub import login, InferenceClient + +login("") + +model_id = "meta-llama/Llama-3.3-70B-Instruct" + +client = InferenceClient(model=model_id) + +def custom_model(messages, stop_sequences=["Task"]): + response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1000) + answer = response.choices[0].message + return answer +``` + +Additionally, `custom_model` can also take a `grammar` argument. In the case where you specify a `grammar` upon agent initialization, this argument will be passed to the calls to model, with the `grammar` that you defined upon initialization, to allow [constrained generation](https://huggingface.co/docs/text-generation-inference/conceptual/guidance) in order to force properly-formatted agent outputs. + +### TransformersModel + +For convenience, we have added a `TransformersModel` that implements the points above by building a local `transformers` pipeline for the model_id given at initialization. + +```python +from smolagents import TransformersModel + +model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct") + +print(model([{"role": "user", "content": "Ok!"}], stop_sequences=["great"])) +``` +```text +>>> What a +``` + +> [!TIP] +> You must have `transformers` and `torch` installed on your machine. Please run `pip install smolagents[transformers]` if it's not the case. + +[[autodoc]] TransformersModel + +### HfApiModel + +The `HfApiModel` wraps an [HF Inference API](https://huggingface.co/docs/api-inference/index) client for the execution of the LLM. + +```python +from smolagents import HfApiModel + +messages = [ + {"role": "user", "content": "Hello, how are you?"}, + {"role": "assistant", "content": "I'm doing great. How can I help you today?"}, + {"role": "user", "content": "No need to help, take it easy."}, +] + +model = HfApiModel() +print(model(messages)) +``` +```text +>>> Of course! If you change your mind, feel free to reach out. Take care! +``` +[[autodoc]] HfApiModel + +### LiteLLMModel + +The `LiteLLMModel` leverages [LiteLLM](https://www.litellm.ai/) to support 100+ LLMs from various providers. +You can pass kwargs upon model initialization that will then be used whenever using the model, for instance below we pass `temperature`. + +```python +from smolagents import LiteLLMModel + +messages = [ + {"role": "user", "content": "Hello, how are you?"}, + {"role": "assistant", "content": "I'm doing great. How can I help you today?"}, + {"role": "user", "content": "No need to help, take it easy."}, +] + +model = LiteLLMModel("anthropic/claude-3-5-sonnet-latest", temperature=0.2, max_tokens=10) +print(model(messages)) +``` + +[[autodoc]] LiteLLMModel + +### OpenAIServerModel + +This class lets you call any OpenAIServer compatible model. +Here's how you can set it (you can customise the `api_base` url to point to another server): +```py +from smolagents import OpenAIServerModel + +model = OpenAIServerModel( + model_id="gpt-4o", + api_base="https://api.openai.com/v1", + api_key=os.environ["OPENAI_API_KEY"], +) +``` + +[[autodoc]] OpenAIServerModel + +### AzureOpenAIServerModel + +`AzureOpenAIServerModel` allows you to connect to any Azure OpenAI deployment. + +Below you can find an example of how to set it up, note that you can omit the `azure_endpoint`, `api_key`, and `api_version` arguments, provided you've set the corresponding environment variables -- `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and `OPENAI_API_VERSION`. + +Pay attention to the lack of an `AZURE_` prefix for `OPENAI_API_VERSION`, this is due to the way the underlying [openai](https://github.com/openai/openai-python) package is designed. + +```py +import os + +from smolagents import AzureOpenAIServerModel + +model = AzureOpenAIServerModel( + model_id = os.environ.get("AZURE_OPENAI_MODEL"), + azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"), + api_key=os.environ.get("AZURE_OPENAI_API_KEY"), + api_version=os.environ.get("OPENAI_API_VERSION") +) +``` + +[[autodoc]] AzureOpenAIServerModel \ No newline at end of file