Separate tree for Model docs is added (#382)

2025-01-28 12:44:15 +05:00 · 2025-01-28 12:44:15 +05:00 · dca7081394
parent 4579a6f7cc
commit dca7081394
3 changed files with 155 additions and 127 deletions
--- a/docs/source/en/_toctree.yml
+++ b/docs/source/en/_toctree.yml
@ -32,5 +32,7 @@
  sections:
  - local: reference/agents
    title: Agent-related objects
  - local: reference/models
    title: Model-related objects
  - local: reference/tools
    title: Tool-related objects
--- a/docs/source/en/reference/agents.md
+++ b/docs/source/en/reference/agents.md
@ -57,130 +57,3 @@ Both require arguments `model` and list of tools `tools` at initialization.
 > You must have `gradio` installed to use the UI. Please run `pip install smolagents[gradio]` if it's not the case.
 [[autodoc]] GradioUI
 ## Models
 You're free to create and use your own models to power your agent.
 You could use any `model` callable for your agent, as long as:
 1. It follows the [messages format](./chat_templating) (`List[Dict[str, str]]`) for its input `messages`, and it returns a `str`.
 2. It stops generating outputs *before* the sequences passed in the argument `stop_sequences`
 For defining your LLM, you can make a `custom_model` method which accepts a list of [messages](./chat_templating) and returns an object with a .content attribute containing the text. This callable also needs to accept a `stop_sequences` argument that indicates when to stop generating.
 ```python
 from huggingface_hub import login, InferenceClient
 login("<YOUR_HUGGINGFACEHUB_API_TOKEN>")
 model_id = "meta-llama/Llama-3.3-70B-Instruct"
 client = InferenceClient(model=model_id)
 def custom_model(messages, stop_sequences=["Task"]):
    response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1000)
    answer = response.choices[0].message
    return answer
 ```
 Additionally, `custom_model` can also take a `grammar` argument. In the case where you specify a `grammar` upon agent initialization, this argument will be passed to the calls to model, with the `grammar` that you defined upon initialization, to allow [constrained generation](https://huggingface.co/docs/text-generation-inference/conceptual/guidance) in order to force properly-formatted agent outputs.
 ### TransformersModel
 For convenience, we have added a `TransformersModel` that implements the points above by building a local `transformers` pipeline for the model_id given at initialization.
 ```python
 from smolagents import TransformersModel
 model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")
 print(model([{"role": "user", "content": "Ok!"}], stop_sequences=["great"]))
 ```
 ```text
 >>> What a
 ```
 > [!TIP]
 > You must have `transformers` and `torch` installed on your machine. Please run `pip install smolagents[transformers]` if it's not the case.
 [[autodoc]] TransformersModel
 ### HfApiModel
 The `HfApiModel` wraps an [HF Inference API](https://huggingface.co/docs/api-inference/index) client for the execution of the LLM.
 ```python
 from smolagents import HfApiModel
 messages = [
  {"role": "user", "content": "Hello, how are you?"},
  {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
  {"role": "user", "content": "No need to help, take it easy."},
 ]
 model = HfApiModel()
 print(model(messages))
 ```
 ```text
 >>> Of course! If you change your mind, feel free to reach out. Take care!
 ```
 [[autodoc]] HfApiModel
 ### LiteLLMModel
 The `LiteLLMModel` leverages [LiteLLM](https://www.litellm.ai/) to support 100+ LLMs from various providers.
 You can pass kwargs upon model initialization that will then be used whenever using the model, for instance below we pass `temperature`.
 ```python
 from smolagents import LiteLLMModel
 messages = [
  {"role": "user", "content": "Hello, how are you?"},
  {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
  {"role": "user", "content": "No need to help, take it easy."},
 ]
 model = LiteLLMModel("anthropic/claude-3-5-sonnet-latest", temperature=0.2, max_tokens=10)
 print(model(messages))
 ```
 [[autodoc]] LiteLLMModel
 ### OpenAIServerModel
 This class lets you call any OpenAIServer compatible model.
 Here's how you can set it (you can customise the `api_base` url to point to another server):
 ```py
 from smolagents import OpenAIServerModel
 model = OpenAIServerModel(
    model_id="gpt-4o",
    api_base="https://api.openai.com/v1",
    api_key=os.environ["OPENAI_API_KEY"],
 )
 ```
 [[autodoc]] OpenAIServerModel
 ### AzureOpenAIServerModel
 `AzureOpenAIServerModel` allows you to connect to any Azure OpenAI deployment. 
 Below you can find an example of how to set it up, note that you can omit the `azure_endpoint`, `api_key`, and `api_version` arguments, provided you've set the corresponding environment variables -- `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and `OPENAI_API_VERSION`.
 Pay attention to the lack of an `AZURE_` prefix for `OPENAI_API_VERSION`, this is due to the way the underlying [openai](https://github.com/openai/openai-python) package is designed. 
 ```py
 import os
 from smolagents import AzureOpenAIServerModel
 model = AzureOpenAIServerModel(
    model_id = os.environ.get("AZURE_OPENAI_MODEL"),
    azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
    api_version=os.environ.get("OPENAI_API_VERSION")    
 )
 ```
 [[autodoc]] AzureOpenAIServerModel
--- a/docs/source/en/reference/models.md
+++ b/docs/source/en/reference/models.md
@ -0,0 +1,153 @@
 <!--Copyright 2024 The HuggingFace Team. All rights reserved.
 Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
 the License. You may obtain a copy of the License at
 http://www.apache.org/licenses/LICENSE-2.0
 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
 an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
 specific language governing permissions and limitations under the License.
 ⚠️ Note that this file is in Markdown but contain specific syntax for our doc-builder (similar to MDX) that may not be
 rendered properly in your Markdown viewer.
 -->
 # Models
 <Tip warning={true}>
 Smolagents is an experimental API which is subject to change at any time. Results returned by the agents
 can vary as the APIs or underlying models are prone to change.
 </Tip>
 To learn more about agents and tools make sure to read the [introductory guide](../index). This page
 contains the API docs for the underlying classes.
 ## Models
 You're free to create and use your own models to power your agent.
 You could use any `model` callable for your agent, as long as:
 1. It follows the [messages format](./chat_templating) (`List[Dict[str, str]]`) for its input `messages`, and it returns a `str`.
 2. It stops generating outputs *before* the sequences passed in the argument `stop_sequences`
 For defining your LLM, you can make a `custom_model` method which accepts a list of [messages](./chat_templating) and returns an object with a .content attribute containing the text. This callable also needs to accept a `stop_sequences` argument that indicates when to stop generating.
 ```python
 from huggingface_hub import login, InferenceClient
 login("<YOUR_HUGGINGFACEHUB_API_TOKEN>")
 model_id = "meta-llama/Llama-3.3-70B-Instruct"
 client = InferenceClient(model=model_id)
 def custom_model(messages, stop_sequences=["Task"]):
    response = client.chat_completion(messages, stop=stop_sequences, max_tokens=1000)
    answer = response.choices[0].message
    return answer
 ```
 Additionally, `custom_model` can also take a `grammar` argument. In the case where you specify a `grammar` upon agent initialization, this argument will be passed to the calls to model, with the `grammar` that you defined upon initialization, to allow [constrained generation](https://huggingface.co/docs/text-generation-inference/conceptual/guidance) in order to force properly-formatted agent outputs.
 ### TransformersModel
 For convenience, we have added a `TransformersModel` that implements the points above by building a local `transformers` pipeline for the model_id given at initialization.
 ```python
 from smolagents import TransformersModel
 model = TransformersModel(model_id="HuggingFaceTB/SmolLM-135M-Instruct")
 print(model([{"role": "user", "content": "Ok!"}], stop_sequences=["great"]))
 ```
 ```text
 >>> What a
 ```
 > [!TIP]
 > You must have `transformers` and `torch` installed on your machine. Please run `pip install smolagents[transformers]` if it's not the case.
 [[autodoc]] TransformersModel
 ### HfApiModel
 The `HfApiModel` wraps an [HF Inference API](https://huggingface.co/docs/api-inference/index) client for the execution of the LLM.
 ```python
 from smolagents import HfApiModel
 messages = [
  {"role": "user", "content": "Hello, how are you?"},
  {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
  {"role": "user", "content": "No need to help, take it easy."},
 ]
 model = HfApiModel()
 print(model(messages))
 ```
 ```text
 >>> Of course! If you change your mind, feel free to reach out. Take care!
 ```
 [[autodoc]] HfApiModel
 ### LiteLLMModel
 The `LiteLLMModel` leverages [LiteLLM](https://www.litellm.ai/) to support 100+ LLMs from various providers.
 You can pass kwargs upon model initialization that will then be used whenever using the model, for instance below we pass `temperature`.
 ```python
 from smolagents import LiteLLMModel
 messages = [
  {"role": "user", "content": "Hello, how are you?"},
  {"role": "assistant", "content": "I'm doing great. How can I help you today?"},
  {"role": "user", "content": "No need to help, take it easy."},
 ]
 model = LiteLLMModel("anthropic/claude-3-5-sonnet-latest", temperature=0.2, max_tokens=10)
 print(model(messages))
 ```
 [[autodoc]] LiteLLMModel
 ### OpenAIServerModel
 This class lets you call any OpenAIServer compatible model.
 Here's how you can set it (you can customise the `api_base` url to point to another server):
 ```py
 from smolagents import OpenAIServerModel
 model = OpenAIServerModel(
    model_id="gpt-4o",
    api_base="https://api.openai.com/v1",
    api_key=os.environ["OPENAI_API_KEY"],
 )
 ```
 [[autodoc]] OpenAIServerModel
 ### AzureOpenAIServerModel
 `AzureOpenAIServerModel` allows you to connect to any Azure OpenAI deployment. 
 Below you can find an example of how to set it up, note that you can omit the `azure_endpoint`, `api_key`, and `api_version` arguments, provided you've set the corresponding environment variables -- `AZURE_OPENAI_ENDPOINT`, `AZURE_OPENAI_API_KEY`, and `OPENAI_API_VERSION`.
 Pay attention to the lack of an `AZURE_` prefix for `OPENAI_API_VERSION`, this is due to the way the underlying [openai](https://github.com/openai/openai-python) package is designed. 
 ```py
 import os
 from smolagents import AzureOpenAIServerModel
 model = AzureOpenAIServerModel(
    model_id = os.environ.get("AZURE_OPENAI_MODEL"),
    azure_endpoint=os.environ.get("AZURE_OPENAI_ENDPOINT"),
    api_key=os.environ.get("AZURE_OPENAI_API_KEY"),
    api_version=os.environ.get("OPENAI_API_VERSION")    
 )
 ```
 [[autodoc]] AzureOpenAIServerModel