smolagents

Go to file

Merve Noyan 408b52abec Add VLM support (#220 ) * vlm initial commit * transformers integration for vlms * Add webbrowser example and make it work 🥳🥳 * Refactor image support * Allow modifying agent attributes in callback * Improve vlm browser example * time.sleep(0.5) before screenshot to let js animations happen * test to validate internal workflow for passing images * Update test_agents.py * Improve error logging * Switch to OpenAIServerModel * Improve the example * Format * add docs about steps, callbacks & co * Add precisions in doc * Improve browser * Tiny prompting update * Fix style * fix/add test * refactor * Fix write_inner_memory_from_logs for OpenAI format * Add back summary mode * Make it work with TransformersModel * Fix test * Fix loop * Fix quality * Fix mutable default argument * Rename tool_response_message to error_message and append it * Working browser with firefox * Use flatten_messages_as_text passed to TransformersModel * Fix quality * Document flatten_messages_as_text in docstring * Working ctrl + f in browser * Make style * Fix summary_mode type hint and add to docstring * Move image functions to tools * Update docstrings * Fix type hint * Fix typo * Fix type hints * Make callback call compatible with old single-argument functions * Revert update_metrics to have a single arg * Pass keyword args instead of args to callback * Update webbrowser * fix for single message case where final message list is empty * forgot debugger lol * accommodate VLM-like chat template and fix tests * Improve example wording * Style fixes * clarify naming and fix tests * test fix * Fix style * Add bm25 to fix one of the doc tests * fix mocking in VL test * fix bug in fallback * add transformers model * remove chrome dir from helium * Update Transformers example with flatten_messages_as_text * Add doc for flatten_messages_as_text * Fix merge error --------- Co-authored-by: Merve Noyan <mervenoyan@Merve-MacBook-Pro.local> Co-authored-by: Aymeric <aymeric.roucher@gmail.com> Co-authored-by: Albert Villanova del Moral <8515462+albertvillanova@users.noreply.github.com>		2025-01-24 17:01:35 +01:00
.github/workflows	Support any and none tool types (#280 )	2025-01-22 12:47:05 +01:00
docs	Add VLM support (#220 )	2025-01-24 17:01:35 +01:00
examples	Add VLM support (#220 )	2025-01-24 17:01:35 +01:00
src/smolagents	Add VLM support (#220 )	2025-01-24 17:01:35 +01:00
tests	Add VLM support (#220 )	2025-01-24 17:01:35 +01:00
utils	Add linter rules + apply make style (#255 )	2025-01-18 19:01:15 +01:00
.gitignore	Add option to upload files to GradioUI (#138 )	2025-01-13 16:33:45 +01:00
.pre-commit-config.yaml	Add build files	2024-12-27 16:42:17 +01:00
CODE_OF_CONDUCT.md	Add code of conduct and contributing guide	2025-01-03 10:30:16 +01:00
CONTRIBUTING.md	Clean makefile, pyproject.toml and CI (#229 )	2025-01-17 13:18:06 +01:00
Dockerfile	Add E2B code interpreter 🥳	2024-12-20 16:20:41 +01:00
LICENSE	Initial commit	2024-12-05 12:28:04 +01:00
Makefile	Clean makefile, pyproject.toml and CI (#229 )	2025-01-17 13:18:06 +01:00
README.md	Update README instructions to run tests (#328 )	2025-01-23 15:42:27 +01:00
e2b.Dockerfile	Make RAG example extremely fast with BM25	2024-12-26 16:19:31 +01:00
e2b.toml	Add E2B code interpreter 🥳	2024-12-20 16:20:41 +01:00
pyproject.toml	Add VLM support (#220 )	2025-01-24 17:01:35 +01:00

README.md

smolagents - a smol library to build great agents!

smolagents is a library that enables you to run powerful agents in a few lines of code. It offers:

✨ Simplicity: the logic for agents fits in ~thousand lines of code (see agents.py). We kept abstractions to their minimal shape above raw code!

🧑‍💻 First-class support for Code Agents, i.e. agents that write their actions in code (as opposed to "agents being used to write code"). To make it secure, we support executing in sandboxed environments via E2B.

On top of this CodeAgent class, we still support the standard ToolCallingAgent that writes actions as JSON/text blobs.

🤗 Hub integrations: you can share and load Gradio Spaces as tools to/from the Hub, and more is to come!

🌐 Support for any LLM: it supports models hosted on the Hub loaded in their transformers version or through our inference API, but also supports models from OpenAI, Anthropic and many others via our LiteLLM integration.

Full documentation can be found here.

[!NOTE] Check the our launch blog post to learn more about smolagents!

Quick demo

First install the package.

pip install smolagents

Then define your agent, give it the tools it needs and run it!

from smolagents import CodeAgent, DuckDuckGoSearchTool, HfApiModel

agent = CodeAgent(tools=[DuckDuckGoSearchTool()], model=HfApiModel())

agent.run("How many seconds would it take for a leopard at full speed to run through Pont des Arts?")

https://github.com/user-attachments/assets/cd0226e2-7479-4102-aea0-57c22ca47884

Code agents?

In our CodeAgent, the LLM engine writes its actions in code. This approach is demonstrated to work better than the current industry practice of letting the LLM output a dictionary of the tools it wants to calls: uses 30% fewer steps (thus 30% fewer LLM calls) and reaches higher performance on difficult benchmarks. Head to our high-level intro to agents to learn more on that.

Especially, since code execution can be a security concern (arbitrary code execution!), we provide options at runtime:

a secure python interpreter to run code more safely in your environment (more secure than raw code execution but still risky)
a sandboxed environment using E2B (removes the risk to your own system).

How smol is it really?

We strived to keep abstractions to a strict minimum: the main code in agents.py is only ~1,000 lines of code. Still, we implement several types of agents: CodeAgent writes its actions as Python code snippets, and the more classic ToolCallingAgent leverages built-in tool calling methods.

By the way, why use a framework at all? Well, because a big part of this stuff is non-trivial. For instance, the code agent has to keep a consistent format for code throughout its system prompt, its parser, the execution. So our framework handles this complexity for you. But of course we still encourage you to hack into the source code and use only the bits that you need, to the exclusion of everything else!

How strong are open models for agentic workflows?

We've created CodeAgent instances with some leading models, and compared them on this benchmark that gathers questions from a few different benchmarks to propose a varied blend of challenges.

Find the benchmarking code here for more detail on the agentic setup used, and see a comparison of using LLMs code agents compared to vanilla (spoilers: code agents works better).

benchmark of different models on agentic workflows

This comparison shows that open source models can now take on the best closed models!

Contributing

To contribute, follow our contribution guide.

At any moment, feel welcome to open an issue, citing your exact error traces and package versions if it's a bug. It's often even better to open a PR with your proposed fixes/changes!

To install dev dependencies, run:

pip install -e ".[dev]"

When making changes to the codebase, please check that it follows the repo's code quality requirements by running: To check code quality of the source code:

make quality

If the checks fail, you can run the formatter with:

make style

And commit the changes.

To run tests locally, run this command:

make test

Citing smolagents

If you use smolagents in your publication, please cite it by using the following BibTeX entry.

@Misc{smolagents,
  title =        {`smolagents`: a smol library to build great agentic systems.},
  author =       {Aymeric Roucher and Albert Villanova del Moral and Thomas Wolf and Leandro von Werra and Erik Kaunismäki},
  howpublished = {\url{https://github.com/huggingface/smolagents}},
  year =         {2025}
}