* feat: change ollama default model to llama3.1
* chore: bump versions
* feat: Change default model in local mode to llama3.1
* chore: make sure last poetry version is used
* fix: mypy
* fix: do not add BOS (with last llamacpp-python version)
* Extract optional dependencies
* Separate local mode into llms-llama-cpp and embeddings-huggingface for clarity
* Support Ollama embeddings
* Upgrade to llamaindex 0.10.14. Remove legacy use of ServiceContext in ContextChatEngine
* Fix vector retriever filters
This mode behaves the same as the openai mode, except that it allows setting custom models not
supported by OpenAI. It can be used with any tool that serves models from an OpenAI compatible API.
Implements #1424