feat(settings): Update default model to TheBloke/Mistral-7B-Instruct-v0.2-GGUF (#1415)
* Update LlamaCPP dependency * Default to TheBloke/Mistral-7B-Instruct-v0.2-GGUF * Fix API docs
This commit is contained in:
		
							parent
							
								
									c71ae7cee9
								
							
						
					
					
						commit
						8ec7cf49f4
					
				|  | @ -1 +1,14 @@ | ||||||
| # API Reference | # API Reference | ||||||
|  | 
 | ||||||
|  | The API is divided in two logical blocks: | ||||||
|  | 
 | ||||||
|  | 1. High-level API, abstracting all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation: | ||||||
|  |     - Ingestion of documents: internally managing document parsing, splitting, metadata extraction, | ||||||
|  |       embedding generation and storage. | ||||||
|  |     - Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt | ||||||
|  |       engineering and the response generation. | ||||||
|  | 
 | ||||||
|  | 2. Low-level API, allowing advanced users to implement their own complex pipelines: | ||||||
|  |     - Embeddings generation: based on a piece of text. | ||||||
|  |     - Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested | ||||||
|  |       documents. | ||||||
|  | @ -32,21 +32,6 @@ The installation guide will help you in the [Installation section](/installation | ||||||
|   /> |   /> | ||||||
| </Cards> | </Cards> | ||||||
| 
 | 
 | ||||||
| ## API Organization  |  | ||||||
| 
 |  | ||||||
| The API is divided in two logical blocks: |  | ||||||
| 
 |  | ||||||
| 1. High-level API, abstracting all the complexity of a RAG (Retrieval Augmented Generation) pipeline implementation: |  | ||||||
|     - Ingestion of documents: internally managing document parsing, splitting, metadata extraction, |  | ||||||
|       embedding generation and storage. |  | ||||||
|     - Chat & Completions using context from ingested documents: abstracting the retrieval of context, the prompt |  | ||||||
|       engineering and the response generation. |  | ||||||
| 
 |  | ||||||
| 2. Low-level API, allowing advanced users to implement their own complex pipelines: |  | ||||||
|     - Embeddings generation: based on a piece of text. |  | ||||||
|     - Contextual chunks retrieval: given a query, returns the most relevant chunks of text from the ingested |  | ||||||
|       documents. |  | ||||||
| 
 |  | ||||||
| <Callout intent = "info"> | <Callout intent = "info"> | ||||||
| A working **Gradio UI client** is provided to test the API, together with a set of useful tools such as bulk | A working **Gradio UI client** is provided to test the API, together with a set of useful tools such as bulk | ||||||
| model download script, ingestion script, documents folder watch, etc. | model download script, ingestion script, documents folder watch, etc. | ||||||
|  |  | ||||||
										
											
												File diff suppressed because it is too large
												Load Diff
											
										
									
								
							|  | @ -36,7 +36,7 @@ gradio = "^4.4.1" | ||||||
| [tool.poetry.group.local] | [tool.poetry.group.local] | ||||||
| optional = true | optional = true | ||||||
| [tool.poetry.group.local.dependencies] | [tool.poetry.group.local.dependencies] | ||||||
| llama-cpp-python = "^0.2.11" | llama-cpp-python = "^0.2.23" | ||||||
| numpy = "1.26.0" | numpy = "1.26.0" | ||||||
| sentence-transformers = "^2.2.2" | sentence-transformers = "^2.2.2" | ||||||
| # https://stackoverflow.com/questions/76327419/valueerror-libcublas-so-0-9-not-found-in-the-system-path | # https://stackoverflow.com/questions/76327419/valueerror-libcublas-so-0-9-not-found-in-the-system-path | ||||||
|  |  | ||||||
|  | @ -48,8 +48,8 @@ qdrant: | ||||||
| 
 | 
 | ||||||
| local: | local: | ||||||
|   prompt_style: "llama2" |   prompt_style: "llama2" | ||||||
|   llm_hf_repo_id: TheBloke/Mistral-7B-Instruct-v0.1-GGUF |   llm_hf_repo_id: TheBloke/Mistral-7B-Instruct-v0.2-GGUF | ||||||
|   llm_hf_model_file: mistral-7b-instruct-v0.1.Q4_K_M.gguf |   llm_hf_model_file: mistral-7b-instruct-v0.2.Q4_K_M.gguf | ||||||
|   embedding_hf_model_name: BAAI/bge-small-en-v1.5 |   embedding_hf_model_name: BAAI/bge-small-en-v1.5 | ||||||
| 
 | 
 | ||||||
| sagemaker: | sagemaker: | ||||||
|  |  | ||||||
		Loading…
	
		Reference in New Issue