diff --git a/README.md b/README.md index 54c7a01..32a7693 100644 --- a/README.md +++ b/README.md @@ -13,18 +13,20 @@ In order to set your environment up to run the code here, first install all requ pip install -r requirements.txt ``` -Rename example.env to .env and edit the variables appropriately. +Then, download the 2 models and place them in a directory of your choice. +- LLM: default to [ggml-gpt4all-j-v1.3-groovy.bin](https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin). If you prefer a different GPT4All-J compatible model, just download it and reference it in your `.env` file. +- Embedding: default to [ggml-model-q4_0.bin](https://huggingface.co/Pi3141/alpaca-native-7B-ggml/resolve/397e872bf4c83f4c642317a5bf65ce84a105786e/ggml-model-q4_0.bin). If you prefer a different compatible Embeddings model, just download it and reference it in your `.env` file. + +Rename `example.env` to `.env` and edit the variables appropriately. ``` MODEL_TYPE: supports LlamaCpp or GPT4All PERSIST_DIRECTORY: is the folder you want your vectorstore in -LLAMA_EMBEDDINGS_MODEL: Path to your LlamaCpp supported embeddings model +LLAMA_EMBEDDINGS_MODEL: (absolute) Path to your LlamaCpp supported embeddings model MODEL_PATH: Path to your GPT4All or LlamaCpp supported LLM MODEL_N_CTX: Maximum token limit for both embeddings and LLM models ``` -Then, download the 2 models and place them in a directory of your choice (Ensure to update your .env with the model paths): -- LLM: default to [ggml-gpt4all-j-v1.3-groovy.bin](https://gpt4all.io/models/ggml-gpt4all-j-v1.3-groovy.bin). If you prefer a different GPT4All-J compatible model, just download it and reference it in your `.env` file. -- Embedding: default to [ggml-model-q4_0.bin](https://huggingface.co/Pi3141/alpaca-native-7B-ggml/resolve/397e872bf4c83f4c642317a5bf65ce84a105786e/ggml-model-q4_0.bin). If you prefer a different compatible Embeddings model, just download it and reference it in your `.env` file. +Note: because of the way `langchain` loads the `LLAMMA` embeddings, you need to specify the absolute path of your embeddings model binary. This means it will not work if you use a home directory shortcut (eg. `~/` or `$HOME/`). ## Test dataset This repo uses a [state of the union transcript](https://github.com/imartinez/privateGPT/blob/main/source_documents/state_of_the_union.txt) as an example.