chore: add linux instructions and C++ guide (#1082)
* fix: add linux instructions Co-authored-by: BW-Projects * chore: Add C++ as a base requirement in the docs * chore: Add clang for OSX * chore: Update docs for OSX and gcc * chore: make docs --------- Co-authored-by: Pablo Orgaz <pablo@Pablos-MacBook-Pro.local>
This commit is contained in:
		
							parent
							
								
									97d860a7c9
								
							
						
					
					
						commit
						b46c1087e2
					
				|  | @ -21,6 +21,7 @@ The API is divided in two logical blocks: | ||||||
| > watch, etc. | > watch, etc. | ||||||
| 
 | 
 | ||||||
| ## Quick Local Installation steps | ## Quick Local Installation steps | ||||||
|  | 
 | ||||||
| The steps in `Installation and Settings` section are better explained and cover more | The steps in `Installation and Settings` section are better explained and cover more | ||||||
| setup scenarios. But if you are looking for a quick setup guide, here it is: | setup scenarios. But if you are looking for a quick setup guide, here it is: | ||||||
| 
 | 
 | ||||||
|  | @ -53,16 +54,17 @@ being used | ||||||
| http://localhost:8001/ | http://localhost:8001/ | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| 
 |  | ||||||
| ## Installation and Settings | ## Installation and Settings | ||||||
| 
 | 
 | ||||||
| ### Base requirements to run PrivateGPT | ### Base requirements to run PrivateGPT | ||||||
| 
 | 
 | ||||||
| * Git clone PrivateGPT repository, and navigate to it: | * Git clone PrivateGPT repository, and navigate to it: | ||||||
|  | 
 | ||||||
| ``` | ``` | ||||||
|   git clone https://github.com/imartinez/privateGPT |   git clone https://github.com/imartinez/privateGPT | ||||||
|   cd privateGPT |   cd privateGPT | ||||||
| ``` | ``` | ||||||
|  | 
 | ||||||
| * Install Python 3.11. Ideally through a python version manager like `pyenv`. | * Install Python 3.11. Ideally through a python version manager like `pyenv`. | ||||||
|   Python 3.12 |   Python 3.12 | ||||||
|   should work too. Earlier python versions are not supported. |   should work too. Earlier python versions are not supported. | ||||||
|  | @ -73,8 +75,11 @@ http://localhost:8001/ | ||||||
| pyenv install 3.11 | pyenv install 3.11 | ||||||
| pyenv local 3.11 | pyenv local 3.11 | ||||||
| ``` | ``` | ||||||
|  | 
 | ||||||
| * Install [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer) for dependency management: | * Install [Poetry](https://python-poetry.org/docs/#installing-with-the-official-installer) for dependency management: | ||||||
| 
 | 
 | ||||||
|  | * Have a valid C++ compiler like gcc. See [Troubleshooting: C++ Compiler](#troubleshooting-c-compiler) for more details. | ||||||
|  | 
 | ||||||
| * Install `make` for scripts: | * Install `make` for scripts: | ||||||
|     * osx: (Using homebrew): `brew install make` |     * osx: (Using homebrew): `brew install make` | ||||||
|     * windows: (Using chocolatey) `choco install make` |     * windows: (Using chocolatey) `choco install make` | ||||||
|  | @ -178,9 +183,9 @@ metal support. To do that run: | ||||||
| CMAKE_ARGS="-DLLAMA_METAL=on" pip install --force-reinstall --no-cache-dir llama-cpp-python | CMAKE_ARGS="-DLLAMA_METAL=on" pip install --force-reinstall --no-cache-dir llama-cpp-python | ||||||
| ``` | ``` | ||||||
| 
 | 
 | ||||||
| #### Windows GPU support | #### Windows NVIDIA GPU support | ||||||
| 
 | 
 | ||||||
| Windows GPU support is done through CUDA or similar open source technologies. | Windows GPU support is done through CUDA. | ||||||
| Follow the instructions on the original [llama.cpp](https://github.com/ggerganov/llama.cpp) repo to install the required | Follow the instructions on the original [llama.cpp](https://github.com/ggerganov/llama.cpp) repo to install the required | ||||||
| dependencies. | dependencies. | ||||||
| 
 | 
 | ||||||
|  | @ -188,6 +193,8 @@ Some tips to get it working with an NVIDIA card and CUDA (Tested on Windows 10 w | ||||||
| 
 | 
 | ||||||
| * Install latest VS2022 (and build tools) https://visualstudio.microsoft.com/vs/community/ | * Install latest VS2022 (and build tools) https://visualstudio.microsoft.com/vs/community/ | ||||||
| * Install CUDA toolkit https://developer.nvidia.com/cuda-downloads | * Install CUDA toolkit https://developer.nvidia.com/cuda-downloads | ||||||
|  | * Verify your installation is correct by running `nvcc --version` and `nvidia-smi`, ensure your CUDA version is up to | ||||||
|  |   date and your GPU is detected. | ||||||
| * [Optional] Install CMake to troubleshoot building issues by compiling llama.cpp directly https://cmake.org/download/ | * [Optional] Install CMake to troubleshoot building issues by compiling llama.cpp directly https://cmake.org/download/ | ||||||
| 
 | 
 | ||||||
| If you have all required dependencies properly configured running the | If you have all required dependencies properly configured running the | ||||||
|  | @ -209,9 +216,33 @@ Note that llama.cpp offloads matrix calculations to the GPU but the performance | ||||||
| still hit heavily due to latency between CPU and GPU communication. You might need to tweak | still hit heavily due to latency between CPU and GPU communication. You might need to tweak | ||||||
| batch sizes and other parameters to get the best performance for your particular system. | batch sizes and other parameters to get the best performance for your particular system. | ||||||
| 
 | 
 | ||||||
| #### Linux GPU support | #### Linux NVIDIA GPU support and Windows-WSL | ||||||
| 
 | 
 | ||||||
| 🚧 Under construction 🚧 | Linux GPU support is done through CUDA. | ||||||
|  | Follow the instructions on the original [llama.cpp](https://github.com/ggerganov/llama.cpp) repo to install the required | ||||||
|  | external | ||||||
|  | dependencies. | ||||||
|  | 
 | ||||||
|  | Some tips: | ||||||
|  | 
 | ||||||
|  | * Make sure you have an up-to-date C++ compiler | ||||||
|  | * Install CUDA toolkit https://developer.nvidia.com/cuda-downloads | ||||||
|  | * Verify your installation is correct by running `nvcc --version` and `nvidia-smi`, ensure your CUDA version is up to | ||||||
|  |   date and your GPU is detected. | ||||||
|  | 
 | ||||||
|  | After that running the following command in the repository will install llama.cpp with GPU support: | ||||||
|  | 
 | ||||||
|  | ` | ||||||
|  | CMAKE_ARGS='-DLLAMA_CUBLAS=on' poetry run pip install --force-reinstall --no-cache-dir llama-cpp-python | ||||||
|  | ` | ||||||
|  | 
 | ||||||
|  | If your installation was correct, you should see a message similar to the following next | ||||||
|  | time you start the server `BLAS = 1`. | ||||||
|  | 
 | ||||||
|  | ``` | ||||||
|  | llama_new_context_with_model: total VRAM used: 4857.93 MB (model: 4095.05 MB, context: 762.87 MB) | ||||||
|  | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 0 | VSX = 0 |  | ||||||
|  | ``` | ||||||
| 
 | 
 | ||||||
| #### Known issues and Troubleshooting | #### Known issues and Troubleshooting | ||||||
| 
 | 
 | ||||||
|  | @ -226,7 +257,9 @@ You might encounter several issues: | ||||||
|   If you encounter any of these issues, please open an issue and we'll try to help. |   If you encounter any of these issues, please open an issue and we'll try to help. | ||||||
| 
 | 
 | ||||||
| #### Troubleshooting: C++ Compiler | #### Troubleshooting: C++ Compiler | ||||||
| If you encounter an error while building a wheel during the `pip install` process, you may need to install a C++ compiler on your computer. | 
 | ||||||
|  | If you encounter an error while building a wheel during the `pip install` process, you may need to install a C++ | ||||||
|  | compiler on your computer. | ||||||
| 
 | 
 | ||||||
| **For Windows 10/11** | **For Windows 10/11** | ||||||
| 
 | 
 | ||||||
|  | @ -239,8 +272,15 @@ To install a C++ compiler on Windows 10/11, follow these steps: | ||||||
| 3. Download the MinGW installer from the [MinGW website](https://sourceforge.net/projects/mingw/). | 3. Download the MinGW installer from the [MinGW website](https://sourceforge.net/projects/mingw/). | ||||||
| 4. Run the installer and select the `gcc` component. | 4. Run the installer and select the `gcc` component. | ||||||
| 
 | 
 | ||||||
|  | ** For OSX ** | ||||||
|  | 
 | ||||||
|  | 1. Check if you have a C++ compiler installed, Xcode might have done it for you. for example running `gcc`. | ||||||
|  | 2. If not, you can install clang or gcc with homebrew `brew install gcc` | ||||||
|  | 
 | ||||||
| #### Troubleshooting: Mac Running Intel | #### Troubleshooting: Mac Running Intel | ||||||
| When running a Mac with Intel hardware (not M1), you may run into _clang: error: the clang compiler does not support '-march=native'_ during pip install. | 
 | ||||||
|  | When running a Mac with Intel hardware (not M1), you may run into _clang: error: the clang compiler does not support ' | ||||||
|  | -march=native'_ during pip install. | ||||||
| 
 | 
 | ||||||
| If so set your archflags during pip install. eg: _ARCHFLAGS="-arch x86_64" pip3 install -r requirements.txt_ | If so set your archflags during pip install. eg: _ARCHFLAGS="-arch x86_64" pip3 install -r requirements.txt_ | ||||||
| 
 | 
 | ||||||
|  | @ -313,6 +353,7 @@ Gradio UI is a ready to use way of testing most of PrivateGPT API functionalitie | ||||||
| ### Execution Modes | ### Execution Modes | ||||||
| 
 | 
 | ||||||
| It has 3 modes of execution (you can select in the top-left): | It has 3 modes of execution (you can select in the top-left): | ||||||
|  | 
 | ||||||
| * Query Documents: uses the context from the | * Query Documents: uses the context from the | ||||||
|   ingested documents to answer the questions posted in the chat. It also takes |   ingested documents to answer the questions posted in the chat. It also takes | ||||||
|   into account previous chat messages as context. |   into account previous chat messages as context. | ||||||
|  | @ -360,6 +401,7 @@ basic logging (for example ingestion progress or LLM prompts and answers). | ||||||
| 🚧 Document Update and Delete are still WIP. 🚧 | 🚧 Document Update and Delete are still WIP. 🚧 | ||||||
| 
 | 
 | ||||||
| The ingestion of documents can be done in different ways: | The ingestion of documents can be done in different ways: | ||||||
|  | 
 | ||||||
| * Using the `/ingest` API | * Using the `/ingest` API | ||||||
| * Using the Gradio UI | * Using the Gradio UI | ||||||
| * Using the Bulk Local Ingestion functionality (check next section) | * Using the Bulk Local Ingestion functionality (check next section) | ||||||
|  |  | ||||||
										
											
												File diff suppressed because one or more lines are too long
											
										
									
								
							
		Loading…
	
		Reference in New Issue