private-gpt

Commit Graph

Author	SHA1	Message	Date
jiangzhuo	e3b769d33a	Optimize load_documents function with multiprocessing	2023-05-20 11:16:13 +02:00
MDW	04f6706bbb	Make scripts executeable, add basic pre-commit setup	2023-05-20 11:15:58 +02:00
MDW	4cda348cf8	Fix #294 (tested)	2023-05-19 16:23:09 +02:00
MDW	a862ff2be6	Add fallback for plain elm #294 #290	2023-05-19 01:04:42 +02:00
Iván Martínez	b9f8dc312f	Merge pull request #254 from Fabio3rs/formatOffice97-2003 Add .doc .ppt (Word and PowerPoint 97/2003 formats)	2023-05-18 23:49:40 +02:00
Fabio Rossini Sluzala	ec126b51d8	Fix loader mapping order	2023-05-17 22:38:30 -03:00
vilaca	79a3c00313	remove duplicate	2023-05-17 23:45:27 +01:00
Fabio Rossini Sluzala	66a9f9cde0	Add .doc .ppt (Word and PowerPoint 97/2003 formats)	2023-05-17 12:04:16 -03:00
Iván Martínez	bf3bddfbb6	More loaders, generic method - Update the README with extra formats - Add Powerpoint, requested in #138 - Add ePub requested in #138 comment - https://github.com/imartinez/privateGPT/pull/138#issuecomment-1549564535 - Update requirements	2023-05-17 00:55:21 +02:00
Iván Martínez	23d24c88e9	Update code to use sentence-transformers through huggingfaceembeddings	2023-05-17 00:32:41 +02:00
Andrea Pinto	d0aa57178a	ingest unlimited number of documents	2023-05-12 15:36:20 +02:00
Andrea Pinto	01f55441e7	fix persist db directory at ingestion	2023-05-12 10:37:10 +02:00
Sorin Neacsu	544ddd9631	load .env	2023-05-11 15:34:17 -07:00
alxspiker	f60dbb520e	Merge branch 'main' into main	2023-05-11 14:34:13 -06:00
alxspiker	52ae6c0866	.env + LlamaCpp + PDF/CSV + Ingest All .env Added an env file to make configuration easier LlamaCpp Added support for LlamaCpp in .env (MODEL_TYPE=LlamaCpp) PDF/CSV Added support for PDF and CSV files. Ingest All All files in source_documents will automatically get stored in vector store based on their file type when running ingest, no longer need a path argument.	2023-05-11 14:24:39 -06:00
R-Y-M-R	f12ea568e5	Use constants.py file	2023-05-11 10:29:07 -04:00
R-Y-M-R	8c6a81a07f	Fix: Disable Chroma Telemetry Opts-out of anonymized telemetry being tracked in Chroma. See: https://docs.trychroma.com/telemetry	2023-05-11 10:17:18 -04:00
Iván Martínez	026b9f895c	Use RecursiveCharacterTextSplitter to avoid llama_tokenize: too many tokens error during ingestion	2023-05-09 00:21:02 +02:00
Iván Martínez	92244a90b4	Use a different text splitter to improve results. Ingest takes an argument pointing to the doc to ingest.	2023-05-05 17:32:31 +02:00
Iván martínez	55338b8f6e	End-to-end working version	2023-05-02 20:32:28 +02:00

20 Commits