discourse-ai/lib/tokenizer
Sam f6ac5cd0a8
FEATURE: allow tuning of RAG generation (#565)
* FEATURE: allow tuning of RAG generation

- change chunking to be token based vs char based (which is more accurate)
- allow control over overlap / tokens per chunk and conversation snippets inserted
- UI to control new settings

* improve ui a bit

* fix various reindex issues

* reduce concurrency

* try ultra low queue ... concurrency 1 is too slow.
2024-04-12 10:32:46 -03:00
..
all_mpnet_base_v2_tokenizer.rb DEV: port directory structure to Zeitwerk (#319) 2023-11-29 15:17:46 +11:00
anthropic_tokenizer.rb DEV: port directory structure to Zeitwerk (#319) 2023-11-29 15:17:46 +11:00
basic_tokenizer.rb FEATURE: allow tuning of RAG generation (#565) 2024-04-12 10:32:46 -03:00
bert_tokenizer.rb DEV: port directory structure to Zeitwerk (#319) 2023-11-29 15:17:46 +11:00
bge_large_en_tokenizer.rb DEV: port directory structure to Zeitwerk (#319) 2023-11-29 15:17:46 +11:00
bge_m3_tokenizer.rb FEATURE: Add BGE-M3 embeddings support (#569) 2024-04-10 17:24:01 -03:00
llama2_tokenizer.rb DEV: port directory structure to Zeitwerk (#319) 2023-11-29 15:17:46 +11:00
mixtral_tokenizer.rb Mixtral (#376) 2023-12-26 14:49:55 -03:00
multilingual_e5_large_tokenizer.rb DEV: port directory structure to Zeitwerk (#319) 2023-11-29 15:17:46 +11:00
open_ai_tokenizer.rb FEATURE: allow tuning of RAG generation (#565) 2024-04-12 10:32:46 -03:00