discourse-ai/spec/shared
Rafael dos Santos Silva 3b8f900486
FIX: Handle unicode on tokenizer (#515)
* FIX: Handle unicode on tokenizer

Our fast track code broke when strings had characters who are longer in tokens than
in UTF-8.

Admins can set `DISCOURSE_AI_STRICT_TOKEN_COUNTING: true` in app.yml to ensure token counting is strict, even if slower.


Co-authored-by: wozulong <sidle.pax_0e@icloud.com>
2024-03-14 17:33:30 -03:00
..
inference FEATURE: add support for new OpenAI embedding models (#445) 2024-01-29 13:24:30 -03:00
chat_message_classificator_spec.rb DEV: Fix new Rubocop offenses 2024-03-06 15:23:29 +01:00
classificator_spec.rb DEV: DiscourseAI -> DiscourseAi rename to have consistent folders and files (#9) 2023-03-14 16:03:50 -03:00
post_classificator_spec.rb DEV: Fix new Rubocop offenses 2024-03-06 15:23:29 +01:00
tokenizer_spec.rb FIX: Handle unicode on tokenizer (#515) 2024-03-14 17:33:30 -03:00