Commit Graph

4 Commits

Author SHA1 Message Date
Rafael dos Santos Silva 739b314312
Fixes for embeddings and truncate (#67) 2023-05-18 09:21:28 +10:00
Rafael dos Santos Silva 3c9513e754
Refinements to embeddings and tokenizers (#61)
* Refinements to embeddings and tokenizers

* lint

* Truncate with tokenizers for summary

* fix
2023-05-15 15:10:42 -03:00
Sam e76fc77189
fixes (#53)
* Minor... use username suggester in case username already exists

* FIX: ensure we truncate long prompts

Previously we

1. Used raw length instead of token counts for counting length
2. We totally dropped a prompt if it was too long

New implementation will truncate "raw" if it gets too long maintaining
meaning.
2023-05-06 07:31:53 -03:00
Rafael dos Santos Silva 9783e3b025
FEATURE: Add a basic tokenizer API (#37)
* FEATURE: Add a basic tokenizer API

* Add tests

* lint
2023-04-19 11:55:59 -03:00