FIX: Disable truncation and padding in all-mpnet-base-v2 tokenizer (#105)
The tokenizer was truncating and padding to 128 tokens, and we try append new post content until we hit 384 tokens. This was causing the tokenizer to accept all posts in a topic, wasting CPU and memory.
This commit is contained in:
parent
703762a7a9
commit
d692ecc7de
File diff suppressed because one or more lines are too long
Loading…
Reference in New Issue