Rafael dos Santos Silva 9783e3b025
FEATURE: Add a basic tokenizer API (#37)
* FEATURE: Add a basic tokenizer API

* Add tests

* lint
2023-04-19 11:55:59 -03:00

15 lines
294 B
Ruby

# frozen_string_literal: true
module DiscourseAi
class Tokenizer
def self.tokenizer
@@tokenizer ||=
Tokenizers.from_file("./plugins/discourse-ai/tokenizers/bert-base-uncased.json")
end
def self.size(text)
tokenizer.encode(text).tokens.size
end
end
end