[DOCS] Fix awkward wording in kuromoji analyzer docs

This commit is contained in:
James Rodewig 2020-07-30 08:53:15 -04:00
parent 970a0c8957
commit b3da548dab
1 changed files with 2 additions and 3 deletions

View File

@ -33,9 +33,8 @@ characters, such as `` and ``. If a text contains full-width characters,
the tokenizer can produce unexpected tokens.
For example, the `kuromoji_tokenizer` tokenizer converts the text
`  ` to the tokens `[ culture, o, f, japan ]` by
default. However, a user may expect the tokenizer to instead produce
`[ culture, of, japan ]`.
`  ` to the tokens `[ culture, o, f, japan ]`
instead of `[ culture, of, japan ]`.
To avoid this, add the <<analysis-icu-normalization-charfilter,`icu_normalizer`
character filter>> to a custom analyzer based on the `kuromoji` analyzer. The