diff --git a/docs/reference/analysis/tokenfilters/cjk-bigram-tokenfilter.asciidoc b/docs/reference/analysis/tokenfilters/cjk-bigram-tokenfilter.asciidoc index 4805d3dc950..f6f7d5794f2 100644 --- a/docs/reference/analysis/tokenfilters/cjk-bigram-tokenfilter.asciidoc +++ b/docs/reference/analysis/tokenfilters/cjk-bigram-tokenfilter.asciidoc @@ -3,7 +3,7 @@ The `cjk_bigram` token filter forms bigrams out of the CJK terms that are generated by the <> -or the `icu_tokenizer` (see <>). +or the `icu_tokenizer` (see <>). By default, when a CJK character has no adjacent characters to form a bigram, it is output in unigram form. If you always want to output both unigrams and diff --git a/docs/reference/analysis/tokenfilters/cjk-width-tokenfilter.asciidoc b/docs/reference/analysis/tokenfilters/cjk-width-tokenfilter.asciidoc index 11bdf0f77dc..4f5d55d4de1 100644 --- a/docs/reference/analysis/tokenfilters/cjk-width-tokenfilter.asciidoc +++ b/docs/reference/analysis/tokenfilters/cjk-width-tokenfilter.asciidoc @@ -7,6 +7,6 @@ The `cjk_width` token filter normalizes CJK width differences: * Folds halfwidth Katakana variants into the equivalent Kana NOTE: This token filter can be viewed as a subset of NFKC/NFKD -Unicode normalization. See the <> +Unicode normalization. See the <> for full normalization support.