[DOCS] Updated ICU-Plugin docs from the repo README
This commit is contained in:
parent
b0fee6c01b
commit
ea05f4538c
|
@ -39,7 +39,7 @@ Here is a sample settings:
|
|||
=== ICU Folding
|
||||
|
||||
Folding of unicode characters based on `UTR#30`. It registers itself
|
||||
under `icu_folding` and `icuFolding` names.
|
||||
under `icu_folding` and `icuFolding` names.
|
||||
The filter also does lowercasing, which means the lowercase filter can
|
||||
normally be left out. Sample setting:
|
||||
|
||||
|
@ -70,7 +70,7 @@ primary letters in a specific language is wanted. See syntax for the
|
|||
UnicodeSet
|
||||
http://icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.html[here].
|
||||
|
||||
The Following example excempt Swedish characters from the folding. Note
|
||||
The Following example exempts Swedish characters from the folding. Note
|
||||
that the filtered characters are NOT lowercased which is why we add that
|
||||
filter below.
|
||||
|
||||
|
@ -148,5 +148,73 @@ And here is a sample of custom collation:
|
|||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
[float]
|
||||
==== Options
|
||||
|
||||
[horizontal]
|
||||
`strength`::
|
||||
The strength property determines the minimum level of difference considered significant during comparison.
|
||||
The default strength for the Collator is `tertiary`, unless specified otherwise by the locale used to create the Collator.
|
||||
Possible values: `primary`, `secondary`, `tertiary`, `quaternary` or `identical`.
|
||||
+
|
||||
See http://icu-project.org/apiref/icu4j/com/ibm/icu/text/Collator.html[ICU Collation] documentation for a more detailed
|
||||
explanation for the specific values.
|
||||
|
||||
`decomposition`::
|
||||
Possible values: `no` or `canonical`. Defaults to `no`. Setting this decomposition property with
|
||||
`canonical` allows the Collator to handle un-normalized text properly, producing the same results as if the text were
|
||||
normalized. If `no` is set, it is the user's responsibility to insure that all text is already in the appropriate form
|
||||
before a comparison or before getting a CollationKey. Adjusting decomposition mode allows the user to select between
|
||||
faster and more complete collation behavior. Since a great many of the world's languages do not require text
|
||||
normalization, most locales set `no` as the default decomposition mode.
|
||||
|
||||
[float]
|
||||
==== Expert options:
|
||||
|
||||
[horizontal]
|
||||
`alternate`::
|
||||
Possible values: `shifted` or `non-ignorable`. Sets the alternate handling for strength `quaternary`
|
||||
to be either shifted or non-ignorable. What boils down to ignoring punctuation and whitespace.
|
||||
|
||||
`caseLevel`::
|
||||
Possible values: `true` or `false`. Default is `false`. Whether case level sorting is required. When
|
||||
strength is set to `primary` this will ignore accent differences.
|
||||
|
||||
`caseFirst`::
|
||||
Possible values: `lower` or `upper`. Useful to control which case is sorted first when case is not ignored
|
||||
for strength `tertiary`.
|
||||
|
||||
`numeric`::
|
||||
Possible values: `true` or `false`. Whether digits are sorted according to numeric representation. For
|
||||
example the value `egg-9` is sorted before the value `egg-21`. Defaults to `false`.
|
||||
|
||||
`variableTop`::
|
||||
Single character or contraction. Controls what is variable for `alternate`.
|
||||
|
||||
`hiraganaQuaternaryMode`::
|
||||
Possible values: `true` or `false`. Defaults to `false`. Distinguishing between Katakana and
|
||||
Hiragana characters in `quaternary` strength .
|
||||
|
||||
[float]
|
||||
=== ICU Tokenizer
|
||||
|
||||
Breaks text into words according to UAX #29: Unicode Text Segmentation ((http://www.unicode.org/reports/tr29/)).
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
{
|
||||
"index" : {
|
||||
"analysis" : {
|
||||
"analyzer" : {
|
||||
"collation" : {
|
||||
"tokenizer" : "icu_tokenizer",
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
|
|
Loading…
Reference in New Issue