Use JS markdown formatter

(cherry picked from commit 3941016)
This commit is contained in:
David Pilato 2014-05-28 15:24:39 +02:00
parent dafa7e764d
commit f068ef88a4
1 changed files with 70 additions and 58 deletions

128
README.md
View File

@ -24,36 +24,40 @@ ICU Normalization
Normalizes characters as explained [here](http://userguide.icu-project.org/transforms/normalization). It registers itself by default under `icu_normalizer` or `icuNormalizer` using the default settings. Allows for the name parameter to be provided which can include the following values: `nfc`, `nfkc`, and `nfkc_cf`. Here is a sample settings: Normalizes characters as explained [here](http://userguide.icu-project.org/transforms/normalization). It registers itself by default under `icu_normalizer` or `icuNormalizer` using the default settings. Allows for the name parameter to be provided which can include the following values: `nfc`, `nfkc`, and `nfkc_cf`. Here is a sample settings:
{ ```js
"index" : { {
"analysis" : { "index" : {
"analyzer" : { "analysis" : {
"collation" : { "analyzer" : {
"tokenizer" : "keyword", "collation" : {
"filter" : ["icu_normalizer"] "tokenizer" : "keyword",
} "filter" : ["icu_normalizer"]
} }
} }
} }
} }
}
```
ICU Folding ICU Folding
----------- -----------
Folding of unicode characters based on `UTR#30`. It registers itself under `icu_folding` and `icuFolding` names. Sample setting: Folding of unicode characters based on `UTR#30`. It registers itself under `icu_folding` and `icuFolding` names. Sample setting:
{ ```js
"index" : { {
"analysis" : { "index" : {
"analyzer" : { "analysis" : {
"collation" : { "analyzer" : {
"tokenizer" : "keyword", "collation" : {
"filter" : ["icu_folding"] "tokenizer" : "keyword",
} "filter" : ["icu_folding"]
} }
} }
} }
} }
}
```
ICU Filtering ICU Filtering
------------- -------------
@ -64,24 +68,26 @@ language is wanted. See syntax for the UnicodeSet [here](http://icu-project.org/
The Following example exempts Swedish characters from the folding. Note that the filtered characters are NOT lowercased which is why we add that filter below. The Following example exempts Swedish characters from the folding. Note that the filtered characters are NOT lowercased which is why we add that filter below.
{ ```js
"index" : { {
"analysis" : { "index" : {
"analyzer" : { "analysis" : {
"folding" : { "analyzer" : {
"tokenizer" : "standard", "folding" : {
"filter" : ["my_icu_folding", "lowercase"] "tokenizer" : "standard",
} "filter" : ["my_icu_folding", "lowercase"]
} }
"filter" : { }
"my_icu_folding" : { "filter" : {
"type" : "icu_folding" "my_icu_folding" : {
"unicodeSetFilter" : "[^åäöÅÄÖ]" "type" : "icu_folding"
} "unicodeSetFilter" : "[^åäöÅÄÖ]"
} }
} }
} }
} }
}
```
ICU Collation ICU Collation
------------- -------------
@ -94,39 +100,43 @@ Uses collation token filter. Allows to either specify the rules for collation
Here is a sample settings: Here is a sample settings:
{ ```js
"index" : { {
"analysis" : { "index" : {
"analyzer" : { "analysis" : {
"collation" : { "analyzer" : {
"tokenizer" : "keyword", "collation" : {
"filter" : ["icu_collation"] "tokenizer" : "keyword",
} "filter" : ["icu_collation"]
} }
} }
} }
} }
}
```
And here is a sample of custom collation: And here is a sample of custom collation:
{ ```js
"index" : { {
"analysis" : { "index" : {
"analyzer" : { "analysis" : {
"collation" : { "analyzer" : {
"tokenizer" : "keyword", "collation" : {
"filter" : ["myCollator"] "tokenizer" : "keyword",
} "filter" : ["myCollator"]
}, }
"filter" : { },
"myCollator" : { "filter" : {
"type" : "icu_collation", "myCollator" : {
"language" : "en" "type" : "icu_collation",
} "language" : "en"
} }
} }
} }
} }
}
```
Optional options: Optional options:
* `strength` - The strength property determines the minimum level of difference considered significant during comparison. * `strength` - The strength property determines the minimum level of difference considered significant during comparison.
@ -159,17 +169,19 @@ ICU Tokenizer
Breaks text into words according to [UAX #29: Unicode Text Segmentation](http://www.unicode.org/reports/tr29/). Breaks text into words according to [UAX #29: Unicode Text Segmentation](http://www.unicode.org/reports/tr29/).
{ ```js
"index" : { {
"analysis" : { "index" : {
"analyzer" : { "analysis" : {
"collation" : { "analyzer" : {
"tokenizer" : "icu_tokenizer", "collation" : {
} "tokenizer" : "icu_tokenizer",
} }
} }
} }
} }
}
```
License License