Fix doc

2014-01-10 23:30:05 +01:00 · 2014-01-10 23:30:05 +01:00 · f88f5e2fb7
parent 4c935133e3
commit f88f5e2fb7
1 changed files with 11 additions and 5 deletions
--- a/README.md
+++ b/README.md
@ -28,7 +28,7 @@ In order to install the plugin, simply run: `bin/plugin -install elasticsearch/e
 ICU Normalization
 -----------------

-Normalizes characters as explained "here":http://userguide.icu-project.org/transforms/normalization. It registers itself by default under `icu_normalizer` or `icuNormalizer` using the default settings. Allows for the name parameter to be provided which can include the following values: `nfc`, `nfkc`, and `nfkc_cf`. Here is a sample settings:
+Normalizes characters as explained [here](http://userguide.icu-project.org/transforms/normalization). It registers itself by default under `icu_normalizer` or `icuNormalizer` using the default settings. Allows for the name parameter to be provided which can include the following values: `nfc`, `nfkc`, and `nfkc_cf`. Here is a sample settings:

    {
        "index" : {
@ -64,7 +64,9 @@ Folding of unicode characters based on `UTR#30`. It registers itself under `icu_
 ICU Filtering
 -------------

-The folding can be filtered by a set of unicode characters with the parameter `unicodeSetFilter`. This is useful for a non-internationalized search engine where retaining a set of national characters which are primary letters in a specific language is wanted. See syntax for the UnicodeSet "here":http://icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.html.
+The folding can be filtered by a set of unicode characters with the parameter `unicodeSetFilter`. This is useful for a
+non-internationalized search engine where retaining a set of national characters which are primary letters in a specific
+language is wanted. See syntax for the UnicodeSet [here](http://icu-project.org/apiref/icu4j/com/ibm/icu/text/UnicodeSet.html).

 The Following example exempts Swedish characters from the folding. Note that the filtered characters are NOT lowercased which is why we add that filter below.

@ -90,7 +92,11 @@ The Following example exempts Swedish characters from the folding. Note that the
 ICU Collation
 -------------

-Uses collation token filter. Allows to either specify the rules for collation (defined "here":http://www.icu-project.org/userguide/Collate_Customization.html) using the `rules` parameter (can point to a location or expressed in the settings, location can be relative to config location), or using the `language` parameter (further specialized by country and variant). By default registers under `icu_collation` or `icuCollation` and uses the default locale.
+Uses collation token filter. Allows to either specify the rules for collation
+(defined [here](http://www.icu-project.org/userguide/Collate_Customization.html)) using the `rules` parameter
+(can point to a location or expressed in the settings, location can be relative to config location), or using the
+`language` parameter (further specialized by country and variant). By default registers under `icu_collation` or
+`icuCollation` and uses the default locale.

 Here is a sample settings:

@ -132,7 +138,7 @@ Optional options:
 * `strength` - The strength property determines the minimum level of difference considered significant during comparison.
 The default strength for the Collator is `tertiary`, unless specified otherwise by the locale used to create the Collator.
 Possible values: `primary`, `secondary`, `tertiary`, `quaternary` or `identical`.
- See ICU Collation:http://icu-project.org/apiref/icu4j/com/ibm/icu/text/Collator.html documentation for a more detailed
+ See [ICU Collation](http://icu-project.org/apiref/icu4j/com/ibm/icu/text/Collator.html) documentation for a more detailed
 explanation for the specific values.
 * `decomposition` - Possible values: `no` or `canonical`. Defaults to `no`. Setting this decomposition property with
 `canonical` allows the Collator to handle un-normalized text properly, producing the same results as if the text were
@ -157,7 +163,7 @@ Expert options:
 ICU Tokenizer
 -------------

-Breaks text into words according to UAX #29: Unicode Text Segmentation ((http://www.unicode.org/reports/tr29/)).
+Breaks text into words according to [UAX #29: Unicode Text Segmentation](http://www.unicode.org/reports/tr29/).

    {
        "index" : {