Merge pull request #53 from Gasol/icu_transform_doc

Update documentation for ICU Transform
2015-06-01 09:08:25 +02:00 · 2015-06-01 09:08:25 +02:00 · 23b6847b5c
parent 97e6016137 2aea018feb
commit 23b6847b5c
1 changed files with 46 additions and 0 deletions
--- a/README.md
+++ b/README.md
@ -224,6 +224,52 @@ Here is a sample settings:
 }
 ```
 ICU Transform
 -------------
 Transforms are used to process Unicode text in many different ways. Some include case mapping, normalization,
 transliteration and bidirectional text handling.
 You can defined transliterator identifiers by using `id` property, and specify direction  to `forward` or `reverse` by
 using `dir` property, The default value of both properties are `Null` and `forward`.
 For example:
 ```js
 {
    "index" : {
        "analysis" : {
            "analyzer" : {
                "latin" : {
                    "tokenizer" : "keyword",
                    "filter" : ["myLatinTransform"]
                }
            },
            "filter" : {
                "myLatinTransform" : {
                    "type" : "icu_transform",
                    "id" : "Any-Latin; NFD; [:Nonspacing Mark:] Remove; NFC"
                }
            }
        }
    }
 }
 ```
 This transform transliterated characters to latin, and separates accents from their base characters, removes the accents,
 and then puts the remaining text into an unaccented form.
 The results are:
 `你好` to `ni hao`
 `здравствуйте` to `zdravstvujte`
 `こんにちは` to `kon'nichiha`
 Currently the filter only supports identifier and direction, custom rulesets are not yet supported.
 For more documentation, Please see the [user guide of ICU Transform](http://userguide.icu-project.org/transforms/general).
 License
 -------