72 lines
2.2 KiB
Plaintext
72 lines
2.2 KiB
Plaintext
[[analysis]]
|
|
== Analysis Plugins
|
|
|
|
Analysis plugins extend Elasticsearch by adding new analyzers, tokenizers,
|
|
token filters, or character filters to Elasticsearch.
|
|
|
|
[float]
|
|
==== Core analysis plugins
|
|
|
|
The core analysis plugins are:
|
|
|
|
<<analysis-icu,ICU>>::
|
|
|
|
Adds extended Unicode support using the http://site.icu-project.org/[ICU]
|
|
libraries, including better analysis of Asian languages, Unicode
|
|
normalization, Unicode-aware case folding, collation support, and
|
|
transliteration.
|
|
|
|
<<analysis-kuromoji,Kuromoji>>::
|
|
|
|
Advanced analysis of Japanese using the http://www.atilika.org/[Kuromoji analyzer].
|
|
|
|
<<analysis-nori,Nori>>::
|
|
|
|
Morphological analysis of Korean using the Lucene Nori analyzer.
|
|
|
|
<<analysis-phonetic,Phonetic>>::
|
|
|
|
Analyzes tokens into their phonetic equivalent using Soundex, Metaphone,
|
|
Caverphone, and other codecs.
|
|
|
|
<<analysis-smartcn,SmartCN>>::
|
|
|
|
An analyzer for Chinese or mixed Chinese-English text. This analyzer uses
|
|
probabilistic knowledge to find the optimal word segmentation for Simplified
|
|
Chinese text. The text is first broken into sentences, then each sentence is
|
|
segmented into words.
|
|
|
|
<<analysis-stempel,Stempel>>::
|
|
|
|
Provides high quality stemming for Polish.
|
|
|
|
<<analysis-ukrainian,Ukrainian>>::
|
|
|
|
Provides stemming for Ukrainian.
|
|
|
|
[float]
|
|
==== Community contributed analysis plugins
|
|
|
|
A number of analysis plugins have been contributed by our community:
|
|
|
|
* https://github.com/medcl/elasticsearch-analysis-ik[IK Analysis Plugin] (by Medcl)
|
|
* https://github.com/medcl/elasticsearch-analysis-pinyin[Pinyin Analysis Plugin] (by Medcl)
|
|
* https://github.com/duydo/elasticsearch-analysis-vietnamese[Vietnamese Analysis Plugin] (by Duy Do)
|
|
* https://github.com/ofir123/elasticsearch-network-analysis[Network Addresses Analysis Plugin] (by Ofir123)
|
|
* https://github.com/ZarHenry96/elasticsearch-dandelion-plugin[Dandelion Analysis Plugin] (by ZarHenry96)
|
|
* https://github.com/medcl/elasticsearch-analysis-stconvert[STConvert Analysis Plugin] (by Medcl)
|
|
|
|
include::analysis-icu.asciidoc[]
|
|
|
|
include::analysis-kuromoji.asciidoc[]
|
|
|
|
include::analysis-nori.asciidoc[]
|
|
|
|
include::analysis-phonetic.asciidoc[]
|
|
|
|
include::analysis-smartcn.asciidoc[]
|
|
|
|
include::analysis-stempel.asciidoc[]
|
|
|
|
include::analysis-ukrainian.asciidoc[]
|