2015-08-15 12:00:55 -04:00
|
|
|
[[analysis]]
|
|
|
|
== Analysis Plugins
|
|
|
|
|
|
|
|
Analysis plugins extend Elasticsearch by adding new analyzers, tokenizers,
|
|
|
|
token filters, or character filters to Elasticsearch.
|
|
|
|
|
|
|
|
[float]
|
|
|
|
==== Core analysis plugins
|
|
|
|
|
|
|
|
The core analysis plugins are:
|
|
|
|
|
|
|
|
<<analysis-icu,ICU>>::
|
|
|
|
|
|
|
|
Adds extended Unicode support using the http://site.icu-project.org/[ICU]
|
|
|
|
libraries, including better analysis of Asian languages, Unicode
|
|
|
|
normalization, Unicode-aware case folding, collation support, and
|
|
|
|
transliteration.
|
|
|
|
|
|
|
|
<<analysis-kuromoji,Kuromoji>>::
|
|
|
|
|
|
|
|
Advanced analysis of Japanese using the http://www.atilika.org/[Kuromoji analyzer].
|
|
|
|
|
|
|
|
<<analysis-phonetic,Phonetic>>::
|
|
|
|
|
|
|
|
Analyzes tokens into their phonetic equivalent using Soundex, Metaphone,
|
|
|
|
Caverphone, and other codecs.
|
|
|
|
|
|
|
|
<<analysis-smartcn,SmartCN>>::
|
|
|
|
|
|
|
|
An analyzer for Chinese or mixed Chinese-English text. This analyzer uses
|
|
|
|
probabilistic knowledge to find the optimal word segmentation for Simplified
|
|
|
|
Chinese text. The text is first broken into sentences, then each sentence is
|
|
|
|
segmented into words.
|
|
|
|
|
|
|
|
<<analysis-stempel,Stempel>>::
|
|
|
|
|
|
|
|
Provides high quality stemming for Polish.
|
|
|
|
|
2016-10-28 11:19:33 -04:00
|
|
|
<<analysis-ukrainian,Ukrainian>>::
|
|
|
|
|
|
|
|
Provides stemming for Ukrainian.
|
|
|
|
|
2015-08-15 12:00:55 -04:00
|
|
|
[float]
|
|
|
|
==== Community contributed analysis plugins
|
|
|
|
|
|
|
|
A number of analysis plugins have been contributed by our community:
|
|
|
|
|
|
|
|
* https://github.com/synhershko/elasticsearch-analysis-hebrew[Hebrew Analysis Plugin] (by Itamar Syn-Hershko)
|
|
|
|
* https://github.com/medcl/elasticsearch-analysis-ik[IK Analysis Plugin] (by Medcl)
|
|
|
|
* https://github.com/medcl/elasticsearch-analysis-mmseg[Mmseg Analysis Plugin] (by Medcl)
|
|
|
|
* https://github.com/imotov/elasticsearch-analysis-morphology[Russian and English Morphological Analysis Plugin] (by Igor Motov)
|
|
|
|
* https://github.com/medcl/elasticsearch-analysis-pinyin[Pinyin Analysis Plugin] (by Medcl)
|
|
|
|
* https://github.com/duydo/elasticsearch-analysis-vietnamese[Vietnamese Analysis Plugin] (by Duy Do)
|
2016-01-28 05:16:28 -05:00
|
|
|
* https://github.com/ofir123/elasticsearch-network-analysis[Network Addresses Analysis Plugin] (by Ofir123)
|
2015-08-15 12:00:55 -04:00
|
|
|
* https://github.com/medcl/elasticsearch-analysis-string2int[String2Integer Analysis Plugin] (by Medcl)
|
2018-04-25 08:11:12 -04:00
|
|
|
* https://github.com/ZarHenry96/elasticsearch-dandelion-plugin[Dandelion Analysis Plugin] (by ZarHenry96)
|
2015-08-15 12:00:55 -04:00
|
|
|
|
|
|
|
include::analysis-icu.asciidoc[]
|
|
|
|
|
|
|
|
include::analysis-kuromoji.asciidoc[]
|
|
|
|
|
|
|
|
include::analysis-phonetic.asciidoc[]
|
|
|
|
|
|
|
|
include::analysis-smartcn.asciidoc[]
|
|
|
|
|
|
|
|
include::analysis-stempel.asciidoc[]
|
|
|
|
|
2016-10-28 11:19:33 -04:00
|
|
|
include::analysis-ukrainian.asciidoc[]
|