2015-08-15 12:00:55 -04:00
|
|
|
[[analysis-smartcn]]
|
|
|
|
=== Smart Chinese Analysis Plugin
|
|
|
|
|
|
|
|
The Smart Chinese Analysis plugin integrates Lucene's Smart Chinese analysis
|
|
|
|
module into elasticsearch.
|
|
|
|
|
|
|
|
It provides an analyzer for Chinese or mixed Chinese-English text. This
|
|
|
|
analyzer uses probabilistic knowledge to find the optimal word segmentation
|
|
|
|
for Simplified Chinese text. The text is first broken into sentences, then
|
|
|
|
each sentence is segmented into words.
|
|
|
|
|
|
|
|
|
|
|
|
[[analysis-smartcn-install]]
|
|
|
|
[float]
|
|
|
|
==== Installation
|
|
|
|
|
|
|
|
This plugin can be installed using the plugin manager:
|
|
|
|
|
|
|
|
[source,sh]
|
|
|
|
----------------------------------------------------------------
|
2016-02-04 10:00:55 -05:00
|
|
|
sudo bin/elasticsearch-plugin install analysis-smartcn
|
2015-08-15 12:00:55 -04:00
|
|
|
----------------------------------------------------------------
|
|
|
|
|
|
|
|
The plugin must be installed on every node in the cluster, and each node must
|
|
|
|
be restarted after installation.
|
|
|
|
|
2016-09-19 09:04:29 -04:00
|
|
|
This plugin can be downloaded for <<plugin-management-custom-url,offline install>> from
|
|
|
|
{plugin_url}/analysis-smartcn/{version}/analysis-smartcn-{version}.zip.
|
2016-09-12 09:34:44 -04:00
|
|
|
|
2015-08-15 12:00:55 -04:00
|
|
|
[[analysis-smartcn-remove]]
|
|
|
|
[float]
|
|
|
|
==== Removal
|
|
|
|
|
|
|
|
The plugin can be removed with the following command:
|
|
|
|
|
|
|
|
[source,sh]
|
|
|
|
----------------------------------------------------------------
|
2016-02-04 10:00:55 -05:00
|
|
|
sudo bin/elasticsearch-plugin remove analysis-smartcn
|
2015-08-15 12:00:55 -04:00
|
|
|
----------------------------------------------------------------
|
|
|
|
|
|
|
|
The node must be stopped before removing the plugin.
|
|
|
|
|
|
|
|
[[analysis-smartcn-tokenizer]]
|
|
|
|
[float]
|
|
|
|
==== `smartcn` tokenizer and token filter
|
|
|
|
|
|
|
|
The plugin provides the `smartcn` analyzer and `smartcn_tokenizer` tokenizer,
|
|
|
|
which are not configurable.
|
|
|
|
|
|
|
|
NOTE: The `smartcn_word` token filter and `smartcn_sentence` have been deprecated.
|