49 lines
1.4 KiB
Plaintext
49 lines
1.4 KiB
Plaintext
|
[[analysis-smartcn]]
|
||
|
=== Smart Chinese Analysis Plugin
|
||
|
|
||
|
The Smart Chinese Analysis plugin integrates Lucene's Smart Chinese analysis
|
||
|
module into elasticsearch.
|
||
|
|
||
|
It provides an analyzer for Chinese or mixed Chinese-English text. This
|
||
|
analyzer uses probabilistic knowledge to find the optimal word segmentation
|
||
|
for Simplified Chinese text. The text is first broken into sentences, then
|
||
|
each sentence is segmented into words.
|
||
|
|
||
|
|
||
|
[[analysis-smartcn-install]]
|
||
|
[float]
|
||
|
==== Installation
|
||
|
|
||
|
This plugin can be installed using the plugin manager:
|
||
|
|
||
|
[source,sh]
|
||
|
----------------------------------------------------------------
|
||
|
sudo bin/plugin install analysis-smartcn
|
||
|
----------------------------------------------------------------
|
||
|
|
||
|
The plugin must be installed on every node in the cluster, and each node must
|
||
|
be restarted after installation.
|
||
|
|
||
|
[[analysis-smartcn-remove]]
|
||
|
[float]
|
||
|
==== Removal
|
||
|
|
||
|
The plugin can be removed with the following command:
|
||
|
|
||
|
[source,sh]
|
||
|
----------------------------------------------------------------
|
||
|
sudo bin/plugin remove analysis-smartcn
|
||
|
----------------------------------------------------------------
|
||
|
|
||
|
The node must be stopped before removing the plugin.
|
||
|
|
||
|
[[analysis-smartcn-tokenizer]]
|
||
|
[float]
|
||
|
==== `smartcn` tokenizer and token filter
|
||
|
|
||
|
The plugin provides the `smartcn` analyzer and `smartcn_tokenizer` tokenizer,
|
||
|
which are not configurable.
|
||
|
|
||
|
NOTE: The `smartcn_word` token filter and `smartcn_sentence` have been deprecated.
|
||
|
|