2013-08-28 19:24:34 -04:00
|
|
|
[[analysis-lang-analyzer]]
|
|
|
|
=== Language Analyzers
|
|
|
|
|
|
|
|
A set of analyzers aimed at analyzing specific language text. The
|
|
|
|
following types are supported: `arabic`, `armenian`, `basque`,
|
|
|
|
`brazilian`, `bulgarian`, `catalan`, `chinese`, `cjk`, `czech`,
|
|
|
|
`danish`, `dutch`, `english`, `finnish`, `french`, `galician`, `german`,
|
|
|
|
`greek`, `hindi`, `hungarian`, `indonesian`, `italian`, `norwegian`,
|
|
|
|
`persian`, `portuguese`, `romanian`, `russian`, `spanish`, `swedish`,
|
|
|
|
`turkish`, `thai`.
|
|
|
|
|
|
|
|
All analyzers support setting custom `stopwords` either internally in
|
|
|
|
the config, or by using an external stopwords file by setting
|
2014-01-13 14:04:30 -05:00
|
|
|
`stopwords_path`. Check <<analysis-stop-analyzer,Stop Analyzer>> for
|
|
|
|
more details.
|
2013-08-28 19:24:34 -04:00
|
|
|
|
|
|
|
The following analyzers support setting custom `stem_exclusion` list:
|
|
|
|
`arabic`, `armenian`, `basque`, `brazilian`, `bulgarian`, `catalan`,
|
|
|
|
`czech`, `danish`, `dutch`, `english`, `finnish`, `french`, `galician`,
|
|
|
|
`german`, `hindi`, `hungarian`, `indonesian`, `italian`, `norwegian`,
|
|
|
|
`portuguese`, `romanian`, `russian`, `spanish`, `swedish`, `turkish`.
|