SOLR-13588: Document Estonian analyzer in Solr Ref Guide

This commit is contained in:
Tomoko Uchida 2019-07-03 22:02:47 +09:00
parent 45ea46a425
commit 9d2f516357
2 changed files with 30 additions and 1 deletions

View File

@ -152,7 +152,8 @@ New Features
* SOLR-13589: Allow zplot to visualize 2D clusters and convex hulls (Joel Bernstein)
* SOLR-13602: Add a field type for Estonian language to default managed_schema (Tomoko Uchida)
* SOLR-13602 SOLR-13588: Add a field type for Estonian language to default managed_schema,
document about Estonian language analysis in Solr Ref Guide (Tomoko Uchida)
Bug Fixes
----------------------

View File

@ -587,6 +587,7 @@ These factories are each designed to work with specific languages. The languages
* <<Danish>>
* <<Dutch>>
* <<Estonian>>
* <<Finnish>>
* <<French>>
* <<Galician>>
@ -916,6 +917,33 @@ Solr can stem Dutch using the Snowball Porter Stemmer with an argument of `langu
*Out:* "kanal", "kanal"
=== Estonian
Solr can stem Estonian using the Snowball Porter Stemmer with an argument of `language="Estonian"`.
*Factory class:* `solr.SnowballPorterFilterFactory`
*Arguments:*
`language`:: (required) stemmer language, "Estonian" in this case
*Example:*
[source,xml]
----
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="Estonian"/>
</analyzer>
----
*In:* "Taevani tõustes"
*Tokenizer to Filter:* "Taevani", "tõustes"
*Out:* "taevani", "tõus"
=== Finnish
Solr includes support for stemming Finnish, and Lucene includes an example stopword list.