parent
5c8b0662df
commit
26a157da7b
|
@ -5,7 +5,7 @@
|
||||||
++++
|
++++
|
||||||
|
|
||||||
Provides <<algorithmic-stemmers,algorithmic stemming>> for the English language,
|
Provides <<algorithmic-stemmers,algorithmic stemming>> for the English language,
|
||||||
based on the http://snowball.tartarus.org/algorithms/porter/stemmer.html[Porter
|
based on the https://snowballstem.org/algorithms/porter/stemmer.html[Porter
|
||||||
stemming algorithm].
|
stemming algorithm].
|
||||||
|
|
||||||
This filter tends to stem more aggressively than other English
|
This filter tends to stem more aggressively than other English
|
||||||
|
|
|
@ -9,7 +9,7 @@ some with additional variants. For a list of supported languages, see the
|
||||||
<<analysis-stemmer-tokenfilter-language-parm,`language`>> parameter.
|
<<analysis-stemmer-tokenfilter-language-parm,`language`>> parameter.
|
||||||
|
|
||||||
When not customized, the filter uses the
|
When not customized, the filter uses the
|
||||||
http://snowball.tartarus.org/algorithms/porter/stemmer.html[porter stemming
|
https://snowballstem.org/algorithms/porter/stemmer.html[porter stemming
|
||||||
algorithm] for English.
|
algorithm] for English.
|
||||||
|
|
||||||
[[analysis-stemmer-tokenfilter-analyze-ex]]
|
[[analysis-stemmer-tokenfilter-analyze-ex]]
|
||||||
|
@ -112,17 +112,17 @@ Language-dependent stemming algorithm used to stem tokens. If both this and the
|
||||||
.Valid values for `language`
|
.Valid values for `language`
|
||||||
====
|
====
|
||||||
Valid values are sorted by language. Defaults to
|
Valid values are sorted by language. Defaults to
|
||||||
http://snowball.tartarus.org/algorithms/porter/stemmer.html[*`english`*].
|
https://snowballstem.org/algorithms/porter/stemmer.html[*`english`*].
|
||||||
Recommended algorithms are *bolded*.
|
Recommended algorithms are *bolded*.
|
||||||
|
|
||||||
Arabic::
|
Arabic::
|
||||||
{lucene-analysis-docs}/ar/ArabicStemmer.html[*`arabic`*]
|
{lucene-analysis-docs}/ar/ArabicStemmer.html[*`arabic`*]
|
||||||
|
|
||||||
Armenian::
|
Armenian::
|
||||||
http://snowball.tartarus.org/algorithms/armenian/stemmer.html[*`armenian`*]
|
https://snowballstem.org/algorithms/armenian/stemmer.html[*`armenian`*]
|
||||||
|
|
||||||
Basque::
|
Basque::
|
||||||
http://snowball.tartarus.org/algorithms/basque/stemmer.html[*`basque`*]
|
https://snowballstem.org/algorithms/basque/stemmer.html[*`basque`*]
|
||||||
|
|
||||||
Bengali::
|
Bengali::
|
||||||
https://www.tandfonline.com/doi/abs/10.1080/02564602.1993.11437284[*`bengali`*]
|
https://www.tandfonline.com/doi/abs/10.1080/02564602.1993.11437284[*`bengali`*]
|
||||||
|
@ -134,36 +134,36 @@ Bulgarian::
|
||||||
http://members.unine.ch/jacques.savoy/Papers/BUIR.pdf[*`bulgarian`*]
|
http://members.unine.ch/jacques.savoy/Papers/BUIR.pdf[*`bulgarian`*]
|
||||||
|
|
||||||
Catalan::
|
Catalan::
|
||||||
http://snowball.tartarus.org/algorithms/catalan/stemmer.html[*`catalan`*]
|
https://snowballstem.org/algorithms/catalan/stemmer.html[*`catalan`*]
|
||||||
|
|
||||||
Czech::
|
Czech::
|
||||||
https://dl.acm.org/doi/10.1016/j.ipm.2009.06.001[*`czech`*]
|
https://dl.acm.org/doi/10.1016/j.ipm.2009.06.001[*`czech`*]
|
||||||
|
|
||||||
Danish::
|
Danish::
|
||||||
http://snowball.tartarus.org/algorithms/danish/stemmer.html[*`danish`*]
|
https://snowballstem.org/algorithms/danish/stemmer.html[*`danish`*]
|
||||||
|
|
||||||
Dutch::
|
Dutch::
|
||||||
http://snowball.tartarus.org/algorithms/dutch/stemmer.html[*`dutch`*],
|
https://snowballstem.org/algorithms/dutch/stemmer.html[*`dutch`*],
|
||||||
http://snowball.tartarus.org/algorithms/kraaij_pohlmann/stemmer.html[`dutch_kp`]
|
https://snowballstem.org/algorithms/kraaij_pohlmann/stemmer.html[`dutch_kp`]
|
||||||
|
|
||||||
English::
|
English::
|
||||||
http://snowball.tartarus.org/algorithms/porter/stemmer.html[*`english`*],
|
https://snowballstem.org/algorithms/porter/stemmer.html[*`english`*],
|
||||||
https://ciir.cs.umass.edu/pubfiles/ir-35.pdf[`light_english`],
|
https://ciir.cs.umass.edu/pubfiles/ir-35.pdf[`light_english`],
|
||||||
http://snowball.tartarus.org/algorithms/lovins/stemmer.html[`lovins`],
|
https://snowballstem.org/algorithms/lovins/stemmer.html[`lovins`],
|
||||||
https://www.researchgate.net/publication/220433848_How_effective_is_suffixing[`minimal_english`],
|
https://www.researchgate.net/publication/220433848_How_effective_is_suffixing[`minimal_english`],
|
||||||
http://snowball.tartarus.org/algorithms/english/stemmer.html[`porter2`],
|
https://snowballstem.org/algorithms/english/stemmer.html[`porter2`],
|
||||||
{lucene-analysis-docs}/en/EnglishPossessiveFilter.html[`possessive_english`]
|
{lucene-analysis-docs}/en/EnglishPossessiveFilter.html[`possessive_english`]
|
||||||
|
|
||||||
Estonian::
|
Estonian::
|
||||||
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/tartarus/snowball/ext/EstonianStemmer.html[*`estonian`*]
|
https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/tartarus/snowball/ext/EstonianStemmer.html[*`estonian`*]
|
||||||
|
|
||||||
Finnish::
|
Finnish::
|
||||||
http://snowball.tartarus.org/algorithms/finnish/stemmer.html[*`finnish`*],
|
https://snowballstem.org/algorithms/finnish/stemmer.html[*`finnish`*],
|
||||||
http://clef.isti.cnr.it/2003/WN_web/22.pdf[`light_finnish`]
|
http://clef.isti.cnr.it/2003/WN_web/22.pdf[`light_finnish`]
|
||||||
|
|
||||||
French::
|
French::
|
||||||
https://dl.acm.org/citation.cfm?id=1141523[*`light_french`*],
|
https://dl.acm.org/citation.cfm?id=1141523[*`light_french`*],
|
||||||
http://snowball.tartarus.org/algorithms/french/stemmer.html[`french`],
|
https://snowballstem.org/algorithms/french/stemmer.html[`french`],
|
||||||
https://dl.acm.org/citation.cfm?id=318984[`minimal_french`]
|
https://dl.acm.org/citation.cfm?id=318984[`minimal_french`]
|
||||||
|
|
||||||
Galician::
|
Galician::
|
||||||
|
@ -172,8 +172,8 @@ http://bvg.udc.es/recursos_lingua/stemming.jsp[`minimal_galician`] (Plural step
|
||||||
|
|
||||||
German::
|
German::
|
||||||
https://dl.acm.org/citation.cfm?id=1141523[*`light_german`*],
|
https://dl.acm.org/citation.cfm?id=1141523[*`light_german`*],
|
||||||
http://snowball.tartarus.org/algorithms/german/stemmer.html[`german`],
|
https://snowballstem.org/algorithms/german/stemmer.html[`german`],
|
||||||
http://snowball.tartarus.org/algorithms/german2/stemmer.html[`german2`],
|
https://snowballstem.org/algorithms/german2/stemmer.html[`german2`],
|
||||||
http://members.unine.ch/jacques.savoy/clef/morpho.pdf[`minimal_german`]
|
http://members.unine.ch/jacques.savoy/clef/morpho.pdf[`minimal_german`]
|
||||||
|
|
||||||
Greek::
|
Greek::
|
||||||
|
@ -183,18 +183,18 @@ Hindi::
|
||||||
http://computing.open.ac.uk/Sites/EACLSouthAsia/Papers/p6-Ramanathan.pdf[*`hindi`*]
|
http://computing.open.ac.uk/Sites/EACLSouthAsia/Papers/p6-Ramanathan.pdf[*`hindi`*]
|
||||||
|
|
||||||
Hungarian::
|
Hungarian::
|
||||||
http://snowball.tartarus.org/algorithms/hungarian/stemmer.html[*`hungarian`*],
|
https://snowballstem.org/algorithms/hungarian/stemmer.html[*`hungarian`*],
|
||||||
https://dl.acm.org/citation.cfm?id=1141523&dl=ACM&coll=DL&CFID=179095584&CFTOKEN=80067181[`light_hungarian`]
|
https://dl.acm.org/citation.cfm?id=1141523&dl=ACM&coll=DL&CFID=179095584&CFTOKEN=80067181[`light_hungarian`]
|
||||||
|
|
||||||
Indonesian::
|
Indonesian::
|
||||||
http://www.illc.uva.nl/Publications/ResearchReports/MoL-2003-02.text.pdf[*`indonesian`*]
|
http://www.illc.uva.nl/Publications/ResearchReports/MoL-2003-02.text.pdf[*`indonesian`*]
|
||||||
|
|
||||||
Irish::
|
Irish::
|
||||||
http://snowball.tartarus.org/otherapps/oregan/intro.html[*`irish`*]
|
https://snowballstem.org/otherapps/oregan/[*`irish`*]
|
||||||
|
|
||||||
Italian::
|
Italian::
|
||||||
https://www.ercim.eu/publication/ws-proceedings/CLEF2/savoy.pdf[*`light_italian`*],
|
https://www.ercim.eu/publication/ws-proceedings/CLEF2/savoy.pdf[*`light_italian`*],
|
||||||
http://snowball.tartarus.org/algorithms/italian/stemmer.html[`italian`]
|
https://snowballstem.org/algorithms/italian/stemmer.html[`italian`]
|
||||||
|
|
||||||
Kurdish (Sorani)::
|
Kurdish (Sorani)::
|
||||||
{lucene-analysis-docs}/ckb/SoraniStemmer.html[*`sorani`*]
|
{lucene-analysis-docs}/ckb/SoraniStemmer.html[*`sorani`*]
|
||||||
|
@ -206,7 +206,7 @@ Lithuanian::
|
||||||
https://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_5_3/lucene/analysis/common/src/java/org/apache/lucene/analysis/lt/stem_ISO_8859_1.sbl?view=markup[*`lithuanian`*]
|
https://svn.apache.org/viewvc/lucene/dev/branches/lucene_solr_5_3/lucene/analysis/common/src/java/org/apache/lucene/analysis/lt/stem_ISO_8859_1.sbl?view=markup[*`lithuanian`*]
|
||||||
|
|
||||||
Norwegian (Bokmål)::
|
Norwegian (Bokmål)::
|
||||||
http://snowball.tartarus.org/algorithms/norwegian/stemmer.html[*`norwegian`*],
|
https://snowballstem.org/algorithms/norwegian/stemmer.html[*`norwegian`*],
|
||||||
{lucene-analysis-docs}/no/NorwegianLightStemmer.html[*`light_norwegian`*],
|
{lucene-analysis-docs}/no/NorwegianLightStemmer.html[*`light_norwegian`*],
|
||||||
{lucene-analysis-docs}/no/NorwegianMinimalStemmer.html[`minimal_norwegian`]
|
{lucene-analysis-docs}/no/NorwegianMinimalStemmer.html[`minimal_norwegian`]
|
||||||
|
|
||||||
|
@ -217,26 +217,26 @@ Norwegian (Nynorsk)::
|
||||||
Portuguese::
|
Portuguese::
|
||||||
https://dl.acm.org/citation.cfm?id=1141523&dl=ACM&coll=DL&CFID=179095584&CFTOKEN=80067181[*`light_portuguese`*],
|
https://dl.acm.org/citation.cfm?id=1141523&dl=ACM&coll=DL&CFID=179095584&CFTOKEN=80067181[*`light_portuguese`*],
|
||||||
pass:macros[http://www.inf.ufrgs.br/~buriol/papers/Orengo_CLEF07.pdf[`minimal_portuguese`\]],
|
pass:macros[http://www.inf.ufrgs.br/~buriol/papers/Orengo_CLEF07.pdf[`minimal_portuguese`\]],
|
||||||
http://snowball.tartarus.org/algorithms/portuguese/stemmer.html[`portuguese`],
|
https://snowballstem.org/algorithms/portuguese/stemmer.html[`portuguese`],
|
||||||
https://www.inf.ufrgs.br/\~viviane/rslp/index.htm[`portuguese_rslp`]
|
https://www.inf.ufrgs.br/\~viviane/rslp/index.htm[`portuguese_rslp`]
|
||||||
|
|
||||||
Romanian::
|
Romanian::
|
||||||
http://snowball.tartarus.org/algorithms/romanian/stemmer.html[*`romanian`*]
|
https://snowballstem.org/algorithms/romanian/stemmer.html[*`romanian`*]
|
||||||
|
|
||||||
Russian::
|
Russian::
|
||||||
http://snowball.tartarus.org/algorithms/russian/stemmer.html[*`russian`*],
|
https://snowballstem.org/algorithms/russian/stemmer.html[*`russian`*],
|
||||||
https://doc.rero.ch/lm.php?url=1000%2C43%2C4%2C20091209094227-CA%2FDolamic_Ljiljana_-_Indexing_and_Searching_Strategies_for_the_Russian_20091209.pdf[`light_russian`]
|
https://doc.rero.ch/lm.php?url=1000%2C43%2C4%2C20091209094227-CA%2FDolamic_Ljiljana_-_Indexing_and_Searching_Strategies_for_the_Russian_20091209.pdf[`light_russian`]
|
||||||
|
|
||||||
Spanish::
|
Spanish::
|
||||||
https://www.ercim.eu/publication/ws-proceedings/CLEF2/savoy.pdf[*`light_spanish`*],
|
https://www.ercim.eu/publication/ws-proceedings/CLEF2/savoy.pdf[*`light_spanish`*],
|
||||||
http://snowball.tartarus.org/algorithms/spanish/stemmer.html[`spanish`]
|
https://snowballstem.org/algorithms/spanish/stemmer.html[`spanish`]
|
||||||
|
|
||||||
Swedish::
|
Swedish::
|
||||||
http://snowball.tartarus.org/algorithms/swedish/stemmer.html[*`swedish`*],
|
https://snowballstem.org/algorithms/swedish/stemmer.html[*`swedish`*],
|
||||||
http://clef.isti.cnr.it/2003/WN_web/22.pdf[`light_swedish`]
|
http://clef.isti.cnr.it/2003/WN_web/22.pdf[`light_swedish`]
|
||||||
|
|
||||||
Turkish::
|
Turkish::
|
||||||
http://snowball.tartarus.org/algorithms/turkish/stemmer.html[*`turkish`*]
|
https://snowballstem.org/algorithms/turkish/stemmer.html[*`turkish`*]
|
||||||
====
|
====
|
||||||
|
|
||||||
`name`::
|
`name`::
|
||||||
|
|
Loading…
Reference in New Issue