[DOCS] Added documentation for the keep word token filter
This commit is contained in:
parent
5f170cb4fd
commit
a9fdcadf01
|
@ -70,3 +70,5 @@ include::tokenfilters/common-grams-tokenfilter.asciidoc[]
|
||||||
include::tokenfilters/normalization-tokenfilter.asciidoc[]
|
include::tokenfilters/normalization-tokenfilter.asciidoc[]
|
||||||
|
|
||||||
include::tokenfilters/delimited-payload-tokenfilter.asciidoc[]
|
include::tokenfilters/delimited-payload-tokenfilter.asciidoc[]
|
||||||
|
|
||||||
|
include::tokenfilters/keep-words-tokenfilter.asciidoc[]
|
||||||
|
|
|
@ -0,0 +1,49 @@
|
||||||
|
[[analysis-keep-words-tokenfilter]]
|
||||||
|
=== Keep Words Token Filter
|
||||||
|
|
||||||
|
A token filter of type `keep` that only keeps tokens with text contained in a
|
||||||
|
predefined set of words. The set of words can be defined in the settings or
|
||||||
|
loaded from a text file containing one word per line.
|
||||||
|
|
||||||
|
|
||||||
|
[float]
|
||||||
|
=== Options
|
||||||
|
[horizontal]
|
||||||
|
keep_words:: a list of words to keep
|
||||||
|
keep_words_path:: a path to a words file
|
||||||
|
keep_words_case:: a boolean indicating whether to lower case the words (defaults to `false`)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
[float]
|
||||||
|
=== Settings example
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
--------------------------------------------------
|
||||||
|
{
|
||||||
|
"index" : {
|
||||||
|
"analysis" : {
|
||||||
|
"analyzer" : {
|
||||||
|
"my_analyzer" : {
|
||||||
|
"tokenizer" : "standard",
|
||||||
|
"filter" : ["standard", "lowercase", "words_till_three"]
|
||||||
|
},
|
||||||
|
"my_analyzer1" : {
|
||||||
|
"tokenizer" : "standard",
|
||||||
|
"filter" : ["standard", "lowercase", "words_on_file"]
|
||||||
|
}
|
||||||
|
},
|
||||||
|
"filter" : {
|
||||||
|
"words_till_three" : {
|
||||||
|
"type" : "keep",
|
||||||
|
"keep_words" : [ "one", "two", "three"]
|
||||||
|
},
|
||||||
|
"words_on_file" : {
|
||||||
|
"type" : "keep",
|
||||||
|
"keep_words_path" : "/path/to/word/file"
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
--------------------------------------------------
|
|
@ -11,3 +11,5 @@ http://lucene.apache.org/core/4_3_1/analyzers-common/org/apache/lucene/analysis/
|
||||||
or the
|
or the
|
||||||
http://lucene.apache.org/core/4_3_1/analyzers-common/org/apache/lucene/analysis/fa/PersianNormalizer.html[PersianNormalizer]
|
http://lucene.apache.org/core/4_3_1/analyzers-common/org/apache/lucene/analysis/fa/PersianNormalizer.html[PersianNormalizer]
|
||||||
documentation.
|
documentation.
|
||||||
|
|
||||||
|
*Note:* These filters are available since `0.90.2`
|
||||||
|
|
Loading…
Reference in New Issue