[DOCS] Reformat lowercase token filter docs (#49935)

2019-12-12 09:39:06 -05:00 · 2019-12-12 09:39:06 -05:00 · 1186a5dc09
parent 521933aa11
commit 1186a5dc09
1 changed files with 122 additions and 12 deletions
--- a/docs/reference/analysis/tokenfilters/lowercase-tokenfilter.asciidoc
+++ b/docs/reference/analysis/tokenfilters/lowercase-tokenfilter.asciidoc
@ -1,25 +1,135 @@
 [[analysis-lowercase-tokenfilter]]
-=== Lowercase Token Filter
+=== Lowercase token filter
++++
+<titleabbrev>Lowercase</titleabbrev>
++++

-A token filter of type `lowercase` that normalizes token text to lower
-case.
+Changes token text to lowercase. For example, you can use the `lowercase` filter
+to change `THE Lazy DoG` to `the lazy dog`.

-Lowercase token filter supports Greek, Irish, and Turkish lowercase token
-filters through the `language` parameter. Below is a usage example in a
-custom analyzer
+In addition to a default filter, the `lowercase` token filter provides access to
+Lucene's language-specific lowercase filters for Greek, Irish, and Turkish.
+  
+[[analysis-lowercase-tokenfilter-analyze-ex]]
+==== Example
+
+The following <<indices-analyze,analyze API>> request uses the default
+`lowercase` filter to change the `THE Quick FoX JUMPs` to lowercase:

 [source,console]
 --------------------------------------------------
-PUT /lowercase_example
+GET _analyze
+{
+  "tokenizer" : "standard",
+  "filter" : ["lowercase"],
+  "text" : "THE Quick FoX JUMPs"
+}
+--------------------------------------------------
+
+The filter produces the following tokens:
+
+[source,text]
+--------------------------------------------------
+[ the, quick, fox, jumps ]
+--------------------------------------------------
+
+/////////////////////
+[source,console-result]
+--------------------------------------------------
+{
+  "tokens" : [
+    {
+      "token" : "the",
+      "start_offset" : 0,
+      "end_offset" : 3,
+      "type" : "<ALPHANUM>",
+      "position" : 0
+    },
+    {
+      "token" : "quick",
+      "start_offset" : 4,
+      "end_offset" : 9,
+      "type" : "<ALPHANUM>",
+      "position" : 1
+    },
+    {
+      "token" : "fox",
+      "start_offset" : 10,
+      "end_offset" : 13,
+      "type" : "<ALPHANUM>",
+      "position" : 2
+    },
+    {
+      "token" : "jumps",
+      "start_offset" : 14,
+      "end_offset" : 19,
+      "type" : "<ALPHANUM>",
+      "position" : 3
+    }
+  ]
+}
+--------------------------------------------------
+/////////////////////
+
+[[analysis-lowercase-tokenfilter-analyzer-ex]]
+==== Add to an analyzer
+
+The following <<indices-create-index,create index API>> request uses the
+`lowercase` filter to configure a new 
+<<analysis-custom-analyzer,custom analyzer>>.
+
+[source,console]
+--------------------------------------------------
+PUT lowercase_example
+{
+    "settings" : {
+        "analysis" : {
+            "analyzer" : {
+                "whitespace_lowercase" : {
+                    "tokenizer" : "whitespace",
+                    "filter" : ["lowercase"]
+                }
+            }
+        }
+    }
+}
+--------------------------------------------------
+
+[[analysis-lowercase-tokenfilter-configure-parms]]
+==== Configurable parameters
+
+`language`::
+
+--
+(Optional, string)
+Language-specific lowercase token filter to use. Valid values include:
+
+`greek`::: Uses Lucene's https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/el/GreekLowerCaseFilter.html[GreekLowerCaseFilter]
+
+`irish`::: Uses Lucene's http://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/ga/IrishLowerCaseFilter.html[IrishLowerCaseFilter]
+
+`turkish`::: Uses Lucene's https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/tr/TurkishLowerCaseFilter.html[TurkishLowerCaseFilter]
+
+If not specified, defaults to Lucene's https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/core/LowerCaseFilter.html[LowerCaseFilter].
+--
+
+[[analysis-lowercase-tokenfilter-customize]]
+==== Customize
+
+To customize the `lowercase` filter, duplicate it to create the basis
+for a new custom token filter. You can modify the filter using its configurable
+parameters.
+
+For example, the following request creates a custom `lowercase` filter for the
+Greek language:
+
+[source,console]
+--------------------------------------------------
+PUT custom_lowercase_example
 {
  "settings": {
    "analysis": {
      "analyzer": {
-        "standard_lowercase_example": {
-          "type": "custom",
-          "tokenizer": "standard",
-          "filter": ["lowercase"]
-        },
        "greek_lowercase_example": {
          "type": "custom",
          "tokenizer": "standard",