[DOCS] Reformat ASCII folding token filter docs (#48143)

2019-10-23 15:06:18 -05:00 · 2019-10-23 15:06:18 -05:00 · 06dc1fbd96
parent 8163e0a9e5
commit 06dc1fbd96
2 changed files with 98 additions and 11 deletions
--- a/docs/reference/analysis/tokenfilters/apostrophe-tokenfilter.asciidoc
+++ b/docs/reference/analysis/tokenfilters/apostrophe-tokenfilter.asciidoc
@ -8,7 +8,7 @@ Strips all characters after an apostrophe, including the apostrophe itself.

 This filter is included in {es}'s built-in <<turkish-analyzer,Turkish language
 analyzer>>. It uses Lucene's
-https://lucene.apache.org/core/4_8_0/analyzers-common/org/apache/lucene/analysis/tr/ApostropheFilter.html[ApostropheFilter],
+https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/tr/ApostropheFilter.html[ApostropheFilter],
 which was built for the Turkish language.


--- a/docs/reference/analysis/tokenfilters/asciifolding-tokenfilter.asciidoc
+++ b/docs/reference/analysis/tokenfilters/asciifolding-tokenfilter.asciidoc
@ -1,10 +1,83 @@
 [[analysis-asciifolding-tokenfilter]]
-=== ASCII Folding Token Filter
+=== ASCII folding token filter
++++
+<titleabbrev>ASCII folding</titleabbrev>
++++

-A token filter of type `asciifolding` that converts alphabetic, numeric,
-and symbolic Unicode characters which are not in the first 127 ASCII
-characters (the "Basic Latin" Unicode block) into their ASCII
-equivalents, if one exists.  Example:
+Converts alphabetic, numeric, and symbolic characters that are not in the Basic
+Latin Unicode block (first 127 ASCII characters) to their ASCII equivalent, if
+one exists. For example, the filter changes `à` to `a`.
+
+This filter uses Lucene's
+https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/miscellaneous/ASCIIFoldingFilter.html[ASCIIFoldingFilter].
+
+[[analysis-asciifolding-tokenfilter-analyze-ex]]
+==== Example
+
+The following <<indices-analyze,analyze API>> request uses the `asciifolding`
+filter to drop the diacritical marks in `açaí à la carte`:
+
+[source,console]
+--------------------------------------------------
+GET /_analyze
+{
+  "tokenizer" : "standard",
+  "filter" : ["asciifolding"],
+  "text" : "açaí à la carte"
+}
+--------------------------------------------------
+
+The filter produces the following tokens:
+
+[source,text]
+--------------------------------------------------
+[ acai, a, la, carte ]
+--------------------------------------------------
+
+/////////////////////
+[source,console-result]
+--------------------------------------------------
+{
+  "tokens" : [
+    {
+      "token" : "acai",
+      "start_offset" : 0,
+      "end_offset" : 4,
+      "type" : "<ALPHANUM>",
+      "position" : 0
+    },
+    {
+      "token" : "a",
+      "start_offset" : 5,
+      "end_offset" : 6,
+      "type" : "<ALPHANUM>",
+      "position" : 1
+    },
+    {
+      "token" : "la",
+      "start_offset" : 7,
+      "end_offset" : 9,
+      "type" : "<ALPHANUM>",
+      "position" : 2
+    },
+    {
+      "token" : "carte",
+      "start_offset" : 10,
+      "end_offset" : 15,
+      "type" : "<ALPHANUM>",
+      "position" : 3
+    }
+  ]
+}
+--------------------------------------------------
+/////////////////////
+
+[[analysis-asciifolding-tokenfilter-analyzer-ex]]
+==== Add to an analyzer
+
+The following <<indices-create-index,create index API>> request uses the
+`asciifolding` filter to configure a new 
+<<analysis-custom-analyzer,custom analyzer>>.

 [source,console]
 --------------------------------------------------
@ -13,7 +86,7 @@ PUT /asciifold_example
    "settings" : {
        "analysis" : {
            "analyzer" : {
-                "default" : {
+                "standard_asciifolding" : {
                    "tokenizer" : "standard",
                    "filter" : ["asciifolding"]
                }
@ -23,9 +96,23 @@ PUT /asciifold_example
 }
 --------------------------------------------------

-Accepts `preserve_original` setting which defaults to false but if true
-will keep the original token as well as emit the folded token.  For
-example:
+[[analysis-asciifolding-tokenfilter-configure-parms]]
+==== Configurable parameters
+
+`preserve_original`::
+(Optional, boolean)
+If `true`, emit both original tokens and folded tokens.
+Defaults to `false`.
+
+[[analysis-asciifolding-tokenfilter-customize]]
+==== Customize
+
+To customize the `asciifolding` filter, duplicate it to create the basis
+for a new custom token filter. You can modify the filter using its configurable
+parameters.
+
+For example, the following request creates a custom `asciifolding` filter with
+`preserve_original` set to true:

 [source,console]
 --------------------------------------------------
@ -34,7 +121,7 @@ PUT /asciifold_example
    "settings" : {
        "analysis" : {
            "analyzer" : {
-                "default" : {
+                "standard_asciifolding" : {
                    "tokenizer" : "standard",
                    "filter" : ["my_ascii_folding"]
                }