[DOCS] Reformat condition token filter (#48775)

2019-11-11 08:49:01 -05:00 · 2019-11-11 08:49:01 -05:00 · dd92830801
parent eb0d8f3383
commit dd92830801
1 changed files with 120 additions and 60 deletions
--- a/docs/reference/analysis/tokenfilters/condition-tokenfilter.asciidoc
+++ b/docs/reference/analysis/tokenfilters/condition-tokenfilter.asciidoc
@ -1,88 +1,148 @@
 [[analysis-condition-tokenfilter]]
-=== Conditional Token Filter
+=== Conditional token filter
 ++++
 <titleabbrev>Conditional</titleabbrev>
 ++++
-The conditional token filter takes a predicate script and a list of subfilters, and
+Applies a set of token filters to tokens that match conditions in a provided
-only applies the subfilters to the current token if it matches the predicate.
+predicate script.
-[float]
+This filter uses Lucene's
-=== Options
+https://lucene.apache.org/core/{lucene_version_path}/analyzers-common/org/apache/lucene/analysis/miscellaneous/ConditionalTokenFilter.html[ConditionalTokenFilter].
 [horizontal]
 filter:: a chain of token filters to apply to the current token if the predicate
  matches. These can be any token filters defined elsewhere in the index mappings.
-script:: a predicate script that determines whether or not the filters will be applied
+[[analysis-condition-analyze-ex]]
-  to the current token.  Note that only inline scripts are supported
+==== Example
-[float]
+The following <<indices-analyze,analyze API>> request uses the `condition`
-=== Settings example
+filter to match tokens with fewer than 5 characters in `THE QUICK BROWN FOX`.
-
+It then applies the <<analysis-lowercase-tokenfilter,`lowercase`>> filter to
-You can set it up like:
+those matching tokens, converting them to lowercase.
 [source,console]
 --------------------------------------------------
-PUT /condition_example
+GET /_analyze
 {
-    "settings" : {
+  "tokenizer": "standard",
-        "analysis" : {
+  "filter": [
-            "analyzer" : {
+    {
-                "my_analyzer" : {
+      "type": "condition",
-                    "tokenizer" : "standard",
+      "filter": [ "lowercase" ],
-                    "filter" : [ "my_condition" ]
+      "script": {
-                }
+        "source": "token.getTerm().length() < 5"
-            },
+      }
            "filter" : {
                "my_condition" : {
                    "type" : "condition",
                    "filter" : [ "lowercase" ],
                    "script" : {
                        "source" : "token.getTerm().length() < 5"  <1>
                    }
                }
            }
        }
    }
  ],
  "text": "THE QUICK BROWN FOX"
 }
 --------------------------------------------------
-<1> This will only apply the lowercase filter to terms that are less than 5
+The filter produces the following tokens:
 characters in length
-And test it like:
+[source,text]
 [source,console]
 --------------------------------------------------
-POST /condition_example/_analyze
+[ the, QUICK, BROWN, fox ]
 {
  "analyzer" : "my_analyzer",
  "text" : "What Flapdoodle"
 }
 --------------------------------------------------
 // TEST[continued]
 And it'd respond:
 /////////////////////
 [source,console-result]
 --------------------------------------------------
 {
-  "tokens": [
+  "tokens" : [
    {
-      "token": "what",              <1>
+      "token" : "the",
-      "start_offset": 0,
+      "start_offset" : 0,
-      "end_offset": 4,
+      "end_offset" : 3,
-      "type": "<ALPHANUM>",
+      "type" : "<ALPHANUM>",
-      "position": 0
+      "position" : 0
    },
    {
-      "token": "Flapdoodle",        <2>
+      "token" : "QUICK",
-      "start_offset": 5,
+      "start_offset" : 4,
-      "end_offset": 15,
+      "end_offset" : 9,
-      "type": "<ALPHANUM>",
+      "type" : "<ALPHANUM>",
-      "position": 1
+      "position" : 1
    },
    {
      "token" : "BROWN",
      "start_offset" : 10,
      "end_offset" : 15,
      "type" : "<ALPHANUM>",
      "position" : 2
    },
    {
      "token" : "fox",
      "start_offset" : 16,
      "end_offset" : 19,
      "type" : "<ALPHANUM>",
      "position" : 3
    }
  ]
 }
 --------------------------------------------------
 /////////////////////
-<1> The term `What` has been lowercased, because it is only 4 characters long
+[[analysis-condition-tokenfilter-configure-parms]]
-<2> The term `Flapdoodle` has been left in its original case, because it doesn't pass
+==== Configurable parameters
-    the predicate
+
 `filter`::
 +
 --
 (Required, array of token filters)
 Array of token filters. If a token matches the predicate script in the `script`
 parameter, these filters are applied to the token in the order provided.
 These filters can include custom token filters defined in the index mapping.
 --
 `script`::
 +
 --
 (Required, <<modules-scripting-using,script object>>)
 Predicate script used to apply token filters. If a token
 matches this script, the filters in the `filter` parameter are applied to the
 token.
 For valid parameters, see <<_script_parameters>>. Only inline scripts are
 supported. Painless scripts are executed in the
 {painless}/painless-analysis-predicate-context.html[analysis predicate context]
 and require a `token` property.
 --
 [[analysis-condition-tokenfilter-customize]]
 ==== Customize and add to an analyzer
 To customize the `condition` filter, duplicate it to create the basis
 for a new custom token filter. You can modify the filter using its configurable
 parameters.
 For example, the following <<indices-create-index,create index API>> request
 uses a custom `condition` filter to configure a new
 <<analysis-custom-analyzer,custom analyzer>>. The custom `condition` filter
 matches the first token in a stream. It then reverses that matching token using
 the <<analysis-reverse-tokenfilter,`reverse`>> filter.
 [source,console]
 --------------------------------------------------
 PUT /palindrome_list
 {
  "settings": {
    "analysis": {
      "analyzer": {
        "whitespace_reverse_first_token": {
          "tokenizer": "whitespace",
          "filter": [ "reverse_first_token" ]
        }
      },
      "filter": {
        "reverse_first_token": {
          "type": "condition",
          "filter": [ "reverse" ],
          "script": {
            "source": "token.getPosition() === 0"
          }
        }
      }
    }
  }
 }
 --------------------------------------------------