[DOCS] Reformats interval query (#45350)

2025-02-23 05:15:04 +00:00 · 2019-08-09 08:53:25 -04:00 · 2019-08-09 08:53:25 -04:00 · 846928a52a
commit 846928a52a
parent 1506d4436b
1 changed files with 194 additions and 110 deletions
--- a/docs/reference/query-dsl/intervals-query.asciidoc
+++ b/docs/reference/query-dsl/intervals-query.asciidoc
@ -4,17 +4,25 @@
 <titleabbrev>Intervals</titleabbrev>
 ++++
-An `intervals` query allows fine-grained control over the order and proximity of
+Returns documents based on the order and proximity of matching terms.
-matching terms.  Matching rules are constructed from a small set of definitions,
+
-and the rules are then applied to terms from a particular `field`.
+The `intervals` query uses *matching rules*, constructed from a small set of
 definitions. Theses rules are then applied to terms from a specified `field`.
 The definitions produce sequences of minimal intervals that span terms in a
-body of text.  These intervals can be further combined and filtered by
+body of text. These intervals can be further combined and filtered by
 parent sources.
-The example below will search for the phrase `my favourite food` appearing
+
-before the terms `hot` and `water` or `cold` and `porridge` in any order, in
+[[intervals-query-ex-request]]
-the field `my_text`
+==== Example request
 The following `intervals` search returns documents containing `my
 favorite food` immediately followed by `hot water` or `cold porridge` in the
 `my_text` field.
 This search would match a `my_text` value of `my favorite food is cold
 porridge` but not `when it's cold my favorite food is porridge`.
 [source,js]
 --------------------------------------------------
@ -28,7 +36,7 @@ POST _search
          "intervals" : [
            {
              "match" : {
-                "query" : "my favourite food",
+                "query" : "my favorite food",
                "max_gaps" : 0,
                "ordered" : true
              }
@ -42,8 +50,7 @@ POST _search
              }
            }
          ]
-        },
+        }
        "_name" : "favourite_food"
      }
    }
  }
@ -51,69 +58,103 @@ POST _search
 --------------------------------------------------
 // CONSOLE
-In the above example, the text `my favourite food is cold porridge` would
+[[intervals-top-level-params]]
-match because the two intervals matching `my favourite food` and `cold
+==== Top-level parameters for `intervals`
-porridge` appear in the correct order, but the text `when it's cold my
+[[intervals-rules]]
-favourite food is porridge` would not match, because the interval matching
+`<field>`::
-`cold porridge` starts before the interval matching `my favourite food`.
+
 --
 (Required, rule object) Field you wish to search.
 The value of this parameter is a rule object used to match documents
 based on matching terms, order, and proximity.
 Valid rules include:
 * <<intervals-match,`match`>>
 * <<intervals-prefix,`prefix`>>
 * <<intervals-wildcard,`wildcard`>>
 * <<intervals-all_of,`all_of`>>
 * <<intervals-any_of,`any_of`>>
 * <<interval_filter,`filter`>>
 --
 [[intervals-match]]
-==== `match`
+==== `match` rule parameters
-The `match` rule matches analyzed text, and takes the following parameters:
+The `match` rule matches analyzed text.
 [horizontal]
 `query`::
-The text to match.
+(Required, string) Text you wish to find in the provided `<field>`.
 `max_gaps`::
-Specify a maximum number of gaps between the terms in the text.  Terms that
+
-appear further apart than this will not match. If unspecified, or set to -1,
+--
-then there is no width restriction on the match.  If set to 0 then the terms
+(Optional, integer) Maximum number of positions between the matching terms.
-must appear next to each other.
+Terms further apart than this are not considered matches. Defaults to
 `-1`.
 If unspecified or set to `-1`, there is no width restriction on the match. If
 set to `0`, the terms must appear next to each other.
 --
 `ordered`::
-Whether or not the terms must appear in their specified order.  Defaults to
+(Optional, boolean) 
-`false`
+If `true`, matching terms must appear in their specified order. Defaults to
 `false`.
 `analyzer`::
-Which analyzer should be used to analyze terms in the `query`.  By
+(Optional, string) <<analysis, analyzer>> used to analyze terms in the `query`.
-default, the search analyzer of the top-level field will be used.
+Defaults to the top-level `<field>`'s analyzer.
 `filter`::
-An optional <<interval_filter,interval filter>>
+(Optional, <<interval_filter,interval filter>> rule object) An optional interval
 filter.
 `use_field`::
-If specified, then match intervals from this field rather than the top-level field.
+(Optional, string) If specified, then match intervals from this
-Terms will be analyzed using the search analyzer from this field.  This allows you
+field rather than the top-level `<field>`. Terms are analyzed using the
-to search across multiple fields as if they were all the same field; for example,
+search analyzer from this field. This allows you to search across multiple
-you could index the same text into stemmed and unstemmed fields, and search for
+fields as if they were all the same field; for example, you could index the same
-stemmed tokens near unstemmed ones.
+text into stemmed and unstemmed fields, and search for stemmed tokens near
 unstemmed ones.
 [[intervals-prefix]]
-==== `prefix`
+==== `prefix` rule parameters
-The `prefix` rule finds terms that start with a specified prefix.  The prefix will
+The `prefix` rule matches terms that start with a specified set of characters.
-expand to match at most 128 terms; if there are more matching terms in the index,
+This prefix can expand to match at most 128 terms. If the prefix matches more
-then an error will be returned.  To avoid this limit, enable the
+than 128 terms, {es} returns an error. You can use the
-<<index-prefixes,`index-prefixes`>> option on the field being searched.
+<<index-prefixes,`index-prefixes`>> option in the field mapping to avoid this
 limit.
 [horizontal]
 `prefix`::
-Match terms starting with this prefix
+(Required, string) Beginning characters of terms you wish to find in the
 top-level `<field>`.
 `analyzer`::
-Which analyzer should be used to normalize the `prefix`.  By default, the
+(Optional, string) <<analysis, analyzer>> used to normalize the `prefix`.
-search analyzer of the top-level field will be used.
+Defaults to the top-level `<field>`'s analyzer.
 `use_field`::
-If specified, then match intervals from this field rather than the top-level field.
+
-The `prefix` will be normalized using the search analyzer from this field, unless
+--
-`analyzer` is specified separately.
+(Optional, string) If specified, then match intervals from this field rather
 than the top-level `<field>`.
 The `prefix` is normalized using the search analyzer from this field, unless a
 separate `analyzer` is specified.
 --
 [[intervals-wildcard]]
-==== `wildcard`
+==== `wildcard` rule parameters
-The `wildcard` rule finds terms that match a wildcard pattern.  The pattern will
+The `wildcard` rule matches terms using a wildcard pattern. This pattern can
-expand to match at most 128 terms; if there are more matching terms in the index,
+expand to match at most 128 terms. If the pattern matches more than 128 terms,
-then an error will be returned.
+{es} returns an error.
 [horizontal]
 `pattern`::
-Find terms matching this pattern
+(Required, string) Wildcard pattern used to find matching terms.
 +
 --
 This parameter supports two wildcard operators:
@ -125,51 +166,112 @@ WARNING: Avoid beginning patterns with `*` or `?`. This can increase
 the iterations needed to find matching terms and slow search performance.
 --
 `analyzer`::
-Which analyzer should be used to normalize the `pattern`.  By default, the
+(Optional, string) <<analysis, analyzer>> used to normalize the `pattern`.
-search analyzer of the top-level field will be used.
+Defaults to the top-level `<field>`'s analyzer.
 `use_field`::
-If specified, then match intervals from this field rather than the top-level field.
+
-The `pattern` will be normalized using the search analyzer from this field, unless
+--
 (Optional, string) If specified, match intervals from this field rather than the
 top-level `<field>`.
 The `pattern` is normalized using the search analyzer from this field, unless
 `analyzer` is specified separately.
 --
 [[intervals-all_of]]
-==== `all_of`
+==== `all_of` rule parameters
-`all_of` returns returns matches that span a combination of other rules.
+The `all_of` rule returns matches that span a combination of other rules.
 [horizontal]
 `intervals`::
-An array of rules to combine.  All rules must produce a match in a
+(Required, array of rule objects) An array of rules to combine. All rules must
-document for the overall source to match.
+produce a match in a document for the overall source to match.
 `max_gaps`::
-Specify a maximum number of gaps between the rules.  Combinations that match
+
-across a distance greater than this will not match.  If set to -1 or
+--
-unspecified, there is no restriction on this distance.  If set to 0, then the
+(Optional, integer) Maximum number of positions between the matching terms.
-matches produced by the rules must all appear immediately next to each other.
+Intervals produced by the rules further apart than this are not considered
 matches. Defaults to `-1`.
 If unspecified or set to `-1`, there is no width restriction on the match. If
 set to `0`, the terms must appear next to each other.
 --
 `ordered`::
-Whether the intervals produced by the rules should appear in the order in
+(Optional, boolean) If `true`, intervals produced by the rules should appear in
-which they are specified.  Defaults to `false`
+the order in which they are specified. Defaults to `false`.
 `filter`::
-An optional <<interval_filter,interval filter>>
+(Optional, <<interval_filter,interval filter>> rule object) Rule used to filter
 returned intervals.
 [[intervals-any_of]]
-==== `any_of`
+==== `any_of` rule parameters
-The `any_of` rule emits intervals produced by any of its sub-rules.
+The `any_of` rule returns intervals produced by any of its sub-rules.
 [horizontal]
 `intervals`::
-An array of rules to match
+(Required, array of rule objects) An array of rules to match.
 `filter`::
-An optional <<interval_filter,interval filter>>
+(Optional, <<interval_filter,interval filter>> rule object) Rule used to filter
 returned intervals.
 [[interval_filter]]
-==== filters
+==== `filter` rule parameters
-You can filter intervals produced by any rules by their relation to the
+The `filter` rule returns intervals based on a query. See
-intervals produced by another rule.  The following example will return
+<<interval-filter-rule-ex>> for an example.
-documents that have the words `hot` and `porridge` within 10 positions
+
-of each other, without the word `salty` in between:
+`after`::
 (Optional, query object) Query used to return intervals that follow an interval
 from the `filter` rule.
 `before`::
 (Optional, query object) Query used to return intervals that occur before an
 interval from the `filter` rule.
 `contained_by`::
 (Optional, query object) Query used to return intervals contained by an interval
 from the `filter` rule.
 `containing`::
 (Optional, query object) Query used to return intervals that contain an interval
 from the `filter` rule.
 `not_contained_by`::
 (Optional, query object) Query used to return intervals that are *not*
 contained by an interval from the `filter` rule.
 `not_containing`::
 (Optional, query object) Query used to return intervals that do *not* contain
 an interval from the `filter` rule.
 `not_overlapping`::
 (Optional, query object) Query used to return intervals that do *not* overlap
 with an interval from the `filter` rule.
 `overlapping`::
 (Optional, query object) Query used to return intervals that overlap with an
 interval from the `filter` rule.
 `script`::
 (Optional, <<modules-scripting-using, script object>>) Script used to return
 matching documents. This script must return a boolean value, `true` or `false`.
 See <<interval-script-filter>> for an example.
 [[intervals-query-note]]
 ==== Notes
 [[interval-filter-rule-ex]]
 ===== Filter example
 The following search includes a `filter` rule. It returns documents that have
 the words `hot` and `porridge` within 10 positions of each other, without the
 word `salty` in between:
 [source,js]
 --------------------------------------------------
@ -196,31 +298,12 @@ POST _search
 --------------------------------------------------
 // CONSOLE
 The following filters are available:
 [horizontal]
 `containing`::
 Produces intervals that contain an interval from the filter rule
 `contained_by`::
 Produces intervals that are contained by an interval from the filter rule
 `not_containing`::
 Produces intervals that do not contain an interval from the filter rule
 `not_contained_by`::
 Produces intervals that are not contained by an interval from the filter rule
 `overlapping`::
 Produces intervals that overlap with an interval from the filter rule
 `not_overlapping`::
 Produces intervals that do not overlap with an interval from the filter rule
 `before`::
 Produces intervals that appear before an interval from the filter role
 `after`::
 Produces intervals that appear after an interval from the filter role
 [[interval-script-filter]]
-==== Script filters
+===== Script filters
-You can also filter intervals based on their start position, end position and
+You can use a script to filter intervals based on their start position, end
-internal gap count, using a script.  The script has access to an `interval`
+position, and internal gap count. The following `filter` script uses the
-variable, with `start`, `end` and `gaps` methods:
+`interval` variable with the `start`, `end`, and `gaps` methods:
 [source,js]
 --------------------------------------------------
@ -244,12 +327,13 @@ POST _search
 --------------------------------------------------
 // CONSOLE
 [[interval-minimization]]
-==== Minimization
+===== Minimization
 The intervals query always minimizes intervals, to ensure that queries can
-run in linear time.  This can sometimes cause surprising results, particularly
+run in linear time. This can sometimes cause surprising results, particularly
-when using `max_gaps` restrictions or filters.  For example, take the
+when using `max_gaps` restrictions or filters. For example, take the
 following query, searching for `salty` contained within the phrase `hot
 porridge`:
@ -277,15 +361,15 @@ POST _search
 --------------------------------------------------
 // CONSOLE
-This query will *not* match a document containing the phrase `hot porridge is
+This query does *not* match a document containing the phrase `hot porridge is
 salty porridge`, because the intervals returned by the match query for `hot
 porridge` only cover the initial two terms in this document, and these do not
 overlap the intervals covering `salty`.
 Another restriction to be aware of is the case of `any_of` rules that contain
-sub-rules which overlap.  In particular, if one of the rules is a strict
+sub-rules which overlap. In particular, if one of the rules is a strict
-prefix of the other, then the longer rule will never be matched, which can
+prefix of the other, then the longer rule can never match, which can
-cause surprises when used in combination with `max_gaps`.  Consider the
+cause surprises when used in combination with `max_gaps`. Consider the
 following query, searching for `the` immediately followed by `big` or `big bad`,
 immediately followed by `wolf`:
@ -316,10 +400,10 @@ POST _search
 --------------------------------------------------
 // CONSOLE
-Counter-intuitively, this query *will not* match the document `the big bad
+Counter-intuitively, this query does *not* match the document `the big bad
-wolf`, because the `any_of` rule in the middle will only produce intervals
+wolf`, because the `any_of` rule in the middle only produces intervals
 for `big` - intervals for `big bad` being longer than those for `big`, while
-starting at the same position, and so being minimized away.  In these cases,
+starting at the same position, and so being minimized away. In these cases,
 it's better to rewrite the query so that all of the options are explicitly
 laid out at the top level: