[DOCS] Add performance warning for scripts (#59890) (#59913)

2020-07-20 15:05:33 -04:00 · 2020-07-20 15:05:33 -04:00 · 24fec52447
parent e16e565c5e
commit 24fec52447
6 changed files with 175 additions and 156 deletions
--- a/docs/painless/painless-lang-spec/painless-regexes.asciidoc
+++ b/docs/painless/painless-lang-spec/painless-regexes.asciidoc
@ -10,6 +10,10 @@ are always constants and compiled efficiently a single time.
 Pattern p = /[aeiou]/
 ---------------------------------------------------------

+WARNING: A poorly written regular expression can significantly slow performance.
+If possible, avoid using regular expressions, particularly in frequently run
+scripts.
+
 [[pattern-flags]]
 ==== Pattern flags

--- a/docs/reference/aggregations/metrics/scripted-metric-aggregation.asciidoc
+++ b/docs/reference/aggregations/metrics/scripted-metric-aggregation.asciidoc
@ -3,6 +3,9 @@

 A metric aggregation that executes using scripts to provide a metric output.

+WARNING: Using scripts can result in slower search speeds. See
+<<scripts-and-search-speed>>.
+
 Example:

 [source,console]
--- a/docs/reference/aggregations/metrics/string-stats-aggregation.asciidoc
+++ b/docs/reference/aggregations/metrics/string-stats-aggregation.asciidoc
@ -6,6 +6,9 @@
 A `multi-value` metrics aggregation that computes statistics over string values extracted from the aggregated documents.
 These values can be retrieved either from specific `keyword` fields in the documents or can be generated by a provided script.

+WARNING: Using scripts can result in slower search speeds. See
+<<scripts-and-search-speed>>.
+
 The string stats aggregation returns the following results:

 * `count` - The number of non-empty fields counted.
--- a/docs/reference/how-to/search-speed.asciidoc
+++ b/docs/reference/how-to/search-speed.asciidoc
@ -165,163 +165,9 @@ include::../mapping/types/numeric.asciidoc[tag=map-ids-as-keyword]
 === Avoid scripts

 If possible, avoid using <<modules-scripting,scripts>> or
-<<request-body-search-script-fields,scripted fields>> in searches. Because
-scripts can't make use of index structures, using scripts in search queries can
-result in slower search speeds.
+<<request-body-search-script-fields,scripted fields>> in searches. See
+<<scripts-and-search-speed>>.

-If you often use scripts to transform indexed data, you can speed up search by
-making these changes during ingest instead. However, that often means slower
-index speeds.
-
-.*Example*
-[%collapsible]
-====
-An index, `my_test_scores`, contains two `long` fields:
-
-* `math_score`
-* `verbal_score`
-
-When running searches, users often use a script to sort results by the sum of
-these two field's values.
-
-[source,console]
----
-GET /my_test_scores/_search
-{
-  "query": {
-    "term": {
-      "grad_year": "2020"
-    }
-  },
-  "sort": [
-    {
-      "_script": {
-        "type": "number",
-        "script": {
-          "source": "doc['math_score'].value + doc['verbal_score'].value"
-        },
-        "order": "desc"
-      }
-    }
-  ]
-}
----
-// TEST[s/^/PUT my_test_scores\n/]
-
-To speed up search, you can perform this calculation during ingest and index the
-sum to a field instead.
-
-First, <<indices-put-mapping,add a new field>>, `total_score`, to the index. The
-`total_score` field will contain sum of the `math_score` and `verbal_score`
-field values.
-
-[source,console]
----
-PUT /my_test_scores/_mapping
-{
-  "properties": {
-    "total_score": {
-      "type": "long"
-    }
-  }
-}
----
-// TEST[continued]
-
-Next, use an <<ingest,ingest pipeline>> containing the
-<<script-processor,`script`>> processor to calculate the sum of `math_score` and
-`verbal_score` and index it in the `total_score` field.
-
-[source,console]
----
-PUT _ingest/pipeline/my_test_scores_pipeline
-{
-  "description": "Calculates the total test score",
-  "processors": [
-    {
-      "script": {
-        "source": "ctx.total_score = (ctx.math_score + ctx.verbal_score)"
-      }
-    }
-  ]
-}
----
-// TEST[continued]
-
-To update existing data, use this pipeline to <<docs-reindex,reindex>> any
-documents from `my_test_scores` to a new index, `my_test_scores_2`.
-
-[source,console]
----
-POST /_reindex
-{
-  "source": {
-    "index": "my_test_scores"
-  },
-  "dest": {
-    "index": "my_test_scores_2",
-    "pipeline": "my_test_scores_pipeline"
-  }
-}
----
-// TEST[continued]
-
-Continue using the pipeline to index any new documents to `my_test_scores_2`.
-
-[source,console]
----
-POST /my_test_scores_2/_doc/?pipeline=my_test_scores_pipeline
-{
-  "student": "kimchy",
-  "grad_year": "2020",
-  "math_score": 800,
-  "verbal_score": 800
-}
----
-// TEST[continued]
-
-These changes may slow indexing but allow for faster searches. Users can now
-sort searches made on `my_test_scores_2` using the `total_score` field instead
-of using a script.
-
-[source,console]
----
-GET /my_test_scores_2/_search
-{
-  "query": {
-    "term": {
-      "grad_year": "2020"
-    }
-  },
-  "sort": [
-    {
-      "total_score": {
-        "order": "desc"
-      }
-    }
-  ]
-}
----
-// TEST[continued]
-
-////
-[source,console]
----
-DELETE /_ingest/pipeline/my_test_scores_pipeline
----
-// TEST[continued]
-
-[source,console-result]
----
-{
-"acknowledged": true
-}
----
-////
-====
-
-We recommend testing and benchmarking any indexing changes before deploying them
-in production.

 [float]
 === Search rounded dates
--- a/docs/reference/query-dsl/script-query.asciidoc
+++ b/docs/reference/query-dsl/script-query.asciidoc
@ -7,6 +7,9 @@
 Filters documents based on a provided <<modules-scripting-using,script>>. The
 `script` query is typically used in a <<query-filter-context,filter context>>.

+WARNING: Using scripts can result in slower search speeds. See
+<<scripts-and-search-speed>>.
+

 [[script-query-ex-request]]
 ==== Example request
--- a/docs/reference/scripting/using.asciidoc
+++ b/docs/reference/scripting/using.asciidoc
@ -260,6 +260,166 @@ changed by setting `script.max_size_in_bytes` setting to increase that soft
 limit, but if scripts are really large then a
 <<modules-scripting-engine,native script engine>> should be considered.

+[[scripts-and-search-speed]]
+=== Scripts and search speed
+
+Scripts can't make use of {es}'s index structures or related optimizations. This
+can sometimes result in slower search speeds.
+
+If you often use scripts to transform indexed data, you can speed up search by
+making these changes during ingest instead. However, that often means slower
+index speeds.
+
+.*Example*
+[%collapsible]
+=====
+An index, `my_test_scores`, contains two `long` fields:
+
+* `math_score`
+* `verbal_score`
+
+When running searches, users often use a script to sort results by the sum of
+these two field's values.
+
+[source,console]
+----
+GET /my_test_scores/_search
+{
+  "query": {
+    "term": {
+      "grad_year": "2099"
+    }
+  },
+  "sort": [
+    {
+      "_script": {
+        "type": "number",
+        "script": {
+          "source": "doc['math_score'].value + doc['verbal_score'].value"
+        },
+        "order": "desc"
+      }
+    }
+  ]
+}
+----
+// TEST[s/^/PUT my_test_scores\n/]
+
+To speed up search, you can perform this calculation during ingest and index the
+sum to a field instead.
+
+First, <<indices-put-mapping,add a new field>>, `total_score`, to the index. The
+`total_score` field will contain sum of the `math_score` and `verbal_score`
+field values.
+
+[source,console]
+----
+PUT /my_test_scores/_mapping
+{
+  "properties": {
+    "total_score": {
+      "type": "long"
+    }
+  }
+}
+----
+// TEST[continued]
+
+Next, use an <<ingest,ingest pipeline>> containing the
+<<script-processor,`script`>> processor to calculate the sum of `math_score` and
+`verbal_score` and index it in the `total_score` field.
+
+[source,console]
+----
+PUT _ingest/pipeline/my_test_scores_pipeline
+{
+  "description": "Calculates the total test score",
+  "processors": [
+    {
+      "script": {
+        "source": "ctx.total_score = (ctx.math_score + ctx.verbal_score)"
+      }
+    }
+  ]
+}
+----
+// TEST[continued]
+
+To update existing data, use this pipeline to <<docs-reindex,reindex>> any
+documents from `my_test_scores` to a new index, `my_test_scores_2`.
+
+[source,console]
+----
+POST /_reindex
+{
+  "source": {
+    "index": "my_test_scores"
+  },
+  "dest": {
+    "index": "my_test_scores_2",
+    "pipeline": "my_test_scores_pipeline"
+  }
+}
+----
+// TEST[continued]
+
+Continue using the pipeline to index any new documents to `my_test_scores_2`.
+
+[source,console]
+----
+POST /my_test_scores_2/_doc/?pipeline=my_test_scores_pipeline
+{
+  "student": "kimchy",
+  "grad_year": "2099",
+  "math_score": 800,
+  "verbal_score": 800
+}
+----
+// TEST[continued]
+
+These changes may slow indexing but allow for faster searches. Users can now
+sort searches made on `my_test_scores_2` using the `total_score` field instead
+of using a script.
+
+[source,console]
+----
+GET /my_test_scores_2/_search
+{
+  "query": {
+    "term": {
+      "grad_year": "2099"
+    }
+  },
+  "sort": [
+    {
+      "total_score": {
+        "order": "desc"
+      }
+    }
+  ]
+}
+----
+// TEST[continued]
+
+////
+[source,console]
+----
+DELETE /_ingest/pipeline/my_test_scores_pipeline
+----
+// TEST[continued]
+
+[source,console-result]
+----
+{
+"acknowledged": true
+}
+----
+////
+=====
+
+We recommend testing and benchmarking any indexing changes before deploying them
+in production.
+
 [float]
 [[modules-scripting-errors]]
 === Script errors