[DOCS] Add performance warning for scripts (#59890) (#59913)

This commit is contained in:
James Rodewig 2020-07-20 15:05:33 -04:00 committed by GitHub
parent e16e565c5e
commit 24fec52447
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
6 changed files with 175 additions and 156 deletions

View File

@ -10,6 +10,10 @@ are always constants and compiled efficiently a single time.
Pattern p = /[aeiou]/
---------------------------------------------------------
WARNING: A poorly written regular expression can significantly slow performance.
If possible, avoid using regular expressions, particularly in frequently run
scripts.
[[pattern-flags]]
==== Pattern flags

View File

@ -3,6 +3,9 @@
A metric aggregation that executes using scripts to provide a metric output.
WARNING: Using scripts can result in slower search speeds. See
<<scripts-and-search-speed>>.
Example:
[source,console]

View File

@ -6,6 +6,9 @@
A `multi-value` metrics aggregation that computes statistics over string values extracted from the aggregated documents.
These values can be retrieved either from specific `keyword` fields in the documents or can be generated by a provided script.
WARNING: Using scripts can result in slower search speeds. See
<<scripts-and-search-speed>>.
The string stats aggregation returns the following results:
* `count` - The number of non-empty fields counted.

View File

@ -165,163 +165,9 @@ include::../mapping/types/numeric.asciidoc[tag=map-ids-as-keyword]
=== Avoid scripts
If possible, avoid using <<modules-scripting,scripts>> or
<<request-body-search-script-fields,scripted fields>> in searches. Because
scripts can't make use of index structures, using scripts in search queries can
result in slower search speeds.
<<request-body-search-script-fields,scripted fields>> in searches. See
<<scripts-and-search-speed>>.
If you often use scripts to transform indexed data, you can speed up search by
making these changes during ingest instead. However, that often means slower
index speeds.
.*Example*
[%collapsible]
====
An index, `my_test_scores`, contains two `long` fields:
* `math_score`
* `verbal_score`
When running searches, users often use a script to sort results by the sum of
these two field's values.
[source,console]
----
GET /my_test_scores/_search
{
"query": {
"term": {
"grad_year": "2020"
}
},
"sort": [
{
"_script": {
"type": "number",
"script": {
"source": "doc['math_score'].value + doc['verbal_score'].value"
},
"order": "desc"
}
}
]
}
----
// TEST[s/^/PUT my_test_scores\n/]
To speed up search, you can perform this calculation during ingest and index the
sum to a field instead.
First, <<indices-put-mapping,add a new field>>, `total_score`, to the index. The
`total_score` field will contain sum of the `math_score` and `verbal_score`
field values.
[source,console]
----
PUT /my_test_scores/_mapping
{
"properties": {
"total_score": {
"type": "long"
}
}
}
----
// TEST[continued]
Next, use an <<ingest,ingest pipeline>> containing the
<<script-processor,`script`>> processor to calculate the sum of `math_score` and
`verbal_score` and index it in the `total_score` field.
[source,console]
----
PUT _ingest/pipeline/my_test_scores_pipeline
{
"description": "Calculates the total test score",
"processors": [
{
"script": {
"source": "ctx.total_score = (ctx.math_score + ctx.verbal_score)"
}
}
]
}
----
// TEST[continued]
To update existing data, use this pipeline to <<docs-reindex,reindex>> any
documents from `my_test_scores` to a new index, `my_test_scores_2`.
[source,console]
----
POST /_reindex
{
"source": {
"index": "my_test_scores"
},
"dest": {
"index": "my_test_scores_2",
"pipeline": "my_test_scores_pipeline"
}
}
----
// TEST[continued]
Continue using the pipeline to index any new documents to `my_test_scores_2`.
[source,console]
----
POST /my_test_scores_2/_doc/?pipeline=my_test_scores_pipeline
{
"student": "kimchy",
"grad_year": "2020",
"math_score": 800,
"verbal_score": 800
}
----
// TEST[continued]
These changes may slow indexing but allow for faster searches. Users can now
sort searches made on `my_test_scores_2` using the `total_score` field instead
of using a script.
[source,console]
----
GET /my_test_scores_2/_search
{
"query": {
"term": {
"grad_year": "2020"
}
},
"sort": [
{
"total_score": {
"order": "desc"
}
}
]
}
----
// TEST[continued]
////
[source,console]
----
DELETE /_ingest/pipeline/my_test_scores_pipeline
----
// TEST[continued]
[source,console-result]
----
{
"acknowledged": true
}
----
////
====
We recommend testing and benchmarking any indexing changes before deploying them
in production.
[float]
=== Search rounded dates

View File

@ -7,6 +7,9 @@
Filters documents based on a provided <<modules-scripting-using,script>>. The
`script` query is typically used in a <<query-filter-context,filter context>>.
WARNING: Using scripts can result in slower search speeds. See
<<scripts-and-search-speed>>.
[[script-query-ex-request]]
==== Example request

View File

@ -260,6 +260,166 @@ changed by setting `script.max_size_in_bytes` setting to increase that soft
limit, but if scripts are really large then a
<<modules-scripting-engine,native script engine>> should be considered.
[[scripts-and-search-speed]]
=== Scripts and search speed
Scripts can't make use of {es}'s index structures or related optimizations. This
can sometimes result in slower search speeds.
If you often use scripts to transform indexed data, you can speed up search by
making these changes during ingest instead. However, that often means slower
index speeds.
.*Example*
[%collapsible]
=====
An index, `my_test_scores`, contains two `long` fields:
* `math_score`
* `verbal_score`
When running searches, users often use a script to sort results by the sum of
these two field's values.
[source,console]
----
GET /my_test_scores/_search
{
"query": {
"term": {
"grad_year": "2099"
}
},
"sort": [
{
"_script": {
"type": "number",
"script": {
"source": "doc['math_score'].value + doc['verbal_score'].value"
},
"order": "desc"
}
}
]
}
----
// TEST[s/^/PUT my_test_scores\n/]
To speed up search, you can perform this calculation during ingest and index the
sum to a field instead.
First, <<indices-put-mapping,add a new field>>, `total_score`, to the index. The
`total_score` field will contain sum of the `math_score` and `verbal_score`
field values.
[source,console]
----
PUT /my_test_scores/_mapping
{
"properties": {
"total_score": {
"type": "long"
}
}
}
----
// TEST[continued]
Next, use an <<ingest,ingest pipeline>> containing the
<<script-processor,`script`>> processor to calculate the sum of `math_score` and
`verbal_score` and index it in the `total_score` field.
[source,console]
----
PUT _ingest/pipeline/my_test_scores_pipeline
{
"description": "Calculates the total test score",
"processors": [
{
"script": {
"source": "ctx.total_score = (ctx.math_score + ctx.verbal_score)"
}
}
]
}
----
// TEST[continued]
To update existing data, use this pipeline to <<docs-reindex,reindex>> any
documents from `my_test_scores` to a new index, `my_test_scores_2`.
[source,console]
----
POST /_reindex
{
"source": {
"index": "my_test_scores"
},
"dest": {
"index": "my_test_scores_2",
"pipeline": "my_test_scores_pipeline"
}
}
----
// TEST[continued]
Continue using the pipeline to index any new documents to `my_test_scores_2`.
[source,console]
----
POST /my_test_scores_2/_doc/?pipeline=my_test_scores_pipeline
{
"student": "kimchy",
"grad_year": "2099",
"math_score": 800,
"verbal_score": 800
}
----
// TEST[continued]
These changes may slow indexing but allow for faster searches. Users can now
sort searches made on `my_test_scores_2` using the `total_score` field instead
of using a script.
[source,console]
----
GET /my_test_scores_2/_search
{
"query": {
"term": {
"grad_year": "2099"
}
},
"sort": [
{
"total_score": {
"order": "desc"
}
}
]
}
----
// TEST[continued]
////
[source,console]
----
DELETE /_ingest/pipeline/my_test_scores_pipeline
----
// TEST[continued]
[source,console-result]
----
{
"acknowledged": true
}
----
////
=====
We recommend testing and benchmarking any indexing changes before deploying them
in production.
[float]
[[modules-scripting-errors]]
=== Script errors