OpenSearch/docs/reference/search/benchmark.asciidoc

[[search-benchmark]]
== Benchmark

.Experimental!
[IMPORTANT]
=====
This feature is marked as experimental, and may be subject to change in the
future. If you use this feature, please let us know your experience with it!
=====

The benchmark API provides a standard mechanism for submitting queries and
measuring their performance relative to one another.

[IMPORTANT]
=====
To be eligible to run benchmarks nodes must be started with: `--node.bench true`. This is just a way to mark certain nodes as "executors". Searches will still be distributed out to the cluster in the normal manner. This is primarily a defensive measure to prevent production nodes from being flooded with potentially many requests. Typically one would start a single node with this setting and submit benchmark requests to it.
=====

[source,bash]
--------------------------------------------------
$ ./bin/elasticsearch --node.bench true
--------------------------------------------------

Benchmarking a search request is as simple as executing the following command:

[source,js]
--------------------------------------------------
$ curl -XPUT 'localhost:9200/_bench/?pretty=true' -d
'{
    "name": "my_benchmark",
    "competitors": [ {
        "name": "my_competitor",
        "requests": [ {
            "query": {
                "match": { "_all": "a*" }
            }
        } ]
    } ]
}'
--------------------------------------------------

Response:

[source,js]
--------------------------------------------------
{
  "status" : "complete",
  "competitors" : {
    "my_competitor" : {
      "summary" : {
        "nodes" : [ "localhost" ],
        "total_iterations" : 5,
        "completed_iterations" : 5,
        "total_queries" : 1000,
        "concurrency" : 5,
        "multiplier" : 100,
        "avg_warmup_time" : 43.0,
        "statistics" : {
          "min" : 1,
          "max" : 10,
          "mean" : 4.19,
          "qps" : 238.663,
          "std_dev" : 1.938,
          "millis_per_hit" : 1.064,
          "percentile_10" : 2,
          "percentile_25" : 3,
          "percentile_50" : 4,
          "percentile_75" : 5,
          "percentile_90" : 7,
          "percentile_99" : 10
        },
        "slowest" : [ {
          "node" : "localhost",
          "max_time" : 15,
          "avg_time" : 4,
          "request":{"query":{"match":{"_all":"a*"}}}
        } ]
      }
    }
  }
}
--------------------------------------------------

A 'competitor' defines one or more search requests to execute along with parameters that describe how the search(es) should be run. 
Multiple competitors may be submitted as a group in which case they will execute one after the other. This makes it easy to compare various
competing alternatives side-by-side.

There are several parameters which may be set at the competition level:
[horizontal]
`name`::            Unique name for the competition
`iterations`::      Number of times to run the competitors
`concurrency`::     Within each iteration use this level of parallelism
`multiplier`::      Within each iteration run the query this many times
`warmup`::          Perform warmup of query
`num_slowest`::     Record N slowest queries
`search_type`::     Type of search, e.g. "query_then_fetch", "dfs_query_then_fetch", "count" 
`requests`::        Query DSL describing search requests
`clear_caches`::    Whether caches should be cleared on each iteration, and if so, how
`indices`::         Array of indices (and optional types) to search, e.g. ["my_index_1/my_type_1", "my_index_2", "my_index_3/my_type_3"]

Cache clearing parameters:
[horizontal]
`clear_caches`::                Set to 'false' to disable cache clearing completely
`clear_caches.filter`::         Whether to clear the filter cache
`clear_caches.field_data`::     Whether to clear the field data cache
`clear_caches.id`::             Whether to clear the id cache
`clear_caches.recycler`::       Whether to clear the recycler cache
`clear_caches.fields`::         Array of fields to clear
`clear_caches.filter_keys`::    Array of filter keys to clear

Global parameters:
[horizontal]
`name`::                    Unique name for the benchmark
`num_executor_nodes`::      Number of cluster nodes from which to submit and time benchmarks. Allows user to run a benchmark simultaneously on one or more nodes and compare timings. Note that this does not control how many nodes a search request will actually execute on. Defaults to: 1.
`percentiles`::             Array of percentile values to report. Defaults to: [10, 25, 50, 75, 90, 99]

Additionally, the following competition-level parameters may be set globally: iteration, concurrency, multiplier, warmup, and clear_caches.

Using these parameters it is possible to describe precisely how to execute a benchmark under various conditions. In the following example we run a filtered query against two different indices using two different search types.

[source,js]
--------------------------------------------------
$ curl -XPUT 'localhost:9200/_bench/?pretty=true' -d
{
    "name": "my_benchmark",
    "num_executor_nodes": 1,
    "percentiles" : [ 25, 50, 75 ],
    "iterations": 5,
    "multiplier": 1000,
    "concurrency": 5,
    "num_slowest": 0,
    "warmup": true,
    "clear_caches": false,

    "requests": [ {
        "query" : {
            "filtered" : {
                "query" : { "match" : { "_all" : "*" } },
                "filter" : {
                    "and" : [ { "term" : { "title" : "Spain" } },
                              { "term" : { "title" : "rain" } },
                              { "term" : { "title" : "plain" } } ]
                }
            }
        }
    } ],

    "competitors": [ {
        "name": "competitor_1",
        "search_type": "query_then_fetch",
        "indices": [ "my_index_1" ],
        "clear_caches" : {
            "filter" : true,
            "field_data" : true,
            "id" : true,
            "recycler" : true,
            "fields": ["title"]
        }
    }, {
        "name": "competitor_2",
        "search_type": "dfs_query_then_fetch",
        "indices": [ "my_index_2" ],
        "clear_caches" : {
            "filter" : true,
            "field_data" : true,
            "id" : true,
            "recycler" : true,
            "fields": ["title"]
        }
    } ]
}
--------------------------------------------------

Response:

[source,js]
--------------------------------------------------
{
  "status" : "complete",
  "competitors" : {
    "competitor_1" : {
      "summary" : {
        "nodes" : [ "localhost" ],
        "total_iterations" : 5,
        "completed_iterations" : 5,
        "total_queries" : 5000,
        "concurrency" : 5,
        "multiplier" : 1000,
        "avg_warmup_time" : 54.0,
        "statistics" : {
          "min" : 0,
          "max" : 3,
          "mean" : 0.533,
          "qps" : 1872.659,
          "std_dev" : 0.528,
          "millis_per_hit" : 0.0,
          "percentile_25" : 0.0,
          "percentile_50" : 1.0,
          "percentile_75" : 1.0
        },
        "slowest" : [ ]
      }
    },
    "competitor_2" : {
      "summary" : {
        "nodes" : [ "localhost" ],
        "total_iterations" : 5,
        "completed_iterations" : 5,
        "total_queries" : 5000,
        "concurrency" : 5,
        "multiplier" : 1000,
        "avg_warmup_time" : 4.0,
        "statistics" : {
          "min" : 0,
          "max" : 4,
          "mean" : 0.487,
          "qps" : 2049.180,
          "std_dev" : 0.545,
          "millis_per_hit" : 0.0,
          "percentile_25" : 0.0,
          "percentile_50" : 0.0,
          "percentile_75" : 1.0
        },
        "slowest" : [ ]
      }
    }
  }
}
--------------------------------------------------

In some cases it may be desirable to view the progress of a long-running benchmark and optionally terminate it early. To view all active benchmarks use:

[source,js]
--------------------------------------------------
$ curl -XGET 'localhost:9200/_bench?pretty'
--------------------------------------------------

This would display run-time statistics in the same format as the sample output above.

To abort a long-running benchmark use the 'abort' endpoint:

[source,js]
--------------------------------------------------
$ curl -XPOST 'localhost:9200/_bench/abort/my_benchmark?pretty'
--------------------------------------------------

Response:

[source,js]
--------------------------------------------------
{
    "aborted_benchmarks" : [
        "node" "localhost",
        "benchmark_name", "my_benchmark",
        "aborted", true
    ]
}
--------------------------------------------------
Benchmark API Add an API endpoint at /_bench for submitting, listing, and aborting search benchmarks. This API can be used for timing search requests, subject to various user-defined settings. Benchmark results provide summary and detailed statistics on such values as min, max, and mean time. Values are reported per-node so that it is easy to spot outliers. Slow requests are also reported. Long running benchmarks can be viewed with a GET request, or aborted with a POST request. Benchmark results are optionally stored in an index for subsequent analysis. Closes #5407 2014-03-12 12:16:39 -04:00			`[[search-benchmark]]`
			`== Benchmark`

			`.Experimental!`
			`[IMPORTANT]`
			`=====`
			`This feature is marked as experimental, and may be subject to change in the`
			`future. If you use this feature, please let us know your experience with it!`
			`=====`

			`The benchmark API provides a standard mechanism for submitting queries and`
			`measuring their performance relative to one another.`

			`[IMPORTANT]`
			`=====`
Update benchmark.asciidoc Fixed incorrect parameter spec for benchmark nodes 2014-04-22 08:16:10 -04:00			To be eligible to run benchmarks nodes must be started with: `--node.bench true`. This is just a way to mark certain nodes as "executors". Searches will still be distributed out to the cluster in the normal manner. This is primarily a defensive measure to prevent production nodes from being flooded with potentially many requests. Typically one would start a single node with this setting and submit benchmark requests to it.
Benchmark API Add an API endpoint at /_bench for submitting, listing, and aborting search benchmarks. This API can be used for timing search requests, subject to various user-defined settings. Benchmark results provide summary and detailed statistics on such values as min, max, and mean time. Values are reported per-node so that it is easy to spot outliers. Slow requests are also reported. Long running benchmarks can be viewed with a GET request, or aborted with a POST request. Benchmark results are optionally stored in an index for subsequent analysis. Closes #5407 2014-03-12 12:16:39 -04:00			`=====`

			`[source,bash]`
			`--------------------------------------------------`
Update benchmark.asciidoc Fixed incorrect parameter spec for benchmark nodes 2014-04-22 08:16:10 -04:00			`$ ./bin/elasticsearch --node.bench true`
Benchmark API Add an API endpoint at /_bench for submitting, listing, and aborting search benchmarks. This API can be used for timing search requests, subject to various user-defined settings. Benchmark results provide summary and detailed statistics on such values as min, max, and mean time. Values are reported per-node so that it is easy to spot outliers. Slow requests are also reported. Long running benchmarks can be viewed with a GET request, or aborted with a POST request. Benchmark results are optionally stored in an index for subsequent analysis. Closes #5407 2014-03-12 12:16:39 -04:00			`--------------------------------------------------`

			`Benchmarking a search request is as simple as executing the following command:`

			`[source,js]`
			`--------------------------------------------------`
			`$ curl -XPUT 'localhost:9200/_bench/?pretty=true' -d`
			`'{`
			`"name": "my_benchmark",`
			`"competitors": [ {`
			`"name": "my_competitor",`
			`"requests": [ {`
			`"query": {`
			`"match": { "_all": "a*" }`
			`}`
			`} ]`
			`} ]`
			`}'`
			`--------------------------------------------------`

			`Response:`

			`[source,js]`
			`--------------------------------------------------`
			`{`
			`"status" : "complete",`
			`"competitors" : {`
			`"my_competitor" : {`
			`"summary" : {`
			`"nodes" : [ "localhost" ],`
			`"total_iterations" : 5,`
			`"completed_iterations" : 5,`
			`"total_queries" : 1000,`
			`"concurrency" : 5,`
			`"multiplier" : 100,`
			`"avg_warmup_time" : 43.0,`
			`"statistics" : {`
			`"min" : 1,`
			`"max" : 10,`
			`"mean" : 4.19,`
			`"qps" : 238.663,`
			`"std_dev" : 1.938,`
			`"millis_per_hit" : 1.064,`
			`"percentile_10" : 2,`
			`"percentile_25" : 3,`
			`"percentile_50" : 4,`
			`"percentile_75" : 5,`
			`"percentile_90" : 7,`
			`"percentile_99" : 10`
			`},`
			`"slowest" : [ {`
			`"node" : "localhost",`
			`"max_time" : 15,`
			`"avg_time" : 4,`
			`"request":{"query":{"match":{"_all":"a*"}}}`
			`} ]`
			`}`
			`}`
			`}`
			`}`
			`--------------------------------------------------`

			`A 'competitor' defines one or more search requests to execute along with parameters that describe how the search(es) should be run.`
			`Multiple competitors may be submitted as a group in which case they will execute one after the other. This makes it easy to compare various`
			`competing alternatives side-by-side.`

			`There are several parameters which may be set at the competition level:`
			`[horizontal]`
			`name`:: Unique name for the competition
			`iterations`:: Number of times to run the competitors
			`concurrency`:: Within each iteration use this level of parallelism
			`multiplier`:: Within each iteration run the query this many times
			`warmup`:: Perform warmup of query
			`num_slowest`:: Record N slowest queries
			`search_type`:: Type of search, e.g. "query_then_fetch", "dfs_query_then_fetch", "count"
			`requests`:: Query DSL describing search requests
			`clear_caches`:: Whether caches should be cleared on each iteration, and if so, how
			`indices`:: Array of indices (and optional types) to search, e.g. ["my_index_1/my_type_1", "my_index_2", "my_index_3/my_type_3"]

			`Cache clearing parameters:`
			`[horizontal]`
			`clear_caches`:: Set to 'false' to disable cache clearing completely
			`clear_caches.filter`:: Whether to clear the filter cache
			`clear_caches.field_data`:: Whether to clear the field data cache
			`clear_caches.id`:: Whether to clear the id cache
			`clear_caches.recycler`:: Whether to clear the recycler cache
			`clear_caches.fields`:: Array of fields to clear
			`clear_caches.filter_keys`:: Array of filter keys to clear

			`Global parameters:`
			`[horizontal]`
			`name`:: Unique name for the benchmark
			`num_executor_nodes`:: Number of cluster nodes from which to submit and time benchmarks. Allows user to run a benchmark simultaneously on one or more nodes and compare timings. Note that this does not control how many nodes a search request will actually execute on. Defaults to: 1.
			`percentiles`:: Array of percentile values to report. Defaults to: [10, 25, 50, 75, 90, 99]

			`Additionally, the following competition-level parameters may be set globally: iteration, concurrency, multiplier, warmup, and clear_caches.`

			`Using these parameters it is possible to describe precisely how to execute a benchmark under various conditions. In the following example we run a filtered query against two different indices using two different search types.`

			`[source,js]`
			`--------------------------------------------------`
			`$ curl -XPUT 'localhost:9200/_bench/?pretty=true' -d`
			`{`
			`"name": "my_benchmark",`
			`"num_executor_nodes": 1,`
			`"percentiles" : [ 25, 50, 75 ],`
			`"iterations": 5,`
			`"multiplier": 1000,`
			`"concurrency": 5,`
			`"num_slowest": 0,`
			`"warmup": true,`
			`"clear_caches": false,`

			`"requests": [ {`
			`"query" : {`
			`"filtered" : {`
			`"query" : { "match" : { "_all" : "*" } },`
			`"filter" : {`
			`"and" : [ { "term" : { "title" : "Spain" } },`
			`{ "term" : { "title" : "rain" } },`
			`{ "term" : { "title" : "plain" } } ]`
			`}`
			`}`
			`}`
			`} ],`

			`"competitors": [ {`
			`"name": "competitor_1",`
			`"search_type": "query_then_fetch",`
			`"indices": [ "my_index_1" ],`
			`"clear_caches" : {`
			`"filter" : true,`
			`"field_data" : true,`
			`"id" : true,`
			`"recycler" : true,`
			`"fields": ["title"]`
			`}`
			`}, {`
			`"name": "competitor_2",`
			`"search_type": "dfs_query_then_fetch",`
			`"indices": [ "my_index_2" ],`
			`"clear_caches" : {`
			`"filter" : true,`
			`"field_data" : true,`
			`"id" : true,`
			`"recycler" : true,`
			`"fields": ["title"]`
			`}`
			`} ]`
			`}`
			`--------------------------------------------------`

			`Response:`

			`[source,js]`
			`--------------------------------------------------`
			`{`
			`"status" : "complete",`
			`"competitors" : {`
			`"competitor_1" : {`
			`"summary" : {`
			`"nodes" : [ "localhost" ],`
			`"total_iterations" : 5,`
			`"completed_iterations" : 5,`
			`"total_queries" : 5000,`
			`"concurrency" : 5,`
			`"multiplier" : 1000,`
			`"avg_warmup_time" : 54.0,`
			`"statistics" : {`
			`"min" : 0,`
			`"max" : 3,`
			`"mean" : 0.533,`
			`"qps" : 1872.659,`
			`"std_dev" : 0.528,`
			`"millis_per_hit" : 0.0,`
			`"percentile_25" : 0.0,`
			`"percentile_50" : 1.0,`
			`"percentile_75" : 1.0`
			`},`
			`"slowest" : [ ]`
			`}`
			`},`
			`"competitor_2" : {`
			`"summary" : {`
			`"nodes" : [ "localhost" ],`
			`"total_iterations" : 5,`
			`"completed_iterations" : 5,`
			`"total_queries" : 5000,`
			`"concurrency" : 5,`
			`"multiplier" : 1000,`
			`"avg_warmup_time" : 4.0,`
			`"statistics" : {`
			`"min" : 0,`
			`"max" : 4,`
			`"mean" : 0.487,`
			`"qps" : 2049.180,`
			`"std_dev" : 0.545,`
			`"millis_per_hit" : 0.0,`
			`"percentile_25" : 0.0,`
			`"percentile_50" : 0.0,`
			`"percentile_75" : 1.0`
			`},`
			`"slowest" : [ ]`
			`}`
			`}`
			`}`
			`}`
			`--------------------------------------------------`

			`In some cases it may be desirable to view the progress of a long-running benchmark and optionally terminate it early. To view all active benchmarks use:`

			`[source,js]`
			`--------------------------------------------------`
			`$ curl -XGET 'localhost:9200/_bench?pretty'`
			`--------------------------------------------------`

			`This would display run-time statistics in the same format as the sample output above.`

			`To abort a long-running benchmark use the 'abort' endpoint:`

			`[source,js]`
			`--------------------------------------------------`
			`$ curl -XPOST 'localhost:9200/_bench/abort/my_benchmark?pretty'`
			`--------------------------------------------------`

			`Response:`

			`[source,js]`
			`--------------------------------------------------`
			`{`
			`"aborted_benchmarks" : [`
			`"node" "localhost",`
			`"benchmark_name", "my_benchmark",`
			`"aborted", true`
			`]`
			`}`
			`--------------------------------------------------`