OpenSearch/docs/reference/search/benchmark.asciidoc

[[search-benchmark]]
== Benchmark

.Experimental!
[IMPORTANT]
=====
This feature is marked as experimental, and may be subject to change in the
future. If you use this feature, please let us know your experience with it!
=====

The benchmark API provides a standard mechanism for submitting queries and
measuring their performance relative to one another.

[IMPORTANT]
=====
To be eligible to run benchmarks nodes must be started with: *es.node.bench=true*. This is just a way to mark certain nodes as "executors". Searches will still be distributed out to the cluster in the normal manner. This is primarily a defensive measure to prevent production nodes from being flooded with potentially many requests. Typically one would start a single node with this setting and sumbmit benchmark requests to it.
=====

[source,bash]
--------------------------------------------------
$ ./bin/elasticsearch -Des.node.bench=true
--------------------------------------------------

Benchmarking a search request is as simple as executing the following command:

[source,js]
--------------------------------------------------
$ curl -XPUT 'localhost:9200/_bench/?pretty=true' -d
'{
    "name": "my_benchmark",
    "competitors": [ {
        "name": "my_competitor",
        "requests": [ {
            "query": {
                "match": { "_all": "a*" }
            }
        } ]
    } ]
}'
--------------------------------------------------

Response:

[source,js]
--------------------------------------------------
{
  "status" : "complete",
  "competitors" : {
    "my_competitor" : {
      "summary" : {
        "nodes" : [ "localhost" ],
        "total_iterations" : 5,
        "completed_iterations" : 5,
        "total_queries" : 1000,
        "concurrency" : 5,
        "multiplier" : 100,
        "avg_warmup_time" : 43.0,
        "statistics" : {
          "min" : 1,
          "max" : 10,
          "mean" : 4.19,
          "qps" : 238.663,
          "std_dev" : 1.938,
          "millis_per_hit" : 1.064,
          "percentile_10" : 2,
          "percentile_25" : 3,
          "percentile_50" : 4,
          "percentile_75" : 5,
          "percentile_90" : 7,
          "percentile_99" : 10
        },
        "slowest" : [ {
          "node" : "localhost",
          "max_time" : 15,
          "avg_time" : 4,
          "request":{"query":{"match":{"_all":"a*"}}}
        } ]
      }
    }
  }
}
--------------------------------------------------

A 'competitor' defines one or more search requests to execute along with parameters that describe how the search(es) should be run.
Multiple competitors may be submitted as a group in which case they will execute one after the other. This makes it easy to compare various
competing alternatives side-by-side.

There are several parameters which may be set at the competition level:
[horizontal]
`name`::            Unique name for the competition
`iterations`::      Number of times to run the competitors
`concurrency`::     Within each iteration use this level of parallelism
`multiplier`::      Within each iteration run the query this many times
`warmup`::          Perform warmup of query
`num_slowest`::     Record N slowest queries
`search_type`::     Type of search, e.g. "query_then_fetch", "dfs_query_then_fetch", "count"
`requests`::        Query DSL describing search requests
`clear_caches`::    Whether caches should be cleared on each iteration, and if so, how
`indices`::         Array of indices (and optional types) to search, e.g. ["my_index_1/my_type_1", "my_index_2", "my_index_3/my_type_3"]

Cache clearing parameters:
[horizontal]
`clear_caches`::                Set to 'false' to disable cache clearing completely
`clear_caches.filter`::         Whether to clear the filter cache
`clear_caches.field_data`::     Whether to clear the field data cache
`clear_caches.id`::             Whether to clear the id cache
`clear_caches.recycler`::       Whether to clear the recycler cache
`clear_caches.fields`::         Array of fields to clear
`clear_caches.filter_keys`::    Array of filter keys to clear

Global parameters:
[horizontal]
`name`::                    Unique name for the benchmark
`num_executor_nodes`::      Number of cluster nodes from which to submit and time benchmarks. Allows user to run a benchmark simultaneously on one or more nodes and compare timings. Note that this does not control how many nodes a search request will actually execute on. Defaults to: 1.
`percentiles`::             Array of percentile values to report. Defaults to: [10, 25, 50, 75, 90, 99]

Additionally, the following competition-level parameters may be set globally: iteration, concurrency, multiplier, warmup, and clear_caches.

Using these parameters it is possible to describe precisely how to execute a benchmark under various conditions. In the following example we run a filtered query against two different indices using two different search types.

[source,js]
--------------------------------------------------
$ curl -XPUT 'localhost:9200/_bench/?pretty=true' -d
{
    "name": "my_benchmark",
    "num_executor_nodes": 1,
    "percentiles" : [ 25, 50, 75 ],
    "iterations": 5,
    "multiplier": 1000,
    "concurrency": 5,
    "num_slowest": 0,
    "warmup": true,
    "clear_caches": false,

    "requests": [ {
        "query" : {
            "filtered" : {
                "query" : { "match" : { "_all" : "*" } },
                "filter" : {
                    "and" : [ { "term" : { "title" : "Spain" } },
                              { "term" : { "title" : "rain" } },
                              { "term" : { "title" : "plain" } } ]
                }
            }
        }
    } ],

    "competitors": [ {
        "name": "competitor_1",
        "search_type": "query_then_fetch",
        "indices": [ "my_index_1" ],
        "clear_caches" : {
            "filter" : true,
            "field_data" : true,
            "id" : true,
            "recycler" : true,
            "fields": ["title"]
        }
    }, {
        "name": "competitor_2",
        "search_type": "dfs_query_then_fetch",
        "indices": [ "my_index_2" ],
        "clear_caches" : {
            "filter" : true,
            "field_data" : true,
            "id" : true,
            "recycler" : true,
            "fields": ["title"]
        }
    } ]
}
--------------------------------------------------

Response:

[source,js]
--------------------------------------------------
{
  "status" : "complete",
  "competitors" : {
    "competitor_1" : {
      "summary" : {
        "nodes" : [ "localhost" ],
        "total_iterations" : 5,
        "completed_iterations" : 5,
        "total_queries" : 5000,
        "concurrency" : 5,
        "multiplier" : 1000,
        "avg_warmup_time" : 54.0,
        "statistics" : {
          "min" : 0,
          "max" : 3,
          "mean" : 0.533,
          "qps" : 1872.659,
          "std_dev" : 0.528,
          "millis_per_hit" : 0.0,
          "percentile_25" : 0.0,
          "percentile_50" : 1.0,
          "percentile_75" : 1.0
        },
        "slowest" : [ ]
      }
    },
    "competitor_2" : {
      "summary" : {
        "nodes" : [ "localhost" ],
        "total_iterations" : 5,
        "completed_iterations" : 5,
        "total_queries" : 5000,
        "concurrency" : 5,
        "multiplier" : 1000,
        "avg_warmup_time" : 4.0,
        "statistics" : {
          "min" : 0,
          "max" : 4,
          "mean" : 0.487,
          "qps" : 2049.180,
          "std_dev" : 0.545,
          "millis_per_hit" : 0.0,
          "percentile_25" : 0.0,
          "percentile_50" : 0.0,
          "percentile_75" : 1.0
        },
        "slowest" : [ ]
      }
    }
  }
}
--------------------------------------------------

In some cases it may be desirable to view the progress of a long-running benchmark and optionally terminate it early. To view all active benchmarks use:

[source,js]
--------------------------------------------------
$ curl -XGET 'localhost:9200/_bench?pretty'
--------------------------------------------------

This would display run-time statistics in the same format as the sample output above.

To abort a long-running benchmark use the 'abort' endpoint:

[source,js]
--------------------------------------------------
$ curl -XPOST 'localhost:9200/_bench/abort/my_benchmark?pretty'
--------------------------------------------------

Response:

[source,js]
--------------------------------------------------
{
    "aborted_benchmarks" : [
        "node" "localhost",
        "benchmark_name", "my_benchmark",
        "aborted", true
    ]
}
--------------------------------------------------