2014-03-12 12:16:39 -04:00
[[search-benchmark]]
== Benchmark
2015-02-04 09:43:22 -05:00
experimental[]
2014-03-12 12:16:39 -04:00
The benchmark API provides a standard mechanism for submitting queries and
measuring their performance relative to one another.
[IMPORTANT]
=====
2014-04-22 08:16:10 -04:00
To be eligible to run benchmarks nodes must be started with: `--node.bench true`. This is just a way to mark certain nodes as "executors". Searches will still be distributed out to the cluster in the normal manner. This is primarily a defensive measure to prevent production nodes from being flooded with potentially many requests. Typically one would start a single node with this setting and submit benchmark requests to it.
2014-03-12 12:16:39 -04:00
=====
[source,bash]
--------------------------------------------------
2014-04-22 08:16:10 -04:00
$ ./bin/elasticsearch --node.bench true
2014-03-12 12:16:39 -04:00
--------------------------------------------------
Benchmarking a search request is as simple as executing the following command:
[source,js]
--------------------------------------------------
2014-06-03 05:43:06 -04:00
$ curl -XPUT 'localhost:9200/_bench/?pretty=true' -d '{
2014-03-12 12:16:39 -04:00
"name": "my_benchmark",
"competitors": [ {
"name": "my_competitor",
"requests": [ {
"query": {
"match": { "_all": "a*" }
}
} ]
} ]
}'
--------------------------------------------------
Response:
[source,js]
--------------------------------------------------
{
"status" : "complete",
"competitors" : {
"my_competitor" : {
"summary" : {
"nodes" : [ "localhost" ],
"total_iterations" : 5,
"completed_iterations" : 5,
"total_queries" : 1000,
"concurrency" : 5,
"multiplier" : 100,
"avg_warmup_time" : 43.0,
"statistics" : {
"min" : 1,
"max" : 10,
"mean" : 4.19,
"qps" : 238.663,
"std_dev" : 1.938,
"millis_per_hit" : 1.064,
"percentile_10" : 2,
"percentile_25" : 3,
"percentile_50" : 4,
"percentile_75" : 5,
"percentile_90" : 7,
"percentile_99" : 10
2014-05-31 16:24:45 -04:00
}
2014-03-12 12:16:39 -04:00
}
}
}
}
--------------------------------------------------
A 'competitor' defines one or more search requests to execute along with parameters that describe how the search(es) should be run.
Multiple competitors may be submitted as a group in which case they will execute one after the other. This makes it easy to compare various
competing alternatives side-by-side.
There are several parameters which may be set at the competition level:
[horizontal]
2014-05-01 00:52:32 -04:00
`name`:: Unique name for the competition.
`iterations`:: Number of times to run the competitors. Defaults to `5`.
`concurrency`:: Within each iteration use this level of parallelism. Defaults to `5`.
`multiplier`:: Within each iteration run the query this many times. Defaults to `1000`.
`warmup`:: Perform warmup of query. Defaults to `true`.
`num_slowest`:: Record N slowest queries. Defaults to `1`.
`search_type`:: Type of search, e.g. "query_then_fetch", "dfs_query_then_fetch", "count". Defaults to `query_then_fetch`.
`requests`:: Query DSL describing search requests.
`clear_caches`:: Whether caches should be cleared on each iteration, and if so, how. Caches are not cleared by default.
`indices`:: Array of indices to search, e.g. ["my_index_1", "my_index_2", "my_index_3"].
`types`:: Array of index types to search, e.g. ["my_type_1", "my_type_2"].
2014-03-12 12:16:39 -04:00
Cache clearing parameters:
[horizontal]
2014-05-01 00:52:32 -04:00
`clear_caches`:: Set to 'false' to disable cache clearing completely.
`clear_caches.filter`:: Whether to clear the filter cache.
`clear_caches.field_data`:: Whether to clear the field data cache.
`clear_caches.id`:: Whether to clear the id cache.
`clear_caches.recycler`:: Whether to clear the recycler cache.
`clear_caches.fields`:: Array of fields to clear.
`clear_caches.filter_keys`:: Array of filter keys to clear.
2014-03-12 12:16:39 -04:00
Global parameters:
[horizontal]
2014-05-01 00:52:32 -04:00
`name`:: Unique name for the benchmark.
2014-03-12 12:16:39 -04:00
`num_executor_nodes`:: Number of cluster nodes from which to submit and time benchmarks. Allows user to run a benchmark simultaneously on one or more nodes and compare timings. Note that this does not control how many nodes a search request will actually execute on. Defaults to: 1.
2014-05-01 00:52:32 -04:00
`percentiles`:: Array of percentile values to report. Defaults to: [10, 25, 50, 75, 90, 99].
2014-03-12 12:16:39 -04:00
Additionally, the following competition-level parameters may be set globally: iteration, concurrency, multiplier, warmup, and clear_caches.
Using these parameters it is possible to describe precisely how to execute a benchmark under various conditions. In the following example we run a filtered query against two different indices using two different search types.
[source,js]
--------------------------------------------------
2014-06-03 05:43:06 -04:00
$ curl -XPUT 'localhost:9200/_bench/?pretty=true' -d '{
2014-03-12 12:16:39 -04:00
"name": "my_benchmark",
"num_executor_nodes": 1,
"percentiles" : [ 25, 50, 75 ],
"iterations": 5,
"multiplier": 1000,
"concurrency": 5,
"num_slowest": 0,
"warmup": true,
"clear_caches": false,
"requests": [ {
"query" : {
"filtered" : {
"query" : { "match" : { "_all" : "*" } },
"filter" : {
"and" : [ { "term" : { "title" : "Spain" } },
{ "term" : { "title" : "rain" } },
{ "term" : { "title" : "plain" } } ]
}
}
}
} ],
"competitors": [ {
"name": "competitor_1",
"search_type": "query_then_fetch",
"indices": [ "my_index_1" ],
2014-05-01 00:52:32 -04:00
"types": [ "my_type_1" ],
2014-03-12 12:16:39 -04:00
"clear_caches" : {
"filter" : true,
"field_data" : true,
"id" : true,
"recycler" : true,
"fields": ["title"]
}
}, {
"name": "competitor_2",
"search_type": "dfs_query_then_fetch",
"indices": [ "my_index_2" ],
2014-05-01 00:52:32 -04:00
"types": [ "my_type_2" ],
2014-03-12 12:16:39 -04:00
"clear_caches" : {
"filter" : true,
"field_data" : true,
"id" : true,
"recycler" : true,
"fields": ["title"]
}
} ]
2014-06-03 05:43:06 -04:00
}'
2014-03-12 12:16:39 -04:00
--------------------------------------------------
Response:
[source,js]
--------------------------------------------------
{
"status" : "complete",
"competitors" : {
"competitor_1" : {
"summary" : {
"nodes" : [ "localhost" ],
"total_iterations" : 5,
"completed_iterations" : 5,
"total_queries" : 5000,
"concurrency" : 5,
"multiplier" : 1000,
"avg_warmup_time" : 54.0,
"statistics" : {
"min" : 0,
"max" : 3,
"mean" : 0.533,
"qps" : 1872.659,
"std_dev" : 0.528,
"millis_per_hit" : 0.0,
"percentile_25" : 0.0,
"percentile_50" : 1.0,
"percentile_75" : 1.0
2014-05-31 16:24:45 -04:00
}
2014-03-12 12:16:39 -04:00
}
},
"competitor_2" : {
"summary" : {
"nodes" : [ "localhost" ],
"total_iterations" : 5,
"completed_iterations" : 5,
"total_queries" : 5000,
"concurrency" : 5,
"multiplier" : 1000,
"avg_warmup_time" : 4.0,
"statistics" : {
"min" : 0,
"max" : 4,
"mean" : 0.487,
"qps" : 2049.180,
"std_dev" : 0.545,
"millis_per_hit" : 0.0,
"percentile_25" : 0.0,
"percentile_50" : 0.0,
"percentile_75" : 1.0
2014-05-31 16:24:45 -04:00
}
2014-03-12 12:16:39 -04:00
}
}
}
}
--------------------------------------------------
In some cases it may be desirable to view the progress of a long-running benchmark and optionally terminate it early. To view all active benchmarks use:
[source,js]
--------------------------------------------------
$ curl -XGET 'localhost:9200/_bench?pretty'
--------------------------------------------------
This would display run-time statistics in the same format as the sample output above.
To abort a long-running benchmark use the 'abort' endpoint:
[source,js]
--------------------------------------------------
$ curl -XPOST 'localhost:9200/_bench/abort/my_benchmark?pretty'
--------------------------------------------------
Response:
[source,js]
--------------------------------------------------
{
"aborted_benchmarks" : [
"node" "localhost",
"benchmark_name", "my_benchmark",
"aborted", true
]
}
--------------------------------------------------