OpenSearch/docs/reference/search/benchmark.asciidoc
Adrien Grand 95f46f1212 Docs: Use the new experimental annotation.
We now have a very useful annotation to mark features or parameters as
experimental. Let's use it! This commit replaces some custom text warnings with
this annotation and adds this annotation to some existing features/parameters:
 - inner_hits (unreleased yet)
 - terminate_after (released in 1.4)
 - per-bucket doc count errors in the terms agg (released in 1.4)

I also tagged with this annotation settings which should either be not needed
(like the ability to evict entries from the filter cache based on time) or that
are too deep into the way that Elasticsearch works like the Directory
implementation or merge settings.

Close #9563
2015-02-05 15:29:45 +01:00

248 lines
8.3 KiB
Plaintext

[[search-benchmark]]
== Benchmark
experimental[]
The benchmark API provides a standard mechanism for submitting queries and
measuring their performance relative to one another.
[IMPORTANT]
=====
To be eligible to run benchmarks nodes must be started with: `--node.bench true`. This is just a way to mark certain nodes as "executors". Searches will still be distributed out to the cluster in the normal manner. This is primarily a defensive measure to prevent production nodes from being flooded with potentially many requests. Typically one would start a single node with this setting and submit benchmark requests to it.
=====
[source,bash]
--------------------------------------------------
$ ./bin/elasticsearch --node.bench true
--------------------------------------------------
Benchmarking a search request is as simple as executing the following command:
[source,js]
--------------------------------------------------
$ curl -XPUT 'localhost:9200/_bench/?pretty=true' -d '{
"name": "my_benchmark",
"competitors": [ {
"name": "my_competitor",
"requests": [ {
"query": {
"match": { "_all": "a*" }
}
} ]
} ]
}'
--------------------------------------------------
Response:
[source,js]
--------------------------------------------------
{
"status" : "complete",
"competitors" : {
"my_competitor" : {
"summary" : {
"nodes" : [ "localhost" ],
"total_iterations" : 5,
"completed_iterations" : 5,
"total_queries" : 1000,
"concurrency" : 5,
"multiplier" : 100,
"avg_warmup_time" : 43.0,
"statistics" : {
"min" : 1,
"max" : 10,
"mean" : 4.19,
"qps" : 238.663,
"std_dev" : 1.938,
"millis_per_hit" : 1.064,
"percentile_10" : 2,
"percentile_25" : 3,
"percentile_50" : 4,
"percentile_75" : 5,
"percentile_90" : 7,
"percentile_99" : 10
}
}
}
}
}
--------------------------------------------------
A 'competitor' defines one or more search requests to execute along with parameters that describe how the search(es) should be run.
Multiple competitors may be submitted as a group in which case they will execute one after the other. This makes it easy to compare various
competing alternatives side-by-side.
There are several parameters which may be set at the competition level:
[horizontal]
`name`:: Unique name for the competition.
`iterations`:: Number of times to run the competitors. Defaults to `5`.
`concurrency`:: Within each iteration use this level of parallelism. Defaults to `5`.
`multiplier`:: Within each iteration run the query this many times. Defaults to `1000`.
`warmup`:: Perform warmup of query. Defaults to `true`.
`num_slowest`:: Record N slowest queries. Defaults to `1`.
`search_type`:: Type of search, e.g. "query_then_fetch", "dfs_query_then_fetch", "count". Defaults to `query_then_fetch`.
`requests`:: Query DSL describing search requests.
`clear_caches`:: Whether caches should be cleared on each iteration, and if so, how. Caches are not cleared by default.
`indices`:: Array of indices to search, e.g. ["my_index_1", "my_index_2", "my_index_3"].
`types`:: Array of index types to search, e.g. ["my_type_1", "my_type_2"].
Cache clearing parameters:
[horizontal]
`clear_caches`:: Set to 'false' to disable cache clearing completely.
`clear_caches.filter`:: Whether to clear the filter cache.
`clear_caches.field_data`:: Whether to clear the field data cache.
`clear_caches.id`:: Whether to clear the id cache.
`clear_caches.recycler`:: Whether to clear the recycler cache.
`clear_caches.fields`:: Array of fields to clear.
`clear_caches.filter_keys`:: Array of filter keys to clear.
Global parameters:
[horizontal]
`name`:: Unique name for the benchmark.
`num_executor_nodes`:: Number of cluster nodes from which to submit and time benchmarks. Allows user to run a benchmark simultaneously on one or more nodes and compare timings. Note that this does not control how many nodes a search request will actually execute on. Defaults to: 1.
`percentiles`:: Array of percentile values to report. Defaults to: [10, 25, 50, 75, 90, 99].
Additionally, the following competition-level parameters may be set globally: iteration, concurrency, multiplier, warmup, and clear_caches.
Using these parameters it is possible to describe precisely how to execute a benchmark under various conditions. In the following example we run a filtered query against two different indices using two different search types.
[source,js]
--------------------------------------------------
$ curl -XPUT 'localhost:9200/_bench/?pretty=true' -d '{
"name": "my_benchmark",
"num_executor_nodes": 1,
"percentiles" : [ 25, 50, 75 ],
"iterations": 5,
"multiplier": 1000,
"concurrency": 5,
"num_slowest": 0,
"warmup": true,
"clear_caches": false,
"requests": [ {
"query" : {
"filtered" : {
"query" : { "match" : { "_all" : "*" } },
"filter" : {
"and" : [ { "term" : { "title" : "Spain" } },
{ "term" : { "title" : "rain" } },
{ "term" : { "title" : "plain" } } ]
}
}
}
} ],
"competitors": [ {
"name": "competitor_1",
"search_type": "query_then_fetch",
"indices": [ "my_index_1" ],
"types": [ "my_type_1" ],
"clear_caches" : {
"filter" : true,
"field_data" : true,
"id" : true,
"recycler" : true,
"fields": ["title"]
}
}, {
"name": "competitor_2",
"search_type": "dfs_query_then_fetch",
"indices": [ "my_index_2" ],
"types": [ "my_type_2" ],
"clear_caches" : {
"filter" : true,
"field_data" : true,
"id" : true,
"recycler" : true,
"fields": ["title"]
}
} ]
}'
--------------------------------------------------
Response:
[source,js]
--------------------------------------------------
{
"status" : "complete",
"competitors" : {
"competitor_1" : {
"summary" : {
"nodes" : [ "localhost" ],
"total_iterations" : 5,
"completed_iterations" : 5,
"total_queries" : 5000,
"concurrency" : 5,
"multiplier" : 1000,
"avg_warmup_time" : 54.0,
"statistics" : {
"min" : 0,
"max" : 3,
"mean" : 0.533,
"qps" : 1872.659,
"std_dev" : 0.528,
"millis_per_hit" : 0.0,
"percentile_25" : 0.0,
"percentile_50" : 1.0,
"percentile_75" : 1.0
}
}
},
"competitor_2" : {
"summary" : {
"nodes" : [ "localhost" ],
"total_iterations" : 5,
"completed_iterations" : 5,
"total_queries" : 5000,
"concurrency" : 5,
"multiplier" : 1000,
"avg_warmup_time" : 4.0,
"statistics" : {
"min" : 0,
"max" : 4,
"mean" : 0.487,
"qps" : 2049.180,
"std_dev" : 0.545,
"millis_per_hit" : 0.0,
"percentile_25" : 0.0,
"percentile_50" : 0.0,
"percentile_75" : 1.0
}
}
}
}
}
--------------------------------------------------
In some cases it may be desirable to view the progress of a long-running benchmark and optionally terminate it early. To view all active benchmarks use:
[source,js]
--------------------------------------------------
$ curl -XGET 'localhost:9200/_bench?pretty'
--------------------------------------------------
This would display run-time statistics in the same format as the sample output above.
To abort a long-running benchmark use the 'abort' endpoint:
[source,js]
--------------------------------------------------
$ curl -XPOST 'localhost:9200/_bench/abort/my_benchmark?pretty'
--------------------------------------------------
Response:
[source,js]
--------------------------------------------------
{
"aborted_benchmarks" : [
"node" "localhost",
"benchmark_name", "my_benchmark",
"aborted", true
]
}
--------------------------------------------------