OpenSearch

History

Jason Tedor 5fcda57b37 Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.		2020-03-31 17:24:38 -04:00
..
src/main	Rename MetaData to Metadata in all of the places (#54519 )	2020-03-31 17:24:38 -04:00
README.md	…
build.gradle	…

README.md

Steps to execute the benchmark

Build client-benchmark-noop-api-plugin with ./gradlew :client:client-benchmark-noop-api-plugin:assemble
Install it on the target host with bin/elasticsearch-plugin install file:///full/path/to/client-benchmark-noop-api-plugin.zip.
Start Elasticsearch on the target host (ideally not on the machine that runs the benchmarks)
Run the benchmark with

./gradlew -p client/benchmark run --args ' params go here'

Everything in the ' gets sent on the command line to JMH. The leading inside the 's is important. Without it parameters are sometimes sent to gradle.

See below for some example invocations.

Example benchmark

In general, you should define a few GC-related settings -Xms8192M -Xmx8192M -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails and keep an eye on GC activity. You can also define -XX:+PrintCompilation to see JIT activity.

Bulk indexing

Download benchmark data from http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames and decompress them.

Example invocation:

wget http://benchmarks.elasticsearch.org.s3.amazonaws.com/corpora/geonames/documents-2.json.bz2
bzip2 -d documents-2.json.bz2
mv documents-2.json client/benchmark/build
gradlew -p client/benchmark run --args ' rest bulk localhost build/documents-2.json geonames type 8647880 5000'

The parameters are all in the 's and are in order:

Client type: Use either "rest" or "transport"
Benchmark type: Use either "bulk" or "search"
Benchmark target host IP (the host where Elasticsearch is running)
full path to the file that should be bulk indexed
name of the index
name of the (sole) type in the index
number of documents in the file
bulk size

Search

Example invocation:

./gradlew -p client/benchmark run --args ' rest search localhost geonames {"query":{"match_phrase":{"name":"Sankt Georgen"}}} 500,1000,1100,1200'

The parameters are in order:

Client type: Use either "rest" or "transport"
Benchmark type: Use either "bulk" or "search"
Benchmark target host IP (the host where Elasticsearch is running)
name of the index
a search request body (remember to escape double quotes). The TransportClientBenchmark uses QueryBuilders.wrapperQuery() internally which automatically adds a root key query, so it must not be present in the command line parameter.
A comma-separated list of target throughput rates