OpenSearch/client/benchmark
Nik Everett 22459576d7
Logging: Make node name consistent in logger (#31588)
First, some background: we have 15 different methods to get a logger in
Elasticsearch but they can be broken down into three broad categories
based on what information is provided when building the logger.

Just a class like:
```
private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class);
```
or:
```
protected final Logger logger = Loggers.getLogger(getClass());
```

The class and settings:
```
this.logger = Loggers.getLogger(getClass(), settings);
```

Or more information like:
```
Loggers.getLogger("index.store.deletes", settings, shardId)
```

The goal of the "class and settings" variant is to attach the node name
to the logger. Because we don't always have the settings available, we
often use the "just a class" variant and get loggers without node names
attached. There isn't any real consistency here. Some loggers get the
node name because it is convenient and some do not.

This change makes the node name available to all loggers all the time.
Almost. There are some caveats are testing that I'll get to. But in
*production* code the node name is node available to all loggers. This
means we can stop using the "class and settings" variants to fetch
loggers which was the real goal here, but a pleasant side effect is that
the ndoe name is now consitent on every log line and optional by editing
the logging pattern. This is all powered by setting the node name
statically on a logging formatter very early in initialization.

Now to tests: tests can't set the node name statically because
subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in
the same class loader. Also, lots of tests don't run with a real node so
they don't *have* a node name at all. To support multiple nodes in the
same JVM tests suss out the node name from the thread name which works
surprisingly well and easy to test in a nice way. For those threads
that are not part of an `ESIntegTestCase` node we stick whatever useful
information we can get form the thread name in the place of the node
name. This allows us to keep the logger format consistent.
2018-07-31 10:54:24 -04:00
..
src/main Logging: Make node name consistent in logger (#31588) 2018-07-31 10:54:24 -04:00
README.md Add client-benchmark-noop-api-plugin to stress clients even more in benchmarks (#20103) 2016-08-26 09:05:47 +02:00
build.gradle Build: Move shadow customizations into common code (#32014) 2018-07-17 14:20:41 -04:00

README.md

Steps to execute the benchmark

  1. Build client-benchmark-noop-api-plugin with gradle :client:client-benchmark-noop-api-plugin:assemble
  2. Install it on the target host with bin/elasticsearch-plugin install file:///full/path/to/client-benchmark-noop-api-plugin.zip
  3. Start Elasticsearch on the target host (ideally not on the same machine)
  4. Build an uberjar with gradle :client:benchmark:shadowJar and execute it.

Repeat all steps above for the other benchmark candidate.

Example benchmark

In general, you should define a few GC-related settings -Xms8192M -Xmx8192M -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails and keep an eye on GC activity. You can also define -XX:+PrintCompilation to see JIT activity.

Bulk indexing

Download benchmark data from http://benchmarks.elastic.co/corpora/geonames/documents.json.bz2 and decompress them.

Example command line parameters:

rest bulk 192.168.2.2 ./documents.json geonames type 8647880 5000

The parameters are in order:

  • Client type: Use either "rest" or "transport"
  • Benchmark type: Use either "bulk" or "search"
  • Benchmark target host IP (the host where Elasticsearch is running)
  • full path to the file that should be bulk indexed
  • name of the index
  • name of the (sole) type in the index
  • number of documents in the file
  • bulk size

Bulk indexing

Example command line parameters:

rest search 192.168.2.2 geonames "{ \"query\": { \"match_phrase\": { \"name\": \"Sankt Georgen\" } } }\"" 500,1000,1100,1200

The parameters are in order:

  • Client type: Use either "rest" or "transport"
  • Benchmark type: Use either "bulk" or "search"
  • Benchmark target host IP (the host where Elasticsearch is running)
  • name of the index
  • a search request body (remember to escape double quotes). The TransportClientBenchmark uses QueryBuilders.wrapperQuery() internally which automatically adds a root key query, so it must not be present in the command line parameter.
  • A comma-separated list of target throughput rates