22459576d7
First, some background: we have 15 different methods to get a logger in Elasticsearch but they can be broken down into three broad categories based on what information is provided when building the logger. Just a class like: ``` private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class); ``` or: ``` protected final Logger logger = Loggers.getLogger(getClass()); ``` The class and settings: ``` this.logger = Loggers.getLogger(getClass(), settings); ``` Or more information like: ``` Loggers.getLogger("index.store.deletes", settings, shardId) ``` The goal of the "class and settings" variant is to attach the node name to the logger. Because we don't always have the settings available, we often use the "just a class" variant and get loggers without node names attached. There isn't any real consistency here. Some loggers get the node name because it is convenient and some do not. This change makes the node name available to all loggers all the time. Almost. There are some caveats are testing that I'll get to. But in *production* code the node name is node available to all loggers. This means we can stop using the "class and settings" variants to fetch loggers which was the real goal here, but a pleasant side effect is that the ndoe name is now consitent on every log line and optional by editing the logging pattern. This is all powered by setting the node name statically on a logging formatter very early in initialization. Now to tests: tests can't set the node name statically because subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in the same class loader. Also, lots of tests don't run with a real node so they don't *have* a node name at all. To support multiple nodes in the same JVM tests suss out the node name from the thread name which works surprisingly well and easy to test in a nice way. For those threads that are not part of an `ESIntegTestCase` node we stick whatever useful information we can get form the thread name in the place of the node name. This allows us to keep the logger format consistent. |
||
---|---|---|
.. | ||
src/main | ||
README.md | ||
build.gradle |
README.md
Steps to execute the benchmark
- Build
client-benchmark-noop-api-plugin
withgradle :client:client-benchmark-noop-api-plugin:assemble
- Install it on the target host with
bin/elasticsearch-plugin install file:///full/path/to/client-benchmark-noop-api-plugin.zip
- Start Elasticsearch on the target host (ideally not on the same machine)
- Build an uberjar with
gradle :client:benchmark:shadowJar
and execute it.
Repeat all steps above for the other benchmark candidate.
Example benchmark
In general, you should define a few GC-related settings -Xms8192M -Xmx8192M -XX:+UseConcMarkSweepGC -verbose:gc -XX:+PrintGCDetails
and keep an eye on GC activity. You can also define -XX:+PrintCompilation
to see JIT activity.
Bulk indexing
Download benchmark data from http://benchmarks.elastic.co/corpora/geonames/documents.json.bz2 and decompress them.
Example command line parameters:
rest bulk 192.168.2.2 ./documents.json geonames type 8647880 5000
The parameters are in order:
- Client type: Use either "rest" or "transport"
- Benchmark type: Use either "bulk" or "search"
- Benchmark target host IP (the host where Elasticsearch is running)
- full path to the file that should be bulk indexed
- name of the index
- name of the (sole) type in the index
- number of documents in the file
- bulk size
Bulk indexing
Example command line parameters:
rest search 192.168.2.2 geonames "{ \"query\": { \"match_phrase\": { \"name\": \"Sankt Georgen\" } } }\"" 500,1000,1100,1200
The parameters are in order:
- Client type: Use either "rest" or "transport"
- Benchmark type: Use either "bulk" or "search"
- Benchmark target host IP (the host where Elasticsearch is running)
- name of the index
- a search request body (remember to escape double quotes). The
TransportClientBenchmark
usesQueryBuilders.wrapperQuery()
internally which automatically adds a root keyquery
, so it must not be present in the command line parameter. - A comma-separated list of target throughput rates