OpenSearch/benchmarks
Nik Everett 22459576d7
Logging: Make node name consistent in logger (#31588)
First, some background: we have 15 different methods to get a logger in
Elasticsearch but they can be broken down into three broad categories
based on what information is provided when building the logger.

Just a class like:
```
private static final Logger logger = ESLoggerFactory.getLogger(ActionModule.class);
```
or:
```
protected final Logger logger = Loggers.getLogger(getClass());
```

The class and settings:
```
this.logger = Loggers.getLogger(getClass(), settings);
```

Or more information like:
```
Loggers.getLogger("index.store.deletes", settings, shardId)
```

The goal of the "class and settings" variant is to attach the node name
to the logger. Because we don't always have the settings available, we
often use the "just a class" variant and get loggers without node names
attached. There isn't any real consistency here. Some loggers get the
node name because it is convenient and some do not.

This change makes the node name available to all loggers all the time.
Almost. There are some caveats are testing that I'll get to. But in
*production* code the node name is node available to all loggers. This
means we can stop using the "class and settings" variants to fetch
loggers which was the real goal here, but a pleasant side effect is that
the ndoe name is now consitent on every log line and optional by editing
the logging pattern. This is all powered by setting the node name
statically on a logging formatter very early in initialization.

Now to tests: tests can't set the node name statically because
subclasses of `ESIntegTestCase` run many nodes in the same jvm, even in
the same class loader. Also, lots of tests don't run with a real node so
they don't *have* a node name at all. To support multiple nodes in the
same JVM tests suss out the node name from the thread name which works
surprisingly well and easy to test in a nice way. For those threads
that are not part of an `ESIntegTestCase` node we stick whatever useful
information we can get form the thread name in the place of the node
name. This allows us to keep the logger format consistent.
2018-07-31 10:54:24 -04:00
..
src/main Logging: Make node name consistent in logger (#31588) 2018-07-31 10:54:24 -04:00
README.md Refine wording in benchmark README and correct typos 2016-06-15 23:01:56 +02:00
build.gradle Build: Move shadow customizations into common code (#32014) 2018-07-17 14:20:41 -04:00

README.md

Elasticsearch Microbenchmark Suite

This directory contains the microbenchmark suite of Elasticsearch. It relies on JMH.

Purpose

We do not want to microbenchmark everything but the kitchen sink and should typically rely on our macrobenchmarks with Rally. Microbenchmarks are intended to spot performance regressions in performance-critical components. The microbenchmark suite is also handy for ad-hoc microbenchmarks but please remove them again before merging your PR.

Getting Started

Just run gradle :benchmarks:jmh from the project root directory. It will build all microbenchmarks, execute them and print the result.

Running Microbenchmarks

Benchmarks are always run via Gradle with gradle :benchmarks:jmh.

Running via an IDE is not supported as the results are meaningless (we have no control over the JVM running the benchmarks).

If you want to run a specific benchmark class, e.g. org.elasticsearch.benchmark.MySampleBenchmark or have special requirements generate the uberjar with gradle :benchmarks:jmhJar and run it directly with:

java -jar benchmarks/build/distributions/elasticsearch-benchmarks-*.jar

JMH supports lots of command line parameters. Add -h to the command above to see the available command line options.

Adding Microbenchmarks

Before adding a new microbenchmark, make yourself familiar with the JMH API. You can check our existing microbenchmarks and also the JMH samples.

In contrast to tests, the actual name of the benchmark class is not relevant to JMH. However, stick to the naming convention and end the class name of a benchmark with Benchmark. To have JMH execute a benchmark, annotate the respective methods with @Benchmark.

Tips and Best Practices

To get realistic results, you should exercise care when running benchmarks. Here are a few tips:

Do

  • Ensure that the system executing your microbenchmarks has as little load as possible. Shutdown every process that can cause unnecessary runtime jitter. Watch the Error column in the benchmark results to see the run-to-run variance.
  • Ensure to run enough warmup iterations to get the benchmark into a stable state. If you are unsure, don't change the defaults.
  • Avoid CPU migrations by pinning your benchmarks to specific CPU cores. On Linux you can use taskset.
  • Fix the CPU frequency to avoid Turbo Boost from kicking in and skewing your results. On Linux you can use cpufreq-set and the performance CPU governor.
  • Vary the problem input size with @Param.
  • Use the integrated profilers in JMH to dig deeper if benchmark results to not match your hypotheses:
    • Run the generated uberjar directly and use -prof gc to check whether the garbage collector runs during a microbenchmarks and skews your results. If so, try to force a GC between runs (-gc true) but watch out for the caveats.
    • Use -prof perf or -prof perfasm (both only available on Linux) to see hotspots.
  • Have your benchmarks peer-reviewed.

Don't

  • Blindly believe the numbers that your microbenchmark produces but verify them by measuring e.g. with -prof perfasm.
  • Run more threads than your number of CPU cores (in case you run multi-threaded microbenchmarks).
  • Look only at the Score column and ignore Error. Instead take countermeasures to keep Error low / variance explainable.