78a93dc18f
This commit allows coordinating node to account the memory used to perform partial and final reduce of aggregations in the request circuit breaker. The search coordinator adds the memory that it used to save and reduce the results of shard aggregations in the request circuit breaker. Before any partial or final reduce, the memory needed to reduce the aggregations is estimated and a CircuitBreakingException} is thrown if exceeds the maximum memory allowed in this breaker. This size is estimated as roughly 1.5 times the size of the serialized aggregations that need to be reduced. This estimation can be completely off for some aggregations but it is corrected with the real size after the reduce completes. If the reduce is successful, we update the circuit breaker to remove the size of the source aggregations and replace the estimation with the serialized size of the newly reduced result. As a follow up we could trigger partial reduces based on the memory accounted in the circuit breaker instead of relying on a static number of shard responses. A simpler follow up that could be done in the mean time is to [reduce the default batch reduce size](https://github.com/elastic/elasticsearch/issues/51857) of blocking search request to a more sane number. Closes #37182 |
||
---|---|---|
.. | ||
src/main | ||
README.md | ||
build.gradle |
README.md
Elasticsearch Microbenchmark Suite
This directory contains the microbenchmark suite of Elasticsearch. It relies on JMH.
Purpose
We do not want to microbenchmark everything but the kitchen sink and should typically rely on our macrobenchmarks with Rally. Microbenchmarks are intended to spot performance regressions in performance-critical components. The microbenchmark suite is also handy for ad-hoc microbenchmarks but please remove them again before merging your PR.
Getting Started
Just run gradlew -p benchmarks run
from the project root
directory. It will build all microbenchmarks, execute them and print
the result.
Running Microbenchmarks
Running via an IDE is not supported as the results are meaningless because we have no control over the JVM running the benchmarks.
If you want to run a specific benchmark class like, say,
MemoryStatsBenchmark
, you can use --args
:
gradlew -p benchmarks run --args ' MemoryStatsBenchmark'
Everything in the '
gets sent on the command line to JMH. The leading
inside the '
s is important. Without it parameters are sometimes sent to
gradle.
Adding Microbenchmarks
Before adding a new microbenchmark, make yourself familiar with the JMH API. You can check our existing microbenchmarks and also the JMH samples.
In contrast to tests, the actual name of the benchmark class is not relevant to JMH. However, stick to the naming convention and
end the class name of a benchmark with Benchmark
. To have JMH execute a benchmark, annotate the respective methods with @Benchmark
.
Tips and Best Practices
To get realistic results, you should exercise care when running benchmarks. Here are a few tips:
Do
- Ensure that the system executing your microbenchmarks has as little load as possible. Shutdown every process that can cause unnecessary
runtime jitter. Watch the
Error
column in the benchmark results to see the run-to-run variance. - Ensure to run enough warmup iterations to get the benchmark into a stable state. If you are unsure, don't change the defaults.
- Avoid CPU migrations by pinning your benchmarks to specific CPU cores. On Linux you can use
taskset
. - Fix the CPU frequency to avoid Turbo Boost from kicking in and skewing your results. On Linux you can use
cpufreq-set
and theperformance
CPU governor. - Vary the problem input size with
@Param
. - Use the integrated profilers in JMH to dig deeper if benchmark results to not match your hypotheses:
- Add
-prof gc
to the options to check whether the garbage collector runs during a microbenchmarks and skews your results. If so, try to force a GC between runs (-gc true
) but watch out for the caveats. - Add
-prof perf
or-prof perfasm
(both only available on Linux) to see hotspots.
- Add
- Have your benchmarks peer-reviewed.
Don't
- Blindly believe the numbers that your microbenchmark produces but verify them by measuring e.g. with
-prof perfasm
. - Run more threads than your number of CPU cores (in case you run multi-threaded microbenchmarks).
- Look only at the
Score
column and ignoreError
. Instead take countermeasures to keepError
low / variance explainable.
Disassembling
Disassembling is fun! Maybe not always useful, but always fun! Generally, you'll want to install perf
and FCML's hsdis
.
perf
is generally available via apg-get install perf
or pacman -S perf
. FCML is a little more involved. This worked
on 2020-08-01:
wget https://github.com/swojtasiak/fcml-lib/releases/download/v1.2.2/fcml-1.2.2.tar.gz
tar xf fcml*
cd fcml*
./configure
make
cd example/hsdis
make
sudo cp .libs/libhsdis.so.0.0.0 /usr/lib/jvm/java-14-adoptopenjdk/lib/hsdis-amd64.so
If you want to disassemble a single method do something like this:
gradlew -p benchmarks run --args ' MemoryStatsBenchmark -jvmArgs "-XX:+UnlockDiagnosticVMOptions -XX:CompileCommand=print,*.yourMethodName -XX:PrintAssemblyOptions=intel"
If you want perf
to find the hot methods for you then do add -prof:perfasm
.