Decay functions currently only use the first value in a field that contains
multiple values to compute the distance to the origin. Instead, it should
consider all distances if more values are in the field and then use
one of min/max/sum/avg which is defined by the user.
Relates to #3960closes#5940
Separate version check logic for reads and writes for all version types, which allows different behavior in these cases.
Change `VersionType.EXTERNAL` & `VersionType.EXTERNAL_GTE` to behave the same as `VersionType.INTERNAL` for read operations.
The previous behavior was fit for writes but is useless in reads.
This commit also makes the usage of `EXTERNAL` & `EXTERNAL_GTE` in the update api raise a validation error as it make cause data to
be lost.
Closes#5663 , Closes#5661, Closes#5929
The current setting of 20MB/sec seems to be too conservative given
the capabilities of modern hardware / network throughput.
A 50MB default should provide better out of the box performance.
Change the default numeric precision_step to 16 for 64-bit types,
8 for 32-bit and 16-bit types. Disable precision_step for the 8-bit
byte type.
Closes#5905
The current setting of 20MB/sec seems to be too conservative given
the capabilities of modern hardware. Even on cloud infrastructure this
seems to be too lowish. A 50MB default should provide better out of the box
performance
Currently we use 5k operations as a flush threshold. Indexing 5k documents
per second is rather common which would cause the index to be committed on
the lucene level each time the flush logic runs which is 5 seconds by default.
We should rather use a size based threshold similar to the lucene index writer
that doesn't cause such agressive commits which can slow down indexing significantly
especially since they cause the underlying devices to fsync their data.
Load tests showed that SerialMS has problems to keep up with
the merges under high load. We should switch back to CMS
until we have a better story to balance merge
threads / efforts across shards on a single node.
Closes#5817
Add an API endpoint at /_bench for submitting, listing, and aborting
search benchmarks. This API can be used for timing search requests,
subject to various user-defined settings.
Benchmark results provide summary and detailed statistics on such
values as min, max, and mean time. Values are reported per-node so that
it is easy to spot outliers. Slow requests are also reported.
Long running benchmarks can be viewed with a GET request, or aborted
with a POST request.
Benchmark results are optionally stored in an index for subsequent
analysis.
Closes#5407