Currently, `finishHim` can either execute on the specified executor
(in the less likely case that the local node request is the last to arrive)
or on a transport thread.
In case of e.g. `org.elasticsearch.action.admin.cluster.stats.TransportClusterStatsAction`
this leads to an expensive execution that deserializes all mapping metadata in the cluster
running on the transport thread and destabilizing the cluster. In case of this transport
action it was specifically moved to the `MANAGEMENT` thread to avoid the high cost of processing
the stats requests on the nodes during fan-out but that did not cover the final execution
on the node that received the initial request. This PR adds to ability to optionally specify the executor for the final step of the
nodes request execution and uses that to work around the issue for the slow `TransportClusterStatsAction`.
Note: the specific problem that motivated this PR is essentially the same as https://github.com/elastic/elasticsearch/pull/57937 where we moved the execution off the transport and on the management thread as a fix as well.
We support `"""` in `console` snippets to emulate kibana's CONSOLE.
CONSOLE also spits out `"""` when a json field contains a new line or a
double quote. This adds support for those sorts of responses to the
handling of `console-response` snippets.
Revises the current 'How to avoid oversharding' docs to incorporate
information from our [shard sizing blog post][0].
Changes:
* Streamlines introduction
* Adds "Things to remember" section to describe how shards work
* Adds "Guidelines" section based on blog tips
* Creates a "Fix an oversharded cluster" section
[0]: https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
Passing FieldMappers to point parsing functions makes trying to build source-only
fields from MappedFieldTypes more complicated. This small refactoring changes
things so that the relevant parsing and factory functions from
AbstractGeometryFieldMapper are instead passed as lambdas to the PointParser
constructor.
make node resolving more robust by ignoring null values. This is a bug in
the usage of this class, however you don't want NPE's in prod. The root cause
might be a corner case. Because silencing the root cause is bad, the assert
causes a fail if assertions are enabled
relates #62847
We have to make sure the applier and not the accept state versions allign here.
Otherwise we can get into the situation where the data node is so slow to process
one version that the next one arrives, gets rejected and the request return with
ack `false` and we fail the assertion that the put mapping request didn't complete.
Closes#62446
JAVA_HOME is set as necessary in packaging tests, depending on whether
it is needed for no-jdk distributions or testing override behavior. We
currently rely on gradle finding java through PATH. However, JAVA_HOME
can sometimes be set by the system itself, which then leaks through to
the packaging test. This commit reworks our handling of JAVA_HOME to
pass it through for gradle, and then explicitly clear it whenever
running shell commands in packaging tests.
This test was disabled with an awaits fix, but the underlying issue has
been worked around, so the test can be re-enabled.
relates #46050
relates #58628
Currently Netty will batch compression an entire HTTP response
regardless of its content size. It allocates a byte array at least of
the same size as the uncompressed content. This causes issues with our
attempts to remove humungous G1GC allocations. This commit resolves the
issue by split responses into 128KB chunks.
This has the side-effect of making large outbound HTTP responses that
are compressed be send as chunked transfer-encoding.
Currently we duplicate our specialized cors logic in all transport
plugins. This is unnecessary as it could be implemented in a single
place. This commit moves the logic to server. Additionally it fixes a
but where we are incorrectly closing http channels on early Cors
responses.
Introduce 64-bit unsigned long field type
This field type supports
- indexing of integer values from [0, 18446744073709551615]
- precise queries (term, range)
- precise sort and terms aggregations
- other aggregations are based on conversion of long values
to double and can be imprecise for large values.
Backport for #60050Closes#32434
This commit adds a mechanism to MapperTestCase that allows implementing
test classes to check that their parameters can be updated, or throw conflict
errors as advertised. Child classes override the registerParameters method
and tell the passed-in UpdateChecker class about their parameters. Simple
conflicts can be checked, using the existing minimal mappings as a base to
compare against, or alternatively a particular initial mapping can be provided
to check edge cases (eg, norms can be updated from true to false, but not
vice versa). Updates are registered with a predicate that checks that the update
has in fact been applied to the resulting FieldMapper.
Fixes#61631
Same as in the normal Netty tests we have to disable the runtime proc
setting in the normal tests task just like we do for the internal cluster tests.
Closes#61919Closes#62298
This commit allows coordinating node to account the memory used to perform partial and final reduce of
aggregations in the request circuit breaker. The search coordinator adds the memory that it used to save
and reduce the results of shard aggregations in the request circuit breaker. Before any partial or final
reduce, the memory needed to reduce the aggregations is estimated and a CircuitBreakingException} is thrown
if exceeds the maximum memory allowed in this breaker.
This size is estimated as roughly 1.5 times the size of the serialized aggregations that need to be reduced.
This estimation can be completely off for some aggregations but it is corrected with the real size after
the reduce completes.
If the reduce is successful, we update the circuit breaker to remove the size of the source aggregations
and replace the estimation with the serialized size of the newly reduced result.
As a follow up we could trigger partial reduces based on the memory accounted in the circuit breaker instead
of relying on a static number of shard responses. A simpler follow up that could be done in the mean time is
to [reduce the default batch reduce size](https://github.com/elastic/elasticsearch/issues/51857) of blocking
search request to a more sane number.
Closes#37182
This adds the ability to fetch java primitives like `long` and `float`
from grok matches rather than their boxed versions. It also allows
customizing the which fields are extracted and how they are extracted.
By default we continue to fetch a `Map<String, Object>` but runtime
fields will be able to catch *just* the fields it is interested
in, and the values will be primitives.
for get trained models include_model_definition is now deprecated.
This commit writes a deprecation warning if that parameter is used and suggests the caller to utilize the replacement
Backport #62825 to 7.x branch.
Today if a data stream is auto created, but an index with same name as the
first backing index already exists then internally that error is ignored,
which then result that later in the execution of a bulk request, the
bulk item fails due to that the data stream hasn't been auto created.
This situation can only occur if an index with same is created that
will be the backing index of a data stream prior to the creation
of the data stream.
Co-authored-by: Dan Hermann <danhermann@users.noreply.github.com>
A few of us were talking about ways to speed up the `date_histogram`
using the index for the timestamp rather than the doc values. To do that
we'd have to pre-compute all of the "round down" points in the index. It
turns out that *just* precomputing those values speeds up rounding
fairly significantly:
```
Benchmark (count) (interval) (range) (zone) Mode Cnt Score Error Units
before 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 96461080.982 ± 616373.011 ns/op
before 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 130598950.850 ± 1249189.867 ns/op
after 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 52311775.080 ± 107171.092 ns/op
after 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 54800134.968 ± 373844.796 ns/op
```
That's a 46% speed up when there isn't a time zone and a 58% speed up
when there is.
This doesn't work for every time zone, specifically those that have two
midnights in a single day due to daylight savings time will produce wonky
results. So they don't get the optimization.
Second, this requires a few expensive computation up front to make the
transition array. And if the transition array is too large then we give
up and use the original mechanism, throwing away all of the work we did
to build the array. This seems appropriate for most usages of `round`,
but this change uses it for *all* usages of `round`. That seems ok for
now, but it might be worth investigating in a follow up.
I ran a macrobenchmark as well which showed an 11% preformance
improvement. *BUT* the benchmark wasn't tuned for my desktop so it
overwhelmed it and might have produced "funny" results. I think it is
pretty clear that this is an improvement, but know the measurement is
weird:
```
Benchmark (count) (interval) (range) (zone) Mode Cnt Score Error Units
before 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 96461080.982 ± 616373.011 ns/op
before 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 g± 1249189.867 ns/op
after 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 52311775.080 ± 107171.092 ns/op
after 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 54800134.968 ± 373844.796 ns/op
Before:
| Min Throughput | hourly_agg | 0.11 | ops/s |
| Median Throughput | hourly_agg | 0.11 | ops/s |
| Max Throughput | hourly_agg | 0.11 | ops/s |
| 50th percentile latency | hourly_agg | 650623 | ms |
| 90th percentile latency | hourly_agg | 821478 | ms |
| 99th percentile latency | hourly_agg | 859780 | ms |
| 100th percentile latency | hourly_agg | 864030 | ms |
| 50th percentile service time | hourly_agg | 9268.71 | ms |
| 90th percentile service time | hourly_agg | 9380 | ms |
| 99th percentile service time | hourly_agg | 9626.88 | ms |
|100th percentile service time | hourly_agg | 9884.27 | ms |
| error rate | hourly_agg | 0 | % |
After:
| Min Throughput | hourly_agg | 0.12 | ops/s |
| Median Throughput | hourly_agg | 0.12 | ops/s |
| Max Throughput | hourly_agg | 0.12 | ops/s |
| 50th percentile latency | hourly_agg | 519254 | ms |
| 90th percentile latency | hourly_agg | 653099 | ms |
| 99th percentile latency | hourly_agg | 683276 | ms |
| 100th percentile latency | hourly_agg | 686611 | ms |
| 50th percentile service time | hourly_agg | 8371.41 | ms |
| 90th percentile service time | hourly_agg | 8407.02 | ms |
| 99th percentile service time | hourly_agg | 8536.64 | ms |
|100th percentile service time | hourly_agg | 8538.54 | ms |
| error rate | hourly_agg | 0 | % |
```
If `track_total_hits=true` is used, the exact value of the number of hits is returned - i.e. the value is effectively limitless, and not the default value of 10,000
Co-authored-by: AndyHunt66 <andrew.hunt@elastic.co>
The `migrate` action will now configure the
`index.routing.allocation.include._tier_preference` setting to the corresponding
tiers. For the HOT phase it will configure `data_hot`, for the WARM phase it will
configure `data_warm,data_hot` and for the COLD phase
`data_cold,data_warm,data_cold`.
(cherry picked from commit 9dbf0e6f0c267e40c5bcfb568bb2254da103ae40)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
Replace common Like and RLike queries that match all characters with
IsNotNull (exists) queries
Fix#62585
(cherry picked from commit 4c23fad0468a9edd7325b06c6a96f7af37625dbf)
Closes#62466. Since we're still seeing occasional failures when
checking the GID of all files in the Docker image due to Elasticsearch
running in the background, instead run a new container with ES running
at all.
In case of more than 500 transforms, get and stats return paged results which can be requested using
page parameters. For >500 transforms count wasn't parsed out of the server response but taken from
size of the list of transforms.
The change also adds client/server hlrc tests and fixes a wrong type for count in get.
fixes#56245
We use the bundled jdk for unit, integ and packaging tests. Since
upgrading to jdk 15, centos-6 and oracle enterprise linux 6 have failed
due to versions of glibc no longer supported by the jdk. This commit
adds detection of the old glibc versions to gradle, and utilizes that
when deciding which jdk to use for tests.
relates #62709closes#62635
The name `FieldFetcher` fits better with the 'fetch' terminology we use
elsewhere, for example `FetchFieldsPhase` and `ValueFetcher`.
This PR also moves the construction of the fetcher off the context and onto
`FetchFieldsPhase`, which feels like a more natural place for it, and fixes a
TODO in javadocs.
This test checks to see if the index has been created before version 6.4, in which
case index prefixes are unavailable and so it expects to see a span multi-term
wrapper. However, the production code doesn't bother with checking for versions,
because if the field in question is configured with index_prefixes then it knows that
it must have been created post 6.4 (you can't merge in a new index_prefixes
configuration).
This commit alters the test to remove the random version checks, as we know we
will always have a prefix field available in this scenario.
Fixes#58199
Backport #62766 to 7.x branch.
The bulk api cache the resolved concrete indices when resolving the user provided
index name into the actual index name. The validation that prevents write ops other
than create from being executed in a data stream was only performed if the result
wasn't cached. In case of cached resolvings, the validation never occurs.
The validation would be skipped for all bulk items for a data stream after a create
operation for that same data stream. This commit ensures that the validation is always
performed for all bulk items (whether the concrete index resolution has been cached or
not cached).
Closes#62762
This adds a method to `Grok` that matches against sections offset from
utf-8 byte arrays:
```
Map<String, Object> captures(byte[] utf8Bytes, int offset, int length)
```
This'll be useful for the grok-flavored runtime fields because they
want to match against utf-8 encoded strings stored in a big array. And
joni already supports this.
When state persistence was first implemented for data frame analytics
we had the assumption that state would always fit in a single document.
However this is not the case any more.
This commit adds handling of state that spreads over multiple documents.
Backport of #62564