Today we use the version of a DirectoryReader as a component of the key
of IndicesRequestCache. This usage is perfectly fine since the version
is advanced every time a new change is made into IndexWriter. In other
words, two DirectoryReaders with the same version should have the same
content. However, this invariant is only guaranteed in the context of a
single IndexWriter because the version is reset to the committed version
value when IndexWriter is re-opened.
Since #33473, each IndexShard may have more than one IndexWriter, and
using the version of a DirectoryReader as a part of the cache key can
cause IndicesRequestCache to return stale cached values. For example, in
#27650, we rollback the engine (i.e., re-open IndexWriter), index new
documents, refresh, then make a count request, but the search layer
mistakenly returns the count of the DirectoryReader of the previous
IndexWriter because the current DirectoryReader has the same version of
the old DirectoryReader even their documents are different. This is
possible because these two readers come from different IndexWriters.
This commit replaces the the version with the reader cache key of
IndexReader as a component of the cache key of IndicesRequestCache.
Closes#27650
Relates #33473
This commit adds the support to early terminate the collection of a leaf
in the min/max aggregator. If the query matches all documents the min and max value
for a numeric field can be retrieved efficiently in the points reader.
This change applies this optimization when possible.
* Make text message not required in constructor for slack
* Remove unnecessary comments in test file
* Throw exception when reduce or combine is not provided; update tests
* Update integration tests for scripted metrics to always include reduce and combine
* Remove some old changes from previous branches
* Rearrange script presence checks to be earlier in build
* Change null check order in script builder for aggregated metrics; correct test scripts in IT
* Add breaking change details to PR
Today we reverse the initial order of the nested documents when we
index them in order to ensure that parents documents appear after
their children. This means that a query will always match nested documents
in the reverse order of their offsets in the source document.
Reversing all documents is not needed so this change ensures that parents
documents appear after their children without modifying the initial order
in each nested level. This allows to match children in the order of their
appearance in the source document which is a requirement to efficiently
implement #33587. Old indices created before this change will continue
to reverse the order of nested documents to ensure backwark compatibility.
* Adds trace logging to IndicesRequestCache
This change adds trace level logging to `IndicesrrequestCache` witht eh
primary aim of helping to identify the cause of teh failures in
https://github.com/elastic/elasticsearch/issues/32827. The cache will
log at trace level when a cache hit or miss occurs including the reader
version and the cache key. Note that this change adds a
`cacheKeyRenderer` whcih supplies a human readable String of the cache
key since the actual cache key itself is a `BytesReference` containing
the wire protocol serialised form of the request.
Logging is also added for the case where a search timeout occurs and fr
that reason the cache entry is invalidated.
* Adds comment to remaind us to remove cacheKeyRenderer
In #28941 we changed the computation of cluster state task descriptions but
this introduced a bug in which we only log the empty descriptions (rather than
the non-empty ones). This change fixes that.
Optionals containing boxed primitive types are prohibitively costly because they
have two level of boxing. For Optional<Integer> the analogous OptionalInt can be
used to avoid the boxing of the contained int value.
This change fixes a bug in the cross fields mode of the `query_string`
query. The multi fields query builder must be reseted before parsing
in order to clear the list of expanded fields coming from the previous text block.
Closes#34215
Mappings with completion type and multi-fields, were not able to index array or
object format on completion fields. Only string format was supported.
This is fixed by providing multiField parser with externalValueContext with already parsed object
closes#15115
This adds some method into the `DateFormatter` interface, namely
* `withLocale()` to change the locale of a date formatter
* `getLocale()`
* `getZone()`
* `hashCode()`
* `equals()`
These methods will be needed for aggregations and mapping changes, where
zones and locales can be specified in the mapping or in search/aggs
parts of a search request.
When nested objects are present in the mappings, we add a filter in
queries to exclude them if there is no evidence that the query cannot
match in this space. In 6x we visit the query in order to find a mandatory
clause that can match root documents only. If we find one we can omit the
nested documents filter. Currently only `term` and `range` queries are checked,
this change adds the support for `terms` query to effectively remove the nested filter
if a mandatory `terms` clause targets a non-nested field.
Closes#34067
Mainly this fixes a warning by replacing the unchecked `new ActionListener`
with the checked `new ActionListener<Response>`, and it also fixes the line
length violations in this class.
This commit adds a check for "enabled" attribute change for types when
a RestPutMappingAction is received. A MappingException is thrown when
such a change is detected. Change are prevented in both ways: "false -> true"
and "true -> false".
Closes#33566
#32281 adds elasticsearch-shard to provide bwc version of elasticsearch-translog for 6.x; have to remove elasticsearch-translog for 7.0
Relates to #31389
The unfollow API changes a follower index into a regular index, so that it will accept write requests from clients.
For the unfollow api to work the index follow needs to be stopped and the index needs to be closed.
Closes#33931
This change introduces the indexing optimization using sequence numbers
in the FollowingEngine. This optimization uses the max_seq_no_updates
which is tracked on the primary of the leader and replicated to replicas
and followers.
Relates #33656
This commit removes the use of ExecutableScript from watcher in favor of
custom script contexts for both watcher condition scripts and transform
scripts.
Fixes the equals and hash function to ignore the order of aggregations to ensure equality after serialization
and deserialization. This ensures storing configs with aggregation works properly.
This also addresses a potential issue in caching when the same query contains aggregations but in
different order. 1st it will not hit in the cache, 2nd cache objects which shall be equal might end up twice in
the cache.
In order to be compatible with joda time, this adds an epoch seconds
formatter, that is able to parse floating point values.
However joda time discards the floating point values, but still parses
the data, where as this one is able to parse the whole value including
milliseconds.
The synonym filters no longer need access to the AnalysisRegistry in their
constructors, so we can remove the special-case code and move them to the
common analysis module.
This commit means that synonyms are no longer available for `server` integration tests,
so several of these are either rewritten or migrated to the common analysis module
as rest-spec-api tests
* Make sure 'ignored' and 'routing' field types inherit from StringFieldType.
* Add tests for prefix and regexp queries.
* Support prefix and regexp queries on _index fields.
This commit adds the ability to plug in compilation of custom contexts
in mock script engine. This is needed for testing plugins which add
custom contexts like watcher.
Prior to this change when a pipeline processor called another
pipeline, only the stats for the first processor were recorded.
The stats for the subsequent pipelines were ignored. This change
properly accounts for pipelines irregardless if they are the first
or subsequently called pipelines.
This change moves the state of the stats from the IngestService
to the pipeline itself. Cluster updates are safe since the pipelines
map is atomically swapped, and if a cluster update happens
while iterating over stats (now read directly from the pipeline)
a slightly stale view of stats may be shown.
Recently we introduced the settings cluster.remote to take the place of
search.remote for configuring remote cluster connections. We made this
change due to the fact that we have generalized the remote cluster
infrastructure to also be used within cross-cluster replication and not
only cross-cluster search. For backwards compatibility, when we made this
change, we allowed that cluster.remote would fallback to
search.remote. Alas, the initial change for this contained a bug for
handling the proxy and seeds settings. The bug for the seeds settings
arose because we were manually iterating over the concrete settings only
for cluster.remote seeds but not for search.remote seeds. This commit
addresses this by iterating over both cluster.remote seeds and
search.remote seeds. Additionally, when checking for existence of proxy
settings, we have to not only check cluster.remote proxy settings, but
also fallback to search.remote proxy settings. This commit addresses
both issues, and adds tests for these situations.
* Handle MatchNoDocsQuery in span query wrappers
This change adds a new SpanMatchNoDocsQuery query that replaces
MatchNoDocsQuery in the span query wrappers.
The `wildcard` query now returns MatchNoDocsQuery if the target field is not
in the mapping (#34093) so we need the equivalent span query in order to
be able to pass it to other span wrappers.
Closes#34105
EngineSearcher can be easily folded into Engine.Searcher which removes
a level of inheritance that is necessary for most of it's subclasses.
This change folds it into Engine.Searcher and removes the dependency on
ReferenceManager.
* This should surface what errors are thrown on CI
and in org.elasticsearch.transport.RemoteClusterConnection.ConnectHandler#collectRemoteNodes
(the sequence of caught error in the last catch block and moving on to the next seed node
seems to be the only path by which the errors logged in #33756 could come about)
* Relates #33756
This commit adds "engine is closed" as an expected failure message.
This change is due to #33967 in which we might access a closed engine on
promotion.
Relates #33967
This change is related to #33903 that ports the DocStats
simplification to the master branch. This change builds the docStats
in the ReadOnlyEngine from the last committed segment infos rather than
the reader.
Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>
Although we allow to index BigInteger and BigDecimal into a keyword
field, source filtering on these fields would fail
as XContentBuilder was not able to deserialize BigInteger and BigDecimal
to json.
This modifies XContentBuilder to allow to handle BigInteger and
BigDecimal.
Closes#32395
This commits creates a DateMathParser interface, which is already
implemented for both joda and java time. While currently the java time
DateMathParser is not used, this change will allow a followup which will
create a DateMathParser from a DateFormatter, so the caller does not
need to know the internals of the DateFormatter they have.
Previously, unmapped aggs try to delegate reduction to a sibling agg that is
mapped. That delegated agg will run the reductions, and also
reduce any pipeline aggs. But because delegation comes before running
pipelines, the unmapped agg _also_ tries to run pipeline aggs.
This causes the pipeline to run twice, and potentially double it's output
in buckets which can create invalid JSON (e.g. same key multiple times)
and break when converting to maps.
This fixes by sorting the list of aggregations ahead of time so that mapped
aggs appear first, meaning they preferentially lead the reduction. If all aggs
are unmapped, the first unmapped agg simply creates a new unmapped object
and returns that for the reduction.
This means that unmapped aggs no longer defer and there is no chance for
a secondary execution of pipelines (or other side effects caused by deferring
execution).
Closes#33514
`SingleFieldsVisitor` is meant to load a single stored field but it
manages to be quite complex to reason about because it inherits from our
"basic" `FieldsVisitor` which is designed to load many fields. This
breaks that inheritance and adds logic to `SingleFieldsVisitor` so it can
be properly stand alone. While this amounts to more lines of code they
ought to be significantly easier to reason about.