While it's not possible to upgrade the Jackson dependencies
to their latest versions yet (see #27032 (comment) for more)
it's still possible to upgrade to the latest 2.8.x version.
We have an hidden setting called `index.queries.cache.term_queries` that disables caching of term queries in the query cache.
Though term queries are not cached in the Lucene UsageTrackingQueryCachingPolicy since version 6.5.
This makes the es policy useless but also makes it impossible to re-enable caching for term queries.
This change appeared in Lucene 6.5 so this setting is no-op since version 5.4 of Elasticsearch
The change in this PR removes the setting and the custom policy.
Only tests should use the single argument Environment constructor. To
enforce this the single arg Environment constructor has been replaced with
a test framework factory method.
Production code (beyond initial Bootstrap) should always use the same
Environment object that Node.getEnvironment() returns. This Environment
is also available via dependency injection.
If the master disconnects from the cluster after initiating snapshot, but just before the snapshot switches from INIT to STARTED state, the snapshot can get indefinitely stuck in the INIT state. This error is specific to v5.x+ and was triggered by keeping the master node that stepped down in the node list, the cleanup logic in snapshot/restore assumed that if master steps down it is always removed from the the node list. This commit changes the logic to trigger cleanup even if no nodes left the cluster.
Closes#27180
For FsBlobStore and HdfsBlobStore, if the repository is read only, the blob store should be aware of the readonly setting and do not create directories if they don't exist.
Closes#21495
After recent changes in InternalStats#doXContentBody the corresponding xContent
output of the parsed aggregation needed to be changed in a similar way.
* Uses norms for exists query if enabled
This change means that for indexes created from 6.1.0, if normas are enabled we will not write the field name to the `_field_names` field and for an exists query we will instead use the NormsFieldExistsQuery which was added in Lucene 7.1.0. If norms are not enabled or if the index was created before 6.1.0 `_field_names` will be used as before.
* Fixes tests
The uid bytes (as the type#id) were needlessly being created even though
they are no longer needed after the move to single type per index. This
commit avoids creating these when parsed documents are constructed.
Relates #27241
We do some accounting in IndexShard that is not necessarily correct since
we maintain two different index readers. This change moves the accounting under
the engine which knows what reader we are refreshing.
Relates to #26972
This local checkpoint tracker uses collections of bit sets to track
which sequence numbers are complete, eventually removing these bit sets
when the local checkpoint advances. However, these bit sets were eagerly
allocated so that if a sequence number far ahead of the checkpoint was
marked as completed, all bit sets between the "last" bit set and the bit
set needed to track the marked sequence number were allocated. If this
sequence number was too far ahead, the memory requirements could be
excessive. This commit opts for a different strategy for holding on to
these bit sets and enables them to be lazily allocated.
Relates #27179
We added an index-level setting for controlling the size of the bit sets
used to back the local checkpoint tracker. This setting is really only
needed to control the memory footprint of the bit sets but we do not
think this setting is going to be needed. This commit removes this
setting before it is released to the wild after which we would have to
worry about BWC implications.
Relates #27191
* Enhances exists queries to reduce need for `_field_names`
Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance.
This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`.
Closes#26770
* Addresses review comments
* Addresses more review comments
* implements existsQuery explicitly on every mapper
* Reinstates ability to perform term query on `_field_names`
* Added bwc depending on index created version
* Review Comments
* Skips tests that are not supported in 6.1.0
These values will need to be changed after backporting this PR to 6.x
This query returns documents that match with at least one ore more
of the provided terms. The number of terms that must match varies
per document and is either controlled by a minimum should match
field or computed per document in a minimum should match script.
Closes#26915
It is required in order to work correctly with bulk scorer implementations
that change the scorer during the collection process. Otherwise sub collectors
might call `Scorer.score()` on the wrong scorer.
Closes#27131
This commit is a minor refactoring of internal engine to move hooks for
generating sequence numbers into the engine itself. As such, we refactor
tests that relied on this hook to use the new hook, and remove the hook
from the sequence number service itself.
Relates #27082
This change is required in order to support a size based check for the
index rollover.
The index size is estimated by sampling the existing segments only. We
prefer using segments to StoreStats because StoreStats is not reliable
if indexing or merging operations are in progress.
Relates #27004
This change makes sure that we track score when sort is set to relevancy only.
In this case we always track max score like normal search does.
Closes#23840
* Apply missing request options to the expand phase
This change adds some missing options to the expand query that builds the inner hits for field collapsing.
The following options are now applied to the inner_hits query:
* post_filters
* preferences
* routing
Closes#27079Closes#26649
The new discovery stats were pushed to the 6.x branch (currently
versioned at 6.1.0) but master was not updated to reflect this. This
impacts the mixed-cluster BWC tests because a 6.1.0 node will be trying
to send a 7.0.0 node the new discovery stats but the 7.0.0 did not yet
understand that it should be reading these when talking to a 6.1.0
node. This commit addresses this, and changes the skip version on the
discovery stats REST tests.
Windows handles trying to read a file that does not exist because a
component of the path is not a directory differently than other OS
handle this situation. This commit adjusts these assertions for Windows.
When executing a cluster settings update that leaves the cluster state
unchanged, we skip validation and this avoids deprecation logging for
deprecated settings in the cluster state. This commit addresses this by
running validation even if the settings are unchanged.
Relates #27017
When a search is executing locally over many shards, we can stack
overflow during query phase execution. This happens due to callbacks
that occur after a phase completes for a shard and we move to the same
phase on another shard. If all the shards for the query are local to the
local node then we will never go async and these callbacks will end up
as recursive calls. With sufficiently many shards, this will end up as a
stack overflow. This commit addresses this by truncating the stack by
forking to another thread on the executor for the phase.
Relates #27069
Finder creates these files if you browse a directory there. These files
are really annoying, but it's an incredible pain for users that these
files are created unbeknownst to them, and then they get in the way of
Elasticsearch starting. This commit adds leniency on macOS only to skip
these files.
Relates #27108
Turns out that `ShardSearchTarget` is nullable, hence its fields may not be printed out as part of `ShardSearchFailure#toXContent`, in which case `fromXContent` cannot parse it back. We would previously try to create the object with all of its fields set to null, but `Index` complains about it in the constructor. Also made sure that this code path is covered by our unit tests in `ShardSearchFailureTests`.
Closes#27055
Introduce minimal thread scheduler as a base class for `ThreadPool`. Such a class can be used from the `BulkProcessor` to schedule retries and the flush task. This allows to remove the `ThreadPool` dependency from `BulkProcessor`, which requires to provide settings that contain `node.name` and also needed log4j for logging. Instead, it needs now a `Scheduler` that is much lighter and gets automatically created and shut down on close.
Closes#26028