Our query DSL supports empty queries (`{}`), which have a different meaning depending on the query that holds it, either ignored, match_all or match_none. We deprecated the support for empty queries in 5.0, where we log a deprecation warning wherever they are used.
The way we supported it once we moved query parsing to the coordinating node was having an Optional<QueryBuilder> return type in all of our parse methods (called fromXContent). See #17624. The central place for this was QueryParseContext#parseInnerQueryBuilder. We can now remove all the optional return types and simply throw an exception whenever an empty query is found.
When we decided to deprecate and remove fuzzy query in #15760, we didn't realize we would take away the possibililty for uses to use a fuzzy query as part of a span query, which is not possible using match query. This means we have to go back and un-deprecate fuzzy query, which will not be removed.
Closes#15760
Queries must be rewritten before the query phase executes otherwise non-executable queries like `wrapper` query or `terms` will fail or queries that require resources like script service can't access these service unless rewritten.
Relates to #21303
`include` / `exclude` in terms / sig-terms aggs seems completely broken
and massively untested. This commit makes the TermsTests pass again that
randomly use `include` / `exclude`. This class must be tested individually
and we need real integ tests that use xcontent that use this feature.
An earlier commit removed BWC for pre-5.0 snapshots, which also meant removing the capability to load pre-5.0 snapshots. In 6.0, such snapshots are now
invisible and must be treated by the BWC tests in that way.
URLBlobContainer can in certain situations throw a FileNotFoundException. To fulfill the contract of the readBlob method it should throw a NoSuchFileException instead when the given blob cannot be found.
Today we connect and publish the nodes connection before we execute a
handshake with the node we connect to. In the case of connecting to a node
that won't pass the handshake this connection is already `published` and other
code paths can use it. This commit detaches the connection and the publish of the
connection such that `TransportService` can do a handshake before actually connect
and publish the connection.
To get #22003 in cleanly we need to centralize as much `XContentParser` creation as possible into `RestRequest`. That'll mean we have to plumb the `NamedXContentRegistry` into fewer places.
This removes `RestAction.hasBody`, `RestAction.guessBodyContentType`, and `RestActions.getRestContent`, moving callers over to `RestRequest.hasContentOrSourceParam`, `RestRequest.contentOrSourceParam`, and `RestRequest.contentOrSourceParamParser` and `RestRequest.withContentOrSourceParamParserOrNull`. The idea is to use `withContentOrSourceParamParserOrNull` if you need to handle requests without any sort of body content and to use `contentOrSourceParamParser` otherwise.
I believe the vast majority of this PR to be purely mechanical but I know I've made the following behavioral change (I'll add more if I think of more):
* If you make a request to an endpoint that requires a request body and has cut over to the new APIs instead of getting `Failed to derive xcontent` you'll get `Body required`.
* Template parsing is now non-strict by default. This is important because we need to be able to deprecate things without requests failing.
Improves the error message returned when looking up a task that
belongs to a node that is no longer part of the cluster. The new
error message tells the user that the node isn't part of the cluster.
This is useful because if you start a task and the node goes down
there isn't a record of the task at all. This hints to the user that
the task might have died with the node.
Relates to #22027
In 5.0, the search slow log switched to the multi-line format with no option to get back to the origin single-line format that was used prior to 5.0 by default. This commit removes the reformat option from the search slow log and returns the search slow log back to the single-line format.
Closes#21711
A shard that is locally marked as relocated, but where the relocation target shard has not been activated yet by the master, can still receive index operations, which in return can lead to flushes being triggered. Flushing is currently (wrongly) prohibited on shards marked as relocated, which makes the flushing process go into an endless retry loop and log warnings until the shard is closed. This commit fixes this situation by allowing flush, force_merge and upgrade operations to run on shards that are marked as relocated.
If you make a mistake and specify a mapping like:
```
{
"parent": {
"properties": {}
},
"child": {
"_parent": "parent",
"properties": {}
}
}
```
then the error message you get back amounts to
`Failed to parse mapping for [child]: can't cast a String to a Map`.
Since it doens't tell you *which* string can't be cast to a map you
have to dig through the stack trace to figure out what to fix. This
replaces the error message with:
```
Failed to parse mapping [child]: [_parent] must be an object containing [type]
```
so you can tell that the problem is with the `parent` field.
This adds a fromXContent method and unit test to InternalNestedIdentity so we can parse it as part of a search response. This is part of the preparation for parsing search responses on the client side.
Fixes an issue where indexing requests with operation type "create" auto-convert external versioning to internal versioning and silently ignore the version number instead of failing with an error message.
This is an attempt to start moving aggs parsing to `ObjectParser`. There is
still A LOT to do, but ObjectParser is way better than the way aggregations
parsing works today. For instance in most cases, we reject numbers that are
provided as strings, which we are supposed to accept since some client languages
(looking at you Perl) cannot make sure to use the appropriate types.
Relates to #22009
* Remove 2.0 prerelease version constants
This is a start to addressing #21887. This removes:
* pre 2.0 snapshot format support
* automatic units addition to cluster settings
* bwc check for delete by query in pre 2.0 indexes
This adds the `_primary_term` field internally to the mappings. This field is
populated with the current shard's primary term.
It is intended to be used for collision resolution when two document copies have
the same sequence id, therefore, doc_values for the field are stored but the
filed itself is not indexed.
This also fixes the `_seq_no` field so that doc_values are retrievable (they
were previously stored but irretrievable) and changes the `stats` implementation
to more efficiently use the points API to retrieve the min/max instead of
iterating on each doc_value value. Additionally, even though we intend to be
able to search on the field, it was previously not searchable. This commit makes
it searchable.
There is no user-visible `_primary_term` field. Instead, the fields are
updated by calling:
```java
index.parsedDoc().updateSeqID(seqNum, primaryTerm);
```
This includes example methods in `Versions` and `Engine` for retrieving the
sequence id values from the index (see `Engine.getSequenceID`) that are only
used in unit tests. These will be extended/replaced by actual implementations
once we make use of sequence numbers as a conflict resolution measure.
Relates to #10708
Supercedes #21480
P.S. As a side effect of this commit, `SlowCompositeReaderWrapper` cannot be
used for documents that contain `_seq_no` because it is a Point value and SCRW
cannot wrap documents with points, so the tests have been updated to loop
through the `LeafReaderContext`s now instead.
Before, it was possible that the SameShardAllocationDecider would allow
force allocation of an unassigned primary to the same node on which an
active replica is assigned. This could only happen with shadow replica
indices, because when a shadow replica primary fails, the replica gets
promoted to primary but in the INITIALIZED state, not in the STARTED
state (because the engine has specific reinitialization that must take
place in the case of shadow replicas). Therefore, if the now promoted
primary that is initializing fails also, the primary will be in the
unassigned state, because replica to primary promotion only happens when
the failed shard was in the started state. The now unassigned primary
shard will go through the allocation deciders, where the
SameShardsAllocationDecider would return a NO decision, but would still
permit force allocation on the primary if all deciders returned NO.
This commit implements canForceAllocatePrimary on the
SameShardAllocationDecider, which ensures that a primary cannot be
force allocated to the same node on which an active replica already
exists.
the ReplicaShardAllocator, when in explain mode, would get the
node decisions for all nodes in the cluster. The PrimaryShardAllocator
neglected to do this and tried to use the shard fetch data in explain
mode, which had not yet been fully fetched. This commit fixes this by
ensuring the PrimaryShardAllocator gets node decisions in the same way
the ReplicaShardAllocator does in explain mode, if shard data is still
being fetched.
This commit enhances the allocator decision result objects (namely,
AllocateUnassignedDecision, MoveDecision, and RebalanceDecision)
to enable them to be used directly by the cluster allocation explain API. In
particular, this commit does the following:
- Adds serialization and toXContent methods to the response objects,
which will form the explain API responses.
- Moves the calculation of the final explanation to the response
object itself, removing it from the responsibility of the allocators.
- Adds shard store information to the NodeAllocationResult, so that
store information is available for each node, when explaining a
shard allocation by the PrimaryShardAllocator or the ReplicaShardAllocator.
- Removes RebalanceDecision in favor of using MoveDecision for both
moving and rebalancing shards.
- Removes NodeRebalanceResult in favor of using NodeAllocationResult.
- Changes the notion of weight ranking to be relative to the current node,
instead of an absolute weight that doesn't convey any added value to the
API user and can be confusing.
- Introduces a new enum AllocationDecision to convey the decision type,
which enables conveying unassigned, moving, and rebalancing scenarios
with more detail as opposed to just Decision.Type and AllocationStatus.
* Ingest: Moved ingest invocation into index/bulk actions
Ingest was originally setup as a plugin, and in order to hook into the
index and bulk actions, action filters were used. However, ingest was
later moved into core, but the action filters were never removed. This
change moves the execution of ingest into the index and bulk actions.
* Address PR comments
* Remove forwarder direct dependency on ClusterService
This adds a fromXContent method and unit test to the HighlightField class so we
can parse it as part of a serch response. This is part of the preparation for
parsing search responses on the client side.
Failing an initializing primary when shadow replicas are enabled for the index can leave the primary unassigned with replicas being active. Instead, a replica should be promoted to primary, which is fixed by this commit.
This commit makes two changes to how the in-sync allocations set is updated:
- the set is only trimmed when it grows. This prevents trimming too eagerly when the number of replicas was decreased while shards were unassigned.
- the allocation id of an active primary that failed is only removed from the in-sync set if another replica gets promoted to primary. This prevents the situation where the only available shard copy in the cluster gets removed the in-sync set.
Closes#21719
We had tests for the regular factories, but not for the pre-built ones, that
ship by default without requiring users to define them in the analysis settings.
Currently we expose the internal representation that we use for ip addresses,
which are the ipv6 bytes. However, this is not really usable, exposes internal
implementation details and also does not work fine with other APIs that expect
that the values can be `toString`'d.
Closes#21977
When you submit a _termvectors request for an artificial document and
specify the 'preference' parameter to send the request to a particular
shard, the request sometimes hits NPE. Fix this case by ignoring the
auto-generated artificial document ID and pick a shard per the
preference parameter, or a random shard.
This closes#21928
* Include unindexed field in FieldStats response
This change adds non-searchable fields to the FieldStats response. These fields do not have min/max informations but they can be aggregatable. Fields that are only stored in _source (store:no, index:no, doc_values:no) will still be missing since they do not have any useful information to show. Indices and clients must be at least on V_5_2_0 to see this change.
Since the removal of local discovery of #https://github.com/elastic/elasticsearch/pull/20960 we rely on minimum master nodes to be set in our test cluster. The settings is automatically managed by the cluster (by default) but current management doesn't work with concurrent single node async starting. On the other hand, with `MockZenPing` and the `discovery.initial_state_timeout` set to `0s` node starting and joining is very fast making async starting an unneeded complexity. Test that still need async starting could, in theory, still do so themselves via background threads.
Note that this change also removes the usage of `INITIAL_STATE_TIMEOUT_SETTINGS` as the starting of nodes is done concurrently (but building them is sequential)
With this commit we document that ingest processors need to be
thread-safe. Previously this could be inferred from reading the
source code but we got several user questions about this so
it is stated explicitly in the Javadocs of Processor now.