* Fix index prefixes to work with span_multi
Text fields that use `index_prefixes` can rewrite `prefix` queries into
`term` queries internally. This commit fix the handling of this rewriting
in the `span_multi` query.
This change also copies the index options of the text field into the
prefix field in order to be able to run positional queries. This is mandatory
for `span_multi` to work but this could also be useful to optimize `match_phrase_prefix`
queries in a follow up. Note that this change can only be done on indices created
after 6.3 since we set the index options to doc only in this version.
Fixes#31056
ObjectParser should throw XContentParseExceptions, not IAE. A dedicated parsing
exception can includes the place where the error occurred.
Closes#30605
This snapshot includes:
- LUCENE-8341: Record soft deletes in SegmentCommitInfo which will resolve#30851
- LUCENE-8335: Enforce soft-deletes field up-front
When `lenient=false`, attempts to create match phrase queries with custom analyzers against non-text fields will throw an IllegalArgumentException.
Also changes `*Match*QueryBuilderTests` so that it avoids this scenario
Fixes#31061
The majority of Responses inheriting from AcknowledgeResponse implement
the readFrom and writeTo serialization method in the same way. Moving this
as a default into AcknowledgeResponse and letting the few exceptions that
need a slightly different implementation handle this themselves saves a lot
of duplication.
Specifying `index_phrases: true` on a text field mapping will add a subsidiary
[field]._index_phrase field, indexing two-term shingles from the parent field.
The parent analysis chain is re-used, wrapped with a FixedShingleFilter.
At query time, if a phrase match query is executed, the mapping will redirect it
to run against the subsidiary field.
This should trade faster phrase querying for a larger index and longer indexing
times.
Relates to #27049
This commit fixes an issue with
PersistentTasksCustomMetaDataTests#testMinVersionSerialization. There
were two problems here:
- some versions do not have future compatible version (e.g., betas)
- the feature logic was incorrect
With #31020 we introduced the ability for transport clients to indicate what features they support
in order to make sure we don't serialize object to them they don't support. This PR adapts the
serialization logic of persistent tasks to be aware of those features and not serialize tasks that
aren't supported.
Also, a version check is added for the future where we may add new tasks implementations and
need to be able to indicate they shouldn't be serialized both to nodes and clients.
As the implementation relies on the interface of `PersistentTaskParams`, these are no longer
optional. That's acceptable as all current implementation have them and we plan to make
`PersistentTaskParams` more central in the future.
Relates to #30731
We compute a random version and later try to compute the version prior
that random version. If the random version is the earliest version in
our list of versions then it, by definition, does not have a previous
version. Yet trying to find its previous is someting we do and so the
test fails. This commit adds a version check to the randomization so
that we do not select the earliest version in our list.
This is related to #31017. That issue identified that these three http
methods were treated like GET requests. This commit adds them to
RestRequest. This means that these methods will be handled properly and
generate 405s.
This commit introduces the ability for a client to communicate to the
server features that it can support and for these features to be used in
influencing the decisions that the server makes when communicating with
the client. To this end we carry the features from the client to the
underlying stream as we carry the version of the client today. This
enables us to enhance the logic where we make protocol decisions on the
basis of the version on the stream to also make protocol decisions on
the basis of the features on the stream. With such functionality, the
client can communicate to the server if it is a transport client, or if
it has, for example, X-Pack installed. This enables us to support
rolling upgrades from the OSS distribution to the default distribution
without breaking client connectivity as we can now elect to serialize
customs in the cluster state depending on whether or not the client
reports to us using the feature capabilities that it can under these
customs. This means that we would avoid sending a client pieces of the
cluster state that it can not understand. However, we want to take care
and always send the full cluster state during node-to-node communication
as otherwise we would end up with different understanding of what is in
the cluster state across nodes depending on which features they reported
to have. This is why when deciding whether or not to write out a custom
we always send the custom if the client is not a transport client and
otherwise do not send the custom if the client is transport client that
does not report to have the feature required by the custom.
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
With the default distribution changing in 6.3, clusters might now contain custom metadata that a
pure OSS transport client cannot deserialize. As this can break transport clients when accessing
the cluster state or reroute APIs, we've decided to exclude any custom metadata that the transport
client might not be able to deserialize. This will ensure compatibility between a < 6.3 transport
client and a 6.3 default distribution cluster. Note that this PR only covers interoperability with older
clients, another follow-up PR will cover full interoperability for >= 6.3 transport clients where we will
make it possible again to get the custom metadata from the cluster state.
Relates to #30731
This change adds an option named `split_queries_on_whitespace` to the `keyword`
field type. When set to true full text queries (`match`, `multi_match`, `query_string`, ...) that target the field will split the input on whitespace to build the query terms. Defaults to `false`.
Closes#30393
The randomized alias names could contain unicode controll charactes that don't
survive an xContent rendering and parsing roundtrip when using the YAML xContent
type. This fix filters the randomized unicode string for control characters to
avoid this particular problem.
Closes#30911
In case an error is returned when calling search_shards on a remote
cluster, which will lead to throwing an exception in the coordinating
node, we should make sure that the status code returned by the
coordinating node is the same as the one returned by the remote
cluster. Up until now a 500 - Internal Server Error was always
returned. This commit changes this behaviour so that for instance if an
index is not found, which causes an 404, a 404 is also returned by the
coordinating node to the client.
Closes#27461
This commit reworks testing for `ListTasksResponse` so that random
fields insertion can be tested and xcontent equivalence can be checked
too. Proper exclusions need to be configured, and failures need to be
tested separately. This helped finding a little problem, whenever there
is a node failure returned, the nodeId was lost as it was never printed
out as part of the exception toXContent.
This is related to #30141. Right now in the transport client we open a
temporary node connection and take the node information. This node
information is used to open a permanent connection that is used for the
client. However, we continue to use the configured transport address.
If the configured transport address is a load balancer, you might
connect to a different node for the permanent connection. This causes
the handshake validation to fail. This commit removes the handshake
validation for the transport client when it simple node sample mode.
Since master will always communicate with a >=6.4 node, the logic for
checking if the node is 6.4 and conditionally reading and writing based
on that can be removed from master. This logic will stay in 6.x as it is
the bridge to the cleaner response in master. This also unmutes the
failing test due to this bwc change.
Closes#30807
This commit removes the RequestBuilder generic type from Action. It was
needed to be used by the newRequest method, which in turn was used by
client.prepareExecute. Both of these methods are now removed, along with
the existing users of prepareExecute constructing the appropriate
builder directly.
This change deprecates completion queries and documents without context that target a
context enabled completion field. Querying without context degrades the search
performance considerably (even when the number of indexed contexts is low).
This commit targets master but the deprecation will take place in 6.x and the functionality
will be removed in 7 in a follow up.
Closes#29222
This commit adds Verify Repository, the associated docs and tests for
the high level REST API client. A few small changes to the Verify
Repository Response went into the commit as well.
Relates #27205
Currently failures to compile a script usually lead to a ScriptException, which
inherits the 500 INTERNAL_SERVER_ERROR from ElasticsearchException if it does
not contain another root cause. Instead, this should be a 400 Bad Request error.
This PR changes this more generally for script compilation errors by changing
ScriptException to return 400 (bad request) as status code.
Closes#12315
When we are connecting to a remote cluster we should never select
dedicated master nodes as gateway nodes, or we will end up loading them
with requests that should rather go to other type of nodes e.g. data
nodes or coord_only nodes.
This commit adds the selection based on the node role, to the existing
selection based on version and potential node attributes.
Closes#30687
AliasMetaData should be parsed more leniently so that the high-level REST client can support forward compatibility on it. This commit addresses this issue that was found as part of #28799 and adds dedicated XContent tests as well.
With multiple data paths, we write the state files for index metadata to all data paths. We only properly fsync on the first location, though. For other locations, we possibly expose the file before its contents is properly fsynced. This can lead to situations where, after a crash, and where the first data path is not available anymore, ES will see a partially-written state file, preventing the node to start up.
This change adds a new option to the composite aggregation named `missing_bucket`.
This option can be set by source and dictates whether documents without a value for the
source should be ignored. When set to true, documents without a value for a field emits
an explicit `null` value which is then added in the composite bucket.
The `missing` option that allows to set an explicit value (instead of `null`) is deprecated in this change and will be removed in a follow up (only in 7.x).
This commit also changes how the big arrays are allocated, instead of reserving
the provided `size` for all sources they are created with a small intial size and they grow
depending on the number of buckets created by the aggregation:
Closes#29380
This commit renames methods in the PersistentTasksService, to
make obvious that the methods send requests in order to change
the state of persistent tasks.
Relates to #29608.
* Make sure all instance variables are final.
* Make generateKey a private static method, instead of protected.
* Rename formatter -> format for consistency.
* Serialize bucket keys as strings as opposed to optional strings.
* Pull the stream serialization logic for buckets into the Bucket class.
Currently AbstractHttpServerTransport is in a netty4 module. This is the
incorrect location. This commit moves it out of netty4 module.
Additionally, it moves unit tests that test AbstractHttpServerTransport
logic to server.