This is related to #23020. There are some cases for where this method
might be called with a URL to a file inside a jar. This commit allows
this method to read URLs with a protocol of 'jar:/'.
Secure settings from the elasticsearch keystore were not yet validated.
This changed improves support in Settings so that secure settings more
seamlessly blend in with normal settings, allowing the existing settings
validation to work. Note that the setting names are still not validated
(yet) when using the elasticsearc-keystore tool.
As part of #22116 we are going to forbid usage of api
java.net.URL#openStream(). However in a number of places across the
we use this method to read files from the local filesystem. This commit
introduces a helper method openFileURLStream(URL url) to read files
from URLs. It does specific validation to only ensure that file:/
urls are read.
Additionlly, this commit removes unneeded method
FileSystemUtil.newBufferedReader(URL, Charset). This method used the
openStream () method which will soon be forbidden. Instead we use the
Files.newBufferedReader(Path, Charset).
This commit adds support for the newline delimited JSON Content-Type, which is how
the bulk, multi-search, and multi-search template APIs expect data to be formatted. The
`elasticsearch-js` client has also been using this content type for these types of requests.
Closes#22943
Today either all nodes in the cluster connect to remote clusters of only nodes
that have remote clusters configured in their node config. To allow global remote
cluster configuration but restrict connections to a set of nodes in the cluster
this change adds a new setting `search.remote.connect` (defaults to `true`) to allow
to disable remote cluster connections on a per node basis.
In order to support the evolving GeoPoint encodings in Lucene 5 and 6, ES 2.x and 5.x implements an abstraction layer to the GeoPointFieldMapper classes. As of 5.x the geo_point field mapper settled on using Lucene's more performant LatLonPoint field type and deprecated all other encodings. In 6.0 all encodings except LatLonPoint have been removed rendering this abstraction layer useless. This commit removes the abstraction layer and renames the LatLonPointFieldMapper back to GeoPointFieldMapper to mantain consistency with ES field naming.
create a snapshot with a name that already exists in the repository.
Instead of throwing a SnapshotCreateException, which results in a
generic 500 status code, a duplicate snapshot name will throw a
InvalidSnapshotNameException, which will result in a 400 status code
(bad request).
`QUERY_AND_FETCH` has been treated as an internal optimization for 2 major
versions. This commit removes the search type and it's implementation details and
folds the optimization in the case of a single shard into the search controller such
that every search with a single shard (non DFS) will receive this optimization.
When a node receives a new cluster state from the master, it opens up connections to any new node in the cluster state. That has always been done serially on the cluster state thread but it has been a long standing TODO to do this concurrently, which is done by this PR.
This is spin off of #22828, where an extra handshake is done whenever connecting to a node, which may slow down connecting. Also, the handshake is done in a blocking fashion which triggers assertions w.r.t blocking requests on the cluster state thread. Instead of adding an exception, I opted to implement concurrent connections which both side steps the assertion and compensates for the extra handshake.
This changes the way that replica failures are handled such that not all
failures will cause the replica shard to be failed or marked as stale.
In some cases such as refresh operations, or global checkpoint syncs, it is
"okay" for the operation to fail without the shard being failed (because no data
is out of sync). In these cases, instead of failing the shard we should simply
fail the operation, and, in the event it is a user-facing operation, return a
5xx response code including the shard-specific failures.
This was accomplished by having two forms of the `Replicas` proxy, one that is
for non-write operations that does not fail the shard, and one that is for write
operations that will fail the shard when an operation fails.
Relates to #10708
It was accidentally renamed `enabled_position_increment` in the cleanups
for 5.0. This adds `enable_position_increment` as a deprecated alias
so it will continue to work.
This commit removes the following queries and parameters (which were deprecated in 5.0):
* GeoPointDistanceRangeQuery
* coerce, and ignore_malformed for GeoBoundingBoxQuery, GeoDistanceQuery, GeoPolygonQuery, and GeoDistanceSort
This is related to #22116. Core no longer needs `SocketPermission`
`connect`.
This permission is relegated to these modules/plugins:
- transport-netty4 module
- reindex module
- repository-url module
- discovery-azure-classic plugin
- discovery-ec2 plugin
- discovery-gce plugin
- repository-azure plugin
- repository-gcs plugin
- repository-hdfs plugin
- repository-s3 plugin
And for tests:
- mocksocket jar
- rest client
- httpcore-nio jar
- httpasyncclient jar
This commit upgrades the checkstyle configuration from version 5.9 to
version 7.5, the latest version as of today. The main enhancement
obtained via this upgrade is better detection of redundant modifiers.
Relates #22960
When a primary is relocated from an old node to a new node, it can have
ops in its translog that do not have a sequence number assigned. When a
file-based recovery is started, this can lead to skipping these ops when
replaying the translog due to a bug in the recovery logic. This commit
addresses this bug and adds a test in the BWC tests.
Relates #22945
This change also removes the reference to the difference bewteen full name and index name.
They are always the same since 2.x and `name` does not refer anymore to `author.name` automatically.
A simple pattern must be used instead.
Remove redundant code that checks the field name twice.
Today if a user invokes the remove plugin command without specifying the
name of a plugin to remove, we arrive at a null pointer exception. This
commit adds logic to cleanly handle this situation and provide clear
feedback to the user.
Relates #22930
Currently, update action internally uses deprecated index and delete
transport actions. As of #21964, these tranport actions were deprecated
in favour of using single item bulk request. In this commit, update action
uses single item bulk action.
This change adds a strict mode for xcontent parsing on the rest layer. The strict mode will be off by default for 5.x and in a separate commit will be enabled by default for 6.0. The strict mode, which can be enabled by setting `http.content_type.required: true` in 5.x, will require that all incoming rest requests have a valid and supported content type header before the request is dispatched. In the non-strict mode, the Content-Type header will be inspected and if it is not present or not valid, we will continue with auto detection of content like we have done previously.
The content type header is parsed to the matching XContentType value with the only exception being for plain text requests. This value is then passed on with the content bytes so that we can reduce the number of places where we need to auto-detect the content type.
As part of this, many transport requests and builders were updated to provide methods that
accepted the XContentType along with the bytes and the methods that would rely on auto-detection have been deprecated.
In the non-strict mode, deprecation warnings are issued whenever a request with body doesn't provide the Content-Type header.
See #19388
GeoDistance query, sort, and scripts make use of a crazy GeoDistance enum for handling 4 different ways of computing geo distance: SLOPPY_ARC, ARC, FACTOR, and PLANE. Only two of these are necessary: ARC, PLANE. This commit removes SLOPPY_ARC, and FACTOR and cleans up the way Geo distance is computed.
This commit change ElasticsearchException.failureFromXContent() method so that it now parses root causes which were ignored before, and adds them as suppressed exceptions of the returned exception.
Implemented by wrapping an array of reused `ModuleDateTime`s that
we grow when needed. The `ModuleDateTime`s are reused when we
move to the next document.
Also improves the error message returned when attempting to modify
the `ScriptdocValues`, removes a couple of allocations, and documents
that the date functions are available in Painless.
Relates to #22162
This disallows object mappings that would accidentally create something like
`foo..bar`, which is then unparsable for the `bar` field as it does not know
what its parent is.
Resolves#22794
The test tried to create a situation where a stale replica is the only shard available. It did so by stopping the node with the replica, indexing some, stopping the primary node, starting a new node. This is flawed because the newly started node may reuse the data path of the primary node and things go back to green. Instead we should make sure that the replica is on the path that will be selected when the new node is started (i.e., the path with the smaller ordinal)
This commit adds a BytesRestResponse.errorFromXContent() method to parse the error returned by BytesRestResponse. It returns a ElasticsearchStatusException instance.
Currently, if a previously allocated shard has no in-sync copy in the
cluster, but there is a stale replica copy, the explain API does not
include information about the stale replica copies in its output. This
commit includes any shard copy information available (even for stale
copies) when explaining an unassigned primary shard that was previously
allocated in the cluster.
This situation can arise as follows: imagine an index with 1 primary and
1 replica and a cluster with 2 nodes. If the node holding the replica
is shut down, and data continues to be indexed, only the primary will
have the latest data and the replica that has gone offline will be
marked as stale. Now, suppose the node holding the primary is shut
down. There are no copies of the shard data in the cluster. Now, start
the first stopped node (holding the stale replica) back up. The cluster
is red because there is no in-sync copy available. Running the explain
API before would inform the user that there is no valid shard copy in
the cluster for that shard, but it would not provide any information
about the existence of the stale replica that exists on the restarted
node. With this commit, the explain API provides information about all
the stale replica copies when explaining the unassigned primary.
Currently, stored scripts use a namespace of (lang, id) to be put, get, deleted, and executed. This is not necessary since the lang is stored with the stored script. A user should only have to specify an id to use a stored script. This change makes that possible while keeping backwards compatibility with the previous namespace of (lang, id). Anywhere the previous namespace is used will log deprecation warnings.
The new behavior is the following:
When a user specifies a stored script, that script will be stored under both the new namespace and old namespace.
Take for example script 'A' with lang 'L0' and data 'D0'. If we add script 'A' to the empty set, the scripts map will be ["A" -- D0, "A#L0" -- D0]. If a script 'A' with lang 'L1' and data 'D1' is then added, the scripts map will be ["A" -- D1, "A#L1" -- D1, "A#L0" -- D0].
When a user deletes a stored script, that script will be deleted from both the new namespace (if it exists) and the old namespace.
Take for example a scripts map with {"A" -- D1, "A#L1" -- D1, "A#L0" -- D0}. If a script is removed specified by an id 'A' and lang null then the scripts map will be {"A#L0" -- D0}. To remove the final script, the deprecated namespace must be used, so an id 'A' and lang 'L0' would need to be specified.
When a user gets/executes a stored script, if the new namespace is used then the script will be retrieved/executed using only 'id', and if the old namespace is used then the script will be retrieved/executed using 'id' and 'lang'