Reindex-from-remote had a race when it tried to clear the scroll. It
first starts the request to clear the scroll and then submits a task
to the generic threadpool to shutdown the client. These two things
race and, in my experience, closing the scroll generally loses. That
means that most of the time reindex-from-remote isn't clearing the
scrolls that it uses. This isn't the end of the world because we
flush old scroll contexts after a while but this isn't great.
Noticed while experimenting with #22514.
Reindex-from-remote was accepting source filtering in the request
but ignoring it and setting `_source=true` on the search URI. This
fixes the filtering so it is piped through to the remote node and
adds tests for that.
Closes#22507
Removes `AggregatorParsers`, replacing all of its functionality with
`XContentParser#namedObject`.
This is the third bit of payoff from #22003, one less thing to pass
around the entire application.
If the remote doesn't return a content type then reindex
tried to guess the content-type. This didn't work most
of the time and produced a rather useless error message.
Given that Elasticsearch always returns the content-type
we are dropping content-type detection in favor of just
failing the request if the remote didn't return a content-type.
Closes#22329
1. Escape sequences we're working. For example `\\` is now correctly
interpreted as `\` instead of `\\`. Same with `\'` being `'` and
`\"` being `"`.
2. `'` delimited strings weren't allowed to contain `"`s but it looked
like they were intended to support it. Now they do.
3. Improves the error message when the script contains an invalid
escape sequence inside a string to include a list of the valid
escape sequences.
Closes#22372
This integrates the mocksocket jar with elasticsearch tests. Mocksocket wraps actions requiring SocketPermissions in doPrivilege blocks. This will eventually allow SocketPermissions to be assigned to the mocksocket jar opposed to the entire elasticsearch codebase.
We previously named the thread using a frame from the stack trace, but
this was removed to simplify the code here. However, the comment
explaining this was left behind and this commit cleans that up.
* Remove a checked exception, replacing it with `ParsingException`.
* Remove all Parser classes for the yaml sections, replacing them with static methods.
* Remove `ClientYamlTestFragmentParser`. Isn't used any more.
* Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods.
I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.
As the translog evolves towards a full operations log as part of the
sequence numbers push, there is a need for the translog to be able to
represent operations for which a sequence number was assigned, but the
operation did not mutate the index. Examples of how this can arise are
operations that fail after the sequence number is assigned, and gaps in
this history that arise when an operation is assigned a sequence number
but the operation never completed (e.g., a node crash). It is important
that these operations appear in the history so that they can be
replicated and replayed during recovery as otherwise the history will be
incomplete and local checkpoints will not be able to advance. This
commit introduces a no-op to the translog to set the stage for these
efforts.
Relates #22291
Introduces `XContentParser#namedObject which works a little like
`StreamInput#readNamedWriteable`: on startup components register
parsers under names and a superclass. At runtime we look up the
parser and call it to parse the object.
Right now the parsers take a context object they use to help with
the parsing but I hope to be able to eliminate the need for this
context as most what it is used for at this point is to move
around parser registries which should be replaced by this method
eventually. I make no effort to do so in this PR because it is
big enough already. This is meant to the a start down a road that
allows us to remove classes like `QueryParseContext`,
`AggregatorParsers`, `IndicesQueriesRegistry`, and
`ParseFieldRegistry`.
The goal here is to reduce the amount of plumbing required to
allow parsing pluggable things. With this you don't have to pass
registries all over the place. Instead you must pass a super
registry to fewer places and use it to wrap the reader. This is
the same tradeoff that we use for NamedWriteable and it allows
much, much simpler binary serialization. We think we want that
same thing for xcontent serialization.
The only parsing actually converted to this method is parsing
`ScoreFunctions` inside of `FunctionScoreQuery`. I chose this
because it is relatively self contained.
It looks like the exception reason can differ in different default
locales, so the build would fail in any non-English locale. This
switches the catch to the name of the exception which shouldn't
vary.
We are currenlty checking that no deprecation warnings are emitted in our query tests. That can be moved to ESTestCase (disabled in ESIntegTestCase) as it allows us to easily catch where our tests use deprecated features and assert on the expected warnings.
We return deprecation warnings as response headers, besides logging them. Strict parsing mode stayed around, but was only used in query tests, though we also introduced checks for deprecation warnings there that don't need strict parsing anymore (see #20993).
We can then safely remove support for strict parsing mode. The final goal is to remove the ParseFieldMatcher class, but there are many many users of it. This commit prepares the field for the removal, by deprecating ParseFieldMatcher and making it effectively not needed. Strict parsing is removed from ParseFieldMatcher, and strict parsing is replaced in tests where needed with deprecation warnings checks.
Note that the setting to enable strict parsing was never ported to the new settings infra hance it cannot be set in production. It is really only used in our own tests.
Relates to #19552
This commit makes mapping updates atomic when multiple types in an index are updated. Mappings for an index are now applied in a single atomic operation, which also allows to optimize some of the cross-type updates and checks.
In #22094 we introduce a test-only setting to simulate transport
impls that don't support handshakes. This commit implements the same logic
without a setting.
The JSON processor has an optional field called "target_field".
If you don't specify target_field then target_field becomes what you specified as "field".
There isn't anyway to add the fields to the root of a document. By
setting `add_to_root`, now serialized fields will be inserted into the
top-level fields of the ingest document.
Closes#21898.
Today we initialize Netty in a static initializer. We trigger this
method via static initializers from Netty-related classes, but we can
trigger this method earlier than we do to ensure that Netty is
initialized how we want it to be.
Inline scripts defined in Ingest Pipelines are now compiled at creation time to preemptively catch errors on initialization of the pipeline.
Fixes#21842.
Low level handshake code doesn't handle situations gracefully if the connection
is concurrently closed or reset by peer. This commit adds the relevant code to
fail the handshake if the connection is closed.
Moves the last of the "easy" parser construction into
`RestRequest`, this time with a new method
`RestRequest#contentParser`. The rest of the production
code that builds `XContentParser` isn't "easy" because it is
exposed in the Transport Client API (a Builder) object.
The creation of the `ValuesSource` used to pass `DateTimeZone.UTC` as a time
zone all the time in case of empty fields in spite of the fact that all doc
value formats but the date one reject this parameter.
This commit centralizes the creation of the `ValuesSource` and adds unit tests
to it.
Closes#22009
With this commit we enable the Jackson feature 'STRICT_DUPLICATE_DETECTION'
by default. This ensures that JSON keys are always unique. While this has
a performance impact, benchmarking has indicated that the typical drop in
indexing throughput is around 1 - 2%.
As a last resort, we allow users to still disable strict duplicate checks
by setting `-Des.json.strict_duplicate_detection=false` which is
intentionally undocumented.
Closes#19614
Grok was originally ignoring potential matches to named-capture groups
larger than one. For example, If you had two patterns containing the
same named field, but only the second pattern matched, it would fail to
pick this up.
This PR fixes this by exploring all potential places where a
named-capture was used and chooses the first one that matched.
Fixes#22117.
Today we rely on the version that the API user passes in together with the DiscoveryNode. This commit introduces a low level handshake where nodes exchange their version to be used with the transport protocol that is executed every time a connection to a node is established. This, on the one hand allows to change the wire protocol based on the version we are talking to even without a full cluster restart. Today we would need to carry on a BWC layer across major versions but with a handshake we can rely on the fact that the latest version of the previous minor executes a handshake and uses the latest protocol version across all communication with the N+1 version nodes.
This change is yet fully backwards compatible, a followup PR will remove the BWC in 6.0 once this has been back-ported to the 5.x branch