As the query of a search request defaults to match_all,
calling _delete_by_query without an explicit query may
result in deleting all data.
In order to protect users against falling into that
pitfall, this commit adds a check to require the explicit
setting of a query.
Closes#23629
The current rest backcompat tests, which run against a mixed cluster of
5.x and 6.0 nodes, depend on snapshot builds of 5.x. However, this has
the potential for inconsistency that results in CI failures, and happens
quite often, whenever some backcompat logic is added to 5.x, but the bwc
test on master fails because the 5.x code has not yet been published as
a snapshot.
This change creates a git clone of the 5.x branch,
builds the zip distribution, and ties that into gradle substitutions for
the 5.x version.
Removed `parse(String index, String type, String id, BytesReference source)` in DocumentMapper.java and replaced all of its use in Test files with `parse(SourceToParse source)`.
`parse(String index, String type, String id, BytesReference source)` was only used in test files and never in the main code so it was removed. All of the test files that used it was then modified to use `parse(SourceToParse source)` method that existing in DocumentMapper.java
Without this change, if write a script with multiple regexes
*sometimes* the lexer will decide to look at them like one
big regex and then some trailing garbage. Like this discuss post:
https://discuss.elastic.co/t/error-with-the-split-function-in-painless-script/79021
```
def val = /\\\\/.split(ctx._source.event_data.param17);
if (val[2] =~ /\\./) {
def val2 = /\\./.split(val[2]);
ctx._source['user_crash'] = val2[0]
} else {
ctx._source['user_crash'] = val[2]
}
```
The error message you get from the lexer is `lexer_no_viable_alt_exception`
right after the *second* regex.
With this change each regex is just a single regex like it ought to be.
As a bonus, while looking into this issue I found that the error
reporting for regexes wasn't very nice. If you specify an invalid
pattern then you get an error marker on the start of the pattern
with the JVM's regex error message which attempts to point you to the
location in the regex but is totally unreadable in the JSON response.
This change fixes the location to point to the appropriate spot
inside the pattern and removes the portion of the JVM's error message
that doesn't render well. It is no longer needed now that we point
users to the appropriate spot in the pattern.
Changes reindex and friends to wait until the entire request has
been "cleaned up" before responding. "Clean up" in this context
is clearing the scroll and (for reindex-from-remote) shutting
down the client. Failures to clean up are still only logged, not
returned to the user.
Closes#23653
This commit upgrades the Netty dependencies from version 4.1.8 to
version 4.1.9. This commit picks up a few bug fixes that impacted us:
- Netty was incorrectly ignoring interfaces with self-assigned MAC
addresses (e.g., instances running in Docker containers or on EC2)
- incorrect handling of the Expect: 100-continue header
Relates #23540
With this commit we change the default receive predictor size for Netty
from 32kB to 64kB as our testing has shown that this leads to less
allocations on smaller heaps like the default out of the box
configuration and this value also works reasonably well for larger
heaps.
Closes#23185
This commit mutes a ton of Painless lambda tests on JDK 9. This commit
did not attempt to discover exactly which tests are failing, but instead
just blanket muted all tests in LambdaTests, FunctionRefTests, and
AugmentationTests.
Relates #23473
Previously, the RestController would stash the context prior to copying headers. However, there could be deprecation
log messages logged and in turn warning headers being added to the context prior to the stashing of the context. These
headers in the context would then be removed from the request and also leaked back into the calling thread's context.
This change moves the stashing of the context to the HttpTransport so that the network threads' context isn't
accidentally populated with warning headers and to ensure the headers added early on in the RestController are not
excluded from the response.
Throw error when skip or do sections are malformed, such as they don't start with the proper token (START_OBJECT). That signals bad indentation, which would be ignored otherwise. Thanks (or due to) our pull parsing code, we were still able to properly parse the sections, yet other runners weren't able to.
Closes#21980
* [TEST] fix indentation in matrix_stats yaml tests
* [TEST] fix indentation in painless yaml test
* [TEST] fix indentation in analysis yaml tests
* [TEST] fix indentation in generated docs yaml tests
* [TEST] fix indentation in multi_cluster_search yaml tests
We have many version constants in master that have already been
released, but are still marked (by naming convention) as unreleased.
This commit renames those version constants.
The dependencyLicenses check has the ability to map multiple jar files
to the same license file. However, netty was not taking advantage of
this, and had duplicate copies of its license/notice files for each jar.
This commit reduces the copies to one and uses the mapping feature.
This commit sets the intial size of the pipeline handler queue small to
prevent waste if pipelined requests are never sent. Since the queue will
grow quickly if pipeline requests are indeed set, this should not be
problematic.
Relates #23335
When pipelined responses are sent to the pipeline handler for writing,
they are not necessarily written immediately. They must be held in a
priority queue until all responses preceding the given response are
written. This means that when write is invoked on the handler, the
promise that is attached to the write invocation will not necessarily be
the promise associated with the responses that are written while the
queue is drained. To address this, the promise associated with a
pipelined response must be held with the response and then used when the
channel context is actually written to. This was introduced when
ensuring that the releasing promise is always chained through on write
calls lest the releasing promise never be invoked. This leads to many
failing test cases, so no new test cases are needed here.
Relates #23317
Gradle's finalizedBy on tasks only ensures one task runs after another,
but not immediately after. This is problematic for our integration tests
since it allows multiple project's integ test clusters to be
simultaneously. While this has not been a problem thus far (gradle 2.13
happened to keep the finalizedBy tasks close enough that no clusters
were running in parallel), with gradle 3.3 the task graph generation has
changed, and numerous clusters may be running simultaneously, causing
memory pressure, and thus generally slower tests, or even failure if the
system has a limited amount of memory (eg in a vagrant host).
This commit reworks how integ tests are configured. It adds an
`integTestCluster` extension to gradle which is equivalent to the current
`integTest.cluster` and moves the rest test runner task to
`integTestRunner`. The `integTest` task is then just a dummy task,
which depends on the cluster runner task, as well as the cluster stop
task. This means running `integTest` in one project will both run the
rest tests, and shut down the cluster, before running `integTest` in
another project.
When sending a response to a client, we attach a releasing listener to
the channel promise. If the client disappears before the response is
sent, the releasing listener was never notified. The reason the
listeners were never notified was due to a mistaken invocation of write
and flush on the channel which has two overrides: one that takes an
existing promise, and one that does not and instead creates a new
promise. When the client disappears, it is this latter promise that is
notified, which does not contain the releasing listener. This commit
addreses this issue by invoking the override that passes our channel
promise through.
Relates #23310
Now that search templates always get converted to json, we don't need to try and auto-detect their content-type, which anyways didn't work as expected before given that only json was really working.
Elasticsearch accepts multiple content-type formats, hence scripts can be stored/provided in json, yaml, cbor or smile. Yet the format that should be used internally is json. This is a problem mainly around search templates, as they only support json out of the four content-types, so instead of maintaining the content-type of the request we should rather convert the scripts/templates to json.
Binary formats were not previously supported. If you stored a template in yaml format, you'd get back an error "No encoder found for MIME type [application/yaml]" when trying to execute it. With this commit the request content-type is independent from the template, which always gets converted to json internally. That is transparent to users and doesn't affect the content type of the response obtained when executing the template.
Fixes Painless to properly implement scripts that return primitives
and void. Adds some simple tests that we emit sane opcodes and some
other tests that we implement primitives as expected.
Mostly this is just a fix following up from #22983 but there is one
thing I did really worth talking about, I think. So, before this script
Painless scripts could only ever return Object and they did would always
return null for paths that didn't return any values. Now that they
can return primitives the question is "what should Painless return
from paths that don't return any values?" And I answered that with
"whatever the JLS default value is". So 0/0L/0f/0d/false.
Generalizes three previously hard coded things in painless into
generic concepts:
1. The "main method" is no longer hardcoded to:
```
public abstract Object execute(Map<String, Object> params,
Scorer scorer, LeafDocLookup doc, Object value);
```
Instead Painless's compiler takes an interface and implements it. It looks like:
```
public interface SomeScript {
// Argument names we expose to Painless scripts
String[] ARGUMENTS = new String[] {"a", "b"};
// Method implemented by Painless script. Must be named execute but can have any parameters or return any value.
Object execute(String a, int b);
// Is the "a" argument used by the script?
boolean uses$a();
}
SomeScript script = scriptEngine.compile(SomeScript.class, null, "the_script_here", emptyMap());
Object result = script.execute("a", 1);
```
`PainlessScriptEngine` now compiles all scripts to the new
`GenericElasticsearchScript` interface by default for compatibility
with the rest of Elasticsearch until it is able to use this new
ability.
2. `_score` and `ctx` are no longer hardcoded to be extracted from
`#score` and `params` respectively. Instead Painless's default
implementation of Elasticsearch scripts uses the `uses$_score` and
`uses$ctx` methods to determine if it is used and gives them
dummy values if they are not used.
3. Throwing the `ScriptException` is now handled by the Painless
script itself. That way Painless doesn't have to leak the metadata
that is required to build the fancy stack trace. And all painless scripts
get the fancy stack trace.
Previously we calculated Netty' receive predictor size for HTTP and transport
traffic based on available memory and worker nodes. This resulted in a receive
predictor size between 64kb and 512kb. In our benchmarks this leads to increased
GC pressure.
With this commit we set Netty's receive predictor size to 32kb. This value is in
a sweet spot between heap memory waste (-> GC pressure) and effect on request
metrics (achieved throughput and latency numbers).
Closes#23185
This commit enforces the requirement of Content-Type for the REST layer and removes the deprecated methods in transport
requests and their usages.
While doing this, it turns out that there are many places where *Entity classes are used from the apache http client
libraries and many of these usages did not specify the content type. The methods that do not specify a content type
explicitly have been added to forbidden apis to prevent more of these from entering our code base.
Relates #19388
Get HEAD requests incorrectly return a content-length header of 0. This
commit addresses this by removing the special handling for get HEAD
requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.
Relates #23186
Get source HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for get
source HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.
Relates #23151
Today all search phases are inner classes of AbstractSearchAsyncAction or one of it's
subclasses. This makes unit testing of these classes practically impossible. This commit
Extracts `DfsQueryPhase` and `FetchSearchPhase` or of the code that composes the actual
query execution types and moves most of the fan-out and collect code into an `InitialSearchPhase`
class that can be used to build initial search phases (phases that retry on shards). This will
make modification to these classes simpler and allows to easily compose or add new search phases
down the road if additional roundtrips are required.
When Netty decodes a bad HTTP request, it marks the decoder result on
the HTTP request as a failure, and reroutes the request to GET
/bad-request. This either leads to puzzling responses when a bad request
is sent to Elasticsearch (if an index named "bad-request" does not exist
then it produces an index not found exception and otherwise responds
with the index settings for the index named "bad-request"). This commit
addresses this by inspecting the decoder result on the HTTP request and
dispatching the request to a bad request handler preserving the initial
cause of the bad request and providing an error message to the client.
Relates #23153
This commit adds a new method to the TransportChannel that provides access to the version of the
remote node that the response is being sent on and that the request came from. This is helpful
for serialization of data attached as headers.
Template HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for
template HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.
Relates #23130
Index HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for index
HEAD requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.
Relates #23112
Alias HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for alias
HEAD requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.
Relates #23094