For the record, I also had to remove the geo-hash cell and geo-distance range
queries to make the code compile. These queries already throw an exception in
all cases with 5.x indices, so that does not hurt any more.
I also had to rename all 2.x bwc indices from `index-${version}` to
`unsupported-${version}` to make `OldIndexBackwardCompatibilityIT`
happy.
SearchTemplateRequest to implement CompositeIndicesRequest
Given that SearchTemplateRequest effectively delegates to search when a search is being executed, it should implement the CompositeIndicesRequest interface. The subrequests method should return a single search request. When a search is not going to be executed, because we are in simulate mode, there are no inner requests, and there are no corresponding indices to that request either.
Closes#21747
Set lucene version to 6.4.0-snapshot-ec38570 and update all the sha1s/license
Fix invalid combo after upgrade in query_string query. split_on_whitespace=false is disallowed if auto_generate_phrase_queries=true
Adapt the expectations of some tests to the new format of the Lucene explain output
Lucene 6.2 added index and query support for numeric ranges. This commit adds a new RangeFieldMapper for indexing numeric (int, long, float, double) and date ranges and creating appropriate range and term queries. The design is similar to NumericFieldMapper in that it uses a RangeType enumerator for implementing the logic specific to each type. The following range types are supported by this field mapper: int_range, float_range, long_range, double_range, date_range.
Lucene does not provide a DocValue field specific to RangeField types so the RangeFieldMapper implements a CustomRangeDocValuesField for handling doc value support.
When executing a Range query over a Range field, the RangeQueryBuilder has been enhanced to accept a new relation parameter for defining the type of query as one of: WITHIN, CONTAINS, INTERSECTS. This provides support for finding all ranges that are related to a specific range in a desired way. As with other spatial queries, DISJOINT can be achieved as a MUST_NOT of an INTERSECTS query.
The Transport#connectToNodeLight concepts is confusing and not very flexible.
neither really testable on a unittest level. This commit cleans up the code used
to connect to nodes and simplifies transport implementations to share more code.
This also allows to connect to nodes with custom profiles if needed, for instance
future improvements can be added to connect to/from nodes that are non-data nodes without
dedicated bulks and recovery connections.
When Netty listens on a socket, it specifies the established connection
backlog for the socket. On Linux, Netty tries to read the system-wide
configuration for this from /proc/sys/net/core/somaxconn and falls back
to a default value when it can not read this value. This commit grants
Netty permission to read this file so that it can honor the system-wide
configuration for the connection backlog for sockets that it is
listening on. This also removes an obnoxious stack trace that appears
when Netty logging is set to debug logging.
Relates #21840
In the past we ran yaml tests against an internal cluster, which would get restarted after each test failure, hence the client objects needed to eventually be refreshed before each test. That is why we had the initClient method to re-initialize the YamlTestClient in the execution context. We ended up though re-initializing the client unconditionally, which is not needed.
Also, ESRestTestCase recreates the RestClient against the external cluster before each test, which is not needed given that nothing changes in the external cluster.
This commit removes the initClient method from the yaml tests execution context. The YamlTestClient can be eagerly created before the first yaml test runs and then re-used in subsequent tests. Also api calls to check for nodes versions etc. are moved out of YamlTestClient to ESClientYamlSuiteTestCase. Also the RestClient is now initialized in ESRestTestCase before the first test runs, and kept around afterwards as a static member.
Basically each subclass of EsRestTestCase will have its own RestClient instance, but the client will be shared across the different tests within the same class. The yaml test suite is just a special suite, composed of 600+ tests that are loaded from files, which will share the same client instance.
This change should speed tests up as well, as we don't recreate the RestClient before each single test, and we don't call _cat/nodes either before each single test.
This commit simplifies the handling of fatal errors on the network
layer. The simplification here is to remove the use of a
StringWriter/PrintWriter pair to format the stack trace, removing the
need for the method to declare that it throws a checked IOException.
If a bug occurs in painless compilation (not from a user, but from the
painless infrastructure), a VerifyError may be thrown when compiling the
broken generated class. This commit wraps VerifyErrors in
ScriptException so that useful information is returned to the user,
which can be passed on to the ES team for analysis.
This bug would cause a VerifyError when scripts using the === operator
were comparing a def type against a primitive type since the primitive
type wasn't being appropriately boxed.
Today when handling unreleased versions for backwards compatilibity
support, we scatted version constants across the code base and add some
asserts to support removing these constants when the version in question
is actually released. This commit improves this situation, enabling us
to just add a single unreleased version constant that can be renamed
when the version is actually released. This should make maintenance of
these versions simpler.
Relates #21760
NOTE: The result of `?.` and `?:` can't be assigned to primitives. So
`int[] someArray = null; int l = someArray?.length` and
`int s = params.size ?: 100` don't work. Do
`def someArray = null; def l = someArray?.length` and
`def s = params.size ?: 100` instead.
Relates to #21748
* Scripting: Remove groovy scripting language
Groovy was deprecated in 5.0. This change removes it, along with the
legacy default language infrastructure in scripting.
You can use `Debug.explain(someObject)` in painless to throw an
`Error` that can't be caught by painless code and contains an
object's class. This is useful because painless's sandbox doesn't
allow you to call `someObject.getClass()`.
Closes#20263
When a fatal error is thrown on the network layer, such an error never
makes its way to the uncaught exception handler. This prevents the node
from being torn down if an out of memory error or other fatal error is
thrown while handling HTTP or transport traffic. This commit adds logic
to ensure that such errors bubble their way up to the uncaught exception
handler, even though Netty tries really hard to swallow everything.
Relates #21720
This should make debugging painless' analysis and code generation a
little easier.
The `toString` implementations mirror the AST somewhat, and look like
`(SSource (SReturn (ENumeric 1)))`.
Today we read a vint from the stream to allocate the size of an array up-front
before we start reading the values. This can be dangerous if for instance we read
from a corrupted stream or if some manipulated bytes are send for instance from
an attacker or a fuzzer. In most of the cases we can apply some best effort and
validate the array size to be _sane_ by ensuring we can at read at least N bytes
where N is the expected size of the array.
When Groovy detects a bug in its runtime because an internal assertion
was violated, it throws an GroovyBugError. This descends from
AssertionError and if it goes uncaught will land in the uncaught
exception handler and will not deliver any useful information to the
user. This commit wraps GroovyBugErrors in ScriptExceptions so that
useful information is returned to the user.
Implements a null coalescing operator in painless that looks like `?:`. This form was chosen to emulate Groovy's `?:` operator. It is different in that it only coalesces null values, instead of Groovy's `?:` operator which coalesces all falsy values. I believe that makes it the same as Kotlin's `?:` operator. In other languages this operator looks like `??` (C#) and `COALESCE` (SQL) and `:-` (bash).
This operator is lazy, meaning the right hand side is only evaluated at all if the left hand side is null.
This commit exposes the executor service interface from thread
pool. This will enable some high-level concurrency primitives that will
make some code cleaner and simpler.
Relates #21608
We kept `netty_3` as a fallback in the 5.x series but now that master
is 6.0 we don't need this or in other words all issues coming up with
netty 4 will be blockers for 6.0.
* master: (22 commits)
Add proper toString() method to UpdateTask (#21582)
Fix `InternalEngine#isThrottled` to not always return `false`. (#21592)
add `ignore_missing` option to SplitProcessor (#20982)
fix trace_match behavior for when there is only one grok pattern (#21413)
Remove dead code from GetResponse.java
Fixes date range query using epoch with timezone (#21542)
Do not cache term queries. (#21566)
Updated dynamic mapper section
Docs: Clarify date_histogram bucket sizes for DST time zones
Handle release of 5.0.1
Fix skip reason for stats API parameters test
Reduce skip version for stats API parameter tests
Strict level parsing for indices stats
Remove cluster update task when task times out (#21578)
[DOCS] Mention "all-fields" mode doesn't search across nested documents
InternalTestCluster: when restarting a node we should validate the cluster is formed via the node we just restarted
Fixed bad asciidoc in boolean mapping docs
Fixed bad asciidoc ID in node stats
Be strict when parsing values searching for booleans (#21555)
Fix time zone rounding edge case for DST overlaps
...
There is an issue in the Grok Processor, where trace_match: true does not inject the _ingest._grok_match_index into the ingest-document when there is just one pattern provided. This is due to an optimization in the regex construction. This commit adds a check for when this is the case, and injects a static index value of "0", since there is only one pattern matched (at the first index into the patterns).
To make this clearer, more documentation was added to the grok-processor docs.
Fixes#21371.
* master:
Hack around cluster service and logging race
Do not prematurely shutdown Log4j
Support decimal constants with trailing [dD] in painless (#21412)
In painless suggest a long constant if int won't do (#21415)
Account for different paths for sysctl utilities
[TEST] testRebalancePossible() may not have an assigned node id
Tests: Disable merge in SearchCancellationTests
Tests: clean search scroll at the end of SearchCancellationIT
This adds support to painless for decimal constants with trailing `d` or
`D` to make it compatible with Java. It already supported integer
constants with a trailing `d` or `D` but this adds tests for it.
Closes#21116
In painless we prefer explicit types over implicit ones whereas
groovy is the other way around. Take this groovy code:
```
> 86400000.class
java.lang.Integer
> 864000000000.class
java.lang.Long
```
Painless accepts `86400000` just fine because that is a valid `int`
in the jvm. It rejects `864000000000` as an invlid `int` constant
because, in painless as in java, `long` constants always end in `L`
or `l`.
To ease the transition from groovy to painless, this changes the
compilation error returned from these invalid constants from:
```
Invalid int constant [864000000000].
```
to
```
Invalid int constant [864000000000]. If you want a long constant then change it to [864000000000L].
```
Inspired by #21313
* master: (516 commits)
Avoid angering Log4j in TransportNodesActionTests
Add trace logging when aquiring and releasing operation locks for replication requests
Fix handler name on message not fully read
Remove accidental import.
Improve log message in TransportNodesAction
Clean up of Script.
Update Joda Time to version 2.9.5 (#21468)
Remove unused ClusterService dependency from SearchPhaseController (#21421)
Remove max_local_storage_nodes from elasticsearch.yml (#21467)
Wait for all reindex subtasks before rethrottling
Correcting a typo-Maan to Man-in README.textile (#21466)
Fix InternalSearchHit#hasSource to return the proper boolean value (#21441)
Replace all index date-math examples with the URI encoded form
Fix typos (#21456)
Adapt ES_JVM_OPTIONS packaging test to ubuntu-1204
Add null check in InternalSearchHit#sourceRef to prevent NPE (#21431)
Add VirtualBox version check (#21370)
Export ES_JVM_OPTIONS for SysV init
Skip reindex rethrottle tests with workers
Make forbidden APIs be quieter about classpath warnings (#21443)
...
In the test for reindex and friend's rethrottling feature we were waiting
only for a single reindex sub task to start before rethrottling. This
mostly worked because starting tasks is fast. But it didn't *always work
and CI found that for us. This fixes the test to wait for all subtasks
to start before rethrottling.
I reproduced this locally semi-consistently with some fairly creative
`Thread.sleep` calls and this test fix fixes the issue even with the
sleeps so I'm fairly sure this will work consistently.
Closes#21446
The method used to be called `isSourceEmpty`, and was renamed to `hasSource`, but the return value never changed. Updated tests and users accordingly.
Closes#21419
Null safe dereferences make handling null or missing values shorter.
Compare without:
```
if (ctx._source.missing != null && ctx._source.missing.foo != null) {
ctx._source.foo_length = ctx.source.missing.foo.length()
}
```
To with:
```
Integer length = ctx._source.missing?.foo?.length();
if (length != null) {
ctx._source.foo_length = length
}
```
Combining this with the as of yet unimplemented elvis operator allows
for very concise defaults for nulls:
```
ctx._source.foo_length = ctx._source.missing?.foo?.length() ?: 0;
```
Since you have to start somewhere, we started with null safe dereferenes.
Anyway, this is a feature borrowed from groovy. Groovy allows writing to
null values like:
```
def v = null
v?.field = 'cat'
```
And the writes are simply ignored. Painless doesn't support this at this
point because it'd be complex to implement and maybe not all that useful.
There is no runtime cost for this feature if it is not used. When it is
used we implement it fairly efficiently, adding a jump rather than a
temporary variable.
This should also work fairly well with doc values.
If you try to reindex with multiple slices against a node that
doesn't support it we throw an `IllegalArgumentException` so
`assertVersionSerializable` is ok with it and so if this happens
in REST it comes back as a 400 error.
* Rest client: don't reuse that same HttpAsyncResponseConsumer across multiple retries
Turns out that AbstractAsyncResponseConsumer from apache async http client is stateful and cannot be reused across multiple requests. The failover mechanism was mistakenly reusing that same instance, which can be provided by users, across retries in case nodes are down or return 5xx errors. The downside is that we have to change the signature of two public methods, as HttpAsyncResponseConsumer cannot be provided directly anymore, rather its factory needs to be provided which is going to be used to create one instance of the consumer per request attempt.
Up until now we tested our RestClient against multiple nodes only in a mock environment, where we don't really send http requests. In that scenario we can verify that retries etc. work properly but the interaction with the http client library in a real scenario is different and can catch other problems. With this commit we also add an integration test that sends requests to multiple hosts, and some of them may also get stopped meanwhile. The specific test for pathPrefix was also removed as pathPrefix is now randomly applied by default, hence implicitly tested. Moved also a small test method that checked the validity of the path argument to the unit test RestClientSingleHostTests.
Also increase default buffer limit to 100MB and make it required in default consumer
The default buffer limit used to be 10MB but that proved not to be high enough for scroll requests (see reindex from remote). With this commit we increase the limit to 100MB and make it a bit more visibile in the consumer factory.
At one point in the past when moving out the rest tests from core to
their own subproject, we had multiple test classes which evenly split up
the tests to run. However, we simplified this and went back to a single
test runner to have better reproduceability in tests. This change
removes the remnants of that multiplexing support.
Adds support for `?slices=N` to reindex which automatically
parallelizes the process using parallel scrolls on `_uid`. Performance
testing sees a 3x performance improvement for simple docs
on decent hardware, maybe 30% performance improvement
for more complex docs. Still compelling, especially because
clusters should be able to get closer to the 3x than the 30%
number.
Closes#20624