Commit Graph

3720 Commits

Author SHA1 Message Date
Nik Everett 457c2d8fb0 Add Debug.explain to painless
You can use `Debug.explain(someObject)` in painless to throw an
`Error` that can't be caught by painless code and contains an
object's class. This is useful because painless's sandbox doesn't
allow you to call `someObject.getClass()`.

Closes #20263
2016-11-22 12:46:02 -05:00
Jason Tedor 446037ccb8 Die with dignity on the network layer
When a fatal error is thrown on the network layer, such an error never
makes its way to the uncaught exception handler. This prevents the node
from being torn down if an out of memory error or other fatal error is
thrown while handling HTTP or transport traffic. This commit adds logic
to ensure that such errors bubble their way up to the uncaught exception
handler, even though Netty tries really hard to swallow everything.

Relates #21720
2016-11-21 22:14:30 -05:00
Nik Everett f5c8c746e6 Implement toString in painless's AST
This should make debugging painless' analysis and code generation a
little easier.

The `toString` implementations mirror the AST somewhat, and look like
`(SSource (SReturn (ENumeric 1)))`.
2016-11-21 16:24:10 -05:00
Simon Willnauer cb5c25ab4f Add a StreamInput#readArraySize method that ensures sane array sizes (#21697)
Today we read a vint from the stream to allocate the size of an array up-front
before we start reading the values. This can be dangerous if for instance we read
from a corrupted stream or if some manipulated bytes are send for instance from
an attacker or a fuzzer. In most of the cases we can apply some best effort and
validate the array size to be _sane_ by ensuring we can at read at least N bytes
where N is the expected size of the array.
2016-11-21 21:39:21 +01:00
Jason Tedor 655c4fe172 Wrap GroovyBugErrors in ScriptExceptions
When Groovy detects a bug in its runtime because an internal assertion
was violated, it throws an GroovyBugError. This descends from
AssertionError and if it goes uncaught will land in the uncaught
exception handler and will not deliver any useful information to the
user. This commit wraps GroovyBugErrors in ScriptExceptions so that
useful information is returned to the user.
2016-11-19 07:11:13 -05:00
Nik Everett ae468441dc Implement the ?: operator in painless (#21506)
Implements a null coalescing operator in painless that looks like `?:`. This form was chosen to emulate Groovy's `?:` operator. It is different in that it only coalesces null values, instead of Groovy's `?:` operator which coalesces all falsy values. I believe that makes it the same as Kotlin's `?:` operator. In other languages this operator looks like `??` (C#) and `COALESCE` (SQL) and `:-` (bash).

This operator is lazy, meaning the right hand side is only evaluated at all if the left hand side is null.
2016-11-18 13:54:26 -05:00
Jack Conradson ced433e9a8 Fix reserved variable availability in lambdas in Painless 2016-11-17 13:39:08 -08:00
Jason Tedor b08a2e1f31 Expose executor service interface from thread pool
This commit exposes the executor service interface from thread
pool. This will enable some high-level concurrency primitives that will
make some code cleaner and simpler.

Relates #21608
2016-11-17 09:18:49 -05:00
Simon Willnauer de04aad994 Remove `modules/transport_netty_3` in favor of `netty_4` (#21590)
We kept `netty_3` as a fallback in the 5.x series but now that master
is 6.0 we don't need this or in other words all issues coming up with
netty 4 will be blockers for 6.0.
2016-11-17 12:44:42 +01:00
Jason Tedor d06a8903fd Merge branch 'master' into feature/seq_no
* master: (22 commits)
  Add proper toString() method to UpdateTask (#21582)
  Fix `InternalEngine#isThrottled` to not always return `false`. (#21592)
  add `ignore_missing` option to SplitProcessor (#20982)
  fix trace_match behavior for when there is only one grok pattern (#21413)
  Remove dead code from GetResponse.java
  Fixes date range query using epoch with timezone (#21542)
  Do not cache term queries. (#21566)
  Updated dynamic mapper section
  Docs: Clarify date_histogram bucket sizes for DST time zones
  Handle release of 5.0.1
  Fix skip reason for stats API parameters test
  Reduce skip version for stats API parameter tests
  Strict level parsing for indices stats
  Remove cluster update task when task times out (#21578)
  [DOCS] Mention "all-fields" mode doesn't search across nested documents
  InternalTestCluster: when restarting a node we should validate the cluster is formed via the node we just restarted
  Fixed bad asciidoc in boolean mapping docs
  Fixed bad asciidoc ID in node stats
  Be strict when parsing values searching for booleans (#21555)
  Fix time zone rounding edge case for DST overlaps
  ...
2016-11-16 09:10:35 -05:00
Tal Levy 6796464f16 add `ignore_missing` option to SplitProcessor (#20982)
Closes #20840.
2016-11-16 15:46:09 +02:00
Tal Levy 04b712bdc5 fix trace_match behavior for when there is only one grok pattern (#21413)
There is an issue in the Grok Processor, where trace_match: true does not inject the _ingest._grok_match_index into the ingest-document when there is just one pattern provided. This is due to an optimization in the regex construction. This commit adds a check for when this is the case, and injects a static index value of "0", since there is only one pattern matched (at the first index into the patterns).

To make this clearer, more documentation was added to the grok-processor docs.

Fixes #21371.
2016-11-16 15:41:54 +02:00
Boaz Leskes 2c0338fa87 Merge remote-tracking branch 'upstream/master' into feature/seq_no 2016-11-15 17:09:08 +00:00
Adrien Grand df4482fdc8 Do not cache the QueryShardContext in PercolatorFieldMapper: it is cheap to create. 2016-11-15 15:45:18 +01:00
Adrien Grand 54809065a6 Make PercolatorFieldMapper get a QueryShardContext lazily. 2016-11-15 12:02:40 +01:00
Boaz Leskes c9f49039d3 Merge remote-tracking branch 'upstream/master' into feature/seq_no 2016-11-15 10:14:47 +00:00
Ryan Ernst d14c470b89 Remove generics from ActionRequest
closes #21368
2016-11-14 15:32:01 -08:00
Adrien Grand 1fd5c47e7f Upgrade to lucene-6.3.0. (#21464) 2016-11-14 09:36:45 +01:00
Jason Tedor c7a1b3eb50 Merge branch 'master' into feature/seq_no
* master:
  Hack around cluster service and logging race
  Do not prematurely shutdown Log4j
  Support decimal constants with trailing [dD] in painless (#21412)
  In painless suggest a long constant if int won't do (#21415)
  Account for different paths for sysctl utilities
  [TEST] testRebalancePossible() may not have an assigned node id
  Tests: Disable merge in SearchCancellationTests
  Tests: clean search scroll at the end of SearchCancellationIT
2016-11-13 20:01:44 -05:00
Nik Everett 2a328034ef Support decimal constants with trailing [dD] in painless (#21412)
This adds support to painless for decimal constants with trailing `d` or
`D` to make it compatible with Java. It already supported integer
constants with a trailing `d` or `D` but this adds tests for it.

Closes #21116
2016-11-12 11:08:39 -05:00
Nik Everett a26b5a113c In painless suggest a long constant if int won't do (#21415)
In painless we prefer explicit types over implicit ones whereas
groovy is the other way around. Take this groovy code:

```
> 86400000.class
java.lang.Integer
> 864000000000.class
java.lang.Long
```

Painless accepts `86400000` just fine because that is a valid `int`
in the jvm. It rejects `864000000000` as an invlid `int` constant
because, in painless as in java, `long` constants always end in `L`
or `l`.

To ease the transition from groovy to painless, this changes the
compilation error returned from these invalid constants from:

```
Invalid int constant [864000000000].
```

to

```
Invalid int constant [864000000000]. If you want a long constant then change it to [864000000000L].
```

Inspired by #21313
2016-11-12 11:08:18 -05:00
Jason Tedor d3417fb022 Merge branch 'master' into feature/seq_no
* master: (516 commits)
  Avoid angering Log4j in TransportNodesActionTests
  Add trace logging when aquiring and releasing operation locks for replication requests
  Fix handler name on message not fully read
  Remove accidental import.
  Improve log message in TransportNodesAction
  Clean up of Script.
  Update Joda Time to version 2.9.5 (#21468)
  Remove unused ClusterService dependency from SearchPhaseController (#21421)
  Remove max_local_storage_nodes from elasticsearch.yml (#21467)
  Wait for all reindex subtasks before rethrottling
  Correcting a typo-Maan to Man-in README.textile (#21466)
  Fix InternalSearchHit#hasSource to return the proper boolean value (#21441)
  Replace all index date-math examples with the URI encoded form
  Fix typos (#21456)
  Adapt ES_JVM_OPTIONS packaging test to ubuntu-1204
  Add null check in InternalSearchHit#sourceRef to prevent NPE (#21431)
  Add VirtualBox version check (#21370)
  Export ES_JVM_OPTIONS for SysV init
  Skip reindex rethrottle tests with workers
  Make forbidden APIs be quieter about classpath warnings (#21443)
  ...
2016-11-10 23:40:33 -05:00
Jack Conradson aeb97ff412 Clean up of Script.
Closes #21321
2016-11-10 09:59:13 -08:00
Nik Everett 4db21db0aa Wait for all reindex subtasks before rethrottling
In the test for reindex and friend's rethrottling feature we were waiting
only for a single reindex sub task to start before rethrottling. This
mostly worked because starting tasks is fast. But it didn't *always work
and CI found that for us. This fixes the test to wait for all subtasks
to start before rethrottling.

I reproduced this locally semi-consistently with some fairly creative
`Thread.sleep` calls and this test fix fixes the issue even with the
sleeps so I'm fairly sure this will work consistently.

Closes #21446
2016-11-10 10:49:25 -05:00
Luca Cavanna bd23921a3a Fix InternalSearchHit#hasSource to return the proper boolean value (#21441)
The method used to be called `isSourceEmpty`, and was renamed to `hasSource`, but the return value never changed. Updated tests and users accordingly.

Closes #21419
2016-11-10 13:13:38 +01:00
Nik Everett b0f5ea3f59 Skip reindex rethrottle tests with workers
They are flakey and spuriously fail the build. I'll hunt down
the cause soon and reenabled but for now they should stop.

Relates #21446
2016-11-09 17:50:09 -05:00
Nik Everett d03b8e4abb Implement reading from null safe dereferences
Null safe dereferences make handling null or missing values shorter.
Compare without:
```
if (ctx._source.missing != null && ctx._source.missing.foo != null) {
  ctx._source.foo_length = ctx.source.missing.foo.length()
}
```

To with:
```
Integer length = ctx._source.missing?.foo?.length();
if (length != null) {
  ctx._source.foo_length = length
}
```

Combining this with the as of yet unimplemented elvis operator allows
for very concise defaults for nulls:
```
ctx._source.foo_length = ctx._source.missing?.foo?.length() ?: 0;
```

Since you have to start somewhere, we started with null safe dereferenes.

Anyway, this is a feature borrowed from groovy. Groovy allows writing to
null values like:
```
def v = null
v?.field = 'cat'
```
And the writes are simply ignored. Painless doesn't support this at this
point because it'd be complex to implement and maybe not all that useful.

There is no runtime cost for this feature if it is not used. When it is
used we implement it fairly efficiently, adding a jump rather than a
temporary variable.

This should also work fairly well with doc values.
2016-11-09 07:20:11 -05:00
Nik Everett a3bd6d1ad9 Switch reindex with slices error to IAE
If you try to reindex with multiple slices against a node that
doesn't support it we throw an `IllegalArgumentException` so
`assertVersionSerializable` is ok with it and so if this happens
in REST it comes back as a 400 error.
2016-11-08 11:42:07 -05:00
Luca Cavanna 293a3cab01 Rest client: don't reuse that same HttpAsyncResponseConsumer across multiple retries (#21378)
* Rest client: don't reuse that same HttpAsyncResponseConsumer across multiple retries

Turns out that AbstractAsyncResponseConsumer from apache async http client is stateful and cannot be reused across multiple requests. The failover mechanism was mistakenly reusing that same instance, which can be provided by users, across retries in case nodes are down or return 5xx errors. The downside is that we have to change the signature of two public methods, as HttpAsyncResponseConsumer cannot be provided directly anymore, rather its factory needs to be provided which is going to be used to create one instance of the consumer per request attempt.

Up until now we tested our RestClient against multiple nodes only in a mock environment, where we don't really send http requests. In that scenario we can verify that retries etc. work properly but the interaction with the http client library in a real scenario is different and can catch other problems. With this commit we also add an integration test that sends requests to multiple hosts, and some of them may also get stopped meanwhile. The specific test for pathPrefix was also removed as pathPrefix is now randomly applied by default, hence implicitly tested. Moved also a small test method that checked the validity of the path argument to the unit test RestClientSingleHostTests.

Also increase default buffer limit to 100MB and make it required in default consumer

The default buffer limit used to be 10MB but that proved not to be high enough for scroll requests (see reindex from remote). With this commit we increase the limit to 100MB and make it a bit more visibile in the consumer factory.
2016-11-08 16:42:42 +01:00
Ryan Ernst 7a2c984bcc Test: Remove multi process support from rest test runner (#21391)
At one point in the past when moving out the rest tests from core to
their own subproject, we had multiple test classes which evenly split up
the tests to run. However, we simplified this and went back to a single
test runner to have better reproduceability in tests. This change
removes the remnants of that multiplexing support.
2016-11-07 15:07:34 -08:00
Jason Tedor 23a271f092 Address race condition in HTTP pipeline tests
This commit adapts a previous fix to the HTTP pipeline tests for Netty 4
to Netty 3.

Relates #19845
2016-11-07 13:20:22 -05:00
Nik Everett a13a050271 Add automatic parallelization support to reindex and friends (#20767)
Adds support for `?slices=N` to reindex which automatically
parallelizes the process using parallel scrolls on `_uid`. Performance
testing sees a 3x performance improvement for simple docs
on decent hardware, maybe 30% performance improvement
for more complex docs. Still compelling, especially because
clusters should be able to get closer to the 3x than the 30%
number.

Closes #20624
2016-11-04 20:59:15 -04:00
Adrien Grand 2a70f6e7b1 Upgrade to lucene-6.3.0-snapshot-a66a445. (#21309)
This addresses a bug that was introduced with https://issues.apache.org/jira/browse/LUCENE-7501.
2016-11-04 10:34:04 +01:00
Nik Everett 24d5f31a54 Make painless's assertion about out of bound less brittle
Instead of asserting that the message is shaped a certain way we
cause the exception and catch it and assert that the messages are
the same. This is the way to go because the exception message from
the jvm is both local and jvm dependent.

This is the CI failure that found this:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+java9-periodic/515/consoleFull
2016-11-02 12:38:51 -04:00
Christoph Büscher b3370de715 Tests: Add warning header checks to QueryBuilder tests and QueryParseContextTests
This adds checks for expected warning headers to the query builder test
infrastructure. Tests that are adding deprecation warnings to the response
headers need to check those, otherwise the abstract base class for the test
class will complain at teardown.
2016-11-02 15:45:33 +01:00
Adrien Grand aa6cd93e0f Require arguments for QueryShardContext creation. (#21196)
The `IndexService#newQueryShardContext()` method creates a QueryShardContext on
shard `0`, with a `null` reader and that uses `System.currentTimeMillis()` to
resolve `now`. This may hide bugs, since the shard id is sometimes used for
query parsing (it is used to salt random score generation in `function_score`),
passing a `null` reader disables query rewriting and for some use-cases, it is
simply not ok to rely on the current timestamp (eg. percolation). So this pull
request removes this method and instead requires that all call sites provide
these parameters explicitly.
2016-11-02 09:48:49 +01:00
Nik Everett a612e5988e Bump reindex-from-remote's buffer to 200mb
It was 10mb and that was causing trouble when folks reindex-from-remoted
with large documents.

We also improve the error reporting so it tells folks to use a smaller
batch size if they hit a buffer size exception. Finally, adds some docs
to reindex-from-remote mentioning the buffer and giving an example of
lowering the size.

Closes #21185
2016-11-01 13:19:28 -04:00
Jason Tedor 38663351dc Fix logger names for Netty
Previously Elasticsearch would only use the package name for logging
levels, truncating the package prefix and the class name. This meant
that logger names for Netty were just prefixed by netty3 and netty. We
changed this for Elasticsearch so that it's the fully-qualified class
name now, but never corrected this for Netty. This commit fixes the
logger names for the Netty modules so that their levels are controlled
by the fully-qualified class name.

Relates #21223
2016-10-31 17:23:21 -04:00
Jack Conradson 185dff7346 Cleanup ScriptType (#21179)
Refactored ScriptType to clean up some of the variable and method names. Added more documentation. Deprecated the 'in' ParseField in favor of 'stored' to match the indexed scripts being replaced by stored scripts.
2016-10-31 13:48:51 -07:00
Nik Everett 1bbd3c5400 Fix painless's out of bounds assertions in java 9
Java 9's exception message when lists have an out of bounds index
is much better than java 8 but the painless code asserted on the
java 8 message. Now it'll accept either.

I'm tempted to weaken the assertion but I like asserting that the
message is readable.
2016-10-29 22:21:57 -04:00
Nik Everett 3a7a218e8f Support negative array ofsets in painless
Adds support for indexing into lists and arrays with negative
indexes meaning "counting from the back". So for if
`x = ["cat", "dog", "chicken"]` then `x[-1] == "chicken"`.

This adds an extra branch to every array and list access but
some performance testing makes it look like the branch predictor
successfully predicts the branch every time so there isn't a
in execution time for this feature when the index is positive.
When the index is negative performance testing showed the runtime
is the same as writing `x[x.length - 1]`, again, presumably thanks
to the branch predictor.

Those performance metrics were calculated for lists and arrays but
`def`s get roughly the same treatment though instead of inlining
the test they need to make a invoke dynamic so we don't screw up
maps.

Closes #20870
2016-10-29 16:12:40 -04:00
Adrien Grand b3cc54cf0d Upgrade to lucene-6.3.0-snapshot-ed102d6 (#21150)
Lucene 6.3 is expected to be released in the next weeks so it'd be good to give
it some integration testing. I had to upgrade randomized-testing too so that
both Lucene and Elasticsearch are on the same version.
2016-10-28 14:47:15 +02:00
Jack Conradson 512a77a633 Refactor ScriptType to be a top-level class. 2016-10-26 10:21:22 -07:00
Jason Tedor 9c3e4d6e22 Add correct Content-Length on HEAD requests
This commit fixes responses to HEAD requests so that the value of the
Content-Length is correct per the HTTP spec. Namely, the value of this
header should be equal to the Content-Length if the request were not a
HEAD request.

This commit also fixes a memory leak on HEAD requests to the main action
that arose from the bytes on a builder not being released due to them
being dropped on the floor to ensure that the response to the main
action did not have a body.

Relates #21123
2016-10-25 23:08:19 -04:00
Nik Everett 18393a06f3 Fix reindex-from-remote for parent/child from <2.0
Versions before 2.0 needed to be told to return interesting fields
like `_parent`, `_routing`, `_ttl`, and `_timestamp`. And they come
back inside a `fields` block which we need to parse.

Closes #21044
2016-10-21 13:14:33 -04:00
Jason Tedor f51bf8ee47 Upgrade to Netty 4.1.6
This commit upgrades the transport-netty4 module dependency from Netty
version 4.1.5 to version 4.1.6. This is a bug fix release of Netty.

Relates #21051
2016-10-20 20:13:29 -04:00
Jack Conradson ceaae47d38 Remove more equivalents of the now method from the Painless whitelist. 2016-10-20 10:35:26 -07:00
Nik Everett b5da42905f Remove publishAddress from reindex whitelist
Removes the `publishAddress` parameter from the reindex-from-remote
whitelist checking because it isn't in use after #21004.
2016-10-20 12:51:10 -04:00
Fanfan 043a45746c some misspelled words in code (#21012)
as the title mentioned, misspelling as follows, "construct" to "constrcut", "cumulation" to "cumalation", "initialize" to "intialize".
2016-10-19 11:42:38 -04:00
Nik Everett acf7c7430b Add "simple match" support for reindex-from-remote whitelist
This allows you to whitelist `localhost:*` or `127.0.10.*:9200`.
It explicitly checks for patterns like `*` in the whitelist and
refuses to start if the whitelist would match everything. Beyond
that the user is on their own designing a secure whitelist.
2016-10-18 21:47:21 -04:00