3900 Commits

Author SHA1 Message Date
Ryan Ernst
175bda64a0 Build: Rework integ test setup and shutdown to ensure stop runs when desired (#23304)
Gradle's finalizedBy on tasks only ensures one task runs after another,
but not immediately after. This is problematic for our integration tests
since it allows multiple project's integ test clusters to be
simultaneously. While this has not been a problem thus far (gradle 2.13
happened to keep the finalizedBy tasks close enough that no clusters
were running in parallel), with gradle 3.3 the task graph generation has
changed, and numerous clusters may be running simultaneously, causing
memory pressure, and thus generally slower tests, or even failure if the
system has a limited amount of memory (eg in a vagrant host).

This commit reworks how integ tests are configured. It adds an
`integTestCluster` extension to gradle which is equivalent to the current
`integTest.cluster` and moves the rest test runner task to
`integTestRunner`.  The `integTest` task is then just a dummy task,
which depends on the cluster runner task, as well as the cluster stop
task. This means running `integTest` in one project will both run the
rest tests, and shut down the cluster, before running `integTest` in
another project.
2017-02-22 12:43:15 -08:00
Jason Tedor
708d11f54a Ensure that releasing listener is called
When sending a response to a client, we attach a releasing listener to
the channel promise. If the client disappears before the response is
sent, the releasing listener was never notified. The reason the
listeners were never notified was due to a mistaken invocation of write
and flush on the channel which has two overrides: one that takes an
existing promise, and one that does not and instead creates a new
promise. When the client disappears, it is this latter promise that is
notified, which does not contain the releasing listener. This commit
addreses this issue by invoking the override that passes our channel
promise through.

Relates #23310
2017-02-22 13:54:17 -05:00
javanna
594f00c582 Remove content type auto-detection from search templates
Now that search templates always get converted to json, we don't need to try and auto-detect their content-type, which anyways didn't work as expected before given that only json was really working.
2017-02-22 16:20:53 +01:00
javanna
f2acf466aa Convert script/template objects to json format
Elasticsearch accepts multiple content-type formats, hence scripts can be stored/provided in json, yaml, cbor or smile. Yet the format that should be used internally is json. This is a problem mainly around search templates, as they only support json out of the four content-types, so instead of maintaining the content-type of the request we should rather convert the scripts/templates to json.

 Binary formats were not previously supported. If you stored a template in yaml format, you'd get back an error "No encoder found for MIME type [application/yaml]" when trying to execute it. With this commit the request content-type is independent from the template, which always gets converted to json internally. That is transparent to users and doesn't affect the content type of the response obtained when executing the template.
2017-02-22 16:20:53 +01:00
javanna
9391c6ffa9 Replace CustomMustacheFactory constant with same constant from Script (CONTENT_TYPE_OPTION) 2017-02-22 16:20:53 +01:00
Nik Everett
38d25a0369 Fix Painless's implementation of interfaces returning primitives (#23298)
Fixes Painless to properly implement scripts that return primitives
and void. Adds some simple tests that we emit sane opcodes and some
other tests that we implement primitives as expected.

Mostly this is just a fix following up from #22983 but there is one
thing I did really worth talking about, I think. So, before this script
Painless scripts could only ever return Object and they did would always
return null for paths that didn't return any values. Now that they
can return primitives the question is "what should Painless return
from paths that don't return any values?" And I answered that with
"whatever the JLS default value is". So 0/0L/0f/0d/false.
2017-02-21 17:10:55 -05:00
Martijn van Groningen
81d53470e7 percolator: add support for term extraction for MultiPhraseQuery 2017-02-21 21:10:55 +01:00
Nik Everett
9105672969 Allow painless to implement more interfaces (#22983)
Generalizes three previously hard coded things in painless into
generic concepts:

1. The "main method" is no longer hardcoded to:
```
public abstract Object execute(Map<String, Object> params,
        Scorer scorer, LeafDocLookup doc, Object value);
```
Instead Painless's compiler takes an interface and implements it. It looks like:
```
public interface SomeScript {
    // Argument names we expose to Painless scripts
    String[] ARGUMENTS = new String[] {"a", "b"};
    // Method implemented by Painless script. Must be named execute but can have any parameters or return any value.
    Object execute(String a, int b);
    // Is the "a" argument used by the script?
    boolean uses$a();
}
SomeScript script = scriptEngine.compile(SomeScript.class, null, "the_script_here", emptyMap());
Object result = script.execute("a", 1);
```

`PainlessScriptEngine` now compiles all scripts to the new
`GenericElasticsearchScript` interface by default for compatibility
with the rest of Elasticsearch until it is able to use this new
ability.

2. `_score` and `ctx` are no longer hardcoded to be extracted from
`#score` and `params` respectively. Instead Painless's default
implementation of Elasticsearch scripts uses the `uses$_score` and
`uses$ctx` methods to determine if it is used and gives them
dummy values if they are not used.

3. Throwing the `ScriptException` is now handled by the Painless
script itself. That way Painless doesn't have to leak the metadata
that is required to build the fancy stack trace. And all painless scripts
get the fancy stack trace.
2017-02-21 14:08:57 -05:00
Jack Conradson
fac2d954e3 Fix certain bad casts in Painless due to boxing/unboxing. (#23282) 2017-02-21 10:23:27 -08:00
Daniel Mitterdorfer
0744a00001 Set network receive predictor size to 32kb (#23284)
Previously we calculated Netty' receive predictor size for HTTP and transport
traffic based on available memory and worker nodes. This resulted in a receive
predictor size between 64kb and 512kb. In our benchmarks this leads to increased
GC pressure.

With this commit we set Netty's receive predictor size to 32kb. This value is in
a sweet spot between heap memory waste (-> GC pressure) and effect on request
metrics (achieved throughput and latency numbers).

Closes #23185
2017-02-21 14:45:33 +01:00
Jay Modi
b234644035 Enforce Content-Type requirement on the rest layer and remove deprecated methods (#23146)
This commit enforces the requirement of Content-Type for the REST layer and removes the deprecated methods in transport
requests and their usages.

While doing this, it turns out that there are many places where *Entity classes are used from the apache http client
libraries and many of these usages did not specify the content type. The methods that do not specify a content type
explicitly have been added to forbidden apis to prevent more of these from entering our code base.

Relates #19388
2017-02-17 14:45:41 -05:00
Jason Tedor
0a5917d182 Fix get HEAD requests
Get HEAD requests incorrectly return a content-length header of 0. This
commit addresses this by removing the special handling for get HEAD
requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.

Relates #23186
2017-02-15 13:07:29 -05:00
Ryan Ernst
79a1629f74 Fix line length 2017-02-14 21:23:21 -08:00
Jason Tedor
9e80e290d6 Add failing tests for expect header violations
This commit adds unit tests for two cases where Elasticsearch violates
expect header handling. These tests are marked as awaits fix.

Relates #23173
2017-02-14 19:24:22 -05:00
Jason Tedor
673754b1d5 Fix get source HEAD requests
Get source HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for get
source HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.

Relates #23151
2017-02-14 16:37:22 -05:00
Martijn van Groningen
cab43707dc [percolator] Removed old 2.x bwc logic. 2017-02-14 22:17:17 +01:00
Simon Willnauer
aef0665ddb Detach SearchPhases from AbstractSearchAsyncAction (#23118)
Today all search phases are inner classes of AbstractSearchAsyncAction or one of it's
subclasses. This makes unit testing of these classes practically impossible. This commit
Extracts `DfsQueryPhase` and `FetchSearchPhase` or of the code that composes the actual
query execution types and moves most of the fan-out and collect code into an `InitialSearchPhase`
class that can be used to build initial search phases (phases that retry on shards). This will
make modification to these classes simpler and allows to easily compose or add new search phases
down the road if additional roundtrips are required.
2017-02-14 12:34:25 +01:00
Jason Tedor
5343b87502 Handle bad HTTP requests
When Netty decodes a bad HTTP request, it marks the decoder result on
the HTTP request as a failure, and reroutes the request to GET
/bad-request. This either leads to puzzling responses when a bad request
is sent to Elasticsearch (if an index named "bad-request" does not exist
then it produces an index not found exception and otherwise responds
with the index settings for the index named "bad-request"). This commit
addresses this by inspecting the decoder result on the HTTP request and
dispatching the request to a bad request handler preserving the initial
cause of the bad request and providing an error message to the client.

Relates #23153
2017-02-13 17:39:25 -05:00
Jay Modi
61e383813d Make the version of the remote node accessible on a transport channel (#23019)
This commit adds a new method to the TransportChannel that provides access to the version of the
remote node that the response is being sent on and that the request came from. This is helpful
for serialization of data attached as headers.
2017-02-13 15:15:57 -05:00
jaymode
d8d03f45c2
Fix communication with 5.3.0 nodes
This commit fixes communication with 5.3.0 nodes to send XContentType to these nodes since #22691 was backported to the
5.3 branch.
2017-02-13 13:15:51 -05:00
Jason Tedor
0f21ed5b70 Fix template HEAD requests
Template HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for
template HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.

Relates #23130
2017-02-11 18:30:16 -05:00
Jason Tedor
a6158398dd Fix index HEAD requests
Index HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for index
HEAD requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.

Relates #23112
2017-02-10 09:44:01 -05:00
Jason Tedor
7ac44656df Fix alias HEAD requests
Alias HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for alias
HEAD requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.

Relates #23094
2017-02-10 09:19:35 -05:00
Adrien Grand
709cc9ba65 Upgrade to lucene-6.5.0-snapshot-f919485. (#23087) 2017-02-10 15:08:47 +01:00
Tanguy Leroux
e2e5937455 Use typed_keys parameter to prefix suggester names by type in search responses (#23080)
This pull request reuses the typed_keys parameter added in #22965, but this time it applies it to suggesters. When set to true, the suggester names in the search response will be prefixed with a prefix that reflects their type.
2017-02-10 10:53:38 +01:00
Nik Everett
0250c7ab18 Fix reindex test after toString change
Weakens the assertion on wait_for_active_shards so that we don't
check the toString of the bulk request because it isn't important.

Relates to #22900
2017-02-09 16:48:40 -05:00
Tim Brooks
a331405aff Isolated SocketPermissions to Netty (#23057)
Netty 4.1.8 wraps connect and accept operations in doPrivileged blocks.
This means that we not need to give permissions to the entire transport
module. Additionally this commit deletes the privileged socket channel
and privileged server socket chanel.
2017-02-09 10:00:25 -06:00
Tanguy Leroux
3553522328 Add parameter to prefix aggs name with type in search responses (#22965)
This pull request adds a new parameter to the REST Search API named `typed_keys`. When set to true, the aggregation names in the search response will be prefixed with a prefix that reflects the internal type of the aggregation.

Here is a simple example:
```
GET /_search?typed_keys
{
    "aggs": {
        "tweets_per_user": {
            "terms": {
                "field": "user"
            }
        }
    },
    "size": 0
}
```

And the response:

```
{
    "aggs": {
        "sterms:tweets_per_user": {
            ...
        }
    }
}
```

This parameter is intended to make life easier for REST clients that could parse back the prefix and could detect the type of the aggregation to parse. It could also be implemented for suggesters.
2017-02-09 11:19:04 +01:00
Tim Brooks
735e5b1983 Upgrade to Netty 4.1.8 (#23055)
This commit upgrades the Netty dependency to version 4.1.8.Final.
2017-02-08 11:44:36 -06:00
Simon Willnauer
ecb01c15b9 Fold InternalSearchHits and friends into their interfaces (#23042)
We have a bunch of interfaces that have only a single implementation
for 6 years now. These interfaces are pretty useless from a SW development
perspective and only add unnecessary abstractions. They also require
lots of casting in many places where we expect that there is only one
concrete implementation. This change removes the interfaces, makes
all of the classes final and removes the duplicate `foo` `getFoo` accessors
in favor of `getFoo` from these classes.
2017-02-08 14:40:08 +01:00
Tim Brooks
fcc568fd8d Add methods requiring connect to forbidden apis (#22964)
This is related to #22116. This commit adds calls that require
SocketPermission connect to forbidden APIs.

The following calls are now forbidden:

- java.net.URL#openStream()
- java.net.URLConnection#connect()
- java.net.URLConnection#getInputStream()
- java.net.Socket#connect(java.net.SocketAddress)
- java.net.Socket#connect(java.net.SocketAddress, int)
- java.nio.channels.SocketChannel#open(java.net.SocketAddress)
- java.nio.channels.SocketChannel#connect(java.net.SocketAddress)
2017-02-07 14:41:50 -06:00
Boaz Leskes
ba06c14a97 TransportService.connectToNode should validate remote node ID (#22828)
#22194 gave us the ability to open low level temporary connections to remote node based on their address. With this use case out of the way, actual full blown connections should validate the node on the other side, making sure we speak to who we think we speak to. This helps in case where multiple nodes are started on the same host and a quick node restart causes them to swap addresses, which in turn can cause confusion down the road.
2017-02-07 22:11:32 +02:00
Tim Brooks
27b7d9bd8d Add FileSystemUtil method to read 'file:/' URLs (#23020)
As part of #22116 we are going to forbid usage of api
java.net.URL#openStream(). However in a number of places across the
we use this method to read files from the local filesystem. This commit
introduces a helper method openFileURLStream(URL url) to read files
from URLs. It does specific validation to only ensure that file:/
urls are read.

Additionlly, this commit removes unneeded method
FileSystemUtil.newBufferedReader(URL, Charset). This method used the
openStream () method which will soon be forbidden. Instead we use the
Files.newBufferedReader(Path, Charset).
2017-02-07 10:24:22 -06:00
Jay Modi
c898e8ab83 Add support for newline delimited JSON Content-Type (#22947)
This commit adds support for the newline delimited JSON Content-Type, which is how
the bulk, multi-search, and multi-search template APIs expect data to be formatted. The
`elasticsearch-js` client has also been using this content type for these types of requests.

Closes #22943
2017-02-07 09:20:06 -05:00
Nik Everett
0d6e622242 Make dates be ReadableDateTimes in scripts (#22948)
Instead of longs. If you want millis since epoch you can call doc.date_field.value.millis.

Relates to #22875
2017-02-06 16:44:56 -05:00
Nicholas Knize
1c9fdfd1b3 Remove GeoPointFieldMapper abstraction
In order to support the evolving GeoPoint encodings in Lucene 5 and 6, ES 2.x and 5.x implements an abstraction layer to the GeoPointFieldMapper classes. As of 5.x the geo_point field mapper settled on using Lucene's more performant LatLonPoint field type and deprecated all other encodings. In 6.0 all encodings except LatLonPoint have been removed rendering this abstraction layer useless. This commit removes the abstraction layer and renames the LatLonPointFieldMapper back to GeoPointFieldMapper to mantain consistency with ES field naming.
2017-02-06 14:17:21 -06:00
Adrien Grand
c8496fc4f4 Upgrade to Lucene 6.4.1. (#22978) 2017-02-06 09:28:43 +01:00
Nik Everett
b0c9759441 Painless: Don't allow casting from void to def (#22969)
Painless can cast anything into the magic type `def` but it
really shouldn't try to cast **nothing** into `def`. That causes
the byte code generation library to freak out a little.

Closes #22908
2017-02-03 16:38:47 -05:00
Nik Everett
9ca871af7e Test: weaken assertion in fix sliced reindex test
This test was using initial count of slices instead of the count
of unfinished slices to pick the expected throttle. Unfortunely
due to race conditions the actual rethrottle count is between the
two. So we weaken the assertion from "the new throttle is exactly X"
to "the new throttle is between X and Y (inclusive)".
2017-02-03 13:00:49 -05:00
Tim Brooks
f70188ac58 Remove connect SocketPermissions from core (#22797)
This is related to #22116. Core no longer needs `SocketPermission`
`connect`.

This permission is relegated to these modules/plugins:
- transport-netty4 module
- reindex module
- repository-url module
- discovery-azure-classic plugin
- discovery-ec2 plugin
- discovery-gce plugin
- repository-azure plugin
- repository-gcs plugin
- repository-hdfs plugin
- repository-s3 plugin

And for tests:
- mocksocket jar
- rest client
- httpcore-nio jar
- httpasyncclient jar
2017-02-03 09:39:56 -06:00
Christoph Büscher
c33f894846 Fixing compilation problem in Eclipse (#22956) 2017-02-03 16:16:51 +01:00
Nik Everett
18eb0827e6 Reindex: do not log when can't clear old scroll (#22942)
Versions of Elasticsearch prior to 2.0 would return a scroll id
even with the last scroll response. They'd then automatically
clear the scroll because it is empty. When terminating reindex
will attempt to clear the last scroll it received, regardless of
the remote version. This quiets the warning when the scroll cannot
be cleared for versions before 2.0.

Closes #22937
2017-02-03 10:08:27 -05:00
Jason Tedor
9a0b216c36 Upgrade checkstyle to version 7.5
This commit upgrades the checkstyle configuration from version 5.9 to
version 7.5, the latest version as of today. The main enhancement
obtained via this upgrade is better detection of redundant modifiers.

Relates #22960
2017-02-03 09:46:44 -05:00
Nik Everett
ea4eb06b0a Test: Make update-by-query test more resilient
`UpdateByQueryWhileModifyingTests#testUpdateWhileReindexing`
runs update-by-query and concurrently updates, asserting that
the update-by-query never reverts any changes made by the update.
It is a smoke test for concurrent updates.

Now, it expects to hit a certain number of version conflicts
during the updates. This is normal as it is racing the
update-by-query. We have a maximum number of failures we
expect (10) and I'd never seen us come close until
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-unix-compatibility/os=sles/495/console

This bumps the max failures from 10 to 50 and improves
logging a bit. If we continue to see this failure then we have
some other issue.

Closes #22938
2017-02-03 09:18:26 -05:00
Jay Modi
7520a107be Optionally require a valid content type for all rest requests with content (#22691)
This change adds a strict mode for xcontent parsing on the rest layer. The strict mode will be off by default for 5.x and in a separate commit will be enabled by default for 6.0. The strict mode, which can be enabled by setting `http.content_type.required: true` in 5.x, will require that all incoming rest requests have a valid and supported content type header before the request is dispatched. In the non-strict mode, the Content-Type header will be inspected and if it is not present or not valid, we will continue with auto detection of content like we have done previously.

The content type header is parsed to the matching XContentType value with the only exception being for plain text requests. This value is then passed on with the content bytes so that we can reduce the number of places where we need to auto-detect the content type.

As part of this, many transport requests and builders were updated to provide methods that
accepted the XContentType along with the bytes and the methods that would rely on auto-detection have been deprecated.

In the non-strict mode, deprecation warnings are issued whenever a request with body doesn't provide the Content-Type header.

See #19388
2017-02-02 14:07:13 -05:00
Nik Everett
ce8e042b66 Reindex: fix reindex-from-remote from <2.0 (#22931)
In 5.2 we stopped sending the source parameter if the user didn't
specify it. This was a mistake as versions before 2.0 look like
they don't always include the `_source`. This is because reindex
requests some metadata fields. Anyway, now we say `"_source": true`
if there isn't a `_source` configured in the reindex request.

Closes #22893
2017-02-02 11:46:24 -05:00
Nik Everett
73bf29072f Painless: Fix def invoked qualified method refs (#22918)
We were incorrectly resolving qualified method references at run
time when invoked on `def`. This lead to errors like
`The struct with name [org] has not been defined.` when attempting

```
doc.date.dates.stream().map(
  org.joda.time.ReadableDateTime::centuryOfEra
).collect(Collectors.toList())
```
2017-02-02 10:15:03 -05:00
Nik Everett
dacc150934 Expose multi-valued dates to scripts and document painless's date functions (#22875)
Implemented by wrapping an array of reused `ModuleDateTime`s that
we grow when needed. The `ModuleDateTime`s are reused when we
move to the next document.

Also improves the error message returned when attempting to modify
the `ScriptdocValues`, removes a couple of allocations, and documents
that the date functions are available in Painless.

Relates to #22162
2017-02-01 21:57:07 -05:00
Jack Conradson
3d2626c4c6 Change Namespace for Stored Script to Only Use Id (#22206)
Currently, stored scripts use a namespace of (lang, id) to be put, get, deleted, and executed. This is not necessary since the lang is stored with the stored script. A user should only have to specify an id to use a stored script. This change makes that possible while keeping backwards compatibility with the previous namespace of (lang, id). Anywhere the previous namespace is used will log deprecation warnings.

The new behavior is the following:

When a user specifies a stored script, that script will be stored under both the new namespace and old namespace.

Take for example script 'A' with lang 'L0' and data 'D0'. If we add script 'A' to the empty set, the scripts map will be ["A" -- D0, "A#L0" -- D0]. If a script 'A' with lang 'L1' and data 'D1' is then added, the scripts map will be ["A" -- D1, "A#L1" -- D1, "A#L0" -- D0].

When a user deletes a stored script, that script will be deleted from both the new namespace (if it exists) and the old namespace.

Take for example a scripts map with {"A" -- D1, "A#L1" -- D1, "A#L0" -- D0}. If a script is removed specified by an id 'A' and lang null then the scripts map will be {"A#L0" -- D0}. To remove the final script, the deprecated namespace must be used, so an id 'A' and lang 'L0' would need to be specified.

When a user gets/executes a stored script, if the new namespace is used then the script will be retrieved/executed using only 'id', and if the old namespace is used then the script will be retrieved/executed using 'id' and 'lang'
2017-01-31 13:27:02 -08:00
Nik Everett
2e48fb8294 Move delete by query helpers into core (#22810)
This moves the building blocks for delete by query into core. This
should enabled two thigns:
1. Plugins other than reindex to implement "bulk by scroll" style
operations.
2. Plugins to directly call delete by query. Those plugins should
be careful to make sure that task cancellation still works, but
this should be possible.

Notes:
1. I've mostly just moved classes and moved around tests methods.
2. I haven't been super careful about cohesion between these core
classes and reindex. They are quite interconnected because I wanted
to make the change as mechanical as possible.

Closes #22616
2017-01-27 16:09:18 -05:00