Commit Graph

3970 Commits

Author SHA1 Message Date
Nik Everett bc65be2a65 Reindex: wait for cleanup before responding (#23677)
Changes reindex and friends to wait until the entire request has
been "cleaned up" before responding. "Clean up" in this context
is clearing the scroll and (for reindex-from-remote) shutting
down the client. Failures to clean up are still only logged, not
returned to the user.

Closes #23653
2017-03-21 15:33:39 -04:00
Jason Tedor 8dfb68cf1c Upgrade to Netty 4.1.9
This commit upgrades the Netty dependencies from version 4.1.8 to
version 4.1.9. This commit picks up a few bug fixes that impacted us:
 - Netty was incorrectly ignoring interfaces with self-assigned MAC
   addresses (e.g., instances running in Docker containers or on EC2)
 - incorrect handling of the Expect: 100-continue header

Relates #23540
2017-03-11 18:28:31 -08:00
Daniel Mitterdorfer 6f7cd71e1f Adjust default Netty receive predictor size to 64k (#23542)
With this commit we change the default receive predictor size for Netty
from 32kB to 64kB as our testing has shown that this leads to less
allocations on smaller heaps like the default out of the box
configuration and this value also works reasonably well for larger
heaps.

Closes #23185
2017-03-11 17:32:35 -08:00
Jason Tedor 8e09eca9a6 Mute Painless lambda tests on JDK 9
This commit mutes a ton of Painless lambda tests on JDK 9. This commit
did not attempt to discover exactly which tests are failing, but instead
just blanket muted all tests in LambdaTests, FunctionRefTests, and
AugmentationTests.

Relates #23473
2017-03-02 22:36:26 -05:00
Jay Modi 01502893eb HTTP transport stashes the ThreadContext instead of the RestController (#23456)
Previously, the RestController would stash the context prior to copying headers. However, there could be deprecation
log messages logged and in turn warning headers being added to the context prior to the stashing of the context. These
headers in the context would then be removed from the request and also leaked back into the calling thread's context.

This change moves the stashing of the context to the HttpTransport so that the network threads' context isn't
accidentally populated with warning headers and to ensure the headers added early on in the RestController are not
excluded from the response.
2017-03-02 14:44:01 -05:00
Luca Cavanna cc65a94fd4 [TEST] improve yaml test sections parsing (#23407)
Throw error when skip or do sections are malformed, such as they don't start with the proper token (START_OBJECT). That signals bad indentation, which would be ignored otherwise. Thanks (or due to) our pull parsing code, we were still able to properly parse the sections, yet other runners weren't able to.

Closes #21980

* [TEST] fix indentation in matrix_stats yaml tests

* [TEST] fix indentation in painless yaml test

* [TEST] fix indentation in analysis yaml tests

* [TEST] fix indentation in generated docs yaml tests

* [TEST] fix indentation in multi_cluster_search yaml tests
2017-03-02 12:43:20 +01:00
Nik Everett 2dcdaa1c9d Mustache: don't extend AbstractComponent (#23419)
Don't extend `AbstractComponent` in `MustacheScriptEngine` because
it doesn't buy anything.
2017-03-01 14:54:27 -05:00
Ryan Ernst 019263d664 Revert "Internal: Change version constant names for already released versions (#23416)"
This reverts commit dc0e93ed62.
2017-02-28 14:45:13 -08:00
Ryan Ernst dc0e93ed62 Internal: Change version constant names for already released versions (#23416)
We have many version constants in master that have already been
released, but are still marked (by naming convention) as unreleased.
This commit renames those version constants.
2017-02-28 13:05:44 -08:00
Tanguy Leroux 33eb6a13bf Tests: Fix RemoteScrollableHitSourceTests
With  #23307, the expected exception is wrapped two times into a RuntimeException instead of being thrown directly.
2017-02-28 11:30:33 +01:00
Jim Ferenczi 5c84640126 Upgrade to lucene-6.5.0-snapshot-d00c5ca (#23385)
Lucene upgrade
2017-02-27 18:39:04 +01:00
javanna 9a2dba3036 [TEST] add support for binary responses to REST tests infra 2017-02-27 12:27:03 +01:00
javanna dad025a6ad [TEST] move test for binary field to specific test file that sets Content-Type header explicitly 2017-02-27 12:27:03 +01:00
Ryan Ernst 9df95def90 Build: Remove extra copies of netty license (#23361)
The dependencyLicenses check has the ability to map multiple jar files
to the same license file. However, netty was not taking advantage of
this, and had duplicate copies of its license/notice files for each jar.
This commit reduces the copies to one and uses the mapping feature.
2017-02-24 14:40:07 -08:00
Jason Tedor f85a7aed37 Keep the pipeline handler queue small initially
This commit sets the intial size of the pipeline handler queue small to
prevent waste if pipelined requests are never sent. Since the queue will
grow quickly if pipeline requests are indeed set, this should not be
problematic.

Relates #23335
2017-02-23 14:17:46 -05:00
sabi0 09b3c7f270 Do not create String instances in 'Strings' methods accepting StringBuilder (#22907) 2017-02-23 10:57:34 -08:00
Christoph Büscher 8b1b152e91 Remove abstract InternalMetricsAggregation class (#23326)
This class doesn't seem to do much other than to group together
certain types of aggregations.
2017-02-23 18:03:40 +01:00
Jason Tedor 3e69c38dbd Respect promises on pipelined responses
When pipelined responses are sent to the pipeline handler for writing,
they are not necessarily written immediately. They must be held in a
priority queue until all responses preceding the given response are
written. This means that when write is invoked on the handler, the
promise that is attached to the write invocation will not necessarily be
the promise associated with the responses that are written while the
queue is drained. To address this, the promise associated with a
pipelined response must be held with the response and then used when the
channel context is actually written to. This was introduced when
ensuring that the releasing promise is always chained through on write
calls lest the releasing promise never be invoked. This leads to many
failing test cases, so no new test cases are needed here.

Relates #23317
2017-02-23 09:32:43 -05:00
Jason Tedor 6ca90a61a6 Relocate a comment in HttpPipeliningHandler
This commit moves a comment in HttpPipeliningHandler as it makes more
sense for this comment to be where the field that it is explaining is
declared.
2017-02-22 20:51:18 -05:00
Jason Tedor 30f723d2b0 Add comments to HttpPipeliningHandler
This commit adds some comments explaining the design of
HttpPipeliningHandler.
2017-02-22 20:47:34 -05:00
Ryan Ernst 175bda64a0 Build: Rework integ test setup and shutdown to ensure stop runs when desired (#23304)
Gradle's finalizedBy on tasks only ensures one task runs after another,
but not immediately after. This is problematic for our integration tests
since it allows multiple project's integ test clusters to be
simultaneously. While this has not been a problem thus far (gradle 2.13
happened to keep the finalizedBy tasks close enough that no clusters
were running in parallel), with gradle 3.3 the task graph generation has
changed, and numerous clusters may be running simultaneously, causing
memory pressure, and thus generally slower tests, or even failure if the
system has a limited amount of memory (eg in a vagrant host).

This commit reworks how integ tests are configured. It adds an
`integTestCluster` extension to gradle which is equivalent to the current
`integTest.cluster` and moves the rest test runner task to
`integTestRunner`.  The `integTest` task is then just a dummy task,
which depends on the cluster runner task, as well as the cluster stop
task. This means running `integTest` in one project will both run the
rest tests, and shut down the cluster, before running `integTest` in
another project.
2017-02-22 12:43:15 -08:00
Jason Tedor 708d11f54a Ensure that releasing listener is called
When sending a response to a client, we attach a releasing listener to
the channel promise. If the client disappears before the response is
sent, the releasing listener was never notified. The reason the
listeners were never notified was due to a mistaken invocation of write
and flush on the channel which has two overrides: one that takes an
existing promise, and one that does not and instead creates a new
promise. When the client disappears, it is this latter promise that is
notified, which does not contain the releasing listener. This commit
addreses this issue by invoking the override that passes our channel
promise through.

Relates #23310
2017-02-22 13:54:17 -05:00
javanna 594f00c582 Remove content type auto-detection from search templates
Now that search templates always get converted to json, we don't need to try and auto-detect their content-type, which anyways didn't work as expected before given that only json was really working.
2017-02-22 16:20:53 +01:00
javanna f2acf466aa Convert script/template objects to json format
Elasticsearch accepts multiple content-type formats, hence scripts can be stored/provided in json, yaml, cbor or smile. Yet the format that should be used internally is json. This is a problem mainly around search templates, as they only support json out of the four content-types, so instead of maintaining the content-type of the request we should rather convert the scripts/templates to json.

 Binary formats were not previously supported. If you stored a template in yaml format, you'd get back an error "No encoder found for MIME type [application/yaml]" when trying to execute it. With this commit the request content-type is independent from the template, which always gets converted to json internally. That is transparent to users and doesn't affect the content type of the response obtained when executing the template.
2017-02-22 16:20:53 +01:00
javanna 9391c6ffa9 Replace CustomMustacheFactory constant with same constant from Script (CONTENT_TYPE_OPTION) 2017-02-22 16:20:53 +01:00
Nik Everett 38d25a0369 Fix Painless's implementation of interfaces returning primitives (#23298)
Fixes Painless to properly implement scripts that return primitives
and void. Adds some simple tests that we emit sane opcodes and some
other tests that we implement primitives as expected.

Mostly this is just a fix following up from #22983 but there is one
thing I did really worth talking about, I think. So, before this script
Painless scripts could only ever return Object and they did would always
return null for paths that didn't return any values. Now that they
can return primitives the question is "what should Painless return
from paths that don't return any values?" And I answered that with
"whatever the JLS default value is". So 0/0L/0f/0d/false.
2017-02-21 17:10:55 -05:00
Martijn van Groningen 81d53470e7 percolator: add support for term extraction for MultiPhraseQuery 2017-02-21 21:10:55 +01:00
Nik Everett 9105672969 Allow painless to implement more interfaces (#22983)
Generalizes three previously hard coded things in painless into
generic concepts:

1. The "main method" is no longer hardcoded to:
```
public abstract Object execute(Map<String, Object> params,
        Scorer scorer, LeafDocLookup doc, Object value);
```
Instead Painless's compiler takes an interface and implements it. It looks like:
```
public interface SomeScript {
    // Argument names we expose to Painless scripts
    String[] ARGUMENTS = new String[] {"a", "b"};
    // Method implemented by Painless script. Must be named execute but can have any parameters or return any value.
    Object execute(String a, int b);
    // Is the "a" argument used by the script?
    boolean uses$a();
}
SomeScript script = scriptEngine.compile(SomeScript.class, null, "the_script_here", emptyMap());
Object result = script.execute("a", 1);
```

`PainlessScriptEngine` now compiles all scripts to the new
`GenericElasticsearchScript` interface by default for compatibility
with the rest of Elasticsearch until it is able to use this new
ability.

2. `_score` and `ctx` are no longer hardcoded to be extracted from
`#score` and `params` respectively. Instead Painless's default
implementation of Elasticsearch scripts uses the `uses$_score` and
`uses$ctx` methods to determine if it is used and gives them
dummy values if they are not used.

3. Throwing the `ScriptException` is now handled by the Painless
script itself. That way Painless doesn't have to leak the metadata
that is required to build the fancy stack trace. And all painless scripts
get the fancy stack trace.
2017-02-21 14:08:57 -05:00
Jack Conradson fac2d954e3 Fix certain bad casts in Painless due to boxing/unboxing. (#23282) 2017-02-21 10:23:27 -08:00
Daniel Mitterdorfer 0744a00001 Set network receive predictor size to 32kb (#23284)
Previously we calculated Netty' receive predictor size for HTTP and transport
traffic based on available memory and worker nodes. This resulted in a receive
predictor size between 64kb and 512kb. In our benchmarks this leads to increased
GC pressure.

With this commit we set Netty's receive predictor size to 32kb. This value is in
a sweet spot between heap memory waste (-> GC pressure) and effect on request
metrics (achieved throughput and latency numbers).

Closes #23185
2017-02-21 14:45:33 +01:00
Jay Modi b234644035 Enforce Content-Type requirement on the rest layer and remove deprecated methods (#23146)
This commit enforces the requirement of Content-Type for the REST layer and removes the deprecated methods in transport
requests and their usages.

While doing this, it turns out that there are many places where *Entity classes are used from the apache http client
libraries and many of these usages did not specify the content type. The methods that do not specify a content type
explicitly have been added to forbidden apis to prevent more of these from entering our code base.

Relates #19388
2017-02-17 14:45:41 -05:00
Jason Tedor 0a5917d182 Fix get HEAD requests
Get HEAD requests incorrectly return a content-length header of 0. This
commit addresses this by removing the special handling for get HEAD
requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.

Relates #23186
2017-02-15 13:07:29 -05:00
Ryan Ernst 79a1629f74 Fix line length 2017-02-14 21:23:21 -08:00
Jason Tedor 9e80e290d6 Add failing tests for expect header violations
This commit adds unit tests for two cases where Elasticsearch violates
expect header handling. These tests are marked as awaits fix.

Relates #23173
2017-02-14 19:24:22 -05:00
Jason Tedor 673754b1d5 Fix get source HEAD requests
Get source HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for get
source HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.

Relates #23151
2017-02-14 16:37:22 -05:00
Martijn van Groningen cab43707dc [percolator] Removed old 2.x bwc logic. 2017-02-14 22:17:17 +01:00
Simon Willnauer aef0665ddb Detach SearchPhases from AbstractSearchAsyncAction (#23118)
Today all search phases are inner classes of AbstractSearchAsyncAction or one of it's
subclasses. This makes unit testing of these classes practically impossible. This commit
Extracts `DfsQueryPhase` and `FetchSearchPhase` or of the code that composes the actual
query execution types and moves most of the fan-out and collect code into an `InitialSearchPhase`
class that can be used to build initial search phases (phases that retry on shards). This will
make modification to these classes simpler and allows to easily compose or add new search phases
down the road if additional roundtrips are required.
2017-02-14 12:34:25 +01:00
Jason Tedor 5343b87502 Handle bad HTTP requests
When Netty decodes a bad HTTP request, it marks the decoder result on
the HTTP request as a failure, and reroutes the request to GET
/bad-request. This either leads to puzzling responses when a bad request
is sent to Elasticsearch (if an index named "bad-request" does not exist
then it produces an index not found exception and otherwise responds
with the index settings for the index named "bad-request"). This commit
addresses this by inspecting the decoder result on the HTTP request and
dispatching the request to a bad request handler preserving the initial
cause of the bad request and providing an error message to the client.

Relates #23153
2017-02-13 17:39:25 -05:00
Jay Modi 61e383813d Make the version of the remote node accessible on a transport channel (#23019)
This commit adds a new method to the TransportChannel that provides access to the version of the
remote node that the response is being sent on and that the request came from. This is helpful
for serialization of data attached as headers.
2017-02-13 15:15:57 -05:00
jaymode d8d03f45c2
Fix communication with 5.3.0 nodes
This commit fixes communication with 5.3.0 nodes to send XContentType to these nodes since #22691 was backported to the
5.3 branch.
2017-02-13 13:15:51 -05:00
Jason Tedor 0f21ed5b70 Fix template HEAD requests
Template HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for
template HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.

Relates #23130
2017-02-11 18:30:16 -05:00
Jason Tedor a6158398dd Fix index HEAD requests
Index HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for index
HEAD requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.

Relates #23112
2017-02-10 09:44:01 -05:00
Jason Tedor 7ac44656df Fix alias HEAD requests
Alias HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for alias
HEAD requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.

Relates #23094
2017-02-10 09:19:35 -05:00
Adrien Grand 709cc9ba65 Upgrade to lucene-6.5.0-snapshot-f919485. (#23087) 2017-02-10 15:08:47 +01:00
Tanguy Leroux e2e5937455 Use `typed_keys` parameter to prefix suggester names by type in search responses (#23080)
This pull request reuses the typed_keys parameter added in #22965, but this time it applies it to suggesters. When set to true, the suggester names in the search response will be prefixed with a prefix that reflects their type.
2017-02-10 10:53:38 +01:00
Nik Everett 0250c7ab18 Fix reindex test after toString change
Weakens the assertion on wait_for_active_shards so that we don't
check the toString of the bulk request because it isn't important.

Relates to #22900
2017-02-09 16:48:40 -05:00
Tim Brooks a331405aff Isolated SocketPermissions to Netty (#23057)
Netty 4.1.8 wraps connect and accept operations in doPrivileged blocks.
This means that we not need to give permissions to the entire transport
module. Additionally this commit deletes the privileged socket channel
and privileged server socket chanel.
2017-02-09 10:00:25 -06:00
Tanguy Leroux 3553522328 Add parameter to prefix aggs name with type in search responses (#22965)
This pull request adds a new parameter to the REST Search API named `typed_keys`. When set to true, the aggregation names in the search response will be prefixed with a prefix that reflects the internal type of the aggregation.

Here is a simple example:
```
GET /_search?typed_keys
{
    "aggs": {
        "tweets_per_user": {
            "terms": {
                "field": "user"
            }
        }
    },
    "size": 0
}
```

And the response:

```
{
    "aggs": {
        "sterms:tweets_per_user": {
            ...
        }
    }
}
```

This parameter is intended to make life easier for REST clients that could parse back the prefix and could detect the type of the aggregation to parse. It could also be implemented for suggesters.
2017-02-09 11:19:04 +01:00
Tim Brooks 735e5b1983 Upgrade to Netty 4.1.8 (#23055)
This commit upgrades the Netty dependency to version 4.1.8.Final.
2017-02-08 11:44:36 -06:00
Simon Willnauer ecb01c15b9 Fold InternalSearchHits and friends into their interfaces (#23042)
We have a bunch of interfaces that have only a single implementation
for 6 years now. These interfaces are pretty useless from a SW development
perspective and only add unnecessary abstractions. They also require
lots of casting in many places where we expect that there is only one
concrete implementation. This change removes the interfaces, makes
all of the classes final and removes the duplicate `foo` `getFoo` accessors
in favor of `getFoo` from these classes.
2017-02-08 14:40:08 +01:00
Tim Brooks fcc568fd8d Add methods requiring connect to forbidden apis (#22964)
This is related to #22116. This commit adds calls that require
SocketPermission connect to forbidden APIs.

The following calls are now forbidden:

- java.net.URL#openStream()
- java.net.URLConnection#connect()
- java.net.URLConnection#getInputStream()
- java.net.Socket#connect(java.net.SocketAddress)
- java.net.Socket#connect(java.net.SocketAddress, int)
- java.nio.channels.SocketChannel#open(java.net.SocketAddress)
- java.nio.channels.SocketChannel#connect(java.net.SocketAddress)
2017-02-07 14:41:50 -06:00
Boaz Leskes ba06c14a97 TransportService.connectToNode should validate remote node ID (#22828)
#22194 gave us the ability to open low level temporary connections to remote node based on their address. With this use case out of the way, actual full blown connections should validate the node on the other side, making sure we speak to who we think we speak to. This helps in case where multiple nodes are started on the same host and a quick node restart causes them to swap addresses, which in turn can cause confusion down the road.
2017-02-07 22:11:32 +02:00
Tim Brooks 27b7d9bd8d Add FileSystemUtil method to read 'file:/' URLs (#23020)
As part of #22116 we are going to forbid usage of api
java.net.URL#openStream(). However in a number of places across the
we use this method to read files from the local filesystem. This commit
introduces a helper method openFileURLStream(URL url) to read files
from URLs. It does specific validation to only ensure that file:/
urls are read.

Additionlly, this commit removes unneeded method
FileSystemUtil.newBufferedReader(URL, Charset). This method used the
openStream () method which will soon be forbidden. Instead we use the
Files.newBufferedReader(Path, Charset).
2017-02-07 10:24:22 -06:00
Jay Modi c898e8ab83 Add support for newline delimited JSON Content-Type (#22947)
This commit adds support for the newline delimited JSON Content-Type, which is how
the bulk, multi-search, and multi-search template APIs expect data to be formatted. The
`elasticsearch-js` client has also been using this content type for these types of requests.

Closes #22943
2017-02-07 09:20:06 -05:00
Nik Everett 0d6e622242 Make dates be ReadableDateTimes in scripts (#22948)
Instead of longs. If you want millis since epoch you can call doc.date_field.value.millis.

Relates to #22875
2017-02-06 16:44:56 -05:00
Nicholas Knize 1c9fdfd1b3 Remove GeoPointFieldMapper abstraction
In order to support the evolving GeoPoint encodings in Lucene 5 and 6, ES 2.x and 5.x implements an abstraction layer to the GeoPointFieldMapper classes. As of 5.x the geo_point field mapper settled on using Lucene's more performant LatLonPoint field type and deprecated all other encodings. In 6.0 all encodings except LatLonPoint have been removed rendering this abstraction layer useless. This commit removes the abstraction layer and renames the LatLonPointFieldMapper back to GeoPointFieldMapper to mantain consistency with ES field naming.
2017-02-06 14:17:21 -06:00
Adrien Grand c8496fc4f4 Upgrade to Lucene 6.4.1. (#22978) 2017-02-06 09:28:43 +01:00
Nik Everett b0c9759441 Painless: Don't allow casting from void to def (#22969)
Painless can cast anything into the magic type `def` but it
really shouldn't try to cast **nothing** into `def`. That causes
the byte code generation library to freak out a little.

Closes #22908
2017-02-03 16:38:47 -05:00
Nik Everett 9ca871af7e Test: weaken assertion in fix sliced reindex test
This test was using initial count of slices instead of the count
of unfinished slices to pick the expected throttle. Unfortunely
due to race conditions the actual rethrottle count is between the
two. So we weaken the assertion from "the new throttle is exactly X"
to "the new throttle is between X and Y (inclusive)".
2017-02-03 13:00:49 -05:00
Tim Brooks f70188ac58 Remove connect SocketPermissions from core (#22797)
This is related to #22116. Core no longer needs `SocketPermission`
`connect`.

This permission is relegated to these modules/plugins:
- transport-netty4 module
- reindex module
- repository-url module
- discovery-azure-classic plugin
- discovery-ec2 plugin
- discovery-gce plugin
- repository-azure plugin
- repository-gcs plugin
- repository-hdfs plugin
- repository-s3 plugin

And for tests:
- mocksocket jar
- rest client
- httpcore-nio jar
- httpasyncclient jar
2017-02-03 09:39:56 -06:00
Christoph Büscher c33f894846 Fixing compilation problem in Eclipse (#22956) 2017-02-03 16:16:51 +01:00
Nik Everett 18eb0827e6 Reindex: do not log when can't clear old scroll (#22942)
Versions of Elasticsearch prior to 2.0 would return a scroll id
even with the last scroll response. They'd then automatically
clear the scroll because it is empty. When terminating reindex
will attempt to clear the last scroll it received, regardless of
the remote version. This quiets the warning when the scroll cannot
be cleared for versions before 2.0.

Closes #22937
2017-02-03 10:08:27 -05:00
Jason Tedor 9a0b216c36 Upgrade checkstyle to version 7.5
This commit upgrades the checkstyle configuration from version 5.9 to
version 7.5, the latest version as of today. The main enhancement
obtained via this upgrade is better detection of redundant modifiers.

Relates #22960
2017-02-03 09:46:44 -05:00
Nik Everett ea4eb06b0a Test: Make update-by-query test more resilient
`UpdateByQueryWhileModifyingTests#testUpdateWhileReindexing`
runs update-by-query and concurrently updates, asserting that
the update-by-query never reverts any changes made by the update.
It is a smoke test for concurrent updates.

Now, it expects to hit a certain number of version conflicts
during the updates. This is normal as it is racing the
update-by-query. We have a maximum number of failures we
expect (10) and I'd never seen us come close until
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-unix-compatibility/os=sles/495/console

This bumps the max failures from 10 to 50 and improves
logging a bit. If we continue to see this failure then we have
some other issue.

Closes #22938
2017-02-03 09:18:26 -05:00
Jay Modi 7520a107be Optionally require a valid content type for all rest requests with content (#22691)
This change adds a strict mode for xcontent parsing on the rest layer. The strict mode will be off by default for 5.x and in a separate commit will be enabled by default for 6.0. The strict mode, which can be enabled by setting `http.content_type.required: true` in 5.x, will require that all incoming rest requests have a valid and supported content type header before the request is dispatched. In the non-strict mode, the Content-Type header will be inspected and if it is not present or not valid, we will continue with auto detection of content like we have done previously.

The content type header is parsed to the matching XContentType value with the only exception being for plain text requests. This value is then passed on with the content bytes so that we can reduce the number of places where we need to auto-detect the content type.

As part of this, many transport requests and builders were updated to provide methods that
accepted the XContentType along with the bytes and the methods that would rely on auto-detection have been deprecated.

In the non-strict mode, deprecation warnings are issued whenever a request with body doesn't provide the Content-Type header.

See #19388
2017-02-02 14:07:13 -05:00
Nik Everett ce8e042b66 Reindex: fix reindex-from-remote from <2.0 (#22931)
In 5.2 we stopped sending the source parameter if the user didn't
specify it. This was a mistake as versions before 2.0 look like
they don't always include the `_source`. This is because reindex
requests some metadata fields. Anyway, now we say `"_source": true`
if there isn't a `_source` configured in the reindex request.

Closes #22893
2017-02-02 11:46:24 -05:00
Nik Everett 73bf29072f Painless: Fix def invoked qualified method refs (#22918)
We were incorrectly resolving qualified method references at run
time when invoked on `def`. This lead to errors like
`The struct with name [org] has not been defined.` when attempting

```
doc.date.dates.stream().map(
  org.joda.time.ReadableDateTime::centuryOfEra
).collect(Collectors.toList())
```
2017-02-02 10:15:03 -05:00
Nik Everett dacc150934 Expose multi-valued dates to scripts and document painless's date functions (#22875)
Implemented by wrapping an array of reused `ModuleDateTime`s that
we grow when needed. The `ModuleDateTime`s are reused when we
move to the next document.

Also improves the error message returned when attempting to modify
the `ScriptdocValues`, removes a couple of allocations, and documents
that the date functions are available in Painless.

Relates to #22162
2017-02-01 21:57:07 -05:00
Jack Conradson 3d2626c4c6 Change Namespace for Stored Script to Only Use Id (#22206)
Currently, stored scripts use a namespace of (lang, id) to be put, get, deleted, and executed. This is not necessary since the lang is stored with the stored script. A user should only have to specify an id to use a stored script. This change makes that possible while keeping backwards compatibility with the previous namespace of (lang, id). Anywhere the previous namespace is used will log deprecation warnings.

The new behavior is the following:

When a user specifies a stored script, that script will be stored under both the new namespace and old namespace.

Take for example script 'A' with lang 'L0' and data 'D0'. If we add script 'A' to the empty set, the scripts map will be ["A" -- D0, "A#L0" -- D0]. If a script 'A' with lang 'L1' and data 'D1' is then added, the scripts map will be ["A" -- D1, "A#L1" -- D1, "A#L0" -- D0].

When a user deletes a stored script, that script will be deleted from both the new namespace (if it exists) and the old namespace.

Take for example a scripts map with {"A" -- D1, "A#L1" -- D1, "A#L0" -- D0}. If a script is removed specified by an id 'A' and lang null then the scripts map will be {"A#L0" -- D0}. To remove the final script, the deprecated namespace must be used, so an id 'A' and lang 'L0' would need to be specified.

When a user gets/executes a stored script, if the new namespace is used then the script will be retrieved/executed using only 'id', and if the old namespace is used then the script will be retrieved/executed using 'id' and 'lang'
2017-01-31 13:27:02 -08:00
Nik Everett 2e48fb8294 Move delete by query helpers into core (#22810)
This moves the building blocks for delete by query into core. This
should enabled two thigns:
1. Plugins other than reindex to implement "bulk by scroll" style
operations.
2. Plugins to directly call delete by query. Those plugins should
be careful to make sure that task cancellation still works, but
this should be possible.

Notes:
1. I've mostly just moved classes and moved around tests methods.
2. I haven't been super careful about cohesion between these core
classes and reindex. They are quite interconnected because I wanted
to make the change as mechanical as possible.

Closes #22616
2017-01-27 16:09:18 -05:00
Nik Everett 8a2d424d68 Generate reference links for painless API (#22775)
Adds "Appending B. Painless API Reference", a reference of all classes
and methods available from Painless. Removes links to java packages
because they contain methods that we don't expose and don't contain
methods that we do expose (the ones in Augmentation). Instead this
generates a list of every class and every exposed method using the same
type information available to the
interpreter/compiler/whatever-we-call-it. From there you can jump to
the relevant docs.

Right now you build all the asciidoc files by running
```
gradle generatePainlessApi
```

These files are expected to be committed because we build the docs
without running `gradle`.

Also changes the output of `Debug.explain` so that it is easy to
search for the class in the generated reference documentation.

You can also run it in an IDE safely if you pass the path to the
directory in which to generate the docs as the first parameter. It'll
blow away the entire directory an recreate it from scratch so be careful.

And then you can build the docs by running something like:
```
../docs/build_docs.pl --out ../built_docs/ --doc docs/reference/index.asciidoc --open
```

That is, if you have checked out https://github.com/elastic/docs in
`../docs`. Wait a minute or two and your browser will pop open in with
all of Elasticsearch's reference documentation. If you go to
`http://localhost:8000/painless-api-reference.html` you can see this
list. Or you can get there by following the links to `Modules` and
`Scripting` and `Painless` and then clicking the link in the paragraphs
below titled `Appendix B. Painless API Reference`.

I like having these in asciidoc because we can deep link to them from the
rest of the guide with constructs like
`<<painless-api-reference-Object-hashCode-0>>` and
`<<painless-api-reference->>` and we get link checking. Then the only
brittle link maintenance bit is the link generation for javadoc. Which
sucks. But I think it is important that we link to the methods directly
so they are easy to find.

Relates to #22720
2017-01-26 10:39:19 -05:00
Tim Brooks 719e75bb3f Add repository-url module and move URLRepository (#22752)
This is related to #22116. URLRepository requires SocketPermission
connect. This commit introduces a new module called "repository-url"
where URLRepository will reside. With the new module, permissions can
be removed from core.
2017-01-25 17:09:25 -06:00
Tal Levy e9a68b3287 fix date-processor to a new default year for every new pipeline execution. (#22601)
Beforehand, the DateProcessor constructs its joda pattern formatter during processor
construction. This led to newly ingested documents being defaulted to
the year that the pipeline was constructed, not that of processing.

Fixes #22547.
2017-01-25 15:09:07 -08:00
Chris Earle f0f75b187a Support Preemptive Authentication with RestClient (#21336)
This adds the necessary `AuthCache` needed to support preemptive authorization. By adding every host to the cache, the automatically added `RequestAuthCache` interceptor will add credentials on the first pass rather than waiting to do it after _each_ anonymous request is rejected (thus always sending everything twice when basic auth is required).
2017-01-24 11:34:05 -05:00
Luca Cavanna 47c0e13a3b Stop returning "es." internal exception headers as http response headers (#22703)
move "es." internal headers to separate metadata set in ElasticsearchException and stop returning them as response headers

Closes #17593

* [TEST] remove ESExceptionTests, move its methods to ElasticsearchExceptionTests or ExceptionSerializationTests
2017-01-24 16:12:45 +01:00
Nik Everett 28cfc533e2 Generate javadoc jar for painless's public API (#22704)
The simplest way to do that is to move the public API into a
new package and generate javadoc for that package.
2017-01-23 17:16:20 -05:00
Jim Ferenczi e48bc2eed7 Add field collapsing for search request (#22337)
* Add top hits collapsing to search request

The field collapsing is done with a custom top docs collector that "collapse" search hits with same field value.
The distributed aspect is resolve using the two passes that the regular search uses. The first pass "collapse" the top hits, then the coordinating node merge/collapse the top hits from each shard.

```
GET _search
{
   "collapse": {
      "field": "category",
   }
}
```

This change also adds an ExpandCollapseSearchResponseListener that intercepts the search response and expands collapsed hits using the CollapseBuilder#innerHit} options.
The retrieval of each inner_hits is done by sending a query to all shards filtered by the collapse key.

```
GET _search
{
   "collapse": {
      "field": "category",
      "inner_hits": {
	"size": 2
      }
   }
}
```
2017-01-23 16:33:51 +01:00
Tim Brooks a4ac29c005 Add single static instance of SpecialPermission (#22726)
This commit adds a SpecialPermission constant and uses that constant
opposed to introducing new instances everywhere.

Additionally, this commit introduces a single static method to check that
the current code has permission. This avoids all the duplicated access
blocks that exist currently.
2017-01-21 12:03:52 -06:00
Jim Ferenczi 8028578305 Upgrade to Lucene 6.4.0 (#22724)
* Upgrade to Lucene 6.4.0

`ValueSource`s are now converted to `DoubleValueSource`s using the Lucene adapter made for the migration to the new API in 6.4.0.
2017-01-21 04:48:01 +01:00
Nik Everett 6265ef1c1b Deguice rest handlers (#22575)
There are presently 7 ctor args used in any rest handlers:
* `Settings`: Every handler uses it to initialize a logger and
  some other strange things.
* `RestController`: Every handler registers itself with it.
* `ClusterSettings`: Used by `RestClusterGetSettingsAction` to
  render the default values for cluster settings.
* `IndexScopedSettings`: Used by `RestGetSettingsAction` to get
  the default values for index settings.
* `SettingsFilter`: Used by a few handlers to filter returned
  settings so we don't expose stuff like passwords.
* `IndexNameExpressionResolver`: Used by `_cat/indices` to
  filter the list of indices.
* `Supplier<DiscoveryNodes>`: Used to fill enrich the response
  by handlers that list tasks.

We probably want to reduce these arguments over time but
switching construction away from guice gives us tighter
control over the list of available arguments.

These parameters are passed to plugins using
`ActionPlugin#initRestHandlers` which is expected to build and
return that handlers immediately. This felt simpler than
returning an reference to the ctors given all the different
possible args.

Breaks java plugins by moving rest handlers off of guice.
2017-01-20 11:48:51 -05:00
Tim Brooks bc16162d21 Remove accept SocketPermissions from core (#22622)
This is related to #22116. Core no longer needs SocketPermission 
accept. This permission is relegated to the transport-netty4 module 
and (for tests) to the mocksocket jar.
2017-01-20 09:27:45 -06:00
Nik Everett 22f1c9fa0f Remove @header we no longer need 2017-01-19 11:44:13 -05:00
Nik Everett bb83c283bb Make lexer abstract 2017-01-19 11:41:50 -05:00
Nik Everett dbb4a2ca6c Move lexer hacks to EnhancedPainlessLexer
This "feels" nicer. Less classes at least.
2017-01-19 11:23:16 -05:00
Nik Everett e2da6a8ee5 Improve painless's javadocs
Hopefully useful references.
2017-01-19 11:04:08 -05:00
Tim Brooks a10aa8aade Add TestWithDependenciesPlugin to build (#22646)
This commit adds a MessyRestTestPlugin to the gradle build. It extends 
StandaloneRestTestPlugin. The main piece of functionality that it adds 
is to copy plugin-metadata from dependencies into the 
generated-resources for the current test source. This is necessary to 
ensure that permissions for dependencies are applied when running the 
tests.

A current limitation is that the permissions are applied differently 
than in the distribution sources. When permissions are granted to all 
depedencies for a module or plugin, the permissions are granted to all 
dependencies on the classpath for tests besides a few hardcoded 
exclusions:
- es core
- es test framework
- lucene test framework
- randomized runner
- junit library
2017-01-19 09:43:53 -06:00
Nik Everett 3ce41a0e15 Painless: Add augmentation to string for base 64 (#22665)
We don't want to expose `String#getBytes` which is required for
`Base64.getEncoder.encode` to work because we're worried about
character sets. This adds `encodeBase64` and `decodeBase64`
methods to `String` in Painless that are duals of one another
such that:
`someString == someString.encodeBase64().decodeBase64()`.

Both methods work with the UTF-8 encoding of the string.

Closes #22648
2017-01-19 09:31:45 -05:00
Nik Everett ee5f8c4522 Consolidate some reindex utility classes (#22666)
Everything that extended `AbstractAsyncBulkByScrollAction` also
extended `AbstractAsyncBulkIndexByScrollAction` so this removes
`AbstractAsyncBulkIndexByScrollAction`, merging it into
`AbstractAsyncBulkByScrollAction`.
2017-01-18 16:58:39 -05:00
Nik Everett 1fe74a6b4b Better error when can't auto create index (#22488)
Changes the error message when `action.auto_create_index` or
`index.mapper.dynamic` forbids automatic creation of an index
from `no such index` to one of:
* `no such index and [action.auto_create_index] is [false]`
* `no such index and [index.mapper.dynamic] is [false]`
* `no such index and [action.auto_create_index] contains [-<pattern>] which forbids automatic creation of the index`
* `no such index and [action.auto_create_index] ([all patterns]) doesn't match`

This should make it more clear *why* there is `no such index`.

Closes #22435
2017-01-18 15:18:32 -05:00
Simon Willnauer 24e2847af2 Streamline foreign stored context restore and allow to perserve response headers (#22677)
Today we do not preserve response headers if they are present on a transport protocol
response. While preserving these headers is not always desired, in the most cases we
should pass on these headers to have consistent results for depreciation headers etc.
yet, this hasn't been much of a problem since most of the deprecations are detected early
ie. on the coordinating node such that this bug wasn't uncovered until #22647

This commit allow to optionally preserve headers when a context is restored and also streamlines
the context restore since it leaked frequently into the callers thread context when the callers
context wasn't restored again.
2017-01-18 16:17:54 +01:00
Igor Motov 500548fcda Remove taskManager.registerChildTask
Instead of forcing each task to register all nodes where its children are running, this commit runs cancellation on all nodes. The task cancellation operation doesn't run too frequently, so this optimization doesn't seem to be worth additional complexity of the interface.
2017-01-17 18:07:31 -05:00
Ali Beyad e2977889b8 Allow comma delimited array settings to have a space after each entry (#22591)
Previously, certain settings that could take multiple comma delimited
values would pick up incorrect values for all entries but the first if
each comma separated value was followed by a whitespace character.  For
example, the multi-value "A,B,C" would be correctly parsed as
["A", "B", "C"] but the multi-value "A, B, C" would be incorrectly parsed
as ["A", " B", " C"].

This commit allows a comma separated list to have whitespace characters
after each entry.  The specific settings that were affected by this are:

  cluster.routing.allocation.awareness.attributes
  index.routing.allocation.require.*
  index.routing.allocation.include.*
  index.routing.allocation.exclude.*
  cluster.routing.allocation.require.*
  cluster.routing.allocation.include.*
  cluster.routing.allocation.exclude.*
  http.cors.allow-methods
  http.cors.allow-headers

For the allocation filtering related settings, this commit also provides
validation of each specified entry if the filtering is done by _ip,
_host_ip, or _publish_ip, to ensure that each entry is a valid IP
address.

Closes #22297
2017-01-17 08:51:04 -06:00
Tanguy Leroux f5542ed47f Simplify ElasticsearchException rendering as a XContent (#22611)
This commit tries to simplify the way ElasticsearchException are rendered to xcontent. It adds some documentation and renames and merges some methods. Current behavior is preserved, the goal is to be more readable and centralize everything in the ElasticsearchException class.
2017-01-17 15:44:49 +01:00
Tim Brooks 16a76d9bc0 Remove blocking TCP clients and servers (#22639)
This commit removes the option to use the blocking variants of the TCP
transport server, TCP transport client, or http server.
2017-01-16 18:38:51 -06:00
Simon Willnauer f30b1f82ee Remove HttpServer and HttpServerAdapter in favor of a simple dispatch method (#22636)
Today we have quite some abstractions that are essentially providing a simple
dispatch method to the plugins defining a `HttpServerTransport`. This commit
removes `HttpServer` and `HttpServerAdaptor` and introduces a simple `Dispatcher` functional
interface that delegate to `RestController` by default.

Relates to #18482
2017-01-16 21:06:08 +01:00
javanna a8a13bb46f replace custom functional interface with CheckedFunction in percolate module 2017-01-16 13:57:58 +01:00
Alexander Reelsen f6ee6e420b Indexing: Add shard id to indexing operation listener (#22606)
The IndexingOperationListener interface did not provide any
information about the shard id when a document was indexed.

This commit adds the shard id as the first parameter to all methods
in the IndexingOperationListener.
2017-01-16 09:08:16 +01:00
Tim Brooks f4270f9914 Wrap netty accept/connect ops with doPrivileged (#22572)
This is related to #22116. netty channels require socket `connect` and
`accept` privileges. Netty does not currently wrap these operations
with `doPrivileged` blocks. These changes extend the netty channels
and wrap calls to the relevant super methods in doPrivileged blocks.
2017-01-13 14:27:09 -06:00
Zachary Tong 18fdc39b8c Increase visibility of doExecute so it can be used directly (#22614) 2017-01-13 09:42:02 -05:00
Nik Everett baed02bbe2 Whitelist some ScriptDocValues in painless (#22600)
Without this whitelist painless can't use ip or binary doc values.

Closes #22584
2017-01-12 15:26:09 -05:00