Commit Graph

610 Commits

Author SHA1 Message Date
Ryan Ernst 2fc41adeb5 Merge branch 'master' into ingest_plugin_api 2016-07-05 20:53:03 -07:00
Nik Everett b3c015e2bb Reindex from remote
This adds a remote option to reindex that looks like

```
curl -POST 'localhost:9200/_reindex?pretty' -d'{
  "source": {
    "remote": {
      "host": "http://otherhost:9200"
    },
    "index": "target",
    "query": {
      "match": {
        "foo": "bar"
      }
    }
  },
  "dest": {
    "index": "target"
  }
}'
```

This reindex has all of the features of local reindex:
* Using queries to filter what is copied
* Retry on rejection
* Throttle/rethottle
The big advantage of this version is that it goes over the HTTP API
which can be made backwards compatible.

Some things are different:

The query field is sent directly to the other node rather than parsed
on the coordinating node. This should allow it to support constructs
that are invalid on the coordinating node but are valid on the target
node. Mostly, that means old syntax.
2016-07-05 16:13:17 -04:00
Jason Tedor 96f283c195 Rename writeThrowable to writeException
This commit renames writeThrowable to writeException. The situation here
stems from the fact that the StreamOutput method for serializing
Exceptions needs to accept Throwables too as Throwables can be the cause
of serialized Exceptions. Yet, we do not serialize Throwables in the
Error sub-hierarchy in a way that they can be deserialized into their
initial type. This leads to an asymmetry in the StreamOutput method for
serializing Exceptions and the StreamInput method for writing
Excpetions. Namely, the former will accept Throwables but the latter
will only return Exceptions. A goal with the stream methods has always
been symmetry in the method names so that serialization/deserialization
routines appear symmetrical in code. It is this asymmetry on the
input/output types for Exceptions on StreamOutput/StreamInput that
clashes with the desired symmetry of naming. Despite this, we should
favor symmetry in the naming of the methods. This commit renames
StreamOutput#writeThrowable to StreamOutput#writeException which leaves
us with Exception StreamInput#readException and void
StreamOutput#writeException(Throwable).
2016-07-05 14:37:01 -04:00
Boaz Leskes 6861d3571e Persistent Node Ids (#19140)
Node IDs are currently randomly generated during node startup. That means they change every time the node is restarted. While this doesn't matter for ES proper, it makes it hard for external services to track nodes. Another, more minor, side effect is that indexing the output of, say, the node stats API results in creating new fields due to node ID being used as keys.

The first approach I considered was to use the node's published address as the base for the id. We already [treat nodes with the same address as the same](https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/discovery/zen/NodeJoinController.java#L387) so this is a simple change (see [here](https://github.com/elastic/elasticsearch/compare/master...bleskes:node_persistent_id_based_on_address)). While this is simple and it works for probably most cases, it is not perfect. For example, if after a node restart, the node is not able to bind to the same port (because it's not yet freed by the OS), it will cause the node to still change identity. Also in environments where the host IP can change due to a host restart, identity will not be the same. 

Due to those limitation, I opted to go with a different approach where the node id will be persisted in the node's data folder. This has the upside of connecting the id to the nodes data. It also means that the host can be adapted in any way (replace network cards, attach storage to a new VM). I

It does however also have downsides - we now run the risk of two nodes having the same id, if someone copies clones a data folder from one node to another. To mitigate this I changed the semantics of the protection against multiple nodes with the same address to be stricter - it will now reject the incoming join if a node exists with the same id but a different address. Note that if the existing node doesn't respond to pings (i.e., it's not alive) it will be removed and the new node will be accepted when it tries another join.

Last, and most importantly, this change requires that *all* nodes persist data to disk. This is a change from current behavior where only data & master nodes store local files. This is the main reason for marking this PR as breaking.

Other less important notes:
- DummyTransportAddress is removed as we need a unique network address per node. Use `LocalTransportAddress.buildUnique()` instead.
- I renamed `node.add_lid_to_custom_path` to `node.add_lock_id_to_custom_path` to avoid confusion with the node ID which is now part of the `NodeEnvironment` logic.
- I removed the `version` paramater from `MetaDataStateFormat#write` , it wasn't really used and was just in the way :)
- TribeNodes are special in the sense that they do start multiple sub-nodes (previously known as client nodes). Those sub-nodes do not store local files but derive their ID from the parent node id, so they are generated consistently.
2016-07-04 21:09:25 +02:00
Tanguy Leroux 0e7faf1005 Enable Checkstyle RedundantModifier 2016-07-04 15:22:12 +02:00
Jason Tedor 3343ceeae4 Do not catch throwable
Today throughout the codebase, catch throwable is used with reckless
abandon. This is dangerous because the throwable could be a fatal
virtual machine error resulting from an internal error in the JVM, or an
out of memory error or a stack overflow error that leaves the virtual
machine in an unstable and unpredictable state. This commit removes
catch throwable from the codebase and removes the temptation to use it
by modifying listener APIs to receive instances of Exception instead of
the top-level Throwable.

Relates #19231
2016-07-04 08:41:06 -04:00
Ryan Ernst 5a66c08ae9 Merge branch 'master' into ingest_plugin_api 2016-07-01 16:27:52 -07:00
Ryan Ernst 822c995367 Internal: Remove generics from LifecycleComponent
The only reason for LifecycleComponent taking a generic type was so that
it could return that type on its start and stop methods. However, this
chaining has no practical necessity. Instead, start and stop can be
void, and a whole bunch of confusing generics disappear.
2016-07-01 16:17:42 -07:00
Ryan Ernst e5caadc4f3 Merge branch 'master' into ingest_plugin_api 2016-07-01 12:35:26 -07:00
Nik Everett f30a70c51f Fix comment
I forgot a word....
2016-07-01 14:48:08 -04:00
Nik Everett ff42d7cfc6 Add embedded stash key support to rest tests
This allowes embedding stash keys in string like `t${key}est`. This
allows simple string concatenation like acitons.

The test for this is in `ObjectPathTests` because `Stash` doesn't seem
to have a test on its own and it is simple enough to test embedded
stashes this way. And this is a way I expect them to be used eventually.
2016-07-01 14:11:11 -04:00
Ryan Ernst 65c9b0b588 Merge branch 'master' into ingest_plugin_api 2016-07-01 09:26:17 -07:00
Tanguy Leroux 8c40b2b54e Fix order of modifiers 2016-07-01 16:57:14 +02:00
Simon Willnauer 5c8164a561 Clean up BytesReference (#19196)
BytesReference should be a really simple interface, yet it has a gazillion
ways to achieve the same this. Methods like `#hasArray`, `#toBytesArray`, `#copyBytesArray`
`#toBytesRef` `#bytes` are all really duplicates. This change simplifies the interface
dramatically and makes implementations of it much simpler. All array access has been removed
and is streamlined through a single `#toBytesRef` method. Utility methods to materialize a
compact byte array has been added too for convenience.
2016-07-01 16:09:31 +02:00
javanna dd781d410a fix line length problems in all classes under o.e.test.rest package 2016-07-01 11:13:10 +02:00
javanna 0b5a549305 [TEST] remove special treatment for stashed $body in REST tests, instead always evaluate the stash through ObjectPath
When we introduced docs testing we added a special case for $body in Stash, so that the last stashed body could be evaluated, and expressions like "$body.took" could be extracted out of it. We can instead do that for any object in the stash, by simply wrapping the internal map in an ObjectPath instance. We can then drop the special stashResponse method and go back to using the ordinary stashValue too.

The downside of this change is that it adds a feature that may not be supported by other REST test runners, namely the evaluation of compouned paths from the stash. If we have "object" stashed as an object, it is now possible to extract directly each subobject of it as well e.g. "object.subobject.field1". None of the current REST tests rely on this, but our docs snippets tests do.
2016-07-01 11:13:10 +02:00
javanna 43b82ce244 [TEST] remove feature yaml from REST tests
The only runner that supported it was the java runner, we can use json format instead given that the default one with cat apis is text
2016-07-01 11:13:10 +02:00
javanna 60bafa5d78 [TEST] parse yaml responses too through ObjectPath rather than only json responses
No need to match against yaml responses via regexes in REST tests, yaml responses can be properly parsed via ObjectPath instead. Few REST tests need to be updated accordingly.
2016-07-01 11:13:10 +02:00
javanna 34f5c50a7f [TEST] eagerly parse response body at ObjectPath initialization and read content type from response headers
We are going to parse the body anyways whenever it's in json format as it is going to be stashed. It is not useful to lazily parse it anymore. Also this allows us to not rely on automatic detection of the xcontent type based on the content of the response, but rather read the content type from the response headers.
2016-07-01 11:13:10 +02:00
javanna d5df738538 [TEST] ObjectPath to support parsing yaml or json that have an array as root object
ObjectPath used a Map up until now for the internal representation of its navigable object. That works in most of the cases, but there could also be an array as root object, in which case a List needs to be used instead of a Map. This commit changes the internal representation of the object to Object which can either be a List or a Map. The change is minimal as ObjectPath already had the checks in place to verify the type of the object in the current position and navigate through it.

  Note: The new test added to ObjectPathTest uses yaml format explicitly as auto-detection of json format works only for a json object that starts with '{', not if the root object is actually an array and starts with '['.
2016-07-01 11:13:10 +02:00
javanna bbaa23bdfd [TEST] extend ObjectPathTests to support also yaml format 2016-07-01 11:13:10 +02:00
javanna 44dc801e90 [TEST] make JsonPath independent of data format, rename to ObjectPath
The internal representation of the object that JsonPath gives access to is a map. That is independent of the initial input format, which is json but could also be yaml etc.
This commit renames JsonPath to ObjectPath and adds a static method to create an ObjectPath from an XContent
2016-07-01 11:13:10 +02:00
javanna 76199ce497 [TEST] rename REST tests Stash methods to distinguish between retrieving a value and replacing values within a map
Stash#unstashMap -> replaceStashedValues
Stash#unstashValue -> getValue
2016-07-01 11:13:10 +02:00
javanna 62462f5d9b [TEST] replace ResponseBodyAssertion with existing MatchAssertion
We introduced a special response_body assertion to test our docs snippets. The match assertion does the same job though and can be reused and adapted where needed. ResponseBodyAssertion contains provides much better and accurate errors though, which can be now utilized in MatchAssertion so that many more REST tests can benefit from readable error messages.

 Each response body gets always stashed and can be retrieved for later evaluations already. Instead of providing the response body as strings that get parsed to json objects separately, then converted to maps as ResponseBodyAssertion did, we parse everything once, the json is part of the yaml test, which is supported. The only downside is that json comments cannot be used, rather yaml comments should be used (// C style vs # ). There were only two docs tests that were using comments in ingest-node.asciidoc where I went ahead and remove the comments which didn't seem that useful anyways.
2016-07-01 11:13:10 +02:00
javanna 598c36128e Revert "Raised IOException on deleteBlob (#18815)"
This reverts commit d24cc65cad as it seems to be causing test failures.
2016-07-01 11:00:32 +02:00
gfyoung d24cc65cad Raised IOException on deleteBlob (#18815)
Raise IOException on deleteBlob if the blob doesn't exist

This commit raises an IOException on BlobContainer#deleteBlob
if the blob does not exist, in conformance with the BlobContainer
interface contract.  Each implementation of BlobContainer now
conforms to this contract (file system, S3, Azure, HDFS).  This 
commit also contains blob container tests for each of the 
repository implementations.

Closes #18530
2016-06-30 23:00:10 -04:00
Nik Everett f5a269b029 Start migration away from aggregation streams
We'll migrate to NamedWriteable so we can share code with the rest
of the system. So we can work on this in multiple pull requests without
breaking Elasticsearch in between the commits this change supports
*both* old style `InternalAggregations.stream` serialization and
`NamedWriteable` style serialization. As such it creates about a
half dozen `// NORELEASE` comments that will have to be removed
once the migration is complete.

This also introduces a boolean `transportClient` flag to `SearchModule`
which is used to skip inappropriate registrations for for the
transport client while still registering the things it needs. In
this case that means that the `InternalAggregation` subclasses are
registered with the `NamedWriteableRegistry` but the `AggregationBuilder`
subclasses are not.

Finally, this moves aggregation registration from guice configuration
time to `SearchModule` construction time. This will make it simpler to
work with in the future as we further clean up Elasticsearch's
extension points.
2016-06-30 12:57:34 -04:00
Boaz Leskes 09ca6d6ed2 Add a BridgePartition to be used by testAckedIndexing (#19172)
We have long worked to capture different partitioning scenarios in our testing infra. This PR adds a new variant, inspired by the Jepsen blogs, which was forgotten far - namely a partition where one node can still see and be seen by all other nodes. It also updates the resiliency page to better reflect all the work that was done in this area.
2016-06-30 17:58:12 +02:00
jaymode 983a64c833 Add support for `teardown` section in REST tests
This commits adds support for a `teardown` section that can be defined in REST tests to
clean up any items that may have been created by the test and are not cleaned up by
deletion of indices and templates.
2016-06-30 11:33:29 -04:00
Ryan Ernst 0732004ae8 Merge pull request #19177 from rjernst/ingest_factory_generic
Remove generics from ingest Processor.Factory
2016-06-30 08:08:26 -07:00
Simon Willnauer 40ec639c89 Factor out abstract TCPTransport* classes to reduce the netty footprint (#19096)
Today we have a ton of logic inside the NettyTransport* codebase. The footprint
of the code that has a direct netty dependency is large and alternative implementations
are pretty hard today since they need to know all about our proticol etc.
This change moves most of the code into TCPTransport* baseclasses and moves all
the protocol send code together. The base classes now contain the majority of the logic
while NettyTransport* classes remain to implement the glue code, configuration and optimization.
2016-06-30 13:41:53 +02:00
Ryan Ernst e4f265eb3a Ingest: Remove generics from Processor.Factory
The factory for ingest processor is generic, but that is only for the
return type of the create mehtod. However, the actual consumer of the
factories only cares about Processor, so generics are not needed.

This change removes the generic type from the factory. It also removes
AbstractProcessorFactory which only existed in order pull the optional
tag from config. This functionality is moved to the caller of the
factories in ConfigurationUtil, and the create method now takes the tag.
This allows the covariant return of the implementation to work with
tests not needing casts.
2016-06-30 02:33:54 -07:00
Ryan Ernst 08b3b6264e Tests pass, started removing generics from processor factory 2016-06-30 01:49:22 -07:00
Ryan Ernst f1376262fe Merge branch 'master' into ingest_plugin_api 2016-06-29 14:16:16 -07:00
Simon Willnauer 872cdffc27 Factor out ChannelBuffer from BytesReference (#19129)
The ChannelBuffer interface today leaks into the BytesReference abstraction
which causes a hard dependency on Netty across the board. This chance moves
this dependency and all BytesReference -> ChannelBuffer conversion into
NettyUtlis and removes the abstraction leak on BytesReference.
This change also removes unused methods on the BytesReference interface
and simplifies access to internal pages.
2016-06-29 10:45:05 +02:00
Ryan Ernst 258c3e86ab Added IngestPlugin api, cutover common and geoip, changed ingest factory
api to take ProcessorsRegistry
2016-06-28 10:52:07 -07:00
Yannick Welsch 3cc2251e33 Fix number of arguments provided to logger calls 2016-06-28 17:38:56 +02:00
Yannick Welsch 98276111e1 Re-enable logger usage checks
It was inadvertently disabled after applying code review comments. This commit reenables the logger usage checker and makes it less naggy when encountering logging usages of the form  logger.info(someStringBuilder). Previously it would fail with the error message "First argument must be a string constant so that we can statically ensure proper place holder usage". Now it will only fail in case any arguments are provided as well, for example logger.info(someStringBuilder, 42).
2016-06-28 16:48:05 +02:00
Boaz Leskes 2512594d9e Testing infra - stablize data folder usage and clean up (#19111)
The plan for persistent node ids ( #17811 ) is to tie the node identity to a file stored in it's data folders. As such it becomes important that nodes in our testing infra have better affinity with their data folders and that their data folders are not cleaned underneath them. The first is important because we fix the random seed used for node id generation (for reproducibility) and allowing the same node to use two different data folders causes two separate nodes to have the same id, which prevents the cluster from forming. The second is important, for example, where a full cluster restart / single node restart need to maintain node identity and wiping the data folders at the wrong moment prevents this.

Concretely this commit does the following:
1) Remove previous attempts to have data folder per role using a prefix. This wasn't effective as it was using the data paths settings which are only used for part of the runs. An attempt to completely separate the paths via the home dir failed due to assumptions made by index custom path about node data folder ordinal uniqueness (see #19076)
2) Change full cluster restarts to start up nodes in the same order their were first created in, only randomly swapping nodes with the same roles.
3) Change test cluster reset methods to first shutdown the unneeded nodes and then re-start the shared nodes that were shut down, so they'll reclaim their data folders.
4) Improve data folder wiping logic and make sure it wipes only folders of "offline" nodes.
5) Add some very basic tests
2016-06-28 16:38:56 +02:00
Jason Tedor 2f638b5a23 Keep input time unit when parsing TimeValues
This commit modifies TimeValue parsing to keep the input time unit. This
enables round-trip parsing from instances of String to instances of
TimeValue and vice-versa. With this, this commit removes support for the
unit "w" representing weeks, and also removes support for fractional
values of units (e.g., 0.5s).

Relates #19102
2016-06-27 18:41:18 -04:00
Nik Everett 79fa778e33 Fix percolator tests
They need their plugin or they'll break!
2016-06-27 15:34:36 -04:00
Ryan Ernst 33ccc5aead Merge branch 'master' into mapper_plugin_api 2016-06-27 11:19:59 -07:00
Boaz Leskes cb0824e957 Make shard store fetch less dependent on the current cluster state, both on master and non data nodes (#19044)
#18938 has changed the timing in which we send out to nodes to fetch their shard stores. Instead of doing this after the cluster state resulting of the node's join was published, #18938 made it be sent concurrently to the publishing processes. This revealed a couple of points where the shard store fetching is dependent of the current state of affairs of the cluster state, both on the master and the data nodes. The problem discovered were already present without #18938 but required a failure/extreme situations to make them happen.This PR tries to remove as much as possible of these dependencies making shard store fetching simpler and make the way to re-introduce #18938 which was reverted.

These are the notable changes:
1) Allow TransportNodesAction (of which shard store fetching is derived) callers to supply concrete disco nodes, so it won't need the cluster state to resolve them. This was a problem because the cluster state containing the needed nodes was not yet made available through ClusterService. Note that long term we can expect the rest layer to resolve node ids to concrete nodes, making this mode the only one needed.
2) The data node relied on the cluster state to have the relevant index meta data so it can find data when custom paths are used. We now fall back to read the meta data from disk if needed.
3) The data node was relying on it's own IndexService state to indicate whether the data it has corresponds to an existing allocation. This is of course something it can not know until it got (and processed) the new cluster state from the master. This flag in the response is now removed. This is not a problem because we used that flag to protect against double assigning of a shard to the same node, but we are already protected from it by the allocation deciders.
4) I removed the redundant filterNodeIds method in TransportNodesAction - if people want to filter they can override resolveRequest.
2016-06-27 15:05:06 +02:00
Nik Everett 71b95fb63c Switch analysis from push to pull
Instead of plugins calling `registerTokenizer` to extend the analyzer
they now instead have to implement `AnalysisPlugin` and override
`getTokenizer`. This lines up extending plugins in with extending
scripts. This allows `AnalysisModule` to construct the `AnalysisRegistry`
immediately as part of its constructor which makes testing anslysis
much simpler.

This also moves the default analysis configuration into `AnalysisModule`
which is how search is setup.

Like `ScriptModule`, `AnalysisModule` no longer extends `AbstractModule`.
Instead it is only responsible for building `AnslysisRegistry`. We still
bind `AnalysisRegistry` but we only do so in `Node`. This is means it
is available at module construction time so we slowly remove the need to
bind it in guice.
2016-06-26 07:15:42 -04:00
Ryan Ernst 6995bde710 Merge branch 'master' into mapper_plugin_api 2016-06-24 11:15:06 -07:00
Yannick Welsch a5908a5da5 [TEST] Increase timeouts for Rest test client (#19042)
Some Rest / Doc tests were running into the default socket timeout of 10 seconds.
2016-06-23 14:05:56 +02:00
Adrien Grand 7ba5bceebe Add a MultiTermAwareComponent marker interface to analysis factories. #19028
This is the same as what Lucene does for its analysis factories, and we hawe
tests that make sure that the elasticsearch factories are in sync with
Lucene's. This is a first step to move forward on #9978 and #18064.
2016-06-23 10:19:24 +02:00
Tanguy Leroux 04da1bda0d Move templates out of the Search API, into lang-mustache module
This commit moves template support out of the Search API to its own dedicated Search Template API in the lang-mustache module. It provides a new SearchTemplateAction that can be used to render templates before it gets delegated to the usual Search API. The current REST endpoint are identical, but the Render Search Template endpoint now uses the same Search Template API with a new "simulate" option. When this option is enabled, the Search Template API only renders template and returns immediatly, without executing the search.

Closes #17906
2016-06-23 09:30:53 +02:00
Nik Everett 0bf447c697 Group client projects under :client
:client ---------> :client:rest
:client-sniffer -> :client:sniffer
:client-test ----> :client:test

This lines the client up with how we do things like modules and
plugins.
2016-06-22 14:26:41 -04:00
javanna 490d9c8cf7 Merge branch 'master' into feature/http_client 2016-06-22 09:50:07 +02:00
Adrien Grand db9af54ec0 Remove `_timestamp` and `_ttl` on 5.x indices. #18980
This removes the ability to use `_timestamp` and `_ttl` on indices created on
or after 5.0.

Closes #18280
2016-06-22 08:35:54 +02:00
Ryan Ernst e817b5daa3 Plugins: Remove guice from Mapper plugins
This changes adds a MapperPlugin interface which allows pull style
retrieval of mappers and metadata mappers added by plugins. For now, I
have kept the MapperRegistry, but this should be removed in the future
as it is just a silly container for 2 maps which could themselves be
passed around.
2016-06-21 22:50:39 -07:00
Nik Everett 8925400f67 Remove guice from ScriptService
Makes ScriptModule just a plain class that manages building the
ScriptSettings and ScriptService from plugins. When we *need*
to bind ScriptService with guice we bind it in a lambda.
2016-06-21 16:45:45 -04:00
Adrien Grand 8078c205f9 Revert "Remove `_timestamp` and `_ttl` on 5.x indices. #18980"
This reverts commit 969e953645.
Docs are failing because of the removed functionality. I will
fix the docs before pushing it again.
2016-06-21 19:19:49 +02:00
Adrien Grand 969e953645 Remove `_timestamp` and `_ttl` on 5.x indices. #18980
This removes the ability to use `_timestamp` and `_ttl` on indices created on
or after 5.0.

Closes #18280
2016-06-21 18:04:58 +02:00
javanna 886cb37efb Merge branch 'master' into feature/http_client 2016-06-21 15:53:37 +02:00
Nik Everett ba1d6907ab Quiet the logging of the docs tests
Significantly quiets the logging of the docs tests by:
1. Switching two log statements to debug level.
2. Only calling ESTestCase#afterIfFailed if the test failure wasn't
just assumptions being violated.
2016-06-21 08:31:09 -04:00
Martijn van Groningen 82f7bfad98 ingest: merged o.e.ingest.core with o.e.ingest and in ingest-common module added o.e.ingest.common package
and moved all code to that package.
2016-06-21 09:24:00 +02:00
Simon Willnauer 459665914b Detach BigArrays from Guice (#18973)
BigArrays can be fully constructed without Guice, this change cleans up
it's creation and the mocking in MockNode.
2016-06-20 13:18:19 +02:00
Simon Willnauer e50314bb6e Remove NodeClientModule and PluginsModule 2016-06-20 11:53:07 +02:00
Simon Willnauer 7fea5bd8e7 Remove obsolete Modules that can simply be inlined in node creation 2016-06-20 11:28:14 +02:00
Simon Willnauer 260f38fd76 Remove VersionModule and use Version#current consistently.
We pretended to be able to ackt like a different version node for so long it's
time to be honest and remove this ability. It's just confusing and where needed
and tested we should build dedicated extension points.
2016-06-20 10:55:52 +02:00
Tanguy Leroux 98951b1203 Compile each Groovy script in its own classloader
closes #18572
2016-06-20 08:17:09 +02:00
Boaz Leskes 14cd8a6794 Introduce Replication unit tests using real shards (#18930)
This commit introduce unit testing infrastructure to test replication operations using real index shards. This is infra is complementary to the full integration tests and unit testing of ReplicationOperation we already have. The new ESIndexLevelReplicationTestCase base makes it easier to test and simulate failure mode that require real shards and but do not need the full blow stack of a complete node.

The commit also add a simple "nothing is wrong" test plus a test that checks we don't drop docs during the various stages of recovery.

For now, only single doc indexing is supported but this can be easily extended in the future.
2016-06-18 18:53:47 +02:00
Areek Zillur 9356a6090f Merge branch 'master' into enhancement/rollover_api 2016-06-17 11:35:57 -04:00
Simon Willnauer bdb6dcea3a Cleanup ClusterService dependencies and detached from Guice (#18941)
This change removes some unnecessary dependencies from ClusterService
and cleans up ClusterName creation. ClusterService is now not created
by guice anymore.
2016-06-17 17:07:19 +02:00
Areek Zillur 545ffa7801 Merge branch 'master' into enhancement/rollover_api 2016-06-17 10:33:11 -04:00
javanna af93533a17 Merge branch 'master' into feature/http_client 2016-06-17 13:50:18 +02:00
Areek Zillur 6adffa6b7b Merge branch 'master' into enhancement/rollover_api 2016-06-16 17:27:32 -04:00
Ryan Ernst 8196cf01e3 Merge branch 'master' into plugin_name_api 2016-06-16 13:49:28 -07:00
Simon Willnauer b22c526b34 Cut over settings registration to a pull model (#18890)
Today we have a push model for registering basically anything. All our extension points
are defined on modules which we pass in to plugins. This is harder to maintain and adds
unnecessary dependencies on the modules itself. This change moves towards a pull model
where the plugin offers a getter kind of method to get the extensions. This will also
help in the future if we need to pass dependencies to the extension points which can
easily be defined on the method as arguments if a pull model is used.
2016-06-16 15:52:58 +02:00
Nik Everett 5aa4769b25 Move waitForTaskCompletion into TaskManager
This allows for listening for the waiting to start using
MockTaskManager. This allows us to work around a race condition
in the TasksIT.
2016-06-16 09:45:46 -04:00
Simon Willnauer 18ff051ad5 Simplify ScriptModule and script registration (#18903)
Registering a script engine or native scripts still uses Guice today
and is much more complicated than needed. This change moves to a pull
based model where script plugins have to implement a dedicated interface
`ScriptPlugin` and defines simple getter returning instances rather than
classes.
2016-06-16 09:35:13 +02:00
Ryan Ernst a4503c2aed Plugins: Remove name() and description() from api
In 2.0 we added plugin descriptors which require defining a name and
description for the plugin. However, we still have name() and
description() which must be overriden from the Plugin class. This still
exists for classpath plugins. But classpath plugins are mainly for
tests, and even then, referring to classpath plugins with their class is
a better idea. This change removes name() and description(), replacing
the name for classpath plugins with the full class name.
2016-06-15 17:12:22 -07:00
Tal Levy a26260fb72 new ScriptProcessor for Ingest (#18193)
add new ScriptProcessor for executing ES Scripts within pipelines
2016-06-15 14:57:18 -07:00
Daniel Mitterdorfer f32b700472 Exclude admin / diagnostic requests from HTTP request limiting
With this commit we exclude certain HTTP requests that are needed to inspect the cluster
from HTTP request limiting to ensure these commands are processed even in critical
memory conditions.

Relates #17951, relates #18145, closes #18833
2016-06-15 14:29:46 +02:00
javanna ace3a7b146 Merge branch 'master' into feature/http_client 2016-06-15 11:44:46 +02:00
Simon Willnauer 429dd3a876 Simplify FetchSubPhase registration and detach it from Guice (#18862)
this commit removes FetchSubPhrase registration by class to registration
by instance. No Guice binding needed anymore.
2016-06-15 09:13:02 +02:00
Nik Everett d0e4485d42 Move NamingConventionsCheck into buildSrc
This will let things that don't depend on :test:framework like the
client use it.

Also skip initializing the classes we check because we don't care
about their initialization behavior because we're not executing them.
This makes the naming conventions check pretty close to instant
from a "human eye" perspective.
2016-06-14 18:30:34 -04:00
Colin Goodheart-Smithe d7e3f9e4eb #18854 Remove size 0 options in aggregations
Remove size 0 options in aggregations
2016-06-14 15:32:42 +01:00
Simon Willnauer 4d78f280ed Remove dead code and dead parameters (#18855) 2016-06-14 15:25:44 +02:00
Colin Goodheart-Smithe cfd3356ee3 Remove size 0 options in aggregations
This removes the ability to set `size: 0` in the `terms`, `significant_terms` and `geohash_grid` aggregations for the reasons described in https://github.com/elastic/elasticsearch/issues/18838

Closes #18838
2016-06-14 13:07:02 +01:00
Adrien Grand 44c653f5a8 Upgrade to lucene-6.1.0-snapshot-3a57bea. 2016-06-10 16:18:12 +02:00
javanna cf6e713d77 Merge branch 'master' into feature/http_client 2016-06-09 17:43:45 +02:00
javanna 437c4f210b rename ElasticsearchResponse to Response and ElasticsearchResponseException to ResponseException 2016-06-09 14:38:32 +02:00
javanna 04d620da74 require hosts when creating RestClient.Builder
Also fix order of arguments when using assertEquals
2016-06-08 12:37:50 +02:00
Martijn van Groningen f611f1c99e ingest: Move processors from core to ingest-common module.
Folded grok processor into ingest-common module.

The rest tests have been moved to ingest-common module as well, because these tests don't run in the rest-api-spec module but in the distribution:integ-test-zip module
and adding a test plugin there felt just wrong to me. I think this is ok. I left a tiny ingest rest test behind in that tests with an empty pipeline.

Removed messy tests, these tests were already covered in the rest tests

Added ingest test plugin in test infra so that each module testing integration with ingest doesn't need write its own plugin

Moved reindex ingest tests to qa module

Closes #18490
2016-06-07 17:32:52 +02:00
Jason Tedor da74323141 Register thread pool settings
This commit refactors the handling of thread pool settings so that the
individual settings can be registered rather than registering the top
level group. With this refactoring, individual plugins must now register
their own settings for custom thread pools that they need, but a
dedicated API is provided for this in the thread pool module. This
commit also renames the prefix on the thread pool settings from
"threadpool" to "thread_pool". This enables a hard break on the settings
so that:
 - some of the settings can be given more sensible names (e.g., the max
   number of threads in a scaling thread pool is now named "max" instead
   of "size")
 - change the soft limit on the number of threads in the bulk and
   indexing thread pools to a hard limit
 - the settings names for custom plugins for thread pools can be
   prefixed (e.g., "xpack.watcher.thread_pool.size")
 - remove dynamic thread pool settings

Relates #18674
2016-06-06 22:09:12 -04:00
Areek Zillur d96fe20e3a add named writable registry glue 2016-06-06 16:11:46 -04:00
Yannick Welsch 0a8afa2e72 Add back pending deletes (#18698)
Triggering the pending deletes logic was accidentally removed in the clean up PR #18602.
2016-06-06 15:14:09 +02:00
javanna a461dd84d2 Build: add hamcrest and securemock to version.properties 2016-06-06 15:02:52 +02:00
javanna 56e689e1b3 [TEST] remove unused method 2016-06-04 01:05:53 +02:00
javanna b15279b5ef Allow to pass socket facttry registry to createDefaultHttpClient method 2016-06-03 23:59:26 +02:00
javanna b891c46657 [TEST] remove status matcher and hasStatus assertion
All it does is checking the status code of a response, which can be done with a single line in each test
2016-06-03 23:25:17 +02:00
javanna f17f0f9247 rename ElasticsearchResponse#getFirstHeader to getHeader 2016-06-03 18:28:31 +02:00
javanna 23a94bb974 [TEST] create standard RestClient at first request and reuse it
A RestClient instance is now created whenever EsIntegTestCase#getRestClient is invoked for the first time. It is then kept until the cluster is cleared (depending on the cluster scope of the test).

Renamed other two restClient methods to createRestClient, as that instance needs to be closed and managed in the tests.
2016-06-03 18:00:54 +02:00
javanna e81aad972a remove usage of deprecated api 2016-06-03 16:01:07 +02:00
javanna eae914ae8e Replace rest test client with low level RestClient
We still have a wrapper called RestTestClient that is very specific to Rest tests, as well as RestTestResponse etc. but all the low level bits around http connections etc. are now handled by RestClient.
2016-06-03 16:01:07 +02:00
javanna 325b723930 [TEST] add rest client test dependency and replace usage of HttpRequestBuilder with RestClient in integration tests 2016-06-03 16:01:07 +02:00
Ali Beyad b720216395 Adds UUIDs to snapshots
This commit adds a UUID for each snapshot, in addition to the already
existing repository and snapshot name. The addition of UUIDs will enable
more robust handling of the deletion of previous snapshots and lingering
files from partially failed delete operations, on top of being able to
uniquely track each snapshot.

Closes #18228
Relates #18156
2016-06-02 17:01:48 -04:00
Christoph Büscher 9067407cdd Adressing review comments 2016-06-02 16:19:23 +02:00
Christoph Büscher e2b6dbc020 Add tests to check that toQuery() doesn't return null 2016-06-02 11:25:56 +02:00
Christoph Büscher 359f45988f Handle empty query bodies at parse time and remove EmptyQueryBuilder
Currently we support empty query clauses like the filter in

"constant_score" : {  "filter" : { } }

How these clauses are handled depends on the surrounding query.
They later are either ignored, converted to match all or no documents or
passed up further in the query hierarchy. During parsing these claues are
currently represented as EmptyQueryBuilders. When not handled anywhere else,
these special cases need to be checked for on the shard when building the
lucene query.

This is trappy, so this PR changes the parsing of compound queries. Instead
of returning QueryBuilder, the core query parsing method
QueryShardContext#parseInnerQueryBuilder() now return an Optional which can
be empty in the case of empty query clauses. This has the advantage of forcing
callers to deal with this sooner or later. When encountering empty Optionals,
compound query builders now have the choice to ignore them, pass them on or
rewrite to a different query, depending on context.
2016-06-02 11:25:56 +02:00
Yannick Welsch c20bf5d747 [TEST] Fix tests that rely on assumption that data dirs are removed after index deletion (#18681)
Relates to #18602
2016-06-01 17:02:09 +02:00
Simon Willnauer 88800e8e47 Move PageCacheRecycler into BigArrays (#18666)
PageCacheRecycler is really just an implementation detail of
BigArrays. There is no need to leak this class anywhere outside of it.
2016-06-01 09:43:11 +02:00
Ali Beyad 0efac76f01 Clarify the semantics of the BlobContainer interface
This commit clarifies the behavior that must be adhered to by any
implementors of the BlobContainer interface.  This is done through
expanded Javadocs.

Closes #18157
Closes #15580
2016-05-31 19:22:55 -04:00
Jason Tedor e21d8b31f1 Remove thread pool from page cache recycler
The page cache recycler has a dependency on thread pool that was there
for historical reasons but is no longer needed. This commit removes this
now unneeded dependency.

Relates #18664
2016-05-31 14:51:58 -04:00
Simon Willnauer 502a775a7c Add primitive to shrink an index into a single shard (#18270)
This adds a low level primitive operations to shrink an existing
index into a new index with a single shard. This primitive expects
all shards of the source index to allocated on a single node. Once the target index is initializing on the shrink node it takes a snapshot of the source index shards and copies all files into the target indices data folder. An [optimization](https://issues.apache.org/jira/browse/LUCENE-7300) coming in Lucene 6.1 will also allow for optional constant time copy if hard-links are supported by the filesystem. All mappings are merged into the new indexes metadata once the snapshots have been taken on the merge node.

To shrink an existing index all shards must be moved to a single node (one instance of each shard) and the index must be read-only:

```BASH
$ curl -XPUT 'http://localhost:9200/logs/_settings' -d '{
    "settings" : {
        "index.routing.allocation.require._name" : "shrink_node_name",
        "index.blocks.write" : true 
    }
}
```
once all shards are started on the shrink node. the new index can be created via:

```BASH
$ curl -XPUT 'http://localhost:9200/logs/_shrink/logs_single_shard' -d '{
    "settings" : {
        "index.codec" : "best_compression",
        "index.number_of_replicas" : 1
    }
}'
```

This API will perform all needed check before the new index is created and selects the shrink node based on the allocation of the source index. This call returns immediately, to monitor shrink progress the recovery API should be used since all copy operations are reflected in the recovery API with byte copy progress etc.

The shrink operation does not modify the source index, if a shrink operation should
be canceled or if the shrink failed, the target index can simply be deleted and
all resources are released.
2016-05-31 10:41:44 +02:00
Boaz Leskes 318a4e3ef6 Introduce dedicated master nodes in testing infrastructure (#18514)
This PR changes the InternalTestCluster to support dedicated master nodes. The creation of dedicated master nodes can be controlled using a new `supportsMasterNodes` parameter to the ClusterScope annotation. If set to true (the default), dedicated master nodes will randomly be used. If set to false,  no master nodes will be created and data nodes will also be allowed to become masters. If active, test runs will either have 1 or 3 masternodes
2016-05-27 08:44:20 +02:00
Yannick Welsch 31b0777c91 Simplify delayed shard allocation (#18351)
This commit simplifies the delayed shard allocation implementation by assigning clear responsibilities to the various components that are affected by delayed shard allocation:

- UnassignedInfo gets a boolean flag delayed which determines whether assignment of the shard should be delayed. The flag gets persisted in the cluster state and is thus available across nodes, i.e. each node knows whether a shard was delayed-unassigned in a specific cluster state. Before, nodes other than the current master were unaware of that information.
- This flag is initially set as true if the shard becomes unassigned due to a node leaving and the index setting index.unassigned.node_left.delayed_timeout being strictly positive. From then on, unassigned shards can only transition from delayed to non-delayed, never in the other direction.
- The reroute step is in charge of removing the delay marker (comparing timestamp when node left to current timestamp).
- A dedicated service DelayedAllocationService, reacting to cluster change events, has the responsibility to schedule reroutes to remove the delay marker.

Closes #18293
2016-05-26 13:39:55 +02:00
Adrien Grand cad959b980 Validate parameters of native sig score scripts so that we know which ones are not set. 2016-05-26 10:07:38 +02:00
Jason Tedor 9d39b05845 Remove deprecation suppression
Failing the build on deprecation warnings was removed in
19b3ec88af. This commit removes the
suppressed deprecation warnings so that their use is surfaced in the
build now.

Relates #18582
2016-05-25 17:15:36 -04:00
Nik Everett bef1c8511d s/tests.logger.level/tests.es.logger.level/
This is a leftover spot that wasn't changed. It was breaking
ClusterSettingsIT#ClusterSettingsIT because that test expected
the test's log level to default to the default logger level for
the nodes.
2016-05-24 13:25:16 -04:00
Martijn van Groningen 27cc2fe4dc Moved the percolator from core to its own module
Significant changes:
* AbstractQueryTestCase has moved to the test framework module, in order for query builder tests in modules and plugins
* Added support to AbstractQueryTestCase to register plugins
* Lift the restriction that only one percolator could be added per index. This validation existed in MapperService, but because the percolator moved to a module it could no longer exist there. Instead of bringing it back it was removed. This validation existed since the percolator cache only supported one percolator query per document, since the percolator cache has been removed this restriction could removed as well.
* While moving percolator tests to the new module, also removed a couple of tests for the deprecated percolate and mpercolate api. These APIs are now sugar  APIs for bwc and rediect to the searvh and msearvh APIs. Some tests were still testing as if percolate and mpercolate API did the percolation, but this no longer the case and these tests could be removed.
2016-05-24 11:01:57 +02:00
Ryan Ernst f6074d383b Merge pull request #18532 from rjernst/less_assert_busy
Tests: Remove unnecessary Callable variant of assertBusy
2016-05-23 17:11:54 -07:00
Chris Earle b49635539d Remove support for -Des.* system properties in integration tests
This now requires that system properties passed to Gradle must be in the form of "-Dtests.es.*" instead of
"-Des.*". It then chops off "tests.es." and passes that as a "-E" property to Elasticsearch.

Also changed system properties:

- `tests.logger.level` became `tests.es.logger.level`
- `node.mode` became `tests.es.node.mode`
- `node.local` became `tests.es.node.local`
2016-05-23 19:38:21 -04:00
Ryan Ernst c7b45b2cc7 Tests: Remove unnecessary Callable variant of assertBusy
The assertBusy method currently has both a Runnable and Callable
version. This has caused confusion with type inference and lambdas
sometimes, in particular with java 9. This change removes the callable
version as nothing was actually using it.
2016-05-23 16:17:43 -07:00
Jason Tedor f63d1255d1 Cleanup settings and system properties entanglement
This commit cleans up some additional places where system properties
were being used to pass settings to Elasticsearch.

Relates #18524
2016-05-23 14:47:22 -04:00
Luca Cavanna d2afe759a7 prevent registration of duplicated rest spec (#18504)
Rather than having one win against the other, reject duplicated apis. Also enforce the convention that see the api name have the same name as the name of the rest spec file that defines it.
2016-05-23 12:17:42 +02:00
Ryan Ernst 37d36f2f4c Merge branch 'master' into java9 2016-05-21 14:19:58 -07:00
Ryan Ernst 41a5c0cfa1 Force java9 log4j hack in testing 2016-05-21 13:41:38 -07:00
Ryan Ernst 1d40c4bbc1 Make java9 work again
This change makes ES compile with java9 again, build 118.
* There are a handful of changes due to failure to determine types during compile.
* The attachment plugins which use tika needed to have tika upgraded in order to pickup fixes there for java 9.
* azure discovery and s3 repository indirectly depend on jaxb, which is no longer in the default modules. They now add a jaxb dependency externally, and make JarHell allow for this package.
2016-05-21 09:41:51 -07:00
Lee Hinman fdfd2a2f18 Remove ScriptMode class in favor of boolean true/false
This removes the ScriptMode class entirely, which was an enum with two
options (ON and OFF) which essentially boiled down to true and false.
Now the boolean values are used instead.
2016-05-20 15:01:30 -06:00
Martijn van Groningen 80fee8666f percolator: Removed percolator cache
Before 5.0 for it was required that the percolator queries were cached in jvm heap as Lucene queries for two reasons:
1) Performance. The percolator evaluated all percolator queries all the time. There was no pre-selecting queries that are likely to match like we have today.
2) Updates made to percolator queries were visible in realtime, Today these changes are visible in near realtime. So updating no longer requires the percolator to have the queries in jvm heap.

So having the percolator queries in jvm heap via the percolator cache is now less attractive. Especially when there are many percolator queries then these queries can consume many GBs of jvm heap.
Removing the percolator cache does make the percolate query slower compared to how the execution time in 5.0.0-alpha1 and alpha2, but it is still faster compared to 2.x and before.
2016-05-20 14:52:16 +02:00
Luca Cavanna fcee329332 update http client version to 4.5.2 and http-core 4.4.4 (#18399)
StrictHostnameVerifier can now be removed
2016-05-20 12:02:42 +02:00
Jason Tedor c257e2c51f Remove settings and system properties entanglement
Today when parsing settings during bootstrap, we add a system property
for every Elasticsearch setting. Additionally, settings can be set via
system properties. This commit simplifies this situation.
 - settings are no longer propogated to system properties
 - system properties can not be used to set settings
 - the "es." prefix on settings is no longer required (nor permitted)
 - test logging has a dedicated system property (tests.logger.level)

Relates #18198
2016-05-19 14:08:08 -04:00
Christoph Büscher d2515727d0 Improve random DateTimeZone creation in tests
We often require a random joda DateTimeZone in our tests. Currently
there are a few options for generating such a random DateTimeZone
from the set of available ids. Currently most random picks are not
really reproducable across different jvms because they rely on order
in the ids set implementation. The helper in DateProcessorFactoryTests
thus performs a sort on the set of ids before random picking from
the result, so I moved this to ESTestCase to make it publicly
available and changed all other tests to use that method.
2016-05-19 18:12:48 +02:00
Tanguy Leroux 35d3bdab84 Add Google Cloud Storage repository plugin
Closes #12880
2016-05-19 13:26:23 +02:00
Jason Tedor ecce53f0df Add I/O statistics on Linux
This commit adds a variety of real disk metrics for the block devices
that back Elasticsearch data paths. A collection of statistics are read
from /proc/diskstats and are used to report the raw metrics for
operations and read/write bytes.

Relates #15915
2016-05-17 16:16:39 -04:00
Adrien Grand 864ed04059 Lessen leniency of the query dsl. #18276
This change does the following:
 - Queries that are currently unsupported such as prefix queries on numeric
   fields or term queries on geo fields now throw an error rather than returning
   a query that does not match anything.
 - Fuzzy queries on numeric, date and ip fields are now unsupported: they used
   to create range queries, we now expect users to use range queries directly.
   Fuzzy, regexp and prefix queries are now only supported on text/keyword
   fields (including `_all`).
 - The `_uid` and `_id` fields do not support prefix or range queries anymore as
   it would prevent us to store them more efficiently in the future, eg. by
   using a binary encoding.

Note that it is still possible to ignore these errors by using the `lenient`
option of the `match` or `query_string` queries.
2016-05-16 17:37:00 +02:00
Robert Muir 2028691e66 painless: improve exception stacktraces
closes #18319
2016-05-13 15:40:45 -04:00
Lee Hinman 9bcdafedda Allow only a single extension for a scripting engine
Previously multiple extensions could be provided, however, this can lead
to confusion with on-disk scripts (ie, "foo.js" and "foo.javascript")
having different content. Only a single extension is now supported.

The only language currently supporting multiple extensions was the
Javascript engine ("js" and "javascript"). It now only supports the
`.js` extension.

Relates to #10598
2016-05-13 09:54:31 -06:00
Lee Hinman efff3918d8 Remove support for mulitple languages per scripting engine 2016-05-13 09:24:31 -06:00
Lee Hinman a4060f7436 Remove vestiges of script engine sandboxing
This removes all the mentions of the sandbox from the script engine
services and permissions model. This means that the following settings
are no longer supported:

```yaml
script.inline: sandbox
script.stored: sandbox
```

Instead, only a `true` or `false` value can be specified.

Since this would otherwise break the default-allow parameter for
languages like expressions, painless, and mustache, all script engines
have been updated to have individual settings, for instance:

```yaml
script.engine.groovy.inline: true
```

Would enable all inline scripts for groovy. (they can still be
overridden on a per-operation basis).

Expressions, Painless, and Mustache all default to `true` for inline,
file, and stored scripts to preserve the old scripting behavior.

Resolves #17114
2016-05-13 09:24:31 -06:00
Yannick Welsch 7753420540 Make ShardRouting and UnassignedInfo immutable (#17821)
This makes defensive copying of ShardRouting objects obsolete whenever we do a reroute and trashes less objects.
2016-05-10 19:11:04 +02:00
Nik Everett ddc531e729 Build a plugin for testing docs
This makes it much easier to apply to other projects.

Fixes to doc tests infrastructure:
* Fix comparing lists. Was totally broken.
* Fix order of actual vs expected parameters.
* Allow multiple `// TESTRESPONSE` lines with substitutions to join
into one big list of subtitutions. This makes lets the docs look
tidier.
* Exclude build from snippet scanning
* Allow subclasses of ESRestTestCase access to the admin execution context
2016-05-09 14:07:27 -04:00
Nik Everett b7d02fbd1e Improve logging of raw rest actions on failure
Log the method and the path.
2016-05-09 13:04:33 -04:00
Nik Everett ef2e3a8c39 Rest tests: More defense around stashing body
Integration tests failed:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-intake/483/console

We'll see if the rest tests were hiding some other failure.
2016-05-09 09:52:23 -04:00
Ryan Ernst 3d1be071c9 Merge branch 'master' into pom_gen 2016-05-06 12:56:51 -07:00
Chris Earle 5be79ed02c Add Failure Details to every NodesResponse
Most of the current implementations of BaseNodesResponse (plural Nodes) ignore FailedNodeExceptions.

- This adds a helper function to do the grouping to TransportNodesAction
- Requires a non-null array of FailedNodeExceptions within the BaseNodesResponse constructor
- Reads/writes the array to output
- Also adds StreamInput and StreamOutput methods for generically reading and writing arrays
2016-05-06 14:59:43 -04:00
Christoph Büscher 7d14728960 Add xContent shuffling to some more tests
This adds some random shuffling of xContent to some more test cases.

Relates to #5831
2016-05-06 10:46:39 +02:00
Adrien Grand de8354dd7f Allow binary sort values. #17959
The `ip` field uses a binary representation internally. This breaks when
rendering sort values in search responses since elasticsearch tries to write a
binary byte[] as an utf8 json string. This commit extends the `DocValueFormat`
API in order to give fields a chance to choose how to render values.

Closes #6077
2016-05-06 09:27:02 +02:00
Ryan Ernst e16af604bf Build: Add pom generation to assemble task
In preparation for a unified release process, we need to be able to
generate the pom files independently of trying to actually publish. This
change adds back the maven-publish plugin just for that purpose. The
nexus plugin still exists for now, so that we do not break snapshots,
but that can be removed at a later time once snapshots are happenign
through the unified tools. Note I also changed the dir jars are written
into so that all our artifacts are under build/distributions.
2016-05-05 17:57:44 -07:00
Nik Everett 4b1c116461 Generate and run tests from the docs
Adds infrastructure so `gradle :docs:check` will extract tests from
snippets in the documentation and execute the tests. This is included
in `gradle check` so it should happen on CI and during a normal build.

By default each `// AUTOSENSE` snippet creates a unique REST test. These
tests are executed in a random order and the cluster is wiped between
each one. If multiple snippets chain together into a test you can annotate
all snippets after the first with `// TEST[continued]` to have the
generated tests for both snippets joined.

Snippets marked as `// TESTRESPONSE` are checked against the response
of the last action.

See docs/README.asciidoc for lots more.

Closes #12583. That issue is about catching bugs in the docs during build.
This catches *some* bugs in the docs during build which is a good start.
2016-05-05 13:58:03 -04:00
Jason Tedor 784c9e5fb9 Introduce node handshake
This commit introduces a handshake when initiating a light
connection. During this handshake, node information, cluster name, and
version are received from the target node of the connection. This
information can be used to immediately validate that the target node is
a member of the same cluster, and used to set the version on the
stream. This will allow us to extend APIs that are used during initial
cluster recovery without a major version change.

Relates #15971
2016-05-04 20:06:47 -04:00
Jason Tedor 78d615f320 Merge pull request #18110 from jasontedor/strings-split-as-array
Remove Strings#splitStringToArray

Remove arbitrary separator/wildcard from PathTrie
2016-05-04 09:38:47 -04:00
Jason Tedor 2dea449949 Remove Strings#splitStringToArray
This commit removes the method Strings#splitStringToArray and replaces
the call sites with invocations to String#split. There are only two
explanations for the existence of this method. The first is that
String#split is slightly tricky in that it accepts a regular expression
rather than a character to split on. This means that if s is a string,
s.split(".")  does not split on the character '.', but rather splits on
the regular expression '.' which splits on every character (of course,
this is easily fixed by invoking s.split("\\.") instead). The second
possible explanation is that (again) String#split accepts a regular
expression. This means that there could be a performance concern
compared to just splitting on a single character. However, it turns out
that String#split has a fast path for the case of splitting on a single
character and microbenchmarks show that String#split has 1.5x--2x the
throughput of Strings#splitStringToArray. There is a slight behavior
difference between Strings#splitStringToArray and String#split: namely,
the former would return an empty array in cases when the input string
was null or empty but String#split will just NPE at the call site on
null and return a one-element array containing the empty string when the
input string is empty. There was only one place relying on this behavior
and the call site has been modified accordingly.
2016-05-04 08:12:41 -04:00
Isabel Drost-Fromm a8bf75983f Merge branch 'master' into tests/switch_to_random_value_other_than_for_sort 2016-05-04 10:24:46 +02:00
Daniel Mitterdorfer 0a6f40c7f5 Enable HTTP compression by default with compression level 3
With this commit we compress HTTP responses provided the client
supports it (as indicated by the HTTP header 'Accept-Encoding').

We're also able to process compressed HTTP requests if needed.

The default compression level is lowered from 6 to 3 as benchmarks
have indicated that this reduces query latency with a negligible
increase in network traffic.

Closes #7309
2016-05-03 08:53:15 +02:00
Isabel Drost-Fromm 372eceb854 Switch to using predicate for testing existing value 2016-05-02 15:41:05 +02:00
Isabel Drost-Fromm 47fefdd273 Switch from separate sort_mode to more general randomValueOtherThan
... for sort tests only ...
2016-04-28 14:45:56 +02:00
Jason Tedor efeec4d096 Merge pull request #17017 from jasontedor/generic-thread-pool
Actually bound the generic thread pool
2016-04-26 08:27:48 -04:00
Alexander Reelsen 486c783f08 Testing: Remove unused junit rule (#17947)
This rule was used to repeat failed tests due to binding on an already bound port. The test has been fixed
so we can get rid of this rule as well.
2016-04-26 09:53:49 +02:00
Adrien Grand 31a9845bc2 Remove the `SearchType` setter on `SearchContext`. #17955
It was not used.
2016-04-26 09:08:37 +02:00
Ali Beyad d39eb2d691 Adds tombstones to cluster state for index deletions
Previously, we would determine index deletes in the cluster state by
comparing the index metadatas between the current cluster state and the
previous cluster state and decipher which ones were missing (the missing
ones are deleted indices).  This led to a situation where a node that
went offline and rejoined the cluster could potentially cause dangling
indices to be imported which should have been deleted, because when a node
rejoins, its previous cluster state does not contain reliable state.

This commit introduces the notion of index tombstones in the cluster
state, where we are explicit about which indices have been deleted.
In the case where the previous cluster state is not useful for index
metadata comparisons, a node now determines which indices are to be
deleted based on these tombstones in the cluster state.  There is also
functionality to purge the tombstones after exceeding a certain amount.

Closes #17265
Closes #16358
Closes #17435
2016-04-25 15:43:20 -04:00
Jason Tedor 5608fa7ac1 Actually bound the generic thread pool
This commit actually bounds the size of the generic thread pool. The
generic thread pool was of type cached, a thread pool with an unbounded
number of workers and an unbounded work queue. With this commit, the
generic thread pool is now of type scaling. As such, the cached thread
pool type has been removed. By default, the generic thread pool is
constructed with a core pool size of four, a max pool size of 128 and
idle workers can be reaped after a keep-alive time of thirty seconds
expires. The work queue for this thread pool remains unbounded.
2016-04-25 06:47:26 -04:00
Martijn van Groningen c5ad2e2865 Changed indexed scripts to be stored in the cluster state instead of the `.scripts` index.
Also added max script size soft limit for stored scripts.

Closes #16651
2016-04-22 13:42:55 +02:00
Nik Everett 65f6f6bc8d Normalize registration for SignificanceHeuristics
When I pulled on the thread that is "Remove PROTOTYPEs from
SignificanceHeuristics" I ended up removing SignificanceHeuristicStreams
and replacing it with readNamedWriteable. That seems like a lot at once
but it made sense at the time. And it is what we want in the end, I think.

Anyway, this also converts registration of SignificanceHeuristics to
use ParseFieldRegistry to make them consistent with Queries, Aggregations
and lots of other stuff.

Adds a new and wonderous hack to support serialization checking of
NamedWriteables registered by plugins!

Related to #17085
2016-04-19 09:47:37 -04:00
Daniel Mitterdorfer 3688629e11 Adjust line-length of transport related classes to coding standard 2016-04-15 10:12:24 +02:00
Ali Beyad b87fd54ba9 Improvements to the IndicesService class
This commit contains the following improvements/fixes:
  1. Renaming method names and variables to better reflect the purpose
of the method and the semantics of the variable.
  2. For deleting indexes, replace the closed parameter passed to the
delete index/store methods with obtaining the index's state from the
IndexSettings that is already passed in.
  3. Added tests to the IndexWithShadowReplicaIT suite, some of which
show issues in the shadow replica delete process that are captured in
Github issue 17695.

Closes #17638
2016-04-14 11:14:02 -04:00
Nik Everett 64f5a4f848 Stop map collisions on FiltersTests
Adds randomUnique to generate unique things and uses it to make unique
keys.

The offending seed was 81AE616FEAD10F17.
2016-04-13 08:35:45 -04:00
Daniel Mitterdorfer 117bc68af3 Limit request size on HTTP level
With this commit we limit the size of all in-flight requests on
HTTP level. The size is guarded by the same circuit breaker that
is also used on transport level. Similarly, the size that is used
is HTTP content length.

Relates #16011
2016-04-13 09:58:08 +02:00
Daniel Mitterdorfer 52b2016447 Limit request size on transport level
With this commit we limit the size of all in-flight requests on
transport level. The size is guarded by a circuit breaker and is
based on the content size of each request.

By default we use 100% of available heap meaning that the parent
circuit breaker will limit the maximum available size. This value
can be changed by adjusting the setting

network.breaker.inflight_requests.limit

Relates #16011
2016-04-13 09:54:59 +02:00
Adrien Grand 226644ea2c Do not assume term queries use the inverted index. #17532
We have a couple places in the code base that assume that search is always done
on the inverted index. However with the new points API in Lucene 6, this is not
true anymore. This commit makes MappedFieldType.indexedValueForSearch protected
and fixes call sites to keep working for field types that use the inverted
index and either work differently ar throw an exception otherwise. For instance,
it will still be possible to run cross_fields multi match queries on numeric
fields, but the score contributions will not be blended as well as before, and
significant terms aggregations on long terms will not be possible anymore since
points do not record document frequencies.
2016-04-12 09:47:20 +02:00
Adrien Grand 0eb1a816c8 Allow the query cache to be disabled. #16268
This replaces the internal `index.queries.cache.type` setting with
a new `index.queries.cache.enabled` setting, which is documented.

Closes #15802
2016-04-11 18:06:16 +02:00
Alexander Reelsen da19ddf3e6 Ingest Attachment: Allow to prevent base64 conversions by using raw bytes (#16601)
CBOR is natively supported in Elasticsearch and allows for byte arrays.
This means, that by using CBOR the user can prevent base64 conversions
for the data being sent back and forth.

This PR adds support to extract data from a byte array in addition to
a string. This also required to add a ByteArrayValueSource class.
2016-04-11 14:14:56 +02:00
David Pilato 1e346d1ac1 Merge branch 'fix/17625-close-ingest-factory' 2016-04-11 10:00:19 +02:00
Nik Everett 525ce40d1c Give SearchContext a toString
and move the string capturing to capture time.
2016-04-10 20:55:31 -04:00
Nik Everett ac94e5f287 Provide more information about open contexts
Sometimes we get a test failure caused by search contexts left open.
The tests include a stack trace of the call that opened the context
but nothing else about the context. This adds more information about
the context that has been left open like what query it was running,
what shard it targeted, and whether or not it was a scroll.

Relates to #17582
2016-04-10 20:55:31 -04:00
David Pilato 24f48b86b5 Update after review and add a Test 2016-04-09 13:14:25 +02:00
Adrien Grand 42526ac28e Remove Settings.settingsBuilder.
We have both `Settings.settingsBuilder` and `Settings.builder` that do exactly
the same thing, so we should keep only one. I kept `Settings.builder` since it
has my preference but also it is the one that we use in examples of the Java API.
2016-04-08 18:10:02 +02:00
Chris Earle d97d5ebb8b Remove hostname from NetworkAddress.format
This removes the inconsistent output of IP addresses. The format was parsing-unfriendly and it makes it hard
to reason about API responses, such as to _nodes.

With this change in place, it will never print the hostname as part of the default format, which has the
added benefit that it can be used consistently for URIs, which was not the case when the hostname might
appear at the front with "hostname/ip:port".
2016-04-07 17:27:59 -04:00
Adrien Grand c33300c543 Make MappedFieldType responsible for providing a parser/formatter. #17546
Aggregations need to perform instanceof calls on MappedFieldType instances in
order to know how they should be parsed or formatted. Instead, we should let
the field types provide a formatter/parser that can can be used.
2016-04-07 16:57:50 +02:00
jaymode f9d1e8a5f3 Root rest api delegates to a transport action
This change makes the root (/) rest api delegate to a transport action to get the
data for the response. This aligns this rest api with all of the other apis, which
delegate to one or more actions.

In doing this, unit tests were added to provide coverage of the RestMainAction
and the associated classes.
2016-04-07 10:03:49 -04:00
Jason Tedor 0a69985153 Merge pull request #17038 from jasontedor/enable_acked
Prepare for enabling acked indexing
2016-04-06 18:13:28 -04:00
Jimmy Jones f157dae053 Disallow unquoted field names, fix testcases using unquoted JSON 2016-04-06 14:37:15 -06:00
Clinton Gormley cbbf80ca35 v2.3.0 has been released and no longer needs to be hardcoded as -SNAPSHOT 2016-04-04 19:03:43 +02:00
Jason Tedor c7c8b1d825 Merge branch 'master' into enable_acked
* master: (156 commits)
  Make JNA calls optional
  Added RPM metadata
  Remove PROTOTYPE from MLT.Item
  Remove PROTOTYPE from VersionType
  Fix mistake in TopHits change
  Remove PROTOTYPEs from highlighting
  Clean up some log messages
  Command line arguments with comma must be quoted on windows
  Cluster Health should run on applied states, even if waitFor=0 #17440
  ingest: make concrete processor impl final, like all other processor concrete impls.
  Improve some test method comments.
  Document task id's as string in the rest spec
  Replace FieldStatsProvider with a method on MappedFieldType. #17334
  cleanup test
  Remove MathUtils. #17454
  Addressing review comments
  fix javadocs
  Make TranslogConfig immutable and pass TranslogGeneration as a ctor arg to Translog
  [reindex] Don't get rejected
  Remove redundant commit - #openTranslog() already commits in that case
  ...
2016-04-02 13:56:00 -04:00
Christoph Büscher 9d68a515b8 Merge pull request #17453 from cbuescher/add-xcontent-randomization
Add randomization of XContentBuilder output to query tests
2016-04-01 15:02:01 +02:00
Christoph Büscher 7a1b06ce0b Improve some test method comments. 2016-04-01 11:04:56 +02:00
Christoph Büscher 1a697a1ae6 Addressing review comments 2016-03-31 21:46:17 +02:00
Simon Willnauer baa2d51e59 Merge pull request #17422 from s1monw/recovery_mem_buffer_access
Move translog recover outside of the engine

We changed the way we manage engine memory buffers to an
open model where each shard can essentially has infinite memory.
The indexing memory controller is responsible for moving memory to disk
when it's needed. Yet, this doesn't work today when we recover from store/translog
since the engine is not fully initialized such that IMC has no access to the engine,
neither to it's memory buffer nor can it move data to disk.

The biggest issue here is that translog recovery happends inside the Engine constructor
which is problematic by itself since it might take minutes and uses a not yet fully
initialzied engine to perform write operations on.

This change detaches the translog recovery and makes it the responsibility of the caller
to run it once the engine is fully constructed or skip it if not necessary.
2016-03-31 21:03:00 +02:00
Christoph Büscher bbb6d91147 Add randomization of XContentBuilder output to query tests
Currently our testing of parsing query builders is limited to the
default order of the parameters that each builders toXContent()
method produces. To better test real queries where the order of
parameters can be different, this change adds a helper
method to ESTestCase that takes a XContentBuilder and randomly
shuffles the order of the fields inside an object. This is
used in AbstractQueryTestCase, but it can be used in other similar
places in the future.
2016-03-31 18:17:39 +02:00
Simon Willnauer 1e06139584 Move translog recover outside of the engine
We changed the way we manage engine memory buffers to an
open model where each shard can essentially has infinite memory.
The indexing memory controller is responsible for moving memory to disk
when it's needed. Yet, this doesn't work today when we recover from store/translog
since the engine is not fully initialized such that IMC has no access to the engine,
neither to it's memory buffer nor can it move data to disk.

The biggest issue here is that translog recovery happends inside the Engine constructor
which is problematic by itself since it might take minutes and uses a not yet fully
initialzied engine to perform write operations on.

This change detaches the translog recovery and makes it the responsibility of the caller
to run it once the engine is fully constructed or skip it if not necessary.
2016-03-30 23:24:24 +02:00
javanna b9f9b2e3ee Merge branch 'master' into enhancement/discovery_node_one_getter 2016-03-30 17:22:40 +02:00
javanna 62ac7d219f Remove DiscoveryNodes#masterNode in favour of existing DiscoveryNodes#getMasterNode 2016-03-30 15:28:32 +02:00
javanna f8b5d1f5b0 Remove DiscoveryNodes#masterNodeId in favour of existing DiscoveryNodes#getMasterNodeId 2016-03-30 15:28:06 +02:00
javanna 2dbba45f2c Rename static DiscoveryNode#localNode(Settings) to DiscoveryNode#isLocalNode(Settings) 2016-03-30 15:27:26 +02:00
javanna 49e952e272 Rename static DiscoveryNode#dataNode(Settings) to isDataNode 2016-03-30 15:26:41 +02:00
javanna 2230fec9ea Rename static DiscoveryNode#masterNode(Settings) to isMasterNode 2016-03-30 15:26:10 +02:00
javanna a8bbdff3bc Remove DiscoveryNode#name in favour of existing DiscoveryNode#getName 2016-03-30 14:47:36 +02:00
javanna 9889f10e5e Remove DiscoveryNode#id in favour of existing DiscoveryNode#getId 2016-03-30 14:42:15 +02:00
Camilo Diaz Repka 7be11a36cd Refactor: replace all ocurrences of ESTestCase.getRandom() for random().
Remove getRandom().
2016-03-29 23:18:05 -04:00
javanna 061f09d9a4 Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-29 20:19:33 +02:00
Jason Tedor c4324f9964 Merge branch 'master' into enable_acked
* master: (25 commits)
  Replication operation that try to perform the primary phase on a replica should be retried
  split long line in ConvertProcessorTests
  add  type conversion support to ConvertProcessor
  percolator: Make explain use the two phase iterator
  test: make sure we don't flush during indexing the percolator queries
  Added experimental annotation to the update-by-query and reindex docs
  Fixed bad YAML in reindex REST test: 50_routing.yaml
  Update-by-query rest tests: fixed bad yaml and deleted a client-dependent test
  Prevents exception being raised when ordering by an aggregation which wasn't collected
  The reindex body is now required, which changes the exception thrown by the REST test
  Docs: Included Nodes Task API and tidied reindex/update-by-query
  Rename update-by-query REST tests to update_by_query
  REST: The body is required in the reindex API
  The source parameter should not be defined in the delete-by-query REST spec
  Renamed update-by-query REST spec to update_by_query
  Fix test bug in TypeQueryBuilderTests.
  Add comment why it is safe to check the number of nested fields in MapperService.merge.
  Automatically add a sub keyword field to string dynamic mappings. #17188
  Type filters should not have a performance impact when there is a single type. #17350
  Add API to explain why a shard is or isn't assigned
  ...
2016-03-29 11:42:34 -04:00
Colin Goodheart-Smithe ff3fd99074 Prevents exception being raised when ordering by an aggregation which wasn't collected
If a terms aggregation was ordered by a metric nested in a single bucket aggregator which did not collect any documents (e.g. a filters aggregation which did not match in that term bucket) an ArrayOutOfBoundsException would be thrown when the ordering code tried to retrieve the value for the metric. This fix fixes all numeric metric aggregators so they return their default value when a bucket ordinal is requested which was not collected.

Closes #17225
2016-03-29 13:28:03 +01:00
javanna de5cbda8e7 Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-29 10:48:47 +02:00
Lee Hinman 80ab366de4 Add API to explain why a shard is or isn't assigned
This adds a new `/_cluster/allocation/explain` API that explains why a
shard can or cannot be allocated to nodes in the cluster. Additionally,
it will show where the master *desires* to put the shard, according to
the `ShardsAllocator`.

It looks like this:

```
GET /_cluster/allocation/explain?pretty
{
  "index": "only-foo",
  "shard": 0,
  "primary": false
}
```

Though, you can optionally send an empty body, which means "explain the
allocation for the first unassigned shard you find".

The output when a shard is unassigned looks like this:

```
{
  "shard" : {
    "index" : "only-foo",
    "index_uuid" : "KnW0-zELRs6PK84l0r38ZA",
    "id" : 0,
    "primary" : false
  },
  "assigned" : false,
  "unassigned_info" : {
    "reason" : "INDEX_CREATED",
    "at" : "2016-03-22T20:04:23.620Z"
  },
  "nodes" : {
    "V-Spi0AyRZ6ZvKbaI3691w" : {
      "node_name" : "Susan Storm",
      "node_attributes" : {
        "bar" : "baz"
      },
      "final_decision" : "NO",
      "weight" : 0.06666675,
      "decisions" : [ {
        "decider" : "filter",
        "decision" : "NO",
        "explanation" : "node does not match index include filters [foo:\"bar\"]"
      } ]
    },
    "Qc6VL8c5RWaw1qXZ0Rg57g" : {
      "node_name" : "Slipstream",
      "node_attributes" : {
        "bar" : "baz",
        "foo" : "bar"
      },
      "final_decision" : "NO",
      "weight" : -1.3833332,
      "decisions" : [ {
        "decider" : "same_shard",
        "decision" : "NO",
        "explanation" : "the shard cannot be allocated on the same node id [Qc6VL8c5RWaw1qXZ0Rg57g] on which it already exists"
      } ]
    },
    "PzdyMZGXQdGhqTJHF_hGgA" : {
      "node_name" : "The Symbiote",
      "node_attributes" : { },
      "final_decision" : "NO",
      "weight" : 2.3166666,
      "decisions" : [ {
        "decider" : "filter",
        "decision" : "NO",
        "explanation" : "node does not match index include filters [foo:\"bar\"]"
      } ]
    }
  }
}
```

And when the shard *is* assigned, the output looks like:

```
{
  "shard" : {
    "index" : "only-foo",
    "index_uuid" : "KnW0-zELRs6PK84l0r38ZA",
    "id" : 0,
    "primary" : true
  },
  "assigned" : true,
  "assigned_node_id" : "Qc6VL8c5RWaw1qXZ0Rg57g",
  "nodes" : {
    "V-Spi0AyRZ6ZvKbaI3691w" : {
      "node_name" : "Susan Storm",
      "node_attributes" : {
        "bar" : "baz"
      },
      "final_decision" : "NO",
      "weight" : 1.4499999,
      "decisions" : [ {
        "decider" : "filter",
        "decision" : "NO",
        "explanation" : "node does not match index include filters [foo:\"bar\"]"
      } ]
    },
    "Qc6VL8c5RWaw1qXZ0Rg57g" : {
      "node_name" : "Slipstream",
      "node_attributes" : {
        "bar" : "baz",
        "foo" : "bar"
      },
      "final_decision" : "CURRENTLY_ASSIGNED",
      "weight" : 0.0,
      "decisions" : [ {
        "decider" : "same_shard",
        "decision" : "NO",
        "explanation" : "the shard cannot be allocated on the same node id [Qc6VL8c5RWaw1qXZ0Rg57g] on which it already exists"
      } ]
    },
    "PzdyMZGXQdGhqTJHF_hGgA" : {
      "node_name" : "The Symbiote",
      "node_attributes" : { },
      "final_decision" : "NO",
      "weight" : 3.6999998,
      "decisions" : [ {
        "decider" : "filter",
        "decision" : "NO",
        "explanation" : "node does not match index include filters [foo:\"bar\"]"
      } ]
    }
  }
}
```

Only "NO" decisions are returned by default, but all decisions can be
shown by specifying the `?include_yes_decisions=true` parameter in the
request.

Resolves #14593
2016-03-28 15:21:02 -06:00
Jason Tedor 4793630eb8 Merge branch 'master' into enable_acked
* master: (419 commits)
  Remove PROTOTYPE from ShapeBuilders
  Take filterNodeIds into consideration while sending tasks actions requests to nodes
  test: cleanup imports and method rename
  Remove PROTOTYPE from SortBuilders
  percolator: Add query extract support for the blended term query and the common terms query.
  Don't iterate over shard routing if it's null
  [TEST] Reduce size of random shapes
  Add some debug logging to testPrimaryRelocationWhileIndexing
  Order methods in IndicesClusterStateService according to execution
  Tidied up percolator doc annotations
  In cat.snapshots, repository is required
  Do not retrieve all indices stats when checking for cache resets
  Enforce `discovery.zen.minimum_master_nodes` is set when bound to a public ip #17288
  Port Primary Terms to master #17044
  Revert "Add debug logging for Vagrant upgrade test"
  Ownership for data, logs, and configs for packages
  add on_failure exception metadata to ingest document for verbose simulate
  Revert "Merge pull request #16843 from xuzha/s3-encryption"
  Update Format, add new settings into the setting test
  Update and rebase the init implementation.
  ...
2016-03-28 12:29:53 -04:00
javanna a9f4982c40 Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-25 20:16:40 +01:00