Commit Graph

913 Commits

Author SHA1 Message Date
Nik Everett 8943421494 Only log rest connection setup once per suite (#21280)
This is a bit funky to do with junit because we need per test state
but we only want to log it per suite. So we use a static flag that
we test per test and reset before every suite.
2016-11-03 21:47:11 -04:00
Yannick Welsch 39f4229594 Add information about in-flight requests when checking IndexShard operation counter (#21308)
Our test infrastructure checks after running each test that there are no more in-flight requests on the shard level. Whenever the check fails, we only know that there were in-flight requests but don't know what requests were causing this issue. This commit adds the replication tasks that are still active at that moment to the assertion error.
2016-11-03 18:36:07 +01:00
Ryan Ernst dc6ed7b8d4 Remove pluggability of ZenPing (#21049)
Plugins: Remove pluggability of ZenPing

ZenPing is the part of zen discovery which knows how to ping nodes.
There is only one alternative implementation, which is just for testing.
This change removes the ability to add custom zen pings, and instead
hooks in the MockZenPing for tests through an overridden method in
MockNode. This also folds in the ZenPingService (which was really just a
single method) into ZenDiscovery, and removes the idea of having
multiple ZenPing instances. Finally, this was the last usage of the
ExtensionPoint classes, so that is also removed here.
2016-11-03 08:20:20 -07:00
Boaz Leskes be1772b70d pending states assertion should dump states
This was removed in a cleanup assuming that Hamcrest will dump the array content. Sadly it only dumps the size.
2016-11-03 09:02:29 +01:00
Christoph Büscher b3370de715 Tests: Add warning header checks to QueryBuilder tests and QueryParseContextTests
This adds checks for expected warning headers to the query builder test
infrastructure. Tests that are adding deprecation warnings to the response
headers need to check those, otherwise the abstract base class for the test
class will complain at teardown.
2016-11-02 15:45:33 +01:00
Yannick Welsch 6930a4846c [TEST] Check static test state after suite scoped cluster is shut down (#21256)
Checks on static test state are run by an @After method in ESTestCase. Suite-scoped tests in ESIntegTestCase only shut down in an @AfterClass method, which executes after the @After method in ESTestCase. The suite-scoped cluster can thus still execute actions that will violate the checks in @After without those being caught. A subsequent test executing within the same JVM will fail these checks however when @After gets called for that test.

This commit adds an explicit call to check the static test state after the suite-scoped cluster has been shut down.
2016-11-02 15:00:16 +01:00
Boaz Leskes 0daf483587 Change ClusterState and PendingClusterTasksResponse's toString() to their prettyPrint format (#21245)
The current XContent output is much harder to read than the prettyPrint format. This commit folds prettyPrint into toString and removes it.
2016-11-02 13:43:39 +01:00
Simon Willnauer cf1457ed22 Allow skip test by version OR feature (#21240)
Today these two are considered mutual exclusive but they are not in
practice. For instance a mixed version cluster might not return a
given warning depending on which node we talk to but on the other hand
some runners might not even support warnings at all so the test might be
skipped either by version or by feature.
2016-11-02 12:24:20 +01:00
Adrien Grand aa6cd93e0f Require arguments for QueryShardContext creation. (#21196)
The `IndexService#newQueryShardContext()` method creates a QueryShardContext on
shard `0`, with a `null` reader and that uses `System.currentTimeMillis()` to
resolve `now`. This may hide bugs, since the shard id is sometimes used for
query parsing (it is used to salt random score generation in `function_score`),
passing a `null` reader disables query rewriting and for some use-cases, it is
simply not ok to rely on the current timestamp (eg. percolation). So this pull
request removes this method and instead requires that all call sites provide
these parameters explicitly.
2016-11-02 09:48:49 +01:00
Simon Willnauer 2ba4dadea0 [TEST] fix extrasFS file filtering in OldIndexUtils 2016-11-02 09:38:51 +01:00
Simon Willnauer 4db1ac931f Fix InternalEngineTests#testUpgradeOldIndex for 5.0.0 BWC indices
Relates to #21147
2016-11-02 09:38:44 +01:00
Jason Tedor 7751049c14 Add version for 5.0.0
This commit adds the version constant for 5.0.0.

Relates #21244
2016-11-01 14:09:00 -04:00
Boaz Leskes 523f7ea71e Fix a racing condition in MockTransportService#addUnresponsiveRule where a request can be delayed even if the rule was removed.
Relates to #21129

Also properly reset DiscoveryWithServiceDisruptionsIT#disableBeforeIndexDeletion
2016-11-01 14:08:18 +01:00
Boaz Leskes ef192ff2cf ESIntegTestCase.jav: use ClusterState.prettyPrint for pending ClusterState assertions 2016-11-01 12:54:20 +01:00
Yannick Welsch d7d5909e69 Disconnect from newly added nodes if cluster state publishing fails (#21197)
Before publishing a cluster state the master connects to the nodes that are added in the cluster state. When publishing fails, however, it does not disconnect from these nodes, leaving NodeConnectionsService out of sync with the currently applied cluster state.
2016-10-31 15:09:43 +01:00
Simon Willnauer 9598616dfe Fallback to '/' info call to fetch cluster version
The `_cat/nodes` API might not be available in all clusters for instance
if they have authorization enabled. This change falls back to the previously
used method of using the '/' endpoint to fetch the nodes version, this is best
effort and will emit a warning.
2016-10-28 16:22:53 +02:00
Adrien Grand b3cc54cf0d Upgrade to lucene-6.3.0-snapshot-ed102d6 (#21150)
Lucene 6.3 is expected to be released in the next weeks so it'd be good to give
it some integration testing. I had to upgrade randomized-testing too so that
both Lucene and Elasticsearch are on the same version.
2016-10-28 14:47:15 +02:00
Simon Willnauer 43dbf9c7b6 Use all available hosts in REST tests and allow for real master election (#21161)
Today we only use a single node to send requests to when we run REST tests.
In some cases we have more than one node (ie. in the BWC case) where we should
send requests to all nodes in a round-robin fashion. This change passes all
available node endpoints to the rest test.

Additionally, this change adds the setting of `discovery.zen.minimum_master_nodes`
to the cluster formation forcing the nodes to wait for all other nodes until the cluster
is formed. This allows for a more realistic master election and allows all master eligable
nodes to become master while before always the first node in the cluster became the master.

This also adds logging to each test run to log the master nodes version and the minimum node
version in the cluster to help debugging BWC test failures.
2016-10-28 12:18:47 +02:00
Simon Willnauer 97cc426a89 Fix bwc cluster formation in order to run BWC tests against a mixed version cluster (#21145)
This fixes our cluster formation task to run REST tests against a mixed version cluster.
Yet, due to some limitations in our test framework `indices.rollover` tests are currently
disabled for the BWC case since they select the current master as the merge node which
happens to be a BWC node and we can't relocate all shards to it since the primaries are on
a higher version node. This will be fixed in a followup.

Closes #21142

Note: This has been cherry-picked from 5.0 and fixes several rest tests
as well as a BWC break in `OsStats.java`
2016-10-27 17:03:53 +02:00
Yannick Welsch f3e578f942 Stop delaying existing requests after network delay rule is cleared (#21129)
The network disruption type "network delay" continues delaying existing requests even after the disruption has been cleared. This commit ensures that the requests get to execute right after the delay rule is cleared.
2016-10-27 13:48:17 +02:00
Jason Tedor 9c3e4d6e22 Add correct Content-Length on HEAD requests
This commit fixes responses to HEAD requests so that the value of the
Content-Length is correct per the HTTP spec. Namely, the value of this
header should be equal to the Content-Length if the request were not a
HEAD request.

This commit also fixes a memory leak on HEAD requests to the main action
that arose from the bytes on a builder not being released due to them
being dropped on the floor to ensure that the response to the main
action did not have a body.

Relates #21123
2016-10-25 23:08:19 -04:00
Igor Motov 17ad88d539 Makes search action cancelable by task management API
Long running searches now can be cancelled using standard task cancellation mechanism.
2016-10-25 12:27:34 -10:00
Christoph Büscher f6f129b21f Consolidate code for equals/hashCode testing in central utility class
Currently test that check that equals() and hashCode() are working as expected
for classes implementing them are quiet similar. This change moves common
assertions in this method to a common utility class. In addition, another common
utility function in most of these test classes that creates copies of input
object by running them through a StreamOutput and reading them back in, is moved
to ESTestCase so it can be shared across all these classes.

Closes #20629
2016-10-24 15:50:40 +02:00
Simon Willnauer 0a410d3916 Pass executor name to request interceptor to support async intercept calls (#21089)
Today the request interceptor can't support async calls since the response
of the async call would execute on a different thread ie. a client or listener
thread. This means in-turn that the intercepted handler is not executed with the
thread it was supposed to run and therefor can, if it's executing blocking
operations, potentially deadlock an entire server.
2016-10-24 13:57:07 +02:00
Ryan Ernst 53cff0f00f Move all zen discovery classes into o.e.discovery.zen (#21032)
* Move all zen discovery classes into o.e.discovery.zen

This collapses sub packages of zen into zen. These all had just a couple
classes each, and there is really no reason to have the subpackages.

* fix checkstyle
2016-10-20 00:44:48 -07:00
javanna c92b550df2 [TEST] Remove create special case in yaml test client
Now that the create api has its own spec, we can remove the special case in the yaml test client for it

Relates to #20924
2016-10-20 08:48:15 +02:00
Boaz Leskes c3987156ab Remove local discovery in favor of a simpler `MockZenPings` (#20960)
`LocalDiscovery` is a discovery implementation that uses static in memory maps to keep track of current live nodes. This is used extensively in our tests in order to speed up cluster formation (i.e., shortcut the 3 second ping period used by `ZenDiscovery` by default). This is sad as that mean that most of the test run using a different discovery semantics than what is used in production. Instead of replacing the entire discovery logic, we can use a similar approach to only shortcut the pinging components.
2016-10-18 21:12:15 +02:00
Boaz Leskes eaa105951f Simplify GlobalCheckpointService and properly hook it for cluster state updates (#20720)
During a recent merge from master, we lost the bridge from IndicesClusterStateService to the GlobalCheckpointService of primary shards, notifying them of changes to the current set of active/initializing shards. This commits add the bridge back (with unit tests). It also simplifies the GlobalCheckpoint tracking to use a simpler model (which makes use the fact that the global check point sync is done periodically).

The old integration CheckpointIT test is moved to IndexLevelReplicationTests. I also added similar assertions to RelocationsIT, which surfaced a bug in the primary relocation logic and how it plays with global checkpoint updates. The test is currently await-fixed and will be fixed in a follow up issue.
2016-10-17 16:33:03 +02:00
Tanguy Leroux 1755cc08f3 REST API parser should fail on duplicate params/paths/methods/parts (#20940)
This commit changes the current REST API parser to make it fail and throw an exception when a REST specification file contains a duplicated parameters, or path, or method, or path part.
2016-10-17 09:19:07 +02:00
Simon Willnauer 5137f44bd6 [TEST] return empty array if AbstractQueryTestCase#currentTypes is null
This is important to allow any test to use RandomQueryBuilder#createQuery()
since some of the query builders that are used in this test test the length
of the types array and otherwise will thow NPE if the test is not a subclass
of AbstractQueryTestCase.
2016-10-15 14:46:54 +02:00
Boaz Leskes bc8ad8de5a MockBigArrays should tell you who originally released them 2016-10-12 13:03:40 +02:00
Tanguy Leroux 44ac5d057a Remove empty javadoc (#20871)
This commit removes as many as empty javadocs comments my regexp has found
2016-10-12 10:27:09 +02:00
Adrien Grand 1914df7b5f Do not cache script queries. (#20799)
The cache relies on the equals() method so we just need to make sure script
queries can never be equals, even to themselves in the case that a weight
is used to produce a Scorer on the same segment multiple times.

Closes #20763
2016-10-11 09:17:21 +02:00
Simon Willnauer 4fd1276542 Prevent AbstractArrays from release bytes more than once (#20819)
Today we throw an assertion error if we release an AbstractArray more than once.
Yet, it's recommended to implement close methods such that they can be invoked
more than once. Guaranteed single release calls are hard to implement and some
situations might not be tested causing for instance `CircuitBreaker` to operate on
corrupted memory stats.
2016-10-10 17:30:37 +02:00
javanna e154e6a758 [TEST] reformatted comment in query tests 2016-10-10 10:53:17 +02:00
Nik Everett cf4038b668 DeGuice some of IndicesModule
UpdateHelper, MetaDataIndexUpgradeService, and some recovery
stuff.

Move ClusterSettings to nullable ctor parameter of TransportService
so it isn't forgotten.
2016-10-07 11:14:38 -04:00
Simon Willnauer 7452028e50 Simplify TransportAddress (#20798)
since TransportAddress is now final we can simplify it's interface a bit
and remove methods that are only used in tests or are plain delegates.
2016-10-07 15:56:54 +02:00
Simon Willnauer 194a6b1df0 Remove LocalTransport in favor of MockTcpTransport (#20695)
This change proposes the removal of all non-tcp transport implementations. The
mock transport can be used by default to run tests instead of local transport that has
roughly the same performance compared to TCP or at least not noticeably slower.

This is a master only change, deprecation notice in 5.x will be committed as a
separate change.
2016-10-07 11:27:47 +02:00
Simon Willnauer 9c9afe3f01 Remove SearchContext#current and all it's threadlocals (#20778)
Today SearchContext expose the current context as a thread local which makes any kind of sane interface design very very hard. This PR removes the thread local entirely and instead passes the relevant context anywhere needed. This simplifies state management dramatically and will allow for a much leaner SearchContext interface down the road.
2016-10-06 19:51:54 +02:00
Colin Goodheart-Smithe 40f8f281e0 Merge branch 'master' into dont_cache_scripts 2016-10-06 09:09:23 +01:00
Colin Goodheart-Smithe ce6f6d3835 Review comments 2016-10-06 08:55:31 +01:00
Simon Willnauer 134b1f9b4d Prevent thread suspension when inside SecurityManager (#20770)
LongGCDisruption suspends and resumes node threads but respects several
`unsafe` class name patterns where it's unsafe to suspend. For instance
log4j uses a global lock so we can't suspend a thread that is currently
calling into log4j. The same is true for the security manager, it's similar
to log4j a shared resource between the test and the node that is _suspended_.
This change adds `java.lang.SecrityManager` to the unsafe patterns.
This prevents test framework deadlocking if a nodes thread is supended
while it's calling into the security manager that uses synchronized maps etc.
2016-10-05 21:40:27 +02:00
Simon Willnauer e556c289b9 use a private rewrite context to prevent exposing isCachable 2016-10-05 11:41:49 +02:00
Simon Willnauer 7ba22bb75b fix random score function builder to deal with empty seeds 2016-10-05 10:45:24 +02:00
Simon Willnauer 587bdcef38 add extra safety when accessing scripts or now and reqeusts are cached 2016-10-05 09:41:48 +02:00
Simon Willnauer 94b7873b49 Add a #markAsNotCachable() method to context to mark requests as not cachable 2016-10-04 18:05:00 +02:00
Simon Willnauer 56f35baf47 Add date-math support to `_rollover` (#20709)
today it's not possible to use date-math efficiently with the `_rollover`
API. This change adds support for date-math in the target index as well as
support for preserving the math logic when an existing index that was created with
a date math expression all subsequent indices are created with the same expression.
2016-10-03 16:52:33 +02:00
Boaz Leskes 27eab74510 merge from master 2016-09-30 17:19:30 +02:00
Jason Tedor 3a4ffd7b86 Fix failing logging listener tests
The logging listener tests started failing after
953a8a959b when the tests are run with
tests.es.logger.level set to any level other than debug. This is because
these tests were based around the assumption that the default logging
level was info, which was the case before that commit fixed setting the
default logging level via that system property. This commit fixes these
failing tests by adjusting this assumption to account for the fact that
the default logging level could be different.
2016-09-30 08:09:35 +02:00
Boaz Leskes a16d644c68 allow settings logging level via a sys config in unit tests
Pipe in the `tests.es.logger.level` system property to the log4j config file used in tests. We still default to info. Also adapts the logger name to use the first letter of packages.
2016-09-29 03:04:43 +02:00
Jason Tedor 0808611184 Fix failing tests after merge
This commit fixes failing tests in feature/seq_no after merging master
in.
2016-09-29 03:04:37 +02:00
Boaz Leskes 953a8a959b allow settings logging level via a sys config in unit tests
Pipe in the `tests.es.logger.level` system property to the log4j config file used in tests. We still default to info. Also adapts the logger name to use the first letter of packages.
2016-09-29 01:33:13 +02:00
Jason Tedor 25fd9e26c4 Merge branch 'master' into feature/seq_no
* master: (1199 commits)
  [DOCS] Remove non-valid link to mapping migration document
  Revert "Default `include_in_all` for numeric-like types to false"
  test: add a test with ipv6 address
  docs: clearify that both ip4 and ip6 addresses are supported
  Include complex settings in settings requests
  Add production warning for pre-release builds
  Clean up confusing error message on unhandled endpoint
  [TEST] Increase logging level in testDelayShards()
  change health from string to enum (#20661)
  Provide error message when plugin id is missing
  Document that sliced scroll works for reindex
  Make reindex-from-remote ignore unknown fields
  Remove NoopGatewayAllocator in favor of a more realistic mock (#20637)
  Remove Marvel character reference from guide
  Fix documentation for setting Java I/O temp dir
  Update client benchmarks to log4j2
  Changes the API of GatewayAllocator#applyStartedShards and (#20642)
  Removes FailedRerouteAllocation and StartedRerouteAllocation
  IndexRoutingTable.initializeEmpty shouldn't override supplied primary RecoverySource (#20638)
  Smoke tester: Adjust to latest changes (#20611)
  ...
2016-09-29 00:22:31 +02:00
Jason Tedor 3c8ff45917 Add production warning for pre-release builds
This commit adds a usage warning when Elasticsearch is started with a
pre-release build.

Relates #20674
2016-09-27 20:13:12 -04:00
Boaz Leskes ee76c1a5c9 Remove NoopGatewayAllocator in favor of a more realistic mock (#20637)
Many of our unit tests instantiate an `AllocationService`, which requires having a `GatewayAllocator`. Today almost all of our test use a class called `NoopGatewayAllocator` which does nothing, effectively leaving all shard assignments to the balanced allocator. This is sad as it means we test a system that behaves differently than our production logic in very basic things. For example, a started primary that is lost will be assigned to a node that didn't use to have it.

This PR removes `NoopGatewayAllocator` in favor of a new `TestGatewayAllocator` that inherits the standard `GatewayAllocator` and overrides shard information fetching to return information based on historical assignments the allocator has done. The only exception is `BalanceConfigurationTests` which does test only the balancer and I opted to not have it work around the `GatewayAllocator` being in it's way.
2016-09-25 20:15:30 +02:00
Ali Beyad ac1b13dde7 Changes the API of GatewayAllocator#applyStartedShards and (#20642)
Changes the API of GatewayAllocator#applyStartedShards and 
GatewayAllocator#applyFailedShards to take both a RoutingAllocation
and a list of shards to apply. This allows better mock allocators
to be created as being done in #20637.

Closes #20642
2016-09-23 09:31:46 -04:00
Ali Beyad 029fc909b5 Removes FailedRerouteAllocation and StartedRerouteAllocation
Removes the FailedRerouteAllocation class and StartedRerouteAllocation
class, as they were just wrappers for RerouteAllocation that stored
started and failed shards, but these started and failed shards can
be passed in directly to the methods that needed them, removing the
need for this wrapper class and extra level of indirection.

Closes #20626
2016-09-23 09:02:36 -04:00
Simon Willnauer fe1803c957 Remove AnalysisService and reduce it to a simple name to analyzer mapping (#20627)
Today we hold on to all possible tokenizers, tokenfilters etc. when we create
an index service on a node. This was mainly done to allow the `_analyze` API to
directly access all these primitive. We fixed this in #19827 and can now get rid of
the AnalysisService entirely and replace it with a simple map like class. This
ensures we don't create a gazillion long living objects that are entirely useless since
they are never used in most of the indices. Also those objects might consume a considerable
amount of memory since they might load stopwords or synonyms etc.

Closes #19828
2016-09-23 08:53:50 +02:00
Yannick Welsch 191fadafcc Fix logger usage checker for Log4j 2
With the switch to Log4j 2 throughout our code base, the logger usage checker was temporarily disabled. This commit
adapts the checks to work with Log4j 2 and re-enables the Gradle checks.

Closes #20243
2016-09-21 14:44:14 +02:00
Simon Willnauer 0151974500 `_flush` should block by default (#20597)
This commit changes the default behavior of `_flush` to block if other flushes are ongoing.
This also removes the use of `FlushNotAllowedException` and instead simply return immediately
by skipping the flush. Users should be aware if they set this option that the flush might or might
not flush everything to disk ie. no transactional behavior of some sort.

Closes #20569
2016-09-21 14:20:24 +02:00
Tanguy Leroux 7645abaad9 Remove duplicate methods in ByteSizeValue (#20560)
This commit removes `ByteSizeValue`'s methods that are duplicated (ex: `mbFrac()` and `getMbFrac()`) in order to only keep the `getN` form.
    
It also renames `mb()` -> `getMb()`, `kb()` -> `getKB()` in order to be more coherent with the `ByteSizeUnit` method names.
2016-09-20 14:07:23 +02:00
Ali Beyad 50584c4103 Merge pull request #20532 from rjernst/rolling_upgrades
This PR introduces backward compatibility index tests to test the rolling upgrade process amongst Elasticsearch instances within the same major version. The test executes in three phases. In the first phase, we form a cluster of 2 ES instances on an old version. In the second phase, we keep one of the nodes from the old cluster, kill the other node, but preserve its data directory and start an instance of the current version of ES using the same data directory as the killed instance. In the third phase, we kill the other old version ES instance from the first phase and launch a new instance, using the same data directory as the killed instance. Therefore, during phase 3, we have fully migrated and have all current versions of ES running. In each phase, we run REST tests that index documents and search them, ensuring at each stage that the documents from the previous phase are still there.

Note that because we haven't released a GA yet of 5.0, the tests currently don't start an old version cluster in the first phase. Once GA is released, this will be changed to make the backward compatibility version 5.0, while the current version in the cluster will be 5.x.
2016-09-19 16:14:38 -04:00
Simon Willnauer ee8d14798f Unguice Transport and friends (#20526)
This change removes all guice interaction from Transport, HttpServerTransport,
HttpServer and TransportService. All these classes as well as their subclasses
or extended version configured via plugins are now created by using plain old
bloody java constructors. YAY!
2016-09-19 22:10:47 +02:00
Boaz Leskes 2ee9ab25d9 Remove `RoutingAllocation.Result` (#20538)
Currently all the reroute-like methods of `AllocationService` return a result object of type `RoutingAllocation.Result`. The result object contains the new `RoutingTable` and `MetaData` plus an indication whether those were changed. The caller is then responsible of updating a cluster state with these. These means that things can easily go wrong and one can take one of these but not the other causing inconsistencies. We already have a utility method on the `ClusterState` builder that does but no one forces you to do so. Also 99% of the callers do the same thing: i.e., check if the result was changed and if so update the very same cluster state that was passed to `AllocationService`.  This PR folds this pattern into `AllocationService` and changes almost all it's methods to return a new cluster state (potentially the original one).  This saves some 500 lines of code.

The one exception here is the reroute API which executes allocation commands and potentially returns an explanation as well (next to the routing table and metadata). That API now returns a `CommandsResult` object which encapsulate a cluster state and the explanation.
2016-09-19 13:54:35 +02:00
Ali Beyad 98230d035a Adds a preserveIndicesUponCompletion method to ESRestTestCase
that can be overridden by subclasses if the test must not
delete indices it created after exiting.
2016-09-16 19:21:26 -04:00
Ali Beyad ce86ed1fdd Merge remote-tracking branch 'upstream/master' into rolling_upgrades 2016-09-16 10:43:38 -04:00
Simon Willnauer f5daa165f1 Remove ability to plug-in TransportService (#20505)
TransportService is such a central part of the core server, replacing
it's implementation is risky and can cause serious issues. This change removes the ability to
plug in TransportService but allows registering a TransportInterceptor that enables
plugins to intercept requests on both the sender and the receiver ends. This is a commonly used
and overwritten functionality but encapsulates the custom code in a contained manner.
2016-09-16 09:47:53 +02:00
Boaz Leskes 577dcb3237 Add current cluster state version to zen pings and use them in master election (#20384)
During a networking partition, cluster states updates (like mapping changes or shard assignments)
are committed if a majority of the masters node received the update correctly. This means that the current master has access to enough nodes in the cluster to continue to operate correctly. When the network partition heals, the isolated nodes catch up with the current state and get the changes they couldn't receive before. However, if a second partition happens while the cluster
is still recovering from the previous one *and* the old master is put in the minority side, it may be that a new master is elected which did not yet catch up. If that happens, cluster state updates can be lost.

This commit fixed 95% of this rare problem by adding the current cluster state version to `PingResponse` and use them when deciding which master to join (and thus casting the node's vote). 

Note: this doesn't fully mitigate the problem as a cluster state update which is issued concurrently with a network partition can be lost if the partition prevents the commit message (part of the two phased commit of cluster state updates) from reaching any single node in the majority side *and* the partition does allow for the master to acknowledge the change. We are working on a more comprehensive fix but that requires considerate work  and is targeted at 6.0.
2016-09-15 23:39:11 +02:00
Nik Everett d0be96df7b Clean up snapshots after each REST test
The only repository we can be sure is safe to clean is `fs` so we clean
any snapshots in those repositories after each test. Other repositories
like url and azure tend to throw exceptions rather than let us fetch
their contents during the REST test. So we clean what we can....

Closes #18159
2016-09-15 14:49:11 -04:00
Boaz Leskes 8469c98e34 Fix LongGCDisruption to be aware of log4j2 (#20348)
LongGCDisruption simulates a Long GC by suspending all threads belonging to a node. That's fine, unless those threads hold shared locks that can prevent other nodes from running. Concretely the logging infrastructure, which is shared between the nodes, can cause some deadlocks. LongGCDisruption has protection for this, but it needs to be updated to point at log4j2 classes, introduced in #20235

This commit also fixes improper handling of retry logic in LongGCDisruption and adds a protection against deadlocking the test code which activates the disruption (and uses logging too! :)).

On top of that we have some new, evil and nasty tests.
2016-09-15 08:50:18 +02:00
Ali Beyad 3f79874042 Prevent the rolling upgrades rest tests from cleaning up indices
after finishing if a the tests.rest.preserve_indices system property
is set
2016-09-14 23:34:19 -04:00
Simon Willnauer 17ddee7011 Remove TransportService#registerRequestHandler leniency (#20469)
`TransportService#registerRequestHandler` allowed to register
handlers more than once and issues an annoying warn log message when
this happens. This change simple throws an exception to prevent regsitering
the same handler more than once. This commit also removes the ability
to remove request handlers.

Relates to #20468
2016-09-14 20:32:29 +02:00
Luca Cavanna 14e17f44a1 Replace usage of LuceneTestCase#getBaseTempDirForTestClass (#20484)
LuceneTestCase#getBaseTempDirForTestClass is deprecated, we should not use it.

Closes #15845
2016-09-14 19:35:20 +02:00
Simon Willnauer 89640965d2 Unguice SearchModule (#20456)
After this change SearchModule doesn't subclass AbstractModule anymore and all wiring
happens in `Node.java`. As a side-effect several tests don't need a guice injector anymore.
2016-09-14 10:07:53 +02:00
Jason Tedor 7560101ec7 Complete Elasticsearch logger names
This commit modifies the logger names within Elasticsearch to be the
fully-qualified class name as opposed removing the org.elasticsearch
prefix and dropping the class name. This change separates the root
logger from the Elasticsearch loggers (they were equated from the
removal of the org.elasticsearch prefix) and enables log levels to be
set at the class level (instead of the package level).

Relates #20457
2016-09-13 22:46:54 -04:00
Jason Tedor fbe27664a6 Fix prefix logging
Today we add a prefix when logging within Elasticsearch. This prefix
contains the node name, and index and shard-level components if
appropriate.

Due to some implementation details with Log4j 2 , this does not work for
integration tests; instead what we see is the node name for the last
node to startup. The implementation detail here is that Log4j 2 there is
only one logger for a name, message factory pair, and the key derived
from the message factory is the class name of the message factory. So,
when the last node starts up and starts setting prefixes on its message
factories, it will impact the loggers for the other nodes.

Additionally, the prefixes are lost when logging an exception. This is
due to another implementation detail in Log4j 2. Namely, since we log
exceptions using a parameterized message, Log4j 2 decides that that
means that we do not want to use the message factory that we have
provided (the prefix message factory) and so logs the exception without
the prefix.

This commit fixes both of these issues.

Relates #20429
2016-09-13 14:46:34 -04:00
Nicholas Knize 1a60e1c3d2 Update docs for LatLonPoint cut over
This commit removes documentation for:

* geohash cell query
* lat_lon parameter
* geohash parameter
* geohash_precision parameter
* geohash_prefix parameter

It also updates failing tests that reference these parameters for backcompat.
2016-09-13 12:18:21 -05:00
Nicholas Knize ef926894f4 Cut over geo_point field and queries to new LatLonPoint type
This commit cuts over geo_point fields to use Lucene's new point-based LatLonPoint type for indexes created in 5.0. Indexes created prior to 5.0 continue to use their respective encoding type. Below is a description of the changes made to support the new encoding type:

* New indexes use a new LatLonPointFieldMapper which provides a parse method for the new type
* The new LatLonPoint parse method removes support for lat_lon and geohash parameters
* Backcompat testing for deprecated lat_lon and geohash parameters is added to all unit and integration tests
* LatLonPointFieldMapper provides DocValues support (enabled by default) which uses Lucene's new LatLonDocValuesField type
* New LatLonPoint field data classes are added for aggregation support (wraps LatLonPoint's Numeric Doc Values)
* MultiFields use the geohash as the string value instead of the lat,lon string making it easier to perform geo string queries on the geohash instead of a lat,lon comma delimited string.

Removed Features:

* With the removal of geohash indexing, GeoHashCellQuery support is removed for all new indexes (still supported on existing indexes)
* LatLonPoint does not support a Distance Range query because it is super inefficient. Instead, the geo_distance_range query should be accomplished using either the geo_distance aggregation, sorting by descending distance on a geo_distance query, or a boolean must not of the excluded distance (which is what the distance_range query did anyway).

TODO:

* fix/finish yaml changes for plugin and rest integration tests
* update documentation
2016-09-13 12:17:36 -05:00
Jason Tedor 013e3f6fcc Remove unused import from BootstrapForTesting
This commit removes an unused import for o.e.c.l.LogConfigurator from
o.e.b.BootstrapForTesting.
2016-09-13 09:49:15 -04:00
Tanguy Leroux 6090c51fc5 Add quiet option to disable console logging (#20422)
This commit adds a -q/--quiet option to Elasticsearch so that it does not log anything in the console and closes stdout & stderr streams. This is useful for SystemD to avoid duplicate logs in both journalctl and /var/log/elasticsearch/elasticsearch.log while still allows the JVM to print error messages in stdout/stderr if needed.

closes #17220
2016-09-13 14:08:24 +02:00
Lee Hinman 44278db1bc Merge pull request #20433 from dakrone/remove-cluster-name-folder-fallback
No longer allow cluster name in data path
2016-09-12 17:01:49 -05:00
Lee Hinman 94625d74e4 No longer allow cluster name in data path
In 5.x we allowed this with a deprecation warning. This removes the code
added for that deprecation, requiring the cluster name to not be in the
data path.

Resolves #20391
2016-09-12 15:47:01 -06:00
Simon Willnauer 686994ae2d Deguice SearchService and friends (#20423)
This change removes the guice dependency handling for SearchService and
several related classes like SearchTransportController and SearchPhaseController.
The latter two now have package private constructors and dependencies like FetchPhase
are now created by calling their constructors explicitly. This also cleans up several users
of the DefaultSearchContext and centralized it's creation inside SearchService.
2016-09-12 22:42:55 +02:00
Ali Beyad b1e87aa13c Split allocator decision making from decision application (#20347)
Splits the PrimaryShardAllocator and ReplicaShardAllocator's decision
making for a shard from the implementation of that decision on the
routing table. This is a step toward making it easier to use the same
logic for the cluster allocation explain APIs.
2016-09-12 16:21:39 -04:00
Boaz Leskes b08352047d Introduce IndexShardTestCase (#20411)
Introduce a base class for unit tests that are based on real `IndexShard`s. The base class takes care of all the little details needed to create and recover shards. 

This commit also moves `IndexShardTests` and `ESIndexLevelReplicationTestCase` to use the new base class. All tests in `IndexShardTests` that required a full node environment were moved to a new `IndexShardIT` suite.
2016-09-12 18:20:25 +02:00
Ali Beyad f39f9b9760 Update discovery nodes after cluster state is published (#20409)
Before, when there was a new cluster state to publish,
zen discovery would first update the set of nodes to
ping based on the new cluster state, then publish the new
cluster state. This is problematic because if the cluster
state failed to publish, then the set of nodes to ping
should not have been updated.

This commit fixes the issue by updating the set of
nodes to ping for fault detection only *after* the new
cluster state has been published.
2016-09-12 12:07:51 -04:00
Luca Cavanna 4b00cc37a1 Merge pull request #20382 from javanna/enhancement/cleanup_parse_elements
Cleanup sub fetch phase extension point
2016-09-09 22:47:15 +02:00
Tal Levy dda32545bb add ignore_missing option to relevant processors (#20194) 2016-09-09 12:20:18 -07:00
javanna 90ab460fcc move parsing of search ext sections to the coordinating node 2016-09-09 19:10:42 +02:00
javanna 65c7f61ad9 decouple registration of SearchExtParsers from sub fetch phases
Search section supports an ext section that is used to provide additional config needed from plugins. It is now tied to sub fetch phases because it is the only section that may need additional config, but there is no reason for the two to be tightly coupled.

It is now possible to register a searchExtParser independently from a sub fetch phase. All a search ext parser does is parsing some ext section of a search request, whose parsed resulting object is stored in the search context for later retrieval.
2016-09-09 18:05:49 +02:00
javanna f9530dfe8f remove FetchSubPhaseContext in favour of generic fetch sub phase builder of type object
The context was an object where the parsed info are stored. That is more of what we call the builder since after the search refactoring. No need for generics in FetchSubPhaseParser then. Also the previous setHitsExecutionNeeded wasn't useful, it can be removed as well, given that once there is a parsed ext section, it will become a builder that can be retrieved by the sub fetch phase. The sub fetch phase is responsible for doing nothing in case the builder is not set, meaning that the fetch sub phase is plugged in but the request didn't have the corresponding section.
2016-09-09 18:05:49 +02:00
javanna dc2ba90f48 clarify that SearchParseElement is only used for custom fetch sub phases and clean up extension point
SearchParseElement is renamed to FetchSubPhaseParser and moved to the search.fetch package. Its parse method doesn't get the SearchContext as argument anymore, only the XContentParser, and the return type is what gets parsed (the fetch sub phase context which we may as well rename later).

It is the parser that initializes the FetchSubPhaseContext then. SearchService retrieves the parser by name, calls parse against it and stores the result of parsing by name. No need for FetchSubPhase.ContextFactory anymore, which can be removed.
2016-09-09 18:05:49 +02:00
javanna a33ca70ff5 make docValueFields similar to other standard sub fetch phases
Given that doc value fields is our own fetch sub phase, it doesn't need to be implemented like if it was plugged in from the outside. It doesn't need its own fetch sub phase context, but it can just be an instance member in SearchContext
2016-09-09 18:05:49 +02:00
Jason Tedor d8475488b8 Disable console logging
Previously we would disable console logging in certain circumstances
(for example, if Elasticsearch is not in the foreground, or if
Elasticsearch is in the foreground but an exception was thrown during
bootstrap). This commit makes this handling work with Log4j 2. This will
prevent users from seeing double bootstrap check failure messages.

Relates #20387
2016-09-09 09:15:35 -04:00
Jason Tedor de43565abc Do not log full bootstrap checks exception
By default, when an exception causes the JVM to terminate, the stack
trace is printed. In the case of failing bootstrap checks, this stack
trace is useless to the user, and might even distract them from seeing
that the bootstrap checks failed for reasons under their control. With
this commit, we cause the stack trace for a failing bootstrap check to
be truncated.

We also modify some methods to not declare that they throw the top level
checked exception type Exception, but instead explicitly declare the
exceptions that they throw. These exceptions are caught and wrapped in a
BootstrapException so that we can percolate only two exception types out
of Bootstrap#init as checked exception, BootstrapException and
NodeValidationException.

Relates #19989
2016-09-08 10:56:11 -04:00
Tanguy Leroux 4fb7ac8254 Clean up XContentBuilder
This commit cleans most of the methods of XContentBuilder so that:
- Jackson's convenience methods are used instead of our custom ones (ie field(String,long) now uses Jackson's writeNumberField(String, long) instead of calling writeField(String) then writeNumber(long))
- null checks are added for all field names and values
- methods are grouped by type in the class source
- methods have the same parameters names
- duplicated methods like field(String, String...) and array(String, String...) are removed
- varargs methods now have the "array" name to reflect that it builds arrays
- unused methods like field(String,BigDecimal) are removed
- all methods now follow the execution path: field(String,?) -> field(String) then value(?), and value(?) -> writeSomething() method. Methods to build arrays also follow the same execution path.
2016-09-08 15:09:09 +02:00
Alexander Lin f825e8f4cb Exposing lucene 6.x minhash filter. (#20206)
Exposing lucene 6.x minhash tokenfilter

Generate min hash tokens from an incoming stream of tokens that can
be used to estimate document similarity.

Closes #20149
2016-09-07 09:38:12 +02:00
Simon Willnauer 11f2da5f14 Skip loading of jansi from log4j2 (#20334)
Jython shades `jansi` into it's classpath without changing it's package or
anything like that. This causes attempts to load native code on windows which
blows up tests. This change adds `log4j.skipJansi=true` system property to our
tests as well as to the JVM properties we set.
2016-09-06 05:53:00 -04:00
Simon Willnauer 5c2d9fa158 Improve error reporting for tests with BackgroundIndexer (#20324)
The BackgroundIndexer now uses auto-generated IDs randomly. This causes some problems
for tests that still rely on the fact that the IDs are increasing integers. This change
exposes all IDs via a Set<String> to iterate over for tests.
2016-09-05 16:28:49 +02:00
Nik Everett 549ca3178b Rename method in OldIndexUtils
loadIndexList -> loadDataFilesList. The new method name is more accurate.
2016-09-02 10:16:30 -04:00
javanna 7c03f65c36 [TEST] adjusted EsTestCase#randomPositiveLong 2016-09-02 10:23:49 +02:00
javanna 536d13ff11 ProcessInfo to implement Writeable rather than Streamable 2016-09-02 10:23:05 +02:00
Simon Willnauer 825b80f2a6 [TEST] fix possible NPE in ClientYamlTestExecutionContext 2016-09-02 10:07:58 +02:00
Jason Tedor 1e80adbfbe Configure test logging with Log4j 2
This commit configures test logging for Log4j 2. The default logger
configuration uses the console appender but at the error level, so most
tests are missing logging. Instead, this commit provides a configuration
for tests which is picked up from the classpath by Log4j 2 when it
initializes. However, this now means that we can no longer initialize
Log4j with a bare-bones configuration when tests run as doing so will
prevent Log4j 2 from attempting to configure logging via the
classpath. Consequently, we move this needed initialization (as
commented, to avoid a message about a status logger not being configured
when we are preparing to configure Log4j from properties files in the
config directory) to only run when we are explicitly configuring Log4j
from properties files.

Relates #20284
2016-09-01 14:00:47 -04:00
Simon Willnauer a0becd26b1 Optimize indexing for the autogenerated ID append-only case (#20211)
If elasticsearch controls the ID values as well as the documents
version we can optimize the code that adds / appends the documents
to the index. Essentially we an skip the version lookup for all
documents unless the same document is delivered more than once.

On the lucene level we can simply call IndexWriter#addDocument instead
of #updateDocument but on the Engine level we need to ensure that we deoptimize
the case once we see the same document more than once.

This is done as follows:

1. Mark every request with a timestamp. This is done once on the first node that
receives a request and is fixed for this request. This can be even the
machine local time (see why later). The important part is that retry
requests will have the same value as the original one.

2. In the engine we make sure we keep the highest seen time stamp of "retry" requests.
This is updated while the retry request has its doc id lock. Call this `maxUnsafeAutoIdTimestamp`

3. When the engine runs an "optimized" request comes, it compares it's timestamp with the
current `maxUnsafeAutoIdTimestamp` (but doesn't update it). If the the request
timestamp is higher it is safe to execute it as optimized (no retry request with the same
timestamp has been run before). If not we fall back to "non-optimzed" mode and run the request as a retry one
and update the `maxUnsafeAutoIdTimestamp` unless it's been updated already to a higher value

Relates to #19813
2016-09-01 10:39:40 +02:00
Simon Willnauer 419627c460 Ensure ESTestCase is initialized before we run tests 2016-09-01 09:39:44 +02:00
Jason Tedor 76ab02e002 Merge branch 'master' into log4j2
* master:
  Avoid NPE in LoggingListener
  Randomly use Netty 3 plugin in some tests
  Skip smoke test client on JDK 9
  Revert "Don't allow XContentBuilder#writeValue(TimeValue)"
  [docs] Remove coming in 2.0.0
  Don't allow XContentBuilder#writeValue(TimeValue)
  [doc] Remove leftover from CONSOLE conversion
  Parameter improvements to Cluster Health API wait for shards (#20223)
  Add 2.4.0 to packaging tests list
  Docs: clarify scale is applied at origin+offest (#20242)
2016-08-31 16:37:55 -04:00
Stian Lindhom c2eddaf2c9 Avoid NPE in LoggingListener
This commit avoids an NPE that could arise when implementing an
ESTestCase for test classes placed in the default package.

Relates #20269
2016-08-31 16:11:12 -04:00
Ali Beyad 4641254ea6 Parameter improvements to Cluster Health API wait for shards (#20223)
* Params improvements to Cluster Health API wait for shards

Previously, the cluster health API used a strictly numeric value
for `wait_for_active_shards`. However, with the introduction of
ActiveShardCount and the removal of write consistency level for
replication operations, `wait_for_active_shards` is used for
write operations to represent values for ActiveShardCount. This
commit moves the cluster health API's usage of `wait_for_active_shards`
to be consistent with its usage in the write operation APIs.

This commit also changes `wait_for_relocating_shards` from a
numeric value to a simple boolean value `wait_for_no_relocating_shards`
to set whether the cluster health operation should wait for
all relocating shards to complete relocation.

* Addresses code review comments

* Don't be lenient if `wait_for_relocating_shards` is set
2016-08-31 11:58:19 -04:00
Jason Tedor e166459bbe Merge branch 'master' into log4j2
* master:
  Increase visibility of deprecation logger
  Skip transport client plugin installed on JDK 9
  Explicitly disable Netty key set replacement
  percolator: Fail indexing percolator queries containing either a has_child or has_parent query.
  Make it possible for Ingest Processors to access AnalysisRegistry
  Allow RestClient to send array-based headers
  Silence rest util tests until the bogusness can be simplified
  Remove unknown HttpContext-based test as it fails unpredictably on different JVMs
  Tests: Improve rest suite names and generated test names for docs tests
  Add support for a RestClient base path
2016-08-31 10:59:27 -04:00
Jason Tedor 21dbc5ba84 Add empty test to ESLoggerUsageTests
This commit adds an empty test to ESLoggerUsageTests to avoid the test
suite from failing for having no tests after the existing tests were
marked as awaits fix in 1d197eddcc.
2016-08-31 04:41:07 -04:00
Jason Tedor 1d197eddcc Mark ESLoggerUsageTests as awaits fix
This commit marks the ESLoggerUsageTests as awaits fix as these tests
are not currently compatible with the Log4j 2 API.
2016-08-30 22:30:14 -04:00
Jason Tedor 4e69ac0272 Add link to open logger usage issue
This commit adds comments linking to an open issue regarding updating
the logger usage check for the Log4j 2 API.
2016-08-30 21:13:17 -04:00
Jason Tedor abf8a1a3f0 Avoid allocating log parameterized messages
This commit modifies the call sites that allocate a parameterized
message to use a supplier so that allocations are avoided unless the log
level is fine enough to emit the corresponding log message.
2016-08-30 18:17:09 -04:00
Ryan Ernst 2a7a187bf8 Silence rest util tests until the bogusness can be simplified 2016-08-30 14:58:44 -07:00
Ryan Ernst e19f2b6348 Tests: Improve rest suite names and generated test names for docs tests
Rest test suites are currently only the directory above the yaml test
file. That is confusing when there are more than one directory level
which contain yaml tests, as there are in generated docs tests. This
change makes rest tests use the full relative path to the rest test root
as the suite name, and also makes the test names for docs tests a little
clearer (that they are testing an example from a specific line number,
instead of just the line number as an opaque test name).
2016-08-30 13:55:44 -07:00
Jason Tedor 7da0cdec42 Introduce Log4j 2
This commit introduces Log4j 2 to the stack.
2016-08-30 13:31:24 -04:00
Jason Tedor 5a8f2d7fb3 Disable logger usage checks
This commit disables the logger usage checks as they will not be
compatible with Log4j 2. This disabling is temporary, they will return.
2016-08-30 13:28:07 -04:00
javanna 61145bfb2f [TEST] minor cleanups to AbstractQueryTestCase
Removed null check for token, if we are outside the null it already means it is null.

Fixed typo in comment and remove leftover assignment to unused local variable.
2016-08-29 16:52:11 +02:00
Yannick Welsch f070c8727b [TEST] Add additional logging to testStaleMasterNotHijackingMajority
This test is periodically failing. As I suspect that the GCDisruption scheme is somehow making the wrong node block on
its cluster state update thread, I've added some more logging and a thread dump once the given assertion triggers
again.
2016-08-29 13:42:13 +02:00
Yannick Welsch 1b75cb63a2 Add recovery source to ShardRouting (#19516)
Adds an explicit recoverySource field to ShardRouting that characterizes the type of recovery to perform:

- fresh empty shard copy
- existing local shard copy
- recover from peer (primary)
- recover from snapshot
- recover from other local shards on same node (shrink index action)
2016-08-27 16:11:10 +02:00
Tanguy Leroux 68b943dc53 Fix MoreLikeThisQueryBuilderTests.testUnknownObjectException()
Objects hierarchy must be tracked when entering/leaving an object so that it better knows if the "newField" has been inserted into an arbitrary holding object.

Can be reproduced with gradle :core:test -Dtests.seed=760F8BD0F7E46D45 -Dtests.class=org.elasticsearch.index.query.MoreLikeThisQueryBuilderTests -Dtests.method="testUnknownObjectException" -Dtests.security.manager=true -Dtests.locale=ko -Dtests.timezone=Etc/Zulu
2016-08-25 20:54:06 +02:00
Tanguy Leroux fbcfddbb77 Fix AbstractQueryTestCase.testUnknownObjectException()
When need to check the whole hierarchy of objects to know if the newly inserted "newField" object is part of an arbitrary holding object or not.

Reproduced with `gradle :modules:percolator:test -Dtests.seed=736B0B67DA7A3632 -Dtests.class=org.elasticsearch.percolator.PercolateQueryBuilderTests -Dtests.method="testUnknownObjectException" -Dtests.security.manager=true -Dtests.locale=es-ES -Dtests.timezone=ART`
2016-08-25 16:24:22 +02:00
Michael McCandless 1fe3e36934 Merge pull request #20147 from mikemccand/lucene_620_upgrade
Upgrade to Lucene 6.2.0
2016-08-25 06:03:34 -04:00
Tanguy Leroux 20719f9b2f Improve AbstractQueryTestCase#unknownObjectExceptionTest()
This method fails when a randomized string value contains a double-quote. This commit changes the method so that it is not based on string concatenation anymore. It now use XContentGenerator & XContentParser to mutate the valid queries.

Related #19864
2016-08-25 10:57:30 +02:00
Mike McCandless 5eb66e3378 Mark Scandinavian analysis components as multi term aware 2016-08-24 19:50:25 -04:00
Mike McCandless 7492300544 Remove now unused Store.renameFile, and obsolete commented out code 2016-08-24 18:20:30 -04:00
Mike McCandless 0ccfe69789 Upgrade to Lucene 6.2.0 2016-08-24 17:26:28 -04:00
Jim Ferenczi 4682fc34ae Add the ability to disable the retrieval of the stored fields entirely
This change adds a special field named _none_ that allows to disable the retrieval of the stored fields in a search request or in a TopHitsAggregation.

To completely disable stored fields retrieval (including disabling metadata fields retrieval such as _id or _type) use _none_ like this:

````
POST _search
{
   "stored_fields": "_none_"
}
````
2016-08-24 16:40:08 +02:00
Nicholas Knize 28ed0e7abf Deprecate optimize_bbox on geodistance queries
Deprecates the optimize_bbox parameter on geodistance queries. This has no longer been needed since version 2.2 because lucene geo distance queries (postings and LatLonPoint) already optimize by bounding box.
2016-08-23 09:14:54 -05:00
Yannick Welsch 771668f380 Use routingResult method to update cluster state after reroute
This ensures that the routing table as well as the metadata (with the primary terms and in-sync allocation ids) is updated.
2016-08-19 17:15:02 +02:00
Ryan Ernst 8c60455ed6 Fix checkstyle line length violations in allocation tests 2016-08-17 16:28:31 -07:00
Ryan Ernst 1ff348ed7f Plugins: Make custom allocation deciders use pull based extensions
This change converts AllocationDecider registration from push based on
ClusterModule to implementing with a new ClusterPlugin interface.
AllocationDecider instances are allowed to use only Settings and
ClusterSettings.
2016-08-17 15:55:31 -07:00
Ryan Ernst 2ea50bc162 Merge pull request #20018 from rjernst/split_disk_threshold
Internal: Split disk threshold monitoring from decider
2016-08-17 07:57:50 -07:00
Yannick Welsch 27a760f9c1 Add routing changes API to RoutingAllocation (#19992)
Adds a class that records changes made to RoutingAllocation, so that at the end of the allocation round other values can be more easily derived based on these changes. Most notably, it:

- replaces the explicit boolean flag that is passed around everywhere to denote changes to the routing table. The boolean flag is automatically updated now when changes actually occur, preventing issues where it got out of sync with actual changes to the routing table.
- records actual changes made to RoutingNodes so that primary term and in-sync allocation ids, which are part of index metadata, can be efficiently updated just by looking at the shards that were actually changed.
2016-08-17 10:46:59 +02:00
Ryan Ernst b2c0f2d08f Internal: Split disk threshold monitoring from decider
In addition to be an allocation decider, DiskThresholdDecider also
monitors the used disk in order to trigger a reroute when the thresholds
are crossed. This change splits out the settings for disk thresholds
into DiskThresholdSettings, and moves the monitoring to a new
DiskThresholdMonitor.  DiskThresholdDecider is then in line with other
allocation deciders, needing only Settings and ClusterSettings for
construction, which will allow deguicing allocation deciders.
2016-08-17 00:22:16 -07:00
Lee Hinman 1825d8060c Merge remote-tracking branch 'dakrone/lockobtainfailed-replacement' 2016-08-16 14:41:27 -06:00
Lee Hinman 1de3388fa3 Switching LockObtainFailedException over to ShardLockObtainFailedException
`LobObtainFailedException` should be reserved for on-disk locks that
Lucene attempts (like `write.lock`). This switches our in-memory
semaphore locks for shards to use a different exception. Additionally,
ShardLockObtainFailedException no longer subclasses IOException, since
no IO is being done is this case.

Resolves #19978
2016-08-16 14:37:36 -06:00
Nik Everett 46bf8baf2e Switch aggregation registration for push to pull
Adds `getAggregations` to `SearchPlugin` which can be used to register
aggregations.

Fixup MockNode which wasn't createing MockBigArrays.
2016-08-16 09:08:36 -04:00
Nik Everett cf6e1a4362 Move all FetchSubPhases to `o.e.search.fetch.subphase`
As the most complicated `FetchSubPhase` highlighting gets its own package
(`o.e.seach.fetch.subphase.highlight`. No other `FetchSubPhase`s get their
own package. Instead they all reside together in `o.e.search.fetch.subphase`.

Add package descriptions to `o.e.search.fetch` and subpackages.
2016-08-12 18:21:15 -04:00
Jason Tedor 1f0673c9bd Default max local storage nodes to one
This commit defaults the max local storage nodes to one. The motivation
for this change is that a default value greather than one is dangerous
as users sometimes end up unknowingly starting a second node and start
thinking that they have encountered data loss.

Relates #19964
2016-08-12 09:26:20 -04:00
Nik Everett 9f8f2ea54b Remove ESIntegTestCase#pluginList
It was a useful method in 1.7 when javac's type inference wasn't as
good, but now we can just replace it with `Arrays.asList`.
2016-08-11 15:44:02 -04:00
Yannick Welsch 522b137097 Make NetworkPartition disruption scheme configurable (#19534)
This commit separates the description of the links in the network that are to be disrupted from the failure that is to be applied to the links (disconnect/unresponsive/delay). Previously we had subclasses for the various kind of network disruption schemes combining on one hand failure mode (disconnect/unresponsive/delay) as well as the network links to cut (two partitions / bridge partitioning) into a single class.
2016-08-11 14:55:06 +02:00
Adrien Grand 0d6ac57acf Collapse o.e.index.mapper packages. #19921
I also reduced the visibility of a couple classes and renamed/consolidated some
test classes for consistency, eg. removing the `Simple` prefix or using the
`<Type>FieldMapperTests` convention for testing field mappers.
2016-08-10 17:51:11 +02:00
javanna 7d4a6499e1 [TEST] add inline comments to AbstractQueryTestCase#unknownObjectExceptionTest 2016-08-10 12:21:25 +02:00
javanna 8391e6de37 [TEST] enable testUnknownObjectException for alternate query versions too 2016-08-10 12:21:25 +02:00
javanna 0a98b5e56e [TEST] make AbstractQueryTestCase#testUnknownObjectException more accurate
testUnknownObjectException used to generate malformed json objects in some cases, due to the existence of arrays as it was not closing the injected object correctly. That is why the test was catching JsonParseException among the exception that are expected to be thrown. That is fixed by tracking where the new object is placed and placing its end object marker to the right level rather than always at the end.

Also introduced a mechanism to explicitly declare objects that won't cause any exception when they get additional objects injected, so that there is no need to override the method anymore as that caused copy pasting of the whole test method. This also makes sure that changes are reflected in tests, as those inner objects are not skipped but we actually check that what is declared is true (no exceptions get thrown when an additional object is added within them.
2016-08-10 11:48:51 +02:00
Lee Hinman 5849c488b5 Merge remote-tracking branch 'dakrone/compliation-breaker' 2016-08-09 11:57:26 -06:00
Lee Hinman 2be52eff09 Circuit break the number of inline scripts compiled per minute
When compiling many dynamically changing scripts, parameterized
scripts (<https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-using.html#prefer-params>)
should be preferred. This enforces a limit to the number of scripts that
can be compiled within a minute. A new dynamic setting is added -
`script.max_compilations_per_minute`, which defaults to 15.

If more dynamic scripts are sent, a user will get the following
exception:

```json
{
  "error" : {
    "root_cause" : [
      {
        "type" : "circuit_breaking_exception",
        "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
        "bytes_wanted" : 0,
        "bytes_limit" : 0
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "i",
        "node" : "a5V1eXcZRYiIk8lecjZ4Jw",
        "reason" : {
          "type" : "general_script_exception",
          "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]",
          "caused_by" : {
            "type" : "circuit_breaking_exception",
            "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
            "bytes_wanted" : 0,
            "bytes_limit" : 0
          }
        }
      }
    ],
    "caused_by" : {
      "type" : "general_script_exception",
      "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]",
      "caused_by" : {
        "type" : "circuit_breaking_exception",
        "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
        "bytes_wanted" : 0,
        "bytes_limit" : 0
      }
    }
  },
  "status" : 500
}
```

This also fixes a bug in `ScriptService` where requests being executed
concurrently on a single node could cause a script to be compiled
multiple times (many in the case of a powerful node with many shards)
due to no synchronization between checking the cache and compiling the
script. There is now synchronization so that a script being compiled
will only be compiled once regardless of the number of concurrent
searches on a node.

Relates to #19396
2016-08-09 10:26:27 -06:00
javanna 329eaaea65 [TEST] expand AbstractQueryTestCase#testQueryWrappedInArray to run against query alternate versions 2016-08-08 19:09:43 +02:00
javanna 2437226802 [TEST] restore tests repeatability in AbstractQueryTestCase
Some random operations were conditionally performed in the before test, which made tests not repeatable. For instance take the seed chain to repeat a specific iteration and try to reproduce it, this conditional code would get executed in both cases when trying to isolate the failure, but not among the different iterations (as only the first method/iteration executes it), hence the failure will not reproduce.

Moved the random operations to beforeClass and left the non random part in the before method, which is needed as it depends on some method that can be overridden by subclasses.
2016-08-05 22:38:31 +02:00
Luca Cavanna 4c1a3b9a53 Merge pull request #19791 from javanna/fix/multiple_fields_queries
Query parsers to throw exception when multiple field names are provided
2016-08-05 15:53:35 +02:00
Ali Beyad f59ca9083b Snapshot repository cleans up empty index folders (#19751)
This commit cleans up indices in a snapshot repository when all
snapshots containing the index are all deleted. Previously, empty
indices folders would lay around after all snapshots containing
them were deleted.
2016-08-05 09:39:02 -04:00
javanna 7f0bd56094 [TEST] use expectThrows wherever possible in query builder unit tests 2016-08-05 13:55:18 +02:00
Nik Everett 1e587406d8 Fail yaml tests and docs snippets that get unexpected warnings
Adds `warnings` syntax to the yaml test that allows you to expect
a `Warning` header that looks like:
```
    - do:
        warnings:
            - '[index] is deprecated'
            - quotes are not required because yaml
            - but this argument is always a list, never a single string
            - no matter how many warnings you expect
        get:
            index:    test
            type:    test
            id:        1
```

These are accessible from the docs with:
```
// TEST[warning:some warning]
```

This should help to force you to update the docs if you deprecate
something. You *must* add the warnings marker to the docs or the build
will fail. While you are there you *should* update the docs to add
deprecation warnings visible in the rendered results.
2016-08-04 15:23:05 -04:00
Daniel Mitterdorfer 4598c36027 Fix various concurrency issues in transport (#19675)
Due to various issues (most notably a missing happens-before edge
between socket accept and channel close in MockTcpTransport),
MockTcpTransportTests sometimes did not terminate.

With this commit we fix various concurrency issues that led to
this hanging test.

Failing example build: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-os-compatibility/os=oraclelinux/835/console
2016-08-04 21:00:59 +02:00
javanna cd9388ce66 [TEST] parse query alternate versions in strict mode
AbstractQueryTestCase parses the main version of the query in strict mode, meaning that it will fail if any deprecated syntax is used. It should do the same for alternate versions (e.g. short versions). This is the way it is because the two alternate versions for ids query are both deprecated. Moved testing for those to a specific test method that isolates the deprecations and actually tests that the two are deprecated.
2016-08-04 19:49:43 +02:00
javanna 146f02183d [TEST] remove unused methods and fix some warnings in AbstractQueryTestCase
Also fix line length issues
2016-08-04 10:06:25 +02:00
Luca Cavanna c5a9427293 Merge pull request #19750 from javanna/fix/npe_parse_field_array
Throw ParsingException if a query is wrapped in an array
2016-08-03 18:21:39 +02:00
javanna 4805250ecf Throw ParsingException if a query is wrapped in an array
Our parsing code accepted up until now queries in the following form (note that the query starts with `[`:

```
{
    "bool" : [
        {
          "must" : []
        }
    ]
}
```

This would lead to a null pointer exception as most parsers assume that the field name ("must" in this example) is the first thing that can be found in a query if its json is valid, hence always non null while parsing. Truth is that the additional array layer doesn't make the json invalid, hence the following code fragment would cause NPE within ParseField, because null gets passed to `parseContext.isDeprecatedSetting`:

```
if (token == XContentParser.Token.FIELD_NAME) {
    currentFieldName = parser.currentName();
} else if (parseContext.isDeprecatedSetting(currentFieldName)) {
    // skip
} else if (token == XContentParser.Token.START_OBJECT) {
```

We could add null checks in each of our parsers in lots of places, but we rely on `currentFieldName` being non null in all of our parsers, and we should consider it a bug when these unexpected situations are not caught explicitly. It would be best to find a way to prevent such queries altogether without changing all of our parsers.

The reason why such a query goes through is that we've been allowing a query to start with either `[` or `{`. The only reason I found is that we accept `match_all : []`. This seems like an undocumented corner case that we could drop support for. Then we can be stricter and accept only `{` as start token of a query. That way the only next token that the parser can encounter if the json is valid (otherwise the json parser would barf earlier) is actually a field_name, hence the assumption that all our parser makes hold.

The downside of this is simply dropping support for `match_all : []`

Relates to #12887
2016-08-03 17:05:14 +02:00
Nik Everett ca8f666c66 Add line number to yaml test failures
Old:
```
   > Throwable #1: java.lang.AssertionError: expected [2xx] status code but api [reindex] returned [400 Bad Request] [{"error":{"root_cause":[{"type":"parsing_exception","reason":"[reindex] failed to parse field [dest]","line":1,"col":25}],"type":"parsing_exception","reason":"[reindex] failed to parse field [dest]","line":1,"col":25,"caused_by":{"type":"illegal_argument_exception","reason":"[dest] unknown field [asdfadf], parser not found"}},"status":400}]
   >    at __randomizedtesting.SeedInfo.seed([9325F8C5C6F227DD:1B71C71F680E4A25]:0)
   >    at org.elasticsearch.test.rest.yaml.section.DoSection.execute(DoSection.java:119)
   >    at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:309)
   >    at java.lang.Thread.run(Thread.java:745)
```

New:
```
   > Throwable #1: java.lang.AssertionError: Failure at [reindex/10_basic:12]: expected [2xx] status code but api [reindex] returned [400 Bad Request] [{"error":{"root_cause":[{"type":"parsing_exception","reason":"[reindex] failed to parse field [dest]","line":1,"col":25}],"type":"parsing_exception","reason":"[reindex] failed to parse field [dest]","line":1,"col":25,"caused_by":{"type":"illegal_argument_exception","reason":"[dest] unknown field [asdfadf], parser not found"}},"status":400}]
   >    at __randomizedtesting.SeedInfo.seed([444DEEAF47322306:CC19D175E9CE4EFE]:0)
   >    at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.executeSection(ESClientYamlSuiteTestCase.java:329)
   >    at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:309)
   >    at java.lang.Thread.run(Thread.java:745)
   > Caused by: java.lang.AssertionError: expected [2xx] status code but api [reindex] returned [400 Bad Request] [{"error":{"root_cause":[{"type":"parsing_exception","reason":"[reindex] failed to parse field [dest]","line":1,"col":25}],"type":"parsing_exception","reason":"[reindex] failed to parse field [dest]","line":1,"col":25,"caused_by":{"type":"illegal_argument_exception","reason":"[dest] unknown field [asdfadf], parser not found"}},"status":400}]
   >    at org.elasticsearch.test.rest.yaml.section.DoSection.execute(DoSection.java:119)
   >    at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.executeSection(ESClientYamlSuiteTestCase.java:325)
   >    ... 37 more
```

Sorry for the longer stack trace, but I wanted to be sure I didn't throw
anything away by accident.
2016-08-03 10:59:57 -04:00
Britta Weber abcb4c8a97 [Test] move methods from bwc test to test package for use in plugins (#19738)
* [Test] move methods from bwc test to test package for use in other plugins
2016-08-03 11:41:46 +02:00
Ryan Ernst df8dc64e9b Plugins: Make NamedWriteableRegistry immutable and add extenion point for named writeables
Currently any code that wants to added NamedWriteables to the
NamedWriteableRegistry can do so via guice injection of the registry,
and registering at construction time. However, this makes the registry
complex: it has both get and register methods synchronized, and there is
likely contention on the read side from multiple threads.  The
registration has mostly already been contained to guice modules at node
construction time.

This change makes the registry immutable, taking all of the
NamedWriteable readers at construction time. It also allows plugins to
added arbitrary named writables that it may use in its own transport
actions.
2016-08-02 15:56:25 -07:00
Ali Beyad c4ae23f5d8 Enables implementations of the BlobContainer interface to (#19749)
conform with the requirements of the writeBlob method by
throwing a FileAlreadyExistsException if attempting to write
to a blob that already exists. This change means implementations
of BlobContainer should never overwrite blobs - to overwrite a
blob, it must first be deleted and then can be written again.

Closes #15579
2016-08-02 09:48:21 -04:00
Ali Beyad 456ea56527 Cleans up the BlobContainer interface by removing the (#19727)
writeBlob method takes a BytesReference in favor of just
the writeBlob method that takes an InputStream.

Closes #18528
2016-08-02 09:21:43 -04:00
Ali Beyad 25d8eca62d Removes the notion of write consistency level across all APIs in
favor of waiting for active shard copy count (wait_for_active_shards).
2016-08-01 13:35:29 -04:00
Ali Beyad 9f88a8194a Merge pull request #19706 from elastic/enhancement/snapshot-blob-handling
More resilient blob handling in snapshot repositories
2016-08-01 12:03:53 -04:00
Tanguy Leroux 386902903e [TEST] Kill remaining lang-groovy messy tests
After #13834 many tests that used Groovy scripts (for good or bad reason) in their tests have been moved in the lang-groovy module and the issue #13837 has been created to track these messy tests in order to clean them up.

The work started with #19280, #19302 and #19336 and this PR moves the remaining messy tests back in core, removes the dependency on Groovy, changes the scripts in order to use the mocked script engine, and change the tests to integration tests.

It also moves IndexLookupIT test back (even if it has good chance to be removed soon) and fixes its tests.

It also changes AbstractQueryTestCase to use custom script plugins in tests.

closes #13837
2016-08-01 16:59:47 +02:00
Alexander Lin 9ac6389e43 Rename operation to result and reworking responses
* Rename operation to result and reworking responses
* Rename DocWriteResponse.Operation enum to DocWriteResponse.Result

These are just easier to interpret names.

Closes #19664
2016-08-01 10:42:58 -04:00
Alexander Lin 119026b4fb Remove isCreated and isFound from the Java API
This is cleanup work from #19566, where @nik9000 suggested trying to nuke the isCreated and isFound methods. I've combined nuking the two methods with removing UpdateHelper.Operation in favor of DocWriteResponse.Operation here.

Closes #19631.
2016-07-29 14:21:43 -04:00
Nik Everett 2e7336dc10 Add package-info to o.e.test.rest
This removes two packages, consolidating them into their parent package
and adds `package-info.java` files to describe all of the packages under
`org.elasticsearch.test.rest`.
2016-07-28 16:07:44 -04:00
David Pilato 0d2ccf0989 Merge branch 'pr/15724-gce-network-host-master' 2016-07-28 16:59:18 +02:00
Nik Everett fb45f6a8a8 Add authentication to reindex-from-remote
The tests for authentication extend ESIntegTestCase and use a mock
authentication plugin. This way the clients don't have to worry about
running it. Sadly, that means we don't really have good coverage on the
REST portion of the authentication.

This also adds ElasticsearchStatusException, and exception on which you
can set an explicit status. The nice thing about it is that you can
set the RestStatus that it returns to whatever arbitrary status you like
based on the status that comes back from the remote system.
reindex-from-remote then uses it to wrap all remote failures, preserving
the status from the remote Elasticsearch or whatever proxy is between us
and the remove Elasticsearch.
2016-07-27 14:17:41 -04:00
David Pilato e9339a1960 Merge branch 'master' into pr/15724-gce-network-host-master 2016-07-27 11:24:53 +02:00
Boaz Leskes 6f76740a58 await fix testConcurrentSendRespondAndDisconnect 2016-07-26 23:42:10 +02:00
Nik Everett 9270e8b22b Rename client yaml test infrastructure
This makes it obvious that these tests are for running the client yaml
suites. Now that there are other ways of running tests using the REST
client against a running cluster we can't go on calling the shared
client yaml tests "REST tests". They are rest tests, but they aren't
**the** rest tests.
2016-07-26 13:53:44 -04:00
David Pilato 0d3edee928 Merge branch 'master' into pr/15724-gce-network-host-master 2016-07-26 18:51:01 +02:00
David Pilato fde15ae470 Move custom name resolvers to NetworkService CTOR
Instead of using NetworkModule we can directly inject them in NetworkService CTOR.

See https://github.com/elastic/elasticsearch/pull/15765#issuecomment-235307974
2016-07-26 18:26:30 +02:00
Boaz Leskes fabfd425f0 remove socket timeout from MockTcpTransport
added in b208a7dbae
2016-07-26 18:04:05 +02:00
Boaz Leskes dbdb6341a5 increase logging information in testConcurrentSendRespondAndDisconnect 2016-07-26 18:02:22 +02:00
Daniel Mitterdorfer b208a7dbae Add socket timeout in MockTcpTransport
With this commit we set an explicit socket timeout in
MockTcpTransport to avoid hanging tests in case of disconnections.
2016-07-26 16:04:51 +02:00
Nik Everett a95d4f4ee7 Add Location header and improve REST testing
This adds a header that looks like `Location: /test/test/1` to the
response for the index/create/update API. The requirement for the header
comes from https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

https://tools.ietf.org/html/rfc7231#section-7.1.2 claims that relative
URIs are OK. So we use an absolute path which should resolve to the
appropriate location.

Closes #19079

This makes large changes to our rest test infrastructure, allowing us
to write junit tests that test a running cluster via the rest client.
It does this by splitting ESRestTestCase into two classes:
* ESRestTestCase is the superclass of all tests that use the rest client
to interact with a running cluster.
* ESClientYamlSuiteTestCase is the superclass of all tests that use the
rest client to run the yaml tests. These tests are shared across all
official clients, thus the `ClientYamlSuite` part of the name.
2016-07-25 17:02:40 -04:00
Boaz Leskes b90dff7292 increase log level to debug in testConcurrentSendRespondAndDisconnect 2016-07-25 22:01:09 +02:00
Ali Beyad 2f831c3abb BytesArray tests fix: offsets don't matter on a zero bytes array
Closes #19582
2016-07-25 15:22:08 -04:00
Tanguy Leroux f745c96949 Clean up more messy tests
After #13834 many tests that used Groovy scripts (for good or bad reason) in their tests have been moved in the lang-groovy module and the issue #13837 has been created to track these messy tests in order to clean them up.

This commit moves more tests back in core, removes the dependency on Groovy, changes the scripts in order to use the mocked script engine, and change the tests  to integration tests.
2016-07-25 17:02:49 +02:00
Boaz Leskes cd596772ee Persistent Node Names (#19456)
With #19140 we started persisting the node ID across node restarts. Now that we have a "stable" anchor, we can use it to generate a stable default node name and make it easier to track nodes over a restarts. Sadly, this means we will not have those random fun Marvel characters but we feel this is the right tradeoff.

On the implementation side, this requires a bit of juggling because we now need to read the node id from disk before we can log as the node node is part of each log message. The PR move the initialization of NodeEnvironment as high up in the starting sequence as possible, with only one logging message before it to indicate we are initializing. Things look now like this:

```
[2016-07-15 19:38:39,742][INFO ][node                     ] [_unset_] initializing ...
[2016-07-15 19:38:39,826][INFO ][node                     ] [aAmiW40] node name set to [aAmiW40] by default. set the [node.name] settings to change it
[2016-07-15 19:38:39,829][INFO ][env                      ] [aAmiW40] using [1] data paths, mounts [[ /(/dev/disk1)]], net usable_space [5.5gb], net total_space [232.6gb], spins? [unknown], types [hfs]
[2016-07-15 19:38:39,830][INFO ][env                      ] [aAmiW40] heap size [1.9gb], compressed ordinary object pointers [true]
[2016-07-15 19:38:39,837][INFO ][node                     ] [aAmiW40] version[5.0.0-alpha5-SNAPSHOT], pid[46048], build[473d3c0/2016-07-15T17:38:06.771Z], OS[Mac OS X/10.11.5/x86_64], JVM[Oracle Corporation/Java HotSpot(TM) 64-Bit Server VM/1.8.0_51/25.51-b03]
[2016-07-15 19:38:40,980][INFO ][plugins                  ] [aAmiW40] modules [percolator, lang-mustache, lang-painless, reindex, aggs-matrix-stats, lang-expression, ingest-common, lang-groovy, transport-netty], plugins []
[2016-07-15 19:38:43,218][INFO ][node                     ] [aAmiW40] initialized
```

Needless to say, settings `node.name` explicitly still works as before.

The commit also contains some clean ups to the relationship between Environment, Settings and Plugins. The previous code suggested the path related settings could be changed after the initial Environment was changed. This did not have any effect as the security manager already locked things down.
2016-07-23 22:46:48 +02:00
Jason Tedor 2d1b0587dd Introduce Netty 4
This commit adds transport-netty4, a transport and HTTP implementation
based on Netty 4.

Relates #19526
2016-07-22 22:26:35 -04:00
Ali Beyad a0a4d67eae All snapshot metadata files use UUID for the blob ID 2016-07-22 13:52:13 -04:00
gfyoung d98fd36dad Added deleteBlob IOException test 2016-07-22 13:48:45 -04:00
javanna db8beeba3b Merge branch 'master' into feature/async_rest_client 2016-07-22 15:51:03 +02:00
Boaz Leskes bd574d92ae Verify lower level transport exceptions don't bubble up on disconnects (#19518)
#19096 introduced a generic TCPTransport base class so we can have multiple TCP based transport implementation. These implementations can vary in how they respond internally to situations where we concurrently send, receive and handle disconnects and can have different exceptions. However, disconnects are important events for the rest of the code base and should be distinguished from other errors (for example, it signals TransportMasterAction that it needs to retry and wait for the a (new) master to come back).  Therefore, we should make sure that all the implementations do the proper translation from their internal exceptions into ConnectTransportException which is used externally. 

Similarly we should make sure that the transport implementation properly recognize errors that were caused by a disconnect as such and deal with them correctly. This was, for example, the source of a build failure at https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-intake/1080 , where a concurrency issue cause SocketException to bubble out of MockTcpTransport.

This PR adds a tests which concurrently simulates connects, disconnects, sending and receiving and makes sure the above holds. It also fixes anything (not much!) that was found it.
2016-07-22 14:35:47 +02:00
Tal Levy f7cd86ef6d rethrow script compilation exceptions into ingest configuration exceptions (#19318)
* rethrow script compilation exceptions into ingest configuration exceptions
* update readProcessor to rethrow any exception as an ElasticsearchException
2016-07-20 10:37:56 -07:00
javanna a9b5c5adbe restore throws IOException clause on all performRequest sync methods
We throw IOException, which is the exception that is going to be thrown in 99% of the cases. A more generic exception can happen, and if it is a runtime one we just let it bubble up as is, otherwise we wrap it into runtime one so that we don't require to catch Exception everywhere, which seems odd.

Also adjusted javadocs for all performRequest methods
2016-07-19 15:18:05 +02:00
javanna 1bb33cf572 Remove RestClient#JSON_CONTENT_TYPE constant, already available in ContentType class 2016-07-19 15:17:12 +02:00
javanna e742d65e02 [TEST] Make sure the last response body is always available in our REST tests
With the introduction of the async client, ResponseException doesn't eagerly read the response body anymore into a string. That is better, but raised a problem in our REST tests infra: we were reading the response body twice, while it can only be consumed once. Introduced a RestTestResponseException that wraps a ResponseException and exposes the body which now gets read only once.
2016-07-19 15:16:45 +02:00
javanna 41e97a7cb1 RestClient: take builder out to its own class
The RestClient class is getting bigger and bigger, its builder can definitely be taken out to its own top level class: RestClientBuilder
2016-07-19 15:16:45 +02:00
javanna 1fbec71243 Rest client: introduce async performRequest method and use async client under the hood for sync requests too
The new method accepts the usual parameters (method, endpoint, params, entity and headers) plus a response listener and an async response consumer. Shortcut methods are also added that don't require params, entity and the async response consumer optional.

There are a few relevant api changes as a consequence of the move to async client that affect sync methods:
- Response doesn't implement Closeable anymore, responses don't need to be closed
- performRequest throws Exception rather than just IOException, as that is the the exception that we get from the FutureCallback#failed method in the async http client
- ssl configuration is a bit simpler, one only needs to call setSSLStrategy from a custom HttpClientConfigCallback, that doesn't end up overridng any other default around connection pooling (it used to happen with the sync client and make ssl configuration more complex)

Relates to #19055
2016-07-19 15:15:58 +02:00
Nik Everett a2a7ea1f17 Make ExtendedBounds immutable
We used to mutate it as part of building the aggregation. That
caused assertVersionSerializable to fail because it assumes that
requests aren't mutated after they are sent.

Closes #19481
2016-07-19 08:48:14 -04:00
Simon Willnauer 8394544548 Add a dedicated client/transport project for transport-client (#19435)
The `client/transport` project adds a new jar build project that
pulls in all dependencies and configures all required modules.

Preinstalled modules are:
 * transport-netty
 * lang-mustache
 * reindex
 * percolator

The `TransportClient` classes are still in core
while `TransportClient.Builder` has only a protected construcutor
such that users are redirected to use the new `TransportClientBuilder`
from the new jar.

Closes #19412
2016-07-18 15:42:24 +02:00
Martijn van Groningen e0ebf5da1c Template cleanup:
* Removed `Template` class and unified script & template parsing logic. Templates are scripts, so they should be defined as a script. Unless there will be separate template infrastructure, templates should share as much code as possible with scripts.
* Removed ScriptParseException in favour for ElasticsearchParseException
* Moved TemplateQueryBuilder to lang-mustache module because this query is hard coded to work with mustache only
2016-07-18 10:16:01 +02:00
Ali Beyad 687e2e12b3 Merge pull request #19450 from elastic/feature/friendly-index-creation
Makes index creation more friendly
2016-07-15 11:48:21 -04:00
Ali Beyad d78f40fb1e Index creation waits for active shard copies before returning (#18985)
Before returning, index creation now waits for the configured number
of shard copies to be started. In the past, a client would create an
index and then potentially have to check the cluster health to wait
to execute write operations. With the cluster health semantics changing
so that index creation does not cause the cluster health to go RED,
this change enables waiting for the desired number of active shards
to be active before returning from index creation.

Relates #9126
2016-07-15 11:19:27 -04:00
Martijn van Groningen d0069f0fbb Provide access to ThreadContext in ingest plugins
Also introduced a `Processor.Parameters` class that is holder for several services processors rely on,
the  IngestPlugin#getProcessors(...) method has been changed to accept `Processor.Parameters` instead
of each service seperately.
2016-07-15 08:16:15 +02:00
Jason Tedor 31c648eee8 Rename transport-netty to transport-netty3
This commit renames the Netty 3 transport module from transport-netty to
transport-netty3. This is to make room for a Netty 4 transport module,
transport-netty4.

Relates #19439
2016-07-14 22:03:14 -04:00
Jason Tedor 575fa4e00a Fix line-length in o/e/t/r/s/Features.java
This commit fixes a line-length checkstyle violation in
o/e/t/r/s/Features.java.
2016-07-14 18:10:35 -04:00
Honza Král e21b1e8066 [TEST] add 'yaml' feature for the test runner (#19436)
Also renamed 30_yaml.yaml to 30_json.yaml since it tests json, not yaml
2016-07-14 17:30:32 +02:00
Simon Willnauer 5616251f22 Remove `node.mode` and `node.local` settings (#19428)
Today `node.mode` and `node.local` serve almost the same purpose, they
are a shortcut for `discovery.type` and `transport.type`. If `node.local: true`
or `node.mode: local` is set elasticsearch will start in _local_ mode which means
only nodes within the same JVM are discovered and a non-network based transport
is used. The _local_ mode it only really used in tests or if nodes are embedded.
For both, embedding and tests explicit configuration via `discovery.type` and `transport.type`
should be preferred.

This change removes all the usage of these settings and by-default doesn't
configure a default transport implemenation since netty is now a module. Yet, to make
the user expericence flawless, plugins or modules can set a `http.type.default` and
`transport.type.default`. Plugins set this via `PluginService#additionalSettings()`
which enforces _set-once_ which prevents node startup if set multiple times. This means
that our distributions will just startup with netty transport since it's packaged as a
module unless `transport.type` or `http.transport.type` is explicitly set.

This change also found a bunch of bugs since several NamedWriteables were not registered if a
transport client is used. Now that we don't rely on the `node.mode` leniency which is inherited
instead of using explicit settings, `TransportClient` uses `AssertingLocalTransport` which detects these problems since it serializes all messages.

Closes #16234
2016-07-14 13:21:10 +02:00
Simon Willnauer 29fd0f1bd8 [TEST] Remove wrong transportName from MockTcpTransport#ctor 2016-07-13 12:50:52 +02:00
Simon Willnauer 067ca1f996 [TEST] Use a semaphore to block unitl all in-flight requests are released 2016-07-13 10:31:05 +02:00
Simon Willnauer 814c7224f9 Merge pull request #19392 from elastic/modularize_netty
This moves all netty related code into modules/transport-netty the module is build as a zip file as well as a JAR to serve as a dependency for transport client. For the time being this is required otherwise we have no network based impl. for transport client users. This might be subject to change given that we move forward http client.
2016-07-13 09:52:03 +02:00
Simon Willnauer eba69ffade [TEST] First decrement in-flight requests before releasing the latch 2016-07-12 22:58:03 +02:00
Simon Willnauer ec55f9fff7 [TEST] Make AbstractSimpleTransportTestCase#testTimeoutSendExceptionWithDelayedResponse more robust and wait for in-flight request 2016-07-12 20:41:37 +02:00
Simon Willnauer 4fb79707bd Fix remaining tests that either need access to the netty module or require explict configuration
Some tests still start http implicitly or miss configuring the transport clients correctly.
This commit fixes all remaining tests and adds a depdenceny to `transport-netty` from
`qa/smoke-test-http` and `modules/reindex` since they need an http server running on the nodes.

This also moves all required permissions for netty into it's module and out of core.
2016-07-12 16:29:57 +02:00
Luca Cavanna f6aec3fdb5 Merge pull request #19373 from javanna/enhancement/rest_client_builder_callback
Rest Client: add callback to customize http client settings
2016-07-12 13:30:27 +02:00
javanna 512b8be791 RestClient: simplify ssl configuration and make http config callback functional friendly 2016-07-12 13:25:55 +02:00
Boaz Leskes 081d04afac Make NotMasterException a first class citizen (#19385)
That exception is currently serialized as its current base class IllegalStateException which confuses code supposed to deal with the stepping down of a master. This is an important exception and we should be able to serialize it correctly. This commit fixes it by moving the exception to inherit from ElasticsearchException and properly register it.

As a bonus I adapted CapturingTransport to properly simulate serialized exceptions.
2016-07-12 12:44:40 +02:00
javanna fa0b354e66 Rest Client: add callback to customize http client settings
The callback replaces the ability to fully replace the http client instance. By doing that, one used to lose any default that the RestClient had set for the underlying http client. Given that you'd usually override one or two things only, like a couple of timeout values, the ssl factory or the default credentials providers, it is not uder friendly if by doing that users end up replacing the whole http client instance and lose any default set by us.
2016-07-12 12:31:28 +02:00
Simon Willnauer 199a5a1f04 Fix TcpTransport#sendRequest to raise NotConnectedExcepiton if we get disconnected while sending
This also fixes a race in AbstractSimpleTransportTestCase where we never wait long enough
for all response to finish causing expected failures.
2016-07-12 10:56:20 +02:00
Ryan Ernst 93aebbef0f Merge branch 'master' into modularize_netty 2016-07-11 23:49:00 -07:00
Ryan Ernst 7195d1e0ff Fix plugins service to not double bind plugin components 2016-07-11 17:05:56 -07:00
Nik Everett 8263873783 Switch search extension from push to pull
Switches most search behavior extensions from push (`onModule(SearchModule)`)
to pull (`implements SearchPlugin`). This effort in general gives plugin
authors a much cleaner view of how to extend Elasticsearch and starts to
set up portions of Elasticsearch as "the plugin API". This commit in
particular does that for search-time behavior like customized suggesters,
highlighters, score functions, and significance heuristics.

It also switches most such customization to being done at search module
construction time which is much, much easier to reason about from a testing
perspective. It also helps significantly in the process of de-guice-ing
Elasticsearch's startup.

There are at least two major search time extensions that aren't covered in
this commit that will simply have to wait for the next commit on the topic
because this one has already grown large: custom aggregations and custom
queries. These will likely live in the same SearchPlugin interface as well.
2016-07-11 18:49:05 -04:00
Ryan Ernst 99ac65931a Plugins: Add components creator as bridge between guice and new plugin init world
This change adds a createComponents() method to Plugin implementations
which they can use to return already constructed componenents/services.
Eventually this should be just services ("components" don't really do
anything), but for now it allows any object so that preconstructed
instances by plugins can still be bound to guice. Over time we should
add basic services as arguments to this method, but for now I have left
it empty so as to not presume what is a necessary service.
2016-07-11 14:14:06 -07:00
Simon Willnauer 048e4416e7 Move netty transport and http into a module
This moves all netty code and it's dependency into a module.
2016-07-11 22:21:29 +02:00
Ali Beyad 0faf638710 Blocked allocations on primary causes RED health
If the allocation decision for a primary shard was NO, this should
cause the cluster health for the shard to go RED, even if the shard
belongs to a newly created index or is part of cluster recovery.

Relates #9126
2016-07-11 15:32:13 -04:00
Ali Beyad 417bd0cd63 Index creation does not cause the cluster health to go RED
Previously, index creation would momentarily cause the cluster health to
go RED, because the primaries were still being assigned and activated.
This commit ensures that when an index is created or an index is being
recovered during cluster recovery and it does not have any active
allocation ids, then the cluster health status will not go RED, but
instead be YELLOW.

Relates #9126
2016-07-11 15:30:47 -04:00
Simon Willnauer 47bd2f9ca5 More cleanups aroung tests that require HTTP to be enalbed. (#19363)
this commit moves the most of the http related integ tests out into it's own 
`qa/smoke-test-http` project where most of the test can run against the external cluster.
2016-07-11 20:44:57 +02:00
Nik Everett 4b171b84cb Fix modifier order
checkstyle
2016-07-11 12:59:45 -04:00
Christoph Büscher 0d428b6ba8 Add test for GeoHashUtils#bbox() 2016-07-11 10:46:31 -05:00
Simon Willnauer ee193f7697 [TEST] Catch RejectedOperationException when disconnecting from node in MockTcpTransport 2016-07-11 16:36:26 +02:00
Simon Willnauer 07260d4351 [TEST] Use AbstractRunnable when forking off threads on an executor 2016-07-11 16:27:07 +02:00
Simon Willnauer 3f3c93ec65 Add blocking socket based MockTcpTransport (#19332)
Today we have a bunch of tests that use netty transport for several reasons
these tests use it because they need to run some tcp based transport. Yet, this
couples our tests tightly to the netty implementation which should be tested on it's own.
This change adds a plain socket based blocking TcpTransport implementation that is used by
default in tests if local transport is suppressed or if network is selected.
It also adds another tcp network implementation as a showcase how the interface works.
2016-07-11 12:17:52 +02:00
javanna 942e342662 Rest Client: use short performRequest methods when possible 2016-07-11 10:36:26 +02:00
Jason Tedor e86aa29f67 Die with dignity
Today when a thread encounters a fatal unrecoverable error that
threatens the stability of the JVM, Elasticsearch marches on. This
includes out of memory errors, stack overflow errors and other errors
that leave the JVM in a questionable state. Instead, the Elasticsearch
JVM should die when these errors are encountered. This commit causes
this to be the case.

Relates #19272
2016-07-07 14:44:03 -04:00
Tanguy Leroux b58f2eb5c2 Move back some messy tests from Groovy plugin to core
This commit moves back some messy tests that have been placed in lang-groovy module in https://github.com/elastic/elasticsearch/pull/13834. It removes the dependency on Groovy plugin as well as change back the tests to integration tests (IT suffix).

It also changes the current MockScriptEngine and MockScriptPlugin to make it easier to use.
2016-07-07 15:26:36 +02:00
Alexander Reelsen 71b48fb16c Dependencies: Update to jopt-5.0 (#19278)
The new version of jopt allows us to remove a couple of TODOs in the code.

Closes #12368
2016-07-07 08:50:10 +02:00
Ryan Ernst e7818f75e1 Fix checkstyle for TestProcessor 2016-07-05 22:33:08 -07:00
Ryan Ernst 2fc41adeb5 Merge branch 'master' into ingest_plugin_api 2016-07-05 20:53:03 -07:00
Jason Tedor d0765d0761 Merge branch 'master' into feature/seq_no
* master: (192 commits)
  [TEST] Fix rare OBOE in AbstractBytesReferenceTestCase
  Reindex from remote
  Rename writeThrowable to writeException
  Start transport client round-robin randomly
  Reword Refresh API reference (#19270)
  Update fielddata.asciidoc
  Fix stored_fields message
  Add missing footer notes in mapper size docs
  Remote BucketStreams
  Add doc values support to the _size field in the mapper-size plugin
  Bump version to 5.0.0-alpha5.
  Update refresh.asciidoc
  Update shrink-index.asciidoc
  Change Debian repository for Vagrant debian-8 box
  [TEST] fix test to account for internal empyt reference optimization
  Upgrade to netty 3.10.6.Final (#19235)
  [TEST] fix histogram test when extended bounds overlaps data
  Remove redundant modifier
  Simplify TcpTransport interface by reducing send code to a single send method (#19223)
  Fix style violation in InstallPluginCommand.java
  ...
2016-07-05 22:01:07 -04:00
Nik Everett b3c015e2bb Reindex from remote
This adds a remote option to reindex that looks like

```
curl -POST 'localhost:9200/_reindex?pretty' -d'{
  "source": {
    "remote": {
      "host": "http://otherhost:9200"
    },
    "index": "target",
    "query": {
      "match": {
        "foo": "bar"
      }
    }
  },
  "dest": {
    "index": "target"
  }
}'
```

This reindex has all of the features of local reindex:
* Using queries to filter what is copied
* Retry on rejection
* Throttle/rethottle
The big advantage of this version is that it goes over the HTTP API
which can be made backwards compatible.

Some things are different:

The query field is sent directly to the other node rather than parsed
on the coordinating node. This should allow it to support constructs
that are invalid on the coordinating node but are valid on the target
node. Mostly, that means old syntax.
2016-07-05 16:13:17 -04:00
Jason Tedor 96f283c195 Rename writeThrowable to writeException
This commit renames writeThrowable to writeException. The situation here
stems from the fact that the StreamOutput method for serializing
Exceptions needs to accept Throwables too as Throwables can be the cause
of serialized Exceptions. Yet, we do not serialize Throwables in the
Error sub-hierarchy in a way that they can be deserialized into their
initial type. This leads to an asymmetry in the StreamOutput method for
serializing Exceptions and the StreamInput method for writing
Excpetions. Namely, the former will accept Throwables but the latter
will only return Exceptions. A goal with the stream methods has always
been symmetry in the method names so that serialization/deserialization
routines appear symmetrical in code. It is this asymmetry on the
input/output types for Exceptions on StreamOutput/StreamInput that
clashes with the desired symmetry of naming. Despite this, we should
favor symmetry in the naming of the methods. This commit renames
StreamOutput#writeThrowable to StreamOutput#writeException which leaves
us with Exception StreamInput#readException and void
StreamOutput#writeException(Throwable).
2016-07-05 14:37:01 -04:00
Boaz Leskes 6861d3571e Persistent Node Ids (#19140)
Node IDs are currently randomly generated during node startup. That means they change every time the node is restarted. While this doesn't matter for ES proper, it makes it hard for external services to track nodes. Another, more minor, side effect is that indexing the output of, say, the node stats API results in creating new fields due to node ID being used as keys.

The first approach I considered was to use the node's published address as the base for the id. We already [treat nodes with the same address as the same](https://github.com/elastic/elasticsearch/blob/master/core/src/main/java/org/elasticsearch/discovery/zen/NodeJoinController.java#L387) so this is a simple change (see [here](https://github.com/elastic/elasticsearch/compare/master...bleskes:node_persistent_id_based_on_address)). While this is simple and it works for probably most cases, it is not perfect. For example, if after a node restart, the node is not able to bind to the same port (because it's not yet freed by the OS), it will cause the node to still change identity. Also in environments where the host IP can change due to a host restart, identity will not be the same. 

Due to those limitation, I opted to go with a different approach where the node id will be persisted in the node's data folder. This has the upside of connecting the id to the nodes data. It also means that the host can be adapted in any way (replace network cards, attach storage to a new VM). I

It does however also have downsides - we now run the risk of two nodes having the same id, if someone copies clones a data folder from one node to another. To mitigate this I changed the semantics of the protection against multiple nodes with the same address to be stricter - it will now reject the incoming join if a node exists with the same id but a different address. Note that if the existing node doesn't respond to pings (i.e., it's not alive) it will be removed and the new node will be accepted when it tries another join.

Last, and most importantly, this change requires that *all* nodes persist data to disk. This is a change from current behavior where only data & master nodes store local files. This is the main reason for marking this PR as breaking.

Other less important notes:
- DummyTransportAddress is removed as we need a unique network address per node. Use `LocalTransportAddress.buildUnique()` instead.
- I renamed `node.add_lid_to_custom_path` to `node.add_lock_id_to_custom_path` to avoid confusion with the node ID which is now part of the `NodeEnvironment` logic.
- I removed the `version` paramater from `MetaDataStateFormat#write` , it wasn't really used and was just in the way :)
- TribeNodes are special in the sense that they do start multiple sub-nodes (previously known as client nodes). Those sub-nodes do not store local files but derive their ID from the parent node id, so they are generated consistently.
2016-07-04 21:09:25 +02:00
Tanguy Leroux 0e7faf1005 Enable Checkstyle RedundantModifier 2016-07-04 15:22:12 +02:00
Jason Tedor 3343ceeae4 Do not catch throwable
Today throughout the codebase, catch throwable is used with reckless
abandon. This is dangerous because the throwable could be a fatal
virtual machine error resulting from an internal error in the JVM, or an
out of memory error or a stack overflow error that leaves the virtual
machine in an unstable and unpredictable state. This commit removes
catch throwable from the codebase and removes the temptation to use it
by modifying listener APIs to receive instances of Exception instead of
the top-level Throwable.

Relates #19231
2016-07-04 08:41:06 -04:00
Ryan Ernst 5a66c08ae9 Merge branch 'master' into ingest_plugin_api 2016-07-01 16:27:52 -07:00
Ryan Ernst 822c995367 Internal: Remove generics from LifecycleComponent
The only reason for LifecycleComponent taking a generic type was so that
it could return that type on its start and stop methods. However, this
chaining has no practical necessity. Instead, start and stop can be
void, and a whole bunch of confusing generics disappear.
2016-07-01 16:17:42 -07:00
Ryan Ernst e5caadc4f3 Merge branch 'master' into ingest_plugin_api 2016-07-01 12:35:26 -07:00
Nik Everett f30a70c51f Fix comment
I forgot a word....
2016-07-01 14:48:08 -04:00
Nik Everett ff42d7cfc6 Add embedded stash key support to rest tests
This allowes embedding stash keys in string like `t${key}est`. This
allows simple string concatenation like acitons.

The test for this is in `ObjectPathTests` because `Stash` doesn't seem
to have a test on its own and it is simple enough to test embedded
stashes this way. And this is a way I expect them to be used eventually.
2016-07-01 14:11:11 -04:00
Ryan Ernst 65c9b0b588 Merge branch 'master' into ingest_plugin_api 2016-07-01 09:26:17 -07:00
Tanguy Leroux 8c40b2b54e Fix order of modifiers 2016-07-01 16:57:14 +02:00