Commit Graph

790 Commits

Author SHA1 Message Date
Simon Willnauer 0c0353fc7d [TEST] Add some testlogging 2016-12-16 14:25:17 +01:00
Masaru Hasegawa a0185c83a7 Merge pull request #21393 from masaruh/alias_boost
Resolve index names in indices_boost
2016-12-16 15:07:51 +09:00
Nik Everett 61597f2c20 Send error_trace by default when testing (#22195)
Sends the `error_trace` parameter with all requests sent by the
yaml test framework, including the doc snippet tests. This can be
overridden by settings `error_trace: false`. While this drift's
core's handling of the yaml tests from the client's slightly this
should only be a problem for tests that rely on the default value,
both of which I've fixed by setting the value explicitly.

This also escapes `\n` and `\t` in the `Stash dump on failure` so
the `stack_trace` is more readable.

Also fixes `RestUpdateSettingsAction` to not think of the `error_trace`
parameter as a setting.
2016-12-15 13:35:14 -05:00
Boaz Leskes b6cbcc49ba ClusterService should expose "applied" cluster states (i.e., remove ClusterStateStatus) (#21817)
`ClusterService` is responsible of updating the cluster state on every node (as a response to an API call on the master and when non-masters receive a new state from the master). When a new cluster state is processed, it is made visible via the `ClusterService#state` method and is sent to series of listeners. Those listeners come in two flavours - one is to change the state of the node in response to the new cluster state (call these cluster state appliers), the other is to start a secondary process. Examples for the later include an indexing operation waiting for a shard to be started or a master node action waiting for a master to be elected. 

The fact that we expose the state before applying it means that samplers of the cluster state had to worry about two things - working based on a stale CS and working based on a future, i.e., "being applied" CS. The `ClusterStateStatus` was used to allow distinguishing between the two. Working with a stale cluster state is not avoidable. How this PR changes things to make sure consumers don't need to worry about future CS, removing the need for the status and simplifying the waiting logic.

This change does come with a price as "cluster state appliers" can't sample the cluster state from `ClusterService` whenever they want as the cluster state isn't exposed yet. However, recent clean ups made this is situation easier and this PR takes the last steps to remove such sampling. This also helps clarify the "information flow" and helps component separation (and thus potential unit testing). It also adds an assertion that will trigger if the cluster state is sampled by such listeners. 

Note that there are still many "appliers" that could be made a simpler, unrestricted "listener" but this can be done in smaller bits in the future. The commit also makes it clear what the `appliers` and what the `listeners` are by using dedicated interfaces.

Also, since I had to change the listener types I went ahead and changed the data structure for temporary/timeout listeners (used for the observer) so addition and removal won't be an O(n) operation.
2016-12-15 17:06:25 +01:00
Simon Willnauer ef610636b6 Remove TCP handshake BWC from master (#22151)
Since #22094 has been back-ported to 5.2 we can remove all BWC layers from master since all supported version will handle handshake requests.

Relates to #22094
2016-12-15 12:47:01 +01:00
Simon Willnauer d27a12510b Handle race-condition when connection is closed before handshake listener was added
Today sending a message on a closed channel doesn't throw an exception. The channel
might just swallow the exception and informs the internal async exception handler
that a channel got disconnected. This change adds a safety check that we fail
the handshake if we registered a handler but the channel has been closed already
for instance due to a reset by peer.
2016-12-15 12:41:50 +01:00
Simon Willnauer 80d6539e9c Handle connection close / reset events gracefully during handshake (#22178)
Low level handshake code doesn't handle situations gracefully if the connection
is concurrently closed or reset by peer. This commit adds the relevant code to
fail the handshake if the connection is closed.
2016-12-14 23:04:14 +01:00
Boaz Leskes bf65a69bbf Enforce min master nodes in test cluster (#22065)
In order to start clusters with min master nodes set without setting `discovery.initial_state_timeout`, #21846 has changed the way we start nodes. Instead to the previous serial start up, we now always start the nodes in an async fashion (internally). This means that starting a cluster is unsafe without `min_master_nodes` being set. We should therefore make it mandatory.
2016-12-14 20:14:16 +01:00
Daniel Mitterdorfer 7e5058037b Enable strict duplicate checks for JSON content
With this commit we enable the Jackson feature 'STRICT_DUPLICATE_DETECTION'
by default. This ensures that JSON keys are always unique. While this has
a performance impact, benchmarking has indicated that the typical drop in
indexing throughput is around 1 - 2%.

As a last resort, we allow users to still disable strict duplicate checks
by setting `-Des.json.strict_duplicate_detection=false` which is
intentionally undocumented.

Closes #19614
2016-12-14 09:35:53 +01:00
Nik Everett 49bdd29f91 Consolidate more parser creation into ESTestCase
This will make it easier to add the forthcoming required argument,
`NamedXContentRegistry`.
2016-12-13 20:28:41 -05:00
Jason Tedor 510ad7b9c7 Add shutdown hook for closing CLI commands
This commit enables CLI commands to be closeable and installs a runtime
shutdown hook to ensure that if the JVM shuts down (as opposed to
aborting) the close method is called.

It is not enough to wrap uses of commands in main methods in
try-with-resources blocks as these will not run if, say, the virtual
machine is terminated in response to SIGINT, or system shutdown event.

Relates #22126
2016-12-13 19:10:11 -05:00
Nik Everett 872984d21a Continue consolidating `XContentParser` construction in tests (#22145)
Consolidate more parser creation in tests

Moves more parser creation in tests to the `createParser` methods
in `ESTestCase`.
2016-12-13 17:22:39 -05:00
Simon Willnauer 7a9b667e98 Introduce a low level protocol handshake (#22094)
Today we rely on the version that the API user passes in together with the DiscoveryNode. This commit introduces a low level handshake where nodes exchange their version to be used with the transport protocol that is executed every time a connection to a node is established. This, on the one hand allows to change the wire protocol based on the version we are talking to even without a full cluster restart. Today we would need to carry on a BWC layer across major versions but with a handshake we can rely on the fact that the latest version of the previous minor executes a handshake and uses the latest protocol version across all communication with the N+1 version nodes.

This change is yet fully backwards compatible, a followup PR will remove the BWC in 6.0 once this has been back-ported to the 5.x branch
2016-12-13 21:06:23 +01:00
Nik Everett ce86405394 Start to centralize creation of XContentParser in tests (#22096)
Starts to centralize creation of the `XContentParser` in
`protected final` methods on `ESTestCase`. The idea is to enable
adding `NamedXContentRegistry` relatively easily by giving tests
a single place they can override to define the
`NamedXContentRegistry`. Since `NamedXContentRegistry` doesn't
exist yet neither does the override point.

This doesn't attempt to migrate all the tests to calling the
new methods to build the parsers. I wanted to make this so we
could review the concept and then I'll merge a followup to
migrate the tests.
2016-12-13 11:22:15 -05:00
Simon Willnauer b667ff46c4 Allow plugins to install bootstrap checks (#22110)
Plugins also have the need to provide better OOTB experience by configuring
defaults unless the plugin is used in _production_ mode. This change exposes
the bootstrap check infrastructure as part of the plugin API to allow plugins
to specify / install their own bootstrap checks if necessary.
2016-12-12 17:35:00 +01:00
Luca Cavanna 6d987a9b69 Remove support for empty queries (#22092)
Our query DSL supports empty queries (`{}`), which have a different meaning depending on the query that holds it, either ignored, match_all or match_none. We deprecated the support for empty queries in 5.0, where we log a deprecation warning wherever they are used.

The way we supported it once we moved query parsing to the coordinating node was having an Optional<QueryBuilder> return type in all of our parse methods (called fromXContent). See #17624. The central place for this was QueryParseContext#parseInnerQueryBuilder. We can now remove all the optional return types and simply throw an exception whenever an empty query is found.
2016-12-12 12:37:12 +01:00
Masaru Hasegawa 3df2a086d4 Resolve index names in indices_boost
This change allows specifying alias/wildcard expression in indices_boost.
And added another format for specifying indices_boost. It accepts array of index name and boost pair.
If an index is included in multiple aliases/wildcard expressions, the first match will be used.
With new format, old format is marked as deprecated.

Closes #4756
2016-12-11 21:41:49 +09:00
Simon Willnauer 01d67e09b9 Detach handshake from connect to node (#22037)
Today we connect and publish the nodes connection before we execute a
handshake with the node we connect to. In the case of connecting to a node
that won't pass the handshake this connection is already `published` and other
code paths can use it. This commit detaches the connection and the publish of the
connection such that `TransportService` can do a handshake before actually connect
and publish the connection.
2016-12-10 10:03:26 +01:00
Ryan Ernst b1cef5fdf8 Remove 2.0 prerelease version constants (#22004)
* Remove 2.0 prerelease version constants

This is a start to addressing #21887. This removes:
* pre 2.0 snapshot format support
* automatic units addition to cluster settings
* bwc check for delete by query in pre 2.0 indexes
2016-12-08 21:48:35 -08:00
Nik Everett e9bb8d8b38 Don't allow yaml tests with `warnings` that don't skip `warnings` (#21989)
If you write a yaml test with a `warnings` section in a `do` block
that doesn't also have a corresponding `skip` section for `warnings`
then client test runners that don't support `warnings` will fail.
This causes the elasticsearch build to fail so we catch these errors
earlier.

Related to #21811
2016-12-08 13:17:31 -05:00
Ali Beyad e6e7bab58c Prepares allocator decision objects for use with the allocation explain API (#21691)
This commit enhances the allocator decision result objects (namely,
AllocateUnassignedDecision, MoveDecision, and RebalanceDecision)
to enable them to be used directly by the cluster allocation explain API. In
particular, this commit does the following:

- Adds serialization and toXContent methods to the response objects,
which will form the explain API responses.
- Moves the calculation of the final explanation to the response
object itself, removing it from the responsibility of the allocators.
- Adds shard store information to the NodeAllocationResult, so that
store information is available for each node, when explaining a
shard allocation by the PrimaryShardAllocator or the ReplicaShardAllocator.
- Removes RebalanceDecision in favor of using MoveDecision for both
moving and rebalancing shards.
- Removes NodeRebalanceResult in favor of using NodeAllocationResult.
- Changes the notion of weight ranking to be relative to the current node,
instead of an absolute weight that doesn't convey any added value to the
API user and can be confusing.
- Introduces a new enum AllocationDecision to convey the decision type,
which enables conveying unassigned, moving, and rebalancing scenarios
with more detail as opposed to just Decision.Type and AllocationStatus.
2016-12-07 17:37:51 -05:00
Adrien Grand c746854e03 Pre-built analysis factories do not implement MultiTermAware correctly. (#21981)
We had tests for the regular factories, but not for the pre-built ones, that
ship by default without requiring users to define them in the analysis settings.
2016-12-07 10:32:25 +01:00
Boaz Leskes 4519bdfeb0 InternalTestCluster shouldn't auto heal an active disruption when a new one is set
Instead people should explicitly clear the existing one so it's clear what's going on.
2016-12-06 19:58:11 +01:00
Boaz Leskes a7050b2d56 Remove `InternalTestCluster.startNode(s)Async` (#21846)
Since the removal of local discovery of #https://github.com/elastic/elasticsearch/pull/20960 we rely on minimum master nodes to be set in our test cluster. The settings is automatically managed by the cluster (by default) but current management doesn't work with concurrent single node async starting. On the other hand, with `MockZenPing` and the `discovery.initial_state_timeout` set to `0s` node starting and joining is very fast making async starting an unneeded complexity. Test that still need async starting could, in theory, still do so themselves via background threads.

Note that this change also removes the usage of `INITIAL_STATE_TIMEOUT_SETTINGS` as the starting of nodes is done concurrently (but building them is sequential)
2016-12-06 12:06:15 +01:00
Nik Everett 2087234d74 Timeout improvements for rest client and reindex (#21741)
Changes the default socket and connection timeouts for the rest
client from 10 seconds to the more generous 30 seconds.

Defaults reindex-from-remote to those timeouts and make the
timeouts configurable like so:
```
POST _reindex
{
  "source": {
    "remote": {
      "host": "http://otherhost:9200",
      "socket_timeout": "1m",
      "connect_timeout": "10s"
    },
    "index": "source",
    "query": {
      "match": {
        "test": "data"
      }
    }
  },
  "dest": {
    "index": "dest"
  }
}
```

Closes #21707
2016-12-05 10:54:51 -05:00
Nik Everett 0c724b1878 Keep context during reindex's retries (#21941)
* Keep context during reindex's retries

This fixes reindex and friend's retries to keep the context.

* Docs
2016-12-02 13:48:51 -05:00
Tanguy Leroux fe95aef6a9 [TEST] Remove CompositeTestCluster and ExternalNode (#21933)
They are not used anymore. Related #21915
2016-12-02 13:25:40 +01:00
Simon Willnauer 20177f6eee [TEST] Add back ExternalTestCluster - downstream tests still use it 2016-12-02 10:54:27 +01:00
Simon Willnauer adf9bd90a4 Remove legacy BWC test infrastructure and tests (#21915)
We don't use the test infra nor do we run the tests. They might all be
entirely out of date. We also have a different BWC test infra in-place.
This change removes all of the legacy infra.
2016-12-02 08:06:20 +01:00
Simon Willnauer 6522538033 Add validation for supported index version on node join, restore, upgrade & open index (#21830)
Today we can easily join a cluster that holds an index we don't support since
we currently allow rolling upgrades from 5.x to 6.x. Along the same lines we don't check if we can support an index based on the nodes in the cluster when we open, restore or metadata-upgrade and index. This commit adds
additional safety that fails cluster state validation, open, restore and /or upgrade if there is an open index with an incompatible index version created in the cluster.

Realtes to #21670
2016-12-01 15:40:35 +01:00
Simon Willnauer 155de53fe3 Add a connect timeout to the ConnectionProfile to allow per node connect timeouts (#21847)
Timeouts are global today across all connections this commit allows to specify
a connection timeout per node such that depending on the context connections can
be established with different timeouts.

Relates to #19719
2016-12-01 15:39:49 +01:00
Boaz Leskes 087a85a4e7 always auto manage min master node in testTwoNodeCluster 2016-12-01 12:57:44 +01:00
Boaz Leskes 9097abee04 Add before and after logging for unit tests
Currently we have these logs for integration tests only.

This adds the following log at the start:
```
logger.info("[{}]: before test", getTestName());
```

and this is logged at the end, but before any clean up done in sub classes

```
 logger.info("[{}]: after test", getTestName());
```
2016-12-01 12:56:37 +01:00
Luca Cavanna 103984a4a1 Remove indices query (#21837)
The indices query is deprecated since 5.0.0 (#17710). It can now be removed in master (future 6.0 version).
2016-11-30 19:37:01 +01:00
Adrien Grand 34e682d3bc Prevent testing on double values whose toString may use the scientific notation.
This might break query parsers because the standard analyzer splits on
punctuation.
2016-11-30 16:48:46 +01:00
Adrien Grand 6231009a8f Remove 2.x backward compatibility of mappings. (#21670)
For the record, I also had to remove the geo-hash cell and geo-distance range
queries to make the code compile. These queries already throw an exception in
all cases with 5.x indices, so that does not hurt any more.

I also had to rename all 2.x bwc indices from `index-${version}` to
`unsupported-${version}` to make `OldIndexBackwardCompatibilityIT`
happy.
2016-11-30 13:34:46 +01:00
Boaz Leskes be4074e13d improve debug logging when node waits for initial cluster state
And enabled debug logging in InternalTestClusterTests so we can see it.
2016-11-29 20:38:19 +01:00
Nicholas Knize af1ab68b64 Add RangeFieldMapper for numeric and date range types
Lucene 6.2 added index and query support for numeric ranges. This commit adds a new RangeFieldMapper for indexing numeric (int, long, float, double) and date ranges and creating appropriate range and term queries. The design is similar to NumericFieldMapper in that it uses a RangeType enumerator for implementing the logic specific to each type. The following range types are supported by this field mapper: int_range, float_range, long_range, double_range, date_range.

Lucene does not provide a DocValue field specific to RangeField types so the RangeFieldMapper implements a CustomRangeDocValuesField for handling doc value support.

When executing a Range query over a Range field, the RangeQueryBuilder has been enhanced to accept a new relation parameter for defining the type of query as one of: WITHIN, CONTAINS, INTERSECTS. This provides support for finding all ranges that are related to a specific range in a desired way. As with other spatial queries, DISJOINT can be achieved as a MUST_NOT of an INTERSECTS query.
2016-11-29 10:10:14 -06:00
Simon Willnauer f5ff69fabe Remove connectToNodeLight and replace it with a connection profile (#21799)
The Transport#connectToNodeLight concepts is confusing and not very flexible.
neither really testable on a unittest level. This commit cleans up the code used
to connect to nodes and simplifies transport implementations to share more code.
This also allows to connect to nodes with custom profiles if needed, for instance
future improvements can be added to connect to/from nodes that are non-data nodes without
dedicated bulks and recovery connections.
2016-11-29 09:35:07 +01:00
Luca Cavanna 360b74eda8 [TEST] Don't reinitialize YamlTestClient and RestClient before each single test (#21807)
In the past we ran yaml tests against an internal cluster, which would get restarted after each test failure, hence the client objects needed to eventually be refreshed before each test. That is why we had the initClient method to re-initialize the YamlTestClient in the execution context. We ended up though re-initializing the client unconditionally, which is not needed.

Also, ESRestTestCase recreates the RestClient against the external cluster before each test, which is not needed given that nothing changes in the external cluster.

This commit removes the initClient method from the yaml tests execution context. The YamlTestClient can be eagerly created before the first yaml test runs and then re-used in subsequent tests. Also api calls to check for nodes versions etc. are moved out of YamlTestClient to ESClientYamlSuiteTestCase. Also the RestClient is now initialized in ESRestTestCase before the first test runs, and kept around afterwards as a static member.

Basically each subclass of EsRestTestCase will have its own RestClient instance, but the client will be shared across the different tests within the same class. The yaml test suite is just a special suite, composed of 600+ tests that are loaded from files, which will share the same client instance.

This change should speed tests up as well, as we don't recreate the RestClient before each single test, and we don't call _cat/nodes either before each single test.
2016-11-28 18:43:27 +01:00
Simon Willnauer b7292a6005 Remove TcpTransport#addressSupported since TransportAddress is now final
TransportAddress used to be customizable per transport but this has been removed
a while ago. Therefore we can remove all usage of this method as well.

Relates to #20695
2016-11-28 16:06:59 +01:00
Yannick Welsch 8390648709 Minor clean-ups in MockBigArrays (#21822)
Removes an unused static variable and an unused instance variable.
2016-11-28 14:09:26 +01:00
Yannick Welsch 7e198f0e41 Detect nodes being blocked by GC-disrupted node (#21797)
The disruption type LongGCDisruption simulates GCs on a node by suspending all the threads of that node. If the suspended threads are in a code section with shared JVM locks, however, it can prevent the other nodes from doing their thing. The class LongGCDisruption has a list of class names for which we know that this can occur. Whenever a test using the GC disruption type fails in mysterious ways, it becomes a long guessing game to find the offending class. This commit adds code to LongGCDisruption to automatically detect these situations, fail the test early and report the offending class and all relevant context.
2016-11-28 11:24:25 +01:00
Simon Willnauer 41e9ed13d6 [TEST] Fix AbstractBytesReferenceTestCase#testSlice to not assert on offset 2016-11-24 15:31:36 +01:00
Jason Tedor 8416b16dfd Improve handling of unreleased versions
Today when handling unreleased versions for backwards compatilibity
support, we scatted version constants across the code base and add some
asserts to support removing these constants when the version in question
is actually released. This commit improves this situation, enabling us
to just add a single unreleased version constant that can be renamed
when the version is actually released. This should make maintenance of
these versions simpler.

Relates #21760
2016-11-23 15:49:05 -05:00
Ryan Ernst 6940b2b8c7 Remove groovy scripting language (#21607)
* Scripting: Remove groovy scripting language

Groovy was deprecated in 5.0. This change removes it, along with the
legacy default language infrastructure in scripting.
2016-11-22 19:24:12 -08:00
Nik Everett 1791623700 Document `error_trace`
The `error_trace` parameter turns on the `stack_trace` field
in errors which returns stack traces.

Removes documentation for `camelCase` because it hasn't worked
in a while....

Documents the internal parameters used to render stack traces as
internal only.

Closes #21708
2016-11-22 19:16:07 -05:00
Simon Willnauer a9a2753f0b Add a HostFailureListener to notify client code if a node got disconnected (#21709)
Today there is no way to get notified if a node is disconnected. Client code
must poll the TransportClient constantly to detect that a node is not connected
anymore in order to react and add new nodes or notify altering etc. For instance
if a hostname  gets resolved to an IP but that host is disconnected clients want
to reconnect by resolving the hostname again which is a common situation in cloud
environments.

Closes #21424
2016-11-22 20:46:28 +01:00
Jason Tedor 9dc65037bc Lazy resolve unicast hosts
Today we eagerly resolve unicast hosts. This means that if DNS changes,
we will never find the host at the new address. Moreover, a single host
failng to resolve causes startup to abort. This commit introduces lazy
resolution of unicast hosts. If a DNS entry changes, there is an
opportunity for the host to be discovered. Note that under the Java
security manager, there is a default positive cache of infinity for
resolved hosts; this means that if a user does want to operate in an
environment where DNS can change, they must adjust
networkaddress.cache.ttl in their security policy. And if a host fails
to resolve, we warn log the hostname but continue pinging other
configured hosts.

When doing DNS resolutions for unicast hostnames, we wait until the DNS
lookups timeout. This appears to be forty-five seconds on modern JVMs,
and it is not configurable. If we do these serially, the cluster can be
blocked during ping for a lengthy period of time. This commit introduces
doing the DNS lookups in parallel, and adds a user-configurable timeout
for these lookups.

Relates #21630
2016-11-22 14:17:04 -05:00
Areek Zillur 0ccf8a742d Add support for merging custom meta data in tribe node (#21552)
* Add support for merging custom meta data in tribe node

Currently, when any underlying cluster has custom metadata
(via plugin), tribe node does not store custom meta data in its
cluster state. This is because the tribe node has no idea how to
select the appropriate custom metadata from one or many custom
metadata (corresponding to the number of underlying clusters).

This change adds an interface that custom metadata implementations
can extend to add support for merging mulitple custom metadata of
the same type for storing in the tribe state.

Relates to #20544
Supersedes #20791

* Simplify updating tribe state

* Add tests for merging multiple custom metadata types in tribe node

* cleanup merging custom md logic in tribe service
2016-11-21 12:03:01 -05:00