Commit Graph

7198 Commits

Author SHA1 Message Date
Nik Everett 3adefb7b4a Begin centralizing XContentParser creation into RestRequest (#22041)
To get #22003 in cleanly we need to centralize as much `XContentParser` creation as possible into `RestRequest`. That'll mean we have to plumb the `NamedXContentRegistry` into fewer places.

This removes `RestAction.hasBody`, `RestAction.guessBodyContentType`, and `RestActions.getRestContent`, moving callers over to `RestRequest.hasContentOrSourceParam`, `RestRequest.contentOrSourceParam`, and `RestRequest.contentOrSourceParamParser` and `RestRequest.withContentOrSourceParamParserOrNull`. The idea is to use `withContentOrSourceParamParserOrNull` if you need to handle requests without any sort of body content and to use `contentOrSourceParamParser` otherwise.

I believe the vast majority of this PR to be purely mechanical but I know I've made the following behavioral change (I'll add more if I think of more):
* If you make a request to an endpoint that requires a request body and has cut over to the new APIs instead of getting `Failed to derive xcontent` you'll get `Body required`.
* Template parsing is now non-strict by default. This is important because we need to be able to deprecate things without requests failing.
2016-12-09 20:23:02 -05:00
Nik Everett ddade1b5ac Improve the error message if task and node isn't found (#22062)
Improves the error message returned when looking up a task that
belongs to a node that is no longer part of the cluster. The new
error message tells the user that the node isn't part of the cluster.
This is useful because if you start a task and the node goes down
there isn't a record of the task at all. This hints to the user that
the task might have died with the node.

Relates to #22027
2016-12-09 15:50:46 -05:00
Igor Motov 93b5e55660 Restores the original default format of search slow log
In 5.0, the search slow log switched to the multi-line format with no option to get back to the origin single-line format that was used prior to 5.0 by default. This commit removes the reformat option from the search slow log and returns the search slow log back to the single-line format.

Closes #21711
2016-12-09 12:38:28 -05:00
Yannick Welsch b20b160a5e Allow flush/force_merge/upgrade on shard marked as relocated (#22078)
A shard that is locally marked as relocated, but where the relocation target shard has not been activated yet by the master, can still receive index operations, which in return can lead to flushes being triggered. Flushing is currently (wrongly) prohibited on shards marked as relocated, which makes the flushing process go into an endless retry loop and log warnings until the shard is closed. This commit fixes this situation by allowing flush, force_merge and upgrade operations to run on shards that are marked as relocated.
2016-12-09 17:56:40 +01:00
Nik Everett bcef1e7452 Better error message when _parent isn't an object (#21987)
If you make a mistake and specify a mapping like:
```
{
  "parent": {
    "properties": {}
  },
  "child": {
    "_parent": "parent",
    "properties": {}
  }
}
```

then the error message you get back amounts to
`Failed to parse mapping for [child]: can't cast a String to a Map`.
Since it doens't tell you *which* string can't be cast to a map you
have to dig through the stack trace to figure out what to fix. This
replaces the error message with:
```
Failed to parse mapping [child]: [_parent] must be an object containing [type]
```
so you can tell that the problem is with the `parent` field.
2016-12-09 11:33:31 -05:00
Yannick Welsch a724f4eb61 Don't update nodes list when stepping down as master (#22049)
This commit simplifies the node update logic so that nodes are never removed from the cluster state when the cluster state is not published.
2016-12-09 14:55:48 +01:00
Christoph Büscher 2592ff86ce Add fromXContent to InternalNestedIdentity
This adds a fromXContent method and unit test to InternalNestedIdentity so we can parse it as part of a search response. This is part of the preparation for parsing search responses on the client side.
2016-12-09 14:52:06 +01:00
Yannick Welsch db0660a7ea Reject external versioning and explicit version numbers on create (#21998)
Fixes an issue where indexing requests with operation type "create" auto-convert external versioning to internal versioning and silently ignore the version number instead of failing with an error message.
2016-12-09 14:21:22 +01:00
Michael McCandless 613a1a6a18 Add stored binary fields to static backwards compatibility indices tests (#22054)
Add stored binary fields to static backwards compatibility indices tests
2016-12-09 05:32:40 -05:00
Adrien Grand 6714e02bef Mute RestoreBackwardsCompatIT.testRestoreUnsupportedSnapshots. 2016-12-09 10:41:07 +01:00
Adrien Grand 787519ee4c Fix `other_bucket` on the `filters` agg to be enabled if a key is set. (#21994)
Closes #21951
2016-12-09 09:48:48 +01:00
Adrien Grand 1bdf4a2c5b Partition-based include-exclude does not implement equals/hashcode/serialization correctly. (#22051) 2016-12-09 09:48:16 +01:00
Adrien Grand 9524c81af9 Document the `locale` option of the `date` field. (#22050)
This also adds another level of protection against using the default locale.
Relates to https://discuss.elastic.co/t/mapping-for-12h-date-format/68433/3.
2016-12-09 09:45:53 +01:00
Adrien Grand 36f598138a Start using `ObjectParser` for aggs. (#22048)
This is an attempt to start moving aggs parsing to `ObjectParser`. There is
still A LOT to do, but ObjectParser is way better than the way aggregations
parsing works today. For instance in most cases, we reject numbers that are
provided as strings, which we are supposed to accept since some client languages
(looking at you Perl) cannot make sure to use the appropriate types.

Relates to #22009
2016-12-09 09:45:16 +01:00
Ryan Ernst b1cef5fdf8 Remove 2.0 prerelease version constants (#22004)
* Remove 2.0 prerelease version constants

This is a start to addressing #21887. This removes:
* pre 2.0 snapshot format support
* automatic units addition to cluster settings
* bwc check for delete by query in pre 2.0 indexes
2016-12-08 21:48:35 -08:00
Igor Motov 7f79c99e9a Add descriptions to bulk tasks
Related to #21768
2016-12-08 21:59:52 -05:00
Lee Hinman ef64d230e7 Merge remote-tracking branch 'dakrone/index-seq-id-and-primary-term' 2016-12-08 19:47:21 -07:00
Lee Hinman ee22a477df Add internal _primary_term doc values field, fix _seq_no indexing
This adds the `_primary_term` field internally to the mappings. This field is
populated with the current shard's primary term.

It is intended to be used for collision resolution when two document copies have
the same sequence id, therefore, doc_values for the field are stored but the
filed itself is not indexed.

This also fixes the `_seq_no` field so that doc_values are retrievable (they
were previously stored but irretrievable) and changes the `stats` implementation
to more efficiently use the points API to retrieve the min/max instead of
iterating on each doc_value value. Additionally, even though we intend to be
able to search on the field, it was previously not searchable. This commit makes
it searchable.

There is no user-visible `_primary_term` field. Instead, the fields are
updated by calling:

```java
index.parsedDoc().updateSeqID(seqNum, primaryTerm);
```

This includes example methods in `Versions` and `Engine` for retrieving the
sequence id values from the index (see `Engine.getSequenceID`) that are only
used in unit tests. These will be extended/replaced by actual implementations
once we make use of sequence numbers as a conflict resolution measure.

Relates to #10708
Supercedes #21480

P.S. As a side effect of this commit, `SlowCompositeReaderWrapper` cannot be
used for documents that contain `_seq_no` because it is a Point value and SCRW
cannot wrap documents with points, so the tests have been updated to loop
through the `LeafReaderContext`s now instead.
2016-12-08 19:47:03 -07:00
Jason Tedor c9882dd1a0 Avoid NPE in NodeService#stats if HTTP is disabled
This commit adds safety against an NPE if HTTP stats are requested but
HTTP is disabled on a node.

Relates #22060
2016-12-08 19:59:02 -05:00
Jason Tedor f713106827 Bump version to 5.1.2
This commit bumps the version to 5.1.2.

Relates #22057
2016-12-08 16:40:39 -05:00
Ali Beyad 3da04293f3 Cannot force allocate primary to a node where the shard already exists (#22031)
Before, it was possible that the SameShardAllocationDecider would allow
force allocation of an unassigned primary to the same node on which an
active replica is assigned.  This could only happen with shadow replica
indices, because when a shadow replica primary fails, the replica gets
promoted to primary but in the INITIALIZED state, not in the STARTED
state (because the engine has specific reinitialization that must take
place in the case of shadow replicas).  Therefore, if the now promoted
primary that is initializing fails also, the primary will be in the
unassigned state, because replica to primary promotion only happens when
the failed shard was in the started state.  The now unassigned primary
shard will go through the allocation deciders, where the
SameShardsAllocationDecider would return a NO decision, but would still
permit force allocation on the primary if all deciders returned NO.

This commit implements canForceAllocatePrimary on the
SameShardAllocationDecider, which ensures that a primary cannot be
force allocated to the same node on which an active replica already
exists.
2016-12-08 12:21:19 -05:00
Adrien Grand 182e119699 IP range masks exclude the maximum address of the range. (#22018)
Closes #22005
2016-12-08 15:58:32 +01:00
makeyang ce0ad4e08e add test case for parse include_in_all in mulit fields 2016-12-08 19:44:40 +08:00
Ali Beyad 30bcb06606 When shard data is still being fetched from nodes in the cluster,
the ReplicaShardAllocator, when in explain mode, would get the
node decisions for all nodes in the cluster.  The PrimaryShardAllocator
neglected to do this and tried to use the shard fetch data in explain
mode, which had not yet been fully fetched.  This commit fixes this by
ensuring the PrimaryShardAllocator gets node decisions in the same way
the ReplicaShardAllocator does in explain mode, if shard data is still
being fetched.
2016-12-07 22:21:09 -05:00
makeyang 46cdb411b5 modified code according to nik9000's comments. 2016-12-08 11:04:33 +08:00
Ali Beyad e6e7bab58c Prepares allocator decision objects for use with the allocation explain API (#21691)
This commit enhances the allocator decision result objects (namely,
AllocateUnassignedDecision, MoveDecision, and RebalanceDecision)
to enable them to be used directly by the cluster allocation explain API. In
particular, this commit does the following:

- Adds serialization and toXContent methods to the response objects,
which will form the explain API responses.
- Moves the calculation of the final explanation to the response
object itself, removing it from the responsibility of the allocators.
- Adds shard store information to the NodeAllocationResult, so that
store information is available for each node, when explaining a
shard allocation by the PrimaryShardAllocator or the ReplicaShardAllocator.
- Removes RebalanceDecision in favor of using MoveDecision for both
moving and rebalancing shards.
- Removes NodeRebalanceResult in favor of using NodeAllocationResult.
- Changes the notion of weight ranking to be relative to the current node,
instead of an absolute weight that doesn't convey any added value to the
API user and can be confusing.
- Introduces a new enum AllocationDecision to convey the decision type,
which enables conveying unassigned, moving, and rebalancing scenarios
with more detail as opposed to just Decision.Type and AllocationStatus.
2016-12-07 17:37:51 -05:00
Ali Beyad 05f64c550a [TEST] fixes line length issue in BulkRequestModifierTests 2016-12-07 13:11:55 -05:00
Ryan Ernst f02a2b6546 Ingest: Moved ingest invocation into index/bulk actions (#22015)
* Ingest: Moved ingest invocation into index/bulk actions

Ingest was originally setup as a plugin, and in order to hook into the
index and bulk actions, action filters were used. However, ingest was
later moved into core, but the action filters were never removed. This
change moves the execution of ingest into the index and bulk actions.

* Address PR comments

* Remove forwarder direct dependency on ClusterService
2016-12-07 08:43:26 -08:00
Christoph Büscher 7454a9647b Add fromXContent to HighlightField
This adds a fromXContent method and unit test to the HighlightField class so we
can parse it as part of a serch response. This is part of the preparation for
parsing search responses on the client side.
2016-12-07 16:32:44 +01:00
Yannick Welsch c87cc15d49 Add toString() for TransportReplicationAction.ConcreteShardRequest 2016-12-07 15:58:22 +01:00
Christoph Büscher 31a1c2e240 Remove redundant source setters from IndexRequestBuilder 2016-12-07 15:20:36 +01:00
Yannick Welsch 9630b1a6e7 Promote shadow replica to primary when initializing primary fails (#22021)
Failing an initializing primary when shadow replicas are enabled for the index can leave the primary unassigned with replicas being active. Instead, a replica should be promoted to primary, which is fixed by this commit.
2016-12-07 13:59:43 +01:00
Yannick Welsch 13e1a6fd40 Trim in-sync allocations set only when it grows (#21976)
This commit makes two changes to how the in-sync allocations set is updated:

- the set is only trimmed when it grows. This prevents trimming too eagerly when the number of replicas was decreased while shards were unassigned.
- the allocation id of an active primary that failed is only removed from the in-sync set if another replica gets promoted to primary. This prevents the situation where the only available shard copy in the cluster gets removed the in-sync set.

Closes #21719
2016-12-07 10:59:11 +01:00
Adrien Grand c746854e03 Pre-built analysis factories do not implement MultiTermAware correctly. (#21981)
We had tests for the regular factories, but not for the pre-built ones, that
ship by default without requiring users to define them in the analysis settings.
2016-12-07 10:32:25 +01:00
Adrien Grand 33b8d7a19d Expose `ip` fields as strings in scripts. (#21997)
Currently we expose the internal representation that we use for ip addresses,
which are the ipv6 bytes. However, this is not really usable, exposes internal
implementation details and also does not work fine with other APIs that expect
that the values can be `toString`'d.

Closes #21977
2016-12-07 10:32:11 +01:00
Boaz Leskes 4519bdfeb0 InternalTestCluster shouldn't auto heal an active disruption when a new one is set
Instead people should explicitly clear the existing one so it's clear what's going on.
2016-12-06 19:58:11 +01:00
shaie 6da44c8164 Fix _termvectors with preference to not hit NPE (#21959)
When you submit a _termvectors request for an artificial document and
specify the 'preference' parameter to send the request to a particular
shard, the request sometimes hits NPE. Fix this case by ignoring the
auto-generated artificial document ID and pick a shard per the
preference parameter, or a random shard.

This closes #21928
2016-12-06 17:29:09 +01:00
Jim Ferenczi b42ca6bcc9 Include unindexed field in FieldStats response (#21821)
* Include unindexed field in FieldStats response

This change adds non-searchable fields to the FieldStats response. These fields do not have min/max informations but they can be aggregatable. Fields that are only stored in _source (store:no, index:no, doc_values:no) will still be missing since they do not have any useful information to show. Indices and clients must be at least on V_5_2_0 to see this change.
2016-12-06 13:32:57 +01:00
Boaz Leskes a7050b2d56 Remove `InternalTestCluster.startNode(s)Async` (#21846)
Since the removal of local discovery of #https://github.com/elastic/elasticsearch/pull/20960 we rely on minimum master nodes to be set in our test cluster. The settings is automatically managed by the cluster (by default) but current management doesn't work with concurrent single node async starting. On the other hand, with `MockZenPing` and the `discovery.initial_state_timeout` set to `0s` node starting and joining is very fast making async starting an unneeded complexity. Test that still need async starting could, in theory, still do so themselves via background threads.

Note that this change also removes the usage of `INITIAL_STATE_TIMEOUT_SETTINGS` as the starting of nodes is done concurrently (but building them is sequential)
2016-12-06 12:06:15 +01:00
Daniel Mitterdorfer a02bc8ed1c Document thread-safety for ingest processors
With this commit we document that ingest processors need to be
thread-safe. Previously this could be inferred from reading the
source code but we got several user questions about this so
it is stated explicitly in the Javadocs of Processor now.
2016-12-06 10:07:51 +01:00
Adrien Grand 26cbda41ea AsciiFoldingFilter's multi-term component should never preserve the original token. (#21982)
This ports the fix of https://issues.apache.org/jira/browse/LUCENE-7536 to
Elasticsearch's ASCIIFoldingTokenFilterFactory.
2016-12-06 10:01:04 +01:00
Ryan Ernst c8f241f284 Plugins: Remove response action filters (#21950)
Action filters currently have the ability to filter both the request and
response. But the response side was not actually used. This change
removes support for filtering responses with action filters.
2016-12-05 16:14:04 -08:00
Jim Ferenczi 03a0a0aebb Undeprecate GetResponse#getFields and GetResponse#getField
These functions should not have been deprecated as they can be used to retrieve stored and doc-value field.
2016-12-05 15:31:53 +01:00
makeyang 318ce6ab16 fix bug: https://github.com/elastic/elasticsearch/issues/21710 2016-12-05 19:14:34 +08:00
Ali Beyad ff9959c865 Don't output null source node in RecoveryFailedException (#21963)
The RecoveryFailedException's output prints the source and
target nodes for the recovery.  However, sometimes there is
no source node for the recovery, only a target node (such as
when recovering a primary shard from disk).  In this case,
we don't want to display the source node.  This commit fixes
this by displaying "Recovery failed on target node.." instead
of "Recovery failed from null to target node" which is what the
output currently displays.
2016-12-04 15:23:35 -05:00
Jason Tedor 60aa14f48e Increase test logging on test simple pings test
This commit increases the test logging on the unicast zeng ping test of
simple pings to gather more info for chasing a race condition that is
happening in this test.
2016-12-04 08:06:01 -05:00
Jason Tedor 040c05df36 Increase timeouts in UnicastZenPingTests
Sadly, the timeouts here need to be increased to reduce the likelihood
of spurious test failures (test hosts under load are especially prone to
this). This does slow down this test suite a bit, but it's still not as
slow as it was before this endeavor of lowering these timeouts started.
2016-12-03 22:19:55 -05:00
Jason Tedor 2c8229fcaf Cleanup unicast zen ping unknown hosts cached test
This commit cleans up the unicast zen ping unknown hosts cached test:
 - send pings from the same node to more clearly indicate DNS lookups
   are not cached (within the same UnicastZenPing instance)
 - increase ping and wait timeout to 500ms to address race conditions
   (on a test host under load, the timeout was too short for the
   connect/handshake/ping cycle to complete)
2016-12-03 22:00:30 -05:00
Jason Tedor 460e787049 Increase resolve timeout in unknown hosts test
The port limit test is a simple test that fakes that resolving an
address with a port range results the correct address collection.  This
test is subject to a race condition where the timeout on the resolve
request can fire before the resolve code finishes executing (this race
is exceptionally rare, because there are not actually any DNS lookups
being done here since we are just resolving addresses). This commit
increases the timeout here to significantly reduce the chance of a
losing race causing a spurious test failure. This increased timeout
should not increase the runtime of the test, just make failures less
likely.
2016-12-03 09:02:44 -05:00
Jason Tedor f5cbc36896 Increase resolve timeout in unknown hosts test
The unknown hosts test is a simple test that fakes that resolving an
address results in an unknown host exception. The main purpose of this
test is to ensure that we log (and do not silently drop) when a host
fails to resolve. This test is subject to a race condition where the
timeout on the resolve request can fire before the resolve code finishes
executing (this race is exceptionally rare, because there are not
actually any DNS lookups being done here, just a mock resolve
implementation that throws an exception and that's where losing the race
can arise). This commit increases the timeout here to significantly
reduce the chance of a losing race causing a spurious test failure. This
increased timeout should not increase the runtime of the test, just make
failures less likely.
2016-12-03 08:46:24 -05:00
Igor Motov bb9317253a Add descriptions to create snapshot and restore snapshot tasks.
Related to #21768
2016-12-02 21:13:54 -05:00
Jason Tedor c6efd4eb42 Rename method in InternalEngine
This commit renames InternalEngine#loadSeqNoStatsLucene to
InternalEngine#loadSeqNoStatsFromLucene to make this name consistent
with the method InternalEngine#loadSeqNoStatsFromLuceneAndTranslog.
2016-12-02 20:46:26 -05:00
Ryan Ernst 34eb23e98e Plugins: Replace Rest filters with RestHandler wrapper (#21905)
* Plugins: Replace Rest filters with RestHandler wrapper

RestFilters are a complex way of allowing plugins to add extra code
before rest actions are executed. This change removes rest filters, and
replaces with a wrapper which a single plugin may provide.
2016-12-02 14:54:51 -08:00
Jason Tedor b0e8696143 Clarify global checkpoint recovery
Today when starting a new engine, we read the global checkpoint from the
translog only if we are opening an existing translog. This commit
clarifies this situation by distinguishing the three cases of engine
creation in the constructor leading to clearer code.

Relates #21934
2016-12-02 15:00:16 -05:00
Jason Tedor 0afef53a17 Add system call filter bootstrap check
Today if system call filters fail to install on startup, we log a
message but otherwise march on. This might leave users without system
call filters installed not knowing that they have implicitly accepted
the additional risk. We should not be lenient like this, instead clearly
informing the user that they have to either fix their configuration or
accept the risk of not having system call filters installed. This commit
adds a bootstrap check that if system call filters are enabled, they
must successfully install.

Relates #21940
2016-12-02 14:27:54 -05:00
Nik Everett 0c724b1878 Keep context during reindex's retries (#21941)
* Keep context during reindex's retries

This fixes reindex and friend's retries to keep the context.

* Docs
2016-12-02 13:48:51 -05:00
Jay Modi 429e517476 Do not lose host information when pinging
In #21828, serialization of the host string was added to preserve this information when
a TransportAddress gets serialized. However, there is still a case where this did not always
work. In UnicastZenPings, DiscoveryNode instances are created for the ping hosts with the
minimum compatibility version, which is currently less than the version required to preserve
the host information. This means that when a node is received from a PingResponse that the
host information is no longer set correctly on the InetSocketAddress contained in the
DiscoveryNode.

This commit adds a workaround for this situation by allowing the host string to be passed
into the TransportAddress constructor that takes a StreamInput and using that as the host
for the InetAddress that is created during deserialization.
2016-12-02 12:21:53 -05:00
Ke Li 7cc9833606 Avoid some redundant unboxing and object creation (#21909) 2016-12-02 16:11:41 +01:00
shaie 8fd3637891 Return correct term statistics when a field is not found in a shard (#21922)
If you ask for the term vectors of an artificial document with
term_statistics=true, but a shard does not have any terms of the doc's
field(s), it returns the doc's term vectors values as the shard-level
term statistics. This commit fixes that to return 0 for `ttf` and also
field-level aggregated statistics.

Closes #21906
2016-12-02 08:14:45 +01:00
Simon Willnauer adf9bd90a4 Remove legacy BWC test infrastructure and tests (#21915)
We don't use the test infra nor do we run the tests. They might all be
entirely out of date. We also have a different BWC test infra in-place.
This change removes all of the legacy infra.
2016-12-02 08:06:20 +01:00
makeyang 3f1d7be07a Refactor shard limit allocation decider
This commit simplifies the shard limit allocation decider, removing some
duplicated code into a common method.

Relates #21845
2016-12-01 21:27:02 -05:00
Ryan Ernst a6ad89bee0 Mappings: Fix get mapping when no indexes exist to not fail in response generation (#21924)
When there are no indexes, get mapping has a series of special cases.
Two of those expect the response object already started, and the other
two respond with an exception. Those two cases (types passed in but no
indexes and vice versa) would fail in their error response generation
because it did not expect an object to already be started in the json
generator. This change moves the object start to where it is needed for
the empty responses.

closes #21916
2016-12-01 16:57:12 -08:00
Simon Willnauer 6522538033 Add validation for supported index version on node join, restore, upgrade & open index (#21830)
Today we can easily join a cluster that holds an index we don't support since
we currently allow rolling upgrades from 5.x to 6.x. Along the same lines we don't check if we can support an index based on the nodes in the cluster when we open, restore or metadata-upgrade and index. This commit adds
additional safety that fails cluster state validation, open, restore and /or upgrade if there is an open index with an incompatible index version created in the cluster.

Realtes to #21670
2016-12-01 15:40:35 +01:00
Simon Willnauer 155de53fe3 Add a connect timeout to the ConnectionProfile to allow per node connect timeouts (#21847)
Timeouts are global today across all connections this commit allows to specify
a connection timeout per node such that depending on the context connections can
be established with different timeouts.

Relates to #19719
2016-12-01 15:39:49 +01:00
Boaz Leskes 92fa9149f3 rename more before() methods that now conflict with ESTestCase 2016-12-01 13:40:27 +01:00
Simon Willnauer dd5256c324 Reduce number of connections per node depending on the nodes role (#21849)
We currently treat every node equally when we establish connections to a node.
Yet, if we are not master eligible or can't hold any data there is no point in creating
a dedicated connection for sending the cluster state or running remote recoveries respectively.
The usage of STATE and RECOVERY connections on non-master and/or non-data nodes will result in an IllegalStateException.
2016-12-01 08:00:48 +01:00
Jim Ferenczi fc9b63877e Handle specialized term queries in MappedFieldType.extractTerm(TermQuery) (#21889)
For some fields we have a specialized implementation of a TermQuery that is specific for the field.
When these kind of fields are used in a wildcard query or a span term query it fails with an exception because they don't recognize the specialized form.
The impacted fields are [_all] and [_type] and the impacted queries are [span_term] and [wilcard].
This change handles these forms and correctly extracts the term inside them for further use.

Fixes #21882
2016-11-30 23:11:38 +01:00
Jason Tedor 92f05e796e Remove traces during connect with handshake
This commit removes two trace logging statements during connection with
handshake as they are just clutter.
2016-11-30 15:29:33 -05:00
Jason Tedor 761325bf94 Throw exception on ping from another cluster
When we receive a ping from another cluster, we should throw an
exception so as to not leak the channel.
2016-11-30 15:28:56 -05:00
Jason Tedor c90ba67abb Do not reply to pings from another cluster
Today when sending responses to discovery pings, we unconditionally
reply. Instead, this commit modifies the response handler to not reply
when the cluster names do not match.

This addresses a race condition identified after reducing the timeout in
UnicastZenPingTests#testSimplePings. In particular, we send pings in the
following way:
 - if not connected to the node, connect to the node and after
   successful handshake, send a ping
 - if connected to the node, send a ping

When the ping timeout is set low, a subsequent batch of pings can race
against a connect/disconnect cycle from a prior batch of pings. In
particular, consider the following scenario:
 - node A from cluster X
 - node B from cluster Y
 - pings are initiated from node A with node B in the hosts list
 - node A will try to connect and handshake with B
 - the connection will succeed, and the handshake will eventually fail due to mismatched cluster names
 - on a short timeout, a second batch of pings will fire, and on this
   batch node A will see that it is still connected to node B; thus, it
   will immediately fire a ping to node B and node B will dutifully
   respond

Relates #21894
2016-11-30 15:09:42 -05:00
Luca Cavanna 103984a4a1 Remove indices query (#21837)
The indices query is deprecated since 5.0.0 (#17710). It can now be removed in master (future 6.0 version).
2016-11-30 19:37:01 +01:00
Adrien Grand 117944093e Remove testing of 2.x indices in DecayFunctionScoreIT.
Such old indices will not be supported in 6.0.
2016-11-30 17:16:13 +01:00
Jason Tedor 6c45695d52 Add version 5.1.1
This commit removes the version constant for 5.1.0 (due to an
inadvertent release) and adds the version constant for 5.1.1.

Relates #21890
2016-11-30 11:14:17 -05:00
Adrien Grand f5ac27a20d Fix TermsQueryBuilderTests expectations. 2016-11-30 17:07:53 +01:00
Adrien Grand c5b9c98b99 Remove the `default` store type. (#21616)
It used to be a hybrid store between `niofs` and `mmapfs`, which we removed when
we switched to `fs` by default (which is `mmapfs` on 64-bits systems).
2016-11-30 15:33:26 +01:00
Adrien Grand 90ab477f19 The `terms` query should always map to a Lucene `TermsQuery`. (#21786)
Currently, the `terms` query is just syctactic sugar for a `bool` query when
used in a query context. This change proposes to always generate the same query
in query and filter contexts, which is less confusing.
2016-11-30 15:29:09 +01:00
Luca Cavanna 5b8bdba12e Remove subrequests method from CompositeIndicesRequest (#21873) 2016-11-30 15:03:58 +01:00
Matt Weber 1e722c060b Remove forked XRollingBuffer and XQueryBuilder. (#21866)
Remove the forked versions now that we are on lucene-6.4.0-snapshot.
2016-11-30 13:45:54 +01:00
Adrien Grand a3ef674992 Reduce memory pressure when sending large terms queries. (#21776)
When users send large `terms` query to Elasticsearch, every value is stored in
an object. This change does not reduce the amount of created objects, but makes
sure these objects die young by optimizing the list storage in case all values
are either non-null instances of Long objects or BytesRef objects, which seems
to help the JVM significantly.
2016-11-30 13:35:56 +01:00
Adrien Grand 6231009a8f Remove 2.x backward compatibility of mappings. (#21670)
For the record, I also had to remove the geo-hash cell and geo-distance range
queries to make the code compile. These queries already throw an exception in
all cases with 5.x indices, so that does not hurt any more.

I also had to rename all 2.x bwc indices from `index-${version}` to
`unsupported-${version}` to make `OldIndexBackwardCompatibilityIT`
happy.
2016-11-30 13:34:46 +01:00
Jason Tedor 072007c759 Speed up UnicastZenPingTests
These tests using ping timeouts on the order of seconds, but this is
unnecessary since all the sockets are within the same JVM it really
should not take that long.

Relates #21874
2016-11-29 23:27:25 -05:00
Jason Tedor b6ba4ae34b Add version 5.0.3
This commit adds version 5.0.3 and the BWC indices for version 5.0.2.

Relates #21867
2016-11-29 18:34:55 -05:00
Jay Modi 404b42ee95 DiscoveryNode and TransportAddress should preserve host information
In some cases, such as the creation of DiscoveryNode instances for unicast ping requests, the
host information was not being populated properly and instead the address string was being used.
Additionally, when serializing a DiscoveryNode and in turn a transport address, the host was not
being set on the InetAddress when deserializing the object, so even if the address was created
from a hostname, the address in the deserialized instance had no knowledge of the hostname that
was originally used.
2016-11-29 16:18:08 -05:00
Luca Cavanna 6eaff9432d SearchTemplateRequest to implement CompositeIndicesRequest (#21865)
SearchTemplateRequest to implement CompositeIndicesRequest

Given that SearchTemplateRequest effectively delegates to search when a search is being executed, it should implement the CompositeIndicesRequest interface. The subrequests method should return a single search request. When a search is not going to be executed, because we are in simulate mode, there are no inner requests, and there are no corresponding indices to that request either.

Closes #21747
2016-11-29 20:52:43 +01:00
Boaz Leskes be4074e13d improve debug logging when node waits for initial cluster state
And enabled debug logging in InternalTestClusterTests so we can see it.
2016-11-29 20:38:19 +01:00
Luca Cavanna f253621feb Remove deprecated query names: in, geo_bbox, mlt, fuzzy_match and match_fuzzy (#21852)
These query names were all deprecated in 5.0.0:
- in is removed in favour of terms
- geo_bbox is removed in favour of geo_bounding_box
- mlt is removed in favour of more_like_this
- fuzzy_match and match_fuzzy are removed in favour of match
2016-11-29 19:07:01 +01:00
Jim Ferenczi d791ddf704 Upgrade to lucene-6.4.0-snapshot-ec38570 (#21853)
Set lucene version to 6.4.0-snapshot-ec38570 and update all the sha1s/license
Fix invalid combo after upgrade in query_string query. split_on_whitespace=false is disallowed if auto_generate_phrase_queries=true
Adapt the expectations of some tests to the new format of the Lucene explain output
2016-11-29 18:40:31 +01:00
Nicholas Knize af1ab68b64 Add RangeFieldMapper for numeric and date range types
Lucene 6.2 added index and query support for numeric ranges. This commit adds a new RangeFieldMapper for indexing numeric (int, long, float, double) and date ranges and creating appropriate range and term queries. The design is similar to NumericFieldMapper in that it uses a RangeType enumerator for implementing the logic specific to each type. The following range types are supported by this field mapper: int_range, float_range, long_range, double_range, date_range.

Lucene does not provide a DocValue field specific to RangeField types so the RangeFieldMapper implements a CustomRangeDocValuesField for handling doc value support.

When executing a Range query over a Range field, the RangeQueryBuilder has been enhanced to accept a new relation parameter for defining the type of query as one of: WITHIN, CONTAINS, INTERSECTS. This provides support for finding all ranges that are related to a specific range in a desired way. As with other spatial queries, DISJOINT can be achieved as a MUST_NOT of an INTERSECTS query.
2016-11-29 10:10:14 -06:00
Simon Willnauer f5ff69fabe Remove connectToNodeLight and replace it with a connection profile (#21799)
The Transport#connectToNodeLight concepts is confusing and not very flexible.
neither really testable on a unittest level. This commit cleans up the code used
to connect to nodes and simplifies transport implementations to share more code.
This also allows to connect to nodes with custom profiles if needed, for instance
future improvements can be added to connect to/from nodes that are non-data nodes without
dedicated bulks and recovery connections.
2016-11-29 09:35:07 +01:00
Ali Beyad a884573898 [TEST] fixes FilterAllocationDecider test for decision explanation
when the initial recovery is LOCAL_SHARDS
2016-11-28 20:37:19 -05:00
Ali Beyad 07bd0a30f0 Improves allocation decider decision explanation messages (#21771)
This commit improves the decision explanation messages,
particularly for NO decisions, in the various AllocationDecider
implementations by including the setting(s) in the explanation
message that led to the decision.

This commit also returns a THROTTLE decision instead of a NO
decision when the concurrent rebalances limit has been reached
in ConcurrentRebalanceAllocationDecider, because it more accurately
reflects a temporary throttling that will turn into a YES decision
once the number of concurrent rebalances lessens, as opposed to a
more permanent NO decision (e.g. due to filtering).
2016-11-28 20:23:16 -05:00
Matt Weber 04e07bcdb6 Synonym Graph Support (LUCENE-6664) (#21517)
Integrate the patch from LUCENE-6664 into elasticsearch and
add support for handling a graph token stream in match/multi-match
queries.

This fixes longstanding bugs with multi-token synonyms returning
incorrect results with proximity queries.
2016-11-28 09:25:49 -08:00
Jim Ferenczi 8affb7c845 Fix FiltersFunctionScoreQuery highlighting (#21827)
This is a cleanup of the fix pushed in https://github.com/elastic/elasticsearch/pull/20400.
FiltersFunctionScoreQuery sub query should be extracted in CustomQueryScorer.extract (and not in CustomQueryScorer.extractUnknownQuery).
This does not fix any bug in this branch (it's just a cleanup) but the intent is first to clean up and then to backport in 2.x where there is a real bug.
The bug is in 2.x only because the backport of https://github.com/elastic/elasticsearch/pull/20400 in 2.x mistakenly renamed the FiltersFunctionScoreQuery to FunctionScoreQuery.
This leads to incorrect highlighting on FiltersFunctionScoreQuery in 2.x.
2016-11-28 17:56:24 +01:00
Nik Everett 145d0813b5 Log ScriptException's xcontent if file script compilation fails (#21767)
When a file script fails to compile, rather than logging the
exception that caused the failure this logs the xcontent of
that exception. This is both shorter and has the script stack
which is useful for figuring out why the compilation failed.

Still logs the entire stacktrace at debug level just in case
you need it.

Relates to #21733
2016-11-28 11:36:06 -05:00
Ali Beyad db7362da67 Fixes shard level snapshot metadata loading when index-N file is missing (#21813)
In making changes for the 5.0 version of snapshots, a bug was
introduced where if an index-N file could not be found for an
individual shard, the backup was to iterate over all snap-*.dat
files in the shard folder to know which snapshots contain that
shard's data, but in 5.0, reading the snap-*.dat files as backup
was incorrectly passing in the blob name for the snap-*.dat file,
thereby failing to load all index files for a given snapshot
when the index-N file is missing.  This condition should be rare
as there is no reason an index-N file should be absent (unless
it was deleted or there was corruption reading the file), but
nevertheless, this situation can be encountered and this commit
fixes the bug by reading the correct snap-*.dat blob name in the
shard data folder.
2016-11-28 10:46:33 -05:00
Simon Willnauer b7292a6005 Remove TcpTransport#addressSupported since TransportAddress is now final
TransportAddress used to be customizable per transport but this has been removed
a while ago. Therefore we can remove all usage of this method as well.

Relates to #20695
2016-11-28 16:06:59 +01:00
Jim Ferenczi 69f35aa07f Fix cross_fields type on multi_match query with synonyms (#21638)
* Fix cross_fields type on multi_match query with synonyms

This change fixes the cross_fields type of the multi_match query when synonyms are involved.
Since 2.x the Lucene query parser creates SynonymQuery for words that appear at the same position.
For simple term query the CrossFieldsQueryBuilder expands the term to all requested fields and creates a BlendedTermQuery.
This change adds the same mechanism for SynonymQuery which otherwise are not expanded to all requested fields.
As a side note I wonder if we should not replace the BlendedTermQuery with the SynonymQuery. They have the same purpose and behave similarly.

Fixes #21633

* Fallback to SynonymQuery for blended terms on a single field
2016-11-28 14:14:01 +01:00
Yannick Welsch 7e198f0e41 Detect nodes being blocked by GC-disrupted node (#21797)
The disruption type LongGCDisruption simulates GCs on a node by suspending all the threads of that node. If the suspended threads are in a code section with shared JVM locks, however, it can prevent the other nodes from doing their thing. The class LongGCDisruption has a list of class names for which we know that this can occur. Whenever a test using the GC disruption type fails in mysterious ways, it becomes a long guessing game to find the offending class. This commit adds code to LongGCDisruption to automatically detect these situations, fail the test early and report the offending class and all relevant context.
2016-11-28 11:24:25 +01:00
Adrien Grand 243a788289 Fail to index fields with dots in field names when one of the intermediate objects is nested. (#21787)
Closes #21726
2016-11-28 09:57:32 +01:00
Clinton Gormley c1fa80d40f Log failure to connect to node at info instead of debug (#21809)
Closes #6468
2016-11-26 13:18:26 +01:00
Ali Beyad efba64d60a Removing unused AllocationExplanation class (#21805)
This commit removes the unused AllocationExplanation class.  The
RoutingAllocation class only created an empty instance of it and
never used it anywhere else.  The allocation explanations will be
encompassed in the various decision classes exposed via the cluster
allocation explain API.  Therefore, there is no reason to keep the
AllocationExplanation class.
2016-11-25 12:18:23 -05:00
Luca Cavanna 720b165350 Search shards to print out aliases array together with alias filter (#21784)
With #21738 we added an indices section to the search shards api, that will return the concrete indices hit by the request, and eventually the corresponding alias filter.

The java API returns the AliasFilter object, which holds the filter itself and an array of aliases that pointed to the index in the original request. The REST layer doesn't print out the aliases array though. This commit adds the aliases array as well and tests for this.
2016-11-25 10:58:06 +01:00
Simon Willnauer 9809760eb0 Fix settings diff generation for affix, list and group settings (#21788)
Group, List and Affix settings generate a bogus diff that turns the actual
diff into a string containing a json structure for instance:

```
"action" : {
  "search" : {
    "remote" : {
      "" : "{\"my_remote_cluster\":\"[::1]:60378\"}"
    }
  }
}
```

which make reading the setting impossible. This happens for instance
if a group or affix setting is rendered via `_cluster/settings?include_defaults=true`
This change fixes the issue as well as several minor issues with affix settings that
where not accepted as valid setting today.
2016-11-24 21:53:04 +01:00
Simon Willnauer 72ef6fa0d7 Handle spaces in `action.auto_create_index` gracefully (#21790)
Today if a comma-separated list is passed to action.auto_create_index
leading and trailing whitespaces are not trimmed but since the values are
index expressions whitespaces should be removed for convenience.

Closes #21449
2016-11-24 21:43:58 +01:00
markharwood aa60e5cc07 Aggregations - support for partitioning set of terms used in aggregations so that multiple requests can be done without trying to compute everything in one request.
Closes #21487
2016-11-24 15:10:46 +00:00
Luca Cavanna ac2aa56350 Cluster search shards improvements: expose ShardId, adjust visibility of some members (#21752)
* ClusterSearchShardsGroup to return ShardId rather than the int shard id

This allows more info to be retrieved, like the index uuid which is exposed through the ShardId object but was not available before

* Make ClusterSearchShardsResponse empty constructor public

This allows to receive such responses when sending ClusterSearchShardsRequests directly through TransportService (not using ClusterSearchShardsAction via Client), otherwise an empty response cannot be created unless the class that does it is in org.elasticsearch.action, admin.cluster.shards package

* adjust visibility of ClusterSearchShards members
2016-11-24 09:46:57 +01:00
Luca Cavanna d8c934a7fa Use index uuid as key in the alias filter map rather than the index name (#21749)
The index uuid is unique across multiple clusters, while the index name is not. Using the index uuid to look up filters in the alias filters map is better and will be needed for multi cluster search.
2016-11-24 09:43:42 +01:00
Luca Cavanna 6a16a60c7e Remove unused assignedReplicasIncludingRelocating from ShardsIterator interface (#21687) 2016-11-23 22:25:51 +01:00
Luca Cavanna 887ada4819 Move SearchTransportService and SearchPhaseController creation outside of TransportSearchAction constructor (#21754)
This commit makes sure that there is only one instance of the two services rather than one per transport action that uses it.
Also, we take their initialization out of guice's hands by binding it to a specific instance. Otherwise those two objects would get created within a constructor that is called by guice. That may cause problem for instance when throwing an exception from such constructors as guice tries all over again to re-initialize objects and fills up logs with stacktraces.
2016-11-23 22:18:50 +01:00
Luca Cavanna a84353d2d6 Don't carry ShardRouting around when not needed in AbstractSearchAsyncAction (#21753)
* replace ShardRouting argument in AbstractSearchAsyncAction#onFirstPhaseResult with more contained String nodeId

There is no need to pass in ShardRouting if the only info read from it is the current node id, the shard id can be read directly from the ShardIterator that's already provided as an argument.

* avoid creating a new ShardId when creating a SearchShardTarget in SnapshotsService
2016-11-23 22:18:02 +01:00
Jason Tedor 8416b16dfd Improve handling of unreleased versions
Today when handling unreleased versions for backwards compatilibity
support, we scatted version constants across the code base and add some
asserts to support removing these constants when the version in question
is actually released. This commit improves this situation, enabling us
to just add a single unreleased version constant that can be renamed
when the version is actually released. This should make maintenance of
these versions simpler.

Relates #21760
2016-11-23 15:49:05 -05:00
Luca Cavanna 033eece6d4 ShardSearchRequest to take ShardId constructor argument rather than the whole ShardRouting (#21750)
ShardSearchRequest was previously taking in the whole ShardRouting as a constructor argument while it only needs the ShardsId, changed that to carry over only the needed bits.
2016-11-23 15:34:55 +01:00
Ryan Ernst 10a945ae72 Plugins: Remove support for onModule (#21416)
All plugin extension points have been converted to pull based
interfaces. This change removes the infrastructure for the black-magic
onModule methods.
2016-11-22 23:12:14 -08:00
Ryan Ernst d8808210f1 Transport client: Fix remove address to actually work (#21743)
* Transport client: Fix remove address to actually work

The removeTransportAddress method of TransportClient removes the address
from the list of nodes that the client pings to sniff for nodes.
However, it does not remove it from the list of existing connected
nodes. This means removing a node is not possible, as long as that node
is still up.

This change removes the node from the connected nodes list before
triggering sampling (ie sniffing).  While the fix is simple, testing was
not because there were no existing tests for sniffing. This change also
modifies the mocks used by transport client unit tests in order to allow
mocking sniffing.
2016-11-22 22:50:11 -08:00
Igor Motov c7b69a0133 Add search task descriptions
Since we added ability to cancel searches it would be nice to see which searches we are actually cancelling.
2016-11-22 23:15:49 -05:00
Ryan Ernst 6940b2b8c7 Remove groovy scripting language (#21607)
* Scripting: Remove groovy scripting language

Groovy was deprecated in 5.0. This change removes it, along with the
legacy default language infrastructure in scripting.
2016-11-22 19:24:12 -08:00
Nik Everett 1791623700 Document `error_trace`
The `error_trace` parameter turns on the `stack_trace` field
in errors which returns stack traces.

Removes documentation for `camelCase` because it hasn't worked
in a while....

Documents the internal parameters used to render stack traces as
internal only.

Closes #21708
2016-11-22 19:16:07 -05:00
Jason Tedor c7b70fc770 Mark Security#addBindPermissions as private
This commit marks the method Security#addBindPermissions as private,
it's package-private visibility was not used anywhere.
2016-11-22 18:40:18 -05:00
Jason Tedor 41ae784a6f Refactor handling of bind permissions
This commit refactors the handling of bind permissions, which is in need
of a little cleanup. For example, in its current state, the code for
handling permissions for transport profiles is split across two
methods. This commit refactors this code hopefully making it easier to
work with in future changes. This change is mostly mechanical, no
functionality is changed.

Relates #21742
2016-11-22 18:39:14 -05:00
Jason Tedor 1576eaba25 Increase lower bound for random resolve timeout in test
The test UnicastZenPing#testResolveTimeout chooses a random resolve
timeout between 1ms and 100ms. Close to the lower bound, this is far too
short and the test races against the concurrent resolves executing
before the timeout elapses. This commit increases the timeout to
something that is far less likely to race, yet will not slow the test
down since we are not doing resolves against a real DNS service anyway.
Note that we still want a short resolve timeout since we are testing
whether or not timeouts really work here (by latching one of the
resolves to respond slowly).
2016-11-22 18:35:57 -05:00
Luca Cavanna db5a72774b Add indices and filter information to search shards api output (#21738)
Add indices and filter information to search shards api output

The search shards api returns info about which shards are going to be hit by executing a search with provided parameters: indices, routing, preference. Indices can also be aliases, which can also hold filters. The output includes an array of shards and a summary of all the nodes the shards are allocated on. This commit adds a new indices section to the search shards output that includes one entry per index, where each index can be associated with an optional filter in case the index was hit through a filtered alias.

This is relevant since we have moved parsing of alias filters to the coordinating node.

Relates to #20916
2016-11-22 23:00:25 +01:00
Nik Everett 29e68323a2 Clean up ScriptQuerySearchIT
Shorten line and remove forbidden API.

Relates to #21484
2016-11-22 16:23:33 -05:00
umeshdangat f37db2fe17 Support binary field type in script values (#21484)
Add ScriptDocValues.BytesRefs for reading binary fieldtype
2016-11-22 16:23:23 -05:00
Simon Willnauer a9a2753f0b Add a HostFailureListener to notify client code if a node got disconnected (#21709)
Today there is no way to get notified if a node is disconnected. Client code
must poll the TransportClient constantly to detect that a node is not connected
anymore in order to react and add new nodes or notify altering etc. For instance
if a hostname  gets resolved to an IP but that host is disconnected clients want
to reconnect by resolving the hostname again which is a common situation in cloud
environments.

Closes #21424
2016-11-22 20:46:28 +01:00
Jason Tedor 9dc65037bc Lazy resolve unicast hosts
Today we eagerly resolve unicast hosts. This means that if DNS changes,
we will never find the host at the new address. Moreover, a single host
failng to resolve causes startup to abort. This commit introduces lazy
resolution of unicast hosts. If a DNS entry changes, there is an
opportunity for the host to be discovered. Note that under the Java
security manager, there is a default positive cache of infinity for
resolved hosts; this means that if a user does want to operate in an
environment where DNS can change, they must adjust
networkaddress.cache.ttl in their security policy. And if a host fails
to resolve, we warn log the hostname but continue pinging other
configured hosts.

When doing DNS resolutions for unicast hostnames, we wait until the DNS
lookups timeout. This appears to be forty-five seconds on modern JVMs,
and it is not configurable. If we do these serially, the cluster can be
blocked during ping for a lengthy period of time. This commit introduces
doing the DNS lookups in parallel, and adds a user-configurable timeout
for these lookups.

Relates #21630
2016-11-22 14:17:04 -05:00
Clinton Gormley 3ff8faf514 Added version 2.4.2 and bwc indices 2016-11-22 19:45:59 +01:00
Yannick Welsch a44655763e Allow master to assign primary shard to node that has shard store locked during shard state fetching (#21656)
PR #19416 added a safety mechanism to shard state fetching to only access the store when the shard lock can be acquired. This can lead to the following situation however where a shard has not fully shut down yet while the shard fetching is going on, resulting in a ShardLockObtainFailedException. PrimaryShardAllocator that decides where to allocate primary shards sees this exception and treats the shard as unusable. If this is the only shard copy in the cluster, the cluster stays red and a new shard fetching cycle will not be triggered as shard state fetching treats exceptions while opening the store as permanent failures.

This commit makes it so that PrimaryShardAllocator treats the locked shard as a possible allocation target (although with the least priority).
2016-11-22 19:35:47 +01:00
Jay Modi 080d55a393 Rethrow ExecutionException from the loader to concurrent callers of Cache#computeIfAbsent
This commit clarifies the contract of Cache#computeIfAbsent so that an exception that occurs during the execution
of the loader is thrown to all callers. Prior to this commit, the first caller would get the ExecutionException
and other callers that called during the load execution would get null, which is confusing.
2016-11-22 13:24:15 -05:00
Luca Cavanna db8b2dceea Remove ignored type parameter in search_shards api (#21688)
The `type` parameter has always been accepted by the search_shards api, probably to make the api and its urls the same as search. Truth is that the type never had any effect, it's been ignored from day one while accepting it may make users think that we actually do something with it.

This commit removes support for the type parameter from the REST layer and the Java API. Backwards compatibility is maintained on the transport layer though.

The new added serialization test also uncovered a bug in the java API where the `ClusterSearchShardsRequest` could be created with no arguments, but the indices were required to be not null otherwise the request couldn't be serialized as `writeTo` would throw NPE. Fixed by setting a default value (empty array) for indices.
2016-11-22 17:22:33 +01:00
Jason Tedor 775638c281 Die with dignity on the Lucene layer
When a fatal error tragically closes an index writer, such an error
never makes its way to the uncaught exception handler. This prevents the
node from being torn down if an out of memory error or other fatal error
is thrown in the Lucene layer. This commit ensures that such events
bubble their way up to the uncaught exception handler.

Relates #21721
2016-11-22 11:21:24 -05:00
Jason Tedor 221caa1c5e Refactor handling for bad default permissions
This commit refactors the handling of bad default permissions that come
from the system security policy.

Relates #21735
2016-11-22 10:26:36 -05:00
Yannick Welsch 50e25912c8 Split main ClusterService method into smaller chunks #21666
Splits the main method in ClusterService into smaller chunks so that it's easier to understand and simpler to modify in subsequent PRs.
2016-11-22 12:20:53 +01:00
Yannick Welsch c521219b2f Adapt BWC layer checks for Exceptions to include v5.0.2 support
The PR #21694 was initially planned to go into v6.0.0 and v5.1.0. Due to another PR relying on this one though for backport to v5.0.2, #21694 must go to v5.0.2
as well. As such, the initial backward compatibility rules established by the PR must be changed to include v5.0.2 and above.
2016-11-22 12:02:56 +01:00
Lee Hinman dd1012d570 Merge remote-tracking branch 'dakrone/fix-lenient-overriding' 2016-11-21 22:10:19 -07:00
Lee Hinman 11da09e9bc Allow overriding all-field leniency when `lenient` option is specified
As part of #20925 and #21341 we added an "all-fields" mode to the
`query_string` and `simple_query_string`. This would expand the query to
all fields and automatically set `lenient` to true.

However, we should still allow a user to override the `lenient` flag to
whichever value they desire, should they add it in the request. This
commit does that.
2016-11-21 21:32:25 -07:00
Areek Zillur 933c4f42b3 [FIX] make MergableCustomMetaData public in TribeService 2016-11-21 23:02:36 -05:00
Nik Everett c79371fd5b Remove lang-python and lang-javascript (#20734)
They were deprecated in 5.0. We are concentrating on making
Painless awesome rather than supporting every language possible.

Closes #20698
2016-11-21 22:13:25 -05:00
Jason Tedor 4225737db9 Install a security manager on startup
When Elasticsearch starts, we go through some initialization before we
install a security manager. Yet, the JVM makes internal policy decisions
on the basis of whether or not a security manager is present. This
commit installs a security manager immediately on startup so that the
JVM always thinks a security manager is present when making such policy
decisions.

Relates #21716
2016-11-21 20:33:42 -05:00
Simon Willnauer cb5c25ab4f Add a StreamInput#readArraySize method that ensures sane array sizes (#21697)
Today we read a vint from the stream to allocate the size of an array up-front
before we start reading the values. This can be dangerous if for instance we read
from a corrupted stream or if some manipulated bytes are send for instance from
an attacker or a fuzzer. In most of the cases we can apply some best effort and
validate the array size to be _sane_ by ensuring we can at read at least N bytes
where N is the expected size of the array.
2016-11-21 21:39:21 +01:00
Areek Zillur 0ccf8a742d Add support for merging custom meta data in tribe node (#21552)
* Add support for merging custom meta data in tribe node

Currently, when any underlying cluster has custom metadata
(via plugin), tribe node does not store custom meta data in its
cluster state. This is because the tribe node has no idea how to
select the appropriate custom metadata from one or many custom
metadata (corresponding to the number of underlying clusters).

This change adds an interface that custom metadata implementations
can extend to add support for merging mulitple custom metadata of
the same type for storing in the tribe state.

Relates to #20544
Supersedes #20791

* Simplify updating tribe state

* Add tests for merging multiple custom metadata types in tribe node

* cleanup merging custom md logic in tribe service
2016-11-21 12:03:01 -05:00
Simon Willnauer 71a21b3208 Add BWC layer for Exceptions (#21694)
Today it's not possible to add exceptions to the serialization layer
without breaking BWC. This commit adds the ability to specify the Version
an exception was added that allows to fall back not NotSerializableExceptionWrapper
if the exception is not present in the streams version.

Relates to #21656
2016-11-21 12:51:06 +01:00
Tanguy Leroux e7b9e65fc3 Add checkstyle rule to forbid empty javadoc comments (#20881)
This commit adds a RegexpMultiline check to checkstyle that yells when an empty Javadoc comment is found in Java files.

Related #20871
2016-11-21 12:36:44 +01:00
Luca Cavanna 6122b84eba remove pointless catch exception in TransportSearchAction (#21689)
TransportSearchAction optimizes the search_type in certain cases, when for instance we are searching against a single shard, or when there is only a suggest section in the request. That optimization is wrapped in a try catch, and when an exception happens we log it and ignore it. This may be a leftover from the past though, as no exception is expected to be thrown in that code block, hence if there is any exception we are probably better off bubbling it up rather than ignoring it.
2016-11-21 11:46:26 +01:00
Luca Cavanna a1d88e6550 Rename ClusterState#lookupPrototypeSafe to `lookupPrototype` and remove previous "unsafe" unused variant (#21686)
The `lookupPrototype` method is not used anywhere. Seems like we rather use its `lookupProrotypeSafe` variant (which also throws exception if the prototype is not found) is always. This commit makes the safer variant the default one, by renaming it to  "lookupPrototype" and removes the previous "unsafe" variant.
2016-11-21 11:36:56 +01:00
Simon Willnauer d913242ca1 Use a buffer to do character to byte conversion in StreamOutput#writeString (#21680)
Today we call `writeByte` up to 3x per character in each string written via
`StreamOutput#writeString` this can have quite some overhead when strings
are long or many strings are written. This change adds a local buffer to
convert chars to bytes into the local buffer. Converted bytes are then
written via `writeBytes` instead reducing the overhead of this opertion.

Closes #21660
2016-11-21 10:47:50 +01:00
Adrien Grand 23d5293f82 Fix integer overflows when dealing with templates. (#21628)
The overflows were happening in two places, the parsing of the template that
implicitly truncates the `order` when its value does not fall into the `integer`
range, and the comparator that sorts templates in ascending order, since it
returns `order2-order1`, which might overflow.

Closes #21622
2016-11-21 10:41:08 +01:00
Jim Ferenczi 90247446aa Fix highlighting on a stored keyword field (#21645)
* Fix highlighting on a stored keyword field

The highlighter converts stored keyword fields using toString().
Since the keyword fields are stored as utf8 bytes the conversion is broken.
This change uses BytesRef.utf8toString() to convert the field value in a valid string.

Fixes #21636

* Replace BytesRef#utf8ToString with MappedFieldType#valueForDisplay
2016-11-21 10:29:30 +01:00
David Roberts 6daeb56969 Set execute permissions for native plugin programs (#21657) 2016-11-21 09:20:09 +00:00
javanna 9594b6f50f adjust visibility of DiscoveryNodes.Delta constructor
It can be private as it gets called by DiscoveryNodes#delta method, which is supposed to be the only way to create a Delta
2016-11-21 10:17:05 +01:00
javanna e0661c5262 Remove unused DiscoveryNodes.Delta constructor 2016-11-21 10:17:05 +01:00
javanna 596eebcf98 Remove unused DiscoveryNode#removeDeadMembers public method 2016-11-21 10:17:05 +01:00
javanna b19c606cef Remove minNodeVersion and corresponding public `getSmallestVersion` getter method from DiscoveryNodes 2016-11-21 10:17:05 +01:00
Jason Tedor aed88fe7a2 Log node ID on startup
If the node name is explicitly set it's not derived from the node ID
meaning that it doesn't immediately appear in the logs. While it can be
tracked down in other places, it would be easier for info purposes if it
just showed up explicitly. This commit adds the node ID to the logs,
whether or not the node name is set.

Relates #21673
2016-11-19 06:27:25 -05:00
Jason Tedor 484ad31ed9 Clarify that plugins can be closed
Plugins are closed if they implement java.io.Closeable but this is not
clear from the plugin interface. This commit clarifies this by declaring
that Plugins implement java.io.Closeable and adding an empty
implementation to the base Plugin class.

Relates #21669
2016-11-18 13:04:28 -05:00
Simon Willnauer 99f8c21d9a Don't reset non-dynamic settings unless explicitly requested (#21646)
AbstractScopedSettings has the ability to only apply updates/deletes
to dynamic settings. The flag is currently not respected when a setting
is reset/deleted which causes static node settings to be reset if a non-dynamic
key is reset via `null` value.

Closes #21593
2016-11-18 16:40:18 +01:00
Ali Beyad 1d2a1540cc Makes allocator decision classes top-level classes (#21662)
This commit moves several allocation decider related inner classes
into their own top-level class, in order to use more easily in
the allocation explain API. This commit also renames some of those
decision related classes to more suitable names.

This is simply a cosmetic change - no functionality changes with this
commit whatsoever.

To summarize the changes:
 1. ShardAllocationDecision renamed to AllocateUnassignedDecision
 2. RelocationDecision moved to a top-level class
 3. MoveDecision moved to a top-level class
 4. RebalanceDecision moved to a top-level class
 5. ShardAllocationDecisionTests renamed to AllocateUnassignedDecisionTests
 6. NodeRebalanceResult moved to a top-level class
 7. ShardAllocationDecision#WeightedDecision moved to a top-level class and renamed to NodeAllocationResult.
2016-11-18 10:19:27 -05:00
Yannick Welsch b1fd257c42 [TEST] Fix testTimedOutUpdateTaskCleanedUp to wait for blocking task to be completed
The "test" task can complete its execution with a timeout exception before the "block-task" actually starts executing. The test thus has to wait for both to be
completed before checking that the updateTasksPerExecutor map has been properly cleaned up.
2016-11-18 12:34:50 +01:00
Christoph Büscher 4a7b70cc08 Don't require `types` parameter in IdsQueryBuilder constructor
According to the docs and our own tests we accept an ids query without specified
types and default to all types in the index mapping in this case. This changes
the builder to reflect this by making the types no longer a required constructor
argument and changes the parser to reflect that.
2016-11-17 20:22:48 +01:00
Christoph Büscher b8cae39b7c Using ObjectParser in MatchAllQueryBuilder and IdsQueryBuilder
A first step moving away from the current parsing to use the generalized
Objectparser and ConstructingObjectParser. This PR start by making use of it in
MatchAllQueryBuilder and IdsQueryBuilder.
2016-11-17 20:22:48 +01:00
Nik Everett 2a1e08f76a Fix compilation in Eclipse (#21606)
* Fix compilation in Eclipse

I'm not sure what the bug is, but ecj doesn't like this expression
unless the type is set explicitly.

* Add comment explaining why no diamond operator
2016-11-17 12:54:57 -05:00
Jim Ferenczi 09fbb4d06d Fix match_phrase_prefix on boosted fields (#21623)
This change fixes the match_phrase_prefix on fields that define a boost in their mapping.

Fixes #21613
2016-11-17 18:45:34 +01:00
Dimitris Athanasiou a75320f89b Replace IndexAlreadyExistsException with ResourceAlreadyExistsException (#21494) 2016-11-17 14:30:21 +00:00
Jason Tedor b08a2e1f31 Expose executor service interface from thread pool
This commit exposes the executor service interface from thread
pool. This will enable some high-level concurrency primitives that will
make some code cleaner and simpler.

Relates #21608
2016-11-17 09:18:49 -05:00
David Roberts 116593e5f5 Adjust bootstrap sequence (#21543)
Added the ability for plugins to spawn a controller process at startup
2016-11-17 09:58:09 +00:00
Adrien Grand 6581b77198 Remove store throttling. (#21573)
Store throttling has been disabled by default since Lucene added automatic
throttling of merge operations based on the indexing rate.
2016-11-17 09:33:32 +01:00
Jason Tedor 9792b5792a Respect default search timeout
The default search timeout is not respected because the timeout is
unconditionally set from the query. This commit fixes this issue, and
adds a test that the default search timeout is correctly attached to the
search context.

Relates #21599
2016-11-16 12:43:47 -05:00
Jason Tedor d06a8903fd Merge branch 'master' into feature/seq_no
* master: (22 commits)
  Add proper toString() method to UpdateTask (#21582)
  Fix `InternalEngine#isThrottled` to not always return `false`. (#21592)
  add `ignore_missing` option to SplitProcessor (#20982)
  fix trace_match behavior for when there is only one grok pattern (#21413)
  Remove dead code from GetResponse.java
  Fixes date range query using epoch with timezone (#21542)
  Do not cache term queries. (#21566)
  Updated dynamic mapper section
  Docs: Clarify date_histogram bucket sizes for DST time zones
  Handle release of 5.0.1
  Fix skip reason for stats API parameters test
  Reduce skip version for stats API parameter tests
  Strict level parsing for indices stats
  Remove cluster update task when task times out (#21578)
  [DOCS] Mention "all-fields" mode doesn't search across nested documents
  InternalTestCluster: when restarting a node we should validate the cluster is formed via the node we just restarted
  Fixed bad asciidoc in boolean mapping docs
  Fixed bad asciidoc ID in node stats
  Be strict when parsing values searching for booleans (#21555)
  Fix time zone rounding edge case for DST overlaps
  ...
2016-11-16 09:10:35 -05:00
Yannick Welsch aa73a76ffd Add proper toString() method to UpdateTask (#21582)
Adds a proper toString() method to ClusterService.UpdateTask
2016-11-16 15:07:26 +01:00
Adrien Grand d7fa2eb155 Fix `InternalEngine#isThrottled` to not always return `false`. (#21592)
Currently it inherits from the default implementation which always returns
`false`, even if indexing is being throttled.
2016-11-16 15:01:05 +01:00
Tal Levy 6796464f16 add `ignore_missing` option to SplitProcessor (#20982)
Closes #20840.
2016-11-16 15:46:09 +02:00
Simon Willnauer 6baded8e7f Remove dead code from GetResponse.java 2016-11-16 10:48:15 +01:00
Colin Goodheart-Smithe c6c734dce1 Fixes date range query using epoch with timezone (#21542)
This change fixes the rnage query so that an exception is always thrown if the range query uses epoch time together with a time zone. Since epoch time is always UTC it should not be used with a time zone.

Closes #21501
2016-11-16 09:11:04 +00:00
Adrien Grand 00de8e07fc Do not cache term queries. (#21566)
There have been reports that the query cache did not manage to speed up search
requests when the query includes a large number of different sub queries since
a single request may manage to exhaust the whole history (256 queries) while
the query cache only starts caching queries once they appear multiple times in
the history (#16031). On the other hand, increasing the size of the query cache
is a bit controversial (#20116) so this pull request proposes a different
approach that consists of never caching term queries, and not adding them to the
history of queries either. The reasoning is that these queries should be fast
anyway, regardless of caching, so taking them out of the equation should not
cause any slow down. On the other hand, the fact that they are not added to the
cache history anymore means that other queries have greater chances of being
cached.
2016-11-16 10:02:24 +01:00
Nik Everett e66261eee9 Handle release of 5.0.1
Adds a version constant for it, bwc indices, and a vagrant upgrade-from
version. Also bumps the "upgrade from" version for the backwards-5.0
test and adds `skip`s for tests that don't fail against 5.0 so we skip
them during the backwards testing.

Finally, this skips the "Shrink index via API" test because it fails
consistently for me. Inconsistently for CI, but consistently for me.
I'll work on making it consistent tomorrow.
2016-11-15 19:31:28 -05:00
Jason Tedor 17b0041aaf Strict level parsing for indices stats
A previous commit added strict level parsing for the node stats API, but
that commit missed adding the same for the indices stats API. This
commit rectifies this miss.

Relates #21577
2016-11-15 16:26:37 -05:00
Yannick Welsch 40e0162e61 Remove cluster update task when task times out (#21578)
Fixes an issue where the cluster service does not remove an update task from its internal data structures that are used for batching cluster state updates.

* review comments

* checkstyle
2016-11-15 21:38:58 +01:00
Lee Hinman 96122aa518 Be strict when parsing values searching for booleans (#21555)
This changes only the query parsing behavior to be strict when searching on
boolean values. We continue to accept the variety of values during index time,
but searches will only be parsed using `"true"` or `"false"`.

Resolves #21545
2016-11-15 10:36:57 -07:00
Christoph Büscher cd4634bdc6 Fix time zone rounding edge case for DST overlaps
When using TimeUnitRounding with a DAY_OF_MONTH unit, failing tests in #20833
uncovered an issue when the DST shift happenes just one hour after midnight
local time and sets back the clock to midnight, leading to an overlap.
Previously this would lead to two different rounding values, depending on
whether a date before or after the transition was rounded. This change detects
this special case and correct for it by using the previous rounding date for
both cases.

Closes #20833
2016-11-15 18:23:47 +01:00
Jason Tedor f5ac0e5076 Remove lenient stats parsing
Today when parsing a stats request, Elasticsearch silently ignores
incorrect metrics. This commit removes lenient parsing of stats requests
for the nodes stats and indices stats APIs.

Relates #21417
2016-11-15 12:17:26 -05:00
Boaz Leskes 2c0338fa87 Merge remote-tracking branch 'upstream/master' into feature/seq_no 2016-11-15 17:09:08 +00:00
Boaz Leskes d6c2b4f7c5 Adapt InternalTestCluster to auto adjust `minimum_master_nodes` (#21458)
#20960 removed `LocalDiscovery` and we now use `ZenDiscovery` in all our tests. To keep cluster forming fast, we are using a `MockZenPing` implementation which uses static maps to return instant results making master election fast. Currently, we don't set `minimum_master_nodes` causing the occasional split brain when starting multiple nodes concurrently and their pinging is so fast that it misses the fact that one of the node has elected it self master. To solve this, `InternalTestCluster` is modified to behave like a true cluster and manage and set `minimum_master_nodes` correctly with every change to the number of nodes.

Tests that want to manage the settings themselves can opt out using a new `autoMinMasterNodes` parameter to the `ClusterScope` annotation. 

Having `min_master_nodes` set means the started node may need to wait for other nodes to be started as well. To combat this, we set `discovery.initial_state_timeout` to `0` and wait for the cluster to form once all node have been started. Also, because a node may wait and ping while other nodes are started, `MockZenPing` is adapted to wait rather than busy-ping.
2016-11-15 13:42:26 +00:00
Jason Tedor ee722d738a Fix internal engine sequence number test bug
This commit fixes a test bug in internal engine tests, and adds some
additional assertions.
2016-11-15 08:34:54 -05:00
Simon Willnauer 66fbb0dbc2 Don't fail in `afterExecute` if context is already closed (#21563)
We run an assert on an potentially closed thread context. this should
not bubble up the `IllegalStateException`.
2016-11-15 13:55:50 +01:00
Adrien Grand 54809065a6 Make PercolatorFieldMapper get a QueryShardContext lazily. 2016-11-15 12:02:40 +01:00
Boaz Leskes c9f49039d3 Merge remote-tracking branch 'upstream/master' into feature/seq_no 2016-11-15 10:14:47 +00:00
Simon Willnauer 200a2850a9 [TEST] Don't stop MockAppender some nodes might concurrently use it 2016-11-15 10:48:39 +01:00
Boaz Leskes 6d9af2fff4 Uncommitted mapping updates should not efect existing indices (#21306)
When processing a mapping updates, the master current creates an `IndexService` and uses its mapper service to do the hard work. However, if the master is also a data node and it already has an instance of `IndexService`, we currently reuse the the `MapperService` of that instance. Sadly, since mapping updates are change the in memory objects, this means that a mapping change that can rejected later on during cluster state publishing will leave a side effect on the index in question, bypassing the cluster state safety mechanism.

This commit removes this optimization and replaces the `IndexService` creation with a direct creation of a `MapperService`. 

Also, this fixes an issue multiple from multiple shards for the same field caused unneeded cluster state publishing as the current code always created a new cluster state.

This were discovered while researching #21189
2016-11-15 10:47:34 +01:00
Adrien Grand ad94bea0bb Remove XPointValues. (#21541)
This class had been added to address a bug in PointValues, which has been fixed
since then.
2016-11-15 10:11:41 +01:00
Martijn van Groningen 8a3a885058 inner_hits: Skip adding a parent field to nested documents.
Otherwise an empty string get added as _parent field.

Closes #21503
2016-11-15 07:32:28 +01:00
Ryan Ernst c7bd4f3454 Tests: Add TestZenDiscovery and replace uses of MockZenPing with it (#21488)
This changes adds a test discovery (which internally uses the existing
mock zenping by default). Having the mock the test framework selects be a discovery
greatly simplifies discovery setup (no more weird callback to a Node
method).
2016-11-14 21:46:10 -08:00
Ryan Ernst d14c470b89 Remove generics from ActionRequest
closes #21368
2016-11-14 15:32:01 -08:00
Jason Tedor 48579cccab Add socket permissions for tribe nodes
Today when a node starts, we create dynamic socket permissions based on
the configured HTTP ports and transport ports. If no ports are
configured, we use the default port ranges. When a tribe node starts, a
tribe node creates an internal node client for connecting to each remote
cluster. If neither an explicit HTTP port nor transport ports were
specified, the default port ranges are large enough for the tribe node
and its internal node clients. If an explicit HTTP port or transport
port was specified for the tribe node, then socket permissions for those
ports will be created, but not for the internal node clients. Whether
the internal node clients have explicit ports specified, or attempt to
bind within the default range, socket permissions for these will not
have been created and the internal node clients will hit a permissions
issue when attempting to bind. This commit addresses this issue by also
accounting for tribe nodes when creating the dynamic socket
permissions. Additionally, we add our first real integration test for
tribe nodes.

Relates #21546
2016-11-14 15:09:45 -05:00
Jay Modi 87d76c3ff8 assert blocking calls are not made on the cluster state update thread
This commit adds an assertion to ensure that we do not introduce blocking calls in code
that is called in a ClusterStateListener or another part of the cluster state update process.
2016-11-14 14:30:01 -05:00
Jason Tedor 9fb54f4ef8 Remove unnecessary hash map copy in o.e.b.Security
This commit removes an unnecessary copying of the tribe node group
settings in o.e.b.Security.
2016-11-14 13:49:16 -05:00
Jason Tedor a12f09317d Fallback to settings if transport profile is empty
If the transport profile does not contain a TCP port range, we fallback
to the top-level settings.
2016-11-14 13:48:12 -05:00
Jason Tedor 491a945ac8 Add socket permissions for tribe nodes
Today when a node starts, we create dynamic socket permissions based on
the configured HTTP ports and transport ports. If no ports are
configured, we use the default port ranges. When a tribe node starts, a
tribe node creates an internal node client for connecting to each remote
cluster. If neither an explicit HTTP port nor transport ports were
specified, the default port ranges are large enough for the tribe node
and its internal node clients. If an explicit HTTP port or transport
port was specified for the tribe node, then socket permissions for those
ports will be created, but not for the internal node clients. Whether
the internal node clients have explicit ports specified, or attempt to
bind within the default range, socket permissions for these will not
have been created and the internal node clients will hit a permissions
issue when attempting to bind. This commit addresses this issue by also
accounting for tribe nodes when creating the dynamic socket
permissions. Additionally, we add our first real integration test for
tribe nodes.
2016-11-14 11:58:44 -05:00
Simon Willnauer 1d8c8529ed Remove `IndexTemplateAlreadyExistsException` and `IndexShardAlreadyExistsException` (#21539)
Both exception can be replaced with java built-in exception, IAE and ISE respectively.
This should be back ported partially to 5.x which the transport layer code should be preserved.

Relates to #21494
2016-11-14 17:09:57 +01:00
Simon Willnauer 26375256ff Enable 5.x to 6.x BWC tests (#21537)
This commit enables real BWC testing against a 5.1 snapshot. All
REST tests plus rolling upgrade test now run against a mixed version
cross major version cluster.
2016-11-14 17:03:57 +01:00
Yannick Welsch d3e97ce6cd Fix line length in TCPTransportTests
Makes checkstyle happy
2016-11-14 16:55:14 +01:00
Yannick Welsch d42f7eec61 Check valid cluster service state transitions (#21538)
This commit adds assertions to check whether the cluster service state transitions in a way that we expect it to.

Relates to #21379.
2016-11-14 16:49:25 +01:00
Simon Willnauer 26a8a94e56 [TEST] Add test to ensure `transport.tcp.compress` works
This adds a basic unittest to ensure `transport.tcp.compress` has effect
on all basic TcpTransport implementations.

Relates to #21526
2016-11-14 16:13:44 +01:00
Simon Willnauer 7d4bde8e00 remove forbidden API 2016-11-14 15:30:07 +01:00
Yannick Welsch 8655cd7182 Add assertion that checks that the same shard with same id is not added to same node (#21498)
Adds an assertion that checks that the same shard with same id is not added to same node. Previously we would just silently ignore the second shard being added.
2016-11-14 15:14:14 +01:00
Simon Willnauer bdc942fa72 Enable 5.x to 6.x BWC tests
This commit enables real BWC testing against a 5.1 snapshot. All
REST tests plus rolling upgrade test now run against a mixed version
cross major version cluster.
2016-11-14 14:26:49 +01:00
Adrien Grand 1fd5c47e7f Upgrade to lucene-6.3.0. (#21464) 2016-11-14 09:36:45 +01:00
Jason Tedor c7a1b3eb50 Merge branch 'master' into feature/seq_no
* master:
  Hack around cluster service and logging race
  Do not prematurely shutdown Log4j
  Support decimal constants with trailing [dD] in painless (#21412)
  In painless suggest a long constant if int won't do (#21415)
  Account for different paths for sysctl utilities
  [TEST] testRebalancePossible() may not have an assigned node id
  Tests: Disable merge in SearchCancellationTests
  Tests: clean search scroll at the end of SearchCancellationIT
2016-11-13 20:01:44 -05:00
Jason Tedor 19decd7552 Hack around cluster service and logging race
When a cluster update task executes, there can be log messages after the
update task has finished processing and the new cluster state becomes
visible. The visibility of the cluster state allows the test thread in
UpdateSettingsIT#testUpdateAutoThrottleSettings and
UpdateSettingsiT#testUpdateMergeMaxThreadCount to proceed. The test
thread will remove and stop a mock appender setup at the beginning of
the test. The log messages in the cluster state update task that occur
after processing has finished can race with the removal of the
appender. Log4j will grab a reference to the appenders when processing
these log messages, and this races with the removal and stopping of the
appenders. If Log4j grabs a reference to the appenders before the mock
appender has been removed, and the test thread subsequently removes and
stops the appender before Log4j has appended the log message, Log4j will
get angry that we are appending to a stopped appender, causing the test
to fail. This commit addresses this race by waiting for the cluster
state update task to have finished processing before freeing the test
thread to make its assertions and finally remove and stop the
appender. Yes, this is a hack.

Relates #21518
2016-11-13 18:06:12 -05:00
Jason Tedor d273419d00 Do not prematurely shutdown Log4j
When a node closes, we shutdown logging as the last statement. This
statement must be last lest any subsequent attempts to log will blow up
by running into security permissions. Yet, in the case of a tribe node
this isn't enough. The first internal tribe node to close will shutdown
logging, and subsequent node closes will blow up with the aforementioned
problem. This commit migrate the Log4j shutdown to occur as part of the
shutdown hook that closes the node, after all nodes have
closed. Consequently, we can remove a hack in the test infrastructure to
prevent Log4j shutdowns when internal test nodes close and instead just
register a single shutdown hook that runs when the test JVM exits.

Relates #21519
2016-11-13 17:27:30 -05:00
Boaz Leskes fac6cf0d4e testUpgradeOldIndex should properly set index setting. They are needed for assertions 2016-11-12 11:42:02 +01:00
Ali Beyad 38023fb58d [TEST] testRebalancePossible() may not have an assigned node id 2016-11-11 23:10:34 -05:00
Igor Motov ca639e8c86 Tests: Disable merge in SearchCancellationTests
We have to have at least 2 segments for the test to work and sometimes random merge policy merges them into one.
2016-11-11 18:22:28 -05:00
Igor Motov 058b6e019c Tests: clean search scroll at the end of SearchCancellationIT
Under some rare conditions search cancellation response might not fully clean scroll context. For now this commit adds the cleaning operation to the test, and we will address the root cause in https://github.com/elastic/elasticsearch/issues/21511
2016-11-11 18:22:15 -05:00
Jason Tedor 1ea69b1a80 Merge branch 'master' into feature/seq_no
* master:
  Set vm.max_map_count on systemd package install
  [TEST] reduce the number of snapshotted shards to 1 in testSnapshotSucceedsAfterSnapshotFailure() so that we are more likely to trigger I/O exceptions on writing the control files during the finalize phase of snapshotting (with the aim of triggering an I/O failure when writing pending-index-*).
  Add documentation for Logger with Transport Client
  Enable appender exceptions in UpdateSettingsIT
  [TEST] remove AwaitsFix from testSnapshotSucceedsAfterSnapshotFailure, turns out the issue is specific to Java 9 v143
  Cleanup formatting in UpdateSettingsIT.java
  [TEST] mute the testSnapshotSucceedsAfterSnapshotFailure() test until its clear what is going wrong.
  Mark SearchQueryIT test as awaits fix
  Makes snapshot throttling test go much faster (#21485)
  Breaking changes docs for template index_patterns
  [TEST] adds randomness between atomic and non-atomic move operations in MockRepository
  Cache successful shard deletion checks (#21438)
  Task cancellation command should wait for all child nodes to receive cancellation request before returning
2016-11-11 17:03:01 -05:00
Jason Tedor d06f43c706 Tighten sequence number assertion
We have an assertion in the engine regarding the initial state of a
sequence number before an indexing operation. This assertion is too
loose, it catches operations during recovery from old indices where
sequence numbers do not even exist. This commit tightens these
assertions to not catch such operations and enables us to reenable some
tests.

Relates #21509
2016-11-11 16:49:13 -05:00
Ali Beyad 5f1d108704 [TEST] reduce the number of snapshotted shards to 1 in testSnapshotSucceedsAfterSnapshotFailure()
so that we are more likely to trigger I/O exceptions on writing the control files during the
finalize phase of snapshotting (with the aim of triggering an I/O failure when writing pending-index-*).
2016-11-11 16:22:11 -05:00
Jason Tedor 8d1260a58a Convert nocommit to TODO in SeqNoFieldMapper
This commit converts a nocommit to a TODO in SeqNoFieldMapper that will
be dealt with in a follow-up.
2016-11-11 16:11:41 -05:00
Jason Tedor c77d285699 Remove nocommit in TransportShardBulkAction
This commit removes a nocommit in TransportShardBulkAction that deserves
a larger issue.
2016-11-11 16:10:22 -05:00
Jason Tedor 33f7cd5a16 Remove shard ID from doc write response
This commit removes the shard ID from doc write response; this was
useful for debugging but its time has passed.

Relates #21508
2016-11-11 15:18:25 -05:00
Jason Tedor 9352d16602 Enable appender exceptions in UpdateSettingsIT
This commit sets the mock appender in UpdateSettingsIT to not ignore
exceptions. This means that when an exception is hit, we will see an
actual stack trace that could be useful in debugging a non-reproducible
test failure.

Relates #21461
2016-11-11 12:41:20 -05:00
Ali Beyad c9c3992f94 [TEST] remove AwaitsFix from testSnapshotSucceedsAfterSnapshotFailure,
turns out the issue is specific to Java 9 v143
2016-11-11 12:37:04 -05:00
Jason Tedor 79076334ae Cleanup formatting in UpdateSettingsIT.java
This commit cleans up some code formatting in UpdateSettingsIT.java and
removes this from from the checkstyle line-length supressions.
2016-11-11 12:10:32 -05:00
Ali Beyad 8f85e388da [TEST] mute the testSnapshotSucceedsAfterSnapshotFailure() test
until its clear what is going wrong.

Relates #21496
2016-11-11 11:50:23 -05:00
Jason Tedor 372480a16a Mark SearchQueryIT test as awaits fix
This commit marks the test SearchQueryIT#testRangeQueryWithTimeZone as
awaits fix.

Relates #21501
2016-11-11 11:33:17 -05:00
Yannick Welsch 9cbb23f3d7 Test distinctNodes 2016-11-11 17:29:51 +01:00
Jason Tedor 1e7c424479 Merge branch 'master' into feature/seq_no
* master:
  ShardActiveResponseHandler shouldn't hold to an entire cluster state
  Ensures cleanup of temporary index-* generational blobs during snapshotting (#21469)
  Remove (again) test uses of onModule (#21414)
  [TEST] Add assertBusy when checking for pending operation counter after tests
  Revert "Add trace logging when aquiring and releasing operation locks for replication requests"
  Allows multiple patterns to be specified for index templates (#21009)
  [TEST] fixes rebalance single shard check as it isn't guaranteed that a rebalance makes sense and the method only tests if rebalance is allowed
  Document _reindex with random_score
2016-11-11 11:25:27 -05:00
Ali Beyad a5ccd02e76 Makes snapshot throttling test go much faster (#21485)
[TEST] Makes the snapshot throttling test go much faster.  Before, 
the snapshot throttling test would throttle at a rate of 0.5 kb per
second, even though it would snapshot/restore about 25 kb of data.
This commit increases the throttling rate to 10kb per second, so
we still test the throttling mechanism while speeding up the test from
taking 30 plus seconds down to 2 seconds or less.
2016-11-11 10:52:26 -05:00
Yannick Welsch d195ef258b test fix 2016-11-11 16:09:34 +01:00
Yannick Welsch 1635baf876 fix tests that add duplicate shards 2016-11-11 15:28:40 +01:00
Yannick Welsch 7099f10909 Add assertion that checks that the same shard with same id is not added to same node 2016-11-11 15:28:40 +01:00
Ali Beyad adb7aaded4 [TEST] adds randomness between atomic and non-atomic move
operations in MockRepository
2016-11-11 09:07:28 -05:00
Yannick Welsch 2d3a52c0f2 Cache successful shard deletion checks (#21438)
Each node checks on every cluster state update if there are shards that it can possibly delete from its disk. It decides this by doing a file-system lookup for each shard id that is fully allocated in the cluster. With lots of shards, this amounts to lots of Files.exists() checks, considerably slowing down cluster state updates. This commit adds a caching layer so that the Files.exists() checks can be skipped if not needed.
2016-11-11 10:06:15 +01:00
Jason Tedor d3417fb022 Merge branch 'master' into feature/seq_no
* master: (516 commits)
  Avoid angering Log4j in TransportNodesActionTests
  Add trace logging when aquiring and releasing operation locks for replication requests
  Fix handler name on message not fully read
  Remove accidental import.
  Improve log message in TransportNodesAction
  Clean up of Script.
  Update Joda Time to version 2.9.5 (#21468)
  Remove unused ClusterService dependency from SearchPhaseController (#21421)
  Remove max_local_storage_nodes from elasticsearch.yml (#21467)
  Wait for all reindex subtasks before rethrottling
  Correcting a typo-Maan to Man-in README.textile (#21466)
  Fix InternalSearchHit#hasSource to return the proper boolean value (#21441)
  Replace all index date-math examples with the URI encoded form
  Fix typos (#21456)
  Adapt ES_JVM_OPTIONS packaging test to ubuntu-1204
  Add null check in InternalSearchHit#sourceRef to prevent NPE (#21431)
  Add VirtualBox version check (#21370)
  Export ES_JVM_OPTIONS for SysV init
  Skip reindex rethrottle tests with workers
  Make forbidden APIs be quieter about classpath warnings (#21443)
  ...
2016-11-10 23:40:33 -05:00
Igor Motov df965fc9b3 Task cancellation command should wait for all child nodes to receive cancellation request before returning
Currently the task cancellation command returns as soon as the top-level parent child is marked as cancelled. This create race conditions in tests where child tasks on other nodes may continue to run for some time after the main task is cancelled. This commit fixes this situation making task cancellation command to wait until it got propagated to all nodes that have child tasks.

Closes #21126
2016-11-10 22:43:43 -05:00
Igor Motov 06a50fa31e ShardActiveResponseHandler shouldn't hold to an entire cluster state
ShardActiveResponseHandler doesn't need to hold to an entire cluster state since it only needs to know the cluster state version. It seems that on overloaded systems where nodes are unresponsive holding onto a lot of different cluster states can make the situation worse.

Closes #21394
2016-11-10 22:28:49 -05:00
Ali Beyad 3001b636db Ensures cleanup of temporary index-* generational blobs during snapshotting (#21469)
Ensures pending index-* blobs are deleted when snapshotting.  The
index-* blobs are generational files that maintain the snapshots
in the repository.  To write these atomically, we first write a
`pending-index-*` blob, then move it to `index-*`, which also deletes
`pending-index-*` in case its not a file-system level move (e.g.
S3 repositories) .  For example, to write the 5th generation of the
index blob for the repository, we would first write the bytes to
`pending-index-5` and then move `pending-index-5` to `index-5`.  It is
possible that we fail after writing `pending-index-5`, but before
moving it to `index-5` or deleting `pending-index-5`.  In this case,
we will have a dangling `pending-index-5` blob laying around.  Since
snapshot #5 would have failed, the next snapshot assumes a generation
number of 5, so it tries to write to `index-5`, which first tries to
write to `pending-index-5` before moving the blob to `index-5`.  Since
`pending-index-5` is leftover from the previous failure, the snapshot
fails as it cannot overwrite this blob.

This commit solves the problem by first, adding a UUID to the
`pending-index-*` blobs, and secondly, strengthen the logic around
failure to write the `index-*` generational blob to ensure pending
files are deleted on cleanup.

Closes #21462
2016-11-10 21:45:02 -05:00
Ryan Ernst 48bfb142b9 Remove (again) test uses of onModule (#21414)
This change was reverted after it caused random test failures. This was
due to a copy/paste error in the original PR which caused the mock
version of ClusterInfoService to be used whenever the mock *ZenPing* was
used, and the real ClusterInfoService to be used when MockZenPing was
not used.
2016-11-10 16:06:14 -08:00
Areek Zillur 7ed195fe93 [TEST] Add assertBusy when checking for pending operation counter after tests
Currently, pending operations can complete after tests with disruption scheme
completes. This commit waits for the pending operation counter to complete
after the tests are run
2016-11-10 18:35:52 -05:00
Areek Zillur 5b4c3fb1ac Revert "Add trace logging when aquiring and releasing operation locks for replication requests"
This reverts commit 4e996ca9f5.
2016-11-10 18:35:25 -05:00
Alexander Lin 0219a211d3 Allows multiple patterns to be specified for index templates (#21009)
* Allows for an array of index template patterns to be provided to an
index template, and rename the field from 'template' to 'index_pattern'.

Closes #20690
2016-11-10 18:00:30 -05:00
Ali Beyad 5c4392e58a [TEST] fixes rebalance single shard check as it isn't guaranteed that a
rebalance makes sense and the method only tests if rebalance is allowed
2016-11-10 17:13:39 -05:00
Jason Tedor 179dd885e2 Avoid angering Log4j in TransportNodesActionTests
When logging a mock exception, Log4j attempts to render the stack
trace. On a mock exception, this will be null and Log4j will hit a
NullPointerException. This NullPointerException will get recorded in the
status logger buffer that we use to ensure that we do not having any
misuses of Log4j in production code. This commit replaces the use of a
mock exception with an actual exception to avoid angering the Log4j
assertions in ESTestCase.
2016-11-10 16:08:08 -05:00
Areek Zillur 4e996ca9f5 Add trace logging when aquiring and releasing operation locks for replication requests 2016-11-10 15:13:42 -05:00
Jason Tedor 0a06a0c2b3 Fix handler name on message not fully read
Today when a message is not fully read on a response, we log (among
other details) the handler name. Unfortunately, if the handler is a
wrapper, all that we see is

o.e.t.TransportService$ContextRestoreResponseHandler@7446ba18

completely losing the offending handler. This commit adds an override
for TransportService$ContextRestoreResponseHandler#toString so that the
underlying offender can be discovered.

Relates #21478
2016-11-10 14:56:48 -05:00
Jack Conradson 834976823a Remove accidental import. 2016-11-10 11:46:14 -08:00
Jason Tedor fdbe336104 Improve log message in TransportNodesAction
Today when handling responses from nodes in TransportNodesAction, if a
node timeouts or some other failure occurs and the action is not
accumulating exceptions, we log a confusing message:

org.elasticsearch.action.admin.cluster.stats.TransportClusterStatsAction]
ignoring unexpected response [null] of type [null], expected
[ClusterStatsNodeResponse] or [FailedNodeException]

Moreover, the original exception is completely lost. Since this log
message is confusing and unhelpful, we can drop it. Instead, we hold
onto the exception and log it at the warn level before dropping it from
the response.

Relates #21476
2016-11-10 14:32:14 -05:00
Jack Conradson aeb97ff412 Clean up of Script.
Closes #21321
2016-11-10 09:59:13 -08:00
Tanguy Leroux 2e531902ff Update Joda Time to version 2.9.5 (#21468)
This commit updates JodaTime to version 2.9.5 that contain a fix for a bug when parsing time zones (see https://github.com/JodaOrg/joda-time/pull/332, https://github.com/JodaOrg/joda-time/issues/386 and https://github.com/JodaOrg/joda-time/issues/373).

It also remove the joda-convert dependency that seems to be unused.
    
closes #20911

Here is the changelog for 2.9.5:
```
Changes in 2.9.5
----------------
 - Add Norwegian period translations [#378]

 - Add Duration.dividedBy(long,RoundingMode) [#69, #379]

 - DateTimeZone data updated to version 2016i

 - Fixed bug where clock read twice when comparing two nulls in DateTimeComparator [#404]

 - Fixed minor issues with historic time-zone data [#373]

 - Fix bug in time-zone binary search [#332, #386]
  The fix in v2.9.2 caused problems when the time-zone being parsed
  was not the last element in the input string. New approach uses a
  different approach to the problem.

 - Update tests for JDK 9 [#394]

 - Close buffered reader correctly in zone info compiler [#396]

 - Handle locale correctly zone info compiler [#397]
```
2016-11-10 17:32:46 +01:00
Luca Cavanna 10a4288a4c Remove unused ClusterService dependency from SearchPhaseController (#21421) 2016-11-10 17:32:19 +01:00
Luca Cavanna bd23921a3a Fix InternalSearchHit#hasSource to return the proper boolean value (#21441)
The method used to be called `isSourceEmpty`, and was renamed to `hasSource`, but the return value never changed. Updated tests and users accordingly.

Closes #21419
2016-11-10 13:13:38 +01:00
Nguyễn Thanh Tiến 27a7b30349 Add null check in InternalSearchHit#sourceRef to prevent NPE (#21431)
Add null check in InternalSearchHit#sourceRef to prevent NPE

Closes #19279
2016-11-10 10:54:43 +01:00