OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nik Everett	3adefb7b4a	Begin centralizing XContentParser creation into RestRequest (#22041 ) To get #22003 in cleanly we need to centralize as much `XContentParser` creation as possible into `RestRequest`. That'll mean we have to plumb the `NamedXContentRegistry` into fewer places. This removes `RestAction.hasBody`, `RestAction.guessBodyContentType`, and `RestActions.getRestContent`, moving callers over to `RestRequest.hasContentOrSourceParam`, `RestRequest.contentOrSourceParam`, and `RestRequest.contentOrSourceParamParser` and `RestRequest.withContentOrSourceParamParserOrNull`. The idea is to use `withContentOrSourceParamParserOrNull` if you need to handle requests without any sort of body content and to use `contentOrSourceParamParser` otherwise. I believe the vast majority of this PR to be purely mechanical but I know I've made the following behavioral change (I'll add more if I think of more): * If you make a request to an endpoint that requires a request body and has cut over to the new APIs instead of getting `Failed to derive xcontent` you'll get `Body required`. * Template parsing is now non-strict by default. This is important because we need to be able to deprecate things without requests failing.	2016-12-09 20:23:02 -05:00
Nik Everett	ddade1b5ac	Improve the error message if task and node isn't found (#22062 ) Improves the error message returned when looking up a task that belongs to a node that is no longer part of the cluster. The new error message tells the user that the node isn't part of the cluster. This is useful because if you start a task and the node goes down there isn't a record of the task at all. This hints to the user that the task might have died with the node. Relates to #22027	2016-12-09 15:50:46 -05:00
Igor Motov	93b5e55660	Restores the original default format of search slow log In 5.0, the search slow log switched to the multi-line format with no option to get back to the origin single-line format that was used prior to 5.0 by default. This commit removes the reformat option from the search slow log and returns the search slow log back to the single-line format. Closes #21711	2016-12-09 12:38:28 -05:00
Yannick Welsch	b20b160a5e	Allow flush/force_merge/upgrade on shard marked as relocated (#22078 ) A shard that is locally marked as relocated, but where the relocation target shard has not been activated yet by the master, can still receive index operations, which in return can lead to flushes being triggered. Flushing is currently (wrongly) prohibited on shards marked as relocated, which makes the flushing process go into an endless retry loop and log warnings until the shard is closed. This commit fixes this situation by allowing flush, force_merge and upgrade operations to run on shards that are marked as relocated.	2016-12-09 17:56:40 +01:00
Nik Everett	bcef1e7452	Better error message when _parent isn't an object (#21987 ) If you make a mistake and specify a mapping like: ``` { "parent": { "properties": {} }, "child": { "_parent": "parent", "properties": {} } } ``` then the error message you get back amounts to `Failed to parse mapping for [child]: can't cast a String to a Map`. Since it doens't tell you which string can't be cast to a map you have to dig through the stack trace to figure out what to fix. This replaces the error message with: ``` Failed to parse mapping [child]: [_parent] must be an object containing [type] ``` so you can tell that the problem is with the `parent` field.	2016-12-09 11:33:31 -05:00
Yannick Welsch	a724f4eb61	Don't update nodes list when stepping down as master (#22049 ) This commit simplifies the node update logic so that nodes are never removed from the cluster state when the cluster state is not published.	2016-12-09 14:55:48 +01:00
Christoph Büscher	2592ff86ce	Add fromXContent to InternalNestedIdentity This adds a fromXContent method and unit test to InternalNestedIdentity so we can parse it as part of a search response. This is part of the preparation for parsing search responses on the client side.	2016-12-09 14:52:06 +01:00
Yannick Welsch	db0660a7ea	Reject external versioning and explicit version numbers on create (#21998 ) Fixes an issue where indexing requests with operation type "create" auto-convert external versioning to internal versioning and silently ignore the version number instead of failing with an error message.	2016-12-09 14:21:22 +01:00
Michael McCandless	613a1a6a18	Add stored binary fields to static backwards compatibility indices tests (#22054 ) Add stored binary fields to static backwards compatibility indices tests	2016-12-09 05:32:40 -05:00
Adrien Grand	6714e02bef	Mute RestoreBackwardsCompatIT.testRestoreUnsupportedSnapshots.	2016-12-09 10:41:07 +01:00
Adrien Grand	787519ee4c	Fix `other_bucket` on the `filters` agg to be enabled if a key is set. (#21994 ) Closes #21951	2016-12-09 09:48:48 +01:00
Adrien Grand	1bdf4a2c5b	Partition-based include-exclude does not implement equals/hashcode/serialization correctly. (#22051 )	2016-12-09 09:48:16 +01:00
Adrien Grand	9524c81af9	Document the `locale` option of the `date` field. (#22050 ) This also adds another level of protection against using the default locale. Relates to https://discuss.elastic.co/t/mapping-for-12h-date-format/68433/3.	2016-12-09 09:45:53 +01:00
Adrien Grand	36f598138a	Start using `ObjectParser` for aggs. (#22048 ) This is an attempt to start moving aggs parsing to `ObjectParser`. There is still A LOT to do, but ObjectParser is way better than the way aggregations parsing works today. For instance in most cases, we reject numbers that are provided as strings, which we are supposed to accept since some client languages (looking at you Perl) cannot make sure to use the appropriate types. Relates to #22009	2016-12-09 09:45:16 +01:00
Ryan Ernst	b1cef5fdf8	Remove 2.0 prerelease version constants (#22004 ) * Remove 2.0 prerelease version constants This is a start to addressing #21887. This removes: * pre 2.0 snapshot format support * automatic units addition to cluster settings * bwc check for delete by query in pre 2.0 indexes	2016-12-08 21:48:35 -08:00
Igor Motov	7f79c99e9a	Add descriptions to bulk tasks Related to #21768	2016-12-08 21:59:52 -05:00
Lee Hinman	ef64d230e7	Merge remote-tracking branch 'dakrone/index-seq-id-and-primary-term'	2016-12-08 19:47:21 -07:00
Lee Hinman	ee22a477df	Add internal _primary_term doc values field, fix _seq_no indexing This adds the `_primary_term` field internally to the mappings. This field is populated with the current shard's primary term. It is intended to be used for collision resolution when two document copies have the same sequence id, therefore, doc_values for the field are stored but the filed itself is not indexed. This also fixes the `_seq_no` field so that doc_values are retrievable (they were previously stored but irretrievable) and changes the `stats` implementation to more efficiently use the points API to retrieve the min/max instead of iterating on each doc_value value. Additionally, even though we intend to be able to search on the field, it was previously not searchable. This commit makes it searchable. There is no user-visible `_primary_term` field. Instead, the fields are updated by calling: ```java index.parsedDoc().updateSeqID(seqNum, primaryTerm); ``` This includes example methods in `Versions` and `Engine` for retrieving the sequence id values from the index (see `Engine.getSequenceID`) that are only used in unit tests. These will be extended/replaced by actual implementations once we make use of sequence numbers as a conflict resolution measure. Relates to #10708 Supercedes #21480 P.S. As a side effect of this commit, `SlowCompositeReaderWrapper` cannot be used for documents that contain `_seq_no` because it is a Point value and SCRW cannot wrap documents with points, so the tests have been updated to loop through the `LeafReaderContext`s now instead.	2016-12-08 19:47:03 -07:00
Jason Tedor	c9882dd1a0	Avoid NPE in NodeService#stats if HTTP is disabled This commit adds safety against an NPE if HTTP stats are requested but HTTP is disabled on a node. Relates #22060	2016-12-08 19:59:02 -05:00
Jason Tedor	f713106827	Bump version to 5.1.2 This commit bumps the version to 5.1.2. Relates #22057	2016-12-08 16:40:39 -05:00
Ali Beyad	3da04293f3	Cannot force allocate primary to a node where the shard already exists (#22031 ) Before, it was possible that the SameShardAllocationDecider would allow force allocation of an unassigned primary to the same node on which an active replica is assigned. This could only happen with shadow replica indices, because when a shadow replica primary fails, the replica gets promoted to primary but in the INITIALIZED state, not in the STARTED state (because the engine has specific reinitialization that must take place in the case of shadow replicas). Therefore, if the now promoted primary that is initializing fails also, the primary will be in the unassigned state, because replica to primary promotion only happens when the failed shard was in the started state. The now unassigned primary shard will go through the allocation deciders, where the SameShardsAllocationDecider would return a NO decision, but would still permit force allocation on the primary if all deciders returned NO. This commit implements canForceAllocatePrimary on the SameShardAllocationDecider, which ensures that a primary cannot be force allocated to the same node on which an active replica already exists.	2016-12-08 12:21:19 -05:00
Adrien Grand	182e119699	IP range masks exclude the maximum address of the range. (#22018 ) Closes #22005	2016-12-08 15:58:32 +01:00
makeyang	ce0ad4e08e	add test case for parse include_in_all in mulit fields	2016-12-08 19:44:40 +08:00
Ali Beyad	30bcb06606	When shard data is still being fetched from nodes in the cluster, the ReplicaShardAllocator, when in explain mode, would get the node decisions for all nodes in the cluster. The PrimaryShardAllocator neglected to do this and tried to use the shard fetch data in explain mode, which had not yet been fully fetched. This commit fixes this by ensuring the PrimaryShardAllocator gets node decisions in the same way the ReplicaShardAllocator does in explain mode, if shard data is still being fetched.	2016-12-07 22:21:09 -05:00
makeyang	46cdb411b5	modified code according to nik9000's comments.	2016-12-08 11:04:33 +08:00
Ali Beyad	e6e7bab58c	Prepares allocator decision objects for use with the allocation explain API (#21691 ) This commit enhances the allocator decision result objects (namely, AllocateUnassignedDecision, MoveDecision, and RebalanceDecision) to enable them to be used directly by the cluster allocation explain API. In particular, this commit does the following: - Adds serialization and toXContent methods to the response objects, which will form the explain API responses. - Moves the calculation of the final explanation to the response object itself, removing it from the responsibility of the allocators. - Adds shard store information to the NodeAllocationResult, so that store information is available for each node, when explaining a shard allocation by the PrimaryShardAllocator or the ReplicaShardAllocator. - Removes RebalanceDecision in favor of using MoveDecision for both moving and rebalancing shards. - Removes NodeRebalanceResult in favor of using NodeAllocationResult. - Changes the notion of weight ranking to be relative to the current node, instead of an absolute weight that doesn't convey any added value to the API user and can be confusing. - Introduces a new enum AllocationDecision to convey the decision type, which enables conveying unassigned, moving, and rebalancing scenarios with more detail as opposed to just Decision.Type and AllocationStatus.	2016-12-07 17:37:51 -05:00
Ali Beyad	05f64c550a	[TEST] fixes line length issue in BulkRequestModifierTests	2016-12-07 13:11:55 -05:00
Ryan Ernst	f02a2b6546	Ingest: Moved ingest invocation into index/bulk actions (#22015 ) * Ingest: Moved ingest invocation into index/bulk actions Ingest was originally setup as a plugin, and in order to hook into the index and bulk actions, action filters were used. However, ingest was later moved into core, but the action filters were never removed. This change moves the execution of ingest into the index and bulk actions. * Address PR comments * Remove forwarder direct dependency on ClusterService	2016-12-07 08:43:26 -08:00
Christoph Büscher	7454a9647b	Add fromXContent to HighlightField This adds a fromXContent method and unit test to the HighlightField class so we can parse it as part of a serch response. This is part of the preparation for parsing search responses on the client side.	2016-12-07 16:32:44 +01:00
Yannick Welsch	c87cc15d49	Add toString() for TransportReplicationAction.ConcreteShardRequest	2016-12-07 15:58:22 +01:00
Christoph Büscher	31a1c2e240	Remove redundant source setters from IndexRequestBuilder	2016-12-07 15:20:36 +01:00
Yannick Welsch	9630b1a6e7	Promote shadow replica to primary when initializing primary fails (#22021 ) Failing an initializing primary when shadow replicas are enabled for the index can leave the primary unassigned with replicas being active. Instead, a replica should be promoted to primary, which is fixed by this commit.	2016-12-07 13:59:43 +01:00
Yannick Welsch	13e1a6fd40	Trim in-sync allocations set only when it grows (#21976 ) This commit makes two changes to how the in-sync allocations set is updated: - the set is only trimmed when it grows. This prevents trimming too eagerly when the number of replicas was decreased while shards were unassigned. - the allocation id of an active primary that failed is only removed from the in-sync set if another replica gets promoted to primary. This prevents the situation where the only available shard copy in the cluster gets removed the in-sync set. Closes #21719	2016-12-07 10:59:11 +01:00
Adrien Grand	c746854e03	Pre-built analysis factories do not implement MultiTermAware correctly. (#21981 ) We had tests for the regular factories, but not for the pre-built ones, that ship by default without requiring users to define them in the analysis settings.	2016-12-07 10:32:25 +01:00
Adrien Grand	33b8d7a19d	Expose `ip` fields as strings in scripts. (#21997 ) Currently we expose the internal representation that we use for ip addresses, which are the ipv6 bytes. However, this is not really usable, exposes internal implementation details and also does not work fine with other APIs that expect that the values can be `toString`'d. Closes #21977	2016-12-07 10:32:11 +01:00
Boaz Leskes	4519bdfeb0	InternalTestCluster shouldn't auto heal an active disruption when a new one is set Instead people should explicitly clear the existing one so it's clear what's going on.	2016-12-06 19:58:11 +01:00
shaie	6da44c8164	Fix _termvectors with preference to not hit NPE (#21959 ) When you submit a _termvectors request for an artificial document and specify the 'preference' parameter to send the request to a particular shard, the request sometimes hits NPE. Fix this case by ignoring the auto-generated artificial document ID and pick a shard per the preference parameter, or a random shard. This closes #21928	2016-12-06 17:29:09 +01:00
Jim Ferenczi	b42ca6bcc9	Include unindexed field in FieldStats response (#21821 ) * Include unindexed field in FieldStats response This change adds non-searchable fields to the FieldStats response. These fields do not have min/max informations but they can be aggregatable. Fields that are only stored in _source (store:no, index:no, doc_values:no) will still be missing since they do not have any useful information to show. Indices and clients must be at least on V_5_2_0 to see this change.	2016-12-06 13:32:57 +01:00
Boaz Leskes	a7050b2d56	Remove `InternalTestCluster.startNode(s)Async` (#21846 ) Since the removal of local discovery of #https://github.com/elastic/elasticsearch/pull/20960 we rely on minimum master nodes to be set in our test cluster. The settings is automatically managed by the cluster (by default) but current management doesn't work with concurrent single node async starting. On the other hand, with `MockZenPing` and the `discovery.initial_state_timeout` set to `0s` node starting and joining is very fast making async starting an unneeded complexity. Test that still need async starting could, in theory, still do so themselves via background threads. Note that this change also removes the usage of `INITIAL_STATE_TIMEOUT_SETTINGS` as the starting of nodes is done concurrently (but building them is sequential)	2016-12-06 12:06:15 +01:00
Daniel Mitterdorfer	a02bc8ed1c	Document thread-safety for ingest processors With this commit we document that ingest processors need to be thread-safe. Previously this could be inferred from reading the source code but we got several user questions about this so it is stated explicitly in the Javadocs of Processor now.	2016-12-06 10:07:51 +01:00
Adrien Grand	26cbda41ea	AsciiFoldingFilter's multi-term component should never preserve the original token. (#21982 ) This ports the fix of https://issues.apache.org/jira/browse/LUCENE-7536 to Elasticsearch's ASCIIFoldingTokenFilterFactory.	2016-12-06 10:01:04 +01:00
Ryan Ernst	c8f241f284	Plugins: Remove response action filters (#21950 ) Action filters currently have the ability to filter both the request and response. But the response side was not actually used. This change removes support for filtering responses with action filters.	2016-12-05 16:14:04 -08:00
Jim Ferenczi	03a0a0aebb	Undeprecate GetResponse#getFields and GetResponse#getField These functions should not have been deprecated as they can be used to retrieve stored and doc-value field.	2016-12-05 15:31:53 +01:00
makeyang	318ce6ab16	fix bug: https://github.com/elastic/elasticsearch/issues/21710	2016-12-05 19:14:34 +08:00
Ali Beyad	ff9959c865	Don't output null source node in RecoveryFailedException (#21963 ) The RecoveryFailedException's output prints the source and target nodes for the recovery. However, sometimes there is no source node for the recovery, only a target node (such as when recovering a primary shard from disk). In this case, we don't want to display the source node. This commit fixes this by displaying "Recovery failed on target node.." instead of "Recovery failed from null to target node" which is what the output currently displays.	2016-12-04 15:23:35 -05:00
Jason Tedor	60aa14f48e	Increase test logging on test simple pings test This commit increases the test logging on the unicast zeng ping test of simple pings to gather more info for chasing a race condition that is happening in this test.	2016-12-04 08:06:01 -05:00
Jason Tedor	040c05df36	Increase timeouts in UnicastZenPingTests Sadly, the timeouts here need to be increased to reduce the likelihood of spurious test failures (test hosts under load are especially prone to this). This does slow down this test suite a bit, but it's still not as slow as it was before this endeavor of lowering these timeouts started.	2016-12-03 22:19:55 -05:00
Jason Tedor	2c8229fcaf	Cleanup unicast zen ping unknown hosts cached test This commit cleans up the unicast zen ping unknown hosts cached test: - send pings from the same node to more clearly indicate DNS lookups are not cached (within the same UnicastZenPing instance) - increase ping and wait timeout to 500ms to address race conditions (on a test host under load, the timeout was too short for the connect/handshake/ping cycle to complete)	2016-12-03 22:00:30 -05:00
Jason Tedor	460e787049	Increase resolve timeout in unknown hosts test The port limit test is a simple test that fakes that resolving an address with a port range results the correct address collection. This test is subject to a race condition where the timeout on the resolve request can fire before the resolve code finishes executing (this race is exceptionally rare, because there are not actually any DNS lookups being done here since we are just resolving addresses). This commit increases the timeout here to significantly reduce the chance of a losing race causing a spurious test failure. This increased timeout should not increase the runtime of the test, just make failures less likely.	2016-12-03 09:02:44 -05:00
Jason Tedor	f5cbc36896	Increase resolve timeout in unknown hosts test The unknown hosts test is a simple test that fakes that resolving an address results in an unknown host exception. The main purpose of this test is to ensure that we log (and do not silently drop) when a host fails to resolve. This test is subject to a race condition where the timeout on the resolve request can fire before the resolve code finishes executing (this race is exceptionally rare, because there are not actually any DNS lookups being done here, just a mock resolve implementation that throws an exception and that's where losing the race can arise). This commit increases the timeout here to significantly reduce the chance of a losing race causing a spurious test failure. This increased timeout should not increase the runtime of the test, just make failures less likely.	2016-12-03 08:46:24 -05:00
Igor Motov	bb9317253a	Add descriptions to create snapshot and restore snapshot tasks. Related to #21768	2016-12-02 21:13:54 -05:00
Jason Tedor	c6efd4eb42	Rename method in InternalEngine This commit renames InternalEngine#loadSeqNoStatsLucene to InternalEngine#loadSeqNoStatsFromLucene to make this name consistent with the method InternalEngine#loadSeqNoStatsFromLuceneAndTranslog.	2016-12-02 20:46:26 -05:00
Ryan Ernst	34eb23e98e	Plugins: Replace Rest filters with RestHandler wrapper (#21905 ) * Plugins: Replace Rest filters with RestHandler wrapper RestFilters are a complex way of allowing plugins to add extra code before rest actions are executed. This change removes rest filters, and replaces with a wrapper which a single plugin may provide.	2016-12-02 14:54:51 -08:00
Jason Tedor	b0e8696143	Clarify global checkpoint recovery Today when starting a new engine, we read the global checkpoint from the translog only if we are opening an existing translog. This commit clarifies this situation by distinguishing the three cases of engine creation in the constructor leading to clearer code. Relates #21934	2016-12-02 15:00:16 -05:00
Jason Tedor	0afef53a17	Add system call filter bootstrap check Today if system call filters fail to install on startup, we log a message but otherwise march on. This might leave users without system call filters installed not knowing that they have implicitly accepted the additional risk. We should not be lenient like this, instead clearly informing the user that they have to either fix their configuration or accept the risk of not having system call filters installed. This commit adds a bootstrap check that if system call filters are enabled, they must successfully install. Relates #21940	2016-12-02 14:27:54 -05:00
Nik Everett	0c724b1878	Keep context during reindex's retries (#21941 ) * Keep context during reindex's retries This fixes reindex and friend's retries to keep the context. * Docs	2016-12-02 13:48:51 -05:00
Jay Modi	429e517476	Do not lose host information when pinging In #21828, serialization of the host string was added to preserve this information when a TransportAddress gets serialized. However, there is still a case where this did not always work. In UnicastZenPings, DiscoveryNode instances are created for the ping hosts with the minimum compatibility version, which is currently less than the version required to preserve the host information. This means that when a node is received from a PingResponse that the host information is no longer set correctly on the InetSocketAddress contained in the DiscoveryNode. This commit adds a workaround for this situation by allowing the host string to be passed into the TransportAddress constructor that takes a StreamInput and using that as the host for the InetAddress that is created during deserialization.	2016-12-02 12:21:53 -05:00
Ke Li	7cc9833606	Avoid some redundant unboxing and object creation (#21909 )	2016-12-02 16:11:41 +01:00
shaie	8fd3637891	Return correct term statistics when a field is not found in a shard (#21922 ) If you ask for the term vectors of an artificial document with term_statistics=true, but a shard does not have any terms of the doc's field(s), it returns the doc's term vectors values as the shard-level term statistics. This commit fixes that to return 0 for `ttf` and also field-level aggregated statistics. Closes #21906	2016-12-02 08:14:45 +01:00
Simon Willnauer	adf9bd90a4	Remove legacy BWC test infrastructure and tests (#21915 ) We don't use the test infra nor do we run the tests. They might all be entirely out of date. We also have a different BWC test infra in-place. This change removes all of the legacy infra.	2016-12-02 08:06:20 +01:00
makeyang	3f1d7be07a	Refactor shard limit allocation decider This commit simplifies the shard limit allocation decider, removing some duplicated code into a common method. Relates #21845	2016-12-01 21:27:02 -05:00
Ryan Ernst	a6ad89bee0	Mappings: Fix get mapping when no indexes exist to not fail in response generation (#21924 ) When there are no indexes, get mapping has a series of special cases. Two of those expect the response object already started, and the other two respond with an exception. Those two cases (types passed in but no indexes and vice versa) would fail in their error response generation because it did not expect an object to already be started in the json generator. This change moves the object start to where it is needed for the empty responses. closes #21916	2016-12-01 16:57:12 -08:00
Simon Willnauer	6522538033	Add validation for supported index version on node join, restore, upgrade & open index (#21830 ) Today we can easily join a cluster that holds an index we don't support since we currently allow rolling upgrades from 5.x to 6.x. Along the same lines we don't check if we can support an index based on the nodes in the cluster when we open, restore or metadata-upgrade and index. This commit adds additional safety that fails cluster state validation, open, restore and /or upgrade if there is an open index with an incompatible index version created in the cluster. Realtes to #21670	2016-12-01 15:40:35 +01:00
Simon Willnauer	155de53fe3	Add a connect timeout to the ConnectionProfile to allow per node connect timeouts (#21847 ) Timeouts are global today across all connections this commit allows to specify a connection timeout per node such that depending on the context connections can be established with different timeouts. Relates to #19719	2016-12-01 15:39:49 +01:00
Boaz Leskes	92fa9149f3	rename more before() methods that now conflict with ESTestCase	2016-12-01 13:40:27 +01:00
Simon Willnauer	dd5256c324	Reduce number of connections per node depending on the nodes role (#21849 ) We currently treat every node equally when we establish connections to a node. Yet, if we are not master eligible or can't hold any data there is no point in creating a dedicated connection for sending the cluster state or running remote recoveries respectively. The usage of STATE and RECOVERY connections on non-master and/or non-data nodes will result in an IllegalStateException.	2016-12-01 08:00:48 +01:00
Jim Ferenczi	fc9b63877e	Handle specialized term queries in MappedFieldType.extractTerm(TermQuery) (#21889 ) For some fields we have a specialized implementation of a TermQuery that is specific for the field. When these kind of fields are used in a wildcard query or a span term query it fails with an exception because they don't recognize the specialized form. The impacted fields are [_all] and [_type] and the impacted queries are [span_term] and [wilcard]. This change handles these forms and correctly extracts the term inside them for further use. Fixes #21882	2016-11-30 23:11:38 +01:00
Jason Tedor	92f05e796e	Remove traces during connect with handshake This commit removes two trace logging statements during connection with handshake as they are just clutter.	2016-11-30 15:29:33 -05:00
Jason Tedor	761325bf94	Throw exception on ping from another cluster When we receive a ping from another cluster, we should throw an exception so as to not leak the channel.	2016-11-30 15:28:56 -05:00
Jason Tedor	c90ba67abb	Do not reply to pings from another cluster Today when sending responses to discovery pings, we unconditionally reply. Instead, this commit modifies the response handler to not reply when the cluster names do not match. This addresses a race condition identified after reducing the timeout in UnicastZenPingTests#testSimplePings. In particular, we send pings in the following way: - if not connected to the node, connect to the node and after successful handshake, send a ping - if connected to the node, send a ping When the ping timeout is set low, a subsequent batch of pings can race against a connect/disconnect cycle from a prior batch of pings. In particular, consider the following scenario: - node A from cluster X - node B from cluster Y - pings are initiated from node A with node B in the hosts list - node A will try to connect and handshake with B - the connection will succeed, and the handshake will eventually fail due to mismatched cluster names - on a short timeout, a second batch of pings will fire, and on this batch node A will see that it is still connected to node B; thus, it will immediately fire a ping to node B and node B will dutifully respond Relates #21894	2016-11-30 15:09:42 -05:00
Luca Cavanna	103984a4a1	Remove indices query (#21837 ) The indices query is deprecated since 5.0.0 (#17710). It can now be removed in master (future 6.0 version).	2016-11-30 19:37:01 +01:00
Adrien Grand	117944093e	Remove testing of 2.x indices in DecayFunctionScoreIT. Such old indices will not be supported in 6.0.	2016-11-30 17:16:13 +01:00
Jason Tedor	6c45695d52	Add version 5.1.1 This commit removes the version constant for 5.1.0 (due to an inadvertent release) and adds the version constant for 5.1.1. Relates #21890	2016-11-30 11:14:17 -05:00
Adrien Grand	f5ac27a20d	Fix TermsQueryBuilderTests expectations.	2016-11-30 17:07:53 +01:00
Adrien Grand	c5b9c98b99	Remove the `default` store type. (#21616 ) It used to be a hybrid store between `niofs` and `mmapfs`, which we removed when we switched to `fs` by default (which is `mmapfs` on 64-bits systems).	2016-11-30 15:33:26 +01:00
Adrien Grand	90ab477f19	The `terms` query should always map to a Lucene `TermsQuery`. (#21786 ) Currently, the `terms` query is just syctactic sugar for a `bool` query when used in a query context. This change proposes to always generate the same query in query and filter contexts, which is less confusing.	2016-11-30 15:29:09 +01:00
Luca Cavanna	5b8bdba12e	Remove subrequests method from CompositeIndicesRequest (#21873 )	2016-11-30 15:03:58 +01:00
Matt Weber	1e722c060b	Remove forked XRollingBuffer and XQueryBuilder. (#21866 ) Remove the forked versions now that we are on lucene-6.4.0-snapshot.	2016-11-30 13:45:54 +01:00
Adrien Grand	a3ef674992	Reduce memory pressure when sending large terms queries. (#21776 ) When users send large `terms` query to Elasticsearch, every value is stored in an object. This change does not reduce the amount of created objects, but makes sure these objects die young by optimizing the list storage in case all values are either non-null instances of Long objects or BytesRef objects, which seems to help the JVM significantly.	2016-11-30 13:35:56 +01:00
Adrien Grand	6231009a8f	Remove 2.x backward compatibility of mappings. (#21670 ) For the record, I also had to remove the geo-hash cell and geo-distance range queries to make the code compile. These queries already throw an exception in all cases with 5.x indices, so that does not hurt any more. I also had to rename all 2.x bwc indices from `index-${version}` to `unsupported-${version}` to make `OldIndexBackwardCompatibilityIT` happy.	2016-11-30 13:34:46 +01:00
Jason Tedor	072007c759	Speed up UnicastZenPingTests These tests using ping timeouts on the order of seconds, but this is unnecessary since all the sockets are within the same JVM it really should not take that long. Relates #21874	2016-11-29 23:27:25 -05:00
Jason Tedor	b6ba4ae34b	Add version 5.0.3 This commit adds version 5.0.3 and the BWC indices for version 5.0.2. Relates #21867	2016-11-29 18:34:55 -05:00
Jay Modi	404b42ee95	DiscoveryNode and TransportAddress should preserve host information In some cases, such as the creation of DiscoveryNode instances for unicast ping requests, the host information was not being populated properly and instead the address string was being used. Additionally, when serializing a DiscoveryNode and in turn a transport address, the host was not being set on the InetAddress when deserializing the object, so even if the address was created from a hostname, the address in the deserialized instance had no knowledge of the hostname that was originally used.	2016-11-29 16:18:08 -05:00
Luca Cavanna	6eaff9432d	SearchTemplateRequest to implement CompositeIndicesRequest (#21865 ) SearchTemplateRequest to implement CompositeIndicesRequest Given that SearchTemplateRequest effectively delegates to search when a search is being executed, it should implement the CompositeIndicesRequest interface. The subrequests method should return a single search request. When a search is not going to be executed, because we are in simulate mode, there are no inner requests, and there are no corresponding indices to that request either. Closes #21747	2016-11-29 20:52:43 +01:00
Boaz Leskes	be4074e13d	improve debug logging when node waits for initial cluster state And enabled debug logging in InternalTestClusterTests so we can see it.	2016-11-29 20:38:19 +01:00
Luca Cavanna	f253621feb	Remove deprecated query names: in, geo_bbox, mlt, fuzzy_match and match_fuzzy (#21852 ) These query names were all deprecated in 5.0.0: - in is removed in favour of terms - geo_bbox is removed in favour of geo_bounding_box - mlt is removed in favour of more_like_this - fuzzy_match and match_fuzzy are removed in favour of match	2016-11-29 19:07:01 +01:00
Jim Ferenczi	d791ddf704	Upgrade to lucene-6.4.0-snapshot-ec38570 (#21853 ) Set lucene version to 6.4.0-snapshot-ec38570 and update all the sha1s/license Fix invalid combo after upgrade in query_string query. split_on_whitespace=false is disallowed if auto_generate_phrase_queries=true Adapt the expectations of some tests to the new format of the Lucene explain output	2016-11-29 18:40:31 +01:00
Nicholas Knize	af1ab68b64	Add RangeFieldMapper for numeric and date range types Lucene 6.2 added index and query support for numeric ranges. This commit adds a new RangeFieldMapper for indexing numeric (int, long, float, double) and date ranges and creating appropriate range and term queries. The design is similar to NumericFieldMapper in that it uses a RangeType enumerator for implementing the logic specific to each type. The following range types are supported by this field mapper: int_range, float_range, long_range, double_range, date_range. Lucene does not provide a DocValue field specific to RangeField types so the RangeFieldMapper implements a CustomRangeDocValuesField for handling doc value support. When executing a Range query over a Range field, the RangeQueryBuilder has been enhanced to accept a new relation parameter for defining the type of query as one of: WITHIN, CONTAINS, INTERSECTS. This provides support for finding all ranges that are related to a specific range in a desired way. As with other spatial queries, DISJOINT can be achieved as a MUST_NOT of an INTERSECTS query.	2016-11-29 10:10:14 -06:00
Simon Willnauer	f5ff69fabe	Remove connectToNodeLight and replace it with a connection profile (#21799 ) The Transport#connectToNodeLight concepts is confusing and not very flexible. neither really testable on a unittest level. This commit cleans up the code used to connect to nodes and simplifies transport implementations to share more code. This also allows to connect to nodes with custom profiles if needed, for instance future improvements can be added to connect to/from nodes that are non-data nodes without dedicated bulks and recovery connections.	2016-11-29 09:35:07 +01:00
Ali Beyad	a884573898	[TEST] fixes FilterAllocationDecider test for decision explanation when the initial recovery is LOCAL_SHARDS	2016-11-28 20:37:19 -05:00
Ali Beyad	07bd0a30f0	Improves allocation decider decision explanation messages (#21771 ) This commit improves the decision explanation messages, particularly for NO decisions, in the various AllocationDecider implementations by including the setting(s) in the explanation message that led to the decision. This commit also returns a THROTTLE decision instead of a NO decision when the concurrent rebalances limit has been reached in ConcurrentRebalanceAllocationDecider, because it more accurately reflects a temporary throttling that will turn into a YES decision once the number of concurrent rebalances lessens, as opposed to a more permanent NO decision (e.g. due to filtering).	2016-11-28 20:23:16 -05:00
Matt Weber	04e07bcdb6	Synonym Graph Support (LUCENE-6664) (#21517 ) Integrate the patch from LUCENE-6664 into elasticsearch and add support for handling a graph token stream in match/multi-match queries. This fixes longstanding bugs with multi-token synonyms returning incorrect results with proximity queries.	2016-11-28 09:25:49 -08:00
Jim Ferenczi	8affb7c845	Fix FiltersFunctionScoreQuery highlighting (#21827 ) This is a cleanup of the fix pushed in https://github.com/elastic/elasticsearch/pull/20400. FiltersFunctionScoreQuery sub query should be extracted in CustomQueryScorer.extract (and not in CustomQueryScorer.extractUnknownQuery). This does not fix any bug in this branch (it's just a cleanup) but the intent is first to clean up and then to backport in 2.x where there is a real bug. The bug is in 2.x only because the backport of https://github.com/elastic/elasticsearch/pull/20400 in 2.x mistakenly renamed the FiltersFunctionScoreQuery to FunctionScoreQuery. This leads to incorrect highlighting on FiltersFunctionScoreQuery in 2.x.	2016-11-28 17:56:24 +01:00
Nik Everett	145d0813b5	Log ScriptException's xcontent if file script compilation fails (#21767 ) When a file script fails to compile, rather than logging the exception that caused the failure this logs the xcontent of that exception. This is both shorter and has the script stack which is useful for figuring out why the compilation failed. Still logs the entire stacktrace at debug level just in case you need it. Relates to #21733	2016-11-28 11:36:06 -05:00
Ali Beyad	db7362da67	Fixes shard level snapshot metadata loading when index-N file is missing (#21813 ) In making changes for the 5.0 version of snapshots, a bug was introduced where if an index-N file could not be found for an individual shard, the backup was to iterate over all snap-.dat files in the shard folder to know which snapshots contain that shard's data, but in 5.0, reading the snap-.dat files as backup was incorrectly passing in the blob name for the snap-.dat file, thereby failing to load all index files for a given snapshot when the index-N file is missing. This condition should be rare as there is no reason an index-N file should be absent (unless it was deleted or there was corruption reading the file), but nevertheless, this situation can be encountered and this commit fixes the bug by reading the correct snap-.dat blob name in the shard data folder.	2016-11-28 10:46:33 -05:00
Simon Willnauer	b7292a6005	Remove TcpTransport#addressSupported since TransportAddress is now final TransportAddress used to be customizable per transport but this has been removed a while ago. Therefore we can remove all usage of this method as well. Relates to #20695	2016-11-28 16:06:59 +01:00
Jim Ferenczi	69f35aa07f	Fix cross_fields type on multi_match query with synonyms (#21638 ) * Fix cross_fields type on multi_match query with synonyms This change fixes the cross_fields type of the multi_match query when synonyms are involved. Since 2.x the Lucene query parser creates SynonymQuery for words that appear at the same position. For simple term query the CrossFieldsQueryBuilder expands the term to all requested fields and creates a BlendedTermQuery. This change adds the same mechanism for SynonymQuery which otherwise are not expanded to all requested fields. As a side note I wonder if we should not replace the BlendedTermQuery with the SynonymQuery. They have the same purpose and behave similarly. Fixes #21633 * Fallback to SynonymQuery for blended terms on a single field	2016-11-28 14:14:01 +01:00
Yannick Welsch	7e198f0e41	Detect nodes being blocked by GC-disrupted node (#21797 ) The disruption type LongGCDisruption simulates GCs on a node by suspending all the threads of that node. If the suspended threads are in a code section with shared JVM locks, however, it can prevent the other nodes from doing their thing. The class LongGCDisruption has a list of class names for which we know that this can occur. Whenever a test using the GC disruption type fails in mysterious ways, it becomes a long guessing game to find the offending class. This commit adds code to LongGCDisruption to automatically detect these situations, fail the test early and report the offending class and all relevant context.	2016-11-28 11:24:25 +01:00
Adrien Grand	243a788289	Fail to index fields with dots in field names when one of the intermediate objects is nested. (#21787 ) Closes #21726	2016-11-28 09:57:32 +01:00
Clinton Gormley	c1fa80d40f	Log failure to connect to node at info instead of debug (#21809 ) Closes #6468	2016-11-26 13:18:26 +01:00
Ali Beyad	efba64d60a	Removing unused AllocationExplanation class (#21805 ) This commit removes the unused AllocationExplanation class. The RoutingAllocation class only created an empty instance of it and never used it anywhere else. The allocation explanations will be encompassed in the various decision classes exposed via the cluster allocation explain API. Therefore, there is no reason to keep the AllocationExplanation class.	2016-11-25 12:18:23 -05:00
Luca Cavanna	720b165350	Search shards to print out aliases array together with alias filter (#21784 ) With #21738 we added an indices section to the search shards api, that will return the concrete indices hit by the request, and eventually the corresponding alias filter. The java API returns the AliasFilter object, which holds the filter itself and an array of aliases that pointed to the index in the original request. The REST layer doesn't print out the aliases array though. This commit adds the aliases array as well and tests for this.	2016-11-25 10:58:06 +01:00
Simon Willnauer	9809760eb0	Fix settings diff generation for affix, list and group settings (#21788 ) Group, List and Affix settings generate a bogus diff that turns the actual diff into a string containing a json structure for instance: ``` "action" : { "search" : { "remote" : { "" : "{\"my_remote_cluster\":\"[::1]:60378\"}" } } } ``` which make reading the setting impossible. This happens for instance if a group or affix setting is rendered via `_cluster/settings?include_defaults=true` This change fixes the issue as well as several minor issues with affix settings that where not accepted as valid setting today.	2016-11-24 21:53:04 +01:00
Simon Willnauer	72ef6fa0d7	Handle spaces in `action.auto_create_index` gracefully (#21790 ) Today if a comma-separated list is passed to action.auto_create_index leading and trailing whitespaces are not trimmed but since the values are index expressions whitespaces should be removed for convenience. Closes #21449	2016-11-24 21:43:58 +01:00
markharwood	aa60e5cc07	Aggregations - support for partitioning set of terms used in aggregations so that multiple requests can be done without trying to compute everything in one request. Closes #21487	2016-11-24 15:10:46 +00:00
Luca Cavanna	ac2aa56350	Cluster search shards improvements: expose ShardId, adjust visibility of some members (#21752 ) * ClusterSearchShardsGroup to return ShardId rather than the int shard id This allows more info to be retrieved, like the index uuid which is exposed through the ShardId object but was not available before * Make ClusterSearchShardsResponse empty constructor public This allows to receive such responses when sending ClusterSearchShardsRequests directly through TransportService (not using ClusterSearchShardsAction via Client), otherwise an empty response cannot be created unless the class that does it is in org.elasticsearch.action, admin.cluster.shards package * adjust visibility of ClusterSearchShards members	2016-11-24 09:46:57 +01:00
Luca Cavanna	d8c934a7fa	Use index uuid as key in the alias filter map rather than the index name (#21749 ) The index uuid is unique across multiple clusters, while the index name is not. Using the index uuid to look up filters in the alias filters map is better and will be needed for multi cluster search.	2016-11-24 09:43:42 +01:00
Luca Cavanna	6a16a60c7e	Remove unused assignedReplicasIncludingRelocating from ShardsIterator interface (#21687 )	2016-11-23 22:25:51 +01:00
Luca Cavanna	887ada4819	Move SearchTransportService and SearchPhaseController creation outside of TransportSearchAction constructor (#21754 ) This commit makes sure that there is only one instance of the two services rather than one per transport action that uses it. Also, we take their initialization out of guice's hands by binding it to a specific instance. Otherwise those two objects would get created within a constructor that is called by guice. That may cause problem for instance when throwing an exception from such constructors as guice tries all over again to re-initialize objects and fills up logs with stacktraces.	2016-11-23 22:18:50 +01:00
Luca Cavanna	a84353d2d6	Don't carry ShardRouting around when not needed in AbstractSearchAsyncAction (#21753 ) * replace ShardRouting argument in AbstractSearchAsyncAction#onFirstPhaseResult with more contained String nodeId There is no need to pass in ShardRouting if the only info read from it is the current node id, the shard id can be read directly from the ShardIterator that's already provided as an argument. * avoid creating a new ShardId when creating a SearchShardTarget in SnapshotsService	2016-11-23 22:18:02 +01:00
Jason Tedor	8416b16dfd	Improve handling of unreleased versions Today when handling unreleased versions for backwards compatilibity support, we scatted version constants across the code base and add some asserts to support removing these constants when the version in question is actually released. This commit improves this situation, enabling us to just add a single unreleased version constant that can be renamed when the version is actually released. This should make maintenance of these versions simpler. Relates #21760	2016-11-23 15:49:05 -05:00
Luca Cavanna	033eece6d4	ShardSearchRequest to take ShardId constructor argument rather than the whole ShardRouting (#21750 ) ShardSearchRequest was previously taking in the whole ShardRouting as a constructor argument while it only needs the ShardsId, changed that to carry over only the needed bits.	2016-11-23 15:34:55 +01:00
Ryan Ernst	10a945ae72	Plugins: Remove support for onModule (#21416 ) All plugin extension points have been converted to pull based interfaces. This change removes the infrastructure for the black-magic onModule methods.	2016-11-22 23:12:14 -08:00
Ryan Ernst	d8808210f1	Transport client: Fix remove address to actually work (#21743 ) * Transport client: Fix remove address to actually work The removeTransportAddress method of TransportClient removes the address from the list of nodes that the client pings to sniff for nodes. However, it does not remove it from the list of existing connected nodes. This means removing a node is not possible, as long as that node is still up. This change removes the node from the connected nodes list before triggering sampling (ie sniffing). While the fix is simple, testing was not because there were no existing tests for sniffing. This change also modifies the mocks used by transport client unit tests in order to allow mocking sniffing.	2016-11-22 22:50:11 -08:00
Igor Motov	c7b69a0133	Add search task descriptions Since we added ability to cancel searches it would be nice to see which searches we are actually cancelling.	2016-11-22 23:15:49 -05:00
Ryan Ernst	6940b2b8c7	Remove groovy scripting language (#21607 ) * Scripting: Remove groovy scripting language Groovy was deprecated in 5.0. This change removes it, along with the legacy default language infrastructure in scripting.	2016-11-22 19:24:12 -08:00
Nik Everett	1791623700	Document `error_trace` The `error_trace` parameter turns on the `stack_trace` field in errors which returns stack traces. Removes documentation for `camelCase` because it hasn't worked in a while.... Documents the internal parameters used to render stack traces as internal only. Closes #21708	2016-11-22 19:16:07 -05:00
Jason Tedor	c7b70fc770	Mark Security#addBindPermissions as private This commit marks the method Security#addBindPermissions as private, it's package-private visibility was not used anywhere.	2016-11-22 18:40:18 -05:00
Jason Tedor	41ae784a6f	Refactor handling of bind permissions This commit refactors the handling of bind permissions, which is in need of a little cleanup. For example, in its current state, the code for handling permissions for transport profiles is split across two methods. This commit refactors this code hopefully making it easier to work with in future changes. This change is mostly mechanical, no functionality is changed. Relates #21742	2016-11-22 18:39:14 -05:00
Jason Tedor	1576eaba25	Increase lower bound for random resolve timeout in test The test UnicastZenPing#testResolveTimeout chooses a random resolve timeout between 1ms and 100ms. Close to the lower bound, this is far too short and the test races against the concurrent resolves executing before the timeout elapses. This commit increases the timeout to something that is far less likely to race, yet will not slow the test down since we are not doing resolves against a real DNS service anyway. Note that we still want a short resolve timeout since we are testing whether or not timeouts really work here (by latching one of the resolves to respond slowly).	2016-11-22 18:35:57 -05:00
Luca Cavanna	db5a72774b	Add indices and filter information to search shards api output (#21738 ) Add indices and filter information to search shards api output The search shards api returns info about which shards are going to be hit by executing a search with provided parameters: indices, routing, preference. Indices can also be aliases, which can also hold filters. The output includes an array of shards and a summary of all the nodes the shards are allocated on. This commit adds a new indices section to the search shards output that includes one entry per index, where each index can be associated with an optional filter in case the index was hit through a filtered alias. This is relevant since we have moved parsing of alias filters to the coordinating node. Relates to #20916	2016-11-22 23:00:25 +01:00
Nik Everett	29e68323a2	Clean up ScriptQuerySearchIT Shorten line and remove forbidden API. Relates to #21484	2016-11-22 16:23:33 -05:00
umeshdangat	f37db2fe17	Support binary field type in script values (#21484 ) Add ScriptDocValues.BytesRefs for reading binary fieldtype	2016-11-22 16:23:23 -05:00
Simon Willnauer	a9a2753f0b	Add a HostFailureListener to notify client code if a node got disconnected (#21709 ) Today there is no way to get notified if a node is disconnected. Client code must poll the TransportClient constantly to detect that a node is not connected anymore in order to react and add new nodes or notify altering etc. For instance if a hostname gets resolved to an IP but that host is disconnected clients want to reconnect by resolving the hostname again which is a common situation in cloud environments. Closes #21424	2016-11-22 20:46:28 +01:00
Jason Tedor	9dc65037bc	Lazy resolve unicast hosts Today we eagerly resolve unicast hosts. This means that if DNS changes, we will never find the host at the new address. Moreover, a single host failng to resolve causes startup to abort. This commit introduces lazy resolution of unicast hosts. If a DNS entry changes, there is an opportunity for the host to be discovered. Note that under the Java security manager, there is a default positive cache of infinity for resolved hosts; this means that if a user does want to operate in an environment where DNS can change, they must adjust networkaddress.cache.ttl in their security policy. And if a host fails to resolve, we warn log the hostname but continue pinging other configured hosts. When doing DNS resolutions for unicast hostnames, we wait until the DNS lookups timeout. This appears to be forty-five seconds on modern JVMs, and it is not configurable. If we do these serially, the cluster can be blocked during ping for a lengthy period of time. This commit introduces doing the DNS lookups in parallel, and adds a user-configurable timeout for these lookups. Relates #21630	2016-11-22 14:17:04 -05:00
Clinton Gormley	3ff8faf514	Added version 2.4.2 and bwc indices	2016-11-22 19:45:59 +01:00
Yannick Welsch	a44655763e	Allow master to assign primary shard to node that has shard store locked during shard state fetching (#21656 ) PR #19416 added a safety mechanism to shard state fetching to only access the store when the shard lock can be acquired. This can lead to the following situation however where a shard has not fully shut down yet while the shard fetching is going on, resulting in a ShardLockObtainFailedException. PrimaryShardAllocator that decides where to allocate primary shards sees this exception and treats the shard as unusable. If this is the only shard copy in the cluster, the cluster stays red and a new shard fetching cycle will not be triggered as shard state fetching treats exceptions while opening the store as permanent failures. This commit makes it so that PrimaryShardAllocator treats the locked shard as a possible allocation target (although with the least priority).	2016-11-22 19:35:47 +01:00
Jay Modi	080d55a393	Rethrow ExecutionException from the loader to concurrent callers of Cache#computeIfAbsent This commit clarifies the contract of Cache#computeIfAbsent so that an exception that occurs during the execution of the loader is thrown to all callers. Prior to this commit, the first caller would get the ExecutionException and other callers that called during the load execution would get null, which is confusing.	2016-11-22 13:24:15 -05:00
Luca Cavanna	db8b2dceea	Remove ignored type parameter in search_shards api (#21688 ) The `type` parameter has always been accepted by the search_shards api, probably to make the api and its urls the same as search. Truth is that the type never had any effect, it's been ignored from day one while accepting it may make users think that we actually do something with it. This commit removes support for the type parameter from the REST layer and the Java API. Backwards compatibility is maintained on the transport layer though. The new added serialization test also uncovered a bug in the java API where the `ClusterSearchShardsRequest` could be created with no arguments, but the indices were required to be not null otherwise the request couldn't be serialized as `writeTo` would throw NPE. Fixed by setting a default value (empty array) for indices.	2016-11-22 17:22:33 +01:00
Jason Tedor	775638c281	Die with dignity on the Lucene layer When a fatal error tragically closes an index writer, such an error never makes its way to the uncaught exception handler. This prevents the node from being torn down if an out of memory error or other fatal error is thrown in the Lucene layer. This commit ensures that such events bubble their way up to the uncaught exception handler. Relates #21721	2016-11-22 11:21:24 -05:00
Jason Tedor	221caa1c5e	Refactor handling for bad default permissions This commit refactors the handling of bad default permissions that come from the system security policy. Relates #21735	2016-11-22 10:26:36 -05:00
Yannick Welsch	50e25912c8	Split main ClusterService method into smaller chunks #21666 Splits the main method in ClusterService into smaller chunks so that it's easier to understand and simpler to modify in subsequent PRs.	2016-11-22 12:20:53 +01:00
Yannick Welsch	c521219b2f	Adapt BWC layer checks for Exceptions to include v5.0.2 support The PR #21694 was initially planned to go into v6.0.0 and v5.1.0. Due to another PR relying on this one though for backport to v5.0.2, #21694 must go to v5.0.2 as well. As such, the initial backward compatibility rules established by the PR must be changed to include v5.0.2 and above.	2016-11-22 12:02:56 +01:00
Lee Hinman	dd1012d570	Merge remote-tracking branch 'dakrone/fix-lenient-overriding'	2016-11-21 22:10:19 -07:00
Lee Hinman	11da09e9bc	Allow overriding all-field leniency when `lenient` option is specified As part of #20925 and #21341 we added an "all-fields" mode to the `query_string` and `simple_query_string`. This would expand the query to all fields and automatically set `lenient` to true. However, we should still allow a user to override the `lenient` flag to whichever value they desire, should they add it in the request. This commit does that.	2016-11-21 21:32:25 -07:00
Areek Zillur	933c4f42b3	[FIX] make MergableCustomMetaData public in TribeService	2016-11-21 23:02:36 -05:00
Nik Everett	c79371fd5b	Remove lang-python and lang-javascript (#20734 ) They were deprecated in 5.0. We are concentrating on making Painless awesome rather than supporting every language possible. Closes #20698	2016-11-21 22:13:25 -05:00
Jason Tedor	4225737db9	Install a security manager on startup When Elasticsearch starts, we go through some initialization before we install a security manager. Yet, the JVM makes internal policy decisions on the basis of whether or not a security manager is present. This commit installs a security manager immediately on startup so that the JVM always thinks a security manager is present when making such policy decisions. Relates #21716	2016-11-21 20:33:42 -05:00
Simon Willnauer	cb5c25ab4f	Add a StreamInput#readArraySize method that ensures sane array sizes (#21697 ) Today we read a vint from the stream to allocate the size of an array up-front before we start reading the values. This can be dangerous if for instance we read from a corrupted stream or if some manipulated bytes are send for instance from an attacker or a fuzzer. In most of the cases we can apply some best effort and validate the array size to be _sane_ by ensuring we can at read at least N bytes where N is the expected size of the array.	2016-11-21 21:39:21 +01:00
Areek Zillur	0ccf8a742d	Add support for merging custom meta data in tribe node (#21552 ) * Add support for merging custom meta data in tribe node Currently, when any underlying cluster has custom metadata (via plugin), tribe node does not store custom meta data in its cluster state. This is because the tribe node has no idea how to select the appropriate custom metadata from one or many custom metadata (corresponding to the number of underlying clusters). This change adds an interface that custom metadata implementations can extend to add support for merging mulitple custom metadata of the same type for storing in the tribe state. Relates to #20544 Supersedes #20791 * Simplify updating tribe state * Add tests for merging multiple custom metadata types in tribe node * cleanup merging custom md logic in tribe service	2016-11-21 12:03:01 -05:00
Simon Willnauer	71a21b3208	Add BWC layer for Exceptions (#21694 ) Today it's not possible to add exceptions to the serialization layer without breaking BWC. This commit adds the ability to specify the Version an exception was added that allows to fall back not NotSerializableExceptionWrapper if the exception is not present in the streams version. Relates to #21656	2016-11-21 12:51:06 +01:00
Tanguy Leroux	e7b9e65fc3	Add checkstyle rule to forbid empty javadoc comments (#20881 ) This commit adds a RegexpMultiline check to checkstyle that yells when an empty Javadoc comment is found in Java files. Related #20871	2016-11-21 12:36:44 +01:00
Luca Cavanna	6122b84eba	remove pointless catch exception in TransportSearchAction (#21689 ) TransportSearchAction optimizes the search_type in certain cases, when for instance we are searching against a single shard, or when there is only a suggest section in the request. That optimization is wrapped in a try catch, and when an exception happens we log it and ignore it. This may be a leftover from the past though, as no exception is expected to be thrown in that code block, hence if there is any exception we are probably better off bubbling it up rather than ignoring it.	2016-11-21 11:46:26 +01:00
Luca Cavanna	a1d88e6550	Rename ClusterState#lookupPrototypeSafe to `lookupPrototype` and remove previous "unsafe" unused variant (#21686 ) The `lookupPrototype` method is not used anywhere. Seems like we rather use its `lookupProrotypeSafe` variant (which also throws exception if the prototype is not found) is always. This commit makes the safer variant the default one, by renaming it to "lookupPrototype" and removes the previous "unsafe" variant.	2016-11-21 11:36:56 +01:00
Simon Willnauer	d913242ca1	Use a buffer to do character to byte conversion in StreamOutput#writeString (#21680 ) Today we call `writeByte` up to 3x per character in each string written via `StreamOutput#writeString` this can have quite some overhead when strings are long or many strings are written. This change adds a local buffer to convert chars to bytes into the local buffer. Converted bytes are then written via `writeBytes` instead reducing the overhead of this opertion. Closes #21660	2016-11-21 10:47:50 +01:00
Adrien Grand	23d5293f82	Fix integer overflows when dealing with templates. (#21628 ) The overflows were happening in two places, the parsing of the template that implicitly truncates the `order` when its value does not fall into the `integer` range, and the comparator that sorts templates in ascending order, since it returns `order2-order1`, which might overflow. Closes #21622	2016-11-21 10:41:08 +01:00
Jim Ferenczi	90247446aa	Fix highlighting on a stored keyword field (#21645 ) * Fix highlighting on a stored keyword field The highlighter converts stored keyword fields using toString(). Since the keyword fields are stored as utf8 bytes the conversion is broken. This change uses BytesRef.utf8toString() to convert the field value in a valid string. Fixes #21636 * Replace BytesRef#utf8ToString with MappedFieldType#valueForDisplay	2016-11-21 10:29:30 +01:00
David Roberts	6daeb56969	Set execute permissions for native plugin programs (#21657 )	2016-11-21 09:20:09 +00:00
javanna	9594b6f50f	adjust visibility of DiscoveryNodes.Delta constructor It can be private as it gets called by DiscoveryNodes#delta method, which is supposed to be the only way to create a Delta	2016-11-21 10:17:05 +01:00
javanna	e0661c5262	Remove unused DiscoveryNodes.Delta constructor	2016-11-21 10:17:05 +01:00
javanna	596eebcf98	Remove unused DiscoveryNode#removeDeadMembers public method	2016-11-21 10:17:05 +01:00
javanna	b19c606cef	Remove minNodeVersion and corresponding public `getSmallestVersion` getter method from DiscoveryNodes	2016-11-21 10:17:05 +01:00
Jason Tedor	aed88fe7a2	Log node ID on startup If the node name is explicitly set it's not derived from the node ID meaning that it doesn't immediately appear in the logs. While it can be tracked down in other places, it would be easier for info purposes if it just showed up explicitly. This commit adds the node ID to the logs, whether or not the node name is set. Relates #21673	2016-11-19 06:27:25 -05:00
Jason Tedor	484ad31ed9	Clarify that plugins can be closed Plugins are closed if they implement java.io.Closeable but this is not clear from the plugin interface. This commit clarifies this by declaring that Plugins implement java.io.Closeable and adding an empty implementation to the base Plugin class. Relates #21669	2016-11-18 13:04:28 -05:00
Simon Willnauer	99f8c21d9a	Don't reset non-dynamic settings unless explicitly requested (#21646 ) AbstractScopedSettings has the ability to only apply updates/deletes to dynamic settings. The flag is currently not respected when a setting is reset/deleted which causes static node settings to be reset if a non-dynamic key is reset via `null` value. Closes #21593	2016-11-18 16:40:18 +01:00
Ali Beyad	1d2a1540cc	Makes allocator decision classes top-level classes (#21662 ) This commit moves several allocation decider related inner classes into their own top-level class, in order to use more easily in the allocation explain API. This commit also renames some of those decision related classes to more suitable names. This is simply a cosmetic change - no functionality changes with this commit whatsoever. To summarize the changes: 1. ShardAllocationDecision renamed to AllocateUnassignedDecision 2. RelocationDecision moved to a top-level class 3. MoveDecision moved to a top-level class 4. RebalanceDecision moved to a top-level class 5. ShardAllocationDecisionTests renamed to AllocateUnassignedDecisionTests 6. NodeRebalanceResult moved to a top-level class 7. ShardAllocationDecision#WeightedDecision moved to a top-level class and renamed to NodeAllocationResult.	2016-11-18 10:19:27 -05:00
Yannick Welsch	b1fd257c42	[TEST] Fix testTimedOutUpdateTaskCleanedUp to wait for blocking task to be completed The "test" task can complete its execution with a timeout exception before the "block-task" actually starts executing. The test thus has to wait for both to be completed before checking that the updateTasksPerExecutor map has been properly cleaned up.	2016-11-18 12:34:50 +01:00
Christoph Büscher	4a7b70cc08	Don't require `types` parameter in IdsQueryBuilder constructor According to the docs and our own tests we accept an ids query without specified types and default to all types in the index mapping in this case. This changes the builder to reflect this by making the types no longer a required constructor argument and changes the parser to reflect that.	2016-11-17 20:22:48 +01:00
Christoph Büscher	b8cae39b7c	Using ObjectParser in MatchAllQueryBuilder and IdsQueryBuilder A first step moving away from the current parsing to use the generalized Objectparser and ConstructingObjectParser. This PR start by making use of it in MatchAllQueryBuilder and IdsQueryBuilder.	2016-11-17 20:22:48 +01:00
Nik Everett	2a1e08f76a	Fix compilation in Eclipse (#21606 ) * Fix compilation in Eclipse I'm not sure what the bug is, but ecj doesn't like this expression unless the type is set explicitly. * Add comment explaining why no diamond operator	2016-11-17 12:54:57 -05:00
Jim Ferenczi	09fbb4d06d	Fix match_phrase_prefix on boosted fields (#21623 ) This change fixes the match_phrase_prefix on fields that define a boost in their mapping. Fixes #21613	2016-11-17 18:45:34 +01:00
Dimitris Athanasiou	a75320f89b	Replace IndexAlreadyExistsException with ResourceAlreadyExistsException (#21494 )	2016-11-17 14:30:21 +00:00
Jason Tedor	b08a2e1f31	Expose executor service interface from thread pool This commit exposes the executor service interface from thread pool. This will enable some high-level concurrency primitives that will make some code cleaner and simpler. Relates #21608	2016-11-17 09:18:49 -05:00
David Roberts	116593e5f5	Adjust bootstrap sequence (#21543 ) Added the ability for plugins to spawn a controller process at startup	2016-11-17 09:58:09 +00:00
Adrien Grand	6581b77198	Remove store throttling. (#21573 ) Store throttling has been disabled by default since Lucene added automatic throttling of merge operations based on the indexing rate.	2016-11-17 09:33:32 +01:00
Jason Tedor	9792b5792a	Respect default search timeout The default search timeout is not respected because the timeout is unconditionally set from the query. This commit fixes this issue, and adds a test that the default search timeout is correctly attached to the search context. Relates #21599	2016-11-16 12:43:47 -05:00
Jason Tedor	d06a8903fd	Merge branch 'master' into feature/seq_no * master: (22 commits) Add proper toString() method to UpdateTask (#21582) Fix `InternalEngine#isThrottled` to not always return `false`. (#21592) add `ignore_missing` option to SplitProcessor (#20982) fix trace_match behavior for when there is only one grok pattern (#21413) Remove dead code from GetResponse.java Fixes date range query using epoch with timezone (#21542) Do not cache term queries. (#21566) Updated dynamic mapper section Docs: Clarify date_histogram bucket sizes for DST time zones Handle release of 5.0.1 Fix skip reason for stats API parameters test Reduce skip version for stats API parameter tests Strict level parsing for indices stats Remove cluster update task when task times out (#21578) [DOCS] Mention "all-fields" mode doesn't search across nested documents InternalTestCluster: when restarting a node we should validate the cluster is formed via the node we just restarted Fixed bad asciidoc in boolean mapping docs Fixed bad asciidoc ID in node stats Be strict when parsing values searching for booleans (#21555) Fix time zone rounding edge case for DST overlaps ...	2016-11-16 09:10:35 -05:00
Yannick Welsch	aa73a76ffd	Add proper toString() method to UpdateTask (#21582 ) Adds a proper toString() method to ClusterService.UpdateTask	2016-11-16 15:07:26 +01:00
Adrien Grand	d7fa2eb155	Fix `InternalEngine#isThrottled` to not always return `false`. (#21592 ) Currently it inherits from the default implementation which always returns `false`, even if indexing is being throttled.	2016-11-16 15:01:05 +01:00
Tal Levy	6796464f16	add `ignore_missing` option to SplitProcessor (#20982 ) Closes #20840.	2016-11-16 15:46:09 +02:00
Simon Willnauer	6baded8e7f	Remove dead code from GetResponse.java	2016-11-16 10:48:15 +01:00
Colin Goodheart-Smithe	c6c734dce1	Fixes date range query using epoch with timezone (#21542 ) This change fixes the rnage query so that an exception is always thrown if the range query uses epoch time together with a time zone. Since epoch time is always UTC it should not be used with a time zone. Closes #21501	2016-11-16 09:11:04 +00:00
Adrien Grand	00de8e07fc	Do not cache term queries. (#21566 ) There have been reports that the query cache did not manage to speed up search requests when the query includes a large number of different sub queries since a single request may manage to exhaust the whole history (256 queries) while the query cache only starts caching queries once they appear multiple times in the history (#16031). On the other hand, increasing the size of the query cache is a bit controversial (#20116) so this pull request proposes a different approach that consists of never caching term queries, and not adding them to the history of queries either. The reasoning is that these queries should be fast anyway, regardless of caching, so taking them out of the equation should not cause any slow down. On the other hand, the fact that they are not added to the cache history anymore means that other queries have greater chances of being cached.	2016-11-16 10:02:24 +01:00
Nik Everett	e66261eee9	Handle release of 5.0.1 Adds a version constant for it, bwc indices, and a vagrant upgrade-from version. Also bumps the "upgrade from" version for the backwards-5.0 test and adds `skip`s for tests that don't fail against 5.0 so we skip them during the backwards testing. Finally, this skips the "Shrink index via API" test because it fails consistently for me. Inconsistently for CI, but consistently for me. I'll work on making it consistent tomorrow.	2016-11-15 19:31:28 -05:00
Jason Tedor	17b0041aaf	Strict level parsing for indices stats A previous commit added strict level parsing for the node stats API, but that commit missed adding the same for the indices stats API. This commit rectifies this miss. Relates #21577	2016-11-15 16:26:37 -05:00
Yannick Welsch	40e0162e61	Remove cluster update task when task times out (#21578 ) Fixes an issue where the cluster service does not remove an update task from its internal data structures that are used for batching cluster state updates. * review comments * checkstyle	2016-11-15 21:38:58 +01:00
Lee Hinman	96122aa518	Be strict when parsing values searching for booleans (#21555 ) This changes only the query parsing behavior to be strict when searching on boolean values. We continue to accept the variety of values during index time, but searches will only be parsed using `"true"` or `"false"`. Resolves #21545	2016-11-15 10:36:57 -07:00
Christoph Büscher	cd4634bdc6	Fix time zone rounding edge case for DST overlaps When using TimeUnitRounding with a DAY_OF_MONTH unit, failing tests in #20833 uncovered an issue when the DST shift happenes just one hour after midnight local time and sets back the clock to midnight, leading to an overlap. Previously this would lead to two different rounding values, depending on whether a date before or after the transition was rounded. This change detects this special case and correct for it by using the previous rounding date for both cases. Closes #20833	2016-11-15 18:23:47 +01:00
Jason Tedor	f5ac0e5076	Remove lenient stats parsing Today when parsing a stats request, Elasticsearch silently ignores incorrect metrics. This commit removes lenient parsing of stats requests for the nodes stats and indices stats APIs. Relates #21417	2016-11-15 12:17:26 -05:00
Boaz Leskes	2c0338fa87	Merge remote-tracking branch 'upstream/master' into feature/seq_no	2016-11-15 17:09:08 +00:00
Boaz Leskes	d6c2b4f7c5	Adapt InternalTestCluster to auto adjust `minimum_master_nodes` (#21458 ) #20960 removed `LocalDiscovery` and we now use `ZenDiscovery` in all our tests. To keep cluster forming fast, we are using a `MockZenPing` implementation which uses static maps to return instant results making master election fast. Currently, we don't set `minimum_master_nodes` causing the occasional split brain when starting multiple nodes concurrently and their pinging is so fast that it misses the fact that one of the node has elected it self master. To solve this, `InternalTestCluster` is modified to behave like a true cluster and manage and set `minimum_master_nodes` correctly with every change to the number of nodes. Tests that want to manage the settings themselves can opt out using a new `autoMinMasterNodes` parameter to the `ClusterScope` annotation. Having `min_master_nodes` set means the started node may need to wait for other nodes to be started as well. To combat this, we set `discovery.initial_state_timeout` to `0` and wait for the cluster to form once all node have been started. Also, because a node may wait and ping while other nodes are started, `MockZenPing` is adapted to wait rather than busy-ping.	2016-11-15 13:42:26 +00:00
Jason Tedor	ee722d738a	Fix internal engine sequence number test bug This commit fixes a test bug in internal engine tests, and adds some additional assertions.	2016-11-15 08:34:54 -05:00
Simon Willnauer	66fbb0dbc2	Don't fail in `afterExecute` if context is already closed (#21563 ) We run an assert on an potentially closed thread context. this should not bubble up the `IllegalStateException`.	2016-11-15 13:55:50 +01:00
Adrien Grand	54809065a6	Make PercolatorFieldMapper get a QueryShardContext lazily.	2016-11-15 12:02:40 +01:00
Boaz Leskes	c9f49039d3	Merge remote-tracking branch 'upstream/master' into feature/seq_no	2016-11-15 10:14:47 +00:00
Simon Willnauer	200a2850a9	[TEST] Don't stop MockAppender some nodes might concurrently use it	2016-11-15 10:48:39 +01:00
Boaz Leskes	6d9af2fff4	Uncommitted mapping updates should not efect existing indices (#21306 ) When processing a mapping updates, the master current creates an `IndexService` and uses its mapper service to do the hard work. However, if the master is also a data node and it already has an instance of `IndexService`, we currently reuse the the `MapperService` of that instance. Sadly, since mapping updates are change the in memory objects, this means that a mapping change that can rejected later on during cluster state publishing will leave a side effect on the index in question, bypassing the cluster state safety mechanism. This commit removes this optimization and replaces the `IndexService` creation with a direct creation of a `MapperService`. Also, this fixes an issue multiple from multiple shards for the same field caused unneeded cluster state publishing as the current code always created a new cluster state. This were discovered while researching #21189	2016-11-15 10:47:34 +01:00
Adrien Grand	ad94bea0bb	Remove XPointValues. (#21541 ) This class had been added to address a bug in PointValues, which has been fixed since then.	2016-11-15 10:11:41 +01:00
Martijn van Groningen	8a3a885058	inner_hits: Skip adding a parent field to nested documents. Otherwise an empty string get added as _parent field. Closes #21503	2016-11-15 07:32:28 +01:00
Ryan Ernst	c7bd4f3454	Tests: Add TestZenDiscovery and replace uses of MockZenPing with it (#21488 ) This changes adds a test discovery (which internally uses the existing mock zenping by default). Having the mock the test framework selects be a discovery greatly simplifies discovery setup (no more weird callback to a Node method).	2016-11-14 21:46:10 -08:00
Ryan Ernst	d14c470b89	Remove generics from ActionRequest closes #21368	2016-11-14 15:32:01 -08:00
Jason Tedor	48579cccab	Add socket permissions for tribe nodes Today when a node starts, we create dynamic socket permissions based on the configured HTTP ports and transport ports. If no ports are configured, we use the default port ranges. When a tribe node starts, a tribe node creates an internal node client for connecting to each remote cluster. If neither an explicit HTTP port nor transport ports were specified, the default port ranges are large enough for the tribe node and its internal node clients. If an explicit HTTP port or transport port was specified for the tribe node, then socket permissions for those ports will be created, but not for the internal node clients. Whether the internal node clients have explicit ports specified, or attempt to bind within the default range, socket permissions for these will not have been created and the internal node clients will hit a permissions issue when attempting to bind. This commit addresses this issue by also accounting for tribe nodes when creating the dynamic socket permissions. Additionally, we add our first real integration test for tribe nodes. Relates #21546	2016-11-14 15:09:45 -05:00
Jay Modi	87d76c3ff8	assert blocking calls are not made on the cluster state update thread This commit adds an assertion to ensure that we do not introduce blocking calls in code that is called in a ClusterStateListener or another part of the cluster state update process.	2016-11-14 14:30:01 -05:00
Jason Tedor	9fb54f4ef8	Remove unnecessary hash map copy in o.e.b.Security This commit removes an unnecessary copying of the tribe node group settings in o.e.b.Security.	2016-11-14 13:49:16 -05:00
Jason Tedor	a12f09317d	Fallback to settings if transport profile is empty If the transport profile does not contain a TCP port range, we fallback to the top-level settings.	2016-11-14 13:48:12 -05:00
Jason Tedor	491a945ac8	Add socket permissions for tribe nodes Today when a node starts, we create dynamic socket permissions based on the configured HTTP ports and transport ports. If no ports are configured, we use the default port ranges. When a tribe node starts, a tribe node creates an internal node client for connecting to each remote cluster. If neither an explicit HTTP port nor transport ports were specified, the default port ranges are large enough for the tribe node and its internal node clients. If an explicit HTTP port or transport port was specified for the tribe node, then socket permissions for those ports will be created, but not for the internal node clients. Whether the internal node clients have explicit ports specified, or attempt to bind within the default range, socket permissions for these will not have been created and the internal node clients will hit a permissions issue when attempting to bind. This commit addresses this issue by also accounting for tribe nodes when creating the dynamic socket permissions. Additionally, we add our first real integration test for tribe nodes.	2016-11-14 11:58:44 -05:00
Simon Willnauer	1d8c8529ed	Remove `IndexTemplateAlreadyExistsException` and `IndexShardAlreadyExistsException` (#21539 ) Both exception can be replaced with java built-in exception, IAE and ISE respectively. This should be back ported partially to 5.x which the transport layer code should be preserved. Relates to #21494	2016-11-14 17:09:57 +01:00
Simon Willnauer	26375256ff	Enable 5.x to 6.x BWC tests (#21537 ) This commit enables real BWC testing against a 5.1 snapshot. All REST tests plus rolling upgrade test now run against a mixed version cross major version cluster.	2016-11-14 17:03:57 +01:00
Yannick Welsch	d3e97ce6cd	Fix line length in TCPTransportTests Makes checkstyle happy	2016-11-14 16:55:14 +01:00
Yannick Welsch	d42f7eec61	Check valid cluster service state transitions (#21538 ) This commit adds assertions to check whether the cluster service state transitions in a way that we expect it to. Relates to #21379.	2016-11-14 16:49:25 +01:00
Simon Willnauer	26a8a94e56	[TEST] Add test to ensure `transport.tcp.compress` works This adds a basic unittest to ensure `transport.tcp.compress` has effect on all basic TcpTransport implementations. Relates to #21526	2016-11-14 16:13:44 +01:00
Simon Willnauer	7d4bde8e00	remove forbidden API	2016-11-14 15:30:07 +01:00
Yannick Welsch	8655cd7182	Add assertion that checks that the same shard with same id is not added to same node (#21498 ) Adds an assertion that checks that the same shard with same id is not added to same node. Previously we would just silently ignore the second shard being added.	2016-11-14 15:14:14 +01:00
Simon Willnauer	bdc942fa72	Enable 5.x to 6.x BWC tests This commit enables real BWC testing against a 5.1 snapshot. All REST tests plus rolling upgrade test now run against a mixed version cross major version cluster.	2016-11-14 14:26:49 +01:00
Adrien Grand	1fd5c47e7f	Upgrade to lucene-6.3.0. (#21464 )	2016-11-14 09:36:45 +01:00
Jason Tedor	c7a1b3eb50	Merge branch 'master' into feature/seq_no * master: Hack around cluster service and logging race Do not prematurely shutdown Log4j Support decimal constants with trailing [dD] in painless (#21412) In painless suggest a long constant if int won't do (#21415) Account for different paths for sysctl utilities [TEST] testRebalancePossible() may not have an assigned node id Tests: Disable merge in SearchCancellationTests Tests: clean search scroll at the end of SearchCancellationIT	2016-11-13 20:01:44 -05:00
Jason Tedor	19decd7552	Hack around cluster service and logging race When a cluster update task executes, there can be log messages after the update task has finished processing and the new cluster state becomes visible. The visibility of the cluster state allows the test thread in UpdateSettingsIT#testUpdateAutoThrottleSettings and UpdateSettingsiT#testUpdateMergeMaxThreadCount to proceed. The test thread will remove and stop a mock appender setup at the beginning of the test. The log messages in the cluster state update task that occur after processing has finished can race with the removal of the appender. Log4j will grab a reference to the appenders when processing these log messages, and this races with the removal and stopping of the appenders. If Log4j grabs a reference to the appenders before the mock appender has been removed, and the test thread subsequently removes and stops the appender before Log4j has appended the log message, Log4j will get angry that we are appending to a stopped appender, causing the test to fail. This commit addresses this race by waiting for the cluster state update task to have finished processing before freeing the test thread to make its assertions and finally remove and stop the appender. Yes, this is a hack. Relates #21518	2016-11-13 18:06:12 -05:00
Jason Tedor	d273419d00	Do not prematurely shutdown Log4j When a node closes, we shutdown logging as the last statement. This statement must be last lest any subsequent attempts to log will blow up by running into security permissions. Yet, in the case of a tribe node this isn't enough. The first internal tribe node to close will shutdown logging, and subsequent node closes will blow up with the aforementioned problem. This commit migrate the Log4j shutdown to occur as part of the shutdown hook that closes the node, after all nodes have closed. Consequently, we can remove a hack in the test infrastructure to prevent Log4j shutdowns when internal test nodes close and instead just register a single shutdown hook that runs when the test JVM exits. Relates #21519	2016-11-13 17:27:30 -05:00
Boaz Leskes	fac6cf0d4e	testUpgradeOldIndex should properly set index setting. They are needed for assertions	2016-11-12 11:42:02 +01:00
Ali Beyad	38023fb58d	[TEST] testRebalancePossible() may not have an assigned node id	2016-11-11 23:10:34 -05:00
Igor Motov	ca639e8c86	Tests: Disable merge in SearchCancellationTests We have to have at least 2 segments for the test to work and sometimes random merge policy merges them into one.	2016-11-11 18:22:28 -05:00
Igor Motov	058b6e019c	Tests: clean search scroll at the end of SearchCancellationIT Under some rare conditions search cancellation response might not fully clean scroll context. For now this commit adds the cleaning operation to the test, and we will address the root cause in https://github.com/elastic/elasticsearch/issues/21511	2016-11-11 18:22:15 -05:00
Jason Tedor	1ea69b1a80	Merge branch 'master' into feature/seq_no * master: Set vm.max_map_count on systemd package install [TEST] reduce the number of snapshotted shards to 1 in testSnapshotSucceedsAfterSnapshotFailure() so that we are more likely to trigger I/O exceptions on writing the control files during the finalize phase of snapshotting (with the aim of triggering an I/O failure when writing pending-index-*). Add documentation for Logger with Transport Client Enable appender exceptions in UpdateSettingsIT [TEST] remove AwaitsFix from testSnapshotSucceedsAfterSnapshotFailure, turns out the issue is specific to Java 9 v143 Cleanup formatting in UpdateSettingsIT.java [TEST] mute the testSnapshotSucceedsAfterSnapshotFailure() test until its clear what is going wrong. Mark SearchQueryIT test as awaits fix Makes snapshot throttling test go much faster (#21485) Breaking changes docs for template index_patterns [TEST] adds randomness between atomic and non-atomic move operations in MockRepository Cache successful shard deletion checks (#21438) Task cancellation command should wait for all child nodes to receive cancellation request before returning	2016-11-11 17:03:01 -05:00
Jason Tedor	d06f43c706	Tighten sequence number assertion We have an assertion in the engine regarding the initial state of a sequence number before an indexing operation. This assertion is too loose, it catches operations during recovery from old indices where sequence numbers do not even exist. This commit tightens these assertions to not catch such operations and enables us to reenable some tests. Relates #21509	2016-11-11 16:49:13 -05:00
Ali Beyad	5f1d108704	[TEST] reduce the number of snapshotted shards to 1 in testSnapshotSucceedsAfterSnapshotFailure() so that we are more likely to trigger I/O exceptions on writing the control files during the finalize phase of snapshotting (with the aim of triggering an I/O failure when writing pending-index-*).	2016-11-11 16:22:11 -05:00
Jason Tedor	8d1260a58a	Convert nocommit to TODO in SeqNoFieldMapper This commit converts a nocommit to a TODO in SeqNoFieldMapper that will be dealt with in a follow-up.	2016-11-11 16:11:41 -05:00
Jason Tedor	c77d285699	Remove nocommit in TransportShardBulkAction This commit removes a nocommit in TransportShardBulkAction that deserves a larger issue.	2016-11-11 16:10:22 -05:00
Jason Tedor	33f7cd5a16	Remove shard ID from doc write response This commit removes the shard ID from doc write response; this was useful for debugging but its time has passed. Relates #21508	2016-11-11 15:18:25 -05:00
Jason Tedor	9352d16602	Enable appender exceptions in UpdateSettingsIT This commit sets the mock appender in UpdateSettingsIT to not ignore exceptions. This means that when an exception is hit, we will see an actual stack trace that could be useful in debugging a non-reproducible test failure. Relates #21461	2016-11-11 12:41:20 -05:00
Ali Beyad	c9c3992f94	[TEST] remove AwaitsFix from testSnapshotSucceedsAfterSnapshotFailure, turns out the issue is specific to Java 9 v143	2016-11-11 12:37:04 -05:00
Jason Tedor	79076334ae	Cleanup formatting in UpdateSettingsIT.java This commit cleans up some code formatting in UpdateSettingsIT.java and removes this from from the checkstyle line-length supressions.	2016-11-11 12:10:32 -05:00
Ali Beyad	8f85e388da	[TEST] mute the testSnapshotSucceedsAfterSnapshotFailure() test until its clear what is going wrong. Relates #21496	2016-11-11 11:50:23 -05:00
Jason Tedor	372480a16a	Mark SearchQueryIT test as awaits fix This commit marks the test SearchQueryIT#testRangeQueryWithTimeZone as awaits fix. Relates #21501	2016-11-11 11:33:17 -05:00
Yannick Welsch	9cbb23f3d7	Test distinctNodes	2016-11-11 17:29:51 +01:00
Jason Tedor	1e7c424479	Merge branch 'master' into feature/seq_no * master: ShardActiveResponseHandler shouldn't hold to an entire cluster state Ensures cleanup of temporary index-* generational blobs during snapshotting (#21469) Remove (again) test uses of onModule (#21414) [TEST] Add assertBusy when checking for pending operation counter after tests Revert "Add trace logging when aquiring and releasing operation locks for replication requests" Allows multiple patterns to be specified for index templates (#21009) [TEST] fixes rebalance single shard check as it isn't guaranteed that a rebalance makes sense and the method only tests if rebalance is allowed Document _reindex with random_score	2016-11-11 11:25:27 -05:00
Ali Beyad	a5ccd02e76	Makes snapshot throttling test go much faster (#21485 ) [TEST] Makes the snapshot throttling test go much faster. Before, the snapshot throttling test would throttle at a rate of 0.5 kb per second, even though it would snapshot/restore about 25 kb of data. This commit increases the throttling rate to 10kb per second, so we still test the throttling mechanism while speeding up the test from taking 30 plus seconds down to 2 seconds or less.	2016-11-11 10:52:26 -05:00
Yannick Welsch	d195ef258b	test fix	2016-11-11 16:09:34 +01:00
Yannick Welsch	1635baf876	fix tests that add duplicate shards	2016-11-11 15:28:40 +01:00
Yannick Welsch	7099f10909	Add assertion that checks that the same shard with same id is not added to same node	2016-11-11 15:28:40 +01:00
Ali Beyad	adb7aaded4	[TEST] adds randomness between atomic and non-atomic move operations in MockRepository	2016-11-11 09:07:28 -05:00
Yannick Welsch	2d3a52c0f2	Cache successful shard deletion checks (#21438 ) Each node checks on every cluster state update if there are shards that it can possibly delete from its disk. It decides this by doing a file-system lookup for each shard id that is fully allocated in the cluster. With lots of shards, this amounts to lots of Files.exists() checks, considerably slowing down cluster state updates. This commit adds a caching layer so that the Files.exists() checks can be skipped if not needed.	2016-11-11 10:06:15 +01:00
Jason Tedor	d3417fb022	Merge branch 'master' into feature/seq_no * master: (516 commits) Avoid angering Log4j in TransportNodesActionTests Add trace logging when aquiring and releasing operation locks for replication requests Fix handler name on message not fully read Remove accidental import. Improve log message in TransportNodesAction Clean up of Script. Update Joda Time to version 2.9.5 (#21468) Remove unused ClusterService dependency from SearchPhaseController (#21421) Remove max_local_storage_nodes from elasticsearch.yml (#21467) Wait for all reindex subtasks before rethrottling Correcting a typo-Maan to Man-in README.textile (#21466) Fix InternalSearchHit#hasSource to return the proper boolean value (#21441) Replace all index date-math examples with the URI encoded form Fix typos (#21456) Adapt ES_JVM_OPTIONS packaging test to ubuntu-1204 Add null check in InternalSearchHit#sourceRef to prevent NPE (#21431) Add VirtualBox version check (#21370) Export ES_JVM_OPTIONS for SysV init Skip reindex rethrottle tests with workers Make forbidden APIs be quieter about classpath warnings (#21443) ...	2016-11-10 23:40:33 -05:00
Igor Motov	df965fc9b3	Task cancellation command should wait for all child nodes to receive cancellation request before returning Currently the task cancellation command returns as soon as the top-level parent child is marked as cancelled. This create race conditions in tests where child tasks on other nodes may continue to run for some time after the main task is cancelled. This commit fixes this situation making task cancellation command to wait until it got propagated to all nodes that have child tasks. Closes #21126	2016-11-10 22:43:43 -05:00
Igor Motov	06a50fa31e	ShardActiveResponseHandler shouldn't hold to an entire cluster state ShardActiveResponseHandler doesn't need to hold to an entire cluster state since it only needs to know the cluster state version. It seems that on overloaded systems where nodes are unresponsive holding onto a lot of different cluster states can make the situation worse. Closes #21394	2016-11-10 22:28:49 -05:00
Ali Beyad	3001b636db	Ensures cleanup of temporary index-* generational blobs during snapshotting (#21469 ) Ensures pending index-* blobs are deleted when snapshotting. The index-* blobs are generational files that maintain the snapshots in the repository. To write these atomically, we first write a `pending-index-` blob, then move it to `index-`, which also deletes `pending-index-` in case its not a file-system level move (e.g. S3 repositories) . For example, to write the 5th generation of the index blob for the repository, we would first write the bytes to `pending-index-5` and then move `pending-index-5` to `index-5`. It is possible that we fail after writing `pending-index-5`, but before moving it to `index-5` or deleting `pending-index-5`. In this case, we will have a dangling `pending-index-5` blob laying around. Since snapshot #5 would have failed, the next snapshot assumes a generation number of 5, so it tries to write to `index-5`, which first tries to write to `pending-index-5` before moving the blob to `index-5`. Since `pending-index-5` is leftover from the previous failure, the snapshot fails as it cannot overwrite this blob. This commit solves the problem by first, adding a UUID to the `pending-index-` blobs, and secondly, strengthen the logic around failure to write the `index-*` generational blob to ensure pending files are deleted on cleanup. Closes #21462	2016-11-10 21:45:02 -05:00
Ryan Ernst	48bfb142b9	Remove (again) test uses of onModule (#21414 ) This change was reverted after it caused random test failures. This was due to a copy/paste error in the original PR which caused the mock version of ClusterInfoService to be used whenever the mock ZenPing was used, and the real ClusterInfoService to be used when MockZenPing was not used.	2016-11-10 16:06:14 -08:00
Areek Zillur	7ed195fe93	[TEST] Add assertBusy when checking for pending operation counter after tests Currently, pending operations can complete after tests with disruption scheme completes. This commit waits for the pending operation counter to complete after the tests are run	2016-11-10 18:35:52 -05:00
Areek Zillur	5b4c3fb1ac	Revert "Add trace logging when aquiring and releasing operation locks for replication requests" This reverts commit `4e996ca9f5`.	2016-11-10 18:35:25 -05:00
Alexander Lin	0219a211d3	Allows multiple patterns to be specified for index templates (#21009 ) * Allows for an array of index template patterns to be provided to an index template, and rename the field from 'template' to 'index_pattern'. Closes #20690	2016-11-10 18:00:30 -05:00
Ali Beyad	5c4392e58a	[TEST] fixes rebalance single shard check as it isn't guaranteed that a rebalance makes sense and the method only tests if rebalance is allowed	2016-11-10 17:13:39 -05:00
Jason Tedor	179dd885e2	Avoid angering Log4j in TransportNodesActionTests When logging a mock exception, Log4j attempts to render the stack trace. On a mock exception, this will be null and Log4j will hit a NullPointerException. This NullPointerException will get recorded in the status logger buffer that we use to ensure that we do not having any misuses of Log4j in production code. This commit replaces the use of a mock exception with an actual exception to avoid angering the Log4j assertions in ESTestCase.	2016-11-10 16:08:08 -05:00
Areek Zillur	4e996ca9f5	Add trace logging when aquiring and releasing operation locks for replication requests	2016-11-10 15:13:42 -05:00
Jason Tedor	0a06a0c2b3	Fix handler name on message not fully read Today when a message is not fully read on a response, we log (among other details) the handler name. Unfortunately, if the handler is a wrapper, all that we see is o.e.t.TransportService$ContextRestoreResponseHandler@7446ba18 completely losing the offending handler. This commit adds an override for TransportService$ContextRestoreResponseHandler#toString so that the underlying offender can be discovered. Relates #21478	2016-11-10 14:56:48 -05:00
Jack Conradson	834976823a	Remove accidental import.	2016-11-10 11:46:14 -08:00
Jason Tedor	fdbe336104	Improve log message in TransportNodesAction Today when handling responses from nodes in TransportNodesAction, if a node timeouts or some other failure occurs and the action is not accumulating exceptions, we log a confusing message: org.elasticsearch.action.admin.cluster.stats.TransportClusterStatsAction] ignoring unexpected response [null] of type [null], expected [ClusterStatsNodeResponse] or [FailedNodeException] Moreover, the original exception is completely lost. Since this log message is confusing and unhelpful, we can drop it. Instead, we hold onto the exception and log it at the warn level before dropping it from the response. Relates #21476	2016-11-10 14:32:14 -05:00
Jack Conradson	aeb97ff412	Clean up of Script. Closes #21321	2016-11-10 09:59:13 -08:00
Tanguy Leroux	2e531902ff	Update Joda Time to version 2.9.5 (#21468 ) This commit updates JodaTime to version 2.9.5 that contain a fix for a bug when parsing time zones (see https://github.com/JodaOrg/joda-time/pull/332, https://github.com/JodaOrg/joda-time/issues/386 and https://github.com/JodaOrg/joda-time/issues/373). It also remove the joda-convert dependency that seems to be unused. closes #20911 Here is the changelog for 2.9.5: ``` Changes in 2.9.5 ---------------- - Add Norwegian period translations [#378] - Add Duration.dividedBy(long,RoundingMode) [#69, #379] - DateTimeZone data updated to version 2016i - Fixed bug where clock read twice when comparing two nulls in DateTimeComparator [#404] - Fixed minor issues with historic time-zone data [#373] - Fix bug in time-zone binary search [#332, #386] The fix in v2.9.2 caused problems when the time-zone being parsed was not the last element in the input string. New approach uses a different approach to the problem. - Update tests for JDK 9 [#394] - Close buffered reader correctly in zone info compiler [#396] - Handle locale correctly zone info compiler [#397] ```	2016-11-10 17:32:46 +01:00
Luca Cavanna	10a4288a4c	Remove unused ClusterService dependency from SearchPhaseController (#21421 )	2016-11-10 17:32:19 +01:00
Luca Cavanna	bd23921a3a	Fix InternalSearchHit#hasSource to return the proper boolean value (#21441 ) The method used to be called `isSourceEmpty`, and was renamed to `hasSource`, but the return value never changed. Updated tests and users accordingly. Closes #21419	2016-11-10 13:13:38 +01:00
Nguyễn Thanh Tiến	27a7b30349	Add null check in InternalSearchHit#sourceRef to prevent NPE (#21431 ) Add null check in InternalSearchHit#sourceRef to prevent NPE Closes #19279	2016-11-10 10:54:43 +01:00

... 3 4 5 6 7 ...

7198 Commits