Commit Graph

5196 Commits

Author SHA1 Message Date
Boaz Leskes 1cc5da43b3 Logging: suppress long mapping logging during mapping updates (unless in TRACE)
Currently DEBUG logs can get very verbose because IndicesClusterStateService logs the complete mapping with every mapping update. We should suppress it if long in DEBUG mode and always log the full one in TRACE.

Closes #7949
2014-10-02 22:19:29 +02:00
Boaz Leskes be2229c183 Discovery: add a finalize round to multicast pinging
When sending a multicast ping, there is no way to determine how long it will take before all nodes will respond. Currently we send two pings (one at start, one after half timeout) and wait until the ping timeout has passed for all responses to come back. However, if all nodes are fast to respond, there is a gap relatively large between the moment that pings were gathered and the election that is based on them. This commits adds a last ping round (at timeout) where we know the number of nodes we expect to receive answers from. Once all nodes responded, we complete the pinging.

Closes #7924
2014-10-02 15:17:54 +02:00
Boaz Leskes ab5d1b9633 Discovery: only accept unicast pings when started
Due to component start order we may process an incoming ping while the ZenDiscovery module is not yet started. This leads to exception (from which we recover correctly, but the logs are note nice). UnicastZenPing should only start processing pings if it is started. We previously processed if not closed or stopped.

Closes #7950
2014-10-02 15:00:25 +02:00
Boaz Leskes c4866b3f03 DiscoveryWithServiceDisruptions: some more java docs and todos 2014-10-02 14:02:31 +02:00
Adrien Grand 3b38db121b Mappings: Make lookup structures immutable.
This commit makes the lookup structures that are used for mappings immutable.
When changes are required, a new instance is created while the current instance
is left unmodified. This is done efficiently thanks to a hash table
implementation based on a array hash trie, see
org.elasticsearch.common.collect.CopyOnWriteHashMap.

ManyMappingsBenchmark returns indexing times that are similar to the ones that
can be observed in current master.

Ultimately, I would like to see if we can make mappings completely immutable as
well and updated atomically. This is not trivial however, eg. because of dynamic
mappings. So here is a first baby step that should help move towards that
direction.

Close #7486
2014-10-02 13:42:20 +02:00
Alex Ksikes 8d4373ab66 [TEST] MLT malformed doc test fixed 2014-10-01 14:39:55 +02:00
Boaz Leskes dc86ac5752 Test: AckTests.test*Warmer* - make sure at least one shard is started
The Put Warmer API executes the search encapsulated in the warmer before accepting it. This requires that at least one shard will be started. The tests used to use ensureGreen to check for that because of a publish timeout of 0 (needed to check the ack mechanism) that doesn't guarantee the shard is really started - just that the master has changed the CS to say so. This commit changes the ensureGreen to a the indexing of a single document.
2014-10-01 13:53:37 +02:00
Simon Willnauer 5747c9ebba [TEST] move fragile tests to BadApples rather than AwaitsFix 2014-10-01 12:37:59 +02:00
Boaz Leskes a2029ed6ec Test: AckClusterUpdateSettingsTests - only set publish_timeout to 0 after green 2014-10-01 12:33:58 +02:00
Lee Hinman 9c8beb8220 Be stricter parsing ids for ids query
Adds a check to make sure that all ids in the query are either strings
or numbers. This is to prevent the case where a user accidentally
specifies:

"ids": [["1", "2"]]

(note the double array)

With this change, an exception will be thrown since the second "[" is
not a string or number, it is a Token.START_ARRAY.

Fixes #7686
2014-10-01 10:34:35 +02:00
Simon Willnauer 50923a764c [TEST] Use canonical path for comparison rather than absolute path 2014-10-01 10:25:20 +02:00
Alexander Reelsen 9903c2480e PluginManager: Fix config path extraction from plugin handle
The PluginManager had a subtle bug in case the config directory was not in the
es home directory - which is always true in case of packaging.

This fixes the plugin manager, so that when specifying a path.home and a
path.conf variable on the commandline, the plugin manager acts
appropriately.
2014-09-30 19:51:07 +02:00
Igor Motov b7a4c6da65 Snapshot/Restore: Allow custom metadata to specify whether or not it should be in a snapshot
Before this change all persistent custom metadata is stored as part of snapshot. It requires us to remove repositories metadata later during recovery process. This change allows custom metadata to specify whether or not it should be stored as part of a snapshot.

  Fixes #7900
2014-09-30 19:16:42 +04:00
uboness ddbeb910be Changed the root rest endpoint ('/') to use cluster service
Instead of issuing a redundant cluster state request.

Closes #7899
2014-09-30 16:48:22 +02:00
Alex Ksikes e53b2eede7 MLT Query/API: fix `minimum_should_match` bwc
Rounded to the nearest int allows to avoid issues in which (int) (0.59f * 100)
= 58, instead of 59%.
2014-09-30 15:38:34 +02:00
Lee Hinman c86fdecd25 [TESTS] Be less strict about breaker child limit
Failing a parent breaker check is eventually consistent, so the test
could fail the parent limit, throw an exception, and before being
adjusted back down, increment more and throw a circuit breaking
exception on the child. This increases the child's limit, to ensure
we're only testing the parent limit.

It adds an additional assert to ensure that the breaker total is
correctly re-adjusted when the parent breaker has been tripped.
2014-09-30 13:01:27 +02:00
Michael McCandless 4e3f3e7ef8 1.3.4 release: add 1.3.5 Version constant 2014-09-30 06:44:19 -04:00
Michael McCandless 0be4c6a73d Core: go back to unbounded (scaling) thread pool for management threads (revert #7318) 2014-09-30 04:54:28 -04:00
Britta Weber e99be5cb0b [TEST] Mute MoreLikeThisActionTests#*ArtificialDocs 2014-09-30 09:29:32 +02:00
Ryan Ernst 37b294aaec Fix optimize behavior with 'force' and 'flush' flags.
This does the following:
* Make 'force' flag only build a merge if the delegate MP returned no merges
* Add async handling for 'flush' when 'waitForMerges' is false
* Remove flush at the beginning of optimize.  This is something the user can
  do if they wish, before calling optimize.

closes #7886
closes #7904
closes #7920
2014-09-29 15:20:19 -07:00
Suyog Rao 25bce1db5d Nest original exception while creating NoShardAvailableActionException
Closes #7756
2014-09-29 14:10:16 -07:00
Simon Willnauer 20a0c68964 [BUILD] Release version should match latest version
This commit ensures that the latest version in our code is identical
to the project.version specified in the pom.xml file.
2014-09-29 17:45:10 +02:00
Simon Willnauer cfd9ac2f63 [TEST] Use Shutdown API only if nodes are on 1.3.3 or newer to prevent shutdown problems 2014-09-29 17:18:26 +02:00
Michael McCandless aa89c481b0 1.3.3 release: add 1.3.4 version constant 2014-09-29 10:29:18 -04:00
javanna c06b772df0 [TEST] make sure that IndicesRequestTests is repeateable using the same seed
Remove the creation of a node client if not there before each test through setup method. `numClientNodes` makes sure that the client node gets created during suite cluster initialization.
2014-09-29 15:57:14 +02:00
Alex Ksikes b118558962 MLT Query: Support for artificial documents
Previously, the only way to specify a document not present in the index was to
use `like_text`. This would usually lead to complex queries made of multiple
MLT queries per document field. This commit adds the ability to the MLT query
to directly specify documents not present in the index (artificial documents).
The syntax is similar to the Percolator API or to the Multi Term Vector API.

Closes #7725
2014-09-29 15:49:13 +02:00
javanna 43a1e1c353 [TEST] create client nodes using node.client: true instead node.data: false and node.master: false
Create client nodes using `node.client: true` instead of `node.data: false` and `node.master: false`.

We should create client nodes in our test infra using the `node.client:true` settings as that is the one that users use, and the one that we use as well in `ClientNodePredicate` thus we end up not finding client nodes otherwise as they weren't created with the proper setting.

Updated also the `DataNodePredicate` so that `client: true` is enough, no need for `data: false` as well.

Closes #7911
2014-09-29 15:24:17 +02:00
Lee Hinman ab9cc336e5 [TESTS] Additional logging for `testThreadedUpdatesToChildBreakerWithParentLimit` 2014-09-29 15:06:36 +02:00
Boaz Leskes 9b4bf4379a Test: testNodeNotReachableFromMaster had a typo when choosing a non master node 2014-09-29 11:38:39 +02:00
Alex Ksikes 5014158d6b MLT Query: use minimum should match more extensive syntax
The minimum number of optional should clauses of the generated query to match
can now be set using the more extensive minimum should match syntax. This
makes the `percent_terms_to_match` parameter deprecated, and replaced in favor
to a new `minimum_should_match` parameter.

Closes #7898
2014-09-29 11:14:56 +02:00
Boaz Leskes 03d880de38 Discovery: master fault detection fall back to cluster state thread upon error
With #7834, we simplified ZenDiscovery by making it use the current cluster state for all it's decision. This had the side effect a node may start it's Master FD before the master  has fully processed that cluster state update that adds that node (or elects the master master). This is due to the fact that master FD is started when a node receives a cluster state from the master but the master it self may still be publishing to other node.

This commit makes sure that a master FD ping is only failed once we know that there is no current cluster state update in progress.

Closes #7908
2014-09-29 11:12:11 +02:00
Lee Hinman 168b3752ef Refactor the Translog.read(Location) method
It was only used by `readSource`, it has been changed to return a
Translog.Operation, which can have .getSource() called on it to return
the source. `readSource` has been removed.

This also removes the checked IOException, any exception thrown is
unexpected and should throw a runtime exception.

Moves the ReleasableBytesStreamOutput allocation into the body of the
try-catch block so the lock can be released in the event of an exception
during allocation.
2014-09-29 10:13:45 +02:00
mikemccand 6bf635039c Core: upgrade to Lucene 4.10.1 2014-09-28 13:42:12 -04:00
mikemccand 9e8c51b70d fix concurrency bug in index throttling 2014-09-28 12:30:48 -04:00
Boaz Leskes b70f0d5eef Internal: MulticastChannel should wait on receiver thread to stop during shutdown
This was signaled by our tests which shutdown class and check for thread leakage.

Closes #7835
2014-09-27 14:23:07 +02:00
Martijn van Groningen 71adb3ada2 If a node is being shutdown some in flight ping request may be executed. Make sure to keep track of those ping requests and close the unicast connect executor service.
Closes #7903
2014-09-27 00:05:15 +02:00
javanna e85e07941d Internal: split internal fetch request used within scroll and search
Similar to #7856 but relates to the fetch shard level requests. We currently use the same internal request when we need to fetch within search and scroll. The two original requests though diverged after #6933 as SearchRequest implements IndicesRequest while SearchScrollRequest doesn't. That said, with #7319 we made `FetchSearchRequest` implement IndicesRequest by making it hold the original indices taken from the original request, which are null if the fetch was originated by a search scroll, and that is why original indices are optional there.

This commit introduces a separate fetch request and transport action for scroll, which doesn't hold original indices. The new action is only used against nodes that expose it, the previous action name will be used for nodes older than 1.4.0.Beta1.

As a result, in 1.4 we have a new `indices:data/read/search[phase/fetch/id/scroll]` action that is equivalent to the previous `indices:data/read/search[phase/fetch/id]` whose request implements now IndicesRequest and holds the original indices coming from the original request. The original indices in the latter request can only be null during a rolling upgrade (already existing version checks make sure that serialization is bw compatible), when some nodes are still < 1.4.

Closes #7870
2014-09-26 18:24:53 +02:00
Britta Weber bac1da25f6 node shutdown: make close() syncronized
An example scenario where this will help:

When the node is shutdown via api call
(https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/test/ExternalNode.java#L219 )
then the call returns immediately even if the node is not actually shutdown yet
(https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/action/admin/cluster/node/shutdown/TransportNodesShutdownAction.java#L226).
If at the same time the proces is killed, then the hook that would usually prevent
uncontrolled shutdown
(https://github.com/elasticsearch/elasticsearch/blob/master/src/main/java/org/elasticsearch/bootstrap/Bootstrap.java#L75)
has no effect: It again calls close() which might then just return
for example because one of the lifecycles was moved to closed already.

The bwc test FunctionScoreBackwardCompatibilityTests.testSimpleFunctionScoreParsingWorks
failed because of this. The translog was not properly
written because if the shutdown was called via api, the following process.destroy()
(https://github.com/elasticsearch/elasticsearch/blob/master/src/test/java/org/elasticsearch/test/ExternalNode.java#L225)
killed the node before the translog was written to disk.

closes #7885
2014-09-26 12:46:18 +02:00
Boaz Leskes 36c3e896de NodesFD: simplify concurrency control to fully rely on a single map
The node fault detection class is used by the master node to ping the nodes in the cluster and verify they are alive. This PR simplifies the concurrency controls in the class + adds a test for a scenario that surfaced the problem.

Closes #7889
2014-09-26 11:21:55 +02:00
Boaz Leskes db54e9c2d5 Discovery: remove any local state and use clusterService.state instead
At the moment, ZenDiscovery contains a local copy of the disco nodes plus a flag that indicates whether the local node is master or not. This is redundant as the same information is stored in the cluster state. Have duplicate copy can lead to unneeded concurrency issues. This PR removes the duplication, including moving the ownership of the localNode creation to ClusterState

The PR introduces a tighter control of the background joining thread to make sure it is started and stopped together with any cluster state changes. This solves potentially concurrency bugs where a joining thread may fail to start.

Last we add a couple of safety checks to make sure that if a nodes receives a cluster state from a new master while actively trying to join another one (or electing itself) we go back to pinging to actively join it.

Closes #7834
2014-09-26 11:21:55 +02:00
Britta Weber eb9d39f611 [TEST] wait for yellow else assertSearchResponse will trip 2014-09-26 11:13:12 +02:00
Britta Weber 75d2a84772 [TEST] wait for yellow else assertSearchResponse will trip 2014-09-26 10:52:44 +02:00
Michael McCandless e207189037 Tests: turn off CheckIndex for now (it's buggy: there is a race w/ deletion of all files in the data dirs) 2014-09-26 04:44:11 -04:00
Michael McCandless 87e9aba2ac disable CheckIndex for these no-ack tests 2014-09-26 04:08:03 -04:00
Britta Weber 526b464025 field name lookup: return List instead of Set for names matching a pattern
The returned sets are only used for iterating. Therefore we might
as well return a list since this guaratees order.

This is the same effect as in
https://github.com/elasticsearch/elasticsearch/pull/7698
The test SimpleIndexQueryParserTests#testQueryStringFieldsMatch
failed on openjdk 1.7.0_65 with
<jdk.map.althashing.threshold>0</jdk.map.althashing.threshold>

closes #7709
2014-09-26 09:59:12 +02:00
Britta Weber 7feb742a9b script with _score: remove dependency of DocLookup and scorer
As pointed out in #7487 DocLookup is a variable that is accessible by all scripts
for one doc while the query is executed. But the _score and therfore the scorer
depends on the current context, that is, which part of query is currently executed.
Instead of setting the scorer for DocLookup
and have Script access the DocLookup for getting the score, the Scorer should just
be explicitely set for each script.
DocLookup should not have any reference to a scorer.
This was similarly discussed in #7043.

This dependency caused a stackoverflow when running script score in combination with an
aggregation on _score. Also the wrong scorer was called when nesting several script scores.

closes #7487
closes #7819
2014-09-26 09:59:12 +02:00
Igor Motov 9c9cd01854 Fix NumberFormatException in Simple Query String Query
Incorrect usage of XContentParser.hasTextCharacters() can result in NumberFormatException as well as other possible issues in template query parser and phrase suggest parsers.

Fixes #7875
2014-09-26 10:49:05 +04:00
Michael McCandless 3db50b2ebf don't CheckIndex for this test case 2014-09-25 18:21:12 -04:00
Michael McCandless 637c6d1606 Tests: always run Lucene's CheckIndex when shards are closed in tests and fail the test if corruption is detected
Today we only run 10% of the time, and the test doesn't fail when
corruption is detected.

I think it's better to always run and fail the test, so we can catch
any possible resiliency bugs in Lucene/Elasticsearch causing corruption.

For known tests that create corrupted indices, it's easy to set
MockFSDirectoryService.CHECK_INDEX_ON_CLOSE to false...

Closes #7730
2014-09-25 16:50:48 -04:00
markharwood e97b8fd217 Aggs - support for arrays of numeric values in include/exclude clauses
Closes #7714
2014-09-25 11:02:29 +01:00