Commit Graph

9689 Commits

Author SHA1 Message Date
Boaz Leskes 4677d05048 Recovery: mapping check during phase2 should be done in cluster state update task
Before phase2 we check verify that the local mapping is in sync with the cluster state mapping (and send & wait on a master update mapping task if not). This check should be done under a cluster state update task to make sure an incoming cluster state update to do not change things while we check.

Closes #7744
2014-09-22 11:05:00 +02:00
Boaz Leskes d17fd26f23 Test: RecoveryWhileUnderLoadTests.recoverWhileRelocating should report cluster state when failing to reach green 2014-09-21 20:16:45 +02:00
Boaz Leskes 41fd5d02f4 Discovery: Give a unique id to each ping response
During discovery a node gossips with other nodes to discover the current state of the cluster - what nodes are out there, what version they use and most importantly whether there is an active master out there. During this ping process we may end up in a situation where old information is mixed with new. This is comment if a couple of master election happen in rapid succession.

This commit adds a monotonically increasing id to each ping response. This makes it easy to always select the last ping from every node.

Closes #7769
2014-09-20 12:58:15 +02:00
Martijn van Groningen afcbffbfc1 Core: Check if from + size don't cause overflow and fail with a better error.
Closes #7778
2014-09-20 12:34:48 +02:00
mikemccand dbe4e6e674 Internal: remove ForceSyncDirectory
Historical code, not used anymore.

Closes #7804
2014-09-19 14:15:06 -04:00
Brian Murphy 8e742c2096 Indexed Scripts/Templates : Cleanup
This contains several cleanups to the indexed scripts.
Remove the unused FetchSourceContext from the Get request..
Add lang,_version,_id to the REST GET API.
Removes the routing from GetIndexedScriptRequest since the script index is a single shard that is replicated across all nodes.
Fix backward compatible template file reference
Before 1.3.0 on disk scripts could be referenced by requesting
````
_search/template

{
  "template" : "ondiskscript"
}
````
This was broken in 1.3.0 by requiring
````
{
  "template" :
  {
    "file" : "ondiskscript"
  }
}
````
This commit restores the previous behavior.
Remove support for preference, realtime and refresh
These parameters don't make sense anymore for indexed scripts as we always force the preference to _local and
always refresh after a Put to the indexed scripts index.

Closes #7568
Closes #7559
Closes #7647
Closes #7567
2014-09-19 11:59:08 +01:00
Lee Hinman 4185566e93 Add option to take currently relocating shards' sizes into account
When using the DiskThresholdDecider, it's possible that shards could
already be marked as relocating to the node being evaluated. This commit
adds a new setting `cluster.routing.allocation.disk.include_relocations`
which adds the size of the shards currently being relocated to this node
to the node's used disk space.

This new option defaults to `true`, however it's possible to
over-estimate the usage for a node if the relocation is already
partially complete, for instance:

A node with a 10gb shard that's 45% of the way through a relocation
would add 10gb + (.45 * 10) = 14.5gb to the node's disk usage before
examining the watermarks to see if a new shard can be allocated.

Fixes #7753
Relates to #6168
2014-09-19 12:36:51 +02:00
Brian Murphy 61c21f9a0e Bulk API: Do not fail whole request on closed index
The bulk API request was marked as completely failed,
in case a request with a closed index was referred in
any of the requests inside of a bulk one.

Implementation Note: Currently the implementation is a bit more verbose in order to prevent an instanceof check and another cast - if that is fast enough, we could execute that logic only once at the
beginning of the loop (thinking this might be a bit overoptimization here).

Closes #6410
2014-09-19 10:55:49 +01:00
Brian Murphy 4f791b06db Revert "Bulk API: Do not fail whole request on closed index"
This reverts commit 405e5816b8.
2014-09-19 10:27:28 +01:00
Brian Murphy c7c61bfd91 Revert "Bulk Request : Add Document Request"
This reverts commit 86f575dcea.
2014-09-19 10:27:16 +01:00
Brian Murphy 86f575dcea Bulk Request : Add Document Request
This file was missing.
2014-09-19 10:05:45 +01:00
Brian Murphy 405e5816b8 Bulk API: Do not fail whole request on closed index
The bulk API request was marked as completely failed,
in case a request with a closed index was referred in
any of the requests inside of a bulk one.

Implementation Note: Currently the implementation is a bit more verbose in order to prevent an instanceof check and another cast - if that is fast enough, we could execute that logic only once at the beginning of the loop (thinking this might be a bit overoptimization here).

Closes #6410
2014-09-19 09:56:49 +01:00
javanna 4fa924494d [TEST] move REST tests to their own test group
Closes #7795
2014-09-19 10:34:30 +02:00
Simon Willnauer 9f6d6d540b [ENGINE] try increment store before searcher is acquired
InternalEngine#refreshNeeded must increment the ref count on the
store used before it's checking if the searcher is current since
internally a searcher ref is acquired and if that happens concurrently
to a engine close it might violate the assumption that all files
are closed when the store is closed.

This commit also converts some try / finally into try / with.
2014-09-19 00:37:30 +02:00
javanna 508ff29e0d [TEST] allow to fully disable REST tests included parsing via -Dtests.rest=false
We currently look for REST tests on file system although they are disabled. We should not do that and move the check earlier on. This way third parties using our test infra, which don't have REST tests on file system, can effectively disable the REST tests, otherwise they would get initialization error despite having disabled them.  The downside is that the number of tests visualized is going to be zero instead of the real number of parsed REST tests, but there is nothing we can do about this. Tests get ignored anyways.
2014-09-18 17:05:34 +02:00
Colin Goodheart-Smithe 66417a93a0 Aggregations: Removes isSingleUserCriteria check
This change removes the backwards compatibility workaround that checks that a compoundOrder originated from a single user defined criteria for the purposes of serialising to older versioned nodes.
2014-09-18 15:22:43 +01:00
javanna 5e1f95ca93 [TEST] close REST test execution context only if not null
The context can be null when REST tests are disabled via sysprop.
2014-09-18 16:14:59 +02:00
Colin Goodheart-Smithe f0e9b7b8ef [DOC] Add GET Alias API note to breaking changes
Note explains that GET Alias API now supports IndicesOptions and will error if a index is missing
2014-09-18 15:09:01 +01:00
Simon Willnauer d3e348ef90 [CORE] Add AbstractRunnable support to ThreadPool to simplify async operation on bounded threadpools
today we have to catch rejected operation exceptions in various places
and notify an ActionListener. This pattern is error prone and adds a lot
of boilerplait code. It's also easy to miss catching this exception
which only is relevant if nodes are under high load. This commit adds
infrastructure that makes ActionListener first class citizen on async
actions.

Closes #7765
2014-09-18 15:25:10 +02:00
Simon Willnauer b2477a43c8 [TEST] Reimplement AckTests#testDeleteWarmerNoAcknowledgement
This test was not testing what it was supposed to test. This commit
implements the test as an actual delete warmer test without ack
returned.
2014-09-18 15:21:02 +02:00
javanna 5f97bccb54 Internal: add indices setter to IndicesRequest interface
We currently expose generic getters for indices and indicesOptions on the IndicesRequest interface. This commit adds a generic setter as well, which can be used to set the indices to a request. The setter impl throws `UnsupportedOperationException` if called on internal requests. Also throws exception if called on single index operations, since it accepts an array as argument.

Closes #7734
2014-09-18 14:26:11 +02:00
javanna 6717de9e46 Internal: make sure that update internal requests share the same original headers and request context
Update request internally executes index and delete operations. We need to make sure that those internal operations hold the same headers and context as the original update request. Achieved via copy constructors that accept the current request and the original request.

Closes #7766
2014-09-18 14:00:51 +02:00
javanna b9b5842acc Internal: make sure that all delete mapping internal requests share the same original headers and context
Delete mapping executes flush, delete by query and refresh operations internally. Those internal requests are now initialized by passing in the original delete mapping request so that its headers and request context are kept around.

Closes #7736
2014-09-18 13:58:03 +02:00
Martijn van Groningen f43a8e2961 Aggregations: Fix regression bug for the support of terms aggregation on the `_parent` field. 2014-09-18 12:27:55 +02:00
Simon Willnauer 66421c5a83 [TEST] Only reset test cluster if a test actually failed
Previously we resetted the test cluster for all subsequent tests
even though they didn't fail. This make suites like REST tests faster
and prevents crazy timeouts.

Closes #7775
2014-09-18 10:40:30 +02:00
Simon Willnauer dd97a95b04 [TEST] Wait until warmer is registered when testing timeout 2014-09-18 09:15:17 +02:00
Simon Willnauer 19c969a800 [SNAPSHOT] Minor code cleanup 2014-09-17 19:17:40 +02:00
Simon Willnauer 63eb49d202 [TEST] adjust chunk size to create less but bigger files to trigger throtteling more reliably 2014-09-17 19:17:40 +02:00
Martijn van Groningen 94ecf59e65 Test: increase zen logging 2014-09-17 19:06:24 +02:00
Simon Willnauer 2be018db84 [TEST] Only close GLOBAL_CLUSTER if it's non-null 2014-09-17 14:53:58 +02:00
Colin Goodheart-Smithe 8a70b115f2 Aggregations: More consistent response format for scripted metrics aggregation
Changes the name of the field in the scripted metrics aggregation from 'aggregation' to 'value' to be more in line with the other metrics aggregations like 'avg'
2014-09-17 11:46:26 +01:00
javanna dd2ef8e014 Internal: make sure that internally generated percolate request re-uses the original headers and request context
Closes #7767
2014-09-17 12:34:29 +02:00
Shay Banon b75d1d885a Add missing cluster blocks handling for master operations
Master node related operations were missing proper handling of cluster blocks, allowing for example to perform cluster level update settings even before the state was fully restored on initial cluster startup

Note, the change allows to change read only related settings without checking for blocks on update settings, as without it, it means one can't re-enable metadata/write. Also, it doesn't check for blocks on cluster state and health API, as those are allowed to be used even when blocked to figure out what causes the block.
closes #7763
closes #7740
2014-09-17 10:55:34 +02:00
Simon Willnauer a2d07058e8 [CORE] Notify listener when execution was rejected 2014-09-17 09:48:51 +02:00
Jordan Snodgrass 6246aac9ab Docs: Indicate that the Children Aggregation is coming in 1.4.0 2014-09-17 09:22:02 +02:00
Britta Weber 364de19251 [TEST] wait until all nodes have joined the cluster after upgrade
upgradeOneNode() only checked if the new node is in the nodes info.
However, this does not guarantee that all nodes have joined the cluster
already. For example the new node could be the master and might not yet
know about all nodes and the other nodes might not know about the new
master yet.
Depending on which client is picked later, the client might then try to
send request to the old node that was shut down instead of the new one.
Instead of just checking if the new node is in the nodes info we should
therefore also check if the all nodes are in the nodes info.
2014-09-16 23:17:30 +02:00
Simon Willnauer 76657251e0 [SNAPSHOT] Reset missing file hash instead of existing hash
Commit e8a1f2598b504183c1a3f2e60363ceaa0d4b298e introduced a regression
where the already existing hash was replaced instead of the missing.
2014-09-16 22:51:54 +02:00
Simon Willnauer 21f6bc84fa [TEST] Mute DedicatedClusterSnapshotRestoreTests#restorePersistentSettingsTest 2014-09-16 22:13:25 +02:00
Simon Willnauer d2e19ea665 [TEST] Wait for nodes before calling the API stats API 2014-09-16 19:35:35 +02:00
Boaz Leskes 2083ca0aa3 Discovery: UnicastZenPing don't rename configure host name
#7719 introduced temporary node ids for nodes that can't be resolved via their address. The change is overly aggressive and creates temporary nodes also for the configure target hosts.

Closes #7747
2014-09-16 18:05:35 +02:00
javanna e78737b19b [TEST] Update REST client before each test in our REST tests
In #7723 we removed the `updateAddresses` method from `RestClient` under the assumption that the addresses never change during the suite execution, as REST tests rely on the global cluster. Due to #6734 we restart the global cluster though before each test if there was a failure in the suite. If that happens we do need to make sure that the REST client points to the proper nodes. What was missing before was the http call to verify the es version every time the addresses change, which we do now since we effectively re-initialize the REST client when needed (if the http addresses have changed).

Closes #7737
2014-09-16 16:09:48 +02:00
Simon Willnauer cc99bfe802 [TEST] ensure HTTP is enabled in JsonP tests 2014-09-16 14:56:05 +02:00
Simon Willnauer f4d8f1673a [TEST] Improve HTTP support in TestClusters
This commit disalbes HTTP for all the suite and test scope tests
since it's an unused / unneeded module which takes time to startup.
This also uses a JVM private port range for HTTP ports to ensure
there are no cross JVM conflicts.
2014-09-16 14:31:43 +02:00
Simon Willnauer 3ed32e022e [SNAPSHOT] Trigger retry logic also if we hit a JsonException
We rely on retry logic when reading a snapshot since it's concurrently
serialized. We should move to a better logic here but the refactoring
of the blobstore change the semantics and this now throws Json
exceptions rather than returning an unexpected Token
2014-09-16 14:15:00 +02:00
Shay Banon 99f91f7616 Bulk operation can create duplicates on primary relocation
When executing a bulk request, with create index operation and auto generate id, if while the primary is relocating the bulk is executed, and the relocation is done while N items from the bulk have executed, the full shard bulk request will be retried on the new primary. This can create duplicates because the request is not makred as potentially holding conflicts.

This change carries over the response for each item on the request level, and if a conflict is detected on the primary shard, and the response is there (indicating that the request was executed once already), use the mentioned response as the actual response for that bulk shard item.

On top of that, when a primary fails and is retried, the change now marks the request as potentially causing duplicates, so the actual impl will do the extra lookup needed.

This change also fixes a bug in our exception handling on the replica, where if a specific item failed, and its not an exception we can ignore, we should actually cause the shard to fail.

closes #7729
2014-09-16 12:13:39 +02:00
Boaz Leskes 12cbb3223a Discovery: node join requests should be handled at lower priority than master election
When a node is elected as master or receives a join request, we submit a cluster state update task. We should give the node join update task a lower priority than the elect as master to increase the chance it will not be rejected. During master election there is a big chance that these will happen concurrently.

This commit lowers the priority of node joins from IMMEDIATE to URGENT

Closes #7733
2014-09-16 11:41:59 +02:00
Simon Willnauer ec28d7c465 [STORE] Fold two hashFile implemenation into one 2014-09-16 11:01:49 +02:00
Simon Willnauer 723a40ef34 [VERSION] s/V_1_4_0_Beta/V_1_4_0_Beta1/g 2014-09-16 10:54:41 +02:00
Simon Willnauer a7dde8dd80 [TEST] Make flush in #indexRandom optinal
Some tests like CorruptedTranslogTests rely on the fact that we
are recovering from translog. In those cases we need to prevent
flushes from happening during indexing. This change adds an optional
flag on the #indexRandom utility to disable flushes.
2014-09-16 10:43:28 +02:00
javanna 38f5aa2248 [TEST] Fixed ActionNamesTests to not use random action names that conflict with existing ones
ActionNamesTests#testIncomingAction rarely uses a random action name to make sure that actions registered via plugins work properly. In some cases the random action would conflict with existing one (e.g. tv) and make the test fail. Fixed also testOutgoingAction although the probability of conflict there is way lower due to longer action names used from 1.4 on.
2014-09-16 10:23:02 +02:00