Commit Graph

11713 Commits

Author SHA1 Message Date
Boaz Leskes cde7d9af1c Transport: fix racing condition in timeout handling
If a request comes in at the same moment the timeout handler for it runs, we may leak a timeoutInfoHolder and erroneously log "Transport response handler not found of id" . The same issue could cause the request tracer to fire a traceUnresolvedResponse call instead of traceReceivedResponse , causing a failure of testTracerLog ( see #10187 ) .

This commit makes sure timeoutInfoHolder is visible before removing the corresponding RequestHolder. It also unifies the TransportService.Adapter#remove(requestId) with TransportService.Adapter#onResponseReceived(requestId), as they are always called together to indicate a response was received.

Closes #10187
Closes #10220
2015-03-23 14:40:08 +01:00
Boaz Leskes f5f9739117 Recovery: only cancel when primary completed relocation
When a primary moves to another node, we cancel ongoing recoveries and retry from the primary's new home. At the moment this happens when the primary relocation *starts*. It's a shame as we cancel recoveries that may be close to completion and will finish before the primary has been fully relocated. This commit only triggers the cancelation once the primary relocation is completed.

Next to this, it fixes a race condition between recovery cancellation and the recovery completion. At the moment we may trigger remove a recovered shard just after it was completed. Instead, we should use the recovery cancellation logic to make sure only one code path is followed.

All of the above caused the recoverWhileUnderLoadWithNodeShutdown test to fail (see http://build-us-00.elastic.co/job/es_core_15_debian/32/ ). The test creates an index and then increasingly disallows nodes for it, until only 1 node is left in the allocation filtering rules. Normally, this means we stay in green, but the premature recovery cancellation plus the race condition mentioned above caused a shard to be failed and stay unassigned and the test asserts to fail. This happens due to the following sequence:

- The shard has finished recovering and sent the master a shard started command.
- The recovery is cancelled locally, removing the index shard.
- Master starts shard (deleting it's other copy).
- Local node gets a cluster state with the shard started in it, which cause it to send a shard failed (to make the master aware).
- Shard is failed and can't be re-assigned due to the allocation filter.

The recoverWhileUnderLoadWithNodeShutdown is also adapted a bit to fit the current behavior of allocation filtering (in the past it used to really shut down nodes). Last, all tests in that class are given better names to fit the current terminology.

Clsoes #10218
2015-03-23 14:24:37 +01:00
Simon Willnauer cae2707375 [ALLOCATION] Verify shards index UUID when fetching started shards
Today we simply fetch the shards metadata without verifying the
index UUID the shard belongs to. We recently added this UUID
to the shard state metadata. This commit adds verification
to the shard metadata fetching to prevent bringing shards
back into an index it doesn't belong to due to name collisions.
2015-03-23 14:04:15 +01:00
Martijn van Groningen 911f522a0e change url to use elastic organization 2015-03-23 13:13:26 +01:00
Boaz Leskes 4970e3e225 Revert "Rest: Add json in request body to scroll, clear scroll, and analyze API"
This reverts commit 16083d454c.
2015-03-23 12:57:19 +01:00
Britta Weber f34c6b7065 Remove unused code 2015-03-23 11:31:21 +01:00
Sylwester Lachiewicz 341a52d829 ThreadPool: use `equals` instead of `==` for threadpool names comparisons
Although the current comparisons work, since we rely on the fact that we always use the same constants, `equals` seems safer.

Closes #10204
2015-03-23 10:39:08 +01:00
Colin Goodheart-Smithe b751f0e11b added validation of reducers 2015-03-23 09:12:28 +00:00
Shay Banon 9f9ada4b7b Upgrade to Jackson 2.5.1
Note, Jackson 2.5 is less lenient when it comes to not starting an object before starting to add fields on a fresh builder, fixed where applicable.
closes #10210
2015-03-23 09:03:27 +01:00
Jun Ohtani 16083d454c Rest: Add json in request body to scroll, clear scroll, and analyze API
Add json support to scroll, clear scroll, and analyze

Closes #5866
2015-03-23 15:35:38 +09:00
Tim Schlechter 56117e3f70 [SPEC] Remove duplicated timeout param from bulk REST spec
Closes #10205
2015-03-22 10:26:12 +01:00
Devin Chollak 3eab3ca1c4 Logging: change logging to warning in LocalAllocateDangledIndices
Closes #9593
2015-03-21 20:06:43 +01:00
Shay Banon 34c5b934f1 TEST: use the random variable
we have to use the variable instead of the static methods to reproduce failing tests
2015-03-21 19:08:33 +01:00
Shay Banon 1032429d31 ping: remove version check 2015-03-21 19:02:20 +01:00
Shay Banon 0cc13b9d5b Schedule transport ping interval
Sometimes, when using transport client for example, through a load balancer, there is a need to send a scheduled ping message to keep each channel alive.

Add support for `transport.ping_schedule`, which controls the schedule (-1 for disabled) at which a ping message will be sent. For transport client case, it gets enabled automatically since almost always this is the desired behavior.

We use the same 6 bytes header format for the ping message, with ES header and -1 for data length for ping message, and simply continue to process the next messages once this is encountered.

closes #10189
2015-03-21 18:55:10 +01:00
javanna a8f55c0677 [DOCS] removed extra line break from monitoring.asciidoc 2015-03-21 14:19:30 +01:00
javanna 481ef7d3f4 [TEST] add REST test for update index settings api using indices options
Relates to #10030
2015-03-21 11:10:24 +01:00
olivier bourgain 9561678ca0 Update Settings api: fix handling of IndicesOptions parameters in REST layer
The Update Settings API tries to merge the query_string params with the settings sent as body and excluding some "well known params" such as `pretty`, `timeout` etc. Those well known params do not include the params used by IndicesOptions though, so they get merged resulting in invalid settings that get rejected.

Closes #10030
2015-03-21 10:59:44 +01:00
javanna 65b9a94baa Benchmark api: removed leftovers
Closes #10182
Closes #10184
2015-03-21 10:36:05 +01:00
Simon Willnauer 7257345db9 Revert Benchmark API
The benchmark api is being worked on feature/bench branch and will be merged from there when ready.
2015-03-21 10:36:04 +01:00
javanna 66fa72f21c Java API: remove duplicated consistency level setters
`setConsistencyLevel` setter is already present in the base class `ShardReplicationOperationRequestBuilder`. It is not needed in `DeleteRequestBuilder` and `IndexRequestBuilder`.

Closes #10188
2015-03-21 10:30:36 +01:00
Asimov4 649e3aa4c5 [DOCS] Fix typos in percolate.asciidoc 2015-03-21 10:23:15 +01:00
popovae b7328a4d45 [DOCS] Add ElasticOcean mobile app to monitoring.asciidoc 2015-03-21 10:03:56 +01:00
Reuben Sutton 051bc8c49e Rest: remove indices_boost _search URL param
Relates to https://github.com/elastic/elasticsearch-ruby/issues/29, the parsing logic is incorrect as comma is used twice as a separator (e.g. indices_boost=index1,5,index,10 in expected but will never be parsed correctly). Anyways `indices_boost` makes more sense in the request body only where properly parsed and supported.

Closes #6281
2015-03-21 09:25:46 +01:00
javanna 88e506e58c [DOCS] add -i flag to more curl HEAD calls 2015-03-21 08:56:20 +01:00
Corey Daley 366f01b4d2 [DOCS] add -i flag to curl HEAD call
without -i you never see the status:200 or status:404 messages
2015-03-21 08:56:12 +01:00
Robert Muir 6da99b3ef0 [Bootstrap] Throw exception if the JVM will corrupt data.
Detect the worst-offenders, all IBM versions and several known hotspot
versions that can cause index corruption, and fail on startup.

Provide/detect compiler workarounds when they exist, but warn about
performance degradation.

In all cases the check can be bypassed completely with a safety
switch via undocumented system property (es.bypass.vm.check=true)

Closes #7580
2015-03-21 02:47:44 -04:00
Simon Willnauer 5e5f00de08 Read first bytes as ints to respect method contract 2015-03-20 17:09:07 -07:00
Florian Hopf 865cbeb3d8 Filter indices stats for translog
Added the missing call in the RestAction, closes #8262
2015-03-20 14:38:49 -07:00
Shay Banon 31b63b1a84 Better detection of CBOR
CBOR has a special header that is optional, if exists, allows for exact detection. Also, since we know which formats we support in ES, we can support the object major type case.
closes #7640
2015-03-20 22:06:03 +01:00
Al Lefebvre 94f82368f0 Update templates.asciidoc
I've been attempting to programatically verify that adding index templates via the `{path.conf}/templates/` directory works fine although I was never able to validate this via an API call to the `/_template/`.  It seems that these templates do not appear in that API call, which I discovered in the following mail thread:

http://elasticsearch-users.115913.n3.nabble.com/Loading-of-index-settings-template-from-file-in-config-templates-td4024923.html#d1366317284000-912

My question is why wouldn't the `/_template/*` method return these templates?  This tends to complicate things for those that want to perform automated tests to verify that they are in fact being recognized and used by Elasticsearch.
2015-03-20 14:52:20 -06:00
Simon Willnauer 1d357cb48a Merge pull request #10157 from spinscale/1503-cleanup-remove-bytesarray-unsafe
Cleanup: Remove unsafe field in BytesStreamInput
2015-03-20 13:14:02 -07:00
Michael McCandless 34b397597c Core: increase default rate limiting for snapshot, restore and recovery to 40 MB/sec
This also fixes a possible issue that may cause over-throttling when
there are many small files being copied, which should be rare.

Closed #10185
2015-03-20 16:09:42 -04:00
Simon Willnauer 93fedcbb88 [RECOVERY] Wipe shard state before switching recovered files live
Today we leave the shard state behind even if a recovery is half finished
this causes in rare conditions shards to be recovered and promoted as
primaries that have never been fully recovered.

Closes #10053
2015-03-20 08:42:03 -07:00
Boaz Leskes 0f2d2d0495 Test: testCancelRecoveryAndResume - add network hook before bumping replicas 2015-03-20 16:24:36 +01:00
Boaz Leskes 7d64877326 remove left over recovery log 2015-03-20 16:10:26 +01:00
javanna 9ca74c8217 [DOCS] clarify no-master-block docs
Closes #9739
2015-03-20 15:58:16 +01:00
Boaz Leskes 8c69535580 Recovery: add throttle stats
This commit adds throttling statistics to the index stats API and to the recovery state.

Closes #10097
2015-03-20 10:13:40 +01:00
javanna 4348959f9d Delete api: remove broadcast delete if routing is missing when required
This commit changes the behaviour of the delete api when processing a delete request that refers to a type that has routing set to required in the mapping, and the routing is missing in the request. Up until now the delete api sent a broadcast delete request to all of the shards that belong to the index, making sure that the document could be found although the routing value wasn't specified. This was probably not the best choice: if the routing is set to required, an error should be thrown instead.

A `RoutingMissingException` gets now thrown instead, like it happens in the same situation with every other api (index, update, get etc.). Last but not least, this change allows to get rid of a couple of `TransportAction`s, `Request`s and `Response`s and simplify the codebase.

Closes #9123
Closes #10136
2015-03-20 09:19:43 +01:00
Simon Willnauer 780fe57a2c [TEST] Add more logging to TruncatedRecoveryTests 2015-03-19 23:17:48 -07:00
Lee Hinman f14e226e82 [TEST] Add additional logging to testCorruptTranslogFiles 2015-03-19 22:58:49 -06:00
Lee Hinman 25b09ede16 [TEST] Use less shards in testShadowReplicaNaturalRelocation
This makes the assertion a bit more flexible and removes the
`ensureGreen` in favor of `ensureYellow`, which is really all that is
needed to perform a search. On slow machines the relocations can take a
while and time out the `ensureGreen`.
2015-03-19 18:12:21 -06:00
Britta Weber f253542abb remove unused method. close() is neither needed nor called anywhere
closes #10175
2015-03-19 17:05:24 -07:00
Simon Willnauer 48388dbb6e apply feedback 2015-03-19 15:56:28 -07:00
Simon Willnauer d9be606c5e [LIFECYCLE] Add before/afterIndexShardDelete callback
This commit allows code to be executed before or after a shards content
is deleted from disk. This is only executed if the shard owns the
content ie. on a shard file system only a primary shard will execute
these operations.
2015-03-19 15:47:51 -07:00
Simon Willnauer a6897aa073 [GATEWAY] Move GatewayShardsState logic into IndexShard
The index shard should take care of shard state persistence since it has
all the information and the gateway concept has been removed in master.
2015-03-19 14:48:02 -07:00
Simon Willnauer 1168347b9d [REPLICATION] Remove `async` replication
Closes #10114
2015-03-19 14:44:21 -07:00
Boaz Leskes 955ff05a5e Transport: fix potential NPE in new tracer log if request timeout
Closes #9994
2015-03-19 21:29:34 +01:00
Clinton Gormley a8e5c6eeb0 Merge pull request #10169 from clintongormley/deprecate_thrift_memcached
Remove references to the thrift and memcached transport plugins
2015-03-19 20:56:25 +01:00
Clinton Gormley aa94ced0ae Remove references to the thrift and memcached transport plugins
as they are no longer supported

Closes #10166
2015-03-19 20:49:58 +01:00