Commit Graph

3003 Commits

Author SHA1 Message Date
Jason Tedor 871d1b4885 Remove and forbid use of j.u.c.ThreadLocalRandom
This commit removes and now forbids all uses of
java.util.concurrent.ThreadLocalRandom across the codebase. The
underlying issue with ThreadLocalRandom is that it can not be
seeded. This means that if ThreadLocalRandom is used in production code,
then tests that cover any code path containing ThreadLocalRandom will be
prevented from being reproducible by use of ThreadLocalRandom. Instead,
using org.elasticsearch.common.random.Randomness#get will give
reproducible sources of random when running under tests and otherwise
still give an instance of ThreadLocalRandom when running as production
code.
2016-01-08 12:23:48 -05:00
Jason Tedor 21f5b0ff35 Remove dead o.e.c.m.UnboxedMathUtils
This commit removes the dead UnboxedMathUtils from the codebase.
2016-01-08 11:58:39 -05:00
Simon Willnauer 71796e2319 [TEST] Close failable translog in a controlled way otherwise assertions are off in the test 2016-01-08 13:10:09 +01:00
Adrien Grand 581fd49dac Merge pull request #15836 from jpountz/fix/bitset_cache_duplicates
BitSetFilterCache duplicates its content.
2016-01-08 09:59:28 +01:00
Igor Motov 8fbb3686cd Improve stability of the testBatchingShardUpdateTask test
On slow machines when this test randomly picks a large number of shards it can occasionally take more than 32.5 seconds to snapshot all shards. That is causing the test to miss the second to last assert in awaitsBusy at 32.5 seconds and then timeout in BlockingClusterStateListener at 60 seconds. Due to the timeout, the pending task queue is cleaned before the last awaitsBusy assert at 65 seconds and as a result the last assert runs on a completely empty queue and fails with a very confusing assert error.

This commit makes the timeout in BlockingClusterStateListener to occur after the last assert in assertBusyPendingTasks and therefore allows assertBusyPendingTasks to perform the last assert before cleaning the pending tasks queue takes place.

 This commit also reduces the maximum number of shards used in the test to 10 in order to speed up this test.
2016-01-07 19:33:50 -05:00
Adrien Grand 3ef9ec25f8 BitSetFilterCache duplicates its content.
We have a bug that makes all per-index bitset caches store bitsets for all
indices. In the case that you have many indices, which is fairly common with
time-based data, this could translate to a lot of wasted memory.

Closes #15820
2016-01-07 18:50:14 +01:00
Britta Weber f93b4cb215 sync translog to disk after recovery from primary
Otherwise if that node is shutdown and restarted it might will have lost all operations
that were in the translog.
2016-01-07 16:27:40 +01:00
Adrien Grand 8bd54dbf5a Merge pull request #15828 from jpountz/enhancement/stricter_metadata_parsing
Make MetaData parsing less lenient.
2016-01-07 15:20:51 +01:00
Adrien Grand 6ce7a972bc Make MetaData parsing less lenient.
Today this simply ignores everything that is not recognized.
2016-01-07 15:20:16 +01:00
Nik Everett 52f28888d5 Merge pull request #15813 from nik9000/xlint1
Remove Xlint:-override,-fallthrough,-static
2016-01-07 08:34:40 -05:00
Boaz Leskes d5e6eb58a8 Log uncaught exceptions from scheduled once tasks
`ScheduledThreadPoolExecutor` allows you to schedule tasks to run once or periodically at the future. If such a task throws an exception, that exception is caught and reported in the future that `ScheduledThreadPoolExecutor#schedule` returns. However, we typically do not capture the future / do not test it for errors. This results in exception being swallowed and not reported. To mitigate this we now wrap any command in a LoggingRunnable  (already used for periodic tasks).  Also, RunnableCommand is changed not to swallow exception but percolate them further for reporting by the future.

Closes #15824
2016-01-07 14:04:35 +01:00
Simon Willnauer e7f9d685f1 [TEST] Test that translog can recover after random IOException
This commit adds a new test that can throw an IOException at any point in time
and ensures that all previously synced documents can be successfully recovered after hitting
an excepiton.

Relates to #15788
2016-01-07 10:17:31 +01:00
Adrien Grand 67d233cecd Remove warmers and the warmer API.
Warmers are now barely useful and will be removed in 3.0. Note that this only
removes the warmer API and query-based warmers. We still have warmers internally
for eg. global ordinals.

Close #15607
2016-01-07 09:57:07 +01:00
Martijn van Groningen 604d59a95e muted test 2016-01-07 09:54:59 +01:00
Nik Everett 20e7fa97db Remove Xlint:-override,-fallthrough,-static
Adds `@SuppressWarnings("fallthrough")` in two places where the fallthrough
is used to implement well known hashing algorithms.
2016-01-06 22:27:14 -05:00
Nik Everett 74c132afc6 Standardize some methods on varargs
Right now we define the same sort of methods as taking String arrays and
string varargs. We should standardize on one and varargs is easier to
call so lets use varargs!
2016-01-06 21:01:58 -05:00
Jason Tedor a583edb2df Merge pull request #15801 from jasontedor/cyclic-barriers-for-boaz
Use CyclicBarriers for sychronizing driver and test threads
2016-01-06 20:09:28 -05:00
Jason Tedor c147fe5691 Do not lose CacheTest failure stack traces 2016-01-06 20:00:11 -05:00
Nik Everett d54f1a8f20 Merge pull request #15796 from nik9000/boundary_chars
Add test for boundary chars
2016-01-06 18:26:38 -05:00
Nik Everett 9935ae921e Version.LATEST instead of Lucene.VERSION
There was a TODO for it.
2016-01-06 17:36:10 -05:00
Jason Tedor 4c0f5bda47 Use CyclicBarriers for sychronizing driver and test threads
This commit modifies some tests to use CyclicBarriers to correctly and
simply sychronize driver and test threads.
2016-01-06 15:07:05 -05:00
Jason Tedor 22abf14812 Visible failures in cluster state update task execution ordering test 2016-01-06 14:43:24 -05:00
Jason Tedor 557b11cc2b Sychronize threads in cluster state update task execution ordering test
This commit uses a CyclicBarrier to correctly and simply sychronize the
driver and test threads in
ClusterServiceIT#testClusterStateUpdateTasksAreExecutedInOrder.
2016-01-06 14:41:43 -05:00
Jason Tedor d1b4cf6778 Further simplify cluster state update task execution ordering test 2016-01-06 14:41:42 -05:00
Jason Tedor 18b42ce798 Simplify cluster state task execution ordering test 2016-01-06 14:41:42 -05:00
Jason Tedor 270b08b302 Add test that cluster state update tasks are executed in order
This commit adds a test that ensures that cluster state update tasks
are executed in order from the perspective of a single thread.
2016-01-06 14:41:38 -05:00
Jason Tedor ef16113697 Merge pull request #15735 from jasontedor/master-node-change-predicate
Refactor master node change predicate for reuse
2016-01-06 13:58:13 -05:00
Nik Everett add60a7560 [highlighting] Another test for boundary chars 2016-01-06 13:42:15 -05:00
Nicholas Knize 7df9ba6053 [TEST] Speed up GeoShapeQueryTests
This commit speeds up GeoShapeQueryTests by reducing the size of the random generated shapes and defaulting geo_shape indexes to use quadtree (more efficient for shapes) over geohash.
2016-01-06 12:41:04 -06:00
Martijn van Groningen 04b79c112f test: unmuted test
test failed, because now the percolator returns upto 10 matches whereas before this was unbounded. The test has been updated to take this in account by checking the total count instead of the number of matches
2016-01-06 19:10:55 +01:00
Jason Tedor 3b192cfc74 Merge pull request #15791 from jasontedor/relocating-shard-failure
Only fail the relocation target when a replication request on it fails

Closes #15790
2016-01-06 12:56:49 -05:00
Jason Tedor bb4d857e44 Redundant assertion in TransportReplicationActionTests#runReplicateTest 2016-01-06 12:53:45 -05:00
Jason Tedor c291c17142 Cleanup TransportReplicationActionTests#runReplicateTest
This commit cleans up some of the assertions in
TransportReplicationActionTests#runReplicateTest:
 - use a Map to track actual vs. expected requests
 - assert that no request was sent to the local node
 - use RoutingTable#shardRoutingTable convenience method
 - explicitly use false in boolean conditions
 - clarify requests are expected on replica shards when assigned and
   execution on replicas is true
 - test ShardRouting equality when checking the failed shard request
2016-01-06 12:53:45 -05:00
Jason Tedor 6413adb5bc Assert that replication requests are sent to the correct shard copies
This commit adds tighter assertions in
TransportReplicationActionTests#runReplicateTest that replication
requests are sent to the correct shard copies.
2016-01-06 12:53:45 -05:00
Jason Tedor 75106daf9c Only fail the relocation target when a replication request on it fails
This commit addresses an issue when handling a failed replication
request against a relocating target shard. Namely, if a replication
request fails against the target of a relocation we currently fail both
the source and the target. This leads to an unnecessary
recovery. Instead, only the target of the relocation should be failed.
2016-01-06 12:53:41 -05:00
Nik Everett f5898fb07f [highlighting] Test for boundary chars 2016-01-06 12:32:09 -05:00
Martijn van Groningen 81cffd1be3 test: mute test 2016-01-06 18:30:04 +01:00
Martijn van Groningen 247ce06fc3 percolator: if size is 0 then use TotalHitCountCollector
Fixes PercolateIT#testPercolateSizingWithQueryAndFilter test
2016-01-06 18:00:00 +01:00
Jason Tedor cd56366378 Assert that we fail the correct shard when a replication request fails
This commit adds an assertion to
TransportReplicationActionTests#runReplicateTest that when a replication
request fails, we fail the correct shard.
2016-01-06 11:01:02 -05:00
Martijn van Groningen 2d6adf6428 Percolator refactoring:
* Added percolator field mapper that extracts the query terms and indexes these terms with the percolator query.
* At percolate time these extracted terms are used to query percolator queries that are like to be evaluated. This can significantly cut down the time it takes to percolate. Whereas before all percolator queries were evaluated if they matches with the document being percolated.
* Changes made to percolator queries are no longer immediately visible, a refresh needs to happen before the changes are visible.
* By default the percolate api only returns upto 10 matches instead of returning all matching percolator queries.
* Made percolate more modular, so that it is easier to add unit tests.
* Added unit tests for the percolator.

Closes #12664
Closes #13646
2016-01-06 16:08:10 +01:00
Yannick Welsch de6dfe15a7 Add PathHierarchy type back to path_hierarchy tokenizer for backward compatibility with 1.x
Closes #15785
2016-01-06 14:37:33 +01:00
Yannick Welsch a6ec1434d6 [TEST] Reduce log level in NodeVersionAllocationDeciderTests 2016-01-06 14:35:47 +01:00
Simon Willnauer 8a90c8085d Merge pull request #15788 from s1monw/dont_delete_tlog_file
Never delete translog-N.tlog file when creation fails
2016-01-06 14:31:22 +01:00
Simon Willnauer 5c833750d7 apply feedback from @bleskes 2016-01-06 14:19:58 +01:00
Simon Willnauer 12b93e72f0 Never delete translog-N.tlog file when creation fails
We today delete the translog-N.tlog file if any subsequent operation fails
but we might actually be in a good state if for instance the creation of the writer
failes after we sucessfully baked the new translog generation into the checkpoint. In this situation
we used to delete the translog-N.tlog file and failed on the next recovery of the translog with a
NoSuchFileException | FileNotFoundException just like in https://discuss.elastic.co/t/cannot-recover-index-because-of-missing-tanslog-files/38336

This commit changes the behavior and cleans up that limbo state on recovery if we already have a generation+1 file written but not baked into
the checkpoint we remove that file but only if the previous ckp file has already been renamed otherwise we know we can't be in this state.
2016-01-06 13:10:21 +01:00
Simon Willnauer 56329d0f53 Never call a listerner under lock in InternalEngine
We has a postIndex|DeleteUnderLock listener callback to load percolator
queries which is entirely private to the index shard in the meanwhile. Yet,
it still calls an external callback while holding an indexing lock which is scary
since we have no control over how long the operation could possibly take.

This commit decouples the percolator registry entirely from the ShardIndexingService
by pessimistically fetching percolator documents from the the engine using realtime get.
Even in situations where the same document is changed concurrently we will eventually end up
in the correct state without loosing an update. This also moves the index throtteling stats directly into
the engine to entirely remove the need for the dependency between InternalEngine and ShardIndexingService.
2016-01-06 11:38:34 +01:00
Yannick Welsch 55cc88e1ae Fix version-based allocation decider to prevent peer recovery from node with older version
Relocating a non-primary shard from one node to another is actually done by recovering from the active
primary shard in the cluster, and not the node that we are logically relocating from.

Closes #15775
2016-01-06 10:07:39 +01:00
Adrien Grand 7e3ccf2ee3 Merge pull request #15746 from jpountz/fix/missing_terms_agg
Make `missing` on terms aggs work with all execution modes.
2016-01-06 09:32:39 +01:00
Jason Tedor d032dabed5 Merge pull request #15777 from jasontedor/safer-cluster-state-task-notifications
Safe cluster state task notifications
2016-01-05 16:56:24 -05:00
Jason Tedor 05c46c9d35 Safe cluster state task notifications
This commit addresses an issue where a cluster state task listener
throwing an exception could prevent other listeners from being notified,
and could prevent the executor from receiving notifications that a new
cluster state was published. Additionally, this commit also addresses a
similar issue for executors handling cluster state publication
notifications.
2016-01-05 16:44:59 -05:00