Commit Graph

8198 Commits

Author SHA1 Message Date
Shay Banon 342a32fb16 Search might not return on thread pool rejection
When a thread pool rejects the execution on the local node, the search might not return.
This happens due to the fact that we move to the next shard only *within* the execution on the thread pool in the start method. If it fails to submit the task to the thread pool, it will go through the fail shard logic, but without "counting" the current shard itself. When this happens, the relevant shard will then execute more times than intended, causing the total opes counter to skew, and for example, if on another shard the search is successful, the total ops will be incremented *beyond* the expectedTotalOps, causing the check on == as the exit condition to never happen.
The fix here makes sure that the shard iterator properly progresses even in the case of rejections, and also includes improvement to when cleaning a context is sent in case of failures (which were exposed by the test).
Though the change fixes the problem, we should work on simplifying the code path considerably, the first suggestion as a followup is to remove the support for operation threading (also in broadcast), and move the local optimization execution to SearchService, this will simplify the code in different search action considerably, and will allow to remove the problematic #firstOrNull method on the shard iterator.
The second suggestion is to move the optimization of local execution to the TransportService, so all actions will not have to explicitly do the mentioned optimization.
fixes #4887
2014-05-05 09:24:53 +02:00
javanna e96e634d10 [TEST] fixed _cat/thread_pool REST tests with local transport, in case the transport port is not available and gets returned as '-'
Re-enabled REST tests suite

Closes #6033
2014-05-04 22:10:03 +02:00
mikemccand 6bc3a744a1 Fix StackOverflowException for long suggestion strings
Changed getFiniteStrings to use an iterative implementation instead of
recursive, so we don't use a Java stack-frame per character for each
suggestion at build & query time.
2014-05-04 13:35:05 -04:00
Shay Banon c9f1792c81 Change default filter cache to 10% and circuit breaker to 60%
The defaults we have today in our data intensive memory structures don't properly add up to properly protected from potential OOM.
The circuit breaker, today at 80%, aims at protecting from extensive field data loading. The default threshold today is too permissive and can still cause OOMs.
 The filter cache today is at 20%, and its too high when adding it to other limits we have, reduce it to 10%, which is still a big enough portion of the heap, yet provides improved safety measure.
 closes #5990
2014-05-04 15:38:16 +02:00
Adrien Grand 01eb01cb70 [TEST] Disable REST tests until #6033 is fixed. 2014-05-04 11:58:30 +02:00
Zachary Tong f4c5cde8af [TEST] Replace folded blocks with literal blocks
The regex tests are formatted with blocks for readability.  Previously,
they were formatted using folded style blocks (e.g. using `>`). Folded
blocks convert newlines into spaces.  This is problematic for our regex,
since comments can only be terminated with a newline.

Effectively, anything after a comment will be commented out, making many
of the regex "silently pass".

This commit replaces them with scalar-style blocks (e.g. using `|`), which
treats newlines as significant, and thus correctly terminates comments
inside the regex.

Also fixes a regex test (`cat.thread_pool/10_basic.yaml`) that started
to fail after the block was fixed.  The test was missing a `\s+` before
the closing newline.
2014-05-02 18:30:48 -04:00
gabriel-tessier 48930c2950 [DOC] Fix typo in function score query documentation. 2014-05-02 23:44:56 +02:00
Boaz Leskes 694bf287d6 Do not start a recovery process if the primary shard is currently allocated on a node which is not part of the cluster state
If a source node disconnect during recover, the target node will respond by canceling the recovery. Typically the master will respond by removing the disconnected node from the cluster state, promoting another shard to become primary. This is sent it to all nodes and the target node will start recovering from the new primary. However, if the drop of a node caused the node count to go bellow min_master_node, the master will step down and will not promote shard immediately. When a new master is elected we may publish a new cluster state (who's point is to notify of a new master) which is not yet updated. This caused the node to start a recovery to a non existent node. Before we aborted the recovery without cleaning up the shard, causing subsequent correct cluster states to be ignored. We should not start the recovery process but wait for another cluster state to come in.

Closes #6024
2014-05-02 23:30:24 +02:00
Alex Ksikes b55d8ed2e3 Fix behavior on default boost factor for More Like This.
A boost terms factor of 1.0 is not the same as no boosting of terms.
The desired behavior is to deactivate boosting by default. If the user
specifies any value other than 0, then boosting is activated.

Closes #6021
2014-05-02 16:59:09 +02:00
Mansur Ashraf d5f90e9803 [DOCS] Added Twitter Storehaus client
Added Twitter Storehaus client
2014-05-02 12:08:05 +02:00
Holger Hoffstätte f5c9bf6f0f Update JNA to latest version
Updating to this version allows to configure a special JNA directory,
in case the /tmp directory is mounted with the noexec option, as JNA
extracts some data and tries to execute parts of it.

Also updated documentation to clarify mlockall and memory settings as well
as pointing to the new jna.tmpdir system property.

Closes #5493
2014-05-02 11:52:57 +02:00
Britta Weber 2e44040388 function_score parser throws exception if both functions:[] and single function given
In addition, add a special warning if the misplaced function is a "boost_factor"
function to avoid confusion of "boost" and "boost_function".

closes #5995
2014-05-02 10:53:33 +02:00
Shay Banon a557ee8daf Support empty properties array in mappings
closes #5887
2014-05-01 12:18:39 -04:00
Boaz Leskes 42a112f50b debug log of receiving a cluster state from another master could be erroneously logged
Added trace logging to MinimumMasterNodesTests.multipleNodesShutdownNonMasterNodes
2014-05-01 13:15:08 +02:00
Martijn van Groningen 9493824a0e [TEST] (RecoveryPercolatorTests) Don't stop the master node and always use the client of the master node 2014-05-01 14:06:34 +07:00
Martijn van Groningen 61093f1bd1 [TEST] Replace execute().actionGet() with get() 2014-05-01 14:06:34 +07:00
Shay Banon 23f200bc0e Use non analyzed token stream optimization everywhere
In the string type, we have an optimization to reuse the StringTokenStream on a thread local when a non analyzed field is used (instead of creating it each time). We should use this across the board on all places where we create a field with a String.
Also, move to a specific XStringField, that we can reuse StringTokenStream instead of copying it.
closes #6001
2014-04-30 17:18:15 -04:00
Martijn van Groningen 12f43fbbc0 Fixed license headers. 2014-05-01 00:33:17 +07:00
Martijn van Groningen 013b319415 Added `reverse_nested` aggregation.
The `reverse_nested` aggregation allows to aggregate on properties outside of the nested scope of a `nested` aggregation.

Closes #5507
2014-05-01 00:23:05 +07:00
Martijn van Groningen 5a0070071a Use collectExistingBucket in GlobalOrdinalsSignificantTermsAggregator.WithHash.
Relates to #5955.
2014-04-30 23:24:33 +07:00
Matt Weber 2663d04a96 Run tests through forbidden-apis. 2014-04-30 17:48:33 +02:00
Adrien Grand 34fb5e48e2 Use collectExistingBucket in GlobalOrdinalsStringTermsAggregator.WithHash.
Relates to #5955.
2014-04-30 15:34:01 +02:00
Boaz Leskes 870bd90f54 ThreadPool.EstimatedTimeThread should be set on initialization
Some tests run before the thread is started and thus use 0 as a the current time, which later on leads to big time jumps and thus failures.
Ex. InternalEngineTests.testVersioningReplicaConflict2
2014-04-30 11:47:47 +02:00
Adrien Grand b2db7c8222 Improve the way sub-aggregations are collected.
Sub-aggregations are currently collected directly, by just forwarding the
doc ID and bucket ordinal to them. This change adds the new BucketCollector
abstract class that Aggregator extends, so that we have more flexibility to
add implicit filters or buffering between an aggregator and its sub
aggregators.

Close #5975
2014-04-30 08:47:25 +02:00
Adrien Grand 2eeaa56d95 Fix setting of readerGen in BytesRefOrdValComparator on nested documents.
Sorting was broken on nested documents because the `missing(slot)` method
didn't correctly set the segment ordinal (readerGen), causing term ordinals to
be compared across segments.

Close #5986
2014-04-30 08:21:26 +02:00
Shay Banon 2076194d8f Upgrade to Jackson 2.3.3
fixes the long value bug as well...
2014-04-29 20:13:43 -04:00
Shay Banon 34302a7cc5 disable using CBOR in randomized test infra
due to a bug in CBOR handling long values (test case to verify it is included), disalbe using CBOR in our tests till it gets fixed
2014-04-29 19:11:12 -04:00
Spencer Alger 9038db7bfc [REST-SPEC] update to update test, to check for es-js error messages 2014-04-29 14:18:20 -07:00
Martijn van Groningen dce127bcdf Added global ordinals based implementations for significant terms aggregator.
Closes #5970
2014-04-30 01:36:02 +07:00
Shay Banon a4ef418e6e Range/Term query/filter on dates fail to handle numbers properly
When providing a number (milliseconds since epoch, UTC), range and term query/filter don't handle it correctly and convert it to a string, that is then first tried to parse as a date
closes #5969
2014-04-29 14:25:05 -04:00
mikemccand fb53784e3b add thread name to logger message from IndexWriter's infoStream 2014-04-29 10:50:36 -04:00
Adrien Grand 6ec01c13e5 Fix computation of the missing ord (leftover of the ordinals change). 2014-04-29 16:29:01 +02:00
Binh Ly fe89b8735a [DOC] Fixed filtered_query typo 2014-04-29 10:24:52 -04:00
Britta Weber 9d214d14fe Provide meaningful error message if field has no fielddata type
closes #5930
2014-04-29 15:19:01 +02:00
mikemccand a8d4c04fc2 include thread name when logging IndexWriter's infoStream messages 2014-04-29 05:50:13 -04:00
Adrien Grand d07c5a5c32 Aggregations parsing is too lenient.
Close #5827
2014-04-29 11:07:06 +02:00
Martijn van Groningen 8817281a70 Added AwaitsFix 2014-04-29 13:58:39 +07:00
Martijn van Groningen 0f23485a3c Cut p/c queries (has_child and has_parent queries) over to use global ordinals instead of being bytes values based.
Closes #5846
2014-04-29 12:41:04 +07:00
Martijn van Groningen fc3efda6af Cut other aggregations over to use collectExistingBucket() if a bucket ord has been hit, that already exists.
Closes #5955
2014-04-29 11:07:12 +07:00
Martijn van Groningen f3219f7098 Added global ordinals terms aggregator impl that is optimized low cardinality fields.
Instead of resolving the global ordinal for each hit on the fly, resolve the global ordinals during post collect.
On fields with not so many unique values, that can reduce the number of global ordinals significantly.

Closes #5895
Closes #5854
2014-04-29 11:04:03 +07:00
Matt Weber 4df4506875 Use URI vs URL accessing File from classpath.
URL escapes special characters such as spaces which
causes the resource to not be found when used to create
a File object.  Use URI.

Closes #5915
2014-04-28 18:49:55 +02:00
javanna 51ba3ca220 [TEST] made sure nodeSettings method gets called for every node type, not only data nodes in case numDataNodes is specified.
This fixes a test ZenUnicastDiscoveryTests when running in network mode
2014-04-28 18:31:47 +02:00
javanna a414e4f2f3 [TEST] randomly introduced a client node within test cluster
The default number of clients nodes is randomized between 0 and 1, applied to all cluster scopes (global, suite and test). Can be changed through the newly added `@ClusterScope#numClientNodes`.

In our tests we currently refer to nodes in a generic way. All the tests that either stop or start nodes rely on the fact that those nodes hold data though. Made that clearer as that becomes more important when introducing other types of nodes within the test cluster. Reflected this by adapting and renaming the following methods in `TestCluster`:

- ensureAtLeastNumNodes to ensureAtLeastNumDataNodes
- ensureAtMostNumNodes to ensureAtMostNumDataNodes
- stopRandomNode to stopRandomDataNode

and the following ones in `ElasticsearchIntegrationTest`:

- allowNodes to allowDataNodes
- dataNodes to numDataNodes.
- @ClusterScope#numNodes to numDataNodes
- @ClusterScope#minNumNodes to minNumDataNodes
- @ClusterScope#maxNumNodes to maxNumDataNodes

Added facilities to be able to deal with data nodes specifically, like for instance retrieve a client to a data node, or retrieve an instance of a class through guice only from data nodes.

Adapted existing tests to successfully run although there's a node client around.

Fixed _cat/allocation REST tests to make disk.total, disk.avail and disk.percent optional as client nodes won't return that info.

Closes #5949
2014-04-28 16:31:36 +02:00
Martijn van Groningen 17a5575757 Disabled parent/child queries in the delete by query api.
It wasn't properly implemented and could lead to a shard being failed and not able to recover.

Closes #5828 #5916
2014-04-28 20:12:54 +07:00
Adrien Grand 22cbdd930c [TEST] Fix test bug in MultiOrdinalsTests. 2014-04-28 13:56:01 +02:00
Clinton Gormley 2dfc77a4ed Removed spec and YAML tests for indices.status
Related #4854
2014-04-28 13:00:08 +02:00
Robert Muir 8e0a479316 Upgrade to Lucene 4.8
Closes #5932
2014-04-28 06:45:50 -04:00
Chris Earle 5528370e24 Added type, max, min, queueSize & keepAlive to _cat/thread_pool
Closes #5366
2014-04-28 12:00:27 +02:00
Simon Willnauer f285ffc610 Multi value handling in decay functions
Decay functions currently only use the first value in a field that contains
multiple values to compute the distance to the origin. Instead, it should
consider all distances if more values are in the field and then use
one of min/max/sum/avg which is defined by the user.

Relates to #3960
closes #5940
2014-04-28 11:55:32 +02:00
Britta Weber f993945e5c Move SortMode to org.elasticsearch.search and rename to MultiValueMode 2014-04-28 11:55:32 +02:00