Commit Graph

7773 Commits

Author SHA1 Message Date
Tim Brooks 5b1fbe5e6c Decouple BulkProcessor from client implementation (#23373)
This commit modifies the BulkProcessor to be decoupled from the
client implementation. Instead it just takes a
BiConsumer<BulkRequest, ActionListener<BulkResponse>> that executes
the BulkRequest.
2017-04-05 12:12:43 -05:00
Lee Hinman 0257a7b97a Only re-parse operation if a mapping update was needed
When executing an index operation on the primary shard,
`TransportShardBulkAction` first parses the document, sees if there are any
mapping updates that needs to be applied, and then updates the mapping on the
master node. It then re-parses the document to make sure that the mappings have
been applied and propagated.

This adds a check that skips the second parsing of the document in the event
there was not a mapping update applied in the first case.

Fixes a performance regression introduced in #23665
2017-04-05 09:29:44 -06:00
Adrien Grand d5d0f140d6 The `filter` and `significant_terms` aggregations should parse the `filter` as a filter, not as a query. (#23797)
This is important for some queries like `bool`, which are parsed differently
depending on whether we want to get a query or a filter.
2017-04-05 16:46:21 +02:00
Simon Willnauer adccdbb3cf Simplify sorted top docs merging in SearchPhaseController (#23881)
Today we have several code paths to merge top docs based on the number of
search results returned from the shards. If there is a only a single shard
holding any hits we go a different code path with quite some complexity while
if there are more than one the code is basically duplicated to safe the
creation of a dense array of top docs which can be large if there are many results.
This commit removes the need of the dense array and in-turn the justification for
the optimization. This commit introduces a single code path to merge top docs.
2017-04-05 14:49:35 +02:00
Boaz Leskes 75b4f408e0 Refactor InternalEngine's index/delete flow for better clarity (#23711)
The InternalEngine Index/Delete methods (plus satellites like version loading from Lucene) have accumulated some cruft over the years making it hard to clearly the code flows for various use cases (primary indexing/recovery/replicas etc). This PR refactors those methods for better readability. The methods are broken up into smaller sub methods, albeit at the price of less code I reused.

To support the refactoring I have considerably beefed up the versioning tests.

This PR is a spin-off from #23543 , which made it clear this is needed.
2017-04-05 14:43:01 +02:00
Boaz Leskes c89fdd938e ZenDiscovery - only validate min_master_nodes values if local node is master (#23915)
The purpose of this validation is to make sure that the master doesn't step down
due to a change in master nodes, which also means that there is no way to revert
an accidental change. Since we validate using the current cluster state (and
not the one from which the settings come from) we have to be careful and only
validate if the local node is already a master. Doing so all the time causes
subtle issues. For example, a node that joins a cluster has no nodes in its
current cluster state. When it receives a cluster state from the master with
a dynamic minimum master nodes setting int it, we must make sure we don't reject it.

Closes #23695
2017-04-05 14:31:32 +02:00
Jason Tedor 24127bf416 Remove hardcoded ports from SingleNodeDiscoveryIT
SingleNodeDiscoveryIT uses a hardcoded port for the purpose of binding
two nodes within the limited port range that an unconfigured unicast zen
ping hosts list would try to discover another node on. This commit at
least removes this hardcoding for the first node to come up, although
still tries to bind the second node to the limited port range after the
first node has bound.
2017-04-05 08:17:33 -04:00
Luca Cavanna 318d365b12 [TEST] make sure that fromXContent doesn't rely on keys ordering (#23901)
We shuffle the keys before we parse our responses for the high level client so that we make sure we never rely on keys ordering.
2017-04-05 11:12:34 +02:00
Jason Tedor afd45c1432 Revert "Closing a ReleasableBytesStreamOutput closes the underlying BigArray (#23572)"
This reverts commit 6bfecdf921.
2017-04-04 20:33:51 -04:00
Jay Modi 6bfecdf921 Closing a ReleasableBytesStreamOutput closes the underlying BigArray (#23572)
This commit makes closing a ReleasableBytesStreamOutput release the underlying BigArray so
that we can use try-with-resources with these streams and avoid leaking memory by not returning
the BigArray. As part of this change, the ReleasableBytesStreamOutput adds protection to only release the BigArray once.

In order to make some of the changes cleaner, the ReleasableBytesStream interface has been
removed. The BytesStream interface is changed to a abstract class so that we can use it as a
useable return type for a new method, Streams#flushOnCloseStream. This new method wraps a
given stream and overrides the close method so that the stream is simply flushed and not closed.
This behavior is used in the TcpTransport when compression is used with a
ReleasableBytesStreamOutput as we need to close the compressed stream to ensure all of the data
is written from this stream. Closing the compressed stream will try to close the underlying stream
but we only want to flush so that all of the written bytes are available.

Additionally, an error message method added in the BytesRestResponse did not use a builder
provided by the channel and instead created its own JSON builder. This changes that method to use the channel builder and in turn the bytes stream output that is managed by the channel.
2017-04-04 17:01:30 +01:00
Jason Tedor 3136ed1490 Rename random ASCII helper methods
This commit renames the random ASCII helper methods in ESTestCase. This
is because this method ultimately uses the random ASCII methods from
randomized runner, but these methods actually only produce random
strings generated from [a-zA-Z].

Relates #23886
2017-04-04 11:04:18 -04:00
Jason Tedor a01f77210a Fix Javadocs for BootstrapChecks#enforceLimits
This commit adds a description for a parameter that was added to
BootstrapChecks#enforceLimits(BoundTransportAddress, String) without the
Javadocs having been updated.
2017-04-04 09:42:19 -04:00
Jason Tedor 51b5dbffb7 Disable bootstrap checks for single-node discovery
While there are use-cases where a single-node is in production, there
are also use-cases for starting a single-node that binds transport to an
external interface where the node is not in production (for example, for
testing the transport client against a node started in a Docker
container). It's tricky to balance the desire to always enforce the
bootstrap checks when a node might be in production with the need for
the community to perform testing in situations that would trip the
bootstrap checks. This commit enables some flexibility for these
users. By setting the discovery type to "single-node", we disable the
bootstrap checks independently of how transport is bound. While this
sounds like a hole in the bootstrap checks, the bootstrap checks can
already be avoided in the single-node use-case by binding only HTTP but
not transport. For users that are genuinely in production on a
single-node use-case with transport bound to an external use-case, they
can set the system property "es.enable.bootstrap.checks" to force
running the bootstrap checks. It would be a mistake for them not to do
this.

Relates #23598
2017-04-04 09:39:04 -04:00
Jim Ferenczi c14be20744 Add unit tests for the missing aggregator (#23895)
* Add unit tests for the missing aggregator

Relates #22278
2017-04-04 14:37:33 +02:00
Jim Ferenczi a04350f0dd Add a property to mark setting as final (#23872)
This change adds a setting property that sets the value of a setting as final.
Updating a final setting is prohibited in any context, for instance an index setting
marked as final must be set at index creation and will refuse any update even if the index is closed.
This change also marks the setting `index.number_of_shards` as Final and the special casing for refusing the updates on this setting has been removed.
2017-04-04 12:35:48 +02:00
Jason Tedor 71293a89bf Introduce single-node discovery
This commit adds a single node discovery type. With this discovery type,
a node will elect itself as master and never form a cluster with another
node.

Relates #23595
2017-04-04 03:02:58 -04:00
Jason Tedor 3bd2efa177 Await termination after shutting down executors
When terminating an executor service or a thread pool, we first
shutdown. Then, we do a timed await termination. If the await
termination fails because there are still tasks running, we then
shutdown now. However, this method does not wait for actively executing
tasks to terminate, so we should again wait for termination of these
tasks before returning. This commit does that.

Relates #23889
2017-04-04 03:01:00 -04:00
Jason Tedor 6234a49fb3 Fix initialization issue in ElasticsearchException
If a test touches ElasticsearchExceptionHandle before the class
initialzer for ElasticsearchException has run, a circular class
initialization problem can arise. Namely, the class initializer for
ElasticsearchExceptionHandle depends on the class initializer for
ElasticsearchExceptionHandle which depends on the class initializer for
all the classes that extend ElasticsearchException, but these classes
can not be loaded because ElasticsearchException has not finished its
class initializer. There are tests that can trigger this before
ElasticsearchException has been loaded due to an unlucky ordering of
test execution. This commit addresses this issue by making
ElasticsearchExceptionHandle private, and then exposing methods that
provide the necessary values from ElasticsearchExceptionHandle. Touching
these methods will force the class initializer for
ElasticsearchException to run first.
2017-04-04 00:33:00 -04:00
Boaz Leskes 48b0121f60 SpecificMasterNodesIT shouldn't use autoMinMasterNodes
as it tweaks the `discovery.initial_state_timeout` setting.
2017-04-03 16:23:17 +02:00
Boaz Leskes 40eb68c95a testRestorePersistentSettings doesn't to mess with discovery settings 2017-04-03 16:23:17 +02:00
Colin Goodheart-Smithe 8482503f9b
Adds tests for cardinality and filter aggregations
Relates to #22278
2017-04-03 10:09:27 +01:00
Colin Goodheart-Smithe cad4fcd9c9
Revert "Adds tests for cardinality and filter aggregations (#23826)"
This reverts commit 058869ed54.
2017-04-03 09:45:16 +01:00
Colin Goodheart-Smithe 058869ed54 Adds tests for cardinality and filter aggregations (#23826)
* Adds tests for cardinality and filter aggregations

Relates to #22278

* addresses review comments
2017-04-03 09:39:03 +01:00
Jim Ferenczi 7316b663e2 Replace custom sort field with SortedSetSortField and SortedNumericSortField when possible (#23827)
Currently for field sorting we always use a custom sort field and a custom comparator source.
Though for numeric fields this custom sort field could be replaced with a standard SortedNumericSortField unless
the field is nested especially since we removed the FieldData for numerics.
We can also use a SortedSetSortField for string sort based on doc_values when the field is not nested.

This change replaces IndexFieldData#comparatorSource with IndexFieldData#sortField that returns a Sorted{Set,Numeric}SortField when possible or a custom
 sort field when the field sort spec is not handled by the SortedSortFields.
2017-04-03 09:57:26 +02:00
Simon Willnauer bdb1cabe71 Prevent nodes from joining if newer indices exist in the cluster (#23843)
Today we prevent nodes from joining when indices exists that are too old.
Yet, the opposite can happen too since lucene / elasticsearch is not forward
compatible when it gets to indices we won't let nodes join the cluster once
there are indices in the clusterstate that are newer than the nodes version.
This prevents forward compatibility issues which we never test against. Yet,
this will not prevent rolling restarts or anything like this since indices
are always created with the minimum node version in the cluster such that an index
can only get the version of the higher nodes once all nodes are upgraded to this version.
2017-04-03 09:52:09 +02:00
Simon Willnauer 998eeb7687 Synchronized CollapseTopFieldDocs with lucenes relatives (#23854)
TopDocs et.al. got additional parameters to incrementally reduce
top docs. In order to add incremental reduction `CollapseTopFieldDocs`
needs to have the same properties.
2017-04-03 09:50:44 +02:00
Jason Tedor 7082baaed9 Stricter parsing of remote node attribute
This commit enables stricter parsing of the remote node attribute,
instead of leniently parsing values that are not "true" as false.
2017-04-01 13:18:46 -04:00
Jason Tedor 38b3fec885 Fix cross-cluster remote node gateway attributes
Remote nodes in cross-cluster search can be marked as eligible for
acting a gateway node via a remote node attribute setting. For example,
if search.remote.node.attr is set to "gateway", only nodes that have
node.attr.gateway set to "true" can be connected to for cross-cluster
search. Unfortunately, there is a bug in the handling of these
attributes due to the use of a dangerous method
Boolean#getBoolean(String) which obtains the system property with
specified name as a boolean. We are not looking at system properties
here, but node settings. This commit fixes this situation, and adds a
test. A follow-up will ban the use of Boolean#getBoolean.

Relates #23863
2017-04-01 13:04:51 -04:00
Jim Ferenczi ee68e75332 FieldCapabilitiesRequest should implements Replaceable since it accepts index patterns 2017-03-31 20:21:06 +02:00
Alexander Reelsen f720767cbc Cleanup: Remove unused FieldMappers class (#23851)
This class is unused, so it can be removed.
2017-03-31 18:26:59 +02:00
Nik Everett ba62229f47 Fix FieldCapabilities compilation in Eclipse (#23855)
Eclipse can't deal with the generics, maybe the fixed but
unreleased https://bugs.eclipse.org/bugs/show_bug.cgi?id=511750
2017-03-31 12:10:15 -04:00
Tanguy Leroux 28099162ab Cluster stats should not render empty http/transport types (#23735)
This commit changes the ClusterStatsNodes.NetworkTypes so that is does
not print out empty field names when no Transport or HTTP type is defined:

```
{
"network_types": {
        ...
        "http_types": {
          "": 2
        }
      }
}
```

is now rendered as:

```
{
"network_types": {
        ...
        "http_types": {
        }
      }
}
```
2017-03-31 17:13:27 +02:00
Simon Willnauer 135eae42b9 Cleanup SearchPhaseController interface (#23844)
SearchPhaseController is tighly coupled to AtomicArray which makes
non-dense representations of results very difficult. This commit removes
the coupling and cuts over to Collection rather than List to ensure no
order or random access lookup is implied.
2017-03-31 16:25:15 +02:00
Jim Ferenczi a8250b26e7 Add FieldCapabilities (_field_caps) API (#23007)
This change introduces a new API called `_field_caps` that allows to retrieve the capabilities of specific fields.

Example:

````
GET t,s,v,w/_field_caps?fields=field1,field2
````
... returns:
````
{
   "fields": {
      "field1": {
         "string": {
            "searchable": true,
            "aggregatable": true
         }
      },
      "field2": {
         "keyword": {
            "searchable": false,
            "aggregatable": true,
            "non_searchable_indices": ["t"]
            "indices": ["t", "s"]
         },
         "long": {
            "searchable": true,
            "aggregatable": false,
            "non_aggregatable_indices": ["v"]
            "indices": ["v", "w"]
         }
      }
   }
}
````

In this example `field1` have the same type `text` across the requested indices `t`, `s`, `v`, `w`.
Conversely `field2` is defined with two conflicting types `keyword` and `long`.
Note that `_field_caps` does not treat this case as an error but rather return the list of unique types seen for this field.
2017-03-31 15:34:46 +02:00
Colin Goodheart-Smithe 9f66b8cd38 Improves disabled fielddata error message (#23841)
Closes #22768
2017-03-31 10:01:07 +01:00
Simon Willnauer 5badf68bd9 Add infrastructure to mark contexts as system contexts (#23830)
Today we have no way to mark an execution as internal. This commit adds
a simple thread context header that allows executing code in a system context.
This allows intercepting code can make better decisions down the road when
it gets to authentication.
2017-03-31 10:47:10 +02:00
Tim Brooks 5fa80a6521 Pass exception from sendMessage to listener (#23559)
This commit changes the listener passed to sendMessage from a Runnable
to a ActionListener.

This change also removes IOException from the sendMessage signature.
That signature is misleading as it allows implementers to assume an
exception will be thrown in case of failure. That does not happen due
to Netty's async nature.
2017-03-30 15:08:23 -05:00
Jason Tedor 48357e43d3 Honor update request timeout
When executing an update request, the request timeout is not transferred
to the index/delete request executed on behalf of the update
request. This leads to update requests not timing out when they should
(e.g., if not all shards are available when the request specifies
wait_for_shards=all with a small timeout). This commit causes the
index/delete requests to honor the update request timeout.

Relates #23825
2017-03-30 14:38:34 -04:00
Christoph Büscher b92371a4dc Tests: Add base tests for InternalSimpleValue and InternalDerivative (#23799)
As an addition to #22278 we should probably also have base tests for InternalSimpleValue
and InternalDerivative.
2017-03-30 20:23:49 +02:00
Simon Willnauer 4125f012b9 Streamline shard index availability in all SearchPhaseResults (#23788)
Today we have the shard target and the target request ID available in SearchPhaseResults.
Yet, the coordinating node maintains a shard index to reference the request, response tuples
internally which is also used in many other classes to reference back from fetch results to
query results etc. Today this shard index is implicitly passed via the index in AtomicArray
which causes an undesirable dependency on this interface.
This commit moves the shard index into the SearchPhaseResult and removes some dependencies
on AtomicArray. Further removals will follow in the future. The most important refactoring here
is the removal of AtomicArray.Entry which used to be created for every element in the atomic array
to maintain the shard index during result processing. This caused an unnecessary indirection, dependency
and potentially thousands of unnecessary objects in every search phase.
2017-03-30 14:32:42 +02:00
Jim Ferenczi 3b559e01be Fixed sliced search tests that rely on BytesRef.hashCode output 2017-03-30 10:14:41 +02:00
David Causse a49e1c0062 Use a fixed seed for computing term hashCode in TermsSliceQuery (#23795)
I think this query should not use the hashCode provided BytesRef#hashCode().
It uses StringHelper#GOOD_FAST_HASH_SEED which is initialized in a static
block to System.currentTimeMillis().
Running this query on different replicas may return inconsistent results.

Using a fixed seed should guaranty that the docs are sliced consistently
accross replicas.

Fixes #23096
2017-03-30 10:10:32 +02:00
Lee Hinman c8081bde91 Further refactor and extend testing for `TransportShardBulkAction`
This moves `updateReplicaRequest` to `createPrimaryResponse` and separates the
translog updating to be a separate function so that the function purpose is more
easily understood (and testable).

It also separates the logic for `MappingUpdatePerformer` into two functions,
`updateMappingsIfNeeded` and `verifyMappings` so they don't do too much in a
single function. This allows finer-grained error testing for when a mapping
fails to parse or be applied.

Finally, it separates parsing and version validation for
`executeIndexRequestOnReplica` into a separate
method (`prepareIndexOperationOnReplica`) and adds a test for it.

Relates to #23359
2017-03-29 10:56:51 -06:00
Jason Tedor 72824609df Add lower bound for translog generation threshold
The translog already occupies 43 bytes on disk when empty. If the
translog generation threshold is below this, the flush thread can get
stuck in an infinite loop repeatedly rolling the generation. This commit
adds a lower bound on the translog generation to avoid this problem,
however we keep the lower bound small for convenience in testing.

Relates #23779
2017-03-28 14:11:50 -04:00
Ali Beyad 2d3c2a4800 Adds backwards compatibility index and repository for v5.3.0 2017-03-28 14:02:07 -04:00
Ali Beyad c675d92a56 Adds v5.3.1 to the version constants 2017-03-28 13:04:28 -04:00
Ali Beyad 2120086d82 Adds pattern keyword marker filter support (#23600)
This commit adds support for the pattern keyword marker filter in
Lucene.  Previously, the keyword marker filter in Elasticsearch
supported specifying a keywords set or a path to a set of keywords.
This commit exposes the regular expression pattern based keyword marker
filter also available in Lucene, so that any token matching the pattern
specified by the `keywords_pattern` setting is excluded from being
stemmed by any stemming filters.

Closes #4877
2017-03-28 11:13:34 -04:00
Dimitris Athanasiou 34f116eae3 Require explicit query in _delete_by_query API (#23632)
As the query of a search request defaults to match_all,
calling _delete_by_query without an explicit query may
result in deleting all data.

In order to protect users against falling into that
pitfall, this commit adds a check to require the explicit
setting of a query.

Closes #23629
2017-03-28 15:44:57 +01:00
Ali Beyad 8359dd05c9 Adds boolean similarity to Elasticsearch (#23637)
This commit adds the boolean similarity scoring from Lucene to
Elasticsearch.  The boolean similarity provides a means to specify that
a field should not be scored with typical full-text ranking algorithms,
but rather just whether the query terms match the document or not.
Boolean similarity scores a query term equal to its query boost only.
Boolean similarity is available as a default similarity option and thus
a field can be specified to have boolean similarity by declaring in its
mapping:
    "similarity": "boolean"

Closes #6731
2017-03-28 10:17:23 -04:00
Stuart Neivandt 3caf887632 Improve error handling for epoch format parser with time zone (#23689)
Change the error response when using a non UTF timezone for range queries with epoch_millis
or epoch_second formats to an illegal argument exception. The goal is to provide a better 
explanation of why the query has failed. The current behavior is to respond with a parse exception.

Closes #22621
2017-03-28 14:45:20 +02:00