Commit Graph

8798 Commits

Author SHA1 Message Date
Christoph Büscher ba02485541 Make sure SortBuilders rewrite inner nested sorts (#26532)
The three SortBuilders that can have inner NestedSortBuilders currently don't
rewrite any of the filters contained in them. This change adds a rewrite method
to NestedSortBuilder and changes rewriting in FieldSortBuilder,
ScriptSortBuilder and GeoDistanceSortBuilder to make sure inner nested sorts get
rewritten if they need to.
2017-09-07 14:04:50 +02:00
Christoph Büscher 47ffa17efb Extend testing of build method in ScriptSortBuilder (#26520)
Improve testing around the ScriptSortBuilder#build method, adding checks for
correct transfers of the sort mode and nested sorts.

Also changing the behaviour around the nested_path, nested_filter vs. nested
parameter in a similar way as in #26490 and deprecating the setters/getters for
the old syntax.

Closes #17286
2017-09-07 10:37:50 +02:00
Ryan Ernst c9964d17bf Internal: Add versionless alias for rest client codebase in policy files (#26521)
Security manager policy files contains grants for specific codebases,
where a codebase is a jar file. We use a system property containing the
name of the jar file to resolve the jar file location when parsing the
policy file. However, this means the version of the jars must be
modified when versions of dependencies change. This is particularly
messy for elasticsearch, where we now have a dependency on the rest
client, and need to support both a snapshot version for testing and non
snapshot for release.

This commit adds an alias for the elasticsearch rest client without a
version to be used in policy files. That allows the policy files to not care whether
the rest client is a snapshot or release.
2017-09-06 18:57:10 -07:00
Lee Hinman fe02350e73 With too many incoming tasks, reset measurements to 1ns instead of 0ns
Resoves #26332 where too many tasks occurred while adjustment was happening, the
measurements were reset to 0, and then an assert failed due to tasks executing
in 0 nanoseconds
2017-09-06 15:34:51 -06:00
Jason Tedor 9c795bd838 Fix cache compute if absent for expired entries
When a cache entry expires, it remains in the cache (both the segment
that it belongs to, and the LRU list) until an eviction occurs. The
problem here is that the compute if absent implementation relies on
there not being an association to a key that we are trying to put
because it internally uses put if absent on the underlying segment. If
we try to put an association for a key that has expired but not been
evicted, then compute if absent will return as if there is nothing in
the cache for the given key, yet no call to compute if absent will
succeed in putting a new association for the key. To remedy this, we
modify the internal get method for the cache to let the caller take
action if the entry they are retrieving is expired. This allows the
compute if absent method to take the action of evicting the entry from
the cache, thus allowing the put if absent method used by compute if
absent to succeed for one of the callers trying to compute if absent a
new association.

Relates #26516
2017-09-06 13:44:20 -04:00
Jim Ferenczi 0c799eedc5 Add upper limit for scroll expiry (#26448)
This change adds a dynamic cluster setting named `search.max_keep_alive`.
It is used as an upper limit for scroll expiry time in scroll queries and defaults to 1 hour.
This change also ensures that the existing setting `search.default_keep_alive` is always smaller than `search.max_keep_alive`.

Relates #11511

* check style

* add skip for bwc

* iter

* Add a maxium throttle wait time of 1h for reindex

* review

* remove empty line
2017-09-06 10:06:48 +02:00
Christoph Büscher 1b49bf3079 Remove deprecated parameters from `ids_query` (#26508)
The `_type` and `types` version of the current `type` parameter have been
deprecated since 5.0. We can remove support for them in 7.0 and also in 6.x and
6.0.
2017-09-05 18:12:31 +02:00
Tim Brooks c1a20f7e48 Merge tsa with ts (#26369)
We currently have a weird relationship between Transport,
TransportService, and TransportServiceAdaptor. At some point I think
that we would like to collapse these all into one concept as we only
support TCP transports.

This commit moves in that direction by eliminating the adaptor and just
passing the transport service to the transport.
2017-09-05 09:15:56 -06:00
Christoph Büscher 760bd6c568 Extend testing of build method in GeoDistanceSortBuilder (#26498)
Improve testing around the GeoDistanceSortBuilder#build method, adding checks for correct
transfers of the sort order, mode, nested sorts and points validation and coercion.

Also changing the behaviour around the nested_path, nested_filter vs. nested parameter in
a similar way as in #26490 and deprecating the setters/getters for the old syntax.

Relates to #17286
2017-09-05 14:38:10 +02:00
Martijn van Groningen 78e9c96d7f
Added a limit to from + size in top_hits and inner hits.
Relates to #11511
2017-09-05 08:44:45 +02:00
Christoph Büscher 8f0369296f Prohibit using `nested_filter`, `nested_path` and new `nested` Option at the same time in FieldSortBuilder (#26490)
Currently we allow both "old" and "new" way of setting nested sorts on the
FieldSortBuilder at the same time. This should throw an error, instead the user
should choose one of the two possible options.

Also adding testing for the now deprecated nestedPath/nestedFilter parameters,
inlcuding checks that they emmit warnings on parsing and that the new
NestetedSortBuilder overwrites the deprecated parameters when building the
SortField.

Relates to #17286
2017-09-04 17:19:52 +02:00
Boaz Leskes 2fd4af82e4 Move `UNASSIGNED_SEQ_NO` and `NO_OPS_PERFORMED` to SequenceNumbers (#26494)
Where they better belong.
2017-09-04 16:31:00 +02:00
Alexander Reelsen 3706a16baf Docs: Update broken link to flake ids in uuid generators 2017-09-04 10:48:50 +02:00
Christoph Büscher f8fc0f3ebe [Tests] Check that quoteAnalyzer overrides analyzer in `query_string` query (#26473)
Adding a check to QueryStringQueryBuilderTests that checks the override
behaviour of `quote_analyzer`, also adding documentation explaining the use of
this parameter in `query_string` query.

Closes #25417
2017-09-02 11:53:02 +02:00
Jason Tedor 1757bd8d92 Prettify primary response in assertion message
We are getting the default Object#toString implementation here, we need
more than this. This commit instead formats the primary response to JSON
so we can see into its soul.
2017-09-01 19:25:06 -04:00
Tal Levy 9735e7d706 migrate some MasterNodeRequest subclasses to Writeable Readers (#26463)
migrate some MasterNodeRequest subclasses to Writeable Readers
2017-09-01 15:27:45 -07:00
Boaz Leskes 2d0997be16 Add version 6.0.0-rc1 2017-09-01 17:48:24 -04:00
Christoph Büscher c2853c8281 Remove old norelease comment, the test is okay as it is 2017-09-01 18:25:27 +02:00
Christoph Büscher 2d342c0830 [Tests] Add unit tests for NestedSortBuilder (#26458)
The new NestedSortBuilder currently is only tested via its use in the other
SortBuilder implementations it can be used in. This adds its own simple unit
test class that at first checks our usual fromXContent parsing, serialization
and hashCode/equals checks. It also adds tests for cases where NestedSortBuilder
is nested in itself and reuses the code for creating randomized instances in the
other SortBuilder tests.

In addition to the tests, this changes the `path` parameter in NestedSortBuilder
to be mandatory and removes the `read` method since it is not really needed.
2017-09-01 10:53:51 +02:00
Alexander Reelsen 80d0a32f8e ScriptService: Replace max compilation per minute setting with max compilation rate (#26399)
The current script service has a script compilation limit for a one
minute window. This is set to a small default value of 15. Instead of
increasing that default value, this commit introduces a new setting 
that allows to configure a rate per time unit, so that the script service can deal with bursts better.

The new setting is named `script.max_compilations_rate`,
requires a nonnegative number and a positive time value.

The default is `75/5m`, which is equivalent to the existing 15 per minute.
2017-09-01 10:15:27 +02:00
Jason Tedor 111defdfe1 Allow double aborts on bulk item requests
In some cases a request can already be aborted and retried. This means
the condition that aborting a request should only happen when an item
has not been processed yet is too strict. This commit allows for a
double abort. If we attempt to abort an operation that was previously
processed but not aborted, we treat that as a hard failure.

Relates #26434
2017-08-31 14:37:02 -04:00
Christoph Büscher 294d167973 Revert accidental deletion of cast needed for Java 9 2017-08-31 16:13:12 +02:00
Jason Tedor 697bc266ce Upgrade to Log4j 2.9.0
This commit upgrades the Log4j dependency from version 2.8.2 to version
2.9.0.

Relates #26450
2017-08-31 09:54:35 -04:00
Tim Vernum eb87df9ff9 Allow abort of bulk items before processing (#26434)
Adds support for bulk items to be aborted before they are processed by the TransportShardBulkAction.
This can be used by an ActionFilter to reject a subset of the items in a bulk action without rejecting the whole action (or all the items for a shard).
2017-08-31 21:23:14 +10:00
Christoph Büscher adad605081 [Tests] Improve testing of FieldSortBuilder (#26437)
Currently we don't have much unit testing about the SortField that is created then
calling the SortBuilders `build` method. Most of this is covered by integration tests
somewhere but it would be good to have some basic checks in FieldSortBuilderTest
as well.

This adds testing for the sort order, mode, missing values and checks that `nested` 
gets set in the XFieldComparatorSource when `nestedPath` and `nestedFilter` are 
set on the builder.

Relates to #17286
2017-08-31 12:15:09 +02:00
Adrien Grand 78681bc9e5 Upgrade to lucene-7.0.0-snapshot-d94a5f0. (#26441) 2017-08-31 09:06:40 +02:00
Lee Hinman c3da66d021 Implement adaptive replica selection (#26128)
* Implement adaptive replica selection

This implements the selection algorithm described in the C3 paper for
determining which copy of the data a query should be routed to.

By using the service time EWMA, response time EWMA, and queue size EWMA we
calculate the score of a node by piggybacking these metrics with each search
request.

Since Elasticsearch lacks the "broadcast to every copy" behavior that Cassandra
has (as mentioned in the C3 paper) to update metrics after a node has been
highly weighted, this implementation adjusts a node's response stats using the
average of the its own and the "best" node's metrics. This is so that a long GC
or other activity that may cause a node's rank to increase dramatically does not
permanently keep a node from having requests routed to it, instead it will
eventually lower its score back to the realm where it is a potential candidate
for new queries.

This feature is off by default and can be turned on with the dynamic setting
`cluster.routing.use_adaptive_replica_selection`.

Relates to #24915, however instead of `b=3` I used `b=4` (after benchmarking)

* Randomly use adaptive replica selection for internal test cluster

* Use an action name *prefix* for retrieving pending requests

* Add unit test for replica selection

* don't use adaptive replica selection in SearchPreferenceIT

* Track client connections in a SearchTransportService instead of TransportService

* Bind `entry` pieces in local variables

* Add javadoc link to C3 paper and javadocs for stat adjustments

* Bind entry's key and value to local variables

* Remove unneeded actionNamePrefix parameter

* Use conns.longValue() instead of cached Long

* Add comments about removing entries from the map

* Pull out bindings for `entry` in IndexShardRoutingTable

* Use .compareTo instead of manually comparing

* add assert for connections not being null and gte to 1

* Copy map for pending search connections instead of "live" map

* Increase the number of pending search requests used for calculating rank when chosen

When a node gets chosen, this increases the number of search counts for the
winning node so that it will not be as likely to be chosen again for
non-concurrent search requests.

* Remove unused HashMap import

* Rename rank -> rankShardsAndUpdateStats

* Rename rankedActiveInitializingShardsIt -> activeInitializingShardsRankedIt

* Instead of precalculating winning node, use "winning" shard from ranked list

* Sort null ranked nodes before nodes that have a rank
2017-08-30 20:55:11 -06:00
Tal Levy ed151d829d Migrate Search requests to use Writeable reading strategies (#26428)
Migrates many SearchRequest objects to use Writeable conventions and rejects usage of `readFrom` in these new classes.
2017-08-30 11:00:33 -07:00
Martijn van Groningen ea3fa768f9
Changed version from 7.0.0-alpha1 to 6.1.0 in the nested sorting serialization check. 2017-08-30 19:56:10 +02:00
Matt Weber 140395c83f Multi-level Nested Sort with Filters (#26395)
Multi-level Nested Sort with Filters

Allow multiple levels of nested sorting where each level can have it's own filter.
Backward compatible with previous single-level nested sort.
2017-08-30 18:52:56 +02:00
Martijn van Groningen c821dce3fe
Revert "Multi-level Nested Sort with Filters"
This reverts commit 6377afa6c3.
2017-08-30 14:53:25 +02:00
Martijn van Groningen 410c6c281a
Revert "Temporarily set bwc version for new nested sorting to 7.0.0-alpha1 until the change has been backported to 6.x branch."
This reverts commit 472a5dd56b.
2017-08-30 14:53:10 +02:00
Martijn van Groningen 472a5dd56b
Temporarily set bwc version for new nested sorting to 7.0.0-alpha1 until the change has been backported to 6.x branch. 2017-08-30 14:30:20 +02:00
Martijn van Groningen 6377afa6c3
Multi-level Nested Sort with Filters
Allow multple levels of nested sorting where each level
can have it's own filter.  Backward compatible with
previous single-level nested sort.
2017-08-30 14:30:20 +02:00
Colin Goodheart-Smithe ce1d85d7d0 Moves deferring code into its own subclass (#26421)
* Moves deferring code into its own subclass

This change moves the code that deals with deferring collection to a subclass of BucketAggregator called DeferringBucketAggregator. This means that the code in AggregatorBase is simplified and also means that the code for deferring colleciton is in one place and easier to maintain.

* Makes SIngleBucketAggregator an interface

This is so aggregators that extend BucketsAggregator directly and those that extend DeferringBucketAggregator can be a single bucket aggregator

* review comments

* More review comments
2017-08-30 11:15:40 +01:00
Adrien Grand 34a6c7af26 Consolidate locale parsing. (#26400)
Mappings and ingest have different locale parsing code.
2017-08-30 10:58:33 +02:00
Sergey Galkin c075323522 Refactor create index service to be unit testable
This commit refactors MetaDataCreateIndexService so that it is unit
testable.

Relates #25961
2017-08-29 16:55:44 -04:00
Jason Tedor 7a035f5f84 setgid on /etc/elasticearch on package install
When creating the keystore explicitly (from executing
elasticsearch-keystore create) or implicitly (for plugins that require
the keystore to be created on install) on an Elasticsearch package
installation, we are running as the root user. This leaves
/etc/elasticsearch/elasticsearch.keystore having the wrong ownership
(root:root) so that the elasticsearch user can not read the keystore on
startup. This commit adds setgid to /etc/elasticsearch on package
installation so that when executing this directory (as we would when
creating the keystore), we will end up with the correct ownership
(root:elasticsearch). Additionally, we set the permissions on the
keystore to be 660 so that the elasticsearch user via its group can read
this file on startup.

Relates #26412
2017-08-28 20:47:42 -04:00
Jim Ferenczi 86d97971a4 Remove the _all metadata field (#26356)
* Remove the _all metadata field

This change removes the `_all` metadata field. This field is deprecated in 6
and cannot be activated for indices created in 6 so it can be safely removed in
the next major version (e.g. 7).
2017-08-28 17:43:59 +02:00
Stuart Neivandt f842ff1ae1 Simple verification of the format of the language tag used in DateProcessor. (#25513)
Closes #26186
2017-08-28 10:59:00 +02:00
Adrien Grand d692ccf261 Reject IPv6-mapped IPv4 addresses when using the CIDR notation. (#26254)
It introduces ambiguity as to whether the prefix length should be interpreted as
a v4 prefix length or a v6 prefix length.

See https://issues.apache.org/jira/browse/LUCENE-7920.

Closes #26078
2017-08-28 10:04:05 +02:00
Adrien Grand 262ea9534f Make locale parsing less lenient. (#26361)
The `locale` field of `date` fields accepts almost any string and unknown
locales are simply ignored, which is trappy. We should fail on unknown languages
or countries.

This commit also makes `-` an accepted separator in addition to `_` since `-`
is the recommended separator (https://tools.ietf.org/html/rfc5646#section-2.1).
`_` is probably still worth supporting since it is the separator used by
`Locale#toString()`.
2017-08-28 09:59:25 +02:00
Adrien Grand 36e22bc30f Remove 5.x backcompat from synonym filters. 2017-08-28 09:56:01 +02:00
Adrien Grand eb782492be Remove support for lenient booleans.
Closes #22298
2017-08-28 09:56:01 +02:00
Alexander Reelsen bdf2c3c691 Script Stats: Add compilation limit counter to stats (#26387)
In order to know, when the script compilation limit has kicked in,
this commit adds a counter in the script stats to expose that
information.

So far the only way to find out about this was to check the logs
or check out responses of individual requests.
2017-08-28 09:51:49 +02:00
Adrien Grand 6eac3ee8ba Avoid hardcoded error message that depends on the current version in tests. (#26391)
It makes it painful to bump the current version.
2017-08-28 09:11:31 +02:00
Michael Basnight cfd14cd2b8 Revert shading for the low level rest client (#26367)
At current, we do not feel there is enough of a reason to shade the low
level rest client. It caused problems with commons logging and IDE's
during the brief time it was used. We did not know exactly how many
users will need this, and decided that leaving shading out until we
gather more information is best. Users can still shade the jar
themselves. For information and feeback, see issue #26366.

Closes #26328

This reverts commit 3a20922046.
This reverts commit 2c271f0f22.
This reverts commit 9d10dbea39.
This reverts commit e816ef89a2.
2017-08-25 14:13:12 -05:00
Ryan Ernst 3655f3f2a3 Test: Remove irrelevant access after close test for stream (#26392)
This commit removes the streams test for access after closing the bytes
stream. Output streams being closed mean they can no longer be written
to, but other methods to retrieve side state of the stream can still
make sense, such as bytes() in this case.

relates #12620
2017-08-25 11:30:37 -07:00
Nik Everett b3edd11aa0 Allow plugins to plug rescore implementations (#26368)
This allows plugins to plug rescore implementations into
Elasticsearch. While this is a fairly expert thing to do I've
done my best to point folks to the QueryRescorer as one that at
least documents the tradeoffs that it makes. I've attempted to
limit the API surface area by removing `SearchContext` from the
exposed interface, instead exposing just the IndexSearcher and
`QueryShardContext`. I also tried to make some of the class names
more consistent and do some general cleanup while I was there.

I entertained the notion of moving the `QueryRescorer` to module.
After all, it'd be a wonderful test to prove that you can plug
rescore implementation into Elasticsearch if the only built in
rescore implementation is in the module. But I decided against it
because the new module would require a client jar and it'd require
moving some more things around. I think if we really want to do
it, we should do it as a followup.

I did, on the other hand, create an "example" rescore plugin which
should both be a nice example for anyone wanting to plug in their
own rescore implementation and servers as a good integration test
to make sure that you can indeed plug one in.

Closes #26208
2017-08-25 13:46:57 -04:00
Jim Ferenczi 74cd32942a Handle leniency for phrase query on a field indexed without positions (#26388)
This change rewrite phrase query built on a field indexed without positions
to match_no_docs query when the `lenient` option is set to true.
This change affects all full text queries.
2017-08-25 16:41:01 +02:00