Commit Graph

8505 Commits

Author SHA1 Message Date
Adrien Grand 40bb1663ee Index ids in binary form. (#25352)
Indexing ids in binary form should help with indexing speed since we would
have to compare fewer bytes upon sorting, should help with memory usage of
the live version map since keys will be shorter, and might help with disk
usage depending on how efficient the terms dictionary is at compressing
terms.

Since we can only expect base64 ids in the auto-generated case, this PR tries
to use an encoding that makes the binary id equal to the base64-decoded id in
the majority of cases (253 out of 256). It also specializes numeric ids, since
this seems to be common when content that is stored in Elasticsearch comes
from another database that uses eg. auto-increment ids.

Another option could be to require base64 ids all the time. It would make things
simpler but I'm not sure users would welcome this requirement.

This PR should bring some benefits, but I expect it to be mostly useful when
coupled with something like #24615.

Closes #18154
2017-07-07 14:22:47 +02:00
Christoph Büscher 870d63d0cd [Tests] Add tests for PhraseSuggestionBuilder#build() (#25571)
This adds a unit test that checks the PhraseSuggestionContext output 
of PhraseSuggestionBuilder#build.
2017-07-07 12:53:06 +02:00
Christoph Büscher abe80b9ccb Remove unused class MinimalMap (#25590) 2017-07-07 12:51:38 +02:00
Yu 2e5e45161e Disable date field mapping changing (#25285)
Make date field mapping unchangeable.

Closes #25271
2017-07-07 11:49:09 +02:00
Simon Willnauer d368d7cb9f [TEST] Remove test trace logging 2017-07-07 11:03:07 +02:00
Christoph Büscher 31f73cc06c
[Tests] Fixing test failure in CompletionSuggesterBuilderTests 2017-07-07 10:39:58 +02:00
Martijn van Groningen 6db708ef75
Move more token filters to analysis-common module
The following token filters were moved: common grams, limit token, pattern capture and pattern raplace.

Relates to #23658
2017-07-07 10:02:52 +02:00
Christoph Büscher d71feceb23 [Tests] Add tests for TermSuggestionBuilder#build() (#25558)
Adds a unit test that checks the TermSuggestionContext contents that is the result 
of TermSuggestionBuilder#build vs. the values the original builder contains.
2017-07-07 09:47:21 +02:00
Simon Willnauer 1f67d079b1 Validate `transport.profiles.*` settings (#25508)
Transport profiles unfortunately have never been validated. Yet, it's very
easy to make a mistake when configuring profiles which will most likely stay
undetected since we don't validate the settings but allow almost everything
based on the wildcard in `transport.profiles.*`. This change removes the
settings subset based parsing of profiles but rather uses concrete affix settings
for the profiles which makes it easier to fall back to higher level settings since
the fallback settings are present when the profile setting is parsed. Previously, it was
unclear in the code which setting is used ie. if the profiles settings (with removed
prefixes) or the global node setting. There is no distinction anymore since we don't pull
prefix based settings.
2017-07-07 09:40:59 +02:00
Simon Willnauer e9f6210dac Add cluster name validation to RemoteClusterConnection (#25568)
This change adds validation to the RemoteClusterConnection to ensure
we always use seed nodes from the same cluster. While we still allow to use
an arbitrary cluster alias we ensure that we, once we connected to a cluster the first time,
we always check against that initial cluster name when we execute a seed node handshake.
2017-07-06 19:18:10 +02:00
Ali Beyad dda68643b6 Removes deprecated usage of the FieldStats API in a test that verifies
sequence number data in Lucene commit points.  Instead, the test
retrieves the _seq_no value from the commit point directly and converts
it to a Long value.
2017-07-06 12:00:00 -04:00
Christoph Büscher 41d0ff32c8 [Tests] Check output of SuggestionBuilder#build method (#25549)
This change adds a basic unit test for the SuggestionSearchContext that is
created as output of SuggestionBuilder#build. The current test only adds checks
for the common fields (like text, prefix, fieldName etc...).

Relates to #17118
2017-07-06 17:32:34 +02:00
Jim Ferenczi 31614c3ddb Remove deprecated fielddata_fields from search request (#25566)
... and inner_hits
2017-07-06 13:02:28 +02:00
Lee Hinman 30b5ca7ab7 Refactor PathTrie and RestController to use a single trie for all methods (#25459)
* Refactor PathTrie and RestController to use a single trie for all methods

This changes `PathTrie` and `RestController` to use a single `PathTrie` for all
endpoints, it also allows retrieving the endpoints' supported HTTP methods more
easily.

This is a spin-off and prerequisite of #24437

* Use EnumSet instead of multiple if conditions

* Make MethodHandlers package-private and final

* Remove duplicate registerHandler method

* Remove public modifier
2017-07-05 17:28:10 -06:00
Simon Willnauer 6e5cc424a8 Switch indices read-only if a node runs out of disk space (#25541)
Today when we run out of disk all kinds of crazy things can happen
and nodes are becoming hard to maintain once out of disk is hit.
While we try to move shards away if we hit watermarks this might not
be possible in many situations. Based on the discussion in #24299
this change monitors disk utilization and adds a flood-stage watermark
that causes all indices that are allocated on a node hitting the flood-stage
mark to be switched read-only (with the option to be deleted). This allows users to react on the low disk
situation while subsequent write requests will be rejected. Users can switch
individual indices read-write once the situation is sorted out. There is no
automatic read-write switch once the node has enough space. This requires
user interaction.

The flood-stage watermark is set to `95%` utilization by default.

Closes #24299
2017-07-05 22:18:23 +02:00
Jason Tedor 7dcd81b41b Throw back replica local checkpoint on new primary
This commit causes a replica to throwback its local checkpoint to the
global checkpoint when learning of a new primary through a replica
operation.

Relates #25452
2017-07-05 09:17:16 -04:00
Simon Willnauer 7c637a0bfe Ensure `index.mapping.single_type` can only be set on 5.x indices (#25375)
In 6.x we prevent multiple types and default to `index.mapping.single_type: false`
This change removes the registered setting and ensures that it's preserved for
5.x indices.

Relates to #24961
2017-07-05 15:16:40 +02:00
Simon Willnauer ca351b60b7 [TEST] Enable transport tracer for RemoteClusterServiceTests#testCollectNodes #25301 2017-07-05 11:23:14 +02:00
Simon Willnauer 8e861b3896 [TEST] Add another valid exception that can occure with concurrent disconnects 2017-07-05 11:23:14 +02:00
Christoph Büscher 3185eaece8 QueryBuilders should implement ToXContentObject (#25530)
All query builders written as self contained xContent objects, to we should mark
them accordingly using ToXContentObject. This also makes it possible to use
things like XContentHelper#toXContent to render query builders in tests.
2017-07-05 09:50:10 +02:00
Adrien Grand e7e5216382 Make totalHits a long in CollapseTopFieldDocs.
Relates to #25349.
2017-07-04 18:35:51 +02:00
Colin Goodheart-Smithe 41abccf6c5 Adds rewrite phase to aggregations (#25495)
* Adds rewrite phase to aggregations

This change adds aggregations to the rewrite performed by the `SearchSourceBuilder`. This means that `AggregationBuilder`s are able to implement a `rewrite()` method where they can return a new `AggregationBuilder` which is functionally the same but in a more primitive form. This is exactly analogous to the rewrite done by the `QueryBuilder`s.

The first aggregation to implement the rewrite are the filter and filters aggregations so they can rewrite the filters they contain.

Closes #17676

* Removes rewrite from PipelineAggregationBuilder

Rewrite is based on shard level information. Since pipeline aggregation are run in the reduce phase it doesn’t make sense to rewrite them on the shards. In fact eventually we shouldn’t be transporting them to the shards at all and should be retaining them on the coordinating node for execution in the reduce phase

* Addresses review comments

* addresses more review comments

* Fixed imports
2017-07-04 16:47:48 +01:00
Simon Willnauer 1c4ef0d214 Upgrade randomizedrunner to 2.5.2 (#25533)
An issue causing confusing error messages during test execution
has been fixed randomizedtesting/randomizedtesting#250
2017-07-04 16:48:11 +02:00
Jun Ohtani 6894ef6057 [Analysis] Support normalizer in request param (#24767)
* [Analysis] Support normalizer in request param

Support normalizer param
Support custom normalizer with char_filter/filter param

Closes #23347
2017-07-04 19:16:56 +09:00
Christoph Büscher 5200665295 Remove deprecated IdsQueryBuilder constructor (#25529)
The constructor using `types` has been deprecated for a while now (starting with
ES 5.1.). It can be removed in the next mayor version. Since types are optional
they should be added with the #types() setter.
2017-07-04 11:59:48 +02:00
Colin Goodheart-Smithe 43efcffcc2 Adds check for negative search request size (#25397)
* Adds check for negative search request size

This change adds a check to `SearchSourceBuilder` to throw and exception if the size set on it is set to a negative value.

Closes #22530

* fix error in reindex

* update re-index tests

* Addresses review comment

* Fixed tests

* Added random negative size test

* Fixes test
2017-07-04 10:51:38 +01:00
Christoph Büscher f576c987ce Remove QueryParseContext (#25486)
QueryParseContext is currently only used as a wrapper for an XContentParser, so
this change removes it entirely and changes the appropriate APIs that use it so
far to only accept a parser instead.
2017-07-03 17:30:40 +02:00
Tanguy Leroux 0e2cfc66bb [Test] Use a common testing class for all XContent filtering tests (#25491)
We have two ways to filter XContent:

- The first method is to parse the XContent as a map and use
XContentMapValues.filter(). This method filters the content of the map
using an automaton. It is used for source filtering, both at search and
indexing time. It performs well but can generate a lot of objects and
garbage collections when large XContent are filtered. It also returns
empty objects (see f2710c16eb) when all
the sub fields have been filtered out and handle dots in field names as
if they were sub fields.

- The second method is to parse the XContent and copy the XContentParser
 structure to a XContentBuilder initialized with includes/excludes
 filters. This method uses the Jackson streaming filter feature. It is
 used by the Response Filtering ('filter_path') feature. It does not
 generate a lot of objects, and does not return empty objects and also
 does not handle dots in field names explicitely.

 Both methods have similar goals but different tests. This commit changes
 the current XContentBuilder test class so that it becomes a more generic
 testing class and we can now ensure that filtering methods generate the
 same results.

 It also removes some tests from the XContentMapValuesTests class that
 should be in XContentParserTests.
2017-07-03 14:45:26 +02:00
markharwood a9ea742a85 Tests fix - Significant terms/text aggs (#25499)
The significance aggs return Lucene index-level statistics that when merged are assumed to be from different shards. The Aggregator unit tests assume segments can be treated as shards and thus break the significance stats and introduce double-counting of background doc frequencies. This change addresses this problem by ensuring test indexes have only one shard.
Closes #25429
2017-07-03 09:52:23 +01:00
Simon Willnauer 1205610023 [TEST] Expect nodes getting disconnected quickly
If all nodes get disconnected before we can send the request we might
try to reconnect and that will fail with an ISE instead of the a transport
exception.

Closes #25301
2017-07-02 22:12:35 +02:00
Boaz Leskes a4fae1540e testPrimaryFailureIncreasesTerm should use assertBusy to wait for yellow
ensureYellow ensures at least yellow.

Also, since we only have 1 replica, we don't need to index for it to know about the primary term promotion

Closes #25287
2017-07-02 21:19:51 +02:00
Simon Willnauer 5a7c8bb04e Cleanup network / transport related settings (#25489)
This commit makes the use of the global network settings explicit instead
of implicit within NetworkService. It cleans up several places where we fall
back to the global settings while we should have used tcp or http ones.

In addition this change also removes unnecessary settings classes
2017-07-02 10:16:50 +02:00
Yannick Welsch bb23d3b2c5 Remove allocation id from replica replication response (#25488)
The replica replication response object has an extra allocationId field that contains the allocation id of the replica on which the request was executed. As we are sending the allocation id with the actual replica replication request, and check when executing the replica replication action that the allocation id of the replica shard is what we expect, there is no need to communicate back the allocation id as part of the response object.
2017-07-01 11:36:45 +02:00
Jason Tedor c70c440050 Adjust status on bad allocation explain requests
When a user requests a cluster allocation explain in a situation where
it does not make sense (for example, there are no unassigned shards), we
should consider this a bad request instead of a server error. Yet, today
by throwing an illegal state exception, these are treated as server
errors. This commit adjusts these so that they throw illegal argument
exceptions and are treated as bad requests.

Relates #25503
2017-06-30 17:50:20 -04:00
Drew Raines 6deb18c0de Preliminary support for ARM
This commit adds preliminary support for 64-bit ARM architectures.

Relates #25318
2017-06-30 14:22:20 -04:00
Jason Tedor dd93ef3f24 Add additional test for sequence-number recovery
This commit adds a test for a scenario where a replica receives an extra
document that the promoted replica does not receive, misses the
primary/replica re-sync, and the recovers from the newly-promoted
primary.

Relates #25493
2017-06-30 10:59:03 -04:00
Martijn van Groningen c8da7f84a2
WrapperQueryBuilder should also rewrite the parsed query.
Failing to do so can cause other errors later on during query execution.
For example if  `WrapperQueryBuilder` wraps a `GeoShapeQueryBuilder` that fetches the shape from an index then it will skip the shape fetching
and fail later with the error that no shapes have been fetched.
2017-06-30 13:48:18 +02:00
Yannick Welsch 1fee1045b9 Remove dead code and stale Javadoc 2017-06-30 12:25:56 +02:00
Jason Tedor d219a85b33 Use LRU set to reduce repeat deprecation messages
This commit adds an LRU set to used to determine if a keyed deprecation
message should be written to the deprecation logs, or only added to the
response headers on the thread context.

Relates #25474
2017-06-29 16:36:43 -04:00
Tim Brooks cac2eec7d2 Add NioTransport threads to thread name checks (#25477)
We have various assertions that check we never block on transport
threads. This commit adds the thread names for the NioTransport to
these assertions.

With this change I had to fix two places where we were calling blocking
methods from the transport threads.
2017-06-29 15:16:07 -05:00
Christoph Büscher c32c21e875 Add shortcut for AbstractQueryBuilder.parseInnerQueryBuilder to QueryShardContext 2017-06-29 21:45:02 +02:00
Christoph Büscher 99aa04b79c Fix Java 9 compilation issue
My IDE ate a cast that seems required to make Java 9 happy.
2017-06-29 20:57:22 +02:00
Christoph Büscher 927111c91d Remove QueryParseContext from parsing QueryBuilders (#25448)
Currently QueryParseContext is only a thin wrapper around an XContentParser that
adds little functionality of its own. I provides helpers for long deprecated
field names which can be removed and two helper methods that can be made static
and moved to other classes. This is a first step in helping to remove
QueryParseContext entirely.
2017-06-29 17:10:20 +02:00
Lee Hinman 22ff76da0c Promote replica on the highest version node (#25277)
* Promote replica on the highest version node

This changes the replica selection to prefer to return replicas on the highest
version when choosing a replacement to promote when the primary shard fails.

Consider this situation:

- A replica on a 5.6 node
- Another replica on a 6.0 node
- The primary on a 6.0 node

The primary shard is sending sequence numbers to the replica on the 6.0 node and
skipping sending them for the 5.6 node. Now assume that the primary shard fails
and (prior to this change) the replica on 5.6 node gets promoted to primary, it
now has no knowledge of sequence numbers and the replica on the 6.0 node will be
expecting sequence numbers but will never receive them.

Relates to #10708

* Switch from map of node to version to retrieving the version from the node

* Remove uneeded null check

* You can pretend you're a functional language Java, but you're not fooling me.

* Randomize node versions

* Add test with random cluster state with multiple versions that fails shards

* Re-add comment and remove extra import

* Remove unneeded stuff, randomly start replicas a few more times

* Move test into FailedNodeRoutingTests

* Make assertions actually test replica version promotion

* Rewrite test, taking Yannick's feedback into account
2017-06-29 08:56:34 -06:00
Martijn van Groningen a2b4080fba
use diamond operator 2017-06-29 13:43:39 +02:00
Christoph Büscher aa2038f9d7 Use DocumentField#toXContent and parsing in SearchHit (#25469)
As a small follow-up to #25361, we can use DocumentFields toXContent/fromXContent
in SearchHit now.
2017-06-29 13:32:13 +02:00
olcbean 3518e313b8 Unify the result interfaces from get and search in Java client (#25361)
As GetField and SearchHitField have the same members, they have been unified into
DocumentField.

Closes #16440
2017-06-29 11:35:28 +02:00
Jason Tedor da59c178e2 Emit settings deprecation logging at most once
When a setting is deprecated, if that setting is used repeatedly we
currently emit a deprecation warning every time the setting is used. In
cases like hitting settings endpoints over and over against a node with
a lot of deprecated settings, this can lead to excessive deprecation
warnings which can crush a node. This commit ensures that a given
setting only sees deprecation logging at most once.

Relates #25457
2017-06-28 22:18:46 -04:00
Ali Beyad b18bfd6062 Output all empty snapshot info fields if in verbose mode (#25455)
In #24477, a less verbose option was added to retrieve snapshot info via
GET /_snapshot/{repo}/{snapshots}.  The point of adding this less
verbose option was so that if the repository is a cloud based one, and
there are many snapshots for which the snapshot info needed to be
retrieved, then each snapshot would require reading a separate snapshot
metadata file to pull out the necessary information.  This can be costly
(performance and cost) on cloud based repositories, so a less verbose
option was added that only retrieves very basic information about each
snapshot that is all available in the index-N blob - requiring only one
read!

In order to display this less verbose snapshot info appropriately, logic
was added to not display those fields which could not be populated.
However, this broke integrators (e.g. ECE) that required these fields to
be present, even if empty.  This commit is to return these fields in the
response, even if empty, if the verbose option is set.
2017-06-28 17:37:56 -05:00
Jay Modi 64d11b8831 Fix race condition in RemoteClusterConnection node supplier (#25432)
This commit fixes a race condition in the node supplier used by the RemoteClusterConnection. The
node supplier stores an iterator over a set backed by a ConcurrentHashMap, but the get operation
of the supplier uses multiple methods of the iterator and is suceptible to a race between the
calls to hasNext() and next(). The test in this commit fails under the old implementation with a
NoSuchElementException. This commit adds a wrapper object over a set and a iterator, with all methods
being synchronized to avoid races. Modifications to the set result in the iterator being set to null
and the next retrieval creates a new iterator.
2017-06-28 15:50:24 -06:00