Commit Graph

9259 Commits

Author SHA1 Message Date
Jason Tedor 470e5e7cfc Add additional low-level logging handler ()
* Add additional low-level logging handler

We have the trace handler which is useful for recording sent messages
but there are times where it would be useful to have more low-level
logging about the events occurring on a channel. This commit adds a
logging handler that can be enabled by setting a certain log level
(org.elasticsearch.transport.netty4.ESLoggingHandler) to trace that
provides trace logging on low-level channel events and includes some
information about the request/response read/write events on the channel
as well.

* Remove imports

* License header

* Remove redundant

* Add test

* More assertions
2017-10-05 12:10:58 -04:00
Simon Willnauer 8583727590 [TEST] add test to ensure legacy list syntax in yml works fine 2017-10-05 14:41:51 +02:00
Simon Willnauer 41925e1171 Bump BWC version for settings serialization to 6.1.0 2017-10-05 14:07:05 +02:00
Md. Abdulla-Al-Sun a40c474e10
Added Bengali Analyzer to Elasticsearch with respect to the lucene update(PR#238) 2017-10-05 13:25:05 +02:00
kel a978ddf37b Fix toString() in SnapshotStatus (#26852)
Closes #26851
2017-10-05 12:57:46 +02:00
Jim Ferenczi 24359c1a75 #26870 change bwc version for fuzzy_transpositions to 6.1 after backport 2017-10-05 11:28:59 +02:00
Martijn van Groningen b863eaff4d
update lucene version after upgrading Lucene deps in 6.x branch too 2017-10-05 09:49:30 +02:00
Simon Willnauer 00dfdf50cf Represent lists as actual lists inside Settings (#26878)
Today we represent each value of a list setting with it's own dedicated key
that ends with the index of the value in the list. Aside of the obvious
weirdness this has several issues especially if lists are massive since it
causes massive runtime penalties when validating settings. Like a list of 100k
words will literally cause a create index call to timeout and in-turn massive
slowdown on all subsequent validations runs.

With this change we use a simple string list to represent the list. This change
also forbids to add a settings that ends with a .0 which was internally used to
detect a list setting.  Once this has been rolled out for an entire major
version all the internal .0 handling can be removed since all settings will be
converted.

Relates to #26723
2017-10-05 09:27:08 +02:00
Martijn van Groningen dca787ed8a
upgrade to Lucene 7.1.0 snapshot version 2017-10-05 09:06:56 +02:00
Alexander Kazakov 9c95e91471 Expose `fuzzy_transpositions` parameter in fuzzy queries (#26870)
Add fuzzy_transpositions parameter to multi_match and query_string queries.
Add fuzzy_transpositions, fuzzy_prefix_length and fuzzy_max_expansions
parameters to simple_query_string query.
2017-10-05 09:01:09 +02:00
Jason Tedor c5a0e77fc6 Remove unused local transport constant
Local transport was removed previously but a stale constant was left
behind. This commit removes this unused constant.
2017-10-04 22:26:29 -04:00
Luca Cavanna 9b9cb81c41 Fix serialization errors when cross cluster search goes to a single shard (#26881)
The single shard optimization that we have in our search api changes the type of response returned by the query transport action name based on the shard search request. if the request goes to one shard, we will do query and fetch at the same time, hence the response will be different. The proxying layer used in cross cluster search was not aware of this distinction, which causes serialization issues every time a cross cluster search request goes to a single shard and goes through a gateway node which has to forward the shard request to a data node. The coordinating node would then expect a QueryFetchSearchResult while the gateway would return a QuerySearchResult.

Closes #26833
2017-10-04 22:39:14 +02:00
Tim Brooks ca35fcabd0 Do not set SO_LINGER to 0 when not shutting down (#26871)
This is a follow up to #26764. That commit set SO_LINGER to 0 in order
to fix a scenario where we were running out of resources during CI. We
are primarily interested in setting this to 0 when stopping the
tranport. Allowing TIMED_WAIT is standard for other failure scenarios
during normal operation.

Unfortunately this commit set SO_LINGER to 0 every time we close
NodeChannels. NodeChannels can be closed in case of an exception or
other failures (such as parsing a response). We want to only disable
linger when actually shutting down.
2017-10-04 10:27:26 -06:00
Simon Willnauer d1533e2397 Remove Settings#getAsMap() (#26845)
Since `#getAsMap` exposes internal representation we are trying to remove it
step by step. This commit is cleaning up some xcontent writing as well as
usage in tests
2017-10-04 01:21:38 -06:00
David Roberts 2bdeee8840 Fix test bug from #26166 2017-10-03 15:55:37 +01:00
David Roberts ea7be2d527 Adjust transport compatibility logic following backport of #26166 2017-10-03 14:16:31 +01:00
David Roberts a292740b9e Add cgroup memory usage/limit to OS stats on Linux (#26166)
This change adds cgroup memory usage/limit to the OS stats section of
the node stats on Linux.  This information is useful because in Docker
containers the standard node stats report the host memory limit, not
taking account of extra restrictions that may have been applied to the
container.

The original idea was to store these values as Long, truncating any values
outside the range of long.  However, this meant that in the relatively common
case of no limit being applied, users would not see the same value in the OS
stats as they see by querying Linux directly.  So instead the values are stored
as String.  This change places a burden on consumers of the strings to
convert the strings to numbers and decide what to do about extremely large
values, but there will be very few consumers and they would need to have a
policy for dealing with "no limit" in any case.
2017-10-03 12:08:36 +01:00
Simon Willnauer 7b8d036ab5 Replace group map settings with affix setting (#26819)
We use group settings historically instead of using a prefix setting which is more restrictive and type safe. The majority of the usecases needs to access a key, value map based on the _leave node_ of the setting ie. the setting `index.tag.*` might be used to tag an index with `index.tag.test=42` and `index.tag.staging=12` which then would be turned into a `{"test": 42, "staging": 12}` map. The group settings would always use `Settings#getAsMap` which is loosing type information and uses internal representation of the settings. Using prefix settings allows now to access such a method type-safe and natively.
2017-09-30 14:27:21 +02:00
David Turner 8fe9a20982 Forbid negative values for index.unassigned.node_left.delayed_timeout (#26828)
Change delayed_timeout to be a positiveTimeSetting, and add note that this is a breaking change
2017-09-29 14:44:43 +01:00
David Turner 1715fd7036 Nitpicking typos in comments (#26831) 2017-09-29 11:37:05 +01:00
Boaz Leskes eabd0d687e MetaData Builder doesn't properly prevent an alias with the same name as an index (#26804)
Elasticsearch doesn't allow having an index alias named with the same name as an existing index. We currently have logic that tries to prevents that in the `MetaData.Builder#build()` method. Sadly that logic is flawed. Depending on iteration order, we may allow the above to happen (if we encounter the alias before the index).

This commit fixes the above and improves the error message while at it.

Note that we have a lot of protections in place before we end up relying on the metadata builder (validating this when we process APIs). I takes quite an abuse of the cluster to get that far.
2017-09-29 11:06:58 +02:00
Tal Levy ce8533e251 Add version 6.0.0-rc2 2017-09-28 11:07:35 -07:00
Jim Ferenczi aade2f6d63 Change ParentFieldSubFetchPhase to create doc values iterator once per segment (#26815) 2017-09-28 16:24:05 +02:00
Jim Ferenczi 0acf74420c Change VersionFetchSubPhase to create doc values iterator once per segment (#26809) 2017-09-28 13:57:10 +02:00
Jim Ferenczi f9f8856d89 Change ScriptFieldsFetchSubPhase to create search scripts once per segment (#26808)
Closes #26775
2017-09-28 12:23:53 +02:00
James Baiera 81d181394f Adding unreleased 5.6.3 version number to Version.java (#26794) 2017-09-26 17:49:22 -04:00
Armin Braun af06231d4c #26701 Close TcpTransport on RST in some Spots to Prevent Leaking TIME_WAIT Sockets (#26764)
#26701 Added option to RST instead of FIN to TcpTransport#closeChannels
2017-09-26 19:58:11 +00:00
olcbean 6952f7b560 Validate top-level keys for create index request (#23755) (#23869)
This commit ensures create index requests do not ignore unknown keys passed to the request.

closes #23755
2017-09-26 09:49:20 -07:00
Alexander Kazakov 52887b1437 Throw exception if setting during dynamic settings update isn't recognized (#26569)
Closes #25607
2017-09-26 13:07:01 +02:00
Simon Willnauer a506ba8602 Remove `Settings,put(Map<String,String>)` (#26785)
`Map<String,String>` is basically erasing the type while other methods on
the `Settings.Builder` are type safe and have corresponding `get` methods.
2017-09-26 12:15:20 +02:00
Jim Ferenczi 74473c1c3d Early termination with index sorting should not set terminated_early in the response (#26597)
Early termination with index sorting always return the best top N in the response but set the flag `terminated_early`
in the response. This can be confusing because we use the same flag for `terminate_after` which on the contrary returns partial results.
This change removes the flag when results are not partial (early termination due to index sorting) and keeps it only when `terminate_after` is used.

Closes #26408
2017-09-26 11:37:11 +02:00
Christoph Büscher 6189c54c84 Reject the `index_options` parameter for numeric fields (#26668)
Numeric fields no longer support the index_options parameter. This changes the parameter
to be rejected in numeric field types after it was deprecated in 6.0.

Closes #21475
2017-09-25 23:43:14 +02:00
Jason Tedor 530d80fdc1 Fix global checkpoint sync log message
The log message here is incorrect, a failure here is occuring on the
post-operation global checkpoint sync, not the background sync. This
commit fixes the log message.
2017-09-25 16:38:58 -04:00
Nik Everett eb754a71be Fix update_by_query's default size parameter (#26784)
We were accidentally defaulting it to the scroll size.
Untwists some of the tricks that we play with parsing
so that the size is no longer scrambled.

Closes #26761
2017-09-25 16:25:27 -04:00
Jason Tedor 57aee93693 Add shard ID to failed global checkpoint messages
When a global checkpoint sync fails we should log the shard ID for the
shard the sync failed for. This commit causes this to be the case.
2017-09-25 14:32:55 -04:00
Boaz Leskes ec0c621072 `IndexShard.routingEntry` should only be updated once all internal state is ready (#26776)
The routing entry is used by external components to check whether the shard is ready to perform as primary. Most notably, the peer recovery source handler delays recoveries until the shard routing entry says the shard is ready. 

When a shard is promoted to primary, we currently update the shard's routing entry before we finish all the work relating to the promotion. This can cause recoveries to fail later on because the `GlobalCheckpointTracker` isn't set (yet) to primary mode. 

This commit fixes this issue by updating the routing entry last.
2017-09-25 19:57:16 +02:00
Adrien Grand 579cca9adb Allow copying from a field to another field that belongs to the same nested object. (#26774)
The previous test was too strict and enforced that the target object was a
parent. It has been relaxed so that fields that belong to the same nested
object can copy to each other.

The commit also improves error handling in case of multi-fields. The current
validation works but may throw confusing error messages since it assumes that
only object fields may introduce dots in fields names while multi fields may
too.

Closes #26763
2017-09-25 18:33:26 +02:00
Christoph Büscher 3827918417 Add configurable `maxTokenLength` parameter to whitespace tokenizer (#26749)
Other tokenizers like the standard tokenizer allow overriding the default
maximum token length of 255 using the `"max_token_length` parameter. This change
enables using this parameter also with the whitespace tokenizer. The range that
is currently allowed is from 0 to StandardTokenizer.MAX_TOKEN_LENGTH_LIMIT,
which is 1024 * 1024 = 1048576 characters.

Closes #26643
2017-09-25 17:21:19 +02:00
Adrien Grand 324ee8cc96 Use the 6.6.1 Lucene version constant. (#26768)
It is only possible since we moved to Lucene 7.0.0 GA. Previous snapshots did
not know about it.
2017-09-25 17:06:16 +02:00
kel 8f5f63452a Fix typo in date format (#26503) 2017-09-25 17:01:39 +02:00
Simon Willnauer aab4655e63 Unify Settings xcontent reading and writing (#26739)
This change adds a fromXContent method to Settings that allows to read
the xcontent that is produced by toXContent. It also replaces the entire settings
loader infrastructure and removes the structured map representation. Future PRs will
also tackle the `getAsMap` that exposes the internal represenation of settings for
better encapsulation.
2017-09-25 13:23:01 +02:00
Jason Tedor d87ab67e44 Fix global checkpoint sync test
This commit fixes issues with the global checkpoint sync test. The test
was off in initializing the maximum sequence number on the primary
shard, and off setting the local knowledge on the primary of the global
checkpoint on the replica.
2017-09-22 17:12:32 -04:00
Jason Tedor 2e63a13c0a Upgrade to Log4j 2.9.1
This commit upgrades the Log4j dependency, picking up a fix for an issue
with handling stack traces on JDK 9.

Relates #26750
2017-09-22 11:57:06 -04:00
Yannick Welsch df5c450e89 Add v6.1 BWC layer for adding wait_for_active_shards to index open command
This commit disables BWC tests while adding a v6.1 BWC layer for the PR #26682
2017-09-22 16:30:07 +02:00
Martijn van Groningen a056c5d469
aggs: Changed how top_hits initialises leaf collectors
Both TopDocsCollector and LeafCollector were being kept around at the aggregator level. In case the nested aggregator would do a post collection then this could cause pushing down docids to top hits child aggregators that already moved the next LeafCollector (causing assertions to trip and incorrect results).

By keeping track of the LeafCollector in a seperate map at the leaf bucket level this problem can simply not happen any more as the place holding LeafCollector is no longer shared.

Also LeafCollector instances for TopDocsCollectors are no longer pre-created as the beginning a new segment is evaluated. There is no guarantee that TopHitsAggregator encounters a document for a particular bucket and there has to be logic to create LeafCollector instances which have not been seen before.

Closes #26738
2017-09-22 15:59:43 +02:00
Alexander Kazakov ff737a880c Add wait_for_active_shards parameter to index open command (#26682)
Adds the wait_for_active_shards parameter to the index open command. Similar to the index creation command, the index open command will now, by default, wait until the primaries have been allocated.

Closes #20937
2017-09-22 11:15:03 +02:00
Martijn van Groningen 109c6c2717
aggs: Do not delegate a null scorer to LeafBucketCollectors
Closes #26611
2017-09-22 09:20:57 +02:00
Yannick Welsch 76e1b7437c [TEST] Remove assertSeqNos from testAckedIndexing 2017-09-22 08:31:36 +02:00
Jason Tedor e0db89bc35 Upgrade to Lucene 7.0.0
This commit upgrades to the GA release of Luence 7!

Relates #26744
2017-09-21 19:19:33 -04:00
Jason Tedor f35d1de502 Introduce global checkpoint background sync
It is the exciting return of the global checkpoint background
sync. Long, long ago, in snapshot version far, far away we had and only
had a global checkpoint background sync. This sync would fire
periodically and send the global checkpoint from the primary shard to
the replicas so that they could update their local knowledge of the
global checkpoint. Later in time, as we sped ahead towards finalizing
the initial version of sequence IDs, we realized that we need the global
checkpoint updates to be inline. This means that on a replication
operation, the primary shard would piggy back the global checkpoint with
the replication operation to the replicas. The replicas would update
their local knowledge of the global checkpoint and reply with their
local checkpoint. However, this could allow the global checkpoint on the
primary to advance again and the replicas would fall behind in their
local knowledge of the global checkpoint. If another replication
operation never fired, then the replicas would be permanently behind. To
account for this, we added one more sync that would fire when the
primary shard fell idle. However, this has problems:
 - the shard idle timer defaults to five minutes, a long time to wait
   for the replicas to learn of the new global checkpoint
 - if a replica missed the sync, there was no follow-up sync to catch
   them up
 - there is an inherent race condition where the primary shard could
   fall idle mid-operation (after having sent the replication request to
   the replicas); in this case, there would never be a background sync
   after the operation completes
 - tying the global checkpoint sync to the idle timer was never natural

To fix this, we add two additional changes for the global checkpoint to
be synced to the replicas. The first is that we add a post-operation
sync that only fires if there are no operations in flight and there is a
lagging replica. This gives us a chance to sync the global checkpoint to
the replicas immediately after an operation so that they are always kept
up to date. The second is that we add back a global checkpoint
background sync that fires on a timer. This timer fires every thirty
seconds, and is not configurable (for simplicity). This background sync
is smarter than what we had previously in the sense that it only sends a
sync if the global checkpoint on at least one replica is lagging that of
the primary. When the timer fires, we can compare the global checkpoint
on the primary to its knowledge of the global checkpoint on the replicas
and only send a sync if there is a shard behind.

Relates #26591
2017-09-21 15:34:13 -04:00
Martijn van Groningen fda8f8b827
muted test 2017-09-21 17:24:18 +02:00
Jay Modi c47f24d406 BulkProcessor flush runnable preserves the thread context from creation time (#26718)
When using a bulk processor, the thread context was not preserved for the flush runnable which is
executed in another thread in the thread pool. This change wraps the flush runnable in a context
preserving runnable so that the headers and transients from the creation time of the bulk processor
are available during the execution of the flush.

Closes #26596
2017-09-20 10:19:42 -06:00
Simon Willnauer b9c0d4447c Catch exceptions and inform handler in RemoteClusterConnection#collectNodes (#26725)
This adds a missing catch block to invoke the action listener instead of bubbeling
up the exception.

Closes #26700
2017-09-20 17:53:12 +02:00
Christoph Büscher 86b00b84bc Remove parse field deprecations in query builders (#26711)
The `fielddata` field and the use of the `_name` field in the short syntax of the range 
query have been deprecated in 5.0 and can be removed.

The same goes for the deprecated `score_mode` field in HasParentQueryBuilder,
the deprecated `like_text`, `ids` and `docs` parameter in the `more_like_this` query,
the deprecated query name in the short version of the `regexp` query, and several
deprecated alternative field names in other query builders.
2017-09-20 16:22:21 +02:00
Christoph Büscher 3d67915ed5 #26720: Set the correct bwc version after backport to 6.0 2017-09-20 16:14:11 +02:00
Christoph Büscher 22e200e79a Remove deprecated type and slop field in MatchQueryBuilder (#26720)
The `type` field has been deprecated in 5.0 and can be removed. It has been
replaced by using the MatchPhraseQueryBuilder or the
MatchPhrasePrefixQueryBuilder. The `slop` field has also been deprecated and can
be removed, the phrase and phrase prefix query builders still provide this
parameter.
2017-09-20 14:24:30 +02:00
Yannick Welsch 5f407062ad Refactoring of Gateway*** classes (#26706)
- Removes mutual dependency between GatewayMetaState and TransportNodesListGatewayMetaState
- Deguices MetaDataIndexUpgradeService
- Deguices GatewayMetaState
- Makes Gateway the master-level component that is only responsible for coordinating the state recovery
2017-09-20 12:51:58 +02:00
Yannick Welsch ff1e26276d Deguice ActionFilter (#26691)
Allows to instantiate TransportAction instances without Guice.
2017-09-20 10:30:21 +02:00
Martijn van Groningen 61849a1150
aggs: Allow aggregation sorting via nested aggregation.
The nested aggregator now buffers all bucket ords per parent document and
emits all bucket ords for a parent document's nested document once. This way
the nested documents document DocIdSetIterator gets used once per bucket
instead of wrapping the nested aggregator inside a multi bucket aggregator,
which was the current solution upto now. This allows sorting by buckets
under a nested bucket.

Closes #16838
2017-09-20 07:44:53 +02:00
Jason Tedor 581a873124 Remove assertion from checkpoint tracker invariants
This assertion is wrong because the global checkpoint on a promoted
primary can be lagging the replicas until it catches up after through
resyncs, ongoing indexing operations and removing the old primary from
the in-sync set.
2017-09-19 17:52:41 -04:00
Igor Motov 5090260119 Upgrade API: fix excessive logging and unnecessary template updates (#26698)
TemplateUpgradeService might get stuck in repeatedly upgrading templates after upgrade to 5.6.0. This is caused by shuffling mappings definition in the template during template serialization. This commit makes the template serialization consistent.

Closes #26673
2017-09-19 16:32:17 -04:00
Boaz Leskes 04385a9ce9 Restoring from snapshot should force generation of a new history uuid (#26694)
Restoring a shard from snapshot throws the primary back in time violating assumptions and bringing the validity of global checkpoints in question. To avoid problems, we should make sure that a shard that was restored will never be the source of an ops based recovery to a shard that existed before the restore. To this end we have introduced the notion of `histroy_uuid` in #26577 and required that both source and target will have the same history to allow ops based recoveries. This PR make sure that a shard gets a new uuid after restore.

As suggested by @ywelsch , I derived the creation of a `history_uuid` from the `RecoverySource` of the shard. Store recovery will only generate a uuid if it doesn't already exist (we can make this stricter when we don't need to deal with 5.x indices). Peer recovery follows the same logic (note that this is different than the approach in #26557, I went this way as it means that shards always have a history uuid after being recovered on a 6.x node and will also mean that a rolling restart is enough for old indices to step over to the new seq no model). Local shards and snapshot force the generation of a new translog uuid.

Relates #10708
Closes #26544
2017-09-19 15:58:36 +02:00
Martijn van Groningen 332b4d12fa
test: Use a single primary shard so that the exception can caught in the same way 2017-09-19 15:14:24 +02:00
Jason Tedor 256721018b Move pre-6.0 node checkpoint to SequenceNumbers
This commit moves the pre-6.0 node checkpoint constant from
SequenceNumbersService to SequenceNumbers so it can chill with the other
sequence number-related constants.

Relates #26690
2017-09-19 06:27:56 -04:00
Armin Braun 2db3bccd37 Invalid JSON request body caused endless loop (#26680)
Request bodys that only consists of a String value can lead to endless loops in the
parser of several rest requests like e.g. `_count`. Up to 5.2 this seems to have been caught 
in the logic guessing the content type of the request, but since then it causes the node to
block. This change introduces checks for receiving a valid xContent object before starting the
parsing in RestActions#parseTopLevelQueryBuilder().

Closes #26083
2017-09-19 12:02:05 +02:00
Martijn van Groningen 6c46a67dd6
added comment 2017-09-19 11:02:15 +02:00
Martijn van Groningen a3a6ce6220
fix line length violation 2017-09-19 11:02:15 +02:00
Martijn van Groningen f782f618cc
Moved the check to fetch phase. This basically means that we throw
a better error message instead of an AOBE and not adding more restrictions.
2017-09-19 11:02:15 +02:00
Martijn van Groningen d05aee7eda
inner hits: Do not allow inner hits that use _source and have a non nested object field as parent
Closes #25315
2017-09-19 11:02:15 +02:00
Tal Levy cc726cb3b6 convert more admin requests to writeable (#26566) 2017-09-18 13:19:34 -07:00
Nik Everett 98f8bde389 Handle release of 5.6.1
* Add a version constant for 5.6.2 so that the 5.6.1 constant
represents the 5.6.1 release and the 5.6.2 constant represents
the unreleased 5.6 branch.
2017-09-18 15:41:09 -04:00
Simon Willnauer 9f97f9072a Allow `InputStreamStreamInput` array size validation where applicable (#26692)
Today we can't validate the array length in `InputStreamStreamInput` since
we can't rely on `InputStream.available` yet in some situations we know
the size of the stream and can apply additional validation.
2017-09-18 17:52:36 +02:00
Jason Tedor 23093adcb9 Update global checkpoint with permit after recovery
After recovery completes from a primary, we now update the local
knowledge on the primary of the global checkpoint on the recovery
target. However if this occurs concurrently with a relocation, an
assertion could trip that we are no longer in primary mode. As this
local knowledge should only be tracked when we are in primary mode,
updating this local knowledge should be done under a permit. This commit
causes that to be the case.

Relates #26666
2017-09-18 07:48:08 -04:00
Jason Tedor 6f25163aef Filter pre-6.0 nodes for checkpoint invariants
When checking that the global checkpoint on the primary is consistent
with the local checkpoints of the in-sync shards, we have to filter
pre-6.0 nodes from the check or the invariant will trivially trip. This
commit filters these nodes out when checking this invariant.

Relates #26666
2017-09-18 06:51:22 -04:00
Jason Tedor c238b79cf4 Add global checkpoint tracking on the primary
This commit adds local tracking of the global checkpoints on all shard
copies when a global checkpoint tracker is operating in primary
mode. With this, we relay the global checkpoint on a shard copy back to
the primary shard during replication operations. This serves as another
step towards adding a background sync of the global checkpoint to the
shard copies.

Relates #26666
2017-09-18 06:04:44 -04:00
Michael Basnight 296c239611 Add check for invalid index in WildcardExpressionResolver (#26409)
This commit adds validation to the resolving of indexes in the wildcard
expression resolver. It no longer throws a 404 Not Found when resolving
invalid indices. It throws a 400 instead, as it is an invalid
index. This was the behavior of 5.x.
2017-09-15 17:00:41 -05:00
kel 0f2a11695e Filter unsupported relation for range query builder (#26620) 2017-09-15 14:01:35 +02:00
Boaz Leskes ffc9999567 fix StartRecoveryRequestTests.testSerialization 2017-09-14 23:20:55 +03:00
Boaz Leskes 1ca0b5e9e4 Introduce a History UUID as a requirement for ops based recovery (#26577)
The new ops based recovery, introduce as part of  #10708, is based on the assumption that all operations below the global checkpoint known to the replica do not need to be synced with the primary. This is based on the guarantee that all ops below it are available on primary and they are equal. Under normal operations this guarantee holds. Sadly, it can be violated when a primary is restored from an old snapshot. At the point the restore primary can miss operations below the replica's global checkpoint, or even worse may have total different operations at the same spot. This PR introduces the notion of a history uuid to be able to capture the difference with the restored primary (in a follow up PR).

The History UUID is generated by a primary when it is first created and is synced to the replicas which are recovered via a file based recovery. The PR adds a requirement to ops based recovery to make sure that the history uuid of the source and the target are equal. Under normal operations, all shard copies will stay with that history uuid for the rest of the index lifetime and thus this is a noop. However, it gives us a place to guarantee we fall back to file base syncing in special events like a restore from snapshot (to be done as a follow up) and when someone calls the truncate translog command which can go wrong when combined with primary recovery (this is done in this PR).

We considered in the past to use the translog uuid for this function (i.e., sync it across copies) and thus avoid adding an extra identifier. This idea was rejected as it removes the ability to verify that a specific translog really belongs to a specific lucene index. We also feel that having a history uuid will serve us well in the future.
2017-09-14 21:25:02 +03:00
Christoph Büscher c7c6443b10 [Docs] "The the" is a great band, but ... (#26644)
Removing several occurrences of this typo in the docs and javadocs, seems to be
a common mistake. Corrections turn up once in a while in PRs, better to correct
some of this in one sweep.
2017-09-14 15:08:20 +02:00
Jason Tedor ca6bce75da Refactor bootstrap check results and error messages
This commit refactors the bootstrap checks into a single result object
that encapsulates whether or not the check passed, and a failure message
if the check failed. This simpifies the checks, and enables the messages
to more easily be based on the state used to discern whether or not the
check passed.

Relates #26637
2017-09-13 21:30:27 -04:00
Simon Willnauer b4de2a6f28 Add BootstrapContext to expose settings and recovered state to bootstrap checks (#26628)
This exposes the node settings and the persistent part of the cluster state to the
bootstrap checks to allow plugins to enforce certain preconditions based on the
recovered state.
2017-09-13 22:14:17 +02:00
Christoph Büscher 2eaf7534f3 [Tests] Removing skipping tests in search rest tests
After backporting the script_field soft limit to the 6.x branches, this test can
now also run in a mixed cluster.

Relates to #26598

 enter the commit message for your changes. Lines starting
2017-09-13 18:21:15 +02:00
Jason Tedor 7be5ee5f28 Initialize checkpoint tracker with allocation ID
This commit pushes the allocation ID down through to the global
checkpoint tracker at construction rather than when activated as a
primary.

Relates #26630
2017-09-13 12:15:15 -04:00
Adrien Grand 93da7720ff Move non-core mappers to a module. (#26549)
Today we have all non-plugin mappers in core. I'd like to start moving those
that neither map to json datatypes nor are very frequently used like `date` or
`ip` to a module.

This commit creates a new module called `mappers-extra` and moves the
`scaled_float` and `token_count` mappers to it. I'd like to eventually move
`range` fields there but it's more complicated due to their intimate
relationship with range queries.

Relates #10368
2017-09-13 17:58:53 +02:00
Christoph Büscher 027c555c9b Add soft limit on allowed number of script fields in request (#26598)
Requesting to many script_fields in a search request can be costly
because of script execution. This change introduces a soft limit on the number
of script fields that are allowed per request. The setting can be
changed per index using the index.max_script_fields setting.

Relates to #26390
2017-09-13 17:22:16 +02:00
Adrien Grand 64770b3fbd Remove MapperService#dynamic. (#26603)
We ignore it as of 6.0 and forbid it as of 7.0.
2017-09-13 17:00:52 +02:00
Adrien Grand 454cfc2cea More efficient encoding of range fields. (#26470)
This PR removes the vInt that precedes every value in order to know how long
they are. Instead the query takes an enum that tells how to compute the length
of values: for fixed-length data (ip addresses, double, float) the length is a
constant while longs and integers use a variable-length representation that
allows the length to be computed from the encoded values.

Also the encoding of ints/longs was made a bit more efficient in order not to
waste 3 bits in the header. As a consequence, values between -8 and 7 can now
be encoded on 1 byte and values between -2048 and 2047 can now be encoded on 2
bytes or less.

Closes #26443
2017-09-13 15:26:33 +02:00
Ivan Brusic 9e05b3260b Add boolean similarity to built in similarity types (#26613) 2017-09-13 13:58:30 +02:00
Jason Tedor b3e7e85cf1 Let search phases override max concurrent requests
If the query coordinating node is also a data node that holds all the
shards for a search request, we can end up recursing through the can
match phase (because we send a local request and on response in the
listener move to the next shard and do this again, without ever having
returned from previous shards). This recursion can lead to stack
overflow for even a reasonable number of indices (daily indices over a
sixty days with five shards per day is enough to trigger the stack
overflow). Moreover, all this execution would be happening on a network
thread (the thread that initially received the query). With this commit,
we allow search phases to override max concurrent requests. This allows
the can match phase to avoid recursing through the shards towards a
stack overflow.

Relates #26484
2017-09-13 06:16:27 -04:00
Christoph Büscher e00db235bc Add a soft limit for the number of requested doc-value fields (#26574)
Requesting to many docvalue_fields in a search request can potentially be costly
because it might incur a per-field per-document seek. This change introduces a
soft limit on the number of fields that can be retrieved. The setting can be
changed per index using the `index.max_docvalue_fields_search` setting.

Relates to #26390
2017-09-13 11:57:06 +02:00
Adrien Grand 04b24c7780 Fix Lucene version of 5.6.1. 2017-09-12 17:54:50 +02:00
Michael Basnight 0e57a416f1 Handle the 5.6.0 release 2017-09-12 09:48:09 -05:00
Simon Willnauer 42f3129d7b Allow plugins to validate cluster-state on join (#26595)
Today we don't have a pluggable way to validate if the cluster state
is compatible with the node that joins. We already apply some checks for index
compatibility that prevents nodes to join a cluster with indices it doesn't support
but for plugins this isn't possible. This change adds a cluster state validator that
allows plugins to prevent a join if the cluster-state is incompatible.
2017-09-12 15:32:33 +02:00
Yu 3d4e28aee1 Remove index mapper dynamic settings (#25734)
Remove "index.mapper.dynamic" setting for 6.0 (and after) indices, but
still keep working for 5.x (and before) indices. Remove two index
dynamic disable test cases as the disability of index.mapper.dynamic is
already removed for current version. Add a new test class for version
test.
2017-09-12 14:29:10 +02:00
Ryan Ernst 5c35bff1c3 Test: Remove leftover static bwc test case (#26584)
This test case was leftover from the static bwc tests. There was still
one use for checking we do not load old indices, but this PR moves the
legacy code needed for that directly into the test. I also opened a
follow up issue to completely remove the unsupported test: #26583.
2017-09-11 15:38:30 -07:00
Jason Tedor b2e4bfa0a7 Snapshot fallback should consider build.snapshot
When determining if a build is a snapshot build, we look for a field in
the JAR manifest. However, when running tests, we are not running with a
compiled core Elasticsearch JAR, we are running with the compiled core
classes on the classpath. We have a fallback for this, we always assume
such a situation is a snapshot build. However, when running builds with
-Dbuild.snapshot=false, this is not the case. As such, we need to
fallback to the value of build.snapshot. However, there are cases where
we are not running with a compiled core Elasticsearch JAR (e.g., when
the transport client is embedded in a web container) so we should only
do this fallback if we are in tests. To verify we are in tests, we check
if randomized runner is on the classpath.

Relates #26554
2017-09-11 07:42:11 -04:00
Jim Ferenczi c62b0192d0 #26496: Set the correct bwc version after backport to 6.x 2017-09-11 13:09:44 +02:00
Adrien Grand 1adee8b5a8 Fix the MapperFieldType.rangeQuery API. (#26552)
RangeQueryBuilder needs to perform too many `instanceof` checks in order to
check for `date` or `range` fields in order to know what it should do with the
shape relation, time zone and date format.

This commit adds those 3 parameters to the `rangeQuery` factory method so that
those instanceof checks are not necessary anymore.
2017-09-11 11:02:05 +02:00
Adrien Grand 2bc3eeccde Deduplicate `_field_names`. (#26550)
This is a minor optimization that should save some utf8 conversions and indexing.
2017-09-11 10:57:08 +02:00
Md.Abdulla-Al-Sun d00d18a36d [Docs] Fix typo in javadocs (#26556) 2017-09-09 22:25:31 +02:00
Lee Hinman 2702918780 Limit the number of expanded fields it query_string and simple_query_string (#26541)
* Limit the number of expanded fields it query_string and simple_query_string

This limits the number of automatically expanded fields for the "all fields"
mode (`"default_field": "*"`) for the `query_string` and `simple_query_string`
queries to 1024 fields.

Resolves #25105

* Add blurb about limit to the docs
2017-09-08 13:37:55 -06:00
Lee Hinman dd90cf1bbb Throw a better error message for empty field names (#26543)
* Throw a better error message for empty field names

When a document is parsed with a `""` for a field name, we currently throw a
confusing error about `.` being present in the field. This changes the error
message to be clearer about what's causing the problem.

Resolves #23348

* Fix exception message in test
2017-09-08 13:30:17 -06:00
Lee Hinman 4e43aac0f8 Expand "NO" decision message in NodeVersionAllocationDecider (#26542)
This explains the `NO` Decision a little more.

Resolves #10403
2017-09-08 09:18:34 -06:00
Antonio Matarrese 155db7326a _reroute's retry_failed flag should reset failure counter (#25888)
To protect against poisonous situations, ES will only try to allocate a shard 5 times (by default). After 5 consecutive failures, ES will stop assigning the shard and wait for an operator to fix the problem. Once the problem is fixed, the operator is expected to call `_reroute` with a `retry_failed` flag to force retrying of those shards. Currently that retry flag is only used for a single allocation run. However, if not all shards can be allocated at once (due to throttling) the operator has to keep on calling the API until all shards are assigned which is cumbersome. This PR changes the behavior of the flag to reset the failed allocations counter and this allowing shards to be assigned again.
2017-09-08 12:18:52 +02:00
Jim Ferenczi 3435c9f4e2 #26496: Fix sporadic failure of ContextCompletionSuggestSearchIT#testGeoBoosting
This test should not rely on strict ordering for same score suggestions.
The Lucene completion suggester uses the doc id in case of a tie and documents are indexed randomly.
2017-09-08 11:30:40 +02:00
Jason Tedor e3b0cc9867 Remove norelease regarding destroying history
This commit removes a norelease from the codebase now that there is a CI
job that fails on the norelease pattern being present. Instead, a new
issue has been opened to track this one.

Relates #26544
2017-09-07 21:57:08 -04:00
Jim Ferenczi e684c5e0a5 #26496: handle `shard_size` correctly in the completion suggester and tests.
The completion suggester has a `shard_size` option that sets the size of the suggestions to retrieve per shard but it is ignored
 by the builder. This commit restores the handling of this option and fixes a test that can randomly fail without it.
2017-09-07 18:22:28 +02:00
Lee Hinman cff904bf97 Enable adaptive replica selection by default (#26522)
Relates to #24915
2017-09-07 09:25:05 -06:00
Jim Ferenczi d68d8c9cef Expose duplicate removal in the completion suggester (#26496)
This change exposes the duplicate removal option added in Lucene for the completion suggester
with a new option called `skip_duplicates` (defaults to false).
This commit also adapts the custom suggest collector to handle deduplication when multiple contexts match the input.

Closes #23364
2017-09-07 17:11:01 +02:00
Jim Ferenczi abe83c4fac Fail query when a sort is provided in conjunction with rescorers (#26510)
This change fixes a regression introduced in 6 that removes the skipping of the rescore phase
when a sort other than _score is used.
We now fail the request when a sort is provided in conjunction with rescore instead of just skipping the rescore phase
This commit also adds an assert that checks if the topdocs are sorted by _score after the rescoring.
This is the responsibility of the rescorer to make sure that topdocs are sorted after rescore so we
just check that this condition is met in the rescore phase.
2017-09-07 14:17:37 +02:00
Christoph Büscher ba02485541 Make sure SortBuilders rewrite inner nested sorts (#26532)
The three SortBuilders that can have inner NestedSortBuilders currently don't
rewrite any of the filters contained in them. This change adds a rewrite method
to NestedSortBuilder and changes rewriting in FieldSortBuilder,
ScriptSortBuilder and GeoDistanceSortBuilder to make sure inner nested sorts get
rewritten if they need to.
2017-09-07 14:04:50 +02:00
Christoph Büscher 47ffa17efb Extend testing of build method in ScriptSortBuilder (#26520)
Improve testing around the ScriptSortBuilder#build method, adding checks for
correct transfers of the sort mode and nested sorts.

Also changing the behaviour around the nested_path, nested_filter vs. nested
parameter in a similar way as in #26490 and deprecating the setters/getters for
the old syntax.

Closes #17286
2017-09-07 10:37:50 +02:00
Ryan Ernst c9964d17bf Internal: Add versionless alias for rest client codebase in policy files (#26521)
Security manager policy files contains grants for specific codebases,
where a codebase is a jar file. We use a system property containing the
name of the jar file to resolve the jar file location when parsing the
policy file. However, this means the version of the jars must be
modified when versions of dependencies change. This is particularly
messy for elasticsearch, where we now have a dependency on the rest
client, and need to support both a snapshot version for testing and non
snapshot for release.

This commit adds an alias for the elasticsearch rest client without a
version to be used in policy files. That allows the policy files to not care whether
the rest client is a snapshot or release.
2017-09-06 18:57:10 -07:00
Lee Hinman fe02350e73 With too many incoming tasks, reset measurements to 1ns instead of 0ns
Resoves #26332 where too many tasks occurred while adjustment was happening, the
measurements were reset to 0, and then an assert failed due to tasks executing
in 0 nanoseconds
2017-09-06 15:34:51 -06:00
Jason Tedor 9c795bd838 Fix cache compute if absent for expired entries
When a cache entry expires, it remains in the cache (both the segment
that it belongs to, and the LRU list) until an eviction occurs. The
problem here is that the compute if absent implementation relies on
there not being an association to a key that we are trying to put
because it internally uses put if absent on the underlying segment. If
we try to put an association for a key that has expired but not been
evicted, then compute if absent will return as if there is nothing in
the cache for the given key, yet no call to compute if absent will
succeed in putting a new association for the key. To remedy this, we
modify the internal get method for the cache to let the caller take
action if the entry they are retrieving is expired. This allows the
compute if absent method to take the action of evicting the entry from
the cache, thus allowing the put if absent method used by compute if
absent to succeed for one of the callers trying to compute if absent a
new association.

Relates #26516
2017-09-06 13:44:20 -04:00
Jim Ferenczi 0c799eedc5 Add upper limit for scroll expiry (#26448)
This change adds a dynamic cluster setting named `search.max_keep_alive`.
It is used as an upper limit for scroll expiry time in scroll queries and defaults to 1 hour.
This change also ensures that the existing setting `search.default_keep_alive` is always smaller than `search.max_keep_alive`.

Relates #11511

* check style

* add skip for bwc

* iter

* Add a maxium throttle wait time of 1h for reindex

* review

* remove empty line
2017-09-06 10:06:48 +02:00
Christoph Büscher 1b49bf3079 Remove deprecated parameters from `ids_query` (#26508)
The `_type` and `types` version of the current `type` parameter have been
deprecated since 5.0. We can remove support for them in 7.0 and also in 6.x and
6.0.
2017-09-05 18:12:31 +02:00
Tim Brooks c1a20f7e48 Merge tsa with ts (#26369)
We currently have a weird relationship between Transport,
TransportService, and TransportServiceAdaptor. At some point I think
that we would like to collapse these all into one concept as we only
support TCP transports.

This commit moves in that direction by eliminating the adaptor and just
passing the transport service to the transport.
2017-09-05 09:15:56 -06:00
Christoph Büscher 760bd6c568 Extend testing of build method in GeoDistanceSortBuilder (#26498)
Improve testing around the GeoDistanceSortBuilder#build method, adding checks for correct
transfers of the sort order, mode, nested sorts and points validation and coercion.

Also changing the behaviour around the nested_path, nested_filter vs. nested parameter in
a similar way as in #26490 and deprecating the setters/getters for the old syntax.

Relates to #17286
2017-09-05 14:38:10 +02:00
Martijn van Groningen 78e9c96d7f
Added a limit to from + size in top_hits and inner hits.
Relates to #11511
2017-09-05 08:44:45 +02:00
Christoph Büscher 8f0369296f Prohibit using `nested_filter`, `nested_path` and new `nested` Option at the same time in FieldSortBuilder (#26490)
Currently we allow both "old" and "new" way of setting nested sorts on the
FieldSortBuilder at the same time. This should throw an error, instead the user
should choose one of the two possible options.

Also adding testing for the now deprecated nestedPath/nestedFilter parameters,
inlcuding checks that they emmit warnings on parsing and that the new
NestetedSortBuilder overwrites the deprecated parameters when building the
SortField.

Relates to #17286
2017-09-04 17:19:52 +02:00
Boaz Leskes 2fd4af82e4 Move `UNASSIGNED_SEQ_NO` and `NO_OPS_PERFORMED` to SequenceNumbers (#26494)
Where they better belong.
2017-09-04 16:31:00 +02:00
Alexander Reelsen 3706a16baf Docs: Update broken link to flake ids in uuid generators 2017-09-04 10:48:50 +02:00
Christoph Büscher f8fc0f3ebe [Tests] Check that quoteAnalyzer overrides analyzer in `query_string` query (#26473)
Adding a check to QueryStringQueryBuilderTests that checks the override
behaviour of `quote_analyzer`, also adding documentation explaining the use of
this parameter in `query_string` query.

Closes #25417
2017-09-02 11:53:02 +02:00
Jason Tedor 1757bd8d92 Prettify primary response in assertion message
We are getting the default Object#toString implementation here, we need
more than this. This commit instead formats the primary response to JSON
so we can see into its soul.
2017-09-01 19:25:06 -04:00
Tal Levy 9735e7d706 migrate some MasterNodeRequest subclasses to Writeable Readers (#26463)
migrate some MasterNodeRequest subclasses to Writeable Readers
2017-09-01 15:27:45 -07:00
Boaz Leskes 2d0997be16 Add version 6.0.0-rc1 2017-09-01 17:48:24 -04:00
Christoph Büscher c2853c8281 Remove old norelease comment, the test is okay as it is 2017-09-01 18:25:27 +02:00
Christoph Büscher 2d342c0830 [Tests] Add unit tests for NestedSortBuilder (#26458)
The new NestedSortBuilder currently is only tested via its use in the other
SortBuilder implementations it can be used in. This adds its own simple unit
test class that at first checks our usual fromXContent parsing, serialization
and hashCode/equals checks. It also adds tests for cases where NestedSortBuilder
is nested in itself and reuses the code for creating randomized instances in the
other SortBuilder tests.

In addition to the tests, this changes the `path` parameter in NestedSortBuilder
to be mandatory and removes the `read` method since it is not really needed.
2017-09-01 10:53:51 +02:00
Alexander Reelsen 80d0a32f8e ScriptService: Replace max compilation per minute setting with max compilation rate (#26399)
The current script service has a script compilation limit for a one
minute window. This is set to a small default value of 15. Instead of
increasing that default value, this commit introduces a new setting 
that allows to configure a rate per time unit, so that the script service can deal with bursts better.

The new setting is named `script.max_compilations_rate`,
requires a nonnegative number and a positive time value.

The default is `75/5m`, which is equivalent to the existing 15 per minute.
2017-09-01 10:15:27 +02:00
Jason Tedor 111defdfe1 Allow double aborts on bulk item requests
In some cases a request can already be aborted and retried. This means
the condition that aborting a request should only happen when an item
has not been processed yet is too strict. This commit allows for a
double abort. If we attempt to abort an operation that was previously
processed but not aborted, we treat that as a hard failure.

Relates #26434
2017-08-31 14:37:02 -04:00
Christoph Büscher 294d167973 Revert accidental deletion of cast needed for Java 9 2017-08-31 16:13:12 +02:00
Jason Tedor 697bc266ce Upgrade to Log4j 2.9.0
This commit upgrades the Log4j dependency from version 2.8.2 to version
2.9.0.

Relates #26450
2017-08-31 09:54:35 -04:00
Tim Vernum eb87df9ff9 Allow abort of bulk items before processing (#26434)
Adds support for bulk items to be aborted before they are processed by the TransportShardBulkAction.
This can be used by an ActionFilter to reject a subset of the items in a bulk action without rejecting the whole action (or all the items for a shard).
2017-08-31 21:23:14 +10:00
Christoph Büscher adad605081 [Tests] Improve testing of FieldSortBuilder (#26437)
Currently we don't have much unit testing about the SortField that is created then
calling the SortBuilders `build` method. Most of this is covered by integration tests
somewhere but it would be good to have some basic checks in FieldSortBuilderTest
as well.

This adds testing for the sort order, mode, missing values and checks that `nested` 
gets set in the XFieldComparatorSource when `nestedPath` and `nestedFilter` are 
set on the builder.

Relates to #17286
2017-08-31 12:15:09 +02:00
Adrien Grand 78681bc9e5 Upgrade to lucene-7.0.0-snapshot-d94a5f0. (#26441) 2017-08-31 09:06:40 +02:00
Lee Hinman c3da66d021 Implement adaptive replica selection (#26128)
* Implement adaptive replica selection

This implements the selection algorithm described in the C3 paper for
determining which copy of the data a query should be routed to.

By using the service time EWMA, response time EWMA, and queue size EWMA we
calculate the score of a node by piggybacking these metrics with each search
request.

Since Elasticsearch lacks the "broadcast to every copy" behavior that Cassandra
has (as mentioned in the C3 paper) to update metrics after a node has been
highly weighted, this implementation adjusts a node's response stats using the
average of the its own and the "best" node's metrics. This is so that a long GC
or other activity that may cause a node's rank to increase dramatically does not
permanently keep a node from having requests routed to it, instead it will
eventually lower its score back to the realm where it is a potential candidate
for new queries.

This feature is off by default and can be turned on with the dynamic setting
`cluster.routing.use_adaptive_replica_selection`.

Relates to #24915, however instead of `b=3` I used `b=4` (after benchmarking)

* Randomly use adaptive replica selection for internal test cluster

* Use an action name *prefix* for retrieving pending requests

* Add unit test for replica selection

* don't use adaptive replica selection in SearchPreferenceIT

* Track client connections in a SearchTransportService instead of TransportService

* Bind `entry` pieces in local variables

* Add javadoc link to C3 paper and javadocs for stat adjustments

* Bind entry's key and value to local variables

* Remove unneeded actionNamePrefix parameter

* Use conns.longValue() instead of cached Long

* Add comments about removing entries from the map

* Pull out bindings for `entry` in IndexShardRoutingTable

* Use .compareTo instead of manually comparing

* add assert for connections not being null and gte to 1

* Copy map for pending search connections instead of "live" map

* Increase the number of pending search requests used for calculating rank when chosen

When a node gets chosen, this increases the number of search counts for the
winning node so that it will not be as likely to be chosen again for
non-concurrent search requests.

* Remove unused HashMap import

* Rename rank -> rankShardsAndUpdateStats

* Rename rankedActiveInitializingShardsIt -> activeInitializingShardsRankedIt

* Instead of precalculating winning node, use "winning" shard from ranked list

* Sort null ranked nodes before nodes that have a rank
2017-08-30 20:55:11 -06:00
Tal Levy ed151d829d Migrate Search requests to use Writeable reading strategies (#26428)
Migrates many SearchRequest objects to use Writeable conventions and rejects usage of `readFrom` in these new classes.
2017-08-30 11:00:33 -07:00
Martijn van Groningen ea3fa768f9
Changed version from 7.0.0-alpha1 to 6.1.0 in the nested sorting serialization check. 2017-08-30 19:56:10 +02:00
Matt Weber 140395c83f Multi-level Nested Sort with Filters (#26395)
Multi-level Nested Sort with Filters

Allow multiple levels of nested sorting where each level can have it's own filter.
Backward compatible with previous single-level nested sort.
2017-08-30 18:52:56 +02:00
Martijn van Groningen c821dce3fe
Revert "Multi-level Nested Sort with Filters"
This reverts commit 6377afa6c3.
2017-08-30 14:53:25 +02:00
Martijn van Groningen 410c6c281a
Revert "Temporarily set bwc version for new nested sorting to 7.0.0-alpha1 until the change has been backported to 6.x branch."
This reverts commit 472a5dd56b.
2017-08-30 14:53:10 +02:00
Martijn van Groningen 472a5dd56b
Temporarily set bwc version for new nested sorting to 7.0.0-alpha1 until the change has been backported to 6.x branch. 2017-08-30 14:30:20 +02:00
Martijn van Groningen 6377afa6c3
Multi-level Nested Sort with Filters
Allow multple levels of nested sorting where each level
can have it's own filter.  Backward compatible with
previous single-level nested sort.
2017-08-30 14:30:20 +02:00
Colin Goodheart-Smithe ce1d85d7d0 Moves deferring code into its own subclass (#26421)
* Moves deferring code into its own subclass

This change moves the code that deals with deferring collection to a subclass of BucketAggregator called DeferringBucketAggregator. This means that the code in AggregatorBase is simplified and also means that the code for deferring colleciton is in one place and easier to maintain.

* Makes SIngleBucketAggregator an interface

This is so aggregators that extend BucketsAggregator directly and those that extend DeferringBucketAggregator can be a single bucket aggregator

* review comments

* More review comments
2017-08-30 11:15:40 +01:00
Adrien Grand 34a6c7af26 Consolidate locale parsing. (#26400)
Mappings and ingest have different locale parsing code.
2017-08-30 10:58:33 +02:00
Sergey Galkin c075323522 Refactor create index service to be unit testable
This commit refactors MetaDataCreateIndexService so that it is unit
testable.

Relates #25961
2017-08-29 16:55:44 -04:00
Jason Tedor 7a035f5f84 setgid on /etc/elasticearch on package install
When creating the keystore explicitly (from executing
elasticsearch-keystore create) or implicitly (for plugins that require
the keystore to be created on install) on an Elasticsearch package
installation, we are running as the root user. This leaves
/etc/elasticsearch/elasticsearch.keystore having the wrong ownership
(root:root) so that the elasticsearch user can not read the keystore on
startup. This commit adds setgid to /etc/elasticsearch on package
installation so that when executing this directory (as we would when
creating the keystore), we will end up with the correct ownership
(root:elasticsearch). Additionally, we set the permissions on the
keystore to be 660 so that the elasticsearch user via its group can read
this file on startup.

Relates #26412
2017-08-28 20:47:42 -04:00
Jim Ferenczi 86d97971a4 Remove the _all metadata field (#26356)
* Remove the _all metadata field

This change removes the `_all` metadata field. This field is deprecated in 6
and cannot be activated for indices created in 6 so it can be safely removed in
the next major version (e.g. 7).
2017-08-28 17:43:59 +02:00
Stuart Neivandt f842ff1ae1 Simple verification of the format of the language tag used in DateProcessor. (#25513)
Closes #26186
2017-08-28 10:59:00 +02:00
Adrien Grand d692ccf261 Reject IPv6-mapped IPv4 addresses when using the CIDR notation. (#26254)
It introduces ambiguity as to whether the prefix length should be interpreted as
a v4 prefix length or a v6 prefix length.

See https://issues.apache.org/jira/browse/LUCENE-7920.

Closes #26078
2017-08-28 10:04:05 +02:00
Adrien Grand 262ea9534f Make locale parsing less lenient. (#26361)
The `locale` field of `date` fields accepts almost any string and unknown
locales are simply ignored, which is trappy. We should fail on unknown languages
or countries.

This commit also makes `-` an accepted separator in addition to `_` since `-`
is the recommended separator (https://tools.ietf.org/html/rfc5646#section-2.1).
`_` is probably still worth supporting since it is the separator used by
`Locale#toString()`.
2017-08-28 09:59:25 +02:00
Adrien Grand 36e22bc30f Remove 5.x backcompat from synonym filters. 2017-08-28 09:56:01 +02:00
Adrien Grand eb782492be Remove support for lenient booleans.
Closes #22298
2017-08-28 09:56:01 +02:00
Alexander Reelsen bdf2c3c691 Script Stats: Add compilation limit counter to stats (#26387)
In order to know, when the script compilation limit has kicked in,
this commit adds a counter in the script stats to expose that
information.

So far the only way to find out about this was to check the logs
or check out responses of individual requests.
2017-08-28 09:51:49 +02:00
Adrien Grand 6eac3ee8ba Avoid hardcoded error message that depends on the current version in tests. (#26391)
It makes it painful to bump the current version.
2017-08-28 09:11:31 +02:00
Michael Basnight cfd14cd2b8 Revert shading for the low level rest client (#26367)
At current, we do not feel there is enough of a reason to shade the low
level rest client. It caused problems with commons logging and IDE's
during the brief time it was used. We did not know exactly how many
users will need this, and decided that leaving shading out until we
gather more information is best. Users can still shade the jar
themselves. For information and feeback, see issue #26366.

Closes #26328

This reverts commit 3a20922046.
This reverts commit 2c271f0f22.
This reverts commit 9d10dbea39.
This reverts commit e816ef89a2.
2017-08-25 14:13:12 -05:00
Ryan Ernst 3655f3f2a3 Test: Remove irrelevant access after close test for stream (#26392)
This commit removes the streams test for access after closing the bytes
stream. Output streams being closed mean they can no longer be written
to, but other methods to retrieve side state of the stream can still
make sense, such as bytes() in this case.

relates #12620
2017-08-25 11:30:37 -07:00
Nik Everett b3edd11aa0 Allow plugins to plug rescore implementations (#26368)
This allows plugins to plug rescore implementations into
Elasticsearch. While this is a fairly expert thing to do I've
done my best to point folks to the QueryRescorer as one that at
least documents the tradeoffs that it makes. I've attempted to
limit the API surface area by removing `SearchContext` from the
exposed interface, instead exposing just the IndexSearcher and
`QueryShardContext`. I also tried to make some of the class names
more consistent and do some general cleanup while I was there.

I entertained the notion of moving the `QueryRescorer` to module.
After all, it'd be a wonderful test to prove that you can plug
rescore implementation into Elasticsearch if the only built in
rescore implementation is in the module. But I decided against it
because the new module would require a client jar and it'd require
moving some more things around. I think if we really want to do
it, we should do it as a followup.

I did, on the other hand, create an "example" rescore plugin which
should both be a nice example for anyone wanting to plug in their
own rescore implementation and servers as a good integration test
to make sure that you can indeed plug one in.

Closes #26208
2017-08-25 13:46:57 -04:00
Jim Ferenczi 74cd32942a Handle leniency for phrase query on a field indexed without positions (#26388)
This change rewrite phrase query built on a field indexed without positions
to match_no_docs query when the `lenient` option is set to true.
This change affects all full text queries.
2017-08-25 16:41:01 +02:00
Yannick Welsch 0390c76f0a Remove reinitShadowPrimary (#26349)
With shadow replicas gone, there is no need to have this method anymore.
2017-08-25 10:37:51 +09:30
Tim Brooks 0551d2ff68 Move generic http settings out of netty module (#26310)
There is a group of five settings relating to raw tcp configurations
(no_delay, buffer sizes, etc) that we have for the http transport. These
currently live in the netty module. As they are unrelated to netty
specifically, this commit moves these settings to the
`HttpTransportSettings` class in core.
2017-08-24 19:27:56 -05:00
Ryan Ernst 5202e7e93b Settings: Move keystore creation to plugin installation (#26329)
This commit removes the keystore creation on elasticsearch startup, and
instead adds a plugin property which indicates the plugin needs the
keystore to exist. It does still make sure the keystore.seed exists on
ES startup, but through an "upgrade" method that loading the keystore in
Bootstrap calls.

closes #26309
2017-08-24 12:12:47 -07:00
Jay Modi 7fb716daab Resync replication action should be internal (#26345)
This commit renames the TransportResyncReplicationAction name to be an internal action as this is
not an action that should be invoked by a user, but is instead internal to the operation of the
system.
2017-08-24 11:04:30 -06:00
Colin Goodheart-Smithe c8ca015c0b Check bucket metric ages point to a multi bucket agg (#26215)
* Check bucket metric ages point to a multi bucket agg

This adds a validation step to the BucketMetricsPipelineAggregationBuilder which ensure that the first aggregation in the `buckets_path` is a multi-bucket aggregation. It does this using a new `MultiBucketAggregationBuilder` marker interface.

The change also moves the validate of pipeline aggregations to the `AggregatorFactories.build()` method so the validate can inspect sibling `AggregatorBuilder` objects rather than `AggregatorFactory` objects. Further it removes the validate from `AggregatorFactory` since this was never implemented and since aggregators only depend on their own internal state and not on other aggregators they should be validated ideally at setter time but in rare case where this is not possible the validation should be done in the `AggregationBuilder.build()` step.

Closes #25775

Move validate stage to happen during AggregatorFactories.Builder.build

Also removes validate method from normal aggs since it was never used.

* review comment fix
2017-08-24 12:05:03 +01:00
Jim Ferenczi c1ba860b71 #26320: Reset default setting after test 2017-08-23 16:05:52 +02:00
Jim Ferenczi de1e4e0c15 Accept an array of field names and boosts in the index.query.default_field setting (#26320)
* Accept an array of field names and boosts in the index.query.default_field setting

This commit allows to define an array of field names and boosts for the index setting `index.query.default_field`.
The format is equivalent to the `fields` options of the full text search queries (e.g. field_name^boost).
This commit also makes this setting dynamically updatable.

Fixes #25946
2017-08-23 15:39:54 +02:00
Colin Goodheart-Smithe c3cc8262a7 Migrates more ToXContentClasses (#26321)
* More XContent migrations

* Removes ToXContentToBytes

* Adds toString to classes that used to extend ToXContentToBytes

* use XContentHelper

* more review comments

* prettify tostring output
2017-08-23 08:17:32 +01:00
Jim Ferenczi 8b8c06398e remove Lucene class copies that are not needed anymore (#26325) 2017-08-23 09:02:00 +02:00
Yannick Welsch 4b813adf52 [TEST] Account for relocating primary in SearchWhileCreatingIndexIT
The test verifies that search on the primary works by executing a search with preference _primary. If the primary is relocating, however, it
does not take the primary relocation target into account. The test only makes sense, however, if balancing is not happening yet, i.e., the
cluster is not green.
2017-08-23 14:23:14 +09:30
Yannick Welsch 73dff6d21f Add workaround for Javadoc generation issues on JDK 9 b181
The javadoc tool on JDK 9 has issues with the combination of anonymous classes and varargs parameters.
This commit simply refactors a few anonymous classes to private inner classes.
2017-08-23 10:15:01 +09:30
Tal Levy 6ab4b6b0ac revamp TransportRequest handlers to support Writeable (#26315)
This PR begins the long journey to deprecating Streamable.

The idea here is to add additional method signatures that
support Writeable.Reader, so that the work to migrate objects TransportMessage to
implement Writeable and not Streamable.

One example conversion is done in this PR: SimulatePipelineRequest.
2017-08-22 15:47:05 -07:00
Jim Ferenczi 4756c9a884 Fix nested query highlighting (#26305)
This commit extracts the inner query in the ESToParentBlockJoinQuery for highlighting.
This query has been added in 5.4 and breaks plain highlighting on nested queries.
Highlighters that use postings or term vectors are not affected because they can't highlight nested documents correctly.

Fixes #26230
2017-08-22 11:36:45 +02:00
Yannick Welsch 3d8feff66e Use Java 9 FilePermission model (#26302)
This commit makes the security code aware of the Java 9 FilePermission changes (see #21534) and allows us to remove the `jdk.io.permissionsUseCanonicalPath` system property.
2017-08-22 11:22:00 +09:30
Andy Bristol bdefcbdcd6 reroute API: log messages from commands (#25955)
Gives allocation commands from the cluster reroute API
the ability to provide messages to be logged once the 
cluster state change has been committed. 

The purpose of this change is to create a record in the 
logs when allocation commands which could potentially
be destructive are applied. The allocate_empty_primary
and allocate_stale_primary commands are the only ones
that currently provide log messages.

Closes #22821
2017-08-21 17:09:40 -07:00
Jim Ferenczi a48616272f #26173: Removed global_ordinals_hash and global_ordinals_low_cardinality exeuction hint deprecated in 6.1 2017-08-21 20:44:34 +02:00
Jim Ferenczi 977dcfe789 Deprecate global_ordinals_hash and global_ordinals_low_cardinality (#26173)
* Deprecate global_ordinals_hash and global_ordinals_low_cardinality

This change deprecates the `global_ordinals_hash` and `global_ordinals_low_cardinality` and
makes the `global_ordinals` execution hint choose internally if global ords should be remapped or use the segment ord directly.
These hints are too sensitive and expert to be exposed and we should be able to take the right decision internally based on the agg tree.
2017-08-21 19:12:27 +02:00
Christoph Büscher 5dae277bb2 Support distance units in GeoHashGrid aggregation precision (#26291)
Currently the `precision` parameter must be a precision level
in the range of [1,12]. In #5042 it was suggested also supporting
distance units like "1km" to automatically approcimate the needed
precision level. This change adds this support to the Rest API by
making use of GeoUtils#geoHashLevelsForPrecision.

Plain integer values without a unit are still treated as precision
levels like before. Distance values that are too small to be represented
by a precision level of 12 (values approx. less than 0.056m) are
rejected.

Closes #5042
2017-08-21 17:29:28 +02:00
Christoph Büscher 4ff12c9a0b Throw exception in scroll requests using `from` (#26235)
The `from` search parameter cannot really be used in scrolled searches. This
commit adds a check for this case to the SearchRequest#validate() method so we
can reported it as an error rather than silently ignoring it.

Closes #9373
2017-08-21 15:12:34 +02:00
Boaz Leskes 181e881a0f enable testIssue8226
The linked issue has been long closed
2017-08-21 14:33:04 +02:00
Jim Ferenczi 8fd71a5d6d #26145 Fix test expectation with MatchNoDocsQuery 2017-08-21 14:17:43 +02:00
Jim Ferenczi 4bce727165 Refactor simple_query_string to handle text part like multi_match and query_string (#26145)
This change is a continuation of #25726 that aligns field expansions for the simple_query_string with the query_string and multi_match query.
The main changes are:

 * For exact field name, the new behavior is to rewrite to a matchnodocs query when the field name is not found in the mapping.

 * For partial field names (with * suffix), the expansion is done only on keyword, text, date, ip and number field types. Other field types are simply ignored.

 * For all fields (*), the expansion is done on accepted field types only (see above) and metadata fields are also filtered.

The use_all_fields option is deprecated in this change and can be replaced by setting `*` in the fields parameter.
This commit also changes how text fields are analyzed. Previously the default search analyzer (or the provided analyzer) was used to analyze every text part
, ignoring the analyzer set on the field in the mapping. With this change, the field analyzer is used instead unless an analyzer has been forced in the parameter of the query.

Finally now that all full text queries can handle the special "*" expansion (`all_fields` mode), the `index.query.default_field` is now set to `*` for indices created in 6.
2017-08-21 13:12:27 +02:00
Sergey Galkin 9a3216dfee Stricter validation for min/max values for whole numbers (#26137) 2017-08-21 12:16:45 +02:00
Antonio Matarrese 93cc2d0372 Configurable distance limit with the AUTO fuzziness. (#25731)
Make the distance thresholds configurable with the AUTO fuzziness.
2017-08-21 11:00:20 +02:00
Ryan Ernst 96b0d3e0cc Script: Convert script query to a dedicated script context (#26003)
This commit converts script query to use a new FilterScript context. The
new context returns a boolean, so the error that would have previously
happened at runtime if a non boolean was returned would now happen at
script compilation. Also, the leniency of supporting returning a number
and 0 mapping to false, non-zero to true is gone, but it was never
documented. With the new context compilation will now also fail if
special variables are used at compilation time, instead of runtime, eg
ctx.
2017-08-18 15:18:35 -07:00
Tim Brooks 5d7a78fcdb Use PlainListenableActionFuture for CloseFuture (#26242)
Right now we use a custom future for the CloseFuture associated with a
channel. This is because we need special unwrapping logic to ensure that
exceptions from a future failure are a certain type (opposed to an
UncategorizedException). However, the current version is limiting
because we can only attach one listener.

This commit changes the CloseFuture to extend the
PlainListenableActionFuture. This change allows us to attach multiple
listeners.
2017-08-18 13:38:38 -05:00
Andy Bristol 6eef6c4f7a [TEST] wait until reindex tasks ready for rethrottle (#26250)
When slices is set as auto, there's an additional network call
needed for the reindex tasks to know how to rethrottle. Sometimes
the rethrottle action happens before the reindex task is fully
initialized, so in the test we wait for the task to be ready.

This commit also adds some safeguards to ensure that
cancel and rethrottle operations are handled correctly

Closes #26192
2017-08-18 11:01:27 -07:00
Jason Tedor 8a7d48538e Add friendlier message on bad keystore permissions
If we do not have permissions to write the keystore, an unclear access
denied exception is thrown. This commit catches this exception so that
we can decorate it with a friendlier error message.

Relates #26284
2017-08-18 10:39:38 -04:00
Nik Everett 542fe864f8 Handle the 5.5.2 release
That looks to be as simple as adding the 5.5.3 version constant.
2017-08-17 20:08:44 -04:00
Lee Hinman f18ec511ca Disallow : in cluster and index/alias names (#26247)
We use `:` for cross-cluster search (eg `cluster:index`), therefore, we should
not allow the ambiguity when allowing cluster or index names.

Relates to #23892
2017-08-17 14:57:26 -06:00
Simon Willnauer e3cc24685d Persist created keystore on startup unless keystore is present (#26253)
We already added the functionality to create a new keystore on startup
in #26126 but apparently missed to persist the keystore. This change adds
peristence and adds a test for the boostrap loading.
2017-08-17 15:32:23 +02:00
Adrien Grand 15b7aeeb0f Remove back compat layer with 2.x indices. (#26245)
As of 6.0 we do not need to support 2.x indices.
2017-08-17 10:16:24 +02:00
Adrien Grand 22292e8d96 Add segment attributes to the `_segments` API. (#26157)
This contains information about whether high compression was enabled for instance.

Closes #26130
2017-08-16 19:01:29 +02:00
Colin Goodheart-Smithe a975f4e5d6 Moves more classes over to ToXContentObject/Fragment (#26234)
* Moves more classes over to ToXContentObject/Fragment

* review comments
2017-08-16 15:40:40 +01:00
Simon Willnauer 54bf7d78e8 Prevent cluster internal `ClusterState.Custom` impls to leak to a client (#26232)
Today a `ClusterState.Custom` can be fetched by a transport client and
leaks to the user even if the classes are private etc since the serialized
bytes can be reconstructed. This change adds an option to customs to mark
them as private such that our clusterstate action will never leak it.
2017-08-16 12:54:17 +02:00
Yannick Welsch ca6eaf9831 [TEST] Reenable RareClusterStateIt#testDeleteCreateInOneBulk
The AwaitsFix issue has been closed as the deleting an index and recreating with same name will give the
shard a fresh folder to be written to (based on the index uuid).
2017-08-16 15:41:11 +08:00
Yannick Welsch 01f6851691 Serialize and expose timeout of acknowledged requests in REST layer (#26189)
Due to the weird way of structuring the serialization code in AcknowledgedRequest, many request types forgot to properly serialize the request timeout, for example "index deletion", "index rollover", "index shrink", "putting pipeline", and other requests. This means that if those requests were not directly sent to the master node, the acknowledgement timeout information would be lost (and the default used instead).
Some requests also don't properly expose the timeout mechanism in the REST layer, such as put / delete stored script. This commit fixes all that.
2017-08-16 07:43:05 +08:00
desmorto 292dd8f992 (refactor) some opportunities to use diamond operator (#25585)
* (refactor) some opportunities to use diamond operator

* Update ExceptionRetryIT.java

update typo
2017-08-15 16:36:42 -06:00
Ryan Ernst b2d6ff9116 Settings: Add keystore.seed auto generated secure setting (#26149)
This commit adds a keystore.seed setting that is automatically
generated when the ES keystore is created. This setting may be used by
plugins as a secure, random value. This commit also auto creates the
keystore upon startup to ensure the new setting is always available.
2017-08-15 14:04:03 -07:00
Jason Tedor 1ff8334d26 Fix document field equals and hash code test
For the document field equals and hash code tests, we try to mutate the
document field to intentionally produce a document field not equal to
our provided one. We do this by randomly choosing a document field that
has either
 - a randomly chosen field name and the same field value as the provided
   document field
 - a randomly chosen field value and the same field value as the
   provided document field

If we are unlucky, it can be that the document field chosen by this
method can be equal to the provided document field. In this case, our
test will fail because the mutation really should be not equal. In this
case, we should simply try the other mutation. Note that random document
field produced by the second method can be equal to the provided
document because it has the same field name and we can get unlucky with
our randomly chosen field values. It is not the case that the random
document field produced by the first method can be equal to the provided
document field; this is because the current implementation guarantees
that the field name length will be different guaranteeing that we have a
different field name. Nevertheless, we fix the issue here by checking
that our random choice gives us a non-equal document field, and assert
that if we got unlucky the other one will work for us.
2017-08-15 14:11:13 -04:00
Jason Tedor d1780a8052 Use holder pattern for lazy deprecation loggers
In a few places we need to lazy initialize static deprecation
loggers. This is needed to avoid touching logging before logging is
configured, but deprecation loggers that are used in foundational
classes like settings and parsers would be initialized before logging is
configured. Previously we used a lazy set once pattern which is fine,
but there's a simpler approach: the holder pattern.

Relates #26218
2017-08-15 13:46:19 -04:00
Ryan Ernst 7ed501b230 Settings: Add keystore creation to add commands (#26126)
This commits changes the keystore cli add commands to prompt for
creating the keystore if it does not exist. This will make it easier on
users starting out, not having to run a separate command for creation.
2017-08-15 10:15:55 -07:00
Zachary Tong d26becc040 Fix NPE when `values` is omitted on percentile_ranks agg (#26046)
An array of values is required because there is no default (or
reasonable way to set a default).  But validation for values
only happens if it is actually set.  If the values param is omitted
entirely than the agg builder will NPE.
2017-08-15 13:09:15 -04:00
Simon Willnauer a9169e536b Several internal improvements to internal test cluster infra (#26214)
This chance adds several random test infrastructure improvements that caused
issues in on-going developments but are generally useful. For instance is it impossible
to restart a node with a secure setting source since we close it after the node is started.
This change makes it cloneable such that we can reuse it for a restart.
2017-08-15 17:42:15 +02:00
Jason Tedor 1331741d7c Fix typo in comment in o/e/b/Elasticsearch
This commit fixes a typo (missing word) in
org/elasticsearch/bootstrap/Elasticsearch.java.
2017-08-15 09:43:35 -04:00
Christoph Büscher 34610b841d Reject multiple methods in `percentiles` aggregation (#26163)
Currently the `percentiles` aggregation allows specifying both possible methods
in the query DSL, but only the later one is used. This changes it to rejecting
such requests with an error. Setting the method multiple times via the java API
still works (and the last one wins).

Closes #26095
2017-08-15 14:11:57 +02:00
Colin Goodheart-Smithe f6d14717ed Makes hashCode and equals in InternalAggregations abstract (#26216)
This simply removes the default identity hashcode and equals methods in InternalAggregation which where only temporarily put there while we implmeneted the methods in the subclasses.
2017-08-15 11:14:57 +01:00
Yannick Welsch 0127528d97 Register setting cluster.indices.tombstones.size (#26193)
The node setting `cluster.indices.tombstones.size` was not registered with the settings infrastructure, making it impossible for it to be set by a user.

Closes #26191
2017-08-15 09:21:38 +08:00
Yannick Welsch fe0c68ec8f Allow wildcards for shard IP filtering (#26187)
Fixes the broken usage of wildcards for IP-based allocation filtering (introduced by PR #22591), which is documented at https://www.elastic.co/guide/en/elasticsearch/reference/current/shard-allocation-filtering.html

Closes #26184
2017-08-15 09:16:53 +08:00
Jason Tedor 447d92e482 Allow not configure logging without config
For CLI tools, we configure logging without reading the
log4j2.properties file. This because any log statements in a CLI tool
should dump to the console while reading from the log4j2.properties file
would cause them to dump whereever the log configuration there indicates
(e.g., possibly a remote machine). To do this, we added some code to the
base implementation of all CLI tools to configure logging without a
config file. This code is also executed when Elasticsearch starts up. In
the past this was fine yet we previously added detection to
Elasticsearch to find cases where we use logging before it is
configured. Because of configuring logging without a config, this means
we only catch uses of logging before the logging without config is
performed. To correct this, we enable a CLI tool to skip enabling
logging without a config and then in the Elasticsearch CLI we indeed
utilize this to skip configuring logging without a config.

Relates #26209
2017-08-14 19:39:14 -04:00
Jason Tedor 685e35e0ae Fix DiskThresholdMonitor flood warning
The flood warning checks the wrong threshold, namely the high
watermark. This would impact any node for which the disk usage is above
the high watermark and below the flood stage watermark. This commit
fixes this so that it compares to the flood threshold.

Relates #26204
2017-08-15 00:22:27 +09:00
Jim Ferenczi d896e62703 Rewrite range queries with open bounds to exists query (#26160)
* Rewrite range queries with open bounds to exists query

This change rewrites range query with open bounds to an exists query that should be faster to execute.

Fixes #22640
2017-08-14 09:50:36 +02:00
Christoph Büscher 6e085c75af Fix eclipse compilation problem (#26170) 2017-08-13 19:19:12 +02:00
Albert Zaharovits 3e3132fe3f Epoch millis and second formats parse float implicitly (Closes #14641) (#26119)
`epoch_millis` and `epoch_second` date formats truncate float values, as numbers or as strings.
The `coerce` parameter is not defined for `date` field type and this is not changing.
See PR #26119

Closes #14641
2017-08-13 08:35:45 +03:00
Martijn van Groningen 1146a35870
Move more token filters to analysis-common module
The following token filters were moved: arabic_stem, brazilian_stem, czech_stem, dutch_stem, french_stem, german_stem and russian_stem.

Relates to #23658
2017-08-11 17:39:24 +02:00
Andy Bristol 7e3cd6a019 reindex: automatically choose the number of slices (#26030)
In reindex APIs, when using the `slices` parameter to choose the number of slices, adds the option to specify `slices` as "auto" which will choose a reasonable number of slices. It uses the number of shards in the source index, up to a ceiling. If there is more than one source index, it uses the smallest number of shards among them.

This gives users an easy way to use slicing in these APIs without having to make decisions about how to configure it, as it provides a good-enough configuration for them out of the box. This may become the default behavior for these APIs in the future.
2017-08-11 08:25:25 -07:00
Adrien Grand 73e936a065 Fix serialization of the `_all` field. (#26143)
By default we only serialize analyzers if the index analyzer is not the
`default` analyzer or if the `search_analyzer` is different from the index
`analyzer`. This raises issues with the `_all` field when the
`index.analysis.analyzer.default_search` is set, since it automatically makes
the `search_analyzer` different from the index `analyzer`. Then there are
exceptions since we expect the `_all` configuration to be empty on 6.0 indices.

Closes #26136
2017-08-11 17:11:18 +02:00
Adrien Grand 1011791f4f Remove SimpleQueryStringIT#testPhraseQueryOnFieldWithNoPositions.
This test does not make sense now that `_all` is gone.
2017-08-11 11:31:09 +02:00
Adrien Grand 93cfbe29e0 Tests: reenable ShardReduceIT#testIpRange. 2017-08-11 11:04:40 +02:00
Simon Willnauer 6f82b0c6e2 Allow `ClusterState.Custom` to be created on initial cluster states (#26144)
Today we have a `null` invariant on all `ClusterState.Custom`. This makes
several code paths complicated and requires complex state handling in some cases.
This change allows to register a custom supplier that is used to initialize the
initial clusterstate with these transient customs.
2017-08-11 09:51:49 +02:00
Martijn van Groningen 076167fbe5
inner hits: Unfiltered nested source should keep its full path
like filtered nested source.

Closes #23090
2017-08-10 15:58:29 +02:00
Adrien Grand 0bf8a354a0 Use `global_ordinals_hash` execution mode when sorting by sub aggregations. (#26014)
This is a safer default since sorting by sub aggregations prevents these
aggregations from being deferred. `global_ordinals_hash` will at least
make sure that we do not use memory for buckets that are not collected.

Closes #24359
2017-08-10 12:28:19 +02:00
Martijn van Groningen 0e5460324c
Removed static indices and repos and the scripts that create them.
Two tests were still using the static indices:
* IndexFolderUpgraderTests#testUpgradeRealIndex()
* InternalEngineTests#testUpgradeOldIndex()

I removed these tests too, because these tests functionally overlap
with the full-cluster-restart qa tests.

Relates to #24939
2017-08-10 09:52:29 +02:00
Colin Goodheart-Smithe 20b7258d41
[TEST] fixes mutate methods in aggs tests
Closes #26121
2017-08-10 07:05:46 +01:00
Christoph Büscher 566992d2a1 Tests: Fix failure in InternalGeoBoundsTests (#26112)
This occasionally fails now because if `top` is `-Infinity` (which we sometimes 
test for in randomization), the value might not get changed for the
equals/hashCode tests.

Closes #26107
2017-08-09 23:01:36 +02:00
Colin Goodheart-Smithe dfbaf90951 Adds ToXContentFragment (#25771)
* Adds ToXContentFragment

This interface is meant for objects that implement `ToXContent` but are not complete objects. It is basically the opposite of `ToXContentObject`. It means that it will be easier to track the migration of classes over to the fragment/not fragment ToXContent model as it will be clear which classes are not migrated. When no classes directly implement `ToXContent` we can make `ToXContent` package private to be sure that all new classes must implement `ToXContentObject` or `ToXContentFragment`.

* review comments

* more review comments

* javadocs

* iter

* Adds tests

* iter

* adds toString test for aggs

* improves tests following review comments

* iter

* iter
2017-08-09 15:53:30 +01:00
Sergey Galkin d8ff6e9831 Reject out of range numbers for float, double and half_float (#25826)
* validate half float values

* test upper bound for numeric mapper

* test for upper bound for float, double and half_float

* more tests on NaN and Infinity for NumberFieldMapper

* fix checkstyle errors

* minor renaming

* comments for disabled test

* tests for byte/short/integer/long removed and will be added in separate PR

* remove unused import

* Fix scaledfloat out of range validation message

* 1) delayed autoboxing in numbertype.parse(...)
2) no redudant checks in half_float validation
3) tests with negative values for half_float/float/double
2017-08-09 12:44:57 +01:00
Jim Ferenczi 15598f2174 #26097: Adapt version check for the new query option: auto_generate_synonyms_phrase_query 2017-08-09 13:19:08 +02:00
Albert Zaharovits b22147854b Workaround Eclipse Oxygen type inference error (#26001) 2017-08-09 13:36:23 +03:00
Jim Ferenczi a7e1610134 Add support for auto_generate_synonyms_phrase_query in match_query, multi_match_query, query_string and simple_query_string (#26097)
* Add support for auto_generate_synonyms_phrase_query in match_query, multi_match_query, query_string and simple_query_string

This change adds a new parameter called auto_generate_synonyms_phrase_query (defaults to true).
This option can be used in conjunction with synonym_graph token filter to generate phrase queries
when multi terms synonyms are encountered.
For example, a synonym like "ny, new york" would produce the following boolean query when "ny city" is parsed:
((ny OR "new york") AND city)

Note how the multi terms synonym "new york" produces a phrase query.
2017-08-09 12:15:09 +02:00
Zachary Tong 59c670cbfa
Add version 6.0.0-beta2 after release 2017-08-08 14:13:47 -04:00
Adrien Grand f0c1e30544 Upgrade to lucene-7.0.0-snapshot-a128fcb. (#26090) 2017-08-08 13:03:19 +02:00
Colin Goodheart-Smithe 18e0fb5b3f [TEST] Adds mutate method to more tests (#26094)
* Adds mutate method to more tests

Relates to #25929

* fixes tests
2017-08-08 11:31:45 +01:00
olcbean 5c4c1c5e15 Verify that _bulk and _msearch requests are terminated by a newline (#25740) 2017-08-08 10:45:44 +02:00
Simon Willnauer 82fa531ab4 Remove `_index` fielddata hack if cluster alias is present (#26082)
We introduced a hack in #25885 to respect the cluster alias if available on the `_index` field. This is important if aggregations or other field data related operations are executed. Yet, we added a small hack that duplicated an implementation detail from the `_index` field data builder to make this work. This change adds a necessary but simple API change that allows us to remove the hack and only have a single implementation.
2017-08-08 09:24:24 +02:00
Adrien Grand f0cba4fce5 Add a scripted similarity. (#25831)
The goal of this similarity is to help users who would like to keep the
functionality of the `tf-idf` similarity that we want to remove, or to allow
for specific usec-cases (disabling idf, disabling tf, disabling length norm,
etc.) to not have to build a custom plugin and familiarize with the low-level
Lucene API.
2017-08-08 08:55:12 +02:00
Christoph Büscher 0ad4c0529b Tests: Fix edge case in InternalBucketMetricValueTests
Same problem as in #26084.
2017-08-07 18:37:51 +02:00
Christoph Büscher 729e09ed6e Tests: Fix edge case in InternalSimpleValueTests (#26084)
When value is NaN, the mutate function might return a new instance that is
equal to the original one.

* Add same fix for InternalDerivativeTests
2017-08-07 18:30:18 +02:00
Colin Goodheart-Smithe a4ae8a9156 [TEST] Adds mutate function for all metric aggregation tests (#26056)
* Adds mutate function for all metric aggregation tests

Relates to #25929

* fixes tests

* fixes review comments

* Fixes cardinality equals method

* Fixes scripted metric test
2017-08-07 13:30:49 +01:00
Colin Goodheart-Smithe 8fda74aee1 Adds mutate function for all pipeline aggregation tests (#26058)
* Adds mutate function for all metric aggregation tests

 Relates to #25929

* Fixes review comments
2017-08-07 10:09:41 +01:00
Boaz Leskes e11cbed534 Adding a refresh listener to a recovering shard should be a noop (#26055)
When `refresh=wait_for` is set on an indexing request, we register a listener on the shards that are call during the next refresh. During the recover translog phase, when the engine is open, we have a window of time when indexing operations succeed and they can add their listeners. Those listeners will only be called when the recovery finishes as we do not refresh during recoveries (unless the indexing buffer is full). Next to being a bad user experience, it can also cause deadlocks with an ongoing peer recovery that may wait for those operations to mark the replica in sync (details below).

To fix this, this PR changes refresh listeners to be a noop when the shard is not yet serving reads (implicitly covering the recovery period). It doesn't matter anyway. 

Deadlock with recovery:

When finalizing a peer recovery we mark the peer as "in sync". To do so we wait until the peer's local checkpoint is at least as high as the global checkpoint. If an operation with `refresh=wait_for` is added as a listener on that peer during recovery, it is not completed from the perspective of the primary. The primary than may wait for it to complete before advancing the local checkpoint for that peer. Since that peer is not considered in sync, the global checkpoint on the primary can be higher, causing a deadlock. Operation waits for recovery to finish and a refresh to happen. Recovery waits on the operation.
2017-08-04 19:51:15 +02:00
Igor Motov c9bb686927 Snapshot/Restore: Update version of shard failure reason serialization
Updating the version in SnapshotsInProgress serialization method to reflect that #25941 was backported to 6.0.0-beta1.

Relates to #25878
2017-08-03 16:16:30 -04:00
Stuart Neivandt 8ef7438d6c Accept ingest simulate params as ints or strings (#23885)
* Allow ingest simulate to parse _id, _index, _type, _routing and _parent as either string or int (#23823)

* Generate data that includes Integer and String type fields for testing document parsing.
2017-08-03 11:29:21 -07:00
Colin Goodheart-Smithe 5f1634dff4 Fixes array out of bounds for value count agg (#26038)
https://github.com/elastic/elasticsearch/pull/17379 fixed many metric aggs so that if the parent aggregation does not collect any documents an empty bucket value is returned instead of an ArrayOutOfBoundsException being thrown. Unfortunately the value count aggregation was mised from this fix.

This change applies this fix from #17379 for the value count aggregation.
2017-08-03 10:19:14 +01:00
Colin Goodheart-Smithe aafd7f90fd [TEST] fix NPE when generating random query (#26023)
`ClusterSearchShardsResponseTests.testSerialization` randomly uses `IdsQueryBuilderTests` to generate an alias filter. `IdsQueryBuilderTests` shecks if the array of current types is length zero but it can also be null which causes a `NullPointerException`. This changes adds a null check to avoid the exception.

Closes #26021
2017-08-02 18:28:26 +01:00
Colin Goodheart-Smithe 87c6e63e73 Adds mutate function to various tests (#25999)
* Adds mutate function to various tests

Relates to #25929

* fix test

* implements mutate function for all single bucket aggs

* review comments

* convert getMutateFunction to mutateIInstance
2017-08-02 11:38:31 +01:00
Adrien Grand 88d456989e Make FieldMapper.copyTo() always non-null. (#25994)
Otherwise it is confusing that both a null copyTo and an empty copyTo should
be treated the same.
2017-08-02 10:07:29 +02:00
Adrien Grand 58feb5efa0 Fix `_exists_` in query_string on empty indices. (#25993)
It currently fails if there are no mappings yet.

Closes #25956
2017-08-02 10:06:34 +02:00
Luca Cavanna e2d25c3c89 [TEST] Remove duplicated main response unit test (#25855)
Also move MainResponseTets to extend AbstractStreamableXContentTestCase
2017-08-02 08:42:38 +02:00
Tim Brooks 0f4f49496f Use nio transport in test clusters (#25986)
This commit adds the nio transport as an option in place of the mock tcp
transport for tests. Each test will only use one transport type. The
transport type is decided by a random boolean generated inside of the
`ESTestCase` class.
2017-08-01 16:19:31 -05:00
Ryan Ernst 072281d5aa Update version to 7.0.0-alpha1 (#25876)
This commit updates the version for master to 7.0.0-alpha1. It also adds
the 6.1 version constant, and fixes many tests, as well as marking some
as awaits fix.

Closes #25893
Closes #25870
2017-08-01 15:47:48 -04:00
Adrien Grand e9669b3762 Better validation of `copy_to`. (#25983)
We are currently quite lenient about the targets of `copy_to`. However in a
number of cases we can detect illegal use of `copy_to` at mapping update time.
For instance, it does not make sense to use object fields as targets of
`copy_to`, or fields that would end up in a different nested document.
2017-08-01 16:23:28 +02:00
Boaz Leskes 9f1d116967 Node should start up despite of a lingering `.es_temp_file` (#21210)
When ES starts up we verify we can write to all data folders and that they support atomic moves. We do so by creating and deleting temp files. If for some reason the files was successfully created but not successfully deleted, we still shut down correctly but subsequent start attempts will fail with a file already exists exception.

This commit makes sure to first clean any existing temporary files.

Superseeds #21007
2017-08-01 15:41:27 +02:00
Tanguy Leroux 52c79629e2 QueryBuilders does not need to be abstract (#25982) 2017-08-01 10:39:21 +02:00
Luca Cavanna 4d589afbc2 AbstractQueryBuilder to no longer extend ToXContentBytes (#25948)
ToXContentToBytes is used as a base class that adds toString and buildAsBytes method implementation to classes that implement ToXContent. With the ongoing cleanups, this class is limited and doesn't add a lot of value, given that buildAsBytes can be replaced with XContentHelper.toXContent and toString can be replaced with Strings.toString(this).

The plan would be to remove ToXContentToBytes entirely, and AbstractQueryBuilder is the first place where we can remove its usage.
2017-07-31 17:38:24 +02:00
Boaz Leskes 9d10ffd547 Goodbye, Translog Views (#25962)
During peer recoveries, we need to copy over lucene files and replay the operations they miss from the source translog. Guaranteeing that translog files are not cleaned up has seen many iterations overtime. Back in the old 1.0 days, recoveries went through the Engine and actively prevented both translog cleaning and lucene commits. We then moved to a notion called Translog Views, which allowed the recovery code to "acquire" a view into the translog which is then guaranteed to be kept around until the view is closed. The Engine code was free to commit lucene and do what it ever it wanted without coordinating with recoveries. Translog file deletion logic was based on reference counting on the file level. Those counters were incremented when a view was acquired but also when the view was used to create a `Snapshot` that allowed you to read operations from the files. At some point we removed the file based counting complexity in favor of constructs on the Translog level that just keep track of "open" views and the minimum translog generation they refer to. To do so, Views had to be kept around until the last snapshot that was made from them was consumed. This was fine in recovery code but lead to [a subtle bug](https://github.com/elastic/elasticsearch/pull/25862) in the [Primary Replica Resyncer](https://github.com/elastic/elasticsearch/pull/25862). 

Concurrently, we have developed the notion of a `TranslogDeletionPolicy` which is responsible for the liveness aspect of translog files. This class makes it very simple to take translog Snapshot into account for keep translog files around, allowing people that just need a snapshot to just take a snapshot and not worry about views and such. Recovery code which actually does need a view can now prevent trimming by acquiring a simple retention lock (a `Closable`). This removes the need for the notion of a View.
2017-07-31 17:29:43 +02:00
honourednihilist 0848ffd52e Fixed bug that mapper_parsing_exception is thrown for numeric field with ignore_malformed=true when inserting "NaN", "Infinity" or "-Infinity" values (#25967) 2017-07-31 16:14:30 +02:00
Sam Cinco e0359e7331 Fix term(s) query for range field (#25918) 2017-07-31 16:01:01 +02:00
Martijn van Groningen 0b776a1de0
Move more token filters to analysis-common module
The following token filters were moved: delimited_payload_filter, keep, keep_types, classic, apostrophe, decimal_digit, fingerprint, min_hash and scandinavian_folding.

Relates to #23658
2017-07-31 15:15:04 +02:00
Jason Tedor 2ef0f8af38 Add max file size bootstrap check
This commit adds a bootstrap check for the maximum file size, and
ensures the limit is set correctly when Elasticsearch is installed as a
service on systemd-based systems.

Relates #25974
2017-07-31 21:01:47 +09:00
Jason Tedor 1afc9afcac Version option should display if snapshot
We have a command-line flag -V or --version that can be used to display
the version of Elasticsearch. However, the version that we display does
not contain whether or not the version is a snapshot build. This commit
changes the behavior here so that if the build is a snapshot, that is
included in the version string.

Relates #25970
2017-07-31 11:45:06 +09:00
Jason Tedor 9267048878 Remove dead code for checking exclusive options
Previously we manually checked if mutually exclusive options are passed
on the command line. Yet, after an upgrade to our option parser
dependency, we were able to use built-in functionality to establish
these mutually exclusive options and the parser would take care of
checking if such options are passed on the command line. However, the
previous manually checking code is now dead and was left behind. This
commit removes that dead code.

Relates #19278
2017-07-31 10:00:31 +09:00
Jason Tedor b54886d502 Fix typo in Elasticsearch help
This commit fixes a small typo in the help output displayed by
Elasticsearch when the --help flag is passed.
2017-07-31 09:56:47 +09:00
Jason Tedor 4c37335f1d Format CLI error message when es.path.conf not set
This commit adds some formatting to the message displayed when
es.path.conf is not set.
2017-07-30 09:49:55 +09:00
Jim Ferenczi 636748e270 [Test] Make sure the same exception is thrown for every test run. Fixes #25952 2017-07-28 19:02:58 +02:00
Igor Motov fe46ef393b Snapshot/Restore: Ensure that shard failure reasons are correctly stored in CS (#25941)
The failure reason for snapshot shard failures might not be propagated properly if the master node changes after the errors were reported by other data nodes. This commits ensures that the snapshot shard failure reason is preserved properly and adds workaround for reading old snapshot files where this information might not have been preserved.

Closes #25878
2017-07-28 12:28:02 -04:00
Martijn van Groningen 7c3735bdc4
percolator: Store the QueryBuilder's Writable representation instead of its XContent representation.
The Writeble representation is less heavy to parse and that will benefit percolate performance and throughput.

The query builder's binary format has now the same bwc guarentees as the xcontent format.

Added a qa test that verifies that percolator queries written in older versions are still readable by the current version.
2017-07-28 12:24:10 +02:00
Yannick Welsch 1a01514081 Move tribe to a module (#25778)
This commit moves tribe to a module, stripping core from the tribe functionality.
2017-07-28 11:23:50 +02:00
Jim Ferenczi 562c3744ca Merge FunctionScoreQuery and FiltersFunctionScoreQuery (#25889)
This change merges the functionality of the FiltersFunctionScoreQuery in the FunctionScoreQuery.
It also ensures that an exception is thrown when the computed score is equals to Float.NaN or Float.NEGATIVE_INFINITY.
These scores are invalid for TopDocsCollectors that relies on score comparison.

Fixes #15709
Fixes #23628
2017-07-28 09:22:20 +02:00
Jason Tedor 1492ccd7ae Fix environment-aware command tests
This commit fixes tests for environment-aware commands. A previous
change added a check that es.path.conf is not null. The problem is that
this system property is not being set in tests so this check trips every
single time. To fix this, we move the check into a method that can be
overridden, and then override this method in relevant places in tests to
avoid having to set the property in tests. We also add a test that this
check works as expected.
2017-07-28 14:37:04 +09:00
Jason Tedor c1ee65f990 Remove unused imports from EnvironmentAwareCommand
This commit removes two unused imports from EnvironmentAwareCommand that
were left behind after a previous change.
2017-07-28 12:20:18 +09:00
Jason Tedor 8639bf4a1a Pass config path as a system property
A previous change enabled it so that users could configure the
configuration path via a command-line option --path.conf. However, a
subsequent change has made it so that we expect users to set the
configuration path via the environment variable CONF_DIR. To enable
this, we now pass the value of CONF_DIR as the value for the
command-line option --path.conf. This has two problems:
 - the presence of --path.conf always being on the command line breaks
   other flags like --help for multi-commands
 - the scripts for which --help is not broken say that you can pass
   --path.conf but this is a lie since passing it will make it appear
   twice in the command-line arguments breaking the script

Since --path.conf is no longer the way that we want users to set the
configuration path, we should remove the --path.conf option. However, we
still need a way to get the configuration path from the scripts to the
running Java process. To do this, we now pass the configuration path as
a system property. This keeps it off the script command line fixing the
above problems.

The only remaining question (that I can see) is whether or not to
respect -Des.path.conf=<some path> if the user sets this in their
jvm.options or via ES_JAVA_OPTS. I think that we should not do this (as
has been our tradition), es.path.home and es.path.conf are special,
should be set by our scripts only so users should not be setting them at
all so we should not take any effort to respect these flags if the user
tries to otherwise use them.

Relates #25943
2017-07-28 12:15:22 +09:00
Yannick Welsch efd79882a2 Allow build to directly run under JDK 9 (#25859)
With Gradle 4.1 and newer JDK versions, we can finally invoke Gradle directly using a JDK9 JAVA_HOME without requiring a JDK8 to "bootstrap" the build. As the thirdPartyAudit task runs within the JVM that Gradle runs in, it needs to be adapted now to be JDK9 aware.

This commit also changes the `JavaCompile` tasks to only fork if necessary (i.e. when Gradle's JVM and JAVA_HOME's JVM differ).
2017-07-27 16:14:04 +02:00
Yannick Welsch 020ba41c5d Close translog view after primary-replica resync (#25862)
The translog view was being closed too early, possibly causing a failed resync. Note: The bug only affects unreleased code.

Relates to #24841
2017-07-27 14:36:51 +02:00
Yannick Welsch 620536f850 Release operation permit on thread-pool rejection (#25930)
At the shard level we use an operation permit to coordinate between regular shard operations and special operations that need exclusive access. In ES versions < 6, the operation requiring exclusive access was invoked during primary relocation, but ES versions >= 6 this exclusive access is also used when a replica learns about a new primary or when a replica is promoted to primary.

These special operations requiring exclusive access delay regular operations from running, by adding them to a queue, and after finishing the exclusive access, release these operations which then need to be put back on the original thread-pool they were running on. In the presence of thread pool rejections, the current implementation had two issues:

- it would not properly release the operation permit when hitting a rejection (i.e. when calling ThreadedActionListener.onResponse from IndexShardOperationPermits.acquire).
- it would not invoke the onFailure method of the action listener when the shard was closed, and just log a warning instead (see ThreadedActionListener.onFailure), which would ultimately lead to the replication task never being cleaned up (see #25863).

This commit fixes both issues by introducing a custom threaded action listener that is permit-aware and properly deals with rejections.

Closes #25863
2017-07-27 14:15:00 +02:00
Adrien Grand 1cd5e3413d Caching a MinDocQuery can lead to wrong results. (#25909)
Queries are supposed to be cacheable per segment, yet matches of this query
also depend on how many documents exist on previous segments.
2017-07-27 11:19:20 +02:00
Adrien Grand 876c7e0400 Fix random score generation when no seed is provided. (#25908)
It fixes random score generation to ensure that you will not always get the
same scores on a read-only index by integrating the seed into the score
computation when using doc ids. It also removes `ctx.docBase` from the formula
since it might change over time if deletes are compacted while scores are
supposed to be cacheable per segment.
2017-07-27 11:17:56 +02:00
Martijn van Groningen edad7b4737
Add support for selecting percolator query candidate matches containing range queries.
Extracts ranges from range queries on byte, short, integer, long, half_float, scaled_float, float, double, date and ip fields.
byte, short, integer and date ranges are normalized to Lucene's LongRange.
half_float and float are normalized to Lucene's DoubleRange.

When extracting range queries, the QueryAnalyzer computes the width of the range.  This width is used to determine
what range should be preferred in a conjunction query. The QueryAnalyzer prefers the smaller ranges, because these
ranges tend to match with less documents.

Closes #21040
2017-07-26 21:25:45 +02:00
Simon Willnauer b72c71083c Cleanup IndexFieldData visibility (#25900)
Today we expose `IndexFieldDataService` outside of IndexService to do maintenance
or lookup field data in different ways. Yet, we have a streamlined way to access IndexFieldData
via `QueryShardContext` that should encapsulate all access to it. This also ensures that we control all other functionality like cache clearing etc.

This change also removes the `recycler` option from `ClearIndicesCacheRequest` this option is a no-op and should have been removed long ago.
2017-07-26 20:03:42 +02:00
Tim Brooks 6d02b45f10 Support client-only mode for NioTransport (#25839)
Currently, NioTransport does start normal socket selectors and the
client when the network server setting is set to false. This commit
makes it so that the client will be started even when the network server
is not enabled.

Additionally, it randomly introduces the NioTransport as an option for
the MockTransportClient throughout tests.
2017-07-26 10:27:15 -05:00
Boaz Leskes 03eb1460ad MasterNodeChangePredicate should use the node instance to detect master change (#25877)
This predicate is used to deal with the intricacies of detecting when a master is reelected/nodes rejoins an existing master. The current implementation is based on nodeIds, which is fine if the master really change. If the nodeId is equal the code falls back to detecting an increment in the cluster state version which happens when a node is re-elected or when the node rejoins. Sadly this doesn't cover the case where the same node is elected after a full restart of all master nodes. In that case we recover the cluster state from disk but the version is reset back to 0. To fix this, the check should be done based on ephemeral IDs which are reset on restart.

Fixes #25471
2017-07-26 17:02:42 +02:00
Luca Cavanna d8203f19fd Remove XContentHelper#toString(ToXContent) in favour of Strings#toString(ToXContent) (#25866)
These two methods do do the same thing. The subtle difference between the two is that the former prints out pretty printed content by default while the latter doesn't. There are way more usages of the latter throughout the codebase hence I kept that variant although I do think that it would be much better to print out prettified content by default from a `toString`. That breaks quite some tests so I didn't make that change yet.

Also XContentHelper#toString was outdated as it didn't check the ToXContent#isFragment method to decide whether a new anonymous object has to be created or not. It would simply fail with any ToXContentObject.
2017-07-26 16:00:59 +02:00
Boaz Leskes 26e82610b7 testWaitForPendingSeqNo didn't properly wait for all pending ops to "stuck"
The test only waited for one op to be stuck. In rare occasions the other ops were still in flight when recovery captured a translog snapshot throwing doc count off.
2017-07-26 13:49:15 +02:00
Boaz Leskes 015424d9f4 Test: indexOnReplicaWithGaps should randomly add a gap at the end
This confuses assertion because if it's the only gap, it looks like one operation less is indexed and there are no gaps at all.
2017-07-26 13:28:39 +02:00
Michael Basnight 9d10dbea39 Fix rest client causing jarHell for gradle 3.5+ (#25892)
The configuration removed from the runtime configuration did not
properly remove the deps jar from gradle versions > 3.3. The rest client
now removes both the 3.3 and 3.3+ configurations so this works on both
versions of gradle.

Closes #25884
Relates #25208
2017-07-26 11:25:25 +02:00
Simon Willnauer ca4f77039c Remove unused member in IndicesService 2017-07-26 10:20:40 +02:00
Simon Willnauer 4baf7a9e50 Remove unnecessary imports 2017-07-26 10:08:23 +02:00
Simon Willnauer 634ce90dc0 Respect cluster alias in `_index` aggs and queries (#25885)
Today when we aggregate on the `_index` field the cross cluster search
alias is not taken into account. Neither is it respected when we search
on the field. This change adds support for cluster alias when the cluster
alias is present on the `_index` field.

Closes #25606
2017-07-26 09:16:52 +02:00
Scott Somerville 2f8def11b5 Coerce decimal strings for whole number types by truncating the decimal part (#25835)
This changes makes it so you can index a value like "1.0" or "1.1" into whole
number field types like byte and integer. Without this change then the above
values would have resulted in an error, even with coerce set to true.

Closes #25819
2017-07-26 08:21:42 +02:00
Lee Hinman 6e79062078 Add 5.5.2 Version and 5.5.1 BWC indices 2017-07-25 10:58:39 -06:00
Adrien Grand 315319b763 Remove assertion about deviation when casting to a float. (#25806)
We cannot guarantee that the result of computations will be in the float range,
since it depends on the data and how scores are computed. We already use doubles
as intermediate representations and cast to a float as a final step, which is
the right thing to do. Small doubles will just be rounded to zero, there is not
much we can or should do about it.

Closes #25330
2017-07-25 15:07:45 +02:00
Martijn van Groningen a9ae52e78b
inner hits: Only access stored fields when needed
Stored fields were still being accessed for nested inner hits even if the _source was not requested.
This was done to figure out the id of the root document. However this is already known higher up the stack.
So instead this change adds the id to the nested search context, so that it is no longer required to be fetched via the stored fields.

In case the _source is large and no source is requested then hot threads like these ones would still appear:

```
100.3% (501.3ms out of 500ms) cpu usage by thread 'elasticsearch[AfXKKfq][search][T#6]'
     2/10 snapshots sharing following 22 elements
       org.apache.lucene.store.DataInput.skipBytes(DataInput.java:352)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.skipField(CompressingStoredFieldsReader.java:246)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:601)
       org.apache.lucene.index.CodecReader.document(CodecReader.java:88)
       org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411)
       org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:347)
       org.elasticsearch.search.fetch.FetchPhase.createNestedSearchHit(FetchPhase.java:219)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:150)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:422)
```

and:

```
8/10 snapshots sharing following 27 elements
       org.apache.lucene.codecs.compressing.LZ4.decompress(LZ4.java:135)
       org.apache.lucene.codecs.compressing.CompressionMode$4.decompress(CompressionMode.java:138)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState$1.fillBuffer(CompressingStoredFieldsReader.java:531)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader$BlockState$1.readBytes(CompressingStoredFieldsReader.java:550)
       org.apache.lucene.store.DataInput.readBytes(DataInput.java:87)
       org.apache.lucene.store.DataInput.skipBytes(DataInput.java:350)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.skipField(CompressingStoredFieldsReader.java:246)
       org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.visitDocument(CompressingStoredFieldsReader.java:601)
       org.apache.lucene.index.CodecReader.document(CodecReader.java:88)
       org.apache.lucene.index.FilterLeafReader.document(FilterLeafReader.java:411)
       org.elasticsearch.search.fetch.FetchPhase.loadStoredFields(FetchPhase.java:347)
       org.elasticsearch.search.fetch.FetchPhase.createNestedSearchHit(FetchPhase.java:219)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:150)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.fetch.subphase.InnerHitsFetchSubPhase.hitsExecute(InnerHitsFetchSubPhase.java:73)
       org.elasticsearch.search.fetch.FetchPhase.execute(FetchPhase.java:166)
       org.elasticsearch.search.SearchService.executeFetchPhase(SearchService.java:422)
```
2017-07-25 12:10:59 +02:00
Yannick Welsch 7e08753bd2 [TEST] Set proper version on InputStream 2017-07-25 10:11:06 +02:00
Boaz Leskes cd508555f9 Engine.close should only return when resources are freed (#25852)
Currently Engine.close can return immediately if the engine is already at the process of shutting down (due to a concurrent close call or an engine failure). This is a shame because some of our testing infra wants to do things like checking the index. This commit changes the logic to make sure that all calls to close wait until resources are freed. Failing the engine is still non blocking.

Fixes #25817
2017-07-25 08:08:44 +02:00
Michael Basnight e816ef89a2 Shade external dependencies in the rest client jar
This commit removes all external dependencies from the rest client jar
and shades them in an 'org.elasticsearch.client' package within the jar
using shadowJar gradle plugin. All projects that depended on the
existing jar have been converted to using the 'org.elasticsearch.client'
package prefixes to interact with the rest client.

Closes #25208
2017-07-24 12:55:43 -05:00
Jim Ferenczi 4a9995145c [Docs]: Clarify query_string parser splits on operator 2017-07-24 18:36:16 +02:00
Boaz Leskes 17714acb9e add debug logging to SpecificMasterNodesIT
Chasing https://github.com/elastic/elasticsearch/issues/25471

Also beefed up tests in TransportMasterNodeActionTests trying to simulate possible failures
2017-07-24 18:33:44 +02:00
Jim Ferenczi 3a59b6a16c Context suggester should filter doc values field (#25858)
The context suggester extracts the context field values from the document but it does not filter doc values field coming from Keyword field.
This change filters doc values field when building the context values.

Fixes #25404
2017-07-24 17:45:01 +02:00
Jim Ferenczi 2f8f440e80 #25851: Fix ParentFieldMapper.toXContent to print eager_global_ordinals only when it is set to false 2017-07-24 15:03:05 +02:00
Jim Ferenczi d73e17c103 SpanNearQueryBuilder should return the inner clause when a single clause is provided (#25856)
This change handles the case where a SpanNearQueryBuilder tries to create a query with a single clause.
This is not allowed in the SpanNearQuery so instead of throwing an exception when the weight is built, this change builds and returns
the singleton inner clause on toQuery.

Fixes #25630
2017-07-24 13:24:29 +02:00
Jim Ferenczi 93b04fb7bd The default _parent field should not try to load global ordinals (#25851)
The default _parent field tries to load global ordinals because it is created with eager_global_ordinals=true.
This leads to an IllegalStateException because this field does not have doc_values.
This change explicitely sets eager_global_ordinals to false in order to avoid the ISE on startup.

Fixes #25849
2017-07-24 13:07:19 +02:00
Boaz Leskes c72fc55283 adapt testDoubleDeliveryReplicaAppendingOnly to #25827 2017-07-22 08:46:21 +02:00
Boaz Leskes d21ad9b652 fix compilation 2017-07-21 20:16:58 +02:00
Boaz Leskes ab1636d547 Engine - do not index operations with seq# lower than the local checkpoint into lucene (#25827)
When a replica processes out of order operations, it can drop some due to version comparisons. In the past that would have resulted in a VersionConflictException being thrown and the operation was totally ignored. With the seq# push, we started storing these operations in the translog (but not indexing them into lucene) in order to have complete op histories to facilitate ops based recoveries. This in turn had the undesired effect that deleted docs may be resurrected during recovery in some extreme edge situation (see a complete explanation below). This PR contains a simple fix, which is also an optimization for the recovery process, incoming operation that have a seq# lower than the current local checkpoint (i.e., have already been processed) should not be indexed into lucene. Note that sometimes we can also skip storing them in the translog, but this is not required for the fix and is more complicated.

This is the equivalent of #25592

## More details on resurrected ops 

Consider two operations: 
 - Index d1, seq no 1
 - Delete d1, seq no 3

On a replica they come out of order:
 - Translog gen 1 contains:
    - delete (seqNo 3)
 - Translog gen 2 contains:
    - index (seqNo 1) (wasn't indexed into lucene, but put into the translog)
    - another operation (seqNo 10)
 - Translog gen 3 
    - another op (seqNo 9)
 - Engine commits with:
    - local checkpoint 9
    - refers to gen 2 

If this replica becomes a primary:
    - Local recovery will replay translog gen 2 and up, causing index #1 to be re-index. 
    - Even if recovery will start at gen 3, the translog retention policy will cause file based recovery to replay the entire translog. If it happens to start at gen 2 (but not 1), we will run into the same problem.

#### Some context - out of order delivery involving deletes:

On normal operations, this relies on the gc_deletes setting. We assume that the setting represents an upper bound on the time between the index and the delete operation. The index operation will be detected as stale based on the tombstone map in the LiveVersionMap.

Recovery presents a challenge as it can replay an old index operation that was in the translog and override a delete operation that was done when the engine was opened (and is not part of the replayed snapshot). To deal with this situation, we disable GC deletes (i.e. retain all deletes) for the duration of recoveries. This means that the delete operation will be remembered and the index operation ignored.

Both of the above scenarios (local recover + peer recovery) create a situation where the delete operation is never replayed. It this "lost" as lucene doesn't remember it happened and our LiveVersionMap is populated with it.

#### Solution:

Note that both local and peer recovery represent a scenario where we replay translog ops on top of an existing lucene index, potentially with ongoing indexing. Therefore we can treat them the same.

The local checkpoint in Lucene represent a marker indicating that all operations below it were performed on the index. This is the only form of "memory" that we have that relates to deletes. If we can achieve the following:
1) All ops below the local checkpoint are not indexed to lucene.
2) All ops above the local checkpoint are

It will mean that all  variants are covered: (i# == index op seq#, d# == delete op seq#, lc == local checkpoint in commit)
1) i# < d# <= lc - document is already deleted in lucene and stays that way.
2) i# <= lc < d# - delete is replayed on index - document is deleted
3) lc < i# < d# - index is replayed and then delete - document is deleted.

More formally - we want to make sure that for all ops that performed on the primary o1 and o2, if o2 is processed on a shard before o1, o1 will be dropped. We have the following scenarios

1) If both o1 or o2 are not included in the replayed snapshot and are above it (i.e., have a higher seq#), they fall under the gc deletes assumption.
2) If both o1 is part of the replayed snapshot but o2 is above it:
	- if o2 arrives first, o1 must arrive due to the recovery and potentially via replication as well. since gc deletes is disabled we are guaranteed to know of o2's existence.
3) If both o2 and o1 are part of the replayed snapshot:
	- we fall under the same scenarios as #2 - disabling GC deletes ensures we know of o2 if it arrives first.
4) If o1 falls before the snapshot and o2 is either part of the snapshot or higher:
	- Since the snapshot is guaranteed to contain all ops that are not part of lucene and are above the lc in the commit used, this means that o1 is part of lucene and o1 < local checkpoint. This means it won't be processed and we're not in the scenario we're discussing.
5) If o2 falls before the snapshot but o1 is part of it:
	- by the same reasoning above, o2 is < local checkpoint. Since o1 < o2, we also get o1 < local checkpoint and this will be dropped.


#### Implementation:

For local recovery, we can filter the ops we read of the translog and avoid replaying them. For peer recovery this is tricky as we do want to send the operations in order to have some history on the target shard. Filtering operations on the engine level (i.e., not indexing to lucene if op seq# <= lc) would work for both.
2017-07-21 17:19:54 +02:00
Jim Ferenczi c3784326eb Refactor field expansion for match, multi_match and query_string query (#25726)
This commit changes the way we handle field expansion in `match`, `multi_match` and `query_string` query.
 The main changes are:

- For exact field name, the new behavior is to rewrite to a matchnodocs query when the field name is not found in the mapping.

- For partial field names (with `*` suffix), the expansion is done only on `keyword`, `text`, `date`, `ip` and `number` field types. Other field types are simply ignored.

- For all fields (`*`), the expansion is done on accepted field types only (see above) and metadata fields are also filtered.

- The `*` notation can also be used to set `default_field` option on`query_string` query. This should replace the needs for the extra option `use_all_fields` which is deprecated in this change.

This commit also rewrites simple `*` query to matchalldocs query when all fields are requested (Fixes #25556). 

The same change should be done on `simple_query_string` for completeness.

`use_all_fields` option in `query_string` is also deprecated in this change, `default_field` should be set to `*` instead.

Relates #25551
2017-07-21 16:52:57 +02:00
Boaz Leskes 47f92d7c62 testRejectingJoinWithIncompatibleVersion(WithUnrecoveredState) should use immediate priorities
That will prevent race conditions with the join task, causing failures.
2017-07-21 16:43:18 +02:00
Yannick Welsch a2624dfcef Move primary term from ReplicationRequest to ConcreteShardRequest (#25822)
Removes the primary term from the replication request and pushes it into the transport envelope. This makes it possible to remove the term from the ReplicationOperation universe. The primary term that is to be used for a replication operation is now determined in the reroute phase when the node decides to execute a primary action (and validated once the primary action gets to execute). This makes it possible to validate that the primary action was sent to the correct primary shard instance that it was meant to be sent to (currently we only validate primary actions using the allocation id, which can be reused for failed and reallocated primaries).
2017-07-21 15:57:42 +02:00
Yannick Welsch d6a8984be6 Make sure shard is not closed when updating local checkpoint
If a primary shard is relocated, and then subsequently closed, there is a short window where ReplicationOperation could access the
closed shard (engine is not shut down yet) and, because it does not know that the shard was relocated, try to update the local
checkpoint, tripping an assertion in GlobalCheckPointTracker that a local checkpoint cannot be updated if it's not in primary mode.
2017-07-21 14:27:39 +02:00
Simon Willnauer 682abb90ee [TEST] Rename variable to make it less confusing 2017-07-21 13:02:33 +02:00
Yannick Welsch fd57101952 Make sure shard is not closed when accessing ReplicationGroup 2017-07-21 11:45:24 +02:00
Simon Willnauer 0e3ad522a2 Rewrite search requests on the coordinating nodes (#25814)
This change rewrites search requests on the coordinating node before
we send requests to the individual shards. This will reduce the rewrite load
and object creation for each rewrite on the executing nodes and will fetch
resources only once instead of N times once per shard for queries like `terms`
query with index lookups. (among percolator and geo-shape)

Relates to #25791
2017-07-21 09:38:38 +02:00
Simon Willnauer 0d0c103451 First increment shard stats before notifing and potentially sending response (#25818)
When we skip a shard we should first increment the skip and successful shard
counters before we notify the super class about a skipped shard which could
send back the result before we increment the stats.
2017-07-21 08:46:10 +02:00
Ryan Ernst cfdfa4705e Bump the min compat version to 5.6.0 (#25805)
This commit increases the min compat version for 6.0 to 5.6.0. This is
already what is being tested by gradle, but the code was out of sync.
2017-07-20 13:02:07 -07:00
Ryan Ernst 8ab0d10387 Add compatibility versions to main action response (#25799)
This commit adds the min wire/index compat versions to the main action
output. Not only will this make the compatility expected more
transparent, but it also allows to test which version others think the
compat versions are, similar to how we test the lucene version.
2017-07-20 13:01:41 -07:00
Boaz Leskes 7488877d1a Validate a joining node's version with version of existing cluster nodes (#25808)
When a node tries to join a cluster, it goes through a validation step to make sure the node is compatible with the cluster. Currently we validation that the node can read the cluster state and that it is compatible with the indexes of the cluster. This PR adds validation that the joining node's version is compatible with the versions of existing nodes. Concretely we check that:

1) The node's min compatible version is higher or equal to any node in the cluster (this prevents a too-new node from joining)
2) The node's version is higher or equal to the min compat version of all cluster nodes (this prevents a too old join where, for example, the master is on 5.6, there's another 6.0 node in the cluster and a 5.4 node tries to join).
3) The node's major version is at least as higher as the lowest node in the cluster. This is important as we use the minimum version in the cluster to stop executing bwc code for operations that require multiple nodes. If the nodes are already operating in "new cluster mode", we should prevent nodes from the previous major to join (even if they are wire level compatible). This does mean that if you have a very unlucky partition during the upgrade which partitions all old nodes which are also a minority / data nodes only, the may not be able to re-join the cluster. We feel this edge case risk is well worth the simplification it brings to BWC layers only going one way. This restriction only holds if the cluster state has been recovered (i.e., the cluster has properly formed).

 Also, the node join validation can now selectively fail specific nodes (previously the entire batch was failed). This is an important preparation for a follow up PR where we plan to have a rejected joining node die with dignity.
2017-07-20 20:11:29 +02:00
Boaz Leskes de6ad7a704 awaitFix testCorruptTranslogTruncationOfReplica
see https://github.com/elastic/elasticsearch/issues/25817
2017-07-20 20:04:42 +02:00
Jack Conradson 9f7463e796 remove lang url parameter from stored script requests (#25779)
Also has updates to ScriptMetaData for allowing the old namespace format to be loaded all the way back through 5.0; however, it will throw an exception if two scripts share the same id but different languages.
2017-07-20 08:51:08 -07:00
Jason Tedor 9d8f11dc27 Remove legacy checks for config file settings
This commit removes legacy checks for unsupported an environment
variable and unsupported system properties. This environment variable
and these system properties have not been supported since 1.x so it is
safe to stop checking for the existence of these settings.

Relates #25809
2017-07-20 22:42:39 +09:00
Simon Willnauer 5e629cfba0 Ensure query resources are fetched asynchronously during rewrite (#25791)
The `QueryRewriteContext` used to provide a client object that can
be used to fetch geo-shapes, terms or documents for percolation. Unfortunately
all client calls used to be blocking calls which can have significant impact on the
rewrite phase since it occupies an entire search thread until the resource is
received. In the case that the index the resource is fetched from isn't on the local
node this can have significant impact on query throughput.

Note: this doesn't fix MLT since it fetches stuff in doQuery which is a different beast. Yet, it is a huge step in the right direction
2017-07-20 15:37:50 +02:00
Jay Modi 3e4bc027eb RestClient uses system properties and system default SSLContext (#25757)
This commit calls the `useSystemProperties` method on the HttpAsyncClientBuilder so that the jvm
system properties are used. The primary reason for doing this is to ensure the builder uses the
system default SSLContext rather than the default instance created by the http client library.

Closes #23231
2017-07-20 07:36:56 -06:00
Boaz Leskes 9989ac69a4 Revert "Validate a joining node's version with version of existing cluster nodes (#25770)"
This reverts commit 1e1f8e6376.
2017-07-19 17:34:53 +02:00
Simon Willnauer 4d78935df7 Introduce a new Rewriteable interface to streamline rewriting (#25788)
Today we have duplicated code that is quite complicated to iterate
over rewriteable (`QueryBuilders` mainly) This change introduces a
`Rewriteable` interface that allow to share code to do the rewriting as
well as encapsulation and composition of queries.
2017-07-19 15:06:49 +02:00
Adrien Grand 7a0eeb3978 Fix compilation. 2017-07-19 14:46:30 +02:00
Adrien Grand 55ad318541 Reduce the overhead of timeouts and low-level search cancellation. (#25776)
Setting a timeout or enforcing low-level search cancellation used to make us
wrap the collector and check either the current time or whether the search
task was cancelled for every collected document. This can be significant
overhead on cheap queries that match many documents.

This commit changes the approach to wrap the bulk scorer rather than the
collector and exponentially increase the interval between two consecutive
checks in order to reduce the overhead of those checks.
2017-07-19 14:15:53 +02:00
Adrien Grand 94a98daa37 Fix parsing of ip range queries. (#25768)
Closes #25636
2017-07-19 14:12:54 +02:00
Adrien Grand 01f083ca83 Reduce profiling overhead. (#25772)
Calling `System.nanoTime()` for each method call may have a significant
performance impact.

Closes #24799
2017-07-19 14:12:14 +02:00
Adrien Grand f1ff7f2454 Require a field when a `seed` is provided to the `random_score` function. (#25594)
We currently use fielddata on the `_id` field which is trappy, especially as we
do it implicitly. This changes the `random_score` function to use doc ids when
no seed is provided and to suggest a field when a seed is provided.

For now the change only emits a deprecation warning when no field is supplied
but this should be replaced by a strict check on 7.0.

Closes #25240
2017-07-19 14:11:15 +02:00
Boaz Leskes 1e1f8e6376 Validate a joining node's version with version of existing cluster nodes (#25770)
When a node tries to join a cluster, it goes through a validation step to make sure the node is compatible with the cluster. Currently we validation that the node can read the cluster state and that it is compatible with the indexes of the cluster. This PR adds validation that the joining node's version is compatible with the versions of existing nodes. Concretely we check that:

1) The node's min compatible version is higher or equal to any node in the cluster (this prevents a too-new node from joining)
2) The node's version is higher or equal to the min compat version of all cluster nodes (this prevents a too old join where, for example, the master is on 5.6, there's another 6.0 node in the cluster and a 5.4 node tries to join).
3) The node's major version is at least as higher as the lowest node in the cluster. This is important as we use the minimum version in the cluster to stop executing bwc code for operations that require multiple nodes. If the nodes are already operating in "new cluster mode", we should prevent nodes from the previous major to join (even if they are wire level compatible). This does mean that if you have a very unlucky partition during the upgrade which partitions all old nodes which are also a minority / data nodes only, the may not be able to re-join the cluster. We feel this edge case risk is well worth the simplification it brings to BWC layers only going one way.

 Also, the node join validation can now selectively fail specific nodes (previously the entire batch was failed). This is an important preparation for a follow up PR where we plan to have a rejected joining node die with dignity.
2017-07-19 12:57:29 +02:00
Simon Willnauer 9882d2b9d3 Reduce the scope of `QueryRewriteContext` (#25787)
Today we provide a lot of functionality on the `QueryRewriteContext` that
we potentially don't have ie. if we rewrite on a coordinating node or when
we percolating. This change moves most of the unnecessary shard level or
index level services and dependencies to `QueryShardContext` instead.
2017-07-19 12:30:38 +02:00
Jason Tedor 4b18800df9 Fix handling of invalid error trace parameter
If a request contains an invalid error trace parameter, we send a error
on the channel. This should immediately abort any additional processing
of the request but instead we march on, dispatch the request and
subsequently send another message on the channel. The problem here is
this means two writes on the channel which leads to the request being
released twice ultimately raising in illegal reference count
exception. This commit addresses this by performing an early return in
the case that the request contained an invalid error trace parameter.

Relates #25785
2017-07-19 18:07:11 +09:00
Jason Tedor 82f52b17e1 Remove timed latch await in listeners test
This commit removes a timed latch await in a transport client listeners
test. The problem with a timed wait here is that on an overloaded
machine, the test can fail because the waiting thread was not unlatched
quickly enough. This makes the test unnecessarily flaky. Instead, we
should wait indefinitely and simply let the test fail by the test
timeout if the latch is not counted down for some reason.

Closes #25760
2017-07-19 16:51:27 +09:00
Jim Ferenczi 4cd9728f55 [Test] Make sure that QueryPhaseTests#testIndexSortScrollOptimization creates segments that can be early terminated 2017-07-18 19:30:15 +02:00
Christoph Büscher e24af64de2 Add strict parsing of aggregation ranges (#25769)
Currently we ignore unknown field names when parsing RangeAggregator.Range and
GeoDistanceAggregationBuilder.Range from `range`, `date_range` or `geo_distance`
aggregations. This can hide subtle errors in the query. This change makes parsing `ranges`
stricter.
2017-07-18 18:31:04 +02:00
Boaz Leskes c0e6dafcab CombinedDeletionPolicy can't assert it has no commits when creating an index
This is an appealing assertion, but there scenarios where it can happen under normal operations. For example, when an index is created it may run into an exception when the lucene files have already been created. The master will try to assign the shard to another node (it's empty, so no need to look for data) but if there is no other node, it will reassign it to the same node. At that point the deletion will get a list of existing commits (which it will typically delete).
2017-07-18 17:23:54 +02:00
Luca Cavanna 5c5d723b86 Improve error message when aliases are not supported (#25728)
With #23997 and #25268 we have changed put alias, delete alias, update aliases and delete index to not accept aliases. Instead concrete indices should be provided as their index parameter.

This commit improves the error message in case aliases are provided, from an IndexNotFoundException (404 status code) with "no such index" message, to an IllegalArgumentException (400 status code) with "The provided expression [alias] matches an alias, specify the corresponding concrete indices instead." message.

Note that there is no specific error message for the case where wildcard expressions match one or more aliases. In fact, aliases are simply ignored when expanding wildcards for such APIs. An error is thrown only when the expression ends up matching no indices at all, and allow_no_indices is set to false. In that case the error is still the generic "404 - no such index".
2017-07-18 15:40:17 +02:00
Luca Cavanna 0d8b753325 IndexClosedException to return 400 rather than 403 (#25752)
403 can be confused with security. If an API doesn't support working against closed indices and closed indices are referred to in a request, that is a bad request, hence 400 is more appropriate.
2017-07-18 10:26:32 +02:00
Boaz Leskes 194f267110 TruncateTranslogIT.testCorruptTranslogTruncation should wait for replica to allocate
The test checks if a file based or ops based recovery happened, but if the replica shard never finished recovering expectations are not met.

Fixes #25761
2017-07-18 10:17:39 +02:00
Christoph Büscher a6e3d356ed Change parsing of numeric `to` and `from` parameters in `date_range` aggregation (#25376)
Currently the `to` and `from` parameter in the `date_range` aggregation is not
parsed with the correct date field format from the mappings or the aggregation
if the argument is numeric, but always treated as a long value specifying
`epoch_millis`. This leads to problems e.g. when the format is `epoch_second`,
but the `to` and `from` are currently treated as millis.

With this change, we interpret these parameters according to the `format` of the target field.
If the `format` in the mappings is not compatible with numeric input values,
a compatible `format` (e.g. `epoch_millis`, `epoch_second`) must be specified in
the `date_range` aggregation itself, otherwise an error is thrown.

#Closes #17920
2017-07-18 09:45:28 +02:00
Boaz Leskes f347bd4a4e await fix testCorruptTranslogTruncation 2017-07-18 09:15:17 +02:00
Jim Ferenczi c6d9456693 #25747: Fix check of termVector with and without offsets 2017-07-17 19:46:42 +02:00
Jim Ferenczi 41ea8fdcec Picks offset source for the unified highlighter directly from the es mapping (#25747)
This commit changes how the offset source is picked for each field using the es mapping rather than the underlying Lucene field infos.
It's mandatory for large mappings where field infos retrieval can be costly (the global field infos is merged for each highlighted field in every hit by the Lucene impl).

Fixes #25699
2017-07-17 19:10:46 +02:00
Lee Hinman 610ba7e427 Register data node stats from info carried back in search responses (#25430)
* Register data node stats from info carried back in search responses

This is part of #24915, where we now calculate the EWMA of service time for
tasks in the search threadpool, and send that as well as the current queue size
back to the coordinating node. The coordinating node now tracks this information
for each node in the cluster.

This information will be used in the future the determining the best replica a
search request should be routed to. This change has no user-visible difference.

* Move response time timing into ResponseListenerWrapper

* Move ResponseListenerWrapper to ActionListener instead of SearchActionListener

Also removes the logger

* Move `requestIndex` back to private

* De-guice-ify ResponseCollectorService \o/

* Undo all changes to SearchQueryThenFetchAsyncAction

* Remove unneeded response collector from TransportSearchAction

* Undo all changes to SearchDfsQueryThenFetchAsyncAction

* Completely rewrite the inside of ResponseCollectorService's record keeping

* Documentation and cleanups for ResponseCollectorService

* Add unit test for collection of queue size and service time

* Fix Guice construction error

* Add basic unit tests for ResponseCollectorService

* Fix version constant for the master merge

* Fix test compilation after master merge

* Add a test for node removal on cluster changed event

* Remove integration test as there are now unit tests

* Rename ResponseListenerWrapper -> SearchExecutionStatsCollector

* Fix line-length

* Make classes private and final where appropriate

* Pass nodeId into SearchExecutionStatsCollector and use only ActionListener

* Get nodeId from connection so searchShardTarget can be private

* Remove threadpool from SearchContext, get it from IndexShard instead

* Add missing import

* Use BiFunction for responseWrapper rather than passing in collector service
2017-07-17 11:04:51 -06:00
Simon Willnauer cb4eebcd6a Make `index` in TermsLookup mandatory (#25753)
This change removes the leniency of having a `null` index to fetch
terms from in 6.0 onwards. This feature will be deprecated in the 5.x series
and 6.0 nodes will require the index to be set.

Closes #25750
2017-07-17 18:50:30 +02:00
Simon Willnauer 9ff259c260 Use concrete version for BWC checks in SearchTransportService (#25748)
We used to compare agaisnt the min compatible version which is misleading since
it might move over time and since we backported the `can_match` API entirely
it's better to compare against a version constant.
2017-07-17 18:49:50 +02:00
Boaz Leskes c0751c8650 deubg logging to TruncateTranslogIT
To see what data paths are used.
2017-07-17 17:18:05 +02:00
Adrien Grand 949db39fad Fix reproducibility of UUIDTests.
Closes #25714
2017-07-17 15:43:28 +02:00
Adrien Grand 78a6c3427b Optimize `terms` queries on `ip` addresses to use a `PointInSetQuery` whenever possible. (#25669)
We can't do it in the general case because of prefix queries, but I believe this
is mostly used in query strings and not in explicit `terms` queries.

Closes #25667
2017-07-17 15:39:01 +02:00
Adrien Grand 264088f1c4 Deprecate the `_default_` mapping. (#25652)
Now that indices cannot have types anymore, this feature does not buy anything
anymore.

Closes #25500
2017-07-17 15:37:59 +02:00
Boaz Leskes 7739aad1aa Add testing around recovery to TruncateTranslogIT 2017-07-17 10:48:26 +02:00
Jason Tedor f121cd3beb Fix pre-6.0 response to unknown replication actions
When sending replica requests for replication operations, we skip
sending the request to pre-6.0 nodes for operations that such nodes
would not be aware of (e.g., the background global checkpoint sync, or
the primary/replica resync) since they would not know what to do with
these requests. Yet, we simulate that we received responses from these
nodes. Today, this is done by simulating that they sent us that their
local checkpoint is unassigned sequence number. However, for pre-6.0
nodes we have introduced a special local checkpoint used in the global
checkpoint tracker for such nodes and that is what we should use here
too. This commit fixes this issue.

Relates #25744
2017-07-17 17:47:48 +09:00
Martijn van Groningen 8003171a0c
Move more token filters to analysis-common module
The following token filters were moved: arabic_normalization, german_normalization, hindi_normalization, indic_normalization, persian_normalization, scandinavian_normalization, serbian_normalization, sorani_normalization, cjk_width and cjk_width

Relates to #23658
2017-07-17 08:29:44 +02:00
Simon Willnauer 8364279b98 Prevent skipping shards if a suggest builder is present (#25739)
Even if the query part can rewrite to match none we can't skip the
suggest execution since it might yield results.

Relates to #25658
2017-07-16 19:06:47 +02:00
Simon Willnauer ccda0441e1 Bump BWC versions after #25658 backport to 5.6 2017-07-15 11:34:16 +02:00
Yannick Welsch 8f0b357651 Let primary own its replication group (#25692)
Currently replication and recovery are both coordinated through the latest cluster state available on the ClusterService as well as through the GlobalCheckpointTracker (to have consistent local/global checkpoint information), making it difficult to understand the relation between recovery and replication, and requiring some tricky checks in the recovery code to coordinate between the two. This commit makes the primary the single owner of its replication group, which simplifies the replication model and allows to clean up corner cases we have in our recovery code. It also reduces the dependencies in the code, so that neither RecoverySourceXXX nor ReplicationOperation need access to the latest state on ClusterService anymore. Finally, it gives us the property that in-sync shard copies won't receive global checkpoint updates which are above their local checkpoint (relates #25485).
2017-07-14 13:52:53 +02:00
Luca Cavanna 7930b8a720 Fix indices options parsing from REST in delete index API (#25709)
When parsing indices options from REST, we parse the optional parameters that are supported at REST (ignore_unavailable, allow_no_indices and expand_wildcards) and we provide the API default values for all the other (internal) options so that they are set to the new indices options while parsing. The `ignoreAliases` option was forgotten though, which means that whenever you pass in any index option at REST to the delete index API, you get to delete aliases like it was supported before (as ignoreAliases gets set to false like in all the other APIs).

Added unit tests for IndicesOptions parsing from REST parameters, and yaml tests for the delete index API.
2017-07-14 10:39:44 +02:00
Jim Ferenczi 13da3eb53e Refactor QueryStringQuery for 6.0 (#25646)
This change refactors the query_string query to analyze the query text around logical operators of the query string the same way than a match_query/multi_match_query.
It also adds a type parameter that can be used to change the way multi fields query are built the same way than a multi_match query does.

Now that these queries share the same behavior regarding text analysis, some parameters are obsolete and have been deprecated:

split_on_whitespace: This setting is now ignored with a deprecation notice
if it is used explicitely. With this PR The query_string always splits on logical operator.
It simplifies the understanding of the other parameters that can have different meanings
depending on the value of split_on_whitespace.

auto_generate_phrase_queries: This setting is now ignored with a deprecation notice
if it is used explicitely. This setting only makes sense when the parser splits on whitespace.

use_dismax: This setting is now ignored with a deprecation notice
if it is used explicitely. The tie_breaker parameter is sufficient to handle best_fields/most_fields.

Fixes #25574
2017-07-13 15:32:17 +02:00
Igor Motov 6125f535ae mget with an alias shouldn't ignore alias routing (#25697)
Closes #25696
2017-07-13 09:27:37 -04:00
Simon Willnauer 0e5d324c36 Prevent `can_match` requests from sending to incompatible nodes (#25705)
With cross cluster search we can potentially proxy `can_match` requests
to nodes that don't have the endpoint. This might not cause any problem
from a functional perspecitve but will cause ugly error messages on
the target node. This commit will cause an IAE if we try to talk to an
incompatible node via a proxy.

Relates to #25704
2017-07-13 14:59:41 +02:00
Colin Goodheart-Smithe 11477a608f Removes FieldStats API (#25628)
* Removes FieldStats API

* iter

* iter
2017-07-13 11:56:46 +01:00
Luca Cavanna ec66d655b5 Rename client artifacts (#25693)
It was brought up that our current client artifacts have generic names like 'rest' that may cause conflicts with other artifacts.

This commit renames:

- rest -> elasticsearch-rest-client
- sniffer -> elasticsearch-rest-client-sniffer
- rest-high-level -> elasticsearch-rest-high-level-client

A couple of small changes are also preparing the high level client for its first release.

Closes #20248
2017-07-13 09:44:25 +02:00
Christoph Büscher 97c4c43fb7 Make slop optional when parsing `span_near` query (#25677)
The slop parameter defaults to 0 in the Lucene SpanNearQuery, so we can set it
to this default value also and don't have to require it being specified in the
query when using the Rest API. Leaving `slop` a ctro arg in the Java API as it
should normally be specified and we can keep it `final` that way.

Closes #25642
2017-07-13 09:21:49 +02:00
Simon Willnauer 02e9ad6d6f Register correct response for `can_match` proxy response
Relates to #25658
Closes #25698
2017-07-13 08:33:56 +02:00
Sergey Galkin e2bfb35f4a Shrunk indices should ignore templates
A shrunk index should ignore anything from templates and instead take
its mappings, aliases, and settings from the original index, plus any
new settings and aliases passed in with the shrink request. This commit
causes this to be the case.

Relates #25380
2017-07-12 18:27:38 -04:00
Simon Willnauer e81804cfa4 Add a shard filter search phase to pre-filter shards based on query rewriting (#25658)
Today if we search across a large amount of shards we hit every shard. Yet, it's quite
common to search across an index pattern for time based indices but filtering will exclude
all results outside a certain time range ie. `now-3d`. While the search can potentially hit
hundreds of shards the majority of the shards might yield 0 results since there is not document
that is within this date range. Kibana for instance does this regularly but used `_field_stats`
to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results.

This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards
and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.
2017-07-12 22:19:20 +02:00
Christoph Büscher f3e7a1c4a4 Adding basic search request documentation for high level client (#25651) 2017-07-12 17:06:46 +02:00
Jack Conradson d2b4f7ac5a Disallow lang to be used with Stored Scripts (#25610)
Requests that execute a stored script will no longer be allowed to specify the lang of the script. This information is stored in the cluster state making only an id necessary to execute against. Putting a stored script will still require a lang.
2017-07-12 07:55:57 -07:00
Antonio Matarrese 8d7cbc43b5 Fix typo in ScriptDocValues deprecation warnings (#25672) 2017-07-12 16:17:55 +02:00
Colin Goodheart-Smithe 55a157e964 Changes DocValueFieldsFetchSubPhase to reuse doc values iterators for multiple hits (#25644)
* Changes DocValueFieldsFetchSubPhase to reuse doc values iterators for multiple hits

Closes #24986

* iter

* Update ScriptDocValues to not reuse GeoPoint and Date objects

* added Javadoc about script value re-use
2017-07-12 12:03:49 +00:00
Martijn van Groningen 0a25558f98
Query range fields by doc values when they are expected to be more efficient than points.
* Enable doc values for range fields by default.
* Store ranges in a binary format that support multi field fields.
* Added BinaryDocValuesRangeQuery that can query ranges that have been encoded into a binary doc values field.
* Wrap range queries on a range field in IndexOrDocValuesQuery query.

Closes #24314
2017-07-12 13:04:14 +02:00
Christoph Büscher ad01a67c51 Remove SearchHit#internalHits (#25653)
This method does exactly what getHits() does and is used in only a few places,
so it can safely be removed. It seems to be a left-over from when
InternalSearchHits was folded into the SearchHits interface, which didn't
contain this method.
2017-07-12 10:01:18 +02:00
Jason Tedor e165c405ac Add an underscore to flood stage setting
This is a minor nitty bikeshedding change that renames the suffix of the
disk flood stage setting to "flood_stage" from "floodstage".

Relates #25659
2017-07-11 22:02:00 -04:00
Simon Willnauer 831dbbf291 Ensure we rewrite common queries to `match_none` if possible (#25650)
In certain situations we can early terminate and just skip the entire
query phase or make the lucene level rewrite very cheap if we can already
tell that a query won't match any documents. For instance if there is a single
`match_none` ie. due to some range rewrite in a filter or must clause of a boolean
query it can just drop all it's other queries since it will never match.
2017-07-11 21:19:14 +02:00
Adrien Grand f9fbce84b6 Optimize the order of bytes in uuids for better compression. (#24615)
Flake ids organize bytes in such a way that ids are ordered. However, we do not
need that property and could reorganize bytes in an order that would better suit
Lucene's terms dict instead.

Some synthetic tests suggest that this change decreases the disk footprint of
the `_id` field by about 50% in many cases (see `UUIDTests.testCompression`).
For instance, when simulating the indexing of 10M docs at a rate of 10k docs
per second, the current uid generator used 20.2 bytes per document on average,
while this new generator which only puts bytes in a different order uses 9.6
bytes per document on average.

We had already explored this idea in #18209 but the attempt to share long common
prefixes had had a bad impact on indexing speed. This time I have been more
careful about putting discriminant bytes early in the `_id` in a way that
preserves indexing speed on par with today, while still allowing for better
compression.
2017-07-11 17:28:23 +02:00
Tim Brooks a3ade99fcf Fix BytesReferenceStreamInput#skip with offset (#25634)
There is a bug when a call to `BytesReferenceStreamInput` skip is made
on a `BytesReference` that has an initial offset. The offset for the
current slice is added to the current index and then subtracted from the
length. This introduces the possibility of a negative number of bytes to
skip. This happens inside a loop, which leads to an infinte loop.

This commit correctly subtracts the current slice index from the
slice.length. Additionally, the `BytesArrayTests` are modified to test
instances that include an offset.
2017-07-11 09:54:29 -05:00
Simon Willnauer 98c91a3bd0 Limit the number of concurrent shard requests per search request (#25632)
This is a protection mechanism to prevent a single search request from
hitting a large number of shards in the cluster concurrently. If a search is
executed against all indices in the cluster this can easily overload the cluster
causing rejections etc. which is not necessarily desirable. Instead this PR adds
a per request limit of `max_concurrent_shard_requests` that throttles the number of
concurrent initial phase requests to `256` by default. This limit can be increased per request
and protects single search requests from overloading the cluster. Subsequent PRs can introduces
addiontional improvemetns ie. limiting this on a `_msearch` level, making defaults a factor of
the number of nodes or sort shards iters such that we gain the best concurrency across nodes.
2017-07-11 16:23:10 +02:00
Adrien Grand 481d5d09b2 Upgrade to lucene-7.0.0-snapshot-00142c9. (#25641)
Lucene 7.0 is feature-frozen now, so there should not be many changes until GA.
2017-07-11 13:58:55 +02:00
Simon Willnauer 538110bd60 Change compatibility version to 5.6 after backport 2017-07-11 11:39:08 +02:00
Simon Willnauer ec1afe30ea Ensure remote cluster alias is preserved in inner hits aggs (#25627)
We lost the cluster alias due to some special caseing in inner hits
and due to the fact that we didn't pass on the alias to the shard request.
This change ensures that we have the cluster alias present on the shard to
ensure all SearchShardTarget reads preserve the alias.

Relates to #25606
2017-07-11 11:34:06 +02:00
Tal Levy e04be73ad5 remove ingest.new_date_format (#25583) 2017-07-10 13:07:50 -07:00
Tim Brooks b22bbf94da Avoid blocking on channel close on network thread (#25521)
Currently when we close a channel in Netty4Utils.closeChannels we
block until the closing is complete. This introduces the possibility
that a network selector thread will block while waiting until a
separate network selector thread closes a channel.

For instance: T1 closes channel 1 (which is assigned to a T1 selector).
Channel 1's close listener executes the closing of the node. That
means that T1 now tries to close channel 2. However, channel 2 is
assigned to a selector that is running on T2. T1 now must wait until T2
closes that channel at some point in the future.

This commit addresses this by adding a boolean to closeChannels
indicating if we should block on close. We only set this boolean to true
if we are closing down the server channels at shutdown. This call is
never made from a network thread. When we call the closeChannels method
with that boolean set to false, we do not block on close.
2017-07-10 10:50:51 -05:00
Yannick Welsch 7836bbf4d4 Fix tribe node cluster state version increments (#25629)
With #24236, tribe nodes submit cluster state changes to their MasterService, making it unnecessary to explicitly update the cluster state version. This PR fixes the double-incrementing of cluster state versions on tribe nodes, which are not harmful, but unnecessary.
2017-07-10 16:25:11 +02:00
Colin Goodheart-Smithe 3a5a54e83e Collapses package structure for some bucket aggs (#25579)
This change collapses some of the packages for the bucket aggregations into their parent packages. This was done for the following aggregations:
* The variants of the range aggregation (geo_distance, date and ip) were moved into the `o.e.s.a.bucket.range` package
* The `o.e.s.a.bucket.terms.support` package was removed and the classes were moved to `o.e.s.a.bucket.terms`
* The filter aggregation was moved to `o.e.s.a.bucket.filter`

Since this PR is already relatively large with only the above changes subsequent PRs will do similar operations on relevant metric and pipeline aggregations

Relates to #22868
2017-07-10 15:08:15 +01:00
Boaz Leskes e93e10f93b Close Translog trimming task when IndexService is closed
Relates to https://github.com/elastic/elasticsearch/pull/25622
2017-07-10 14:40:23 +02:00
Yannick Welsch b5521872bb [TEST] Use correct StreamInput version to deserialize in testSnapshotDeletionsInProgressSerialization
The test is currently serializing the cluster state using an older ES version format, but then deserializes those same bytes by
assuming they are of the current ES version.
2017-07-10 14:03:12 +02:00
Luca Cavanna a932591007 Treat aliases as unavailable indices in delete index and update aliases api (#25524)
When resolving wildcards, aliases should be treated as unavailable indices when the `ignoreAliases` option is set to `true` (currently enabled with delete index api and update aliases api). This way the `allow_no_indices` and `ignore_unavailable` options can be honoured, otherwise WildcardExpressionResolver ends up treating aliases differently and there is no way to control when an error is thrown.

The default behaviour for the delete index api, which has `ignore_unavailable` set to `false` and `allow_no_indices` set to `true` by default, is to throw an error when executed against an alias, same as when it's executed against an index that does not exist.
2017-07-10 10:58:00 +02:00
Boaz Leskes 09378f48e4 Add a scheduled translog retention check (#25622)
We currently check whether translog files can be trimmed whenever we create a new translog generation or close a view. However #25294 added a long translog retention period (12h, max 512MB by default), which means translog files should potentially be cleaned up long after there isn't any indexing activity to trigger flushes/the creation of new translog files. We therefore need a scheduled background check to clean up those files once they are no longer needed.

Relates to #10708
2017-07-10 10:28:39 +02:00
Jason Tedor c084542731 Bump version to 6.0.0-beta1
This commit does two things:
 - bumps the version from 6.0.0-alpha3 to 6.0.0-beta1
 - renames the 6.0.0-alpha3 version constant to 6.0.0-beta1

Relates #25621
2017-07-09 18:12:50 -04:00
Jason Tedor c75ddd2c85 Fix scaling thread pool test bug
This commit adjusts the expectation for the max number of threads in the
scaling thread pool configuration test. The reason that this expectation
is incorrect is because we removed the limitation that the number of
processors maxes out at 32, instead letting it be the true number of
logical processors on the machine. However, when we removed this
limitation, this test was never adjusted to reflect the new reality yet
it never arose since our tests were not running on machines with
incredibly high core counts.

Relates #20874
2017-07-09 08:00:27 -04:00
Boaz Leskes 1f4d8a05d1 testConcurrentWriteViewsAndSnapshot: writers should expose the local checkpoint to readers before trimming the translog 2017-07-09 12:26:54 +02:00
Jason Tedor cb3674c5ee Add reason to global checkpoint updates on replica
Updating the global checkpoint on a replica can occur for a few
different reasons:
 - from inlined global checkpoint updates
 - from a primary term transition
 - from finalizing recovery

Yet, the trace logging for a global checkpoint update does not present
this information that can be useful when tracing test failures. This
commit adds a reason for the global checkpoint update on a replica so
that we can trace these updates.

Relates #25612
2017-07-08 17:05:24 -04:00
Boaz Leskes 40ae134f5a Move `BulkItemRequest` BWC to 5.x (#25511)
The current BWC code in `BulkItemRequest` mutates the underlying `DocWriteRequests` which causes test failures and unexpected state (our test infra checks bwc serialization on the fly). This PR removes this logic from master. Another PR will add a BWC layer to 5.x only.

This PR contains the logic in https://github.com/elastic/elasticsearch/pull/25510 , which is needed to run the tests.
2017-07-08 11:42:57 +02:00
Boaz Leskes f189e819be testRecoveryAfterPrimaryPromotion: seqNo recovery doesn't require some initial indexing
Previously the primary didn't update it's own local checkpoint (and thus the global checkpoint) before some indexing occurred. With recent changes the primary now properly initializes it self and thus ops recovery is possible even if no indexing has occurred.
2017-07-08 10:05:05 +02:00
Jason Tedor bc22c1c286 Add disk threshold settings validation
This commit adds cross-settings validation for the low/high/flood stage
disk watermark settings. This validation was enabled by the introduction
of multiple settings validation.

Relates #25600
2017-07-07 19:54:36 -04:00
Jason Tedor 93311ab717 Restore local checkpoint tracker on promotion
When a shard is promoted to replica, it's possible that it was
previously a replica that started following a new primary. When it
started following this new primary, the state of its local checkpoint
tracker was reset. Upon promotion, it's possible that the state of the
local checkpoint tracker has not yet restored from a successful
primary-replica re-sync. To account for this, we must restore the state
of the local checkpoint tracker when a replica shard is promoted to
primary. To do this, we stream the operations in the translog, marking
the operations that are in the translog as completed. We do this before
we fill the gaps on the newly promoted primary, ensuring that we have a
primary shard with a complete history up to the largest maximum sequence
number it has ever seen.

Relates #25553
2017-07-07 14:38:35 -04:00
Yannick Welsch baa87db5d1 Harden global checkpoint tracker
This commit refactors the global checkpont tracker to make it more
resilient. The main idea is to make it more explicit what state is
actually captured and how that state is updated through
replication/cluster state updates etc. It also fixes the issue where the
local checkpoint information is not being updated when a shard becomes
primary. The primary relocation handoff becomes very simple too, we can
just verbatim copy over the internal state.

Relates #25468
2017-07-07 14:04:28 -04:00
olcbean 2ba9fd2aec Remove deprecated created and found from index, delete and bulk (#25516)
The created and found fields in index and delete responses became obsolete after the introduction of the result field in index, update and delete responses (#19566).

After deprecating the created and found fields in 5.x (#19633), now they are removed.

Fixes #19630
2017-07-07 13:58:46 -04:00
Boaz Leskes efb29031f1 fix testEnsureVersionCompatibility for 5.5.0 release 2017-07-07 19:04:12 +02:00
Boaz Leskes 1d2a10bad8 fix Version.v6_0_0 min compatibility version to 5.5.0 2017-07-07 19:04:12 +02:00
Boaz Leskes ad1b9feb20 Add bwc indices for 5.5.0 2017-07-07 19:04:12 +02:00