Commit Graph

964 Commits

Author SHA1 Message Date
Nhat Nguyen 04dd738782 TEST: trim unsafe commits before opening engine
Since #29260, unsafe commits must be trimmed before opening an engine.
This makes the engine constructor follow Lucene standard semantics and
use the last commit. However, we haven't fully applied this change in some
tests.

Relates #29260
2018-03-29 14:25:42 -04:00
Boaz Leskes eb8b31746a Move trimming unsafe commits from engine ctor to store (#29260)
As follow up to #28245 , this PR removes the logic for selecting the 
right start commit from the Engine constructor in favor of explicitly
trimming them in the Store, before the engine is opened. This makes the
constructor in engine follow standard Lucene semantics and use the last
commit.

Relates #28245
Relates #29156
2018-03-29 13:35:57 -04:00
Igor Motov 04d0edc8ee
Fix incorrect geohash for lat 90, lon 180 (#29256)
Due to special treatment for the 0xFFFFFF... value in GeoHashUtils'
encodeLatLon method, the hashcode for lat 90, lon 180 is incorrectly
encoded as `"000000000000"` instead of "zzzzzzzzzzzz". This commit
removes the special treatment and fixes the issue.

Closes #22163
2018-03-29 09:23:43 -04:00
Tanguy Leroux b6568d0cfd
Do not load global state when deleting a snapshot (#29278)
When deleting a snapshot, it is not necessary to load and to parse the
global metadata of the snapshot to delete. Now indices are stored in
the snapshot metadata file, we have all the information to resolve the
shards files to delete.

This commit removes the readSnapshotMetaData() method that was used to
load both global and index metadata files. Test coverage should be
enough as SharedClusterSnapshotRestoreIT already contains several
deletion tests.

Related to #28934
2018-03-29 09:16:53 +02:00
Nhat Nguyen 9bc167466f TEST: add log testDoNotRenewSyncedFlushWhenAllSealed
This test was failed recently. This commit enables debug log and prints
out seals.

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-unix-compatibility/os=oraclelinux/2234/console
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.x+intake/1437/console
2018-03-28 22:05:00 -04:00
Jason Tedor 4ef3de40bc
Fix handling of bad requests (#29249)
Today we have a few problems with how we handle bad requests:
 - handling requests with bad encoding
 - handling requests with invalid value for filter_path/pretty/human
 - handling requests with a garbage Content-Type header

There are two problems:
 - in every case, we give an empty response to the client
 - in most cases, we leak the byte buffer backing the request!

These problems are caused by a broader problem: poor handling preparing
the request for handling, or the channel to write to when the response
is ready. This commit addresses these issues by taking a unified
approach to all of them that ensures that:
 - we respond to the client with the exception that blew us up
 - we do not leak the byte buffer backing the request
2018-03-28 16:25:01 -04:00
Simon Willnauer 13e19e7428
Allow _update and upsert to read from the transaction log (#29264)
We historically removed reading from the transaction log to get consistent
results from _GET calls. There was also the motivation that the read-modify-update
principle we apply should not be hidden from the user. We still agree on the fact
that we should not hide these aspects but the impact on updates is quite significant
especially if the same documents is updated before it's written to disk and made serachable.

This change adds back the ability to read from the transaction log but only for update calls.
Calls to the _GET API will always do a refresh if necessary to return consistent results ie.
if stored fields or DocValues Fields are requested.

Closes #26802
2018-03-28 18:03:34 +02:00
Christoph Büscher 27e45fc552
Remove IndicesOptions bwc serialization layer (#29281)
On master we don't need to talk to pre-6.0 nodes anymore.
2018-03-28 16:19:45 +02:00
Luca Cavanna 245dd73156
Bulk processor#awaitClose to close scheduler (#29263)
When the `BulkProcessor` is used with the high-level REST client, a scheduler is internally created that allows to schedule tasks. Such scheduler is not exposed to users and needs to be closed once the `BulkProcessor` is closed. There are two ways to close the `BulkProcessor` though, one is the ordinary `close` method and the other one is `awaitClose`. The former closes the scheduler while the latter doesn't, leaving threads lingering.
2018-03-28 16:09:18 +02:00
Yannick Welsch cacf759213
Remove RELOCATED index shard state (#29246)
as this information is already covered by ReplicationTracker.primaryMode.
2018-03-28 12:25:46 +02:00
Robin Neatherway ea8e3661d0 Fix a type check that is always false (#27726)
DocumentParser: The checks for Text and Keyword were masked by the
earlier check for String, which they are child classes of. As String
field types are no longer supported, this check can be removed.
2018-03-28 10:20:20 +02:00
Tanguy Leroux 36f8531bf4
Don't load global state when only restoring indices (#29239)
Restoring a snapshot, or getting the status of finished
snapshots, currently always load the global state metadata
 file from the repository even if it not required. This
slows down the restore process (or listing statuses process)
 and can also be an issue if the global state cannot be
deserialized (because it has unknown customs for example).

This commit splits the Repository.getSnapshotMetadata()
method into two distincts methods: getGlobalMetadata()
and getIndexMetadata() that are now called only when needed.
2018-03-28 09:35:05 +02:00
Lee Hinman eebda6974d
Decouple NamedXContentRegistry from ElasticsearchException (#29253)
* Decouple NamedXContentRegistry from ElasticsearchException

This commit decouples `NamedXContentRegistry` from using either
`ElasticsearchException`, `ParsingException`, or `UnknownNamedObjectException`.

This will allow us to move NamedXContentRegistry to its own lib as part of the
xcontent extraction work.

Relates to #28504
2018-03-27 16:51:31 -06:00
Lee Hinman 7df66abaf5 [TEST] Fix issue with HttpInfo passed invalid parameter
HttpInfo is passed the maxContentLength as a parameter, but this value should
never be negative. This fixes the test to only pass a positive random value.
2018-03-27 14:20:06 -06:00
Lee Hinman b4c78019b0
Remove all dependencies from XContentBuilder (#29225)
* Remove all dependencies from XContentBuilder

This commit removes all of the non-JDK dependencies from XContentBuilder, with
the exception of `CollectionUtils.ensureNoSelfReferences`. It adds a third
extension point around dealing with time-based fields and formatters to work
around the Joda dependency.

This decoupling allows us to be able to move XContentBuilder to a separate lib
so it can be available for things like the high level rest client.

Relates to #28504
2018-03-27 12:58:22 -06:00
Jim Ferenczi 3db6f1c9d5 Fix sporadic failure in CompositeValuesCollectorQueueTests
This commit fixes a test bug that causes an NPE on empty segments.

Closes #29269
2018-03-27 20:11:21 +02:00
Jim Ferenczi 2aaa057387
Propagate ignore_unmapped to inner_hits (#29261)
In 5.2 `ignore_unmapped` was added to `inner_hits` in order to ignore invalid mapping.
This value was automatically set to the value defined in the parent query (`nested`, `has_child`, `has_parent`) but the refactoring of the parent/child in 5.6 removed this behavior unintentionally.
This commit restores this behavior but also makes sure that we always automatically enforce this value when the query builder is used directly (previously this was only done by the XContent deserialization).

Closes #29071
2018-03-27 18:55:42 +02:00
Nhat Nguyen dfc9e721d8 TEST: Increase timeout for testPrimaryReplicaResyncFailed
The default timeout (eg. 10 seconds) may not be enough for CI to
re-allocate shards after the partion is healed. This commit increases
the timeout to 30 seconds and enables logging in order to have more
detailed information in case this test failed again.

Closes #29060
2018-03-27 12:18:09 -04:00
Nhat Nguyen d1d3edf156 TEST: Use different translog dir for a new engine
In #testPruneOnlyDeletesAtMostLocalCheckpoint, we create a new engine
but mistakenly use the same translog directory of  the existing engine.
This prevents translog files from cleaning up when closing the engines.

ERROR   0.12s J2 | InternalEngineTests.testPruneOnlyDeletesAtMostLocalCheckpoint <<< FAILURES!
   > Throwable #1: java.io.IOException: could not remove the following files (in the order of attempts):
   >    translog-primary-060/translog-2.tlog:  java.io.IOException: access denied:

This commit makes sure to use a separate directory for each engine in
this tes.
2018-03-27 09:45:51 -04:00
Christoph Büscher 8d6832c5ee
Make SearchStats implement Writeable (#29258)
Moves another class over from Streamable to Writeable. By this,
also some constructors can be removed or made private.
2018-03-27 15:21:11 +02:00
Nhat Nguyen 0ac89a32cc
Do not optimize append-only if seen normal op with higher seqno (#28787)
When processing an append-only operation, primary knows that operations 
can only conflict with another instance of the same operation. This is
true as the id was freshly generated. However this property doesn't hold
for replicas. As soon as an auto-generated ID was indexed into the
primary, it can be exposed to a search and users can issue a follow up
operation on it. In extremely rare cases, the follow up operation can be
arrived and processed on a replica before the original append-only
request. In this case we can't simply proceed with the append-only
request and blindly add it to the index without consulting the version
map. 

The following scenario can cause difference between primary and
replica.

1. Primary indexes an auto-gen-id doc. (id=X, v=1, s#=20)
2. A refresh cycle happens on primary
3. The new doc is picked up and modified - say by a delete by query
   request - Primary gets a delete doc (id=X, v=2, s#=30)
4. Delete doc is processed first on the replica (id=X, v=2, s#=30)
5. Indexing operation arrives on the replica, since it's an auto-gen-id
   request and the retry marker is lower, we put it into lucene without 
   any check. Replica has a doc the primary doesn't have.

To deal with a potential conflict between an append-only operation and a 
normal operation on replicas, we need to rely on sequence numbers. This
commit maintains the max seqno of non-append-only operations on replica
then only apply optimization for an append-only operation only if its
seq# is higher than the seq# of all non-append-only.
2018-03-26 16:56:12 -04:00
Nhat Nguyen 87957603c0
Prune only gc deletes below local checkpoint (#28790)
Once a document is deleted and Lucene is refreshed, we will not be able 
to look up the `version/seq#` associated with that delete in Lucene. As
conflicting operations can still be indexed, we need another mechanism
to remember these deletes. Therefore deletes should still be stored in
the Version Map, even after Lucene is refreshed. Obviously, we can't
remember all deletes forever so a trimming mechanism is needed.
Currently, we remember deletes for at least 1 minute (the default GC
deletes cycle) and clean them periodically. This is, at the moment, the
best we can do on the primary for user facing APIs but this arbitrary
time limit is problematic for replicas. Furthermore, we can't rely on
the primary and replicas doing the trimming in a synchronized manner,
and failing to do so results in the replica and primary making different
decisions. 

The following scenario can cause inconsistency between
primary and replica.

1. Primary index doc (index, id=1, v2)
2. Network packet issue causes index operation to back off and wait
3. Primary deletes doc (delete, id=1, v3)
4. Replica processes delete (delete, id=1, v3)
5. 1+ minute passes (GC deletes runs replica)
6. Indexing op is finally sent to the replica which no processes it 
   because it forgot about the delete.

We can reply on sequence-numbers to prevent this issue. If we prune only 
deletes whose seqno at most the local checkpoint, a replica will
correctly remember what it needs. The correctness is explained as
follows:

Suppose o1 and o2 are two operations on the same document with seq#(o1) 
< seq#(o2), and o2 arrives before o1 on the replica. o2 is processed
normally since it arrives first; when o1 arrives it should be discarded:
 
1. If seq#(o1) <= LCP, then it will be not be added to Lucene, as it was
  already previously added.

2. If seq#(o1)  > LCP, then it depends on the nature of o2:
  - If o2 is a delete then its seq# is recorded in the VersionMap,
    since seq#(o2) > seq#(o1) > LCP, so a lookup can find it and
    determine that o1 is stale.
  
  - If o2 is an indexing then its seq# is either in Lucene (if
    refreshed) or the VersionMap (if not refreshed yet), so a 
    real-time lookup can find it and determine that o1 is stale.

In this PR, we prefer to deploy a single trimming strategy, which 
satisfies both requirements, on primary and replicas because:

- It's simpler - no need to distinguish if an engine is running at
primary mode or replica mode or being promoted.

- If a replica subsequently is promoted, user experience is fully
maintained as that replica remembers deletes for the last GC cycle.

However, the version map may consume less memory if we deploy two 
different trimming strategies for primary and replicas.
2018-03-26 13:42:08 -04:00
Boaz Leskes bca264699a remove testUnassignedShardAndEmptyNodesInRoutingTable
testUnassignedShardAndEmptyNodesInRoutingTable and that test is as old as time and does a very bogus thing.
it is an IT test which extracts the GatewayAllocator from the node and tells it to allocated unassigned
shards, while giving it a conjured cluster state with no nodes in it (it uses the DiscoveryNodes.EMPTY_NODES.
This is never a cluster state we want to reroute on (we always have at least master node in it).
I'm going to just delete the test as I don't think it adds much value.

Closes #21463
2018-03-26 17:10:57 +02:00
Boaz Leskes f5d4550e93
Fold EngineDiskUtils into Store, for better lock semantics (#29156)
#28245 has introduced the utility class`EngineDiskUtils` with a set of methods to prepare/change
translog and lucene commit points. That util class bundled everything that's needed to create and
empty shard, bootstrap a shard from a lucene index that was just restored etc. 

In order to safely do these manipulations, the util methods acquired the IndexWriter's lock. That
would sometime fail due to concurrent shard store fetching or other short activities that require the
files not to be changed while they read from them. 

Since there is no way to wait on the index writer lock, the `Store` class has other locks to make
sure that once we try to acquire the IW lock, it will succeed. To side step this waiting problem, this
PR folds `EngineDiskUtils` into `Store`. Sadly this comes with a price - the store class doesn't and
shouldn't know about the translog. As such the logic is slightly less tight and callers have to do the
translog manipulations on their own.
2018-03-26 14:08:03 +02:00
Christoph Büscher 318b0af953 Remove execute mode bit from source files
Some source files seem to have the execute bit (a+x) set, which doesn't
really seem to hurt but is a bit odd. This change removes those, making
the permissions similar to other source files in the repository.
2018-03-26 13:37:55 +02:00
Jim Ferenczi 5288235ca3
Optimize the composite aggregation for match_all and range queries (#28745)
This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values
present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate
the collection when the leading source value is greater than the lowest value in the queue.
Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents
in the order of the values present in the leading source.
For instance the following aggregation:

```
"composite" : {
  "sources" : [
    { "value1": { "terms" : { "field": "timestamp", "order": "asc" } } }
  ],
  "size": 10
}
```
... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents.
For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough
composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it
is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited.

This mode can execute iff:
 * The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`.
 * The query is a match_all query or a range query over the field that is used as the leading source in the composite definition.
 * The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only).

If these conditions are not met this aggregation visits each document like any other agg.
2018-03-26 09:51:37 +02:00
Nicholas Knize fede633563 Add Z value support to geo_shape
This enhancement adds Z value support (source only) to geo_shape fields. If vertices are provided with a third dimension, the third dimension is ignored for indexing but returned as part of source. Like beofre, any values greater than the 3rd dimension are ignored.

closes #23747
2018-03-23 08:50:55 -05:00
Nhat Nguyen 794de63232
Remove type casts in logging in server component (#28807)
This commit removes type-casts in logging in the server component (other 
components will be done later). This also adds a parameterized message
test which would catch breaking-changes related to lambdas in Log4J.
2018-03-23 07:35:50 -04:00
Yu 4a8099c696 Change BroadcastResponse from ToXContentFragment to ToXContentObject (#28878)
While working on #27799, we find that it might make sense to change BroadcastResponse from ToXContentFragment to ToXContentObject, seeing that it's rather a complete XContent object and also the other Responses are normally ToXContentObject.

By doing this, we can also move the XContent build logic of BroadcastResponse's subclasses, from Rest Layer to the concrete classes themselves.

Relates to #3889
2018-03-23 10:53:37 +01:00
Milan Chovatiya 8328b9c5cd REST : Split `RestUpgradeAction` into two actions (#29124)
Closes #29062
2018-03-23 10:37:31 +01:00
Nhat Nguyen 14157c8705
Harden periodically check to avoid endless flush loop (#29125)
In #28350, we fixed an endless flushing loop which may happen on 
replicas by tightening the relation between the flush action and the
periodically flush condition.

1. The periodically flush condition is enabled only if it is disabled 
after a flush.

2. If the periodically flush condition is enabled then a flush will
actually happen regardless of Lucene state.

(1) and (2) guarantee that a flushing loop will be terminated. Sadly, 
the condition 1 can be violated in edge cases as we used two different
algorithms to evaluate the current and future uncommitted translog size.

- We use method `uncommittedSizeInBytes` to calculate current 
  uncommitted size. It is the sum of translogs whose generation at least
the minGen (determined by a given seqno). We pick a continuous range of
translogs since the minGen to evaluate the current uncommitted size.

- We use method `sizeOfGensAboveSeqNoInBytes` to calculate the future 
  uncommitted size. It is the sum of translogs whose maxSeqNo at least
the given seqNo. Here we don't pick a range but select translog one by
one.

Suppose we have 3 translogs `gen1={#1,#2}, gen2={}, gen3={#3} and 
seqno=#1`, `uncommittedSizeInBytes` is the sum of gen1, gen2, and gen3
while `sizeOfGensAboveSeqNoInBytes` is the sum of gen1 and gen3. Gen2 is
excluded because its maxSeqno is still -1.

This commit removes both `sizeOfGensAboveSeqNoInBytes` and 
`uncommittedSizeInBytes` methods, then enforces an engine to use only
`sizeInBytesByMinGen` method to evaluate the periodically flush condition.

Closes #29097
Relates ##28350
2018-03-22 14:31:15 -04:00
Jim Ferenczi c93c7f3121
Remove deprecated options for query_string (#29203)
This commit removes some parameters deprecated in 6.x (or 5.x):
`use_dismax`, `split_on_whitespace`, `all_fields` and `lowercase_expanded_terms`.

Closes #25551
2018-03-22 18:37:08 +01:00
Yu 24c8d8f5ef REST high-level client: add force merge API (#28896)
Relates to #27205
2018-03-22 17:17:16 +01:00
Lee Hinman 7d1de890b8
Decouple more classes from XContentBuilder and make builder strict (#29197)
This commit decouples `BytesRef`, `Releaseable`, and `TimeValue` from
XContentBuilder, and paves the way for doupling `ByteSizeValue` as well. It
moves much of the Lucene and Joda encoding into a new SPI extension that is
loaded by XContentBuilder to know how to encode these values.

Part of doing this also allows us to make JSON encoding strict, as we no longer
allow just any old object to be passed (in the past it was possible to get json
that was `"field": "java.lang.Object@d8355a8"` if no one was careful about what
was passed in).

Relates to #28504
2018-03-22 08:18:55 -06:00
Christoph Büscher d6d3fb3c73
Use EnumMap in ClusterBlocks (#29112)
By using EnumMap instead of an ImmutableLevelHolder array we can avoid
the using enum ordinals to index into the array.
2018-03-22 11:14:24 +01:00
Tanguy Leroux edf27a599e
Add new setting to disable persistent tasks allocations (#29137)
This commit adds a new setting `cluster.persistent_tasks.allocation.enable`
that can be used to enable or disable the allocation of persistent tasks.
The setting accepts the values `all` (default) or `none`. When set to
none, the persistent tasks that are created (or that must be reassigned)
won't be assigned to a node but will reside in the cluster state with
a no "executor node" and a reason describing why it is not assigned:

```
"assignment" : {
  "executor_node" : null,
  "explanation" : "persistent task [foo/bar] cannot be assigned [no
  persistent task assignments are allowed due to cluster settings]"
}
```
2018-03-22 09:18:07 +01:00
Nhat Nguyen 7d44d75774 Adjust PreSyncedFlushResponse bwc versions
We discussed and agreed to include the synced-flush change in 6.3.0+ but
not in 5.6.9. We will re-evaluate the urgency and importance of the
issue then decide which versions that the change should be included.
2018-03-21 16:50:35 -04:00
markharwood 93ff973afc
Tests - fix incorrect test assumption that zero-doc buckets will be returned by the adjacency matrix aggregation. Closes #29159 (#29167) 2018-03-21 10:42:14 +00:00
Jason Tedor 2f6c77337e Remove 6.1.5 version constant
The assumption here is that we will no longer be making a release from
the 6.1 branch. Since we assume that all versions on this branch are
actually released, we do not want to leave behind any versions that
would require a snapshot build. We do have a test that verifies that all
released versions are present here, so if another release is performed
from the 6.1 branch, that test will fail and we will know to add the
version constant at that time.
2018-03-21 06:28:17 -04:00
Adrien Grand 8f9d2ee4e2
Reject updates to the `_default_` mapping. (#29165)
This will reject mapping updates to the `_default_` mapping with 7.x indices
and still emit a deprecation warning with 6.x indices.

Relates #15613
Supersedes #28248
2018-03-21 10:44:11 +01:00
Nhat Nguyen f938c4267e Fix BWC issue for PreSyncedFlushResponse
I misunderstood how the bwc versions works. If we backport to 5.x, we
need to backport to all supported 6.*.  This commit corrects the BWC
versions for PreSyncedFlushResponse.

Relates #29103
2018-03-20 13:56:15 -04:00
Lee Hinman b4af451ec5
Remove BytesArray and BytesReference usage from XContentFactory (#29151)
* Remove BytesArray and BytesReference usage from XContentFactory

This removes the usage of `BytesArray` and `BytesReference` from
`XContentFactory`. Instead, a regular `byte[]` should be passed. To assist with
this a helper has been added to `XContentHelper` that will preserve the offset
and length from the underlying BytesReference.

This is part of ongoing work to separate the XContent parts from ES so they can
be factored into their own jar.

Relates to #28504
2018-03-20 11:52:26 -06:00
Lee Hinman 4bd217c94f
Add pluggable XContentBuilder writers and human readable writers (#29120)
* Add pluggable XContentBuilder writers and human readable writers

This adds the ability to use SPI to plug in writers for XContentBuilder. By
implementing the XContentBuilderProvider class we can allow Elasticsearch to
plug in different ways to encode types to JSON.

Important caveat for this, we should always try to have the class implement
`ToXContentFragment` first, however, in the case of classes from our
dependencies (think Joda classes or Lucene classes) we need a way to specify
writers for these classes.

This also makes the human-readable field writers generic and pluggable, so that
we no longer need to tie XContentBuilder to things like `TimeValue` and
`ByteSizeValue`. Contained as part of this moves all the TimeValue human
readable fields to the new `humanReadableField` method. A future commit will
move the `ByteSizeValue` calls over to this method.

Relates to #28504
2018-03-20 11:39:24 -06:00
Christoph Büscher 701625b065 Add unreleased version 6.2.4 (#29171) 2018-03-20 18:38:06 +01:00
Christoph Büscher 5a97fe75da Add unreleased version 6.1.5 (#29168) 2018-03-20 18:31:59 +01:00
Luca Cavanna ff09c82319
REST high-level client: add clear cache API (#28866)
* REST high-level client: add clear cache API

Relates to #27205

Also Closes #26947 (rest-spec were outdated)
2018-03-20 10:39:36 +01:00
Lee Hinman 687577a516 Fix javadoc warning in Strings for missing parameter description
Fixes a parameter in `Strings` that had a javadoc annotation but was missing the
description, causing warnings in the build.
2018-03-19 12:28:15 -06:00
Lee Hinman 3025295f7e
Decouple Text and Geopoint from XContentBuilder (#29119)
This removes the `Text` and `Geopoint` special handling from `XContentBuilder`.
Instead, these classes now implement `ToXContentFragment` and render themselves
accordingly.

This allows us to further decouple XContentBuilder from Elasticsearch-specific
classes so it can be factored into a standalone lib at a later time.

Relates to #28504
2018-03-19 08:54:10 -06:00
Nik Everett bf05c600c4
REST: Include suppressed exceptions on failures (#29115)
This modifies xcontent serialization of Exceptions to contain suppressed
exceptions. If there are any suppressed exceptions they are included in
the exception response by default. The reasoning here is that they are
fairly rare but when they exist they almost always add extra useful
information. Take, for example, the response when you specify two broken
ingest pipelines:

```
{
  "error" : {
    "root_cause" : ...snip...
    "type" : "parse_exception",
    "reason" : "[field] required property is missing",
    "header" : {
      "processor_type" : "set",
      "property_name" : "field"
    },
    "suppressed" : [
      {
        "type" : "parse_exception",
        "reason" : "[field] required property is missing",
        "header" : {
          "processor_type" : "convert",
          "property_name" : "field"
        }
      }
    ]
  },
  "status" : 400
}
```

Moreover, when suppressed exceptions come from 500 level errors should
give us more useful debugging information.

Closes #23392
2018-03-19 10:52:50 -04:00
Tanguy Leroux 0f93b7abdf Fix compilation errors in ML integration tests
After elastic/elasticsearch#29109, the `needsReassignment` method has
been moved to the PersistentTasksClusterService. This commit fixes
some compilation in tests I introduced.
2018-03-19 09:46:53 +01:00
Tanguy Leroux b57bd695f2
Small code cleanups and refactorings in persistent tasks (#29109)
This commit consists of small code cleanups and refactorings in the
persistent tasks framework. Most changes are in
PersistentTasksClusterService where some methods have been renamed
or merged together, documentation has been added, unused code removed
in order to improve readability of the code.
2018-03-19 09:26:17 +01:00
Nhat Nguyen f1029aaad5
getMinGenerationForSeqNo should acquire read lock (#29126)
The method Translog#getMinGenerationForSeqNo does not modify the current
translog but only access, it therefore should acquire the readLock
instead of writeLock.
2018-03-17 17:43:20 -04:00
Nhat Nguyen c9749180a1 Backport - Do not renew sync-id PR to 5.6 and 6.3
Relates ##29103
2018-03-17 11:38:22 -04:00
Jason Tedor 2e93a9158f
Align thread pool info to thread pool configuration (#29123)
Today we report thread pool info using a common object. This means that
we use a shared set of terminology that is not consistent with the
terminology used to the configure thread pools. This holds in particular
for the minimum and maximum number of threads in the thread pool where
we use the following terminology:
 thread pool info | fixed | scaling
 min                core    size
 max                max     size

This commit changes the display of thread pool info to be dependent on
the type of the thread pool so that we can align the terminology in the
output of thread pool info with the terminology used to configure a
thread pool.
2018-03-16 22:47:06 -04:00
Nhat Nguyen 22ad52a288 TEST: Adjust translog size assumption in new engine
A new engine now can have more than one empty translog since #28676.
This cause #testShouldPeriodicallyFlush failed because in the test we
asssume an engine should have one empty translog. This commit takes into
account the extra translog size of a new engine.
2018-03-16 21:50:31 -04:00
olcbean 47211c00e9 REST: Clear Indices Cache API simplify param parsing (#29111)
Simplify the parsing of the params in Clear Indices Cache API, as
a follow up to the removing of the deprecated parameter names.
2018-03-16 16:50:34 -04:00
Jason Tedor 4d62640bf1 Fix typo in ExceptionSerializationTests
This commit fixes a little typo in ExceptionSerializationTests.java
replacing "weas" by "was".
2018-03-16 15:52:39 -04:00
Jason Tedor 1f1a4d17b4 Remove BWC layer for rejected execution exception
The serialization changes for rejected execution exceptions has been
backported to 6.x with the intention to appear in all versions since
6.3.0. Therefore, this BWC layer is no longer needed in master since
master would never speak to a node that does not speak the same
serialization.
2018-03-16 14:40:17 -04:00
Jason Tedor 6bf742dd1b
Fix EsAbortPolicy to conform to API (#29075)
The rejected execution handler API says that rejectedExecution(Runnable,
ThreadPoolExecutor) throws a RejectedExecutionException if the task must
be rejected due to capacity on the executor. We do throw something that
smells like a RejectedExecutionException (it is named
EsRejectedExecutionException) yet we violate the API because
EsRejectedExecutionException is not a RejectedExecutionException. This
has caused problems before where we try to catch RejectedExecution when
invoking rejectedExecution but this causes EsRejectedExecutionException
to go uncaught. This commit addresses this by modifying
EsRejectedExecutionException to extend
RejectedExecutionException.
2018-03-16 14:34:36 -04:00
David Turner 158bb23887
Remove usages of obsolete settings (#29087)
The settings `indices.recovery.concurrent_streams` and
`indices.recovery.concurrent_small_file_streams` were removed in
f5e4cd4616. This commit removes their last traces
from the codebase.
2018-03-16 15:35:40 +00:00
Nhat Nguyen 2c1ef3d4c6
Do not renew sync-id if all shards are sealed (#29103)
Today the synced-flush always issues a new sync-id even though all
shards haven't been changed since the last seal. This causes active
shards to have different a sync-id from offline shards even though all
were sealed and no writes since then.

This commit adjusts not to renew sync-id if all active shards are sealed
with the same sync-id.

Closes #27838
2018-03-16 11:16:30 -04:00
Adrien Grand 0755ff425f
Clarify requirements of strict date formats. (#29090)
Closes #29014
2018-03-16 14:39:36 +01:00
Alan Woodward a2d5cf6514 Compilation fix for #29067 2018-03-16 13:33:25 +00:00
Alan Woodward 986e518170
Store offsets in index prefix fields when stored in the parent field (#29067)
The index prefix field is normally indexed as docs-only, given that it cannot
be used in phrases.  However, in the case that the parent field has been indexed
with offsets, or has term-vector offsets, we should also store this in the index
prefix field for highlighting.

Note that this commit does not implement highlighting on prefix fields, but
rather ensures that future work can implement this without a backwards-break
in index data.

Closes #28994
2018-03-16 11:39:46 +00:00
Tanguy Leroux f14146982f
Use removeTask instead of finishTask in PersistentTasksClusterService (#29055)
The method `PersistentTasksClusterService.finishTask()` has been
modified since it was added and does not use any `removeOncompletion`
flag anymore. Its behavior is now similar to `removeTask()` and can be
replaced by this one. When a non existing task is removed, the cluster
state update task will fail and its `source` will still indicate
`finish persistent task`/`remove persistent task`.
2018-03-16 10:20:56 +01:00
Yogesh Gaikwad a685784cea
CLI: Close subcommands in MultiCommand (#28954)
* CLI Command: MultiCommand must close subcommands to release resources properly

- Changes are done to override the close method and call close on subcommands using IOUtils#close
- Unit Test

Closes #28953
2018-03-16 09:59:23 +11:00
Nhat Nguyen c75790e7c0
TEST: write ops should execute under shard permit (#28966)
Currently ESIndexLevelReplicationTestCase executes write operations
without acquiring  index shard permit. This may prevent the primary term
on replica from being updated or cause a race between resync and
indexing on primary. This commit ensures that write operations are
always executed under shard permit like the production code.
2018-03-15 14:42:15 -04:00
Mayya Sharipova 8cb3d18eac Revert "Improve error message for installing plugin (#28298)"
This reverts commit 0cc1ffdf20

The reason is that Windows test are failing,
because of the incorrect path for the plugin
2018-03-15 10:47:50 -07:00
Adrien Grand 404e776a45
Validate regular expressions in dynamic templates. (#29013)
Today you would only get these errors at index time.

Relates #24749
2018-03-15 16:43:56 +01:00
Christoph Büscher 312ccc05d5
[Tests] Fix GetResultTests and DocumentFieldTests failures (#29083)
Changes made in #28972 seems to have changed some assumptions about how
SMILE and CBOR write byte[] values and how this is tested. This changes
the generation of the randomized DocumentField values back to BytesArray
while expecting the JSON and YAML deserialisation to produce Base64
encoded strings and SMILE and CBOR to parse back BytesArray instances.

Closes #29080
2018-03-15 16:42:26 +01:00
Adrien Grand 18d848f218
Reenable LiveVersionMapTests.testRamBytesUsed on Java 9. (#29063)
I also had to make the test more lenient. This is due to the fact that
Lucene's RamUsageTester was changed in order not to reflect `java.*`
classes and the way that it estimates ram usage of maps is by assuming
it has similar memory usage to an `Object[]` array that stores all keys
and values. The implementation in `LiveVersionMap` tries to be slightly
more realistic by taking the load factor and linked lists into account,
so it usually gives a higher estimate which happens to be closer to
reality.

Closes #22548
2018-03-15 16:39:02 +01:00
Christoph Büscher 85933161d4 Mute failing GetResultTests and DocumentFieldTests 2018-03-15 11:49:45 +01:00
Mayya Sharipova 0cc1ffdf20
Improve error message for installing plugin (#28298)
Provide more actionable error message when installing an offline plugin
in the plugins directory, and the `plugins` directory for the node
contains plugin distribution.

Closes #27401
2018-03-14 16:19:04 -07:00
Lee Hinman 8425257593 [TEST] Fix issue parsing response out of order
When parsing GetResponse it was possible that the equality check failed because
items in the map were in a different order (in the `.equals` implementation).
2018-03-14 16:34:40 -06:00
Christoph Büscher ae912cbde4
[Docs] Fix Java Api index administration usage (#28260)
The Java API documentation for index administration currenty is wrong because
the PutMappingRequestBuilder#setSource(Object... source) an
CreateIndexRequestBuilder#addMapping(String type, Object... source) methods
delegate to methods that check that the input arguments are valid key/value
pairs. This changes the docs so the java api code examples are included from
documentation integration tests so we detect compile and runtime issues earlier.

Closes #28131
2018-03-14 22:02:06 +01:00
olcbean 3d81497f25 REST: Clear Indices Cache API remove deprecated url params (#29068)
By the time the master branch is released the deprecated url
parameters in the `/_cache/clear` API will have been deprecated
for a couple of minor releases. Since master will be the next
major release we are fine with removing these parameters.
2018-03-14 16:37:50 -04:00
Boaz Leskes bf65cb4914
Untangle Engine Constructor logic (#28245)
Currently we have a fairly complicated logic in the engine constructor logic to deal with all the 
various ways we want to mutate the lucene index and translog we're opening.

We can:
1) Create an empty index
2) Use the lucene but create a new translog
3) Use both
4) Force a new history uuid in all cases.

This leads complicated code flows which makes it harder and harder to make sure we cover all the 
corner cases. This PR tries to take another approach. Constructing an InternalEngine always opens 
things as they are and all needed modifications are done by static methods directly on the 
directory, one at a time.
2018-03-14 20:59:47 +01:00
Lee Hinman 8e8fdc4f0e
Decouple XContentBuilder from BytesReference (#28972)
* Decouple XContentBuilder from BytesReference

This commit removes all mentions of `BytesReference` from `XContentBuilder`.
This is needed so that we can completely decouple the XContent code and move it
into its own dependency.

While this change appears large, it is due to two main changes, moving
`.bytes()` and `.string()` out of XContentBuilder itself into static methods
`BytesReference.bytes` and `Strings.toString` respectively. The rest of the
change is code reacting to these changes (the majority of it in tests).

Relates to #28504
2018-03-14 13:47:57 -06:00
David Kyle cb9d10f971
Protect against NPE in RestNodesAction (#29059)
* Protect against NPE in RestNodesAction
2018-03-14 15:47:18 +00:00
David Roberts 5bf92ca3b3
Enforce that java.io.tmpdir exists on startup (#28217)
If the default java.io.tmpdir is used then the startup script creates
it, but if a custom java.io.tmpdir is used then the user must ensure it
exists before running Elasticsearch. If they forget then it can cause
errors that are hard to understand, so this change adds an explicit
check early in the bootstrap and reports a clear error if java.io.tmpdir
is not an accessible directory.
2018-03-14 15:43:53 +00:00
Jason Tedor 24d10adaab
Main response should not have status 503 when okay (#29045)
The REST status 503 means "I can not handle the request that you sent
me." However today we respond to a main request with a 503 when there
are certain cluster blocks despite still responding with an actual main
response. This is broken, we should respond with a 200 status. This
commit removes this silliness.
2018-03-14 06:36:37 -04:00
Jason Tedor 647d0a1e95
Do not swallow fail to convert exceptions (#29043)
When converting the source for an indexing request to JSON, the
conversion can throw an I/O exception which we swallow and proceed with
logging to the slow log. The cause of the I/O exception is lost. This
commit changes this behavior and chooses to drop the entry from the slow
logs and instead lets an exception percolate up to the indexing
operation listener loop. Here, the exception will be caught and logged
at the warn level.
2018-03-13 23:42:16 -04:00
Jason Tedor 46fcd07153
Add total hits to the search slow log (#29034)
This commit adds the total hits to the search slow log.
2018-03-13 20:40:47 -04:00
Jason Tedor 4dc3adad51
Archive unknown or invalid settings on updates (#28888)
Today we can end up in a situation where the cluster state contains
unknown or invalid settings. This can happen easily during a rolling
upgrade. For example, consider two nodes that are on a version that
considers the setting foo.bar to be known and valid. Assume one of these
nodes is restarted on a higher version that considers foo.bar to now be
either unknown or invalid, and then the second node is restarted
too. Now, both nodes will be on a version that consider foo.bar to be
unknown or invalid yet this setting will still be contained in the
cluster state. This means that if a cluster settings update is applied
and we validate the settings update with the existing settings then
validation will fail. In such a state, the offending setting can not
even be removed. This commit helps out with this situation by archiving
any settings that are unknown or invalid at the time that a settings
update is applied. This allows the setting update to go through, and the
archived settings can be removed at a later time.
2018-03-13 17:32:18 -04:00
Jason Tedor c8e71327ab
Log template creation and deletion (#29027)
These can be seen at the debug level via cluster state update logging
but really they should be more visible like index creation and
deletion. This commit adds info-level logging for template puts and
deletes.
2018-03-13 16:31:19 -04:00
Jason Tedor 697b9f8b82
Remove interning from prefix logger (#29031)
This interning is completely unnecessary because we look up the marker
by the prefix (value, not identity) anyway. This means that regardless
of the identity of the prefix, we end up with the same marker. That is
all that we really care about here.
2018-03-13 16:30:13 -04:00
olcbean edc57f6f34 REST: deprecate `field_data` in Clear Cache API (#28943)
We call it `fielddata` everywhere else in the code and API so we may as
well be consistent.
2018-03-13 15:16:27 -04:00
Jason Tedor 5904d936fa
Copy Lucene IOUtils (#29012)
As we have factored Elasticsearch into smaller libraries, we have ended
up in a situation that some of the dependencies of Elasticsearch are not
available to code that depends on these smaller libraries but not server
Elasticsearch. This is a good thing, this was one of the goals of
separating Elasticsearch into smaller libraries, to shed some of the
dependencies from other components of the system. However, this now
means that simple utility methods from Lucene that we rely on are no
longer available everywhere. This commit copies IOUtils (with some small
formatting changes for our codebase) into the fold so that other
components of the system can rely on these methods where they no longer
depend on Lucene.
2018-03-13 12:49:33 -04:00
Jason Tedor 6088af5887 Fix comment regarding removal of requiresKeystore
The requiresKeystore flag was removed from PluginInfo in 6.3.0. This
commit fixes a pair of code comments that incorrectly refer to this
version as 7.0.0.
2018-03-12 14:20:02 -04:00
Jason Tedor b8e165a994 Fix BWC versions on plugin info
This commit fixes the BWC versions on the plugin info serialization
which was changed to remove the requiresKeystore flag.
2018-03-12 13:05:48 -04:00
Jason Tedor 6331bcaf76
Create keystore on package install (#28928)
This commit removes the ability to specify that a plugin requires the
keystore and instead creates the keystore on package installation or
when Elasticsearch is started for the first time. The reason that we opt
to create the keystore on package installation is to ensure that the
keystore has the correct permissions (the package installation scripts
run as root as opposed to Elasticsearch running as the elasticsearch
user) and to enable removing the keystore on package removal if the
keystore is not modified.
2018-03-12 12:48:00 -04:00
Mika⠙ a7b53fd3b7 Add check when trying to reroute a shard to a non-data discovery node (#28886)
While trying to reroute a shard to or from a non-data node (a node with ``node.data=false``), I encountered a null pointer exception. Though an exception is to be expected, the NPE was occurring because ``allocation.routingNodes()`` would not contain any non-data nodes, so when you attempt to do ``allocation.routingNodes.node(non-data-node)`` it would not find it, and thus error. This occurred regardless of whether I was rerouting to or from a non-data node.

This PR adds a check (as well as a test for these use cases) to return a legible, useful exception if the discovery node you are rerouting to or from is not a data node.
2018-03-12 16:48:51 +01:00
Jason Tedor b1b469e30f
Avoid class cast exception from index writer (#28989)
When an index writer encounters a tragic exception, it could be a
Throwable and not an Exception. Yet we blindly cast the tragic exception
to an Exception which can encounter a ClassCastException. This commit
addresses this by checking if the tragic exception is an Exception and
otherwise wrapping the Throwable in a RuntimeException if it is not. We
choose to wrap the Throwable instead of passing it around because
passing it around leads to changing a lot of places where we handle
Exception to handle Throwable instead. In general, we have tried to
avoid handling Throwable and instead let those bubble up to the uncaught
exception handler.
2018-03-12 08:42:02 -04:00
Yannick Welsch be7f5dde24
Disallow logger methods with Object parameter (#28969)
Log4j2 provides a wide range of logging methods. Our code typically only uses a subset of them. In particular, uses of the methods trace|debug|info|warn|error|fatal(Object) or trace|debug|info|warn|error|fatal(Object, Throwable) have all been wrong, leading to not properly logging the provided message. To prevent these issues in the future, the corresponding Logger methods have been blacklisted.
2018-03-12 03:05:24 -07:00
Jim Ferenczi 7afe5ad943
Restore tiebreaker for cross fields query (#28935)
This commit restores the handling of tiebreaker for multi_match
cross fields query. This functionality was lost during a refactoring
of the multi_match query (#25115).

Fixes #28933
2018-03-12 09:58:20 +01:00
Ryan Ernst 4216fc9f64
Plugins: Allow modules to spawn controllers (#28968)
This commit makes the controller spawner also look under modules. It
also fixes a bug in module security policy loading where the module is a
meta plugin.
2018-03-11 09:01:27 -07:00
Nhat Nguyen 4f644d04a3 TEST: Use non-zero number for #testCompareUnits
In `ByteSizeValueTests#testCompareUnits`, we expect non-zero for the
variable `number` however `randomNonNegativeLong` can return zero.

CI: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.2+oracle-java10-periodic/147/console
2018-03-10 22:56:22 -05:00
Jason Tedor 4ba80a7952
Maybe die before failing engine (#28973)
Today we check for a few cases where we should maybe die before failing
the engine (e.g., when a merge fails). However, there are still other
cases where a fatal error can be hidden from us (for example, a failed
index writer commit). This commit modifies the mechanism for failing the
engine to always check for a fatal error before failing the engine.
2018-03-10 07:41:51 -05:00
Jason Tedor 950c4363bf
Remove special handling for _all in nodes info
Today when requesting _all we return all nodes regardless of what other
node qualifiers are in the request. This is contrary to how the
remainder of the API behaves which acts as additive and subtractive
based on the qualifiers and their ordering. It is also contrary to how
the wildcard * behaves. This commit removes the special handling for
_all so that it behaves identical to the wildcard *.

Relates #28971
2018-03-09 18:23:16 -05:00
Lee Hinman 97b513e925
Remove Booleans use from XContent and ToXContent (#28768)
* Remove Booleans use from XContent and ToXContent

This removes the use of the `common.Boolean` class from two of the XContent
classes, so they can be decoupled from the ES code as much as possible.

Related to #28754, #28504
2018-03-09 14:58:54 -07:00
Nhat Nguyen 4973887a10
Make primary-replica resync failures less lenient (#28534)
Today, failures from the primary-replica resync are ignored as the best 
effort to not mark shards as stale during the cluster restart. However
this can be problematic if replicas failed to execute resync operations
but just fine in the subsequent write operations. When this happens,
replica will miss some operations from the new primary. There are some
implications if the local checkpoint on replica can't advance because of
the missing operations.

1. The global checkpoint won't advance - this causes both primary and 
replicas keep many index commits

2. Engine on replica won't flush periodically because uncommitted stats
is calculated based on the local checkpoint

3. Replica can use a large number of bitsets to keep track operations seqno

However we can prevent this issue but still reserve the best-effort by 
failing replicas which fail to execute resync operations but not mark
them as stale. We have prepared to the required infrastructure in #28049
and #28054 for this change.

Relates #24841
2018-03-09 09:55:45 -08:00
Martijn van Groningen b32e999960 Use different pipeline id in test.
(pipelines do not get removed between tests extending from ESIntegTestCase)
2018-03-09 14:29:42 +01:00
David Turner 033a83b98b
Use String.join() to describe a list of tasks (#28941)
This change replaces the use of string concatenation with a call to
String.join(). String concatenation might be quadratic, unless the compiler can
optimise it away, whereas String.join() is more reliably linear. There can
sometimes be a large number of pending ClusterState update tasks and #28920
includes a report that this operation sometimes takes a long time.
2018-03-09 09:42:44 +00:00
Martijn van Groningen 41519da45a Fixed incorrect test try-catch statement 2018-03-09 09:38:16 +01:00
Ryan Ernst 62293ec1c9
Plugins: Consolidate plugin and module loading code (#28815)
At one point, modules and plugins were very different. But effectively
now they are the same, just from different directories. This commit
unifies the loading methods so they are simply two different
directories. Note that the main codepath to load plugin bundles had
duplication (was not calling getPluginBundles) since previous
refactorings to add meta plugins. Note this change also rewords the
primary exception message when a plugin descriptor is missing, as the
wording asking if the plugin was built before 2.0 isn't really
applicable anymore (it is highly unlikely someone tries to install a 1.x
plugin on any modern version).
2018-03-08 22:49:27 -08:00
Lee Hinman 46a79127ed
Remove FastStringReader in favor of vanilla StringReader (#28944)
This allows us to remove another dependency in the decoupling of the XContent
code. Rather than move this class over or decouple it, it can simply be removed.

Relates tangentially to #28504
2018-03-08 17:17:36 -07:00
Lee Hinman d6d7ee7320
Remove FastCharArrayReader and FastCharArrayWriter (#28951)
These classes are used only in two places, and can be replaced by the
`CharArrayReader` and `CharArrayWriter`. The JDK can also perform lock biasing
and elision as well as escape analysis to optimize away non-contended locks,
rendering their lock-free implementations unnecessary.
2018-03-08 17:05:11 -07:00
Tal Levy 7784c1bff9
Continue registering pipelines after one pipeline parse failure. (#28752)
Ingest has been failing to apply existing pipelines from cluster-state
into the in-memory representation that are no longer valid. One example of
this is a pipeline with a script processor. If a cluster starts up with scripting
disabled, these pipelines will not be loaded. Even though GETing a pipeline worked,
indexing operations claimed that this pipeline did not exist. This is because one
gets information from cluster-state and the other is from an in-memory data-structure.

Now, two things happen
1. suppress the exceptions until after other successful pipelines are loaded
2. replace failed pipelines with a placeholder pipeline

If the pipeline execution service encounters the stubbed pipeline, it is known that
something went wrong at the time of pipeline creation and an exception was thrown to
the user at some point at start-up.

closes #28269.
2018-03-08 15:22:59 -08:00
Lee Hinman 17fc07a193
Switch XContentBuilder from BytesStreamOutput to ByteArrayOutputStream (#28945)
This switches the underlying byte output representation used by default in
`XContentBuilder` from `BytesStreamOutput` to a `ByteArrayOutputStream` (an
`OutputStream` can still be specified manually)

This is groundwork to allow us to decouple `XContent*` from the rest of the ES
core code so that it may be factored into a separate jar.

Since `BytesStreamOutput` was not using the recycling instance of `BigArrays`,
this should not affect the circuit breaking capabilities elsewhere in the
system.

Relates to #28504
2018-03-08 15:45:51 -07:00
Lee Hinman 697f3f1a3b
Factor UnknownNamedObjectException into its own class (#28931)
* Factor UnknownNamedObjectException into its own class

This moves the inner class `UnknownNamedObjectException` from
`NamedXContentRegistry` into a top-level class. This is so that
`NamedXContentRegistry` doesn't have to depend on StreamInput and StreamOutput.

Relates to #28504
2018-03-08 15:32:41 -07:00
Lee Hinman ec92796ed8
Remove now-unused createParser that uses BytesReference (#28926)
This removes `BytesReference` use from XContent and all subclasses.

Relates to #28504
2018-03-08 09:10:21 -07:00
Jim Ferenczi bc8b3fc71c Revert "Rescore collapsed documents (#28521)"
This reverts commit f057fc294a.
The rescorer does not resort the collapsed values inside the top docs
during rescoring. For this reason the Lucene rescorer is not compatible
with collapsing.
Relates #27243
2018-03-08 11:20:29 +01:00
Lee Hinman 9d4f09db68 [TEST] AwaitsFix QueryRescorerIT.testRescoreAfterCollapse
See: https://github.com/elastic/elasticsearch/issues/28932
2018-03-07 15:32:29 -07:00
Lee Hinman 818920a281
Decouple XContentType from StreamInput/Output (#28927)
This removes the readFrom and writeTo methods from XContentType, instead using
the more generic `readEnum` and `writeEnum` methods. Luckily they are both
encoded exactly the same way, so there is no compatibility layer needed for
backwards compatibility.

Relates to #28504
2018-03-07 14:50:30 -07:00
Lee Hinman 2d1d6503a4
Remove BytesRef usage from XContentParser and its subclasses (#28792)
* Remove BytesRef usage from XContentParser and its subclasses

This removes all the BytesRef usage from XContentParser in favor of directly
returning a CharBuffer (this was originally what was returned, it was just
immediately wraped in a BytesRef).

Relates to #28504

* Rename method after Ryan's feedback
2018-03-07 10:09:56 -07:00
Lee Hinman e7d1e12675
Wrap stream passed to createParser in try-with-resources (#28897)
* Wrap stream passed to createParser in try-with-resources

This wraps the stream (`.streamInput()`) that is passed to many of the
`createParser` instances in the enclosing (or a new) try-with-resources block.
This ensures the `BytesReference.streamInput()` is closed.

Relates to #28504

* Use try-with-resources instead of closing in a finally block
2018-03-04 16:48:03 -07:00
Sergey Galkin f057fc294a Rescore collapsed documents (#28521)
This change adds the ability to rescore collapsed documents.
2018-03-04 13:39:50 -08:00
Jim Ferenczi c26bd6046b
Fix (simple)_query_string to ignore removed terms (#28871)
This change ensures that we ignore terms removed from the analysis rather than returning a match_no_docs query for the part
that contain the stop word. For instance a query like "the AND fox" should ignore "the" if it is considered as a stop word instead of
adding a match_no_docs query.
This change also fixes the analysis of prefix terms that start with a stop word (e.g. `the*`). In such case if `analyze_wildcard` is true and `the`
is considered as a stop word this part of the query is rewritten into a match_no_docs query. Since it's a prefix query this change forces the prefix query
on `the` even if it is removed from the analysis.

Fixes #28855
Fixes #28856
2018-03-04 13:25:46 -08:00
Simon Willnauer 5a1d9f33a0
Try if tombstone is eligable for pruning before locking on it's key (#28767)
Pruning tombstones is quite expensive since we have to walk though all
deletes in the live version map and acquire a lock on every value even though
it's impossible to prune it. This change does a pre-check if a delete is old enough
and if not it skips acquireing the lock.
2018-03-04 11:31:13 -08:00
Mayya Sharipova f53d159aa1
Limit analyzed text for highlighting (improvements) (#28808)
Increase the default limit of `index.highlight.max_analyzed_offset` to 1M instead of previous 10K.

Enhance an error message when offset increased to include field name, index name and doc_id.

Relates to https://github.com/elastic/kibana/issues/16764
2018-03-02 08:09:05 -08:00
olcbean 472acf7833 [DOCS] fix put_mapping snippet (#28814)
Add a java snippet to be run in an integration test
in order to guarantee that the snippet is correct

Closes #28778
2018-03-01 10:59:52 +01:00
Luca Cavanna 184a8718d8
REST high-level client: add flush API (#28852)
Relates to #27205
2018-03-01 10:56:03 +01:00
Yu 95dea2408d Add Refresh API for RestHighLevelClient (#27799)
Relates to #27205
2018-02-28 11:49:14 +01:00
lzh3636 929dba8667 Fix log message when virtual lock not possible (#28829)
When virtual lock is not possible because JNA is unavailable, we log a
warning message. Yet, this log message refers to mlockall rather than
virtual lock, presumably because of a copy/paste error. This commit
fixes this issue.
2018-02-26 06:55:32 -08:00
Nhat Nguyen f1a94de9e7
Replace log4j.Supplier by jdk’s in non-logging usage (#28812)
This commit replaces `org.apache.logging.log4j.util.Supplier` by
`java.util.function.Supplier` in non-logging code. These usages are
neither incorrect nor wrong but rather than accidental. I think our
intention was to use the JDK's Supplier in these places.
2018-02-24 14:28:32 -05:00
Ke Li a77273fc01 Reject regex search if regex string is too long (#28542)
* Reject regex search if regex string is too long (#28344)

* Add docs

* Introduce index level setting `index.max_regex_length`
 to control the maximum length of the regular expression

Closes #28344
2018-02-23 10:41:24 -08:00
Luca Cavanna cd3d9c9f80
[TEST] share code between streamable/writeable/xcontent base test classes (#28785)
Today we have two test base classes that have a lot in common when it comes to testing wire and xcontent serialization: `AbstractSerializingTestCase` and `AbstractXContentStreamableTestCase`. There are subtle differences though between the two, in the way they work, what can be overridden and features that they support (e.g. insertion of random fields).

This commit introduces a new base class called `AbstractWireTestCase` which holds all of the serialization test code in common between `Streamable` and `Writeable`. It has two minimal subclasses called `AbstractWireSerializingTestCase` and `AbstractStreamableTestCase` which are specialized for `Writeable` and `Streamable`.

This commit also introduces a new test class called `AbstractXContentTestCase` for all of the xContent testing, which holds a testFromXContent method for parsing and rendering to xContent. This one can be delegated to from the existing `AbstractStreamableXContentTestCase` and `AbstractSerializingTestCase` so that we avoid code duplicate as much as possible and all these base classes offer the same functionalities in the same way. Having this last base class decoupled from the serialization testing may also help with the REST high-level client testing, as there are some classes where it's hard to implement equals/hashcode and this makes it possible to override `assertEqualInstances` for custom equality comparisons (also this base class doesn't require implementing equals/hashcode as it doesn't test such methods.
2018-02-23 10:48:48 +01:00
Nicholas Knize 3728c50d85 [GEO] Fix points_only indexing failure for GeoShapeFieldMapper
This commit fixes a bug that was introduced in PR #27415 for 6.1
and 7.0 where a change to support MULTIPOINT shapes mucked up
indexing of standalone points.
2018-02-22 20:57:09 -06:00
Lee Hinman 5bb79558e7
Decouple XContentGenerator and JsonXContentGenerator from BytesReference (#28772)
This removes the link these two classes have with BytesReference, in favor of an
`InputStream` approach.

Relates to #28504
2018-02-22 14:22:37 -07:00
jxy 497b3d7a20 Fix node ID reported by ThrottlingAllocationDecider (#28779)
Previously the message reported when `node_concurrent_outgoing_recoveries`
resulted in a `THROTTLE` decision included the reporting node's ID rather than
that of the primary. This commit fixes that.

Fixes #28777.
2018-02-22 19:23:15 +00:00
Luca Cavanna 1df711c5b7
Remove AcknowledgedRestListener in favour of RestToXContentListener (#28724)
This commit makes AcknowledgedResponse implement ToXContentObject, so that the response knows how to print its own content out to XContent, which allows us to remove AcknowledgedRestListener.
2018-02-22 09:13:30 +01:00
Jason Tedor 3dfb4b8b18 Skip some plugins service tests on Windows
These tests need to be skipped. They cause plugins to be loaded which
causes a child classloader to be opened. We do not want to add the
permissions to be able to close a classloader solely for these tests,
and the JARs created in the test can not be deleted on Windows until the
classloader is closed. Since this will not happen before test teardown,
the test will fail on Windows. So, we skip these tests.
2018-02-21 15:22:27 -05:00
Luca Cavanna 8b4a298874
Migrate some *ResponseTests to AbstractStreamableXContentTestCase (#28749)
This allows us to save a bit of code, but also adds more coverage as it tests serialization which was missing in some of the existing tests. Also it requires implementing equals/hashcode and we get the corresponding tests for them for free from the base test class.
2018-02-21 20:04:12 +01:00
Lee Hinman d7eae4b90f
Pass InputStream when creating XContent parser (#28754)
* Pass InputStream when creating XContent parser

Rather than passing the raw `BytesReference` in when creating the xcontent
parser, this passes the StreamInput (which is an InputStream), this allows us to
decouple XContent from BytesReference.

This also removes the use of `commons.Booleans` so it doesn't require more
external commons classes.

Related to #28504

* Undo boolean removal

* Enhance deprecation javadoc
2018-02-21 11:03:25 -07:00
Yu 7d8fb69d50 version set in ingest pipeline (#27573)
Add support version and version_type in ingest pipelines

Add support for setting document version and version type in set
processor of an ingest pipeline.
2018-02-21 09:34:51 +01:00
Michael Basnight eaa6b41b03 Add 5.6.9 snapshot version 2018-02-20 12:16:02 -06:00
Jim Ferenczi 5991e977d2 Add unreleased v6.2.3 version 2018-02-20 17:49:41 +01:00
Lee Hinman d4fddfa2a0
Remove log4j dependency from elasticsearch-core (#28705)
* Remove log4j dependency from elasticsearch-core

This removes the log4j dependency from our elasticsearch-core project. It was
originally necessary only for our jar classpath checking. It is now replaced by
a `Consumer<String>` so that the es-core dependency doesn't have external
dependencies.

The parts of #28191 which were moved in conjunction (like `ESLoggerFactory` and
`Loggers`) have been moved back where appropriate, since they are not required
in the core jar.

This is tangentially related to #28504

* Add javadocs for `output` parameter

* Change @code to @link
2018-02-20 09:15:54 -07:00
Simon Willnauer b00870600b
Never block on key in `LiveVersionMap#pruneTombstones` (#28736)
Pruning tombstones is best effort and should not block if a key is currently
locked. This can cause a deadlock in rare situations if we switch of append
only optimization while heavily updating the same key in the engine
while the LiveVersionMap is locked. This is very rare since this code
patch only executed every 15 seconds by default since that is the interval
we try to prune the deletes in the version map.

Closes #28714
2018-02-20 16:35:05 +01:00
Luca Cavanna 8bbb3c9ffa
REST high-level client: add support for Rollover Index API (#28698)
Relates to #27205
2018-02-20 15:58:58 +01:00
Jason Tedor 94594f19ab
Fix handling of mandatory meta plugins
This commit fixes an issue with setting plugin.mandatory to include a
meta-plugin. The issue here is that the names that we collect are the
underlying plugins, not the meta-plugin. We should not use the
underlying plugins instead using the names of non-meta plugins and the
names of meta-plugins. This commit addresses this. The strategy here is
that when we look at the installed plugins on the filesystem, we keep
track of which ones are meta-plugins and carry this information up to
where check which plugins are installed against the mandatory plugins.

Relates #28710
2018-02-20 08:57:04 -05:00
Nhat Nguyen 0c2871c4d2
Replace CAS loop by updateAndGet to improve readability
Relates #28737
2018-02-20 08:03:24 -05:00
Simon Willnauer 779bc6fd5c
Simplify Engine.Searcher creation (#28728)
Today we have several levels of indirection to acquire an Engine.Searcher.
We first acquire a the reference manager for the scope then acquire an
IndexSearcher and then create a searcher for the engine based on that.
This change simplifies the creation into a single method call instead of
3 different ones.
2018-02-20 09:35:49 +01:00
Simon Willnauer 13a8ba4740 [TEST] Fix flaky IndexServiceTests#testRefreshActuallyWorks 2018-02-20 09:34:20 +01:00
Jason Tedor 105dcb544c
Enable selecting adaptive selection stats
The node stats API enables filtlering the top-level stats for only
desired top-level stats. Yet, this was never enabled for adaptive
replica selection stats. This commit enables this. We also add setting
these stats on the request builder, and fix an inconsistent name in a
setter.

Relates #28721
2018-02-19 16:56:36 -05:00
Nhat Nguyen ff2164c4f9
Revisit deletion policy after release the last snapshot (#28627)
We currently revisit the index deletion policy whenever the global
checkpoint has advanced enough. We should also revisit the deletion
policy after releasing the last snapshot of a snapshotting commit. With 
this change, the old index commits will be cleaned up as soon as
possible.

Follow-up of #28140
https://github.com/elastic/elasticsearch/pull/28140#discussion_r162458207
2018-02-19 11:39:15 -05:00
Simon Willnauer 8325786b33 Remove unused method 2018-02-19 12:24:28 +01:00
Simon Willnauer 56edb5eb3a
Track deletes only in the tombstone map instead of maintaining as copy (#27868)
Today we maintain a copy of every delete in the live version maps. This is unnecessary
and might add quite some overhead if maps grow large. This change moves out the deletes
tracking into the tombstone map only and relies on the cleaning of tombstones when deletes
are collected.
2018-02-19 12:23:38 +01:00
Yannick Welsch c0026648f0
Fix AdaptiveSelectionStats serialization bug (#28718)
The AdaptiveSelectionStats object serializes the clientOutgoingConnections map that's concurrently updated in SearchTransportService. Serializing the map consists of first writing the size of the map and then serializing the entries. If the number of entries changes while the map is being serialized, the size and number of entries go out of sync. The deserialization routine expects those to be in sync though.

Closes #28713
2018-02-19 10:16:29 +01:00
Nhat Nguyen df07943522 TEST: Fix InternalEngine#testAcquireIndexCommit
The acquireIndexCommit was separated into acquireSafeIndexCommit and
acquireLastIndexCommit, however the test was not updated accordingly.

Relates #28271
2018-02-17 09:49:14 -05:00
Nhat Nguyen 84fd39f5bb
Separate acquiring safe commit and last commit (#28271)
Previously we introduced a new parameter to `acquireIndexCommit` to
allow acquire either a safe commit or a last commit. However with the
new parameters, callers can provide a nonsense combination - flush first
but acquire the safe commit. This commit separates acquireIndexCommit
method into two different methods to avoid that problem. Moreover, this
change should also improve the readability.

Relates #28038
2018-02-16 21:25:58 -05:00
Nhat Nguyen 2f011295ec Backported the translog files age stats to v6.3.0
Relates #28613
2018-02-16 12:07:08 -05:00
Lee Hinman 0dd79028c9
Remove deprecated createParser methods (#28697)
* Remove deprecated createParser methods

This removes the final instances of the callers of `XContent.createParser` and
`XContentHelper.createParser` that did not pass in the `DeprecationHandler`. It
also removes the now-unused deprecated methods and fully removes any mention of
Log4j or LoggingDeprecationHandler from the XContent code.

Relates to #28504

* Add comments in JsonXContentGenerator
2018-02-16 08:26:30 -07:00
Justin Wyer 5aeb479ffd Add translog file age to Translog Stats (#28613)
Expose the age of translog files in the translog stats. This is useful to reason about your translog retention policy.

Closes #28189
2018-02-16 16:23:33 +01:00
Jason Tedor 57a56d8e64 Fix test concurrent remote connection updates
This test has a race condition. The action listener used to listen for
connections has a guard against being executed twice. However, this
listener can be executed twice. After on success is invoked the test
starts to tear down. At this point, the threads the test forked will
terminate and the remote cluster connection will be closed. However, a
thread forked to the management thread pool by the remote cluster
connection can still be executing and try to continue connecting. This
thread will be cancelled when the remote cluster connection is closed
and this leads to the action listener being invoked again. To address
this, we explicitly check that the reason that on failure was invoked
was cancellation, and we assert that the listener was already previously
invoked. Interestingly, this issue has always been present yet a recent
change (#28667) exposed errors that occur on tasks submitted to the
thread pool and were silently being lost.

Relates #28695
2018-02-16 07:30:15 -05:00
Andy Bristol 70b279dbbc [TEST] AwaitsFix testTriggerUpdatesConcurrently 2018-02-15 16:34:24 -08:00
Lee Hinman d90a440bf7
Add XContentHelper shim for move to passing in deprecation handler (#28684)
In order to allow us to gradually move to passing the deprecation handler is, we
need a shim that contains both the non-passed and passed version.

Relates to #28504
2018-02-15 11:01:01 -07:00
Jason Tedor 3e846ab251
Handle throws on tasks submitted to thread pools
When we submit a task to a thread pool for asynchronous execution, we
are returned a future. Since we submitted to go asynchronous, these
futures are not inspected for failure (we would have to block a thread
to do that). While we have on failure handlers for exceptions that are
thrown during execution, we do not handle throwables that are not
exceptions and these end up silently lost. This commit adds a check
after the runnable returns that inspects the status of the future. If an
unhandled throwable occurred during execution, this throwable is
propogated out where it will land in the uncaught exception handler.

Relates #28667
2018-02-15 11:59:12 -05:00
olcbean 02fc16f10e Add Cluster Put Settings API to the high level REST client (#28633)
Relates to #27205
2018-02-15 17:21:45 +01:00
Jason Tedor 671e7e2f00
Lift error finding utility to exceptions helpers
We have code used in the networking layer to search for errors buried in
other exceptions. This code will be useful in other locations so with
this commit we move it to our exceptions helpers.

Relates #28691
2018-02-15 09:48:52 -05:00
Boaz Leskes beb55d148a
Simplify the Translog constructor by always expecting an existing translog (#28676)
Currently the Translog constructor is capable both of opening an existing translog and creating a
new one (deleting existing files). This PR separates these two into separate code paths. The
constructors opens files and a dedicated static methods creates an empty translog.
2018-02-15 09:24:09 +01:00
Ke Li fc406c9a5a Upgrade t-digest to 3.2 (#28295) (#28305) 2018-02-15 08:23:20 +00:00
Jason Tedor cd54c96d56 Add comment explaining lazy declared versions
A recent change moved computing declared versions from using reflection
which occurred repeatedly to a lazily-initialized holder so that
declared versions are computed exactly once. This commit adds a comment
explaining the motivation for this change.
2018-02-14 23:15:59 -05:00
Nhat Nguyen 452bfc0d83 Backported synced-flush PR to v5.6.8 and v6.2.2
Relates #28464
2018-02-14 14:48:29 -05:00
Lee Hinman b59b1cf59d
Move more XContent.createParser calls to non-deprecated version (#28672)
* Move more XContent.createParser calls to non-deprecated version

Part 2

This moves more of the callers to pass in the DeprecationHandler.

Relates to #28504

* Use parser's deprecation handler where appropriate

* Use logging handler in test that uses deprecated field on purpose
2018-02-14 11:24:48 -07:00
Lee Hinman 7c1f5f5054
Move more XContent.createParser calls to non-deprecated version (#28670)
* Move more XContent.createParser calls to non-deprecated version

This moves more of the callers to pass in the DeprecationHandler.

Relates to #28504

* Use parser's deprecation handler where available
2018-02-14 09:01:40 -07:00
Tal Levy 6c7d12c34c [TEST] bump timeout in testFetchShardsSkipUnavailable to 5s
in response to #28668.
2018-02-13 13:09:43 -08:00
Lee Hinman 7c201a64b5 [TEST] Synchronize searcher list in IndexShardTests
It's possible to check the list size, then attempt to remove a searcher and
throw an IndexOutOfBoundsException due to multiple threads.

Resolves #27651
2018-02-13 12:16:04 -07:00
Scott Somerville a138e0e225 Compute declared versions in a static block
This method is called often enough (when computing minimum compatibility
versions) that the reflection and sort can be seen while profiling. This
commit addresses this issue by computing the declared versions exactly
once.

Relates #28661
2018-02-13 13:24:19 -05:00
Jim Ferenczi 3b9f530839
Inc store reference before refresh (#28656)
If a tragic even happens while we are refreshing a searcher/reader the engine can open new files on a store that is already closed
For instance the following CI job failed because a merge was concurrently called on a failing shard:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+oracle-java10-periodic/84
This change increments the ref count of the store during a refresh in order to postpone the closing after a tragic event.
2018-02-13 15:38:02 +01:00
Jim Ferenczi 813d8e1f7e Fix meta plugin installation that contains plugins with dependencies
When installing a meta plugin we check the dependency of each sub plugin during the installation.
Though if the extended plugin is part of the meta plugin the installation fails because we only check for plugins that are
already installed. This change is a workaround that extracts all plugins (even those that are not fully installed yet) when the dependency check
is made during the installation. Note that this is how the plugin installation worked before https://github.com/elastic/elasticsearch/pull/28581.
2018-02-13 11:43:34 +01:00
Ryan Ernst ea381969be
Plugins: Separate plugin semantic validation from properties format validation (#28581)
This commit moves the semantic validation (like which version a plugin
was built for or which java version it is compatible with) from reading
a plugin descriptor, leaving the checks on the format of the descriptor
intact.

relates #28540
2018-02-12 21:30:11 -08:00
Robin Neatherway 282974215c MetaDataIndexAliasesService wrong get type (#28614)
A get of the wrong type would always have returned null so these
indices would have been inserted into the map repeatedly.
2018-02-12 15:55:17 -08:00
Robin Neatherway 3174b2cbfa NXYSignificanceHeuristic equality update (#28616)
* NXYSignificanceHeuristic.java: implementation of equality would have
  failed with a ClassCastException when comparing to another type.
  Replaced with the Eclipse generated form.
2018-02-12 15:51:57 -08:00
Robin Neatherway 68b7a5c281 Fix DeadlockAnalyzer printer (#28615)
Remove `if` block that was always true.
2018-02-12 15:28:56 -08:00
Nhat Nguyen 9eb9ce3843
Require translogUUID when reading global checkpoint (#28587)
Today we use the persisted global checkpoint to calculate the starting 
seqno in peer-recovery. However we do not check whether the translog 
actually belongs to the existing Lucene index when reading the global
checkpoint. In some rare cases if the translog does not match the Lucene
index, that recovering replica won't be able to complete its recovery.
This can happen as follows.

1. Replica executes a file-based recovery
2. Index files are copied to replica but crashed before finishing the recovery
3. Replica starts recovery again with seq-based as the copied commit is safe
4. Replica fails to open engine because translog and Lucene index are not matched
5. Replica won't be able to recover from primary

This commit enforces the translogUUID requirement when reading the 
global checkpoint directly from the checkpoint file.

Relates #28435
2018-02-12 13:23:32 -05:00
Lee Hinman 6538542603
Switch to hardcoding Smile as the state format (#28610)
This commit changes the state format that was previously passed in to
`MetaDataStateFormat` to always use Smile. This doesn't actually change the
format, since we have used Smile for writing the format since at least 5.0. This
removes the automatic detection of the state format when reading state, since
any state that could be processed in 6.x and 7.x would already have been written
in Smile format.

This is work towards removing the deprecated methods in the XContent code where
we do automatic content-type detection.

Relates to #28504
2018-02-12 08:07:01 -07:00
Jim Ferenczi e6a8528554
Force depth_first mode execution for terms aggregation under a nested context (#28421)
This commit forces the depth_first mode for `terms` aggregation that contain a sub-aggregation that need to access the score of the document
in a nested context (the `terms` aggregation is a child of a `nested` aggregation). The score of children documents is not accessible in
breadth_first mode because the `terms` aggregation cannot access the nested context.

Close #28394
2018-02-12 13:38:11 +01:00
Jim Ferenczi 7dc00ef1f5
Search option terminate_after does not handle post_filters and aggregations correctly (#28459)
* Search option terminate_after does not handle post_filters and aggregations correctly

This change fixes the handling of the `terminate_after` option when post_filters (or min_score) are used.
`post_filter` should be applied before `terminate_after` in order to terminate the query when enough document are accepted
by the post_filters.
This commit also changes the type of exception thrown by `terminate_after` in order to ensure that multi collectors (aggregations)
do not try to continue the collection when enough documents have been collected.

Closes #28411
2018-02-12 13:36:33 +01:00
Ke Li 55448b2630 [Tests] Remove unnecessary condition check (#28559)
The condition value in question is true, regardless of the randomBoolean() value.
This change simplifies this removing the condition blocks.
2018-02-12 11:33:19 +01:00
Boaz Leskes 4aece92b2c
IndexShardOperationPermits: shouldn't use new Throwable to capture stack traces (#28598)
The is a follow up to #28567 changing the method used to capture stack traces, as requested
during the review. Instead of creating a throwable, we explicitly capture the stack trace of the
current thread. This should Make Jason Happy Again ™️ .
2018-02-12 10:33:13 +01:00
Michael Basnight e0bea70070
Generalize BWC logic (#28505)
Generalizing BWC building so that there is less code to modify for a release. This ensures we do not
need to think about what major or minor version is in the gradle code. It follows the general rules of the
elastic release structure. For more information on the rules, see the VersionCollection's javadoc.

This also removes the additional bwc snapshots that will never be released, such as 6.0.2, which were
being built and tested against every time we ran bwc tests.

Additionally, it creates 4 new projects that correspond to the different types of snapshots that may exist
for a given version. Its possible to now run those individual tasks to work out bwc logic whereas
previously it was impossible and the entire suite of bwc tests had to be run to work out any logic
changes in the build tools' bwc project. Please note that if the project does not make sense for the 
version that is current, that an error will be thrown from that individual project if an attempt is made to 
run it.

This should allow for automating the version bumps as well, since it removes all the hardcoded version
logic from the configs.
2018-02-09 14:55:10 -06:00
Lee Hinman 5263b8cc7e
Remove all instances of the deprecated `ParseField.match` method (#28586)
This removes all the server references to the deprecated `ParseField.match`
method in favor of the method that passes in the deprecation logger.

Relates to #28504
2018-02-09 09:19:24 -07:00
Martijn van Groningen 766b9d600e
Fixed a bug that prevents pipelines to load that use stored scripts after a restart.
The bug was caused because the ScriptService had no reference to a ClusterState instance,
because it received the ClusterState after the PipelineStore. This only is the case
after a restart.

A bad side effect is that during a restart, any pipeline to be loaded after the pipeline that uses a stored script,
was never loaded, which caused many pipeline to be missing in bulk / index request api calls.
2018-02-09 17:14:00 +01:00
Yannick Welsch 5735e088f9
Fsync directory after cleanup (#28604)
After copying over the Lucene segments during peer recovery, we call cleanupAndVerify which removes all other files in the directory and which then calls getMetadata to check if the resulting files are a proper index. There are two issues with this:

- the directory is not fsynced after the deletions, so that the call to getMetadata, which lists files in the directory, can get a stale view, possibly seeing a deleted corruption marker (which leads to the exception seen in #28435)
- failing to delete a corruption marker should result in a hard failure, as the shard is otherwise unusable.
2018-02-09 17:06:36 +01:00
Igor Motov da1a10fa92
Add generic array support to AbstractObjectParser (#28552)
Adds a generic declareFieldArray that can process arrays of arbitrary elements.
2018-02-08 19:46:12 -05:00
Nhat Nguyen dbf9fb31e4
Do not ignore shard not-available exceptions in replication (#28571)
The shard not-available exceptions are currently ignored in the
replication as the best effort avoids failing not-yet-ready shards.
However these exceptions can also happen from fully active shards. If
this is the case, we may have skipped important failures from replicas.
Since #28049, only fully initialized shards are received write requests.
This restriction allows us to handle all exceptions in the replication.

There is a side-effect with this change. If a replica retries its peer
recovery second time after being tracked in the replication group, it
can receive replication requests even though it's not-yet-ready. That
shard may be failed and allocated to another node even though it has a
good lucene index on that node.

This PR does not change the way we report replication errors to users,
hence the shard not-available exceptions won't be reported as before.

Relates #28049
Relates #28534
2018-02-08 18:05:27 -05:00
Boaz Leskes ba59cf1262
Capture stack traces while issuing IndexShard operations permits to easy debugging (#28567)
Today we acquire a permit from the shard to coordinate between indexing operations, recoveries and other state transitions. When we leak an  permit it's practically impossible to find who the culprit is. This PR add stack traces capturing for each permit so we can identify which part of the code is responsible for acquiring the unreleased permit. This code is only active when assertions are active. 

The output is something like:
```
java.lang.AssertionError: shard [test][1] on node [node_s0] has pending operations:
--> java.lang.RuntimeException: something helpful 2
	at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:223)
	at org.elasticsearch.index.shard.IndexShard.<init>(IndexShard.java:322)
	at org.elasticsearch.index.IndexService.createShard(IndexService.java:382)
	at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:514)
	at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:143)
	at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:552)
	at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:529)
	at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:231)
	at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$6(ClusterApplierService.java:498)
	at java.base/java.lang.Iterable.forEach(Iterable.java:75)
	at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:495)
	at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:482)
	at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
	at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:161)
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:566)
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244)
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
	at java.base/java.lang.Thread.run(Thread.java:844)

--> java.lang.RuntimeException: something helpful
	at org.elasticsearch.index.shard.IndexShardOperationPermits.acquire(IndexShardOperationPermits.java:223)
	at org.elasticsearch.index.shard.IndexShard.<init>(IndexShard.java:311)
	at org.elasticsearch.index.IndexService.createShard(IndexService.java:382)
	at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:514)
	at org.elasticsearch.indices.IndicesService.createShard(IndicesService.java:143)
	at org.elasticsearch.indices.cluster.IndicesClusterStateService.createShard(IndicesClusterStateService.java:552)
	at org.elasticsearch.indices.cluster.IndicesClusterStateService.createOrUpdateShards(IndicesClusterStateService.java:529)
	at org.elasticsearch.indices.cluster.IndicesClusterStateService.applyClusterState(IndicesClusterStateService.java:231)
	at org.elasticsearch.cluster.service.ClusterApplierService.lambda$callClusterStateAppliers$6(ClusterApplierService.java:498)
	at java.base/java.lang.Iterable.forEach(Iterable.java:75)
	at org.elasticsearch.cluster.service.ClusterApplierService.callClusterStateAppliers(ClusterApplierService.java:495)
	at org.elasticsearch.cluster.service.ClusterApplierService.applyChanges(ClusterApplierService.java:482)
	at org.elasticsearch.cluster.service.ClusterApplierService.runTask(ClusterApplierService.java:432)
	at org.elasticsearch.cluster.service.ClusterApplierService$UpdateTask.run(ClusterApplierService.java:161)
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:566)
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:244)
	at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:207)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
	at java.base/java.lang.Thread.run(Thread.java:844)

```
2018-02-08 22:59:02 +01:00
Nhat Nguyen 5b8870f193
Only log warning when actually failing shards (#28558)
Currently the master node logs a warning message whenever it receives a 
failed shard request. However, this can be noisy because

- Multiple failed shard requests can be issued for a single shard
- Failed shard requests can be still issued for an already failed shard

This commit moves the log-warn to AllocationService in which the failing 
shard action actually happens. This is another prerequisite step in 
order to not ignore the shard not-available exceptions in the
replication.

Relates #28534
2018-02-08 15:37:52 -05:00
Jason Tedor 5badacf391
Fix race condition in queue size test
The queue size test has a race condition. Namely the offering thread can
run so quickly completing all of its offering iterations before the
queue size thread ever has a chance to run a single size poll
iteration. This means that the size will never actually be polled and
the test can spuriously fail. What we really want to do here, since this
test is checking for a race condition between polling the size of the
queue and offers to the queue, we want to execute each iteration in
lockstep giving the threads multiple changes for the race between
polling the size and offers to occur. This commit addresses this by
running the two threads in lockstep for multiple iterations so that they
have multiple chances to race.

Relates #28584
2018-02-08 14:23:24 -05:00
Igor Motov 80c8c3b114 Add 6.2.2 version constant 2018-02-08 13:43:34 -05:00
Lee Hinman 2e4c834a13
Switch to non-deprecated ParseField.match method for o.e.search (#28526)
* Switch to non-deprecated ParseField.match method for o.e.search

This replaces more of the `ParseField.match` calls with the same call using a
deprecation handler. It encapsulates all of the instances in the
`org.elastsicsearch.search` package.

Relates to #28504

* Address Nik's comments
2018-02-08 11:17:57 -07:00
Lee Hinman b64bf51000
Replace more deprecated ParseField.match calls with non-deprecated call (#28525)
* Replace more deprecated ParseField.match calls with non-deprecated call

This replaces more of the `ParseField.match` calls with the same call using a
deprecation handler.

Relates to #28504

* Address Nik's comments
2018-02-08 09:45:28 -07:00
Ryan Ernst a55eda626f
Plugins: Store elasticsearch and java versions in PluginInfo (#28556)
Plugin descriptors currently contain an elasticsearch version,
which the plugin was built against, and a java version, which the plugin
was built with. These versions are read and validated, but not stored.
This commit keeps them in PluginInfo so they can be used later.
While seeing the elasticsearch version is less interesting (since it is
enforced to match that of the running elasticsearc node), the java
version is interesting since we only validate the format, not the actual
version. This also makes PluginInfo have full parity with the plugin
properties file.
2018-02-08 08:31:39 -08:00
Jason Tedor 86fd48e5f5
Fix size blocking queue to not lie about its weight
Today when offering an item to a size blocking queue that is at
capacity, we first increment the size of the queue and then check if the
capacity is exceeded or not. If the capacity is indeed exceeded, we do
not add the item to the queue and immediately decrement the size of the
queue. However, this incremented size is exposed externally even though
the offered item was never added to the queue (this is effectively a
race on the size of the queue). This can lead to misleading statistics
such as the size of a queue backing a thread pool. This commit fixes
this issue so that such a size is never exposed. To do this, we replace
the hidden CAS loop that increments the size of the queue with a CAS
loop that only increments the size of the queue if we are going to be
successful in adding the item to the queue.

Relates #28557
2018-02-08 06:54:39 -05:00
Jason Tedor 666c4f9414
Index shard should roll generation via the engine
Today when a replica shard detects a new primary shard (via a primary
term transition), we roll the translog generation. However, the
mechanism that we are using here is by reaching through the engine to
the translog directly. By poking all the way through rather than asking
the engine to manage the roll for us we miss:
 - taking a read lock in the engine while the roll is occurring
 - trimming unreferenced readers

This commit addresses this by asking the engine to roll the translog
generation for us.

Relates #28537
2018-02-07 14:57:50 -05:00
Martijn van Groningen 2023c98bea
Added more parameter to PersistentTaskPlugin#getPersistentTasksExecutor(...) 2018-02-07 17:43:35 +01:00
Christoph Büscher c0886cf7c6
[Tests] Relax assertion in SuggestStatsIT (#28544)
The test expects suggest times in milliseconds that are strictly
positive. Internally they are measured in nanos, it is possible that on
really fast execution this is rounded to 0L, so this should also be an
accepted value.

Closes #28543
2018-02-07 17:26:08 +01:00
Christoph Büscher 305b87b4b7
Make internal Rounding fields final (#28532)
The fields in the internal rounding classes can be made final with very minor
adjustments to how they are read from a StreamInput.
2018-02-07 09:33:21 +01:00
Jason Tedor c2fcf15d9d
Fix the ability to remove old plugin
We now read the plugin descriptor when removing an old plugin. This is
to check if we are removing a plugin that is extended by another
plugin. However, when reading the descriptor we enforce that it is of
the same version that we are. This is not the case when a user has
upgraded Elasticsearch and is now trying to remove an old plugin. This
commit fixes this by skipping the version enforcement when reading the
plugin descriptor only when removing a plugin.

Relates #28540
2018-02-06 17:38:26 -05:00
Lee Hinman 6b4ea4e6fb Add 6.2.1 version constant 2018-02-06 12:13:24 -07:00
Yannick Welsch e6f873c620
Remove feature parsing for GetIndicesAction (#28535)
Removes dead code. Follow-up of #24723
2018-02-06 18:00:14 +01:00
Yannick Welsch c8df446000
No refresh on shard activation needed (#28013)
A shard is fully baked when it moves to POST_RECOVERY. There is no need to do an extra refresh on shard activation again as the shard has already been refreshed when it moved to POST_RECOVERY.
2018-02-06 17:29:22 +01:00
Yannick Welsch d43f0b5f26
Improve failure message when restoring an index that already exists in the cluster (#28498)
Makes the message more actionable and removes the focus on the fact that the index is open.
2018-02-06 14:24:52 +01:00
Lee Hinman eebff4d2b3
Use non deprecated xcontenthelper (#28503)
* Move to non-deprecated XContentHelper.createParser(...)

This moves away from one of the now-deprecated XContentHelper.createParser
methods in favor of specifying the deprecation logger at parser creation time.

Relates to #28449

Note that this doesn't move all the `createParser` calls because some of them
use the already-deprecated method that doesn't specify the XContentType.

* Remove the deprecated (and now non-needed) createParser method
2018-02-05 16:18:18 -07:00
Nik Everett 5003ef18ac
Scripts: Fix security for deprecation warning (#28485)
If you call `getDates()` on a long or date type field add a deprecation
warning to the response and log something to the deprecation logger.
This *mostly* worked just fine but if the deprecation logger happens to
roll then the roll will be performed with the script's permissions
rather than the permissions of the server. And scripts don't have
permissions to, say, open files. So the rolling failed. This fixes that
by wrapping the call the deprecation logger in `doPriviledged`.

This is a strange `doPrivileged` call because it doens't check
Elasticsearch's `SpecialPermission`. `SpecialPermission` is a permission
that no-script code has and that scripts never have. Usually all
`doPrivileged` calls check `SpecialPermission` to make sure that they
are not accidentally acting on behalf of a script. But in this case we
are *intentionally* acting on behalf of a script.

Closes #28408
2018-02-03 14:56:08 -05:00
Nhat Nguyen de6d31ebc2 Backport fail shard w/o marking as stale PR to v6.3
Relates #28054
2018-02-03 12:07:39 -05:00
Nhat Nguyen 965efa51cc
Allows failing shards without marking as stale (#28054)
Currently when failing a shard we also mark it as stale (eg. remove its
allocationId from from the InSync set). However in some cases, we need 
to be able to fail shards but keep them InSync set. This commit adds
such capacity. This is a preparatory change to make the primary-replica
resync less lenient.

Relates #24841
2018-02-03 09:41:53 -05:00
Nhat Nguyen 875bbfe699 Backported the harden synced-flush PR to v6.3.0
Relates #28464
2018-02-02 14:31:37 -05:00
David Turner ab8f5ea54c
Forbid trappy methods from java.time (#28476)
ava.time has the functionality needed to deal with timezones with varying 
offsets correctly, but it also has a bunch of methods that silently let you
forget about the hard cases, which raises the risk that we'll quietly do the
wrong thing at some point in the future.

This change adds the trappy methods to the list of forbidden methods to try and
help stop this from happening.

It also fixes the only use of these methods in the codebase so far:
IngestDocument#deepCopy() used ZonedDateTime.of() which may alter the offset of
the given time in cases where the offset is ambiguous.
2018-02-02 18:24:02 +00:00
Lee Hinman 3ddea8d8d2
Start switching to non-deprecated ParseField.match method (#28488)
This commit switches all the modules and server test code to use the
non-deprecated `ParseField.match` method, passing in the parser's deprecation
handler or the logging deprecation handler when a parser is not available (like
in tests).

Relates to #28449
2018-02-02 10:10:13 -07:00
Nhat Nguyen 5f2121960e
Synced-flush should not seal index of out of sync replicas (#28464)
Today the correctness of synced-flush is guaranteed by ensuring that 
there is no ongoing indexing operations on the primary. Unfortunately, a
replica might fall out of sync with the primary even the condition is
met. Moreover, if synced-flush mistakenly issues a sync_id for an out of
sync replica, then that replica would not be able to recover from the 
primary. ES prevents that peer-recovery because it detects that both
indexes from primary and replica were sealed with the same sync_id but
have a different content. This commit modifies the synced-flush to not
issue sync_id for out of sync replicas. This change will report the
divergence issue earlier to users and also prevent replicas from getting
into the "unrecoverable" state.

Relates #10032
2018-02-02 11:20:38 -05:00
Jim Ferenczi 88f4c1c03a
Add a test for sub-aggregations rewrite (#28491)
This commit adds a test to check that the rewrite of a sub-aggregation triggers a copy of the parent aggregation.

Relates #28430
Closes #27782
2018-02-02 16:04:33 +01:00
Haris Osmanagić 897ef458f3 Add support for indices exists to REST high level client (#27384)
Relates to #27205
2018-02-02 11:25:36 +01:00
Yannick Welsch 031415a5f6
Replicate writes only to fully initialized shards (#28049)
The primary currently replicates writes to all other shard copies as soon as they're added to the routing table. Initially those shards are not even ready yet to receive these replication requests, for example when undergoing a file-based peer recovery. Based on the specific stage that the shard copies are in, they will throw different kinds of exceptions when they receive the replication requests. The primary then ignores responses from shards that match certain exception types. With this mechanism it's not possible for a primary to distinguish between a situation where a replication target shard is not allocated and ready yet to receive requests and a situation where the shard was successfully allocated and active but subsequently failed.
This commit changes replication so that only initializing shards that have successfully opened their engine are used as replication targets. This removes the need to replicate requests to initializing shards that are not even ready yet to receive those requests. This saves on network bandwidth and enables features that rely on the distinction between a "not-yet-ready" shard and a failed shard.
2018-02-02 11:13:07 +01:00
Nhat Nguyen 5be478f938 Remove uncommitted ops assertion in shouldFlush
This assertion does not hold if engine is flushed between the invocation
of translog.uncommittedSizeInBytes and translog.uncommittedOperations.
These two values can be calculated from different commits.
2018-02-01 18:20:56 -05:00
markharwood 998461c737
Test fix - reenable BWC tests and lower version checks now that PR 28440 for allowPartialSearchResults flag backported to 6.x (#28482)
Support for allowPartialSearchResults is now in 6.3 so changing master BWC checks accordingly
2018-02-01 19:20:21 +00:00
Nhat Nguyen 1970e01782
Add lower bound for translog flush threshold (#28382)
If the translog flush threshold is too small (eg. smaller than the
translog header), we may repeatedly flush even there is no uncommitted
operation because the shouldFlush condition can still be true after
flushing. This is currently avoided by adding an extra guard against the
uncommitted operations. However, this extra guard makes the shouldFlush
complicated. This commit replaces that extra guard by a lower bound for
translog flush threshold. We keep the lower bound small for convenience
in testing.

Relates #28350
Relates #23606
2018-02-01 13:51:53 -05:00
Luca Cavanna d860971572
REST high-level client: add support for split and shrink index API (#28425)
Relates to #27205
2018-02-01 16:37:01 +01:00
Martijn van Groningen 61806802fb
Add persistent tasks
Persistent tasks are build on top of node tasks and provide functionality to restart a task to run on a different coordination node in case the coordinating node is no longer available.
It is up to a persistent task implementation to keep track of status, so that in case the task is restarted, the task can continue were it left off before it was restarted.
2018-02-01 15:26:17 +01:00
Alan Woodward 9500513ba1
Move leftover aliases test from core/ to server/ (#28463) 2018-02-01 11:40:29 +00:00
Colin Goodheart-Smithe 65157e9428
[TEST] Replaces flaky breaker IT test with unit test (#28418)
This change remove the `CircuitBreakerIT. testParentChecking` test method which fails intermittently in unexpected ways with a `MemoryCircuitBreakerTests. testBorrowingSiblingBreakerMemory` unit test method which can test the borrowing functionality more directly

Closes #28223
2018-02-01 08:32:46 +00:00
Jim Ferenczi dd40b984c4
Add a shallow copy method to aggregation builders (#28430)
This change adds a shallow copy method for aggregation builders. This method returns a copy of the builder replacing the factoriesBuilder and metaDada
This method is used when the builder is rewritten (AggregationBuilder#rewrite) in order to make sure that we create a new instance of the parent builder when sub aggregations are rewritten.

Relates #27782
2018-02-01 09:22:32 +01:00
Jim Ferenczi c7d5a54b42
Fix AIOOB on indexed geo_shape query (#28458)
This change fixes a possible AIOOB during the parsing of the document that contains the indexed shape.
This change ensures that the parsing does not continue when the field that contains the shape has been found.

Closes #28456
2018-02-01 09:01:48 +01:00
Jason Tedor 1b3d529bef Introduce secure security manager to project
This commit migrates SecureSM, our secure security manager
implementation, from its own repository to being a sub-project of
Elasticsearch.
2018-01-31 18:23:28 -05:00
Nhat Nguyen 5e0be61774
Add logging to index commit deletion policy (#28448)
This would help us to figure out which index commit that an engine 
started with or used in peer-recovery.

Relates #28405
2018-01-31 11:09:49 -05:00
markharwood 77d2dd203e
Search - add allow_partial_search_results flag with default setting false (#28440)
Adds allow_partial_search_results flag to search requests with default setting = true.
When false, will error if search either timeouts, has partial errors or has missing shards rather
than returning partial search results. A cluster-level setting provides a default for search requests with no flag.

Closes #27435
2018-01-31 15:51:29 +00:00
David Turner 4c154b70d3
Fix rounding of time values near to overlapping days (#28151)
Sometimes, in some places, the clocks are set back across midnight, leading to
overlapping days. This was not handled as expected, and this change fixes this.

Additionally, in this situation it is not true that rounding a time down to the
nearest day is a monotonic operation, as asserted in these tests. This change 
suppresses those assertions in those rare cases.

Fixes #27966.
2018-01-31 15:10:47 +00:00
kel 5819e57baa Replace Bits with new abstract class to respresent documents that have a value (#24088) (#28334) 2018-01-31 15:42:11 +01:00
Martijn van Groningen 592eedbf49
Make persistent tasks work.
Made persistent tasks executors pluggable.
2018-01-31 12:28:06 +01:00
Martijn van Groningen 07e727c769
Removed ClientHelper dependency from PersistentTasksService. 2018-01-31 12:28:06 +01:00
David Roberts cc16f9d9c9
Added AllocatedPersistentTask#waitForPersistentTaskStatus(...) that delegates to PersistentTasksService#waitForPersistentTaskStatus(...)
This allows persistent tasks executor implementations to not have an instance of PersistentTasksService.
2018-01-31 12:28:06 +01:00
Igor Motov 41071e4711
Add adding ability to associate an ID with tasks.
Persistent tasks portion of elastic/elasticsearch#23250
2018-01-31 12:28:06 +01:00
Jay Modi 8521b2d11e
Remove InternalClient and InternalSecurityClient (#3054)
This change removes the InternalClient and the InternalSecurityClient. These are replaced with
usage of the ThreadContext and a transient value, `action.origin`, to indicate which component the
request came from. The security code has been updated to look for this value and ensure the
request is executed as the proper user. This work comes from #2808 where @s1monw suggested
that we do this.

While working on this, I came across index template registries and rather than updating them to use
the new method, I replaced the ML one with the template upgrade framework so that we could
remove this template registry. The watcher template registry is still needed as the template must be
updated for rolling upgrades to work (see #2950).
2018-01-31 12:28:05 +01:00
Martijn van Groningen 4dd69951f3
Make the persistent task status available to PersistentTasksExecutor.nodeOperation(...) method 2018-01-31 12:28:05 +01:00
Colin Goodheart-Smithe 1c489ee867
Refactor/to x content fragments2 (#2329)
* Moves more classes over to ToXContentObject/Fragment

* Removes ToXContentToBytes

* Removes ToXContent from Enums

* review comment fix

* slight change to use XContantHelper
2018-01-31 12:28:05 +01:00
David Roberts 7313ad5b29
Make AllocatedPersistentTask members volatile (#2297)
These members are default initialized on contruction and then set by the
init() method.  It's possible that another thread accessing the object
after init() is called could still see the null/0 values, depending on how
the compiler optimizes the code.
2018-01-31 12:28:05 +01:00
Colin Goodheart-Smithe b0de3c38d6
Moves more classes over to ToXContentObject/Fragment (#2283) 2018-01-31 12:28:04 +01:00
Luca Cavanna 65ce2276eb
Adapt to upstream changes made to AbstractStreamableXContentTestCase (#2117) 2018-01-31 12:28:04 +01:00
Yannick Welsch b5f281386a
Move tribe to a module (#2088)
Companion PR to elastic/elasticsearch#25778
2018-01-31 12:28:04 +01:00
Igor Motov ffdb05e48e
Persistent Tasks: remove unused isCurrentStatus method (#2076)
Removes a method that is no longer used in production code.

Relates to #957
2018-01-31 12:28:04 +01:00
David Kyle 0d50f9c6a9
Call initialising constructor of BaseTasksRequest (#1771) 2018-01-31 12:28:04 +01:00
Chris Earle 1cef531165
Always Accumulate Transport Exceptions (#1619)
This is the x-pack side of the removal of `accumulateExceptions()` for both `TransportNodesAction` and `TransportTasksAction`.

There are occasional, random failures that occur during API calls that are silently ignored from the caller's perspective, which also leads to weird API responses that have no response and also no errors, which is obviously untrue.
2018-01-31 12:28:03 +01:00
Hendrik Muhs 614aef2527
Pass down the provided timeout. 2018-01-31 12:28:03 +01:00
Simon Willnauer 292e383d2c
Fix static / version based BWC tests (#1456)
With the leniency in Version.java we missed to really setup BWC
testing for static indices. This change brings back the testing and adds
missing bwc indices.

Relates to elastic/elasticsearch#24732
2018-01-31 12:28:03 +01:00
Yannick Welsch e69317b24b
Don't call ClusterService.state() in a ClusterStateUpdateTask
The current state is readily available as a parameter
2018-01-31 12:28:02 +01:00
Yannick Welsch 44ea5d6b3e
Separate publishing from applying cluster states
Companion commit to elastic/elasticsearch#24236
2018-01-31 12:28:02 +01:00
Igor Motov a08e2d9e5e
Persistent tasks: require allocation id on task completion (#1107)
Persistent tasks should verify that completion notification is done for correct version of the task, otherwise a delayed notification from an old node can accidentally close a newly reassigned task.
2018-01-31 12:28:01 +01:00
Colin Goodheart-Smithe 76cd7b1eb2
Fixes compile errors in Eclipse due to generics
PersistentTasksCustomMetadata was using a generic param named `Params`. This conflicted with the imported interface `ToXContent.Params`. The java compiler was preferring the generic param over the interface so everything was fine but Eclipse apparently prefers the interface int his case which was screwing up the Hierarchy and causing compile errors in Eclipse. This changes fixes it by renaming the Generic param to `P`
2018-01-31 12:27:34 +01:00
Igor Motov fc524bc9b5
Persistent Tasks: force writeable name of params and status to be the same as their task (#1072)
Changes persistent task serialization and forces params and status to have the same writeable name as the task itself.
2018-01-31 12:27:34 +01:00
Martijn van Groningen 4771965931
Use task builder instead of creating persistent tasks directly. 2018-01-31 12:27:34 +01:00
Igor Motov abd9ae399c
Persistent Tasks: PersistentTaskRequest -> PersistTaskParams (#1057)
Removes the last pieces of ActionRequest from PersistentTaskRequest and renames it into PersistTaskParams, which is now just an interface that extends NamedWriteable and ToXContent.
2018-01-31 12:27:33 +01:00
Igor Motov 6bfea09dd6
Persistent Tasks: switch from long task ids to string task ids (#1035)
This commit switches from long persistent task ids to caller-supplied string persistent task ids.
2018-01-31 12:27:33 +01:00
Hendrik Muhs 0a1f25588b
Added PersistentTasksService#waitForPersistentTasksStatus(...) method to allow callers to wait when an executor node has updated its task status. 2018-01-31 12:27:33 +01:00
Igor Motov 0a1abd430d
Persistent Tasks: remove listener from PersistentTasksExecutor#nodeOperation (#1032)
Instead of having a separate listener for indicating that the current task is finished, this commit is switching to use allocated object itself.
2018-01-31 12:27:32 +01:00
Igor Motov 95c6005f6f
Persistent Tasks: remove retries on notification failures (#977)
Retries should be already handled by TransportMasterNodeAction, there is no need to introduce another retry layer in Persistent Tasks code.
2018-01-31 12:27:32 +01:00
Martijn van Groningen fab0dc449a
Remove PersistentTask#isCurrentStatus() usages 2018-01-31 12:27:32 +01:00
Igor Motov 5a8512bf4e
Persistent Tasks: refactor PersistentTasksService to use ActionListener (#937)
PersistentTasksService methods are not using ActionListener<PersistentTask<?>> instead of PersistentTaskOperationListener.
2018-01-31 12:27:29 +01:00
Jason Tedor 97822dbea3
Respond to rename random ASCII helper methods
This commit is response to the renaming of the random ASCII helper
methods in ESTestCase. The name of this method was changed because these
methods only produce random strings generated from [a-zA-Z], not from
all ASCII characters.
2018-01-31 12:00:10 +01:00
Igor Motov 5b45b167bd
Persistent Tasks: check the current state in waitForPersistentTaskStatus (#935)
Add a check for the current state waitForPersistentTaskStatus before waiting for the next one. This fixes sporadic failure in testPersistentActionStatusUpdate test.

Fixes #928
2018-01-31 12:00:09 +01:00
Martijn van Groningen a5acb556b0
Use PersistentTasksService#waitForPersistentTaskStatus(...) to wait for job and datafeed status and use PersistentTasksService#removeTask(...) to force close job and force stop datafeed. 2018-01-31 12:00:09 +01:00
Igor Motov 1b0f5b9572
Persistent Tasks: require correct allocation id for status updates (#923)
In order to prevent tasks state updates by stale executors, this commit adds a check for correct allocation id during status update operation.
2018-01-31 12:00:09 +01:00
Igor Motov 6ca044736e
Persistent Tasks: Add waitForPersistentTaskStatus method (#901)
This method allows to wait for tasks to change their status to match the supplied predicate.
2018-01-31 12:00:09 +01:00
Martijn van Groningen 78b844e79b
Check allocationIdOnLastStatusUpdate when trying to detect whether a task is stale. 2018-01-31 11:59:02 +01:00
Igor Motov b142d7e29c
Persistent Tasks: Remove unused stopped and removeOnCompletion flags (#853)
The stopped and removeOnCompletion flags are not currently used, this commit removes them for now to simplify things.
2018-01-31 11:59:01 +01:00
Igor Motov 37fad04879
Persistent Tasks: Merge NodePersistentTask and RunningPersistentTask (#842)
Refactors NodePersistentTask and RunningPersistentTask into a single AllocatedPersistentTask. Makes it possible to update Persistent Task Status via AllocatedPersistentTask.
2018-01-31 11:59:01 +01:00
Igor Motov 19f39fd392
Persistent Tasks: remove task restart on failure (#815)
If a persistent task throws an exception, the persistent tasks framework will no longer try to restart the task. This is a temporary measure to prevent threshing the cluster with endless restart attempt. We will revisit this in the future version to make the restart process more robust. Please note, however, that if node executing the task goes down, the task will still be restarted on another node.
2018-01-31 11:59:01 +01:00
Igor Motov 9bd24418d5
Make PersistentAction independent from TransportActions (#742)
Removes the transport layer dependency from PersistentActions, makes PersistentActionRegistry immutable and rename actions into tasks in class and variable names.
2018-01-31 11:59:01 +01:00
Igor Motov 810d9335c0
Simplify names of PersistentTasks-related classes
PersistentTask -> NodePersistentTask
PersistentTasksInProgress -> PersistentTasks
PersistentTaskInProgress -> PersistentTask
2018-01-31 11:59:00 +01:00
Igor Motov b33fc05492
Request and Status in Persistent Tasks should be serialized using their writable names
Refactors xcontent serialization of Request and Status to use their writable names instead of action name. That simplifies the parsing logic, allows reuse of the same status object for multiple actions and is consistent with how named objects in xcontent are used.
2018-01-31 11:59:00 +01:00
Igor Motov 5eeb480d97
Add persistent task assignment explanations.
This commit allows persistent actions to indicate why a task was or wasn't assigned to a certain node.
2018-01-31 11:59:00 +01:00
Martijn van Groningen 479429c6ef
In order to keep track of restarted tasks, `allocationIdOnLastStatusUpdate` field was added to `PersistentTaskInProgress` class.
This will allow persistent task implementors to detect whether the executor node has changed or has been unset since the last status update has occured.
2018-01-31 11:58:07 +01:00
Igor Motov 16e661c34b
Make persistent task persist full cluster restart
This commit moves persistent tasks from ClusterState.Custom to MetaData.Custom and adds ability for the task to remain in the metadata after completion.
2018-01-31 11:58:07 +01:00
Martijn van Groningen 243b7e4499
Moved job lifecycle over to persistent tasks
Also replaced the DELETING status from JobState with a boolean flag on Job. The state of a job is now stored inside a persistent task in cluster state. Jobs that aren't running don't have a persistent task, so I moved that notion of being deleted to the job config itself.

Original commit: elastic/x-pack@21cd19ca1c
2018-01-31 11:58:07 +01:00
Igor Motov d340c190b2
Replace List with Map in PersistentTasksInProgress
Store currently running persistent tasks in a map instead of a list.

Original commit: elastic/x-pack@f88c9adef5
2018-01-31 11:58:06 +01:00
David Kyle 32e406181e
Fix check style error after upgrade
Original commit: elastic/x-pack@3bf4025f78
2018-01-31 11:58:06 +01:00
Igor Motov ac67d02bc3
Add support for task status on persistent tasks
Similarly to task status on normal tasks it's now possible to update task status on the persistent tasks. This should allow updating the state of the running tasks (such as loading, started, etc) as well as store intermediate state or progress.

Original commit: elastic/x-pack@048006b467
2018-01-31 11:58:06 +01:00
Martijn van Groningen 777b21f2ef
Add a number of auxiliary methods to persistent tasks classes.
Original commit: elastic/x-pack@7f44b41b7a
2018-01-31 11:57:02 +01:00
Igor Motov f136bfa6e0
Adds support for persistent actions
A persistent action is a transport-like action that is using the cluster state instead of transport to start tasks. This allows persistent tasks to survive restart of executing nodes. A persistent action can be implemented by extending TransportPersistentAction. TransportPersistentAction will start the task by using PersistentActionService, which controls persistent tasks lifecycle.  See TestPersistentActionPlugin for an example implementing a persistent action.

Original commit: elastic/x-pack@5e83f1bfa3
2018-01-31 11:08:56 +01:00
Jim Ferenczi cb1fef7f6e
Fix intermittent failure in InternalEngineTest#testRefreshScopedSearcher (#28417)
This change switches the merge policy to none (for this specific test) in order to make sure that refreshes are always triggered
 by a change in the writer.

 Closes #27514
2018-01-31 09:24:15 +01:00
Nik Everett 3b6af15a60
XContent: Factor deprecation handling into callback (#28449)
Factors the way in which XContent parsing handles deprecated fields
into a callback that is set at parser construction time. The goals here
are:
1. Remove Log4J as a dependency of XContent so that XContent can be used
by clients without forcing log4j and our particular deprecation handling
scheme.
2. Simplify handling of deprecated fields in tests. Now tests can listen
directly for the deprecation callback rather than digging through a
ThreadLocal.

More accurately, this change begins this work. It deprecates a number of
methods, pointing folks to the new versions of those methods that take
`DeprecationHandler`. The plan is to slowly drop these deprecated
methods. Once they are entirely removed we can remove Log4j as
dependency of XContent.
2018-01-30 18:21:10 -05:00
Simon Willnauer 3bf8554114
Remove tribe node support (#28443)
Tribe node has been superseeded by Cross-Cluster-Search. This change
removes the tribe node support entirely.
2018-01-30 20:40:19 +01:00
Alexander Reelsen 1d311dfb65 Versions: Add 6.1.4/5.6.8 snapshot versions 2018-01-30 20:13:34 +01:00
Christoph Büscher 6731c76900
Add ranking evaluation API to High Level Rest Client (#28357)
This change adds support for the new ranking evaluation API to the High Level Rest Client.
This mostly means adding support for parsing the various response objects back from the
REST representation. It includes one change to the response syntax where previously we didn't
print the type of the metric details section but we now need it to pick the right parser to
parse this section back.

Closes #28198
2018-01-30 17:48:09 +01:00
Luca Cavanna 2c99bfc947
REST high-level client: Fix parsing of script fields (#28395)
Script fields can get a bit more complicated than just stored fields. A script can return null, an object and also an array. Extended parsing to support such valid values. Also renamed util method from `parseStoredFieldsValue` to `parseFieldsValue` given that it can parse stored fields but also script fields, anything that's returned as `fields`.

Closes #28380
2018-01-30 13:19:08 +01:00
Boaz Leskes 613fc1654f testTranslogReplayWithFailure: checkStyle 2018-01-30 11:26:06 +01:00
Boaz Leskes f5f679f921 testTranslogReplayWithFailure: do not assume engine is recoverable when disk errors stop
The test currently makes the assumption that if underlying directory stops throwing exceptions, we can always open the engine. This is not the case as some errors can cause a corruption marker to be placed in the store.

This commit refactors the test to only check that everything is OK if the engine was successfully opened. On top of that, there is no point in checking replay with no errors as we have another test for that.

Closes #28426
2018-01-30 10:02:26 +01:00
Alan Woodward 424ecb3c7d
Add ability to index prefixes on text fields (#28290)
This adds the ability to index term prefixes into a hidden subfield, enabling prefix queries to be run without multitermquery rewrites. The subfield reuses the analysis chain of its parent text field, appending an EdgeNGramTokenFilter. It can be configured with minimum and maximum ngram lengths. Query terms with lengths outside this min-max range fall back to using prefix queries against the parent text field.

The mapping looks like this:

"my_text_field" : {
"type" : "text",
"analyzer" : "english",
"index_prefix" : { "min_chars" : 1, "max_chars" : 10 }
}

Relates to #27049
2018-01-30 08:26:56 +00:00
Yannick Welsch 6f84503c33
Use final fields in UnicastZenPing request/response objects (#28406)
Prevents a NullPointerException that can happen due to concurrency in UnicastZenPing, see #21658.
2018-01-29 14:30:51 +01:00
Alan Woodward e208e959bd
TextFieldMapper defaults can be final (#28313) 2018-01-29 10:05:54 +00:00
Ryan Ernst b47b399f00
Settings: Reimplement keystore format to use FIPS compliant algorithms (#28255)
This commit switches the internal format of the elasticsearch keystore
to no longer use java's KeyStore class, but instead encrypt the binary
data of the secrets using AES-GCM. The cipher key is generated using
PBKDF2WithHmacSHA512. Tests are also added for backcompat reading the v1
and v2 formats.
2018-01-26 15:51:07 -08:00
Ryan Ernst 3dd833ca0a
Plugins: Use one confirmation of all meta plugin permissions (#28366)
Currently meta plugins will ask for confirmation of security policy
exceptions for each bundled plugin. This commit collects the necessary
permissions of each bundled plugin, and asks for confirmation of all of
them at the same time.
2018-01-26 15:44:44 -08:00
Nhat Nguyen 414b5de661 TEST: Lower num of shards in testShrinkIndexPrimaryTerm
In some cases testShrinkIndexPrimaryTerm creates then 'mutates' 210
shards. If each shard opens more than 10 files (translog, lucene index),
we exceeded the maximum allowed file handles. In our test, the number of
file handles is limited to 2048 by HandleLimitFS. This commit reduces
the number of shards in testShrinkIndexPrimaryTerm to avoid such errors.

Closes #28153
2018-01-26 17:16:00 -05:00
Nhat Nguyen 583085d7e8 Increase timeout for ensureGreen in testShrinkIndexPrimaryTerm
If we have created 210 shards, we may need more than 30 seconds for all
shards become green.

Relates #28153
2018-01-26 15:31:37 -05:00
Lee Hinman c25a4637e8 Ensure total nanosecond time for tasks is at least 1 nanosecond
In rare cases the total nanoseconds for an entire window of operations can be 0
nanoseconds, causing the assertion in
QueueResizingEsThreadPoolExecutor.calculateLambda to trip. This ensures that we
calculate the lambda value with at least 1 nanosecond.

Resolves #27607
2018-01-26 11:36:44 -07:00
Lee Hinman 96e7da53c7 [TEST] Expand assert to mention which files are not deleted
Relates to #25335
2018-01-26 09:03:51 -07:00
Antonio Matarrese f61591c6ec Fix string terms get key as number to see integers
Currently this method parses the string as a double. This means that it
might lose accuracy if the value is a long that is greater than
2^52. This commit changes this method to try to detect whether the
string represents a long first.
2018-01-26 06:22:42 -05:00
Nhat Nguyen f39402a039
Fix peer recovery flushing loop (#28350)
Today after writing an operation to an engine, we will call 
`IndexShard#afterWriteOperation` to flush a new commit if needed. The 
`shouldFlush` condition is purely based on the uncommitted translog size
and the translog flush threshold size setting. However this can cause a
replica execute an infinite loop of flushing in the following situation.

1. Primary has a fully baked index commit with its local checkpoint 
equals to max_seqno
2. Primary sends that fully baked commit, then replays all retained
translog operations to the replica
3. No operations are added to Lucence on the replica as seqno of these
operations are at most the local checkpoint
4. Once translog operations are replayed, the target calls 
`IndexShard#afterWriteOperation` to flush. If the total size of the
replaying operations exceeds the flush threshold size, this call will
`Engine#flush`. However the engine won't flush as its index writer does
not have any uncommitted operations. The method
`IndexShard#afterWriteOperation` will keep flushing as the condition
`shouldFlush` is still true.

This issue can be avoided if we always flush if the `shouldFlush` 
condition is true.
2018-01-25 14:29:46 -05:00
olcbean 9db23e48cd Add Indices Aliases API to the high level REST client (#27876)
Relates to #27205
2018-01-25 14:34:06 +01:00
olcbean 0c83240b5f Java Api clean up: remove deprecated `isShardsAcked` (#28311)
This PR removes previously deprecated `isShardsAcked()` method in
favour of `isShardsAcknowledged()` on `CreateIndexResponse`, `CreateIndexClusterStateUpdateResponse` and `RolloverResponse` 

Related to #27784
Follow-up of #27819
2018-01-25 14:13:20 +01:00
Jim Ferenczi 95c45aeb5d Adapt bwc version after backport #28358 2018-01-25 09:26:10 +01:00
Jim Ferenczi c26d4ac6c1
Always return the after_key in composite aggregation response (#28358)
This change adds the `after_key` of a composite aggregation directly in the response.
It is redundant when all buckets are not filtered/removed by a pipeline aggregation since in this case the `after_key` is always the last bucket
in the response. Though when using a pipeline aggregation to filter composite buckets, the `after_key` can be lost if the last bucket is filtered.
This commit fixes this situation by always returning the `after_key` in a dedicated section.
2018-01-25 09:15:27 +01:00
Tanguy Leroux 5f0cb3a07e
[Test] Fix DiscoveryNodesTests.testDeltas() (#28361)
The DiscoveryNodes.Delta was changed in #28197. Previous/Master nodes
are now always set in the `Delta` (before the change they were set only
if the master changed) and the `masterChanged()` method is now based on
object equality and nodes ephemeral ids (before the change it was based
on nodes id).

This commit adapts the DiscoveryNodesTests.testDeltas() to reflect the
changes.
2018-01-25 08:52:30 +01:00
Nhat Nguyen 7847cded80 Only assert single commit iff index created on 6.2
We introduced a single commit assertion when opening an index but create
a new translog. However, this assertion is not held in this situation.

1. A replica with two commits c1 and c2 starts peer-recovery with c1
2. The recovery is sequence-based recovery but the primary is before 6.2 so
it sent true for “createNewTranslog”
3. Replica opens engine and create translog. We expect "open index and
create translog" have 1 commit but we have c1 and c2.

This commit makes sure to assert this iff the index was created on 6.2+.
2018-01-24 10:49:44 -05:00
Nhat Nguyen 80a7943d6a isHeldByCurrentThread should return primitive bool 2018-01-24 10:48:05 -05:00
Alexander Reelsen a87714aafc
Settings: Introduce settings updater for a list of settings (#28338)
This introduces a settings updater that allows to specify a list of
settings. Whenever one of those settings changes, the whole block of
settings is passed to the consumer.

This also fixes an issue with affix settings, when used in combination
with group settings, which could result in no found settings when used
to get a setting for a namespace.

Lastly logging has been slightly changed, so that filtered settings now
only log the setting key.

Another bug has been fixed for the mock log appender, which did not
work, when checking for the exact message.

Closes #28047
2018-01-24 09:47:17 +01:00
Jim Ferenczi b10d166190 Adapt bwc version after backport #28310 2018-01-24 09:17:30 +01:00
Simon Willnauer 4d3f7a7695
Ensure we protect Collections obtained from scripts from self-referencing (#28335)
Self referencing maps can cause SOE if they are iterated ie. in their toString methods. This chance adds some protected to the usage of those collections.
2018-01-23 16:57:26 +01:00
Jim Ferenczi 19cfc25873
Adds the ability to specify a format on composite date_histogram source (#28310)
This commit adds the ability to specify a date format on the `date_histogram` composite source.
If the format is defined, the key for the source is returned as a formatted date.

Closes #27923
2018-01-23 15:14:49 +01:00
Simon Willnauer d31e964a86
Provide a better error message for the case when all shards failed (#28333)
Today we don't specify a cause which can make debugging very very tricky.
This change is best effort to supply at least one cause for the failure.
2018-01-23 14:50:02 +01:00
Christoph Büscher ba9e2e44cb
[Test] Re-Add integer_range and date_range field types for query builder tests (#28171)
The tests for those field types were removed in #26549 because the range mapper
was moved to a module, but later this mapper was moved back to core in #27854.
This change adds back those two field types like before to the general setup in
AbstractQueryTestCase and adds some specifics to the RangeQueryBuilder and
TermsQueryBuilder tests. Also adding back an integration test in SearchQueryIT that
has been removed before but that can be kept with the mapper back in core now.

Relates to #28147
2018-01-23 13:08:54 +01:00
Catalin Ursachi cf61d792b2 Added Put Mapping API to high-level Rest client (#27869)
Relates to #27205
2018-01-23 11:03:32 +01:00
Martijn van Groningen 4ef341a0c3
Revert change that does not return all indices if a specific alias is requested via get alias api. (#28294)
Reopens #27763
2018-01-23 09:06:02 +01:00
Lee Hinman ba5b583203
Notify affixMap settings when any under the registered prefix matches (#28317)
* Notify affixMap settings when any under the registered prefix matches

Previously if an affixMap setting was registered, and then a completely
different setting was applied, the affixMap update consumer would be notified
with an empty map. This caused settings that were previously set to be unset in
local state in a consumer that assumed it would only be called when the affixMap
setting was changed.

This commit changes the behavior so if a prefix `foo.` is registered, any
setting under the prefix will have the update consumer notified if there are
changes starting with `foo.`.

Resolves #28316

* Add unit test

* Address feedback
2018-01-22 11:55:54 -07:00
Luca Cavanna 0c83ee2a5d
Trim down usages of `ShardOperationFailedException` interface (#28312)
In many cases we use the `ShardOperationFailedException` interface to abstract an exception that can only be of one type, namely `DefaultShardOperationException`. There is no need to use the interface in such cases, the concrete type should be used instead. That has the additional advantage of simplifying parsing such exceptions back from rest responses for the high-level REST client
2018-01-22 15:51:46 +01:00
Martijn van Groningen 509ecf2aa6
Do not return all indices if a specific alias is requested via get aliases api.
If a get alias api call requests a specific alias pattern then
indices not having any matching aliases should not be included in the response.

Closes #27763
2018-01-22 14:02:53 +01:00
Adrien Grand 8d195c86de
CountedBitSet doesn't need to extend BitSet. (#28239) 2018-01-22 12:43:34 +01:00
kel 452c36c552 Calculate sum in Kahan summation algorithm in aggregations (#27807) (#27848) 2018-01-22 12:42:56 +01:00
Adrien Grand 700d9ecc95
Remove the `update_all_types` option. (#28288)
This option is not useful in 7.x since no indices may have more than one type
anymore.
2018-01-22 12:03:07 +01:00
Tanguy Leroux 119b1b5c2b
Add information when master node left to DiscoveryNodes' shortSummary() (#28197)
This commit changes `DiscoveryNodes.Delta.shortSummary()` in order to
add information to the summary when the master node left.
2018-01-22 09:52:57 +01:00
Jason Tedor ef76d99d86 Add 6.3 version constant to master
This commit adds the 6.3 version constant to the master branch after 6.2
was cut from 6.x.
2018-01-20 22:16:59 -05:00
Nhat Nguyen 9db9bd52f7
Clean up commits when global checkpoint advanced (#28140)
Today we keep multiple index commits based on the current global 
checkpoint, but only clean up unneeded index commits when we have a new 
index commit. However, we can release the old index commits earlier once
the global checkpoint has advanced enough. This commit makes an engine
revisit the index deletion policy whenever a new global checkpoint value
is persisted and advanced enough.

Relates #10708
2018-01-18 15:45:06 -05:00
Jim Ferenczi c38c12e3bf
Fix simple_query_string on invalid input (#28219)
This change converts any exception that occurs during the parsing of
a simple_query_string to a match_no_docs query (instead of a null query)
when leniency is activated.

Closes #28204
2018-01-18 10:49:34 +01:00
Jason Tedor 6b0036e0e1
Add client actions to action plugin
This commit adds an extension point for client actions to action
plugins. This is useful for plugins to expose the client-side actions
without exposing the server-side implementations to the client. The
default implementation, of course, delegates to extracting the
client-side action from the server-side implementation.

Relates #28280
2018-01-17 21:57:03 -05:00
Tony Zeng 1335232e6b Add toString() implementation for UpdateRequest (#27997) 2018-01-17 17:04:04 +00:00
Alexander Reelsen 707d8d6fe6
Dependencies: Update joda time to 2.9.9 (#28261) 2018-01-17 14:58:52 +01:00
David Turner 9bd7f2c65b
Improve wording in deprecation message (#28259) 2018-01-17 12:42:20 +00:00
Tanguy Leroux 6256c330c0 [Test] Wait for no relocating shards in indices.stats/13_fields tests
MixedClusterClientYamlTestSuiteIT sometimes fails when executing the
indices.stats/13_fields/* REST tests. It does not reproduce locally
but the execution logs show that it failed when a shard is relocating
during the set up execution. This commit change the set up so that it
now waits for all shards to be active before executing the tests.

closes #26732, #27146
2018-01-17 13:35:29 +01:00
olcbean b98514c6d9 Add Close Index API to the high level REST client (#27734)
Add support for _close endpoint to the high level REST client

Relates to #27205
2018-01-17 11:47:08 +01:00
Lee Hinman f2cd580332 Remove duplicated javadoc `fieldType` param 2018-01-16 16:34:44 -07:00
Nik Everett 4ec0569a19 Handle 5.6.6 and 6.1.2 release
Add new version constants for 5.6.6 and 6.1.2 release.
2018-01-16 16:41:05 -05:00
Jason Tedor 045dd4ad48
Introduce multi-release JAR
This commit introduces the ability for the core Elasticsearch JAR to be
a multi-release JAR containing code that is compiled for JDK 8 and code
that is compiled for JDK 9. At runtime, a JDK 8 JVM will ignore the JDK
9 compiled classfiles, and a JDK 9 JVM will use the JDK 9 compiled
classfiles instead of the JDK 8 compiled classfiles. With this work, we
utilize the new JDK 9 API for obtaining the PID of the running JVM,
instead of relying on a hack.

For now, we want to keep IDEs on JDK 8 so when the build is in an IDE we
ignore the JDK 9 source set (as otherwise the IDE would give compilation
errors). However, with this change, running Gradle from the command-line
now requires JAVA_HOME and JAVA_9_HOME to be set. This will require
follow-up work in our CI infrastructure and our release builds to
accommodate this change.

Relates #28051
2018-01-16 15:10:29 -05:00
Jason Tedor e5a698447b Move the multi-get response tests to server
This test file was accidentally pushed to core instead of server. This
commit moves this file to its proper location.
2018-01-16 14:11:31 -05:00
Christoph Büscher 8a58df46f3 Revert "[Docs] Fix Java Api index administration usage (#28133)"
This reverts commit 67c1f1c856.
2018-01-16 17:31:11 +01:00
Martijn van Groningen 853f7e8780
Added multi get api to the high level rest client.
Relates to #27205
2018-01-16 17:27:02 +01:00
Nhat Nguyen 65e90079ad
Open engine should keep only starting commit (#28228)
Keeping unsafe commits when opening an engine can be problematic because
these commits are not safe at the recovering time but they can suddenly
become safe in the future. The following issues can happen if unsafe
commits are kept oninit.

1. Replica can use unsafe commit in peer-recovery. This happens when a
replica with a safe commit c1 (max_seqno=1) and an unsafe commit c2
(max_seqno=2) recovers from a primary with c1(max_seqno=1). If a new
document (seqno=2) is added without flushing, the global checkpoint is
advanced to 2; and the replica recovers again, it will use the unsafe
commit c2 (max_seqno=2 <= gcp=2) as the starting commit for sequenced
based recovery even the commit c2 contains a stale operation and the
document (with seqno=2) will not be replicated to the replica.

2. Min translog gen for recovery can go backwards in peer-recovery. This
happens when a replica with a safe commit c1 (local_checkpoint=1,
recovery_translog_gen=1) and an unsafe commit c2 (local_checkpoint=2,
recovery_translog_gen=2). The replica recovers from a primary, and keeps
c2 as the last commit, then sets last_translog_gen to 2. Flushing a new
commit on the replica will cause exception as the new last commit c3
will have recovery_translog_gen=1. The recovery translog generation of a
commit is calculated based on the current local checkpoint. The local
checkpoint of c3 is 1 while the local checkpoint of c2 is 2.

3. Commit without translog can be used for recovery. An old index, which
was created before multiple-commits is introduced (v6.2), may not have a
safe commit. If that index has a snapshotted commit without translog and
an unsafe commit, the policy can consider the snapshotted commit as a
safe commit for recovery even the commit does not have translog.

These issues can be avoided if the combined deletion policy keeps only
the starting commit onInit.

Relates #27804
Relates #28181
2018-01-16 08:37:42 -05:00
Christoph Büscher 67c1f1c856
[Docs] Fix Java Api index administration usage (#28133)
The Java API documentation for index administration currenty is wrong because
the PutMappingRequestBuilder#setSource(Object... source) and
CreateIndexRequestBuilder#addMapping(String type, Object... source) methods
delegate to methods that check that the input arguments are valid key/value
pairs:

https://www.elastic.co/guide/en/elasticsearch/client/java-api/current/java-admin-indices.html

This changes the docs so the java api code examples are included from
documentation integration tests so we detect compile and runtime issues earlier.

Closes #28131
2018-01-16 12:05:03 +01:00
Yannick Welsch 196c7b80dc
Never return null from Strings.tokenizeToStringArray (#28224)
This method has a different contract than all the other methods in this class, returning null instead of an empty array when receiving a null input. While switching over some methods from delimitedListToStringArray to this method tokenizeToStringArray, this resulted in unexpected nulls in some places of our code.

Relates #28213
2018-01-16 09:58:58 +01:00
Yannick Welsch 0c4e2cbc19
Fallback to TransportMasterNodeAction for cluster health retries (#28195)
ClusterHealthAction does not use the regular retry logic, possibly causing StackOverflowErrors.

Relates #28169
2018-01-16 09:50:06 +01:00
Nhat Nguyen 6c297ad7c8 TEST: Update logging for testAckedIndexing
- Log the response of indexing requests
- Correct logging setting for discovery package
2018-01-15 18:14:04 -05:00
Nicholas Knize 5ed25f1e12 [GEO] Add WKT Support to GeoBoundingBoxQueryBuilder
Add WKT BBOX parsing support to GeoBoundingBoxQueryBuilder.
2018-01-15 13:30:51 -06:00
Adrien Grand 0a92e43f62
Avoid doing redundant work when checking for self references. (#26927)
Currently we test all maps, arrays or iterables. However, in the case that maps
contain sub maps for instance, we will test the sub maps again even though the
work has already been done for the top-level map.

Relates #26907
2018-01-15 18:36:32 +01:00
Adrien Grand a16f80a832
Fix casts in HotThreads. (#27578)
Even though an overflow would be very unlikely, it's better to use the longs
directly in the comparator.
2018-01-15 18:35:27 +01:00
Adrien Grand 77a7e2480b
Allow update of `eager_global_ordinals` on `_parent`. (#28014)
A bug introduced in #24407 currently prevents `eager_global_ordinals` from
being updated. This new approach should fix the issue while still allowing
mapping updates to not specify the `_parent` field if it doesn't need
updating, which was the goal of #24407.
2018-01-15 18:34:10 +01:00
Jim Ferenczi bd11e6c441
Fix NPE on composite aggregation with sub-aggregations that need scores (#28129)
The composite aggregation defers the collection of sub-aggregations to a second pass that visits documents only if they
appear in the top buckets. Though the scorer for sub-aggregations is not set on this second pass and generates an NPE if any sub-aggregation
tries to access the score. This change creates a scorer for the second pass and makes sure that sub-aggs can use it safely to check the score of
the collected documents.
2018-01-15 18:30:38 +01:00
Tim Brooks ee7eac8dc1
`MockTcpTransport` to connect asynchronously (#28203)
The method `initiateChannel` on `TcpTransport` is explicit in that
channels can be connect asynchronously. All production implementations
do connect asynchronously. Only the blocking `MockTcpTransport`
connects in a synchronous manner. This avoids testing some of the
blocking code in `TcpTransport` that waits on connections to complete.
Additionally, it requires a more extensive method signature than
required for other transports.

This commit modifies the `MockTcpTransport` to make these connections
asynchronously on a different thread. Additionally, it simplifies that
`initiateChannel` method signature.
2018-01-15 10:20:30 -07:00
Jim Ferenczi 190f1e1fb3
Fix synonym phrase query expansion for cross_fields parsing (#28045)
* Fix synonym phrase query expansion for cross_fields parsing

The `cross_fields` mode for query parser ignores phrase query generated by multi-word synonyms.
In such case only the first field of each analyzer group is kept. This change fixes this issue
by expanding the phrase query for each analyzer group to **all** fields using a disjunction max query.
2018-01-15 18:00:20 +01:00
Tim Brooks 3895add2ca
Introduce elasticsearch-core jar (#28191)
This is related to #27933. It introduces a jar named elasticsearch-core
in the lib directory. This commit moves the JarHell class from server to
elasticsearch-core. Additionally, PathUtils and some of Loggers are
moved as JarHell depends on them.
2018-01-15 09:59:01 -07:00
Jim Ferenczi 5973c2bf31 #28218: Update the Lucene version for 6.2.0 after backport 2018-01-15 17:27:51 +01:00
Jim Ferenczi be012b1326
upgrade to lucene 7.2.1 (#28218) 2018-01-15 16:47:46 +01:00
Colin Goodheart-Smithe 023d08ee91
Adds metadata to rewritten aggregations (#28185)
* Adds metadata to rewritten aggregations

Previous to this change, if any filters in the filters aggregation were rewritten, the rewritten version of the FiltersAggregationBuilder would not contain the metadata form the original. This is because `AbstractAggregationBuilder.getMetadata()` returns an empty map when not metadata is set.

Closes #28170

* Always set metadata when rewritten
2018-01-15 08:44:49 +00:00
Igor Motov aec0c0f9b6 Update version of TaskInfo header serialization after backport
Update the serialization version after backporting #27764 to 6.x.
2018-01-14 19:20:32 -05:00
Nhat Nguyen fbb840b5c8 TEST: Tightens file-based condition in peer-recovery
As a replica always keeps a safe commit and starts peer-recovery with
that commit; file-based recovery  only happens if new operations are
added to the primary and the required translog is not fully retained. In
the test, we tried to produce this condition by flushing a new commit in
order to trim all translog. However, if the new global checkpoint is not
persisted yet, we will keep two commits and not trim translog. This
commit tightens the file-based condition in the test by waiting for the
global checkpoint persisted properly on the new primary before flushing.

Close #28209
Relates #28181
2018-01-13 22:03:30 -05:00
Nhat Nguyen 9774ba35a1 Correct backport replica rollback to 6.2 (#28181)
The previous backport was not corect.

Relates #28181
2018-01-13 14:10:23 -05:00
Nhat Nguyen 0151c1565d Backport replica rollback to 6.2 (#28181)
Relates #28181
2018-01-13 11:44:13 -05:00
Nhat Nguyen e44e34f42a Rename deleteLocalTranslog to createNewTranslog
We introduced a new option `createNewTranslog` in #28181. However, we
named that parameter as deleteLocalTranslog in other places. This commit
makes sure to have a consistent naming in these places.

Relates #28181
2018-01-13 11:44:13 -05:00
Nhat Nguyen fafdb8d9e3 AwaitsFix #testRecoveryAfterPrimaryPromotion
Relates #28209
2018-01-13 11:44:13 -05:00
Nhat Nguyen 82722ebad3 TEST: init unassigned gcp in testAcquireIndexCommit
The global checkpoint should be assigned to unassigned rather than 0. If
a single document is indexed and the global checkpoint is initialized
with 0, the first commit is safe which the test does not suppose.

Relates #28038
2018-01-12 20:09:34 -05:00
Nhat Nguyen 095f31b80e
Replica start peer recovery with safe commit (#28181)
Today a replica starts a peer-recovery with the last commit. If the last
commit is not a safe commit, a replica will immediately fallback to the
file based sync which is more expensive than the sequence based
recovery. This commit modifies the peer-recovery in replica to start
with a safe commit. Moreover we can keep the existing translog on the
target if the recovery is sequence based recovery.

Relates #10708
2018-01-12 19:09:31 -05:00
Nhat Nguyen f2db2a02e2
Truncate tlog cli should assign global checkpoint (#28192)
We are targeting to always have a safe index once the recovery is done. 
This invariant does not hold if the translog is manually truncated by 
users because the truncate translog cli resets the global checkpoint to
unassigned. This commit assigns the global checkpoint to the max_seqno
of the last commit when truncating translog. We can only safely do it
because the truncate translog command will generate a new history uuid
for that shard. With a new history UUID, sequence-based recovery between
that shard and other old shards will be disabled.

Relates #28181
2018-01-12 19:06:04 -05:00
Jason Tedor a15ba75d93
Fix lock accounting in releasable lock
Releasble locks hold accounting on who holds the lock when assertions
are enabled. However, the underlying lock can be re-entrant yet we mark
the lock as not held by the current thread as soon as the releasable is
closed. For a re-entrant lock this is not right because the thread could
have entered the lock multiple times. Instead, we have to count how many
times the thread has entered the lock and only mark the lock as not held
by the current thread when the counter reaches zero.

Relates #28202
2018-01-12 16:17:30 -05:00
Igor Motov c75ac319a6
Add ability to associate an ID with tasks (#27764)
Adds support for capturing the X-Opaque-Id header from a REST request and storing it's value in the tasks that this request started. It works for all user-initiated tasks (not only search).

Closes #23250

Usage:
```
$ curl -H "X-Opaque-Id: imotov" -H "foo:bar" "localhost:9200/_tasks?pretty&group_by=parents"
{
  "tasks" : {
    "7qrTVbiDQKiZfubUP7DPkg:6998" : {
      "node" : "7qrTVbiDQKiZfubUP7DPkg",
      "id" : 6998,
      "type" : "transport",
      "action" : "cluster:monitor/tasks/lists",
      "start_time_in_millis" : 1513029940042,
      "running_time_in_nanos" : 266794,
      "cancellable" : false,
      "headers" : {
        "X-Opaque-Id" : "imotov"
      },
      "children" : [
        {
          "node" : "V-PuCjPhRp2ryuEsNw6V1g",
          "id" : 6088,
          "type" : "netty",
          "action" : "cluster:monitor/tasks/lists[n]",
          "start_time_in_millis" : 1513029940043,
          "running_time_in_nanos" : 67785,
          "cancellable" : false,
          "parent_task_id" : "7qrTVbiDQKiZfubUP7DPkg:6998",
          "headers" : {
            "X-Opaque-Id" : "imotov"
          }
        },
        {
          "node" : "7qrTVbiDQKiZfubUP7DPkg",
          "id" : 6999,
          "type" : "direct",
          "action" : "cluster:monitor/tasks/lists[n]",
          "start_time_in_millis" : 1513029940043,
          "running_time_in_nanos" : 98754,
          "cancellable" : false,
          "parent_task_id" : "7qrTVbiDQKiZfubUP7DPkg:6998",
          "headers" : {
            "X-Opaque-Id" : "imotov"
          }
        }
      ]
    }
  }
}
```
2018-01-12 15:34:17 -05:00
Nhat Nguyen 55a14230a7
Do not keep 5.x commits once having 6.x commits (#28188)
Currently we keep a 5.x index commit as a safe commit until we have a
6.x safe commit. During that time, if peer-recovery happens, a primary
will send a 5.x commit in file-based sync and the recovery will even
fail as the snapshotted commit does not have sequence number tags.

This commit updates the combined deletion policy to delete legacy
commits if there are 6.x commits.

Relates #27606
Relates #28038
2018-01-11 18:34:17 -05:00
Tim Brooks 99f88f15c5
Rename core module to server (#28180)
This is related to #27933. It renames the core module to server. This is
the first step towards introducing an elasticsearch-core jar.
2018-01-11 11:30:43 -07:00