Commit Graph

402 Commits

Author SHA1 Message Date
Lee Hinman eebda6974d
Decouple NamedXContentRegistry from ElasticsearchException (#29253)
* Decouple NamedXContentRegistry from ElasticsearchException

This commit decouples `NamedXContentRegistry` from using either
`ElasticsearchException`, `ParsingException`, or `UnknownNamedObjectException`.

This will allow us to move NamedXContentRegistry to its own lib as part of the
xcontent extraction work.

Relates to #28504
2018-03-27 16:51:31 -06:00
Lee Hinman 7df66abaf5 [TEST] Fix issue with HttpInfo passed invalid parameter
HttpInfo is passed the maxContentLength as a parameter, but this value should
never be negative. This fixes the test to only pass a positive random value.
2018-03-27 14:20:06 -06:00
Lee Hinman b4c78019b0
Remove all dependencies from XContentBuilder (#29225)
* Remove all dependencies from XContentBuilder

This commit removes all of the non-JDK dependencies from XContentBuilder, with
the exception of `CollectionUtils.ensureNoSelfReferences`. It adds a third
extension point around dealing with time-based fields and formatters to work
around the Joda dependency.

This decoupling allows us to be able to move XContentBuilder to a separate lib
so it can be available for things like the high level rest client.

Relates to #28504
2018-03-27 12:58:22 -06:00
Jim Ferenczi 3db6f1c9d5 Fix sporadic failure in CompositeValuesCollectorQueueTests
This commit fixes a test bug that causes an NPE on empty segments.

Closes #29269
2018-03-27 20:11:21 +02:00
Jim Ferenczi 2aaa057387
Propagate ignore_unmapped to inner_hits (#29261)
In 5.2 `ignore_unmapped` was added to `inner_hits` in order to ignore invalid mapping.
This value was automatically set to the value defined in the parent query (`nested`, `has_child`, `has_parent`) but the refactoring of the parent/child in 5.6 removed this behavior unintentionally.
This commit restores this behavior but also makes sure that we always automatically enforce this value when the query builder is used directly (previously this was only done by the XContent deserialization).

Closes #29071
2018-03-27 18:55:42 +02:00
Nhat Nguyen dfc9e721d8 TEST: Increase timeout for testPrimaryReplicaResyncFailed
The default timeout (eg. 10 seconds) may not be enough for CI to
re-allocate shards after the partion is healed. This commit increases
the timeout to 30 seconds and enables logging in order to have more
detailed information in case this test failed again.

Closes #29060
2018-03-27 12:18:09 -04:00
Nhat Nguyen d1d3edf156 TEST: Use different translog dir for a new engine
In #testPruneOnlyDeletesAtMostLocalCheckpoint, we create a new engine
but mistakenly use the same translog directory of  the existing engine.
This prevents translog files from cleaning up when closing the engines.

ERROR   0.12s J2 | InternalEngineTests.testPruneOnlyDeletesAtMostLocalCheckpoint <<< FAILURES!
   > Throwable #1: java.io.IOException: could not remove the following files (in the order of attempts):
   >    translog-primary-060/translog-2.tlog:  java.io.IOException: access denied:

This commit makes sure to use a separate directory for each engine in
this tes.
2018-03-27 09:45:51 -04:00
Christoph Büscher 8d6832c5ee
Make SearchStats implement Writeable (#29258)
Moves another class over from Streamable to Writeable. By this,
also some constructors can be removed or made private.
2018-03-27 15:21:11 +02:00
Nhat Nguyen 0ac89a32cc
Do not optimize append-only if seen normal op with higher seqno (#28787)
When processing an append-only operation, primary knows that operations 
can only conflict with another instance of the same operation. This is
true as the id was freshly generated. However this property doesn't hold
for replicas. As soon as an auto-generated ID was indexed into the
primary, it can be exposed to a search and users can issue a follow up
operation on it. In extremely rare cases, the follow up operation can be
arrived and processed on a replica before the original append-only
request. In this case we can't simply proceed with the append-only
request and blindly add it to the index without consulting the version
map. 

The following scenario can cause difference between primary and
replica.

1. Primary indexes an auto-gen-id doc. (id=X, v=1, s#=20)
2. A refresh cycle happens on primary
3. The new doc is picked up and modified - say by a delete by query
   request - Primary gets a delete doc (id=X, v=2, s#=30)
4. Delete doc is processed first on the replica (id=X, v=2, s#=30)
5. Indexing operation arrives on the replica, since it's an auto-gen-id
   request and the retry marker is lower, we put it into lucene without 
   any check. Replica has a doc the primary doesn't have.

To deal with a potential conflict between an append-only operation and a 
normal operation on replicas, we need to rely on sequence numbers. This
commit maintains the max seqno of non-append-only operations on replica
then only apply optimization for an append-only operation only if its
seq# is higher than the seq# of all non-append-only.
2018-03-26 16:56:12 -04:00
Nhat Nguyen 87957603c0
Prune only gc deletes below local checkpoint (#28790)
Once a document is deleted and Lucene is refreshed, we will not be able 
to look up the `version/seq#` associated with that delete in Lucene. As
conflicting operations can still be indexed, we need another mechanism
to remember these deletes. Therefore deletes should still be stored in
the Version Map, even after Lucene is refreshed. Obviously, we can't
remember all deletes forever so a trimming mechanism is needed.
Currently, we remember deletes for at least 1 minute (the default GC
deletes cycle) and clean them periodically. This is, at the moment, the
best we can do on the primary for user facing APIs but this arbitrary
time limit is problematic for replicas. Furthermore, we can't rely on
the primary and replicas doing the trimming in a synchronized manner,
and failing to do so results in the replica and primary making different
decisions. 

The following scenario can cause inconsistency between
primary and replica.

1. Primary index doc (index, id=1, v2)
2. Network packet issue causes index operation to back off and wait
3. Primary deletes doc (delete, id=1, v3)
4. Replica processes delete (delete, id=1, v3)
5. 1+ minute passes (GC deletes runs replica)
6. Indexing op is finally sent to the replica which no processes it 
   because it forgot about the delete.

We can reply on sequence-numbers to prevent this issue. If we prune only 
deletes whose seqno at most the local checkpoint, a replica will
correctly remember what it needs. The correctness is explained as
follows:

Suppose o1 and o2 are two operations on the same document with seq#(o1) 
< seq#(o2), and o2 arrives before o1 on the replica. o2 is processed
normally since it arrives first; when o1 arrives it should be discarded:
 
1. If seq#(o1) <= LCP, then it will be not be added to Lucene, as it was
  already previously added.

2. If seq#(o1)  > LCP, then it depends on the nature of o2:
  - If o2 is a delete then its seq# is recorded in the VersionMap,
    since seq#(o2) > seq#(o1) > LCP, so a lookup can find it and
    determine that o1 is stale.
  
  - If o2 is an indexing then its seq# is either in Lucene (if
    refreshed) or the VersionMap (if not refreshed yet), so a 
    real-time lookup can find it and determine that o1 is stale.

In this PR, we prefer to deploy a single trimming strategy, which 
satisfies both requirements, on primary and replicas because:

- It's simpler - no need to distinguish if an engine is running at
primary mode or replica mode or being promoted.

- If a replica subsequently is promoted, user experience is fully
maintained as that replica remembers deletes for the last GC cycle.

However, the version map may consume less memory if we deploy two 
different trimming strategies for primary and replicas.
2018-03-26 13:42:08 -04:00
Boaz Leskes bca264699a remove testUnassignedShardAndEmptyNodesInRoutingTable
testUnassignedShardAndEmptyNodesInRoutingTable and that test is as old as time and does a very bogus thing.
it is an IT test which extracts the GatewayAllocator from the node and tells it to allocated unassigned
shards, while giving it a conjured cluster state with no nodes in it (it uses the DiscoveryNodes.EMPTY_NODES.
This is never a cluster state we want to reroute on (we always have at least master node in it).
I'm going to just delete the test as I don't think it adds much value.

Closes #21463
2018-03-26 17:10:57 +02:00
Boaz Leskes f5d4550e93
Fold EngineDiskUtils into Store, for better lock semantics (#29156)
#28245 has introduced the utility class`EngineDiskUtils` with a set of methods to prepare/change
translog and lucene commit points. That util class bundled everything that's needed to create and
empty shard, bootstrap a shard from a lucene index that was just restored etc. 

In order to safely do these manipulations, the util methods acquired the IndexWriter's lock. That
would sometime fail due to concurrent shard store fetching or other short activities that require the
files not to be changed while they read from them. 

Since there is no way to wait on the index writer lock, the `Store` class has other locks to make
sure that once we try to acquire the IW lock, it will succeed. To side step this waiting problem, this
PR folds `EngineDiskUtils` into `Store`. Sadly this comes with a price - the store class doesn't and
shouldn't know about the translog. As such the logic is slightly less tight and callers have to do the
translog manipulations on their own.
2018-03-26 14:08:03 +02:00
Christoph Büscher 318b0af953 Remove execute mode bit from source files
Some source files seem to have the execute bit (a+x) set, which doesn't
really seem to hurt but is a bit odd. This change removes those, making
the permissions similar to other source files in the repository.
2018-03-26 13:37:55 +02:00
Jim Ferenczi 5288235ca3
Optimize the composite aggregation for match_all and range queries (#28745)
This change refactors the composite aggregation to add an execution mode that visits documents in the order of the values
present in the leading source of the composite definition. This mode does not need to visit all documents since it can early terminate
the collection when the leading source value is greater than the lowest value in the queue.
Instead of collecting the documents in the order of their doc_id, this mode uses the inverted lists (or the bkd tree for numerics) to collect documents
in the order of the values present in the leading source.
For instance the following aggregation:

```
"composite" : {
  "sources" : [
    { "value1": { "terms" : { "field": "timestamp", "order": "asc" } } }
  ],
  "size": 10
}
```
... can use the field `timestamp` to collect the documents with the 10 lowest values for the field instead of visiting all documents.
For composite aggregation with more than one source the execution can early terminate as soon as one of the 10 lowest values produces enough
composite buckets. For instance if visiting the first two lowest timestamp created 10 composite buckets we can early terminate the collection since it
is guaranteed that the third lowest timestamp cannot create a composite key that compares lower than the one already visited.

This mode can execute iff:
 * The leading source in the composite definition uses an indexed field of type `date` (works also with `date_histogram` source), `integer`, `long` or `keyword`.
 * The query is a match_all query or a range query over the field that is used as the leading source in the composite definition.
 * The sort order of the leading source is the natural order (ascending since postings and numerics are sorted in ascending order only).

If these conditions are not met this aggregation visits each document like any other agg.
2018-03-26 09:51:37 +02:00
Nicholas Knize fede633563 Add Z value support to geo_shape
This enhancement adds Z value support (source only) to geo_shape fields. If vertices are provided with a third dimension, the third dimension is ignored for indexing but returned as part of source. Like beofre, any values greater than the 3rd dimension are ignored.

closes #23747
2018-03-23 08:50:55 -05:00
Nhat Nguyen 794de63232
Remove type casts in logging in server component (#28807)
This commit removes type-casts in logging in the server component (other 
components will be done later). This also adds a parameterized message
test which would catch breaking-changes related to lambdas in Log4J.
2018-03-23 07:35:50 -04:00
Yu 4a8099c696 Change BroadcastResponse from ToXContentFragment to ToXContentObject (#28878)
While working on #27799, we find that it might make sense to change BroadcastResponse from ToXContentFragment to ToXContentObject, seeing that it's rather a complete XContent object and also the other Responses are normally ToXContentObject.

By doing this, we can also move the XContent build logic of BroadcastResponse's subclasses, from Rest Layer to the concrete classes themselves.

Relates to #3889
2018-03-23 10:53:37 +01:00
Milan Chovatiya 8328b9c5cd REST : Split `RestUpgradeAction` into two actions (#29124)
Closes #29062
2018-03-23 10:37:31 +01:00
Nhat Nguyen 14157c8705
Harden periodically check to avoid endless flush loop (#29125)
In #28350, we fixed an endless flushing loop which may happen on 
replicas by tightening the relation between the flush action and the
periodically flush condition.

1. The periodically flush condition is enabled only if it is disabled 
after a flush.

2. If the periodically flush condition is enabled then a flush will
actually happen regardless of Lucene state.

(1) and (2) guarantee that a flushing loop will be terminated. Sadly, 
the condition 1 can be violated in edge cases as we used two different
algorithms to evaluate the current and future uncommitted translog size.

- We use method `uncommittedSizeInBytes` to calculate current 
  uncommitted size. It is the sum of translogs whose generation at least
the minGen (determined by a given seqno). We pick a continuous range of
translogs since the minGen to evaluate the current uncommitted size.

- We use method `sizeOfGensAboveSeqNoInBytes` to calculate the future 
  uncommitted size. It is the sum of translogs whose maxSeqNo at least
the given seqNo. Here we don't pick a range but select translog one by
one.

Suppose we have 3 translogs `gen1={#1,#2}, gen2={}, gen3={#3} and 
seqno=#1`, `uncommittedSizeInBytes` is the sum of gen1, gen2, and gen3
while `sizeOfGensAboveSeqNoInBytes` is the sum of gen1 and gen3. Gen2 is
excluded because its maxSeqno is still -1.

This commit removes both `sizeOfGensAboveSeqNoInBytes` and 
`uncommittedSizeInBytes` methods, then enforces an engine to use only
`sizeInBytesByMinGen` method to evaluate the periodically flush condition.

Closes #29097
Relates ##28350
2018-03-22 14:31:15 -04:00
Jim Ferenczi c93c7f3121
Remove deprecated options for query_string (#29203)
This commit removes some parameters deprecated in 6.x (or 5.x):
`use_dismax`, `split_on_whitespace`, `all_fields` and `lowercase_expanded_terms`.

Closes #25551
2018-03-22 18:37:08 +01:00
Yu 24c8d8f5ef REST high-level client: add force merge API (#28896)
Relates to #27205
2018-03-22 17:17:16 +01:00
Lee Hinman 7d1de890b8
Decouple more classes from XContentBuilder and make builder strict (#29197)
This commit decouples `BytesRef`, `Releaseable`, and `TimeValue` from
XContentBuilder, and paves the way for doupling `ByteSizeValue` as well. It
moves much of the Lucene and Joda encoding into a new SPI extension that is
loaded by XContentBuilder to know how to encode these values.

Part of doing this also allows us to make JSON encoding strict, as we no longer
allow just any old object to be passed (in the past it was possible to get json
that was `"field": "java.lang.Object@d8355a8"` if no one was careful about what
was passed in).

Relates to #28504
2018-03-22 08:18:55 -06:00
Christoph Büscher d6d3fb3c73
Use EnumMap in ClusterBlocks (#29112)
By using EnumMap instead of an ImmutableLevelHolder array we can avoid
the using enum ordinals to index into the array.
2018-03-22 11:14:24 +01:00
Tanguy Leroux edf27a599e
Add new setting to disable persistent tasks allocations (#29137)
This commit adds a new setting `cluster.persistent_tasks.allocation.enable`
that can be used to enable or disable the allocation of persistent tasks.
The setting accepts the values `all` (default) or `none`. When set to
none, the persistent tasks that are created (or that must be reassigned)
won't be assigned to a node but will reside in the cluster state with
a no "executor node" and a reason describing why it is not assigned:

```
"assignment" : {
  "executor_node" : null,
  "explanation" : "persistent task [foo/bar] cannot be assigned [no
  persistent task assignments are allowed due to cluster settings]"
}
```
2018-03-22 09:18:07 +01:00
Nhat Nguyen 7d44d75774 Adjust PreSyncedFlushResponse bwc versions
We discussed and agreed to include the synced-flush change in 6.3.0+ but
not in 5.6.9. We will re-evaluate the urgency and importance of the
issue then decide which versions that the change should be included.
2018-03-21 16:50:35 -04:00
markharwood 93ff973afc
Tests - fix incorrect test assumption that zero-doc buckets will be returned by the adjacency matrix aggregation. Closes #29159 (#29167) 2018-03-21 10:42:14 +00:00
Jason Tedor 2f6c77337e Remove 6.1.5 version constant
The assumption here is that we will no longer be making a release from
the 6.1 branch. Since we assume that all versions on this branch are
actually released, we do not want to leave behind any versions that
would require a snapshot build. We do have a test that verifies that all
released versions are present here, so if another release is performed
from the 6.1 branch, that test will fail and we will know to add the
version constant at that time.
2018-03-21 06:28:17 -04:00
Adrien Grand 8f9d2ee4e2
Reject updates to the `_default_` mapping. (#29165)
This will reject mapping updates to the `_default_` mapping with 7.x indices
and still emit a deprecation warning with 6.x indices.

Relates #15613
Supersedes #28248
2018-03-21 10:44:11 +01:00
Nhat Nguyen f938c4267e Fix BWC issue for PreSyncedFlushResponse
I misunderstood how the bwc versions works. If we backport to 5.x, we
need to backport to all supported 6.*.  This commit corrects the BWC
versions for PreSyncedFlushResponse.

Relates #29103
2018-03-20 13:56:15 -04:00
Lee Hinman b4af451ec5
Remove BytesArray and BytesReference usage from XContentFactory (#29151)
* Remove BytesArray and BytesReference usage from XContentFactory

This removes the usage of `BytesArray` and `BytesReference` from
`XContentFactory`. Instead, a regular `byte[]` should be passed. To assist with
this a helper has been added to `XContentHelper` that will preserve the offset
and length from the underlying BytesReference.

This is part of ongoing work to separate the XContent parts from ES so they can
be factored into their own jar.

Relates to #28504
2018-03-20 11:52:26 -06:00
Lee Hinman 4bd217c94f
Add pluggable XContentBuilder writers and human readable writers (#29120)
* Add pluggable XContentBuilder writers and human readable writers

This adds the ability to use SPI to plug in writers for XContentBuilder. By
implementing the XContentBuilderProvider class we can allow Elasticsearch to
plug in different ways to encode types to JSON.

Important caveat for this, we should always try to have the class implement
`ToXContentFragment` first, however, in the case of classes from our
dependencies (think Joda classes or Lucene classes) we need a way to specify
writers for these classes.

This also makes the human-readable field writers generic and pluggable, so that
we no longer need to tie XContentBuilder to things like `TimeValue` and
`ByteSizeValue`. Contained as part of this moves all the TimeValue human
readable fields to the new `humanReadableField` method. A future commit will
move the `ByteSizeValue` calls over to this method.

Relates to #28504
2018-03-20 11:39:24 -06:00
Christoph Büscher 701625b065 Add unreleased version 6.2.4 (#29171) 2018-03-20 18:38:06 +01:00
Christoph Büscher 5a97fe75da Add unreleased version 6.1.5 (#29168) 2018-03-20 18:31:59 +01:00
Luca Cavanna ff09c82319
REST high-level client: add clear cache API (#28866)
* REST high-level client: add clear cache API

Relates to #27205

Also Closes #26947 (rest-spec were outdated)
2018-03-20 10:39:36 +01:00
Lee Hinman 687577a516 Fix javadoc warning in Strings for missing parameter description
Fixes a parameter in `Strings` that had a javadoc annotation but was missing the
description, causing warnings in the build.
2018-03-19 12:28:15 -06:00
Lee Hinman 3025295f7e
Decouple Text and Geopoint from XContentBuilder (#29119)
This removes the `Text` and `Geopoint` special handling from `XContentBuilder`.
Instead, these classes now implement `ToXContentFragment` and render themselves
accordingly.

This allows us to further decouple XContentBuilder from Elasticsearch-specific
classes so it can be factored into a standalone lib at a later time.

Relates to #28504
2018-03-19 08:54:10 -06:00
Nik Everett bf05c600c4
REST: Include suppressed exceptions on failures (#29115)
This modifies xcontent serialization of Exceptions to contain suppressed
exceptions. If there are any suppressed exceptions they are included in
the exception response by default. The reasoning here is that they are
fairly rare but when they exist they almost always add extra useful
information. Take, for example, the response when you specify two broken
ingest pipelines:

```
{
  "error" : {
    "root_cause" : ...snip...
    "type" : "parse_exception",
    "reason" : "[field] required property is missing",
    "header" : {
      "processor_type" : "set",
      "property_name" : "field"
    },
    "suppressed" : [
      {
        "type" : "parse_exception",
        "reason" : "[field] required property is missing",
        "header" : {
          "processor_type" : "convert",
          "property_name" : "field"
        }
      }
    ]
  },
  "status" : 400
}
```

Moreover, when suppressed exceptions come from 500 level errors should
give us more useful debugging information.

Closes #23392
2018-03-19 10:52:50 -04:00
Tanguy Leroux 0f93b7abdf Fix compilation errors in ML integration tests
After elastic/elasticsearch#29109, the `needsReassignment` method has
been moved to the PersistentTasksClusterService. This commit fixes
some compilation in tests I introduced.
2018-03-19 09:46:53 +01:00
Tanguy Leroux b57bd695f2
Small code cleanups and refactorings in persistent tasks (#29109)
This commit consists of small code cleanups and refactorings in the
persistent tasks framework. Most changes are in
PersistentTasksClusterService where some methods have been renamed
or merged together, documentation has been added, unused code removed
in order to improve readability of the code.
2018-03-19 09:26:17 +01:00
Nhat Nguyen f1029aaad5
getMinGenerationForSeqNo should acquire read lock (#29126)
The method Translog#getMinGenerationForSeqNo does not modify the current
translog but only access, it therefore should acquire the readLock
instead of writeLock.
2018-03-17 17:43:20 -04:00
Nhat Nguyen c9749180a1 Backport - Do not renew sync-id PR to 5.6 and 6.3
Relates ##29103
2018-03-17 11:38:22 -04:00
Jason Tedor 2e93a9158f
Align thread pool info to thread pool configuration (#29123)
Today we report thread pool info using a common object. This means that
we use a shared set of terminology that is not consistent with the
terminology used to the configure thread pools. This holds in particular
for the minimum and maximum number of threads in the thread pool where
we use the following terminology:
 thread pool info | fixed | scaling
 min                core    size
 max                max     size

This commit changes the display of thread pool info to be dependent on
the type of the thread pool so that we can align the terminology in the
output of thread pool info with the terminology used to configure a
thread pool.
2018-03-16 22:47:06 -04:00
Nhat Nguyen 22ad52a288 TEST: Adjust translog size assumption in new engine
A new engine now can have more than one empty translog since #28676.
This cause #testShouldPeriodicallyFlush failed because in the test we
asssume an engine should have one empty translog. This commit takes into
account the extra translog size of a new engine.
2018-03-16 21:50:31 -04:00
olcbean 47211c00e9 REST: Clear Indices Cache API simplify param parsing (#29111)
Simplify the parsing of the params in Clear Indices Cache API, as
a follow up to the removing of the deprecated parameter names.
2018-03-16 16:50:34 -04:00
Jason Tedor 4d62640bf1 Fix typo in ExceptionSerializationTests
This commit fixes a little typo in ExceptionSerializationTests.java
replacing "weas" by "was".
2018-03-16 15:52:39 -04:00
Jason Tedor 1f1a4d17b4 Remove BWC layer for rejected execution exception
The serialization changes for rejected execution exceptions has been
backported to 6.x with the intention to appear in all versions since
6.3.0. Therefore, this BWC layer is no longer needed in master since
master would never speak to a node that does not speak the same
serialization.
2018-03-16 14:40:17 -04:00
Jason Tedor 6bf742dd1b
Fix EsAbortPolicy to conform to API (#29075)
The rejected execution handler API says that rejectedExecution(Runnable,
ThreadPoolExecutor) throws a RejectedExecutionException if the task must
be rejected due to capacity on the executor. We do throw something that
smells like a RejectedExecutionException (it is named
EsRejectedExecutionException) yet we violate the API because
EsRejectedExecutionException is not a RejectedExecutionException. This
has caused problems before where we try to catch RejectedExecution when
invoking rejectedExecution but this causes EsRejectedExecutionException
to go uncaught. This commit addresses this by modifying
EsRejectedExecutionException to extend
RejectedExecutionException.
2018-03-16 14:34:36 -04:00
David Turner 158bb23887
Remove usages of obsolete settings (#29087)
The settings `indices.recovery.concurrent_streams` and
`indices.recovery.concurrent_small_file_streams` were removed in
f5e4cd4616. This commit removes their last traces
from the codebase.
2018-03-16 15:35:40 +00:00
Nhat Nguyen 2c1ef3d4c6
Do not renew sync-id if all shards are sealed (#29103)
Today the synced-flush always issues a new sync-id even though all
shards haven't been changed since the last seal. This causes active
shards to have different a sync-id from offline shards even though all
were sealed and no writes since then.

This commit adjusts not to renew sync-id if all active shards are sealed
with the same sync-id.

Closes #27838
2018-03-16 11:16:30 -04:00
Adrien Grand 0755ff425f
Clarify requirements of strict date formats. (#29090)
Closes #29014
2018-03-16 14:39:36 +01:00
Alan Woodward a2d5cf6514 Compilation fix for #29067 2018-03-16 13:33:25 +00:00
Alan Woodward 986e518170
Store offsets in index prefix fields when stored in the parent field (#29067)
The index prefix field is normally indexed as docs-only, given that it cannot
be used in phrases.  However, in the case that the parent field has been indexed
with offsets, or has term-vector offsets, we should also store this in the index
prefix field for highlighting.

Note that this commit does not implement highlighting on prefix fields, but
rather ensures that future work can implement this without a backwards-break
in index data.

Closes #28994
2018-03-16 11:39:46 +00:00
Tanguy Leroux f14146982f
Use removeTask instead of finishTask in PersistentTasksClusterService (#29055)
The method `PersistentTasksClusterService.finishTask()` has been
modified since it was added and does not use any `removeOncompletion`
flag anymore. Its behavior is now similar to `removeTask()` and can be
replaced by this one. When a non existing task is removed, the cluster
state update task will fail and its `source` will still indicate
`finish persistent task`/`remove persistent task`.
2018-03-16 10:20:56 +01:00
Yogesh Gaikwad a685784cea
CLI: Close subcommands in MultiCommand (#28954)
* CLI Command: MultiCommand must close subcommands to release resources properly

- Changes are done to override the close method and call close on subcommands using IOUtils#close
- Unit Test

Closes #28953
2018-03-16 09:59:23 +11:00
Nhat Nguyen c75790e7c0
TEST: write ops should execute under shard permit (#28966)
Currently ESIndexLevelReplicationTestCase executes write operations
without acquiring  index shard permit. This may prevent the primary term
on replica from being updated or cause a race between resync and
indexing on primary. This commit ensures that write operations are
always executed under shard permit like the production code.
2018-03-15 14:42:15 -04:00
Mayya Sharipova 8cb3d18eac Revert "Improve error message for installing plugin (#28298)"
This reverts commit 0cc1ffdf20

The reason is that Windows test are failing,
because of the incorrect path for the plugin
2018-03-15 10:47:50 -07:00
Adrien Grand 404e776a45
Validate regular expressions in dynamic templates. (#29013)
Today you would only get these errors at index time.

Relates #24749
2018-03-15 16:43:56 +01:00
Christoph Büscher 312ccc05d5
[Tests] Fix GetResultTests and DocumentFieldTests failures (#29083)
Changes made in #28972 seems to have changed some assumptions about how
SMILE and CBOR write byte[] values and how this is tested. This changes
the generation of the randomized DocumentField values back to BytesArray
while expecting the JSON and YAML deserialisation to produce Base64
encoded strings and SMILE and CBOR to parse back BytesArray instances.

Closes #29080
2018-03-15 16:42:26 +01:00
Adrien Grand 18d848f218
Reenable LiveVersionMapTests.testRamBytesUsed on Java 9. (#29063)
I also had to make the test more lenient. This is due to the fact that
Lucene's RamUsageTester was changed in order not to reflect `java.*`
classes and the way that it estimates ram usage of maps is by assuming
it has similar memory usage to an `Object[]` array that stores all keys
and values. The implementation in `LiveVersionMap` tries to be slightly
more realistic by taking the load factor and linked lists into account,
so it usually gives a higher estimate which happens to be closer to
reality.

Closes #22548
2018-03-15 16:39:02 +01:00
Christoph Büscher 85933161d4 Mute failing GetResultTests and DocumentFieldTests 2018-03-15 11:49:45 +01:00
Mayya Sharipova 0cc1ffdf20
Improve error message for installing plugin (#28298)
Provide more actionable error message when installing an offline plugin
in the plugins directory, and the `plugins` directory for the node
contains plugin distribution.

Closes #27401
2018-03-14 16:19:04 -07:00
Lee Hinman 8425257593 [TEST] Fix issue parsing response out of order
When parsing GetResponse it was possible that the equality check failed because
items in the map were in a different order (in the `.equals` implementation).
2018-03-14 16:34:40 -06:00
Christoph Büscher ae912cbde4
[Docs] Fix Java Api index administration usage (#28260)
The Java API documentation for index administration currenty is wrong because
the PutMappingRequestBuilder#setSource(Object... source) an
CreateIndexRequestBuilder#addMapping(String type, Object... source) methods
delegate to methods that check that the input arguments are valid key/value
pairs. This changes the docs so the java api code examples are included from
documentation integration tests so we detect compile and runtime issues earlier.

Closes #28131
2018-03-14 22:02:06 +01:00
olcbean 3d81497f25 REST: Clear Indices Cache API remove deprecated url params (#29068)
By the time the master branch is released the deprecated url
parameters in the `/_cache/clear` API will have been deprecated
for a couple of minor releases. Since master will be the next
major release we are fine with removing these parameters.
2018-03-14 16:37:50 -04:00
Boaz Leskes bf65cb4914
Untangle Engine Constructor logic (#28245)
Currently we have a fairly complicated logic in the engine constructor logic to deal with all the 
various ways we want to mutate the lucene index and translog we're opening.

We can:
1) Create an empty index
2) Use the lucene but create a new translog
3) Use both
4) Force a new history uuid in all cases.

This leads complicated code flows which makes it harder and harder to make sure we cover all the 
corner cases. This PR tries to take another approach. Constructing an InternalEngine always opens 
things as they are and all needed modifications are done by static methods directly on the 
directory, one at a time.
2018-03-14 20:59:47 +01:00
Lee Hinman 8e8fdc4f0e
Decouple XContentBuilder from BytesReference (#28972)
* Decouple XContentBuilder from BytesReference

This commit removes all mentions of `BytesReference` from `XContentBuilder`.
This is needed so that we can completely decouple the XContent code and move it
into its own dependency.

While this change appears large, it is due to two main changes, moving
`.bytes()` and `.string()` out of XContentBuilder itself into static methods
`BytesReference.bytes` and `Strings.toString` respectively. The rest of the
change is code reacting to these changes (the majority of it in tests).

Relates to #28504
2018-03-14 13:47:57 -06:00
David Kyle cb9d10f971
Protect against NPE in RestNodesAction (#29059)
* Protect against NPE in RestNodesAction
2018-03-14 15:47:18 +00:00
David Roberts 5bf92ca3b3
Enforce that java.io.tmpdir exists on startup (#28217)
If the default java.io.tmpdir is used then the startup script creates
it, but if a custom java.io.tmpdir is used then the user must ensure it
exists before running Elasticsearch. If they forget then it can cause
errors that are hard to understand, so this change adds an explicit
check early in the bootstrap and reports a clear error if java.io.tmpdir
is not an accessible directory.
2018-03-14 15:43:53 +00:00
Jason Tedor 24d10adaab
Main response should not have status 503 when okay (#29045)
The REST status 503 means "I can not handle the request that you sent
me." However today we respond to a main request with a 503 when there
are certain cluster blocks despite still responding with an actual main
response. This is broken, we should respond with a 200 status. This
commit removes this silliness.
2018-03-14 06:36:37 -04:00
Jason Tedor 647d0a1e95
Do not swallow fail to convert exceptions (#29043)
When converting the source for an indexing request to JSON, the
conversion can throw an I/O exception which we swallow and proceed with
logging to the slow log. The cause of the I/O exception is lost. This
commit changes this behavior and chooses to drop the entry from the slow
logs and instead lets an exception percolate up to the indexing
operation listener loop. Here, the exception will be caught and logged
at the warn level.
2018-03-13 23:42:16 -04:00
Jason Tedor 46fcd07153
Add total hits to the search slow log (#29034)
This commit adds the total hits to the search slow log.
2018-03-13 20:40:47 -04:00
Jason Tedor 4dc3adad51
Archive unknown or invalid settings on updates (#28888)
Today we can end up in a situation where the cluster state contains
unknown or invalid settings. This can happen easily during a rolling
upgrade. For example, consider two nodes that are on a version that
considers the setting foo.bar to be known and valid. Assume one of these
nodes is restarted on a higher version that considers foo.bar to now be
either unknown or invalid, and then the second node is restarted
too. Now, both nodes will be on a version that consider foo.bar to be
unknown or invalid yet this setting will still be contained in the
cluster state. This means that if a cluster settings update is applied
and we validate the settings update with the existing settings then
validation will fail. In such a state, the offending setting can not
even be removed. This commit helps out with this situation by archiving
any settings that are unknown or invalid at the time that a settings
update is applied. This allows the setting update to go through, and the
archived settings can be removed at a later time.
2018-03-13 17:32:18 -04:00
Jason Tedor c8e71327ab
Log template creation and deletion (#29027)
These can be seen at the debug level via cluster state update logging
but really they should be more visible like index creation and
deletion. This commit adds info-level logging for template puts and
deletes.
2018-03-13 16:31:19 -04:00
Jason Tedor 697b9f8b82
Remove interning from prefix logger (#29031)
This interning is completely unnecessary because we look up the marker
by the prefix (value, not identity) anyway. This means that regardless
of the identity of the prefix, we end up with the same marker. That is
all that we really care about here.
2018-03-13 16:30:13 -04:00
olcbean edc57f6f34 REST: deprecate `field_data` in Clear Cache API (#28943)
We call it `fielddata` everywhere else in the code and API so we may as
well be consistent.
2018-03-13 15:16:27 -04:00
Jason Tedor 5904d936fa
Copy Lucene IOUtils (#29012)
As we have factored Elasticsearch into smaller libraries, we have ended
up in a situation that some of the dependencies of Elasticsearch are not
available to code that depends on these smaller libraries but not server
Elasticsearch. This is a good thing, this was one of the goals of
separating Elasticsearch into smaller libraries, to shed some of the
dependencies from other components of the system. However, this now
means that simple utility methods from Lucene that we rely on are no
longer available everywhere. This commit copies IOUtils (with some small
formatting changes for our codebase) into the fold so that other
components of the system can rely on these methods where they no longer
depend on Lucene.
2018-03-13 12:49:33 -04:00
Jason Tedor 6088af5887 Fix comment regarding removal of requiresKeystore
The requiresKeystore flag was removed from PluginInfo in 6.3.0. This
commit fixes a pair of code comments that incorrectly refer to this
version as 7.0.0.
2018-03-12 14:20:02 -04:00
Jason Tedor b8e165a994 Fix BWC versions on plugin info
This commit fixes the BWC versions on the plugin info serialization
which was changed to remove the requiresKeystore flag.
2018-03-12 13:05:48 -04:00
Jason Tedor 6331bcaf76
Create keystore on package install (#28928)
This commit removes the ability to specify that a plugin requires the
keystore and instead creates the keystore on package installation or
when Elasticsearch is started for the first time. The reason that we opt
to create the keystore on package installation is to ensure that the
keystore has the correct permissions (the package installation scripts
run as root as opposed to Elasticsearch running as the elasticsearch
user) and to enable removing the keystore on package removal if the
keystore is not modified.
2018-03-12 12:48:00 -04:00
Mika⠙ a7b53fd3b7 Add check when trying to reroute a shard to a non-data discovery node (#28886)
While trying to reroute a shard to or from a non-data node (a node with ``node.data=false``), I encountered a null pointer exception. Though an exception is to be expected, the NPE was occurring because ``allocation.routingNodes()`` would not contain any non-data nodes, so when you attempt to do ``allocation.routingNodes.node(non-data-node)`` it would not find it, and thus error. This occurred regardless of whether I was rerouting to or from a non-data node.

This PR adds a check (as well as a test for these use cases) to return a legible, useful exception if the discovery node you are rerouting to or from is not a data node.
2018-03-12 16:48:51 +01:00
Jason Tedor b1b469e30f
Avoid class cast exception from index writer (#28989)
When an index writer encounters a tragic exception, it could be a
Throwable and not an Exception. Yet we blindly cast the tragic exception
to an Exception which can encounter a ClassCastException. This commit
addresses this by checking if the tragic exception is an Exception and
otherwise wrapping the Throwable in a RuntimeException if it is not. We
choose to wrap the Throwable instead of passing it around because
passing it around leads to changing a lot of places where we handle
Exception to handle Throwable instead. In general, we have tried to
avoid handling Throwable and instead let those bubble up to the uncaught
exception handler.
2018-03-12 08:42:02 -04:00
Yannick Welsch be7f5dde24
Disallow logger methods with Object parameter (#28969)
Log4j2 provides a wide range of logging methods. Our code typically only uses a subset of them. In particular, uses of the methods trace|debug|info|warn|error|fatal(Object) or trace|debug|info|warn|error|fatal(Object, Throwable) have all been wrong, leading to not properly logging the provided message. To prevent these issues in the future, the corresponding Logger methods have been blacklisted.
2018-03-12 03:05:24 -07:00
Jim Ferenczi 7afe5ad943
Restore tiebreaker for cross fields query (#28935)
This commit restores the handling of tiebreaker for multi_match
cross fields query. This functionality was lost during a refactoring
of the multi_match query (#25115).

Fixes #28933
2018-03-12 09:58:20 +01:00
Ryan Ernst 4216fc9f64
Plugins: Allow modules to spawn controllers (#28968)
This commit makes the controller spawner also look under modules. It
also fixes a bug in module security policy loading where the module is a
meta plugin.
2018-03-11 09:01:27 -07:00
Nhat Nguyen 4f644d04a3 TEST: Use non-zero number for #testCompareUnits
In `ByteSizeValueTests#testCompareUnits`, we expect non-zero for the
variable `number` however `randomNonNegativeLong` can return zero.

CI: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+6.2+oracle-java10-periodic/147/console
2018-03-10 22:56:22 -05:00
Jason Tedor 4ba80a7952
Maybe die before failing engine (#28973)
Today we check for a few cases where we should maybe die before failing
the engine (e.g., when a merge fails). However, there are still other
cases where a fatal error can be hidden from us (for example, a failed
index writer commit). This commit modifies the mechanism for failing the
engine to always check for a fatal error before failing the engine.
2018-03-10 07:41:51 -05:00
Jason Tedor 950c4363bf
Remove special handling for _all in nodes info
Today when requesting _all we return all nodes regardless of what other
node qualifiers are in the request. This is contrary to how the
remainder of the API behaves which acts as additive and subtractive
based on the qualifiers and their ordering. It is also contrary to how
the wildcard * behaves. This commit removes the special handling for
_all so that it behaves identical to the wildcard *.

Relates #28971
2018-03-09 18:23:16 -05:00
Lee Hinman 97b513e925
Remove Booleans use from XContent and ToXContent (#28768)
* Remove Booleans use from XContent and ToXContent

This removes the use of the `common.Boolean` class from two of the XContent
classes, so they can be decoupled from the ES code as much as possible.

Related to #28754, #28504
2018-03-09 14:58:54 -07:00
Nhat Nguyen 4973887a10
Make primary-replica resync failures less lenient (#28534)
Today, failures from the primary-replica resync are ignored as the best 
effort to not mark shards as stale during the cluster restart. However
this can be problematic if replicas failed to execute resync operations
but just fine in the subsequent write operations. When this happens,
replica will miss some operations from the new primary. There are some
implications if the local checkpoint on replica can't advance because of
the missing operations.

1. The global checkpoint won't advance - this causes both primary and 
replicas keep many index commits

2. Engine on replica won't flush periodically because uncommitted stats
is calculated based on the local checkpoint

3. Replica can use a large number of bitsets to keep track operations seqno

However we can prevent this issue but still reserve the best-effort by 
failing replicas which fail to execute resync operations but not mark
them as stale. We have prepared to the required infrastructure in #28049
and #28054 for this change.

Relates #24841
2018-03-09 09:55:45 -08:00
Martijn van Groningen b32e999960 Use different pipeline id in test.
(pipelines do not get removed between tests extending from ESIntegTestCase)
2018-03-09 14:29:42 +01:00
David Turner 033a83b98b
Use String.join() to describe a list of tasks (#28941)
This change replaces the use of string concatenation with a call to
String.join(). String concatenation might be quadratic, unless the compiler can
optimise it away, whereas String.join() is more reliably linear. There can
sometimes be a large number of pending ClusterState update tasks and #28920
includes a report that this operation sometimes takes a long time.
2018-03-09 09:42:44 +00:00
Martijn van Groningen 41519da45a Fixed incorrect test try-catch statement 2018-03-09 09:38:16 +01:00
Ryan Ernst 62293ec1c9
Plugins: Consolidate plugin and module loading code (#28815)
At one point, modules and plugins were very different. But effectively
now they are the same, just from different directories. This commit
unifies the loading methods so they are simply two different
directories. Note that the main codepath to load plugin bundles had
duplication (was not calling getPluginBundles) since previous
refactorings to add meta plugins. Note this change also rewords the
primary exception message when a plugin descriptor is missing, as the
wording asking if the plugin was built before 2.0 isn't really
applicable anymore (it is highly unlikely someone tries to install a 1.x
plugin on any modern version).
2018-03-08 22:49:27 -08:00
Lee Hinman 46a79127ed
Remove FastStringReader in favor of vanilla StringReader (#28944)
This allows us to remove another dependency in the decoupling of the XContent
code. Rather than move this class over or decouple it, it can simply be removed.

Relates tangentially to #28504
2018-03-08 17:17:36 -07:00
Lee Hinman d6d7ee7320
Remove FastCharArrayReader and FastCharArrayWriter (#28951)
These classes are used only in two places, and can be replaced by the
`CharArrayReader` and `CharArrayWriter`. The JDK can also perform lock biasing
and elision as well as escape analysis to optimize away non-contended locks,
rendering their lock-free implementations unnecessary.
2018-03-08 17:05:11 -07:00
Tal Levy 7784c1bff9
Continue registering pipelines after one pipeline parse failure. (#28752)
Ingest has been failing to apply existing pipelines from cluster-state
into the in-memory representation that are no longer valid. One example of
this is a pipeline with a script processor. If a cluster starts up with scripting
disabled, these pipelines will not be loaded. Even though GETing a pipeline worked,
indexing operations claimed that this pipeline did not exist. This is because one
gets information from cluster-state and the other is from an in-memory data-structure.

Now, two things happen
1. suppress the exceptions until after other successful pipelines are loaded
2. replace failed pipelines with a placeholder pipeline

If the pipeline execution service encounters the stubbed pipeline, it is known that
something went wrong at the time of pipeline creation and an exception was thrown to
the user at some point at start-up.

closes #28269.
2018-03-08 15:22:59 -08:00
Lee Hinman 17fc07a193
Switch XContentBuilder from BytesStreamOutput to ByteArrayOutputStream (#28945)
This switches the underlying byte output representation used by default in
`XContentBuilder` from `BytesStreamOutput` to a `ByteArrayOutputStream` (an
`OutputStream` can still be specified manually)

This is groundwork to allow us to decouple `XContent*` from the rest of the ES
core code so that it may be factored into a separate jar.

Since `BytesStreamOutput` was not using the recycling instance of `BigArrays`,
this should not affect the circuit breaking capabilities elsewhere in the
system.

Relates to #28504
2018-03-08 15:45:51 -07:00
Lee Hinman 697f3f1a3b
Factor UnknownNamedObjectException into its own class (#28931)
* Factor UnknownNamedObjectException into its own class

This moves the inner class `UnknownNamedObjectException` from
`NamedXContentRegistry` into a top-level class. This is so that
`NamedXContentRegistry` doesn't have to depend on StreamInput and StreamOutput.

Relates to #28504
2018-03-08 15:32:41 -07:00
Lee Hinman ec92796ed8
Remove now-unused createParser that uses BytesReference (#28926)
This removes `BytesReference` use from XContent and all subclasses.

Relates to #28504
2018-03-08 09:10:21 -07:00
Jim Ferenczi bc8b3fc71c Revert "Rescore collapsed documents (#28521)"
This reverts commit f057fc294a.
The rescorer does not resort the collapsed values inside the top docs
during rescoring. For this reason the Lucene rescorer is not compatible
with collapsing.
Relates #27243
2018-03-08 11:20:29 +01:00