Commit Graph

8420 Commits

Author SHA1 Message Date
Christoph Büscher 5541debd5a Merge branch 'master' into feature/client_aggs_parsing 2017-04-24 14:48:20 +02:00
Yannick Welsch 7c395070e2 [TEST] Wait for tribe node to be fully connected before shutting it down
The tribe was being shutdown by the test while a publishing round (that adds the tribe node to a cluster) is not completed yet (i.e. the node itself
knows that it became part of the cluster, and the test shuts the tribe node down, but another node has not applied the cluster state yet, which makes
that node hang while trying to connect to the node that is shutting down (due to connect_timeout being 30 seconds), delaying publishing for 30
seconds, and subsequently tripping an assertion when another tribe instance wants to join.

Relates to #23695
2017-04-24 12:27:41 +02:00
Colin Goodheart-Smithe 6d6a230f70
Makes StoredScriptSource implement ToXContentObject 2017-04-24 10:20:15 +01:00
Colin Goodheart-Smithe d4a6ba8ec9 No longer add illegal content type option to stored search templates (#24251)
When parsing StoredSearchScript we were adding a Content type option that was forbidden (by a check that threw an exception) by the parser thats used to parse the template when we read it from the cluster state. This was stopping Elastisearch from starting after stored search templates had been added.

This change no longer adds the content type option to the StoredScriptSource object when parsing from the put search template request.  This is safe because the StoredScriptSource content is always JSON when its stored in the cluster state since we do a conversion to JSON before this point.

Also removes the check for the content type in the options when parsing StoredScriptSource so users who already have stored scripts can start Elasticsearch.

Closes #24227
2017-04-22 13:37:04 -04:00
Ryan Ernst 473e98981b Scripts: Remove unnecessary executable shortcut (#24264)
ScriptService has two executable methods, one which takes a
CompiledScript, which is similar to search, and one that takes a raw
Script and both compiles and returns an ExecutableScript for it. The
latter is not needed, and the call sites which used one or the other
were mixed. This commit removes the extra executable method in favor of
callers first calling compile, then executable.
2017-04-21 17:53:03 -07:00
Ryan Ernst aadc33d260 Scripts: Remove unwrap method from executable scripts (#24263)
The unwrap method was leftover from support javascript and python. Since
those languages are removed in 6.0, this commit removes the unwrap
feature from scripts.
2017-04-21 17:50:22 -07:00
Nik Everett 447f307ebb Fix _bulk response when it can't create an index (#24048)
Before #22488 when an index couldn't be created during a `_bulk`
operation we'd do all the *other* actions and return the index
creation error on each failing action. In #22488 we accidentally
changed it so that we now reject the entire bulk request if a single
action cannot create an index that it must create to run. This
gets reverts to the old behavior while still keeping the nicer
error messages. Instead of failing the entire request we now only
fail the portions of the request that can't work because the index
doesn't exist.

Closes #24028
2017-04-21 18:56:04 -04:00
Jason Tedor fe91c72151 Use a marker file when removing a plugin
Today when removing a plugin, we attempt to move the plugin directory to
a temporary directory and then delete that directory from the
filesystem. We do this to avoid a plugin being in a half-removed
state. We previously tried an atomic move, and fell back to a non-atomic
move if that failed. Atomic moves can fail on union filesystems when the
plugin directory is not in the top layer of the
filesystem. Interestingly, the regular move can fail as well. This is
because when the JDK is executing such a move, it first tries to rename
the source directory to the target directory and if this fails with
EXDEV (as in the case of an atomic move failing), it falls back to
copying the source to the target, and then attempts to rmdir the
source. The bug here is that the JDK never deleted the contents of the
source so the rmdir will always fail (except in the case of an empty
directory).

Given all this silliness, we were inspired to find a different
strategy. The strategy is simple. We will add a marker file to the
plugin directory that indicates the plugin is in a state of
removal. This file will be the last file out the door during removal. If
this file exists during startup, we fail startup.

Relates #24252
2017-04-21 15:50:44 -04:00
Simon Willnauer 2ca7072b24 Fill missing sequence IDs up to max sequence ID when recovering from store (#24238)
Today we might promote a primary and recover from store where after translog
recovery the local checkpoint is still behind the maximum sequence ID seen.
To fill the holes in the sequence ID history this PR adds a utility method
that fills up all missing sequence IDs up to the maximum seen sequence ID
with no-ops.

Relates to #10708
2017-04-21 20:28:00 +02:00
Ryan Ernst ba48674695 Build: Move plugin cli and tests to distribution tool (#24220)
The plugin cli currently resides inside the elasticsearch jar. This
commit moves it into a plugin-cli jar. This is change alone is a no-op;
it does not change anything about what is loaded at runtime. But it will
allow easier testing (with fixtures in the future to test ES or maven
installation), as well as eventually not loading these classes when
starting elasticsearch.
2017-04-21 09:25:58 -07:00
Boaz Leskes badb2be066 Peer Recovery: remove maxUnsafeAutoIdTimestamp hand off (#24243)
With #24149 , it is now stored in the Lucene commit and is implicitly transferred in the file phase of the recovery.
2017-04-21 17:31:50 +02:00
Ali Beyad 63e5aff5d6 Adds version 5.3.2 and backwards compatibility indices for 5.3.1 2017-04-21 10:48:41 -04:00
Tanguy Leroux adef5e227a Merge remote-tracking branch 'origin/master' into feature/client_aggs_parsing 2017-04-21 15:46:18 +02:00
Tanguy Leroux 480bf0996d Add utility method to parse named XContent objects with typed prefix (#24240)
This commit adds a XContentParserUtils.parseTypedKeysObject() method
that can be used to parse named XContent objects identified by a field
name containing a type identifier, a delimiter and the name of the
object to parse.
2017-04-21 15:41:27 +02:00
Tanguy Leroux 251b6d452b MultiBucketsAggregation.Bucket should not extend Writeable (#24216)
The MultiBucketsAggregation.Bucket interface extends Writeable, forcing
all implementation classes to implement writeTo(). This commit removes
the Writeable from the interface and move it down to the InternalBucket
implementation.
2017-04-21 15:29:53 +02:00
Yannick Welsch c2deb1c81d Don't expose cleaned-up tasks as pending in PrioritizedEsThreadPoolExecutor (#24237)
Changes in #24102 exposed the following oddity: PrioritizedEsThreadPoolExecutor.getPending() can return Pending entries where pending.task == null. This can happen for example when tasks are added to the pending list while they are in the clean up phase, i.e. TieBreakingPrioritizedRunnable#runAndClean has run already, but afterExecute has not removed the task yet. Instead of safeguarding consumers of the API (as was done before #24102) this changes the executor to not count these tasks as pending at all.
2017-04-21 15:25:19 +02:00
Christoph Büscher 4a69b658cd Merge branch 'master' into feature/client_aggs_parsing 2017-04-21 12:18:06 +02:00
Colin Goodheart-Smithe 3c7c4bc824 Adds declareNamedObjects methods to ConstructingObjectParser (#24219)
* Adds declareNamedObjects methods to ConstructingObjectParser

* Addresses review comments
2017-04-21 09:50:30 +01:00
Christoph Büscher c8ad26edc9 Tests: Extend InternalStatsTests (#24212)
Currently we don't test for count = 0 which will make a difference when adding
tests for parsing for the high level rest client. Also min/max/sum should also
be tested with negative values and on a larger range.
2017-04-21 10:38:09 +02:00
Adrien Grand 81b64ed587 IndicesQueryCache should delegate the scorerSupplier method. (#24209)
Otherwise the range improvements that we did on range queries would not work.
This is similar to https://issues.apache.org/jira/browse/LUCENE-7749.
2017-04-21 10:33:02 +02:00
Adrien Grand f322f537e4 Speed up parsing of large `terms` queries. (#24210)
The addition of the normalization feature on keywords slowed down the parsing
of large `terms` queries since all terms now have to go through normalization.
However this can be avoided in the default case that the analyzer is a
`keyword` analyzer since all that normalization will do is a UTF8 conversion.
Using `Analyzer.normalize` for that is a bit overkill and could be skipped.
2017-04-21 10:32:33 +02:00
Jim Ferenczi a4365971a0 [TEST] make sure that the random query_string query generator defines a default_field or a list of fields 2017-04-21 02:56:26 +02:00
Fabien Baligand 4a45579506 token_count type : add an option to count tokens (fix #23227) (#24175)
Add option "enable_position_increments" with default value true.
If option is set to false, indexed value is the number of tokens
(not position increments count)
2017-04-21 00:53:28 +02:00
javanna d9916f20a6 Merge branch 'master' into feature/client_aggs_parsing 2017-04-20 23:03:58 +02:00
Jim Ferenczi 525101b64d Query string default field (#24214)
Currently any `query_string` query that use a wildcard field with no matching field is rewritten with the `_all` field.

For instance:
````
#creating test doc
PUT testing/t/1
{
  "test": {
    "field_one": "hello",
    "field_two": "world"
  }
}
#searching abc.* (does not exist) -> hit
GET testing/t/_search
{
  "query": {
    "query_string": {
      "fields": [
        "abc.*"
      ],
      "query": "hello"
    }
  }
}
````

This bug first appeared in 5.0 after the query refactoring and impacts only users that use `_all` as default field.
Indices created in 6.x will not have this problem since `_all` is deactivated in this version.

This change fixes this bug by returning a MatchNoDocsQuery for any term that expand to an empty list of field.
2017-04-20 22:12:20 +02:00
Luca Cavanna 82c678b5c7 Make Aggregations an abstract class rather than an interface (#24184)
Some of the base methods that don't have to do with reduce phase and serialization can be moved to the base class which is no longer an interface. This will be reusable by the high level REST client further on the road. Also it simplify things as having an interface with a single implementor is not that helpful.
2017-04-20 21:31:34 +02:00
Areek Zillur 077a6c3ee7 [TEST] ensure expected sequence no and version are set when index/delete engine operation has a document failure 2017-04-20 13:38:52 -04:00
Yannick Welsch 22e0795990 Extract batch executor out of cluster service (#24102)
Refactoring that extracts the task batching functionality from ClusterService and makes it a reusable component that can be tested in isolation.
2017-04-20 17:28:43 +02:00
Tanguy Leroux c8fc30a999 [Test] Expose AbstractPercentilesTestCase.randomPercents() 2017-04-20 13:36:16 +02:00
Tanguy Leroux 35946a13d0 Merge remote-tracking branch 'origin/master' into feature/client_aggs_parsing
# Conflicts:
#	core/src/test/java/org/elasticsearch/search/aggregations/metrics/percentiles/InternalPercentilesTestCase.java
#	core/src/test/java/org/elasticsearch/search/aggregations/metrics/percentiles/hdr/InternalHDRPercentilesTests.java
2017-04-20 13:35:21 +02:00
Tanguy Leroux d0df1ed193 [Test] Always check the XContent equivalent when parsing aggregations (#24208)
In InternalAggregationTestCase, we can check that the internal aggregation and the parsed aggregation always produce the same XContent even if the original internal aggregation has been shuffled or not.
2017-04-20 13:22:50 +02:00
Tanguy Leroux 55a879ee8d Align behavior or HDR percentiles iterator with percentile() method (#24206) 2017-04-20 12:37:33 +02:00
Christoph Büscher 6e22a1e9ba Add parsing for InternalBucketMetricValue (#24182) 2017-04-20 11:49:09 +02:00
Tanguy Leroux 2b14db9b70 Remove @Repeat(iterations = 1000) in tests 2017-04-20 10:44:12 +02:00
Tanguy Leroux 11da77388a Add parsing methods for Percentiles aggregations (#24183) 2017-04-20 10:16:51 +02:00
Nik Everett caf376c8af Start building analysis-common module (#23614)
Start moving built in analysis components into the new analysis-common
module. The goal of this project is:
1. Remove core's dependency on lucene-analyzers-common.jar which should
shrink the dependencies for transport client and high level rest client.
2. Prove that analysis plugins can do all the "built in" things by moving all
"built in" behavior to a plugin.
3. Force tests not to depend on any oddball analyzer behavior. If tests
need anything more than the standard analyzer they can use the mock
analyzer provided by Lucene's test infrastructure.
2017-04-19 18:51:34 -04:00
Jason Tedor 4796557a30 Add primary term to doc write response
This commit adds the primary term to the doc write response.

Relates #24171
2017-04-19 14:44:22 -04:00
Ryan Ernst c7e9231a86 Plugins: Remove leniency for missing plugins dir (#24173)
This leniency was left in after plugin installer refactoring for 2.0
because some tests still relied on it. However, the need for this
leniency no longer exists.
2017-04-19 09:09:34 -07:00
Christoph Büscher e12339a683 Merge branch 'master' into feature/client_aggs_parsing 2017-04-19 16:37:22 +02:00
Christoph Büscher a9657a5a09 Add BucketMetricValue interface (#24188)
Unlike other implementations of InternalNumericMetricsAggregation.SingleValue,
the InternalBucketMetricValue aggregation currently doesn't implement a
specialized interface that exposes the `keys()` method. This change adds this so
that clients can access the keys via the interface.
2017-04-19 16:27:33 +02:00
Jim Ferenczi f05af0a382 Enable index-time sorting (#24055)
This change adds an index setting to define how the documents should be sorted inside each Segment.
It allows any numeric, date, boolean or keyword field inside a mapping to be used to sort the index on disk.
It is not allowed to use a `nested` fields inside an index that defines an index sorting since `nested` fields relies on the original sort of the index.
This change does not add early termination capabilities in the search layer. This will be added in a follow up.

Relates #6720
2017-04-19 14:36:11 +02:00
Christoph Büscher 4562c8a345 Add parsing for InternalSimpleValue and InternalDerivative (#24162) 2017-04-19 12:58:26 +02:00
Tanguy Leroux bf5cfabe04 Fix checkstyle violation 2017-04-19 10:23:25 +02:00
Tanguy Leroux 5717ac3cc6 Merge remote-tracking branch 'origin/master' into feature/client_aggs_parsing
# Conflicts:
#	core/src/main/java/org/elasticsearch/search/DocValueFormat.java
#	core/src/test/java/org/elasticsearch/search/aggregations/InternalAggregationTestCase.java
#	core/src/test/java/org/elasticsearch/search/aggregations/metrics/InternalMaxTests.java
#	core/src/test/java/org/elasticsearch/search/aggregations/metrics/avg/InternalAvgTests.java
#	core/src/test/java/org/elasticsearch/search/aggregations/metrics/min/InternalMinTests.java
2017-04-19 10:12:11 +02:00
Boaz Leskes 8758c541b3 ElectMasterService.hasEnoughMasterNodes should return false if no masters were found
This is a regression introduced in #20063
2017-04-19 09:52:06 +02:00
Tanguy Leroux 741c031384 [Test] Add unit tests for InternalHDRPercentilesTests (#24157)
Related to #22278
2017-04-19 09:37:01 +02:00
Areek Zillur 4f773e2dbb Replicate write failures (#23314)
* Replicate write failures

Currently, when a primary write operation fails after generating
a sequence number, the failure is not communicated to the replicas.
Ideally, every operation which generates a sequence number on primary
should be recorded in all replicas.

In this change, a sequence number is associated with write operation
failure. When a failure with an assinged seqence number arrives at a
replica, the failure cause and sequence number is recorded in the translog
and the sequence number is marked as completed via executing `Engine.noOp`
on the replica engine.

* use zlong to serialize seq_no

* Incorporate feedback

* track write failures in translog as a noop in primary

* Add tests for replicating write failures.

Test that document failure (w/ seq no generated) are recorded
as no-op in the translog for primary and replica shards

* Update to master

* update shouldExecuteOnReplica comment

* rename indexshard noop to markSeqNoAsNoOp

* remove redundant conditional

* Consolidate possible replica action for bulk item request
depanding on it's primary execution

* remove bulk shard result abstraction

* fix failure handling logic for bwc

* add more tests

* minor fix

* cleanup

* incorporate feedback

* incorporate feedback

* add assert to remove handling noop primary response when 5.0 nodes are not supported
2017-04-19 01:23:54 -04:00
Jason Tedor 9e0ebc5965 Rename variable in translog simple commit test
This commit renames a variable for clarity in the translog simple commit
test.
2017-04-18 23:43:25 -04:00
Jason Tedor 20181dd0ad Strengthen translog commit with open view test
This commit strengthens an assertion in the translog commit with open
view test.
2017-04-18 23:41:55 -04:00
Jason Tedor 180d1f2219 Stronger check in translog prepare and commit test
This commit strengthens an assertion in the translog prepare commit and
commit test.
2017-04-18 23:37:54 -04:00
Jason Tedor 23b224a5a9 Fix translog prepare commit and commit test
This test was terribly, horribly, no goodly, and badly broken it's
amazing it ever passed so this commit fixes it.
2017-04-18 23:32:47 -04:00
Boaz Leskes edff30f82a Engine: store maxUnsafeAutoIdTimestamp in commit (#24149)
The `maxUnsafeAutoIdTimestamp` timestamp is a safety marker guaranteeing that no retried-indexing operation with a higher auto gen id timestamp was process by the engine. This allows us to safely process documents without checking if they were seen before.

Currently this property is maintained in memory and is handed off from the primary to any replica during the recovery process.

This commit takes a more natural approach and stores it in the lucene commit, using the same semantics (no retry op with a higher time stamp is part of this commit). This means that the knowledge is transferred during the file copy and also means that we don't need to worry about crazy situations where an original append only request arrives at the engine after a retry was processed *and* the engine was restarted.
2017-04-18 20:11:32 +02:00
Christoph Büscher 210e101f6d Minor changes in ParsedCardinality 2017-04-18 18:09:45 +02:00
Christoph Büscher bc646cf7ad Adding parsing for InternalValueCount 2017-04-18 18:09:39 +02:00
Christoph Büscher 5f96972b04 Adding parsing for InternalAvg 2017-04-18 17:48:11 +02:00
Christoph Büscher 695b2858f4 Adding parsing for InternalSum 2017-04-18 17:44:05 +02:00
Christoph Büscher 75fdc9449f Adding parsing for InternalMax and InternalMin 2017-04-18 17:38:44 +02:00
Simon Willnauer ab9884b2e9 Remove leniency when merging fetched hits in a search response phase (#24158)
Today when we merge hits we have a hard check to prevent AIOOB exceptions
that simply skips an expected search hit. This can only happen if there is a
bug in the code which should be turned into a hard exception or an assertion
triggered. This change adds an assertion an removes the lenient check for the
fetched hits.
2017-04-18 17:19:57 +02:00
Tanguy Leroux 829dd068d6 [Test] Use appropriate DocValueFormats in Aggregations tests (#24155)
Some aggregations (like Min, Max etc) use a wrong DocValueFormat in
tests (like IP or GeoHash). We should not test aggregations that expect
a numeric value with a DocValueFormat like IP. Such wrong DocValueFormat
can also prevent the aggregation to be rendered as ToXContent, and this
will be an issue for the High Level Rest Client tests which expect to be
able to parse back aggregations.
2017-04-18 17:03:32 +02:00
Tanguy Leroux c1ba6997ff AbstractParsedPercentiles should use Percentile class (#24160)
Now the Percentile interface has been merged with the InternalPercentile
class in core (#24154) the AbstractParsedPercentiles should use it.

This commit also changes InternalPercentilesRanksTestCase so that it now
tests the iterator obtained from parsed percentiles ranks aggregations.

Adding this new test raised an issue in the iterators where key and
value are "swapped" in internal implementations when building the
iterators (see InternalTDigestPercentileRanks.Iter constructor that
accepts the `keys` as the first parameter named `values`, each key
being mapped to the `value` field of Percentile class). This is because
 percentiles ranks aggs inverts percentiles/values compared to the
 percentiles aggs.

* Add assume in InternalAggregationTestCase

* Update after Luca review
2017-04-18 16:54:17 +02:00
Christoph Büscher 8f540346a9 Tests: Fixing typo in class name of InternalGlobalTests
Renaming from InternalGlogbalTests -> InternalGlobalTests
2017-04-18 16:27:15 +02:00
Adrien Grand 4632661bc7 Upgrade to a Lucene 7 snapshot (#24089)
We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index.

Some notes about the change:
 - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors
 - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed
 - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge
2017-04-18 15:17:21 +02:00
Tanguy Leroux 67a9696e55 Merge remote-tracking branch 'origin/master' into feature/client_aggs_parsing 2017-04-18 14:55:58 +02:00
Tanguy Leroux f217eb8ad8 Merge Percentile class with interface (#24154)
This commit merges the Percentile interface with the InternalPercentile
class, as we don't need to maintain both.
2017-04-18 14:47:18 +02:00
Martijn van Groningen edada2581e
[TEST] Added unittests for InternalSampler 2017-04-18 14:31:58 +02:00
Yannick Welsch 0b2cb68f6f [TEST] Randomly add and remove no_master blocks in IndicesClusterStateServiceRandomUpdatesTests
Checks that IndicesClusterStateService stays consistent with incoming cluster states that contain no_master blocks (especially
discovery.zen.no_master_block=all which disables state persistence). In particular this checks that active shards which have no in-memory data
structures on a node are failed.
2017-04-18 14:27:54 +02:00
Martijn van Groningen ac41fb2c4a
[TEST] Added test for GeoCentroidAggregator and
made constructors of GeoCentroidAggregator, GeoCentroidAggregatorFactory and InternalGeoCentroid package protected.
2017-04-18 13:54:31 +02:00
Tanguy Leroux c0036d8516 Add parsing for percentiles ranks (#23974)
This commit adds the logic for parsing the percentiles ranks aggregations.
2017-04-18 10:19:30 +02:00
Tanguy Leroux 81dbdb239f [Test] Add unit tests for InternalTDigestPercentilesTests (#24090) 2017-04-18 09:48:35 +02:00
Chris Earle 12c8423ec9 Warn on not enough masters during election (#20063)
This changes the trace level logging to warn, and adds the needed number to the message as well.

My fear is that it may get noisy, but this is an issue that you want to be noisy.
2017-04-17 22:18:28 -04:00
Jason Tedor 34eda1a1a8 Do not set path.data in environment if not set
When preparing the final settings in the environment, we unconditionally
set path.data even if path.data was not explicitly set. This confounds
detection for whether or not path.data was explicitly set, and this is
trappy. This commit adds logic to only set path.data in the final
settings if path.data was explicitly set, and provides a test case that
fails without this logic.

Relates #24132
2017-04-17 10:43:13 -04:00
Jason Tedor f7ebe9d18f Preserve multiple translog generations
Today when a flush is performed, the translog is committed and if there
are no outstanding views, only the current translog generation is
preserved. Yet for the purpose of sequence numbers, we need stronger
guarantees than this. This commit migrates the preservation of translog
generations to keep the minimum generation that would be needed to
recover after the local checkpoint.

Relates #24015
2017-04-17 08:51:54 -04:00
Jason Tedor 8033c576b7 Detect remnants of path.data/default.path.data bug
In Elasticsearch 5.3.0 a bug was introduced in the merging of default
settings when the target setting existed as an array. When this bug
concerns path.data and default.path.data, we ended up in a situation
where the paths specified in both settings would be used to write index
data. Since our packaging sets default.path.data, users that configure
multiple data paths via an array and use the packaging are subject to
having shards land in paths in default.path.data when that is very
likely not what they intended.

This commit is an attempt to rectify this situation. If path.data and
default.path.data are configured, we check for the presence of indices
there. If we find any, we log messages explaining the situation and fail
the node.

Relates #24099
2017-04-17 07:03:46 -04:00
jaymode a8be0a5836
Cat APIs should not close the stream obtained from the channel
The cat APIs and rest tables would obtain a stream from the RestChannel, which happened to be a
ReleasableBytesStreamOutput. These APIs used the stream to write content to, closed the stream,
and then tried to send a response. After #23941 was merged, closing the stream meant that the bytes
were released for use elsewhere. This caused occasional corruption of the response when the bytes
were used prior to the response being sent.

This commit changes these two usages to wrap the stream obtained from the channel in a flush on
close stream so that the bytes are still reserved until the message is sent.
2017-04-15 14:57:00 -04:00
Jason Tedor cd8e059885 Do not produce empty IDs in simple versioning test
Empty IDs are rejected during indexing, so we should not randomly
produce them during tests. This commit modifies the simple versioning
tests to no longer produce empty IDs.
2017-04-15 12:15:45 -04:00
Jason Tedor 972bdc09ee Reject empty IDs
When indexing a document via the bulk API where IDs can be explicitly
specified, we currently accept an empty ID. This is problematic because
such a document can not be obtained via the get API. Instead, we should
rejected these requets as accepting them could be a dangerous form of
leniency. Additionally, we already have a way of specifying
auto-generated IDs and that is to not explicitly specify an ID so we do
not need a second way. This commit rejects the individual requests where
ID is specified but empty.

Relates #24118
2017-04-15 10:36:03 -04:00
Boaz Leskes ecf81688fb Use sequence numbers to identify out of order delivery in replicas & recovery (#24060)
Internal indexing requests in Elasticsearch may be processed out of order and repeatedly. This is important during recovery and due to concurrency in replicating requests between primary and replicas. As such, a replica/recovering shard needs to be able to identify that an incoming request contains information that is old and thus need not be processed. The current logic is based on external version. This is sadly not sufficient. This PR moves the logic to rely on sequences numbers and primary terms which give the semantics we need.

Relates to #10708
2017-04-14 21:46:17 +02:00
Jason Tedor 09efdc3151 Improve performance of extracting warning value
When building headers for a REST response, we de-duplicate the warning
headers based on the actual warning value. The current implementation of
this uses a capturing regular expression that is prone to excessive
backtracking. In cases a request involves a large number of warnings,
this extraction can be a severe performance penalty. An example where
this can arise is a bulk indexing request that utilizes a deprecated
feature (e.g., using deprecated forms of boolean values). This commit is
an attempt to address this performance regression. We already know the
format of the warning header, so we do not need to use a regular
expression to parse it but rather can parse it by hand to extract the
warning value. This gains back the vast majority of the performance lost
due to the usage of a deprecated feature. There is still a performance
loss due to logging the deprecation message but we do not address that
concern in this commit.

Relates #24114
2017-04-14 12:18:00 -04:00
Jay Modi 30ab8739a6 Closing a ReleasableBytesStreamOutput closes the underlying BigArray (#23941)
This commit makes closing a ReleasableBytesStreamOutput release the underlying BigArray so
that we can use try-with-resources with these streams and avoid leaking memory by not returning
the BigArray. As part of this change, the ReleasableBytesStreamOutput adds protection to only
release the BigArray once.

In order to make some of the changes cleaner, the ReleasableBytesStream interface has been
removed. The BytesStream interface is changed to a abstract class so that we can use it as a
useable return type for a new method, Streams#flushOnCloseStream. This new method wraps a
given stream and overrides the close method so that the stream is simply flushed and not closed.
This behavior is used in the TcpTransport when compression is used with a
ReleasableBytesStreamOutput as we need to close the compressed stream to ensure all of the data
is written from this stream. Closing the compressed stream will try to close the underlying stream
but we only want to flush so that all of the written bytes are available.

Additionally, an error message method added in the BytesRestResponse did not use a builder
provided by the channel and instead created its own JSON builder. This changes that method to use
the channel builder and in turn the bytes stream output that is managed by the channel.

Note, this commit differs from 6bfecdf921 in that it updates
ReleasableBytesStreamOutput to handle the case of the BigArray decreasing in size, which changes
the reference to the BigArray. When the reference is changed, the releasable needs to be updated
otherwise there could be a leak of bytes and corruption of data in unrelated streams.

This reverts commit afd45c1432, which reverted #23572.
2017-04-14 10:50:31 -04:00
javanna 5ccb4a0bbd fix typo in ParsedCardinality comment and add //norelease comment on DocValueFormat dep 2017-04-14 11:22:49 +02:00
Yannick Welsch e3aa2a89f9 [TEST] Wait in OldIndexBackwardsCompatibilityIT for cluster to be fully initialized
There are test failures that suggest that the import of dangling indices is happening too early, before the dangling indices are ready to be consumed.
This commit adds an ensureGreen() at the end of cluster initialization to make sure that no cluster state updates are happening while the dangling
indices are prepared on-disk.
2017-04-14 11:02:55 +02:00
Ali Beyad 5e54c0261a [TEST] fixes InternalTopHitsTests test to initialize the SearchHits
maxScore to Float.NaN if there is no max score, as that is what Lucene's
TopDocs does
2017-04-13 18:27:42 -04:00
Igor Motov cce321a560 Task Management: Make TaskInfo parsing forwards compatible (#24073)
TaskInfo is stored as a part of TaskResult and therefore can be read by nodes with an older version. If we add any additional information to TaskInfo (for #23250, for example), nodes with an older version should be able to ignore it, otherwise they will not be able to read TaskResults stored by newer nodes.
2017-04-13 16:16:01 -04:00
Tim Brooks ffaac5a08a Simplify BulkProcessor handling and retry logic (#24051)
This commit collapses the SyncBulkRequestHandler and
AsyncBulkRequestHandler into a single BulkRequestHandler. The new
handler executes a bulk request and awaits for the completion if the
BulkProcessor was configured with a concurrentRequests setting of 0.
Otherwise the execution happens asynchronously.

As part of this change the Retry class has been refactored.
withSyncBackoff and withAsyncBackoff have been replaced with two
versions of withBackoff. One method takes a listener that will be
called on completion. The other method returns a future that will been
complete on request completion.
2017-04-13 14:48:52 -05:00
Jason Tedor 99e0268e0a Remove support for default settings
Today Elasticsearch allows default settings to be used only if the
actual setting is not set. These settings are trappy, and the complexity
invites bugs. This commit removes support for default settings with the
exception of default.path.data, default.path.conf, and default.path.logs
which are maintainted to support packaging. A follow-up will remove
support for these as well.

Relates #24093
2017-04-13 14:25:45 -04:00
Jason Tedor 32b2caad42 Correct handling of default and array settings
In Elasticsearch 5.3.0 a bug was introduced in the merging of default
settings when the target setting existed as an array. This arose due to
the fact that when a target setting is an array, the setting key is
broken into key.0, key.1, ..., key.n, one for each element of the
array. When settings are replaced by default.key, we are looking for the
target key but not the target key.0. This leads to key, and key.0, ...,
key.n being present in the constructed settings object. This commit
addresses two issues here. The first is that we fix the merging of the
keys so that when we try to merge default.key, we also check for the
presence of the flattened keys. The second is that when we try to get a
setting value as an array from a settings object, we check whether or
not the backing map contains the top-level key as well as the flattened
keys. This latter check would have caught the first bug. For kicks, we
add some tests.

Relates #24074
2017-04-13 06:34:58 -04:00
Tanguy Leroux 7f730c9489 Merge remote-tracking branch 'origin/master' into feature/client_aggs_parsing 2017-04-13 09:24:03 +02:00
Ryan Ernst fb3a281755 Build: Switch jna dependency to an elastic version (#24081)
This new version of jna is rebuilt from the official release of jna, but
with native libs linked against older glibc in order to support all
platforms elasticsearch supports.

closes #23640
2017-04-13 00:17:50 -07:00
Boaz Leskes 215a9b2df9 fix CategoryContextMappingTests compilation bugs 2017-04-13 09:15:10 +02:00
Boaz Leskes 342e745fc7 testConcurrentGetAndSetOnPrimary - fix a race condition between indexing and updating value map
Currently the map can be lagging behind what's actually in lucene causes assertions about adding/removing values to fail
2017-04-13 09:03:09 +02:00
Nilabh Sagar ec421974b9 Allow different data types for category in Context suggester (#23491)
The "category" in context suggester could be String, Number or Boolean. However with the changes in version 5 this is failing and only accepting String. This will have problem for existing users of Elasticsearch if they choose to migrate to higher version; as their existing Mapping and query will fail as mentioned in a bug #22358

This PR fixes the above mentioned issue and allows user to migrate seamlessly.

Closes #22358
2017-04-12 23:43:29 -07:00
Ryan Ernst c19044ddf6 Restrict build info loading to ES jar, not any jar (#24049)
This change makes the build info initialization only try to load a jar
manifest if it is the elasticsearch jar. Anything else (eg a repackaged
ES for use of transport client in an uber jar) will contain "Unknown"
for the build info as it does for tests currently.

fixes #21955
2017-04-12 23:22:43 -07:00
Jason Tedor 12b46bdbc4 Remove more hidden file leniency from plugins
This commit removes one more instance of leniency from the plugin
service which skips hidden files in the plugins directory.

Relates #23982
2017-04-12 22:23:42 -04:00
Jason Tedor edd16fa27e Register error listener in evil logger tests
This test needs an error listener registered since we configure logging
here.
2017-04-12 21:23:05 -04:00
Jason Tedor a1c2fe9e3a Detect using logging before configuration
It can easily happen that we touch a logger before logging is configured
due to chains of static intializers and other such scenarios. This
commit adds detection for this mechanism that will fail startup if we
touch a logger before logging is configured. This is a bug that will
cause builds to fail.

Relates #24076
2017-04-12 21:13:08 -04:00
Nik Everett 31c8903492 Add version constant for 5.5 (#24075)
This is required in master now that #24071 is in or else we fail during BWC testing because the 5.x branch contains 5.5 but the build thinks it should contain 5.4.
2017-04-12 16:29:47 -04:00
Zachary Tong 1fd50bc54d Add unit tests for NestedAggregator (#24054)
Add unit tests for NestedAggregator, change class visibilities

Relates to #22278
2017-04-12 15:59:51 -04:00
Nik Everett e99f90fb46 Add more debugging information to rethrottles
I'm still trying to track down failures like:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+dockeralpine-periodic/1180/console

It looks like a task is hanging but I'm not sure why. So this
adds more logging for next time.
2017-04-12 08:37:31 -04:00
Christoph Büscher cf89fb86b5 Merge branch 'master' into feature/client_aggs_parsing 2017-04-12 12:00:40 +02:00
Christoph Büscher 356532816a Adding ParsedCardinality (#23973)
Adding parsing of InternalCardinality xContent output. Parsing method will return a new
implementation of the Cardinality interface, ParsedCardinality.
2017-04-12 11:58:57 +02:00
Christoph Büscher 1847bbac4d Tests: Use random analyzer only on string fields in Match/MultiMatchBuilderTests
Currently we can run into test errors by accidently using e.g. a "simple"
analyzer on a numeric field which might lead to number parsing errors. While
these errors are correct, we should avoid these combinations in our regular
tests.
2017-04-12 11:32:48 +02:00
Ryan Ernst 1207103b6d S3 Repository: Eagerly load static settings (#23910)
The S3 repostiory has many levels of settings it looks at to create a
repository, and these settings were read at repository creation time.
This meant secure settings like access and secret keys had to be
available after node construction. This change makes setting loading for
every except repository level settings eager, so that secure settings
can be stashed, and the keystore can once again be closed after
bootstrapping the node is complete.
2017-04-11 15:42:56 -07:00
Jason Tedor b4c3bb5d21 Reject duplicate settings on the command line
Today Elasticsearch and other CLI tools that rely on environment aware
command leniently accept duplicate settings with the last one
winning. This commit removes this leniency.

Relates #24053
2017-04-11 18:30:05 -04:00
Tim Brooks cf6b03c8f4 Wildcard cluster names for cross cluster search (#23985)
This is related to #23893. This commit allows users to use wilcards for
cluster names when executing a cross cluster search.

So instead of defining every cluster such as:

GET one:*,two:*,three:*/_search

A user could just search:

GET *:*/_search

As ":" characters are currently allowed in index names, if the text
up to the first ":" does not match a defined cluster name, the entire
string is treated as an index name.
2017-04-11 13:56:26 -05:00
Lee Hinman 5cace8e48a Remove shadow replicas
Resolves #22024
2017-04-11 11:26:26 -06:00
Simon Willnauer e30a275bfe Add a dedicated TransportRemoteInfoAction for consistency (#24040)
All our actions that are invoked from rest actions have corresponding
transport actions. This adds the transport action for RestRemoteClusterInfoAction
for consistency.

Relates to #23969
2017-04-11 14:40:37 +02:00
Yannick Welsch 88a54f14c7 Trigger replica recovery restarts by master when primary relocation completes (#23926)
When a primary relocation completes while there are ongoing replica recoveries, the recoveries for these replicas need to be restarted (as a new primary is in charge of replicating changes). Before this commit, the need for a recovery restart was detected by the data nodes that had the replicas, by checking on each cluster state update if the recovery process had completed before the recovery source changed. That code had a race, however, which could lead to a not-fully recovered shard exposing itself as started (see #23904).

This commit takes a different approach: When the primary relocation completes and the master updates the cluster state to move the primary shard from relocating to started, it will reinitialize all initializing replica shards, by giving them a fresh allocation id. Data nodes that have the replica shard will simply detect that the allocation id changed and restart the recovery process (instead of trying to determine the need to restart based on ongoing recoveries).

Note: Removal of the code in IndicesClusterStateService that checks whether the recovery source has changed will not be backported to the 5.x branch. This ensures backward compatibility for the situation where the master node is older and does not have the code changes that have been introduced in this PR.

Closes #23904
2017-04-11 11:21:57 +02:00
Colin Goodheart-Smithe 0114f0061c Removes version 2.x constants from Version (#24011)
* Removes version 2.x constants from Version

Closes #21887

* Addresses review comments
2017-04-11 08:31:22 +01:00
Simon Willnauer f22e0dc30b Add cross-cluster search remote cluster info API (#23969)
This commit adds an API to discover information like seed nodes,
http addresses and connection status of a configured remote cluster.

Closes #23925
2017-04-11 09:24:40 +02:00
Nik Everett 16a2048416 Remove real time from tests (#24025)
The `AsyncBulkByScrollActionTests` were brittle because they used the
current time. That was a mistake. This removes the current time from
the test, instead adding it to the parameters passed in to the
appropriate methods. This means that we take the current time slightly
earlier in all cases, but that shouldn't make a difference.

Closes #24005

Example failure:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+nfs/161/consoleFull
2017-04-10 17:55:02 -04:00
Ryan Ernst 65f7a76630 Settings: Add secure file setting to keystore (#24001)
Some systems like GCE rely on a plaintext file containing credentials.
Rather than extract the information out of that credentials file and
store each peace individually in the keystore, it is cleaner to just
store the entire file.

This commit adds support to the keystore wrapper for secure file
settings. These are settings that contain an entire file that would
normally be stored on the local filesystem. Retrieving the file returns
an input stream to the file contents. This also adds a `add-file`
command to the keystore cli.

In order to support both strings and files as values for settings, the
metadata format of the keystore has also been updated (with backcompat)
to keep a map of setting name to type.
2017-04-10 13:10:42 -07:00
Simon Willnauer a61fb3f708 Remote support for lucene versions without checksums (#24021)
We are still carrying some legacy code that deals with lucene indices
that don't have checksums. Yet, we do not support these indices
for a while now, in fact since version 5.0 such an index is not supported
anymore. This commit removes all the special handling and leniency involved.
2017-04-10 18:16:34 +02:00
Martijn van Groningen 887f3ed8dc
inner_hits: Replace `NestedChildrenQuery` with `ParentChildrenBlockJoinQuery`.
Closes #24009
2017-04-10 17:36:45 +02:00
Lee Hinman 53d4d747a6 Mark IndexWithShadowReplicasIT as AwaitsFix
Relates to #24007 and #23906
2017-04-10 09:32:20 -06:00
Simon Willnauer 040b86a76b Set shard count limit to unlimited (#24012)
Now that we have incremental reduce functions for topN and aggregations
we can set the default for `action.search.shard_count.limit` to unlimited.
This still allows users to restrict these settings while by default we executed
across all shards matching the search requests index pattern.
2017-04-10 17:09:21 +02:00
javanna f538d7b8d6 Merge branch 'master' into feature/client_aggs_parsing 2017-04-10 14:46:57 +02:00
Luca Cavanna 2c545c064d Move getProperty method out of MultiBucketsAggregation.Bucket interface (#23988)
The getProperty method is an internal method needed to run pipeline aggregations and retrieve info by path from the aggs tree. It is not needed in the MultiBucketsAggregation.Bucket interface, which is returned to users running aggregations from the transport client. The method is moved to the InternalMultiBucketAggregation class as that's where it belongs.
2017-04-10 13:35:01 +02:00
Luca Cavanna 93f159429f Remove getProperty method from Aggregations interface and impl (#23972)
The `getProperty` method is an internal method needed to run pipeline aggregations and retrieve info by path from the aggs tree. It is not needed in the `Aggregations` interface, which is returned to users running aggregations from the transport client. Furthermore, the method is currenty unused by pipeline aggs too, as only InternalAggregation#getProperty is used. It can then be removed
2017-04-10 12:31:45 +02:00
Luca Cavanna b283c8b768 Move aggs CommonFields and TYPED_KEYS_DELIMITER from InternalAggregation to Aggregation (#23987)
These will be shared between internal objects and objects exposed through high level REST client, so they should be moved from internal classes.
2017-04-10 12:30:02 +02:00
Luca Cavanna 9db8a266e6 Un-deprecate NamedXContentRegistry.Entry constructor that takes a context (#23986)
We deprecated this method in the past because we thought it was a temporary thing that could go away over time. We radically trimmed down the usages of a context while parsing when we got rid of the ParseFieldMatcher, but the usages that are left are legit and we will hardly get rid of them. Also, working on aggs parsing we will need a context to carry around the aggregation name that gets parsed through XContentParser#namedObject .
2017-04-10 12:28:56 +02:00
Yannick Welsch 12471c4f76 [TEST] Fix wait condition on testMultipleNodesShutdownNonMasterNodes
After two nodes are being stopped and two more are joining the cluster, we first have to wait on the cluster to consist of the right nodes before
waiting on green status, otherwise we might get a green status for a cluster with dead nodes.
2017-04-10 11:38:56 +02:00
Jim Ferenczi 9b3c85dd88 Deprecate _field_stats endpoint (#23914)
_field_stats has evolved quite a lot to become a multi purpose API capable of retrieving the field capabilities and the min/max value for a field.
In the mean time a more focused API called `_field_caps` has been added, this enpoint is a good replacement for _field_stats since he can
retrieve the field capabilities by just looking at the field mapping (no lookup in the index structures).
Also the recent improvement made to range queries makes the _field_stats API obsolete since this queries are now rewritten per shard based on the min/max found for the field.
This means that a range query that does not match any document in a shard can return quickly and can be cached efficiently.
For these reasons this change deprecates _field_stats. The deprecation should happen in 5.4 but we won't remove this API in 6.x yet which is why
 this PR is made directly to 6.0.
 The rest tests have also been adapted to not throw an error while this change is backported to 5.4.
2017-04-10 10:10:16 +02:00
Simon Willnauer 1f40f8a2d2 Introduce incremental reduction of TopDocs (#23946)
This commit adds support for incremental top N reduction if the number of
expected shards in the search request is high enough. The changes here
also clean up more code in SearchPhaseController to make the separation
between values that are the same on each search result and values that
are per response. The reduced search phase result doesn't hold an arbitrary
result to obtain values like `from`, `size` or sort values which is now
cleanly encapsulated.
2017-04-10 09:37:52 +02:00
Boaz Leskes b636ca79d5 Engine: version logic on replicas should not be hard coded (#23998)
The refactoring in #23711 hardcoded version logic for replica to assume monotonic versions. Sadly that's wrong for `FORCE` and `VERSION_GTE`. Instead we should use the methods in VersionType to detect conflicts.

Note - once replicas use sequence numbers for out of order delivery, this logic goes away.
2017-04-09 22:04:12 +02:00
Boaz Leskes f0df5e64d8 InternalEngineTests: fix a potential NPE in assertOpsOnPrimary
assertOpsOnPrimary may inherit a situation where the document exist but it doesn't the last indexed value.
This cloud cause an NPE.
2017-04-09 21:21:00 +02:00
Jason Tedor 61c5976aee Upgrade to Log4j 2.8.2
This commit upgrades the Log4j dependencies from version 2.7 to version
2.8.2. This release includes a fix for a case where Log4j could lose
exceptions in the presence of a security manager.

Relates #23995
2017-04-09 07:19:16 -04:00
Jason Tedor 5c8d5677a4 Suppress ExtrasFS in plugins service tests
The ExtrasFS filesystem creates extra directories when creating temp
directories during tests to ensure that Lucene does not care about extra
files. These extra files get in our way in the plugins service tests
because some of these tests are counting only on certain directories
existing. This commit suppresses the ExtrasFS filesystem for the plugins
service tests, and fixes a test that was passing for the wrong reason
(because of the existence of an extra directory from ExtrasFS).
2017-04-08 20:42:18 -04:00
Jason Tedor 9056e0cb49 Remove hidden file leniency from plugin service
This commit removes some leniency from the plugin service which skips
hidden files in the plugins directory. We really want to ensure the
integrity of the plugin folder, so hasta la vista leniency.

Relates #23982
2017-04-08 18:22:44 -04:00
javanna 12e8a45de7 remove some more TODOs from ParsedAggregation 2017-04-08 00:41:32 +02:00
javanna 9e7b020578 Remove TODO on un-deprecating NamedContentRegistry.Entry ctor that takes a context 2017-04-08 00:28:49 +02:00
javanna 8464b4755f [TEST] replace FareRestRequest usage with ToXContent.MapParams 2017-04-08 00:18:56 +02:00
Ryan Ernst 73b8aad9a3 Settings: Disallow secure setting to exist in normal settings (#23976)
This commit removes the "legacy" feature of secure settings, which setup
a parallel setting that was a fallback in the insecure
elasticsearch.yml. This was previously used to allow the new secure
setting name to be that of the old setting name, but is now not in use
due to other refactorings. It is much cleaner to just have all secure
settings use new setting names. If in the future we want to reuse the
previous setting name, once support for the insecure settings have been
removed, we can then rename the secure setting.  This also adds a test
for the behavior.
2017-04-07 14:18:06 -07:00
javanna 306ef086c5 Align ParsedAggregation meta to InternalAggregation behaviour
Empty meta gets printed out, which means that if the request contains an empty meta object, that is returned with the response as well. On the other hand null, meaning when the object is not in the request, is not printed out. ParsedAggregation used to not print out empty metadata, and didn't allow the null value. Aligned behaviour to the existing behaviour from InternalAggregation.
2017-04-07 21:54:46 +02:00
Simon Willnauer 0c465b1931 Add comment why we check for null fetch results during merge 2017-04-07 21:00:19 +02:00
Jason Tedor 457a76c1c6 Fix import order in Spawner
The imports are not in alphabetical order in Spawner.java and this is a
crime that is rectified by this commit.
2017-04-07 14:52:22 -04:00
javanna 39e791291e Merge branch 'master' into feature/client_aggs_parsing 2017-04-07 15:43:32 +02:00
Yannick Welsch a3cceb8a00 [TEST] Fix testMultipleNodesShutdownNonMasterNodes to wait for the right nodes to rejoin the cluster
This test was sporadically failing for the following reason:
- 4 nodes (nodes 0, 1, 2, and 3) running with `minimum_master_nodes` set to 3
- we stop 2 nodes (node 0 and 3)
- wait for cluster block to be in place on all nodes
- start 2 nodes (node 4 and node 5) and do a `prepareHealth().setWaitForNodes("4")`
- then do a search request

The search request runs into the `ClusterBlockException` as the `prepareHealth().setWaitForNodes("4")` check succeeds on a cluster state that has
nodes 1, 2, 3, and 4, i.e., only one of the two new nodes has joined the cluster and only one of the two dead nodes was removed by the master
(removing the dead nodes only happens after there are again `minimum_master_nodes` nodes in the cluster).

This commit fixes the issue by reusing a method from InternalTestCluster that checks that the right nodes have rejoined the cluster.
2017-04-07 15:26:21 +02:00
Jim Ferenczi 0821fa23ff Restore special case for wilcard on _all query to rewrite to a match all query (#23967)
This change restores the rewrite to a match all query that we used to apply on wildcard query * on the query_string parser before #23433.
2017-04-07 15:15:43 +02:00
Yannick Welsch 8522b43ce7 [TEST] Take cluster state batching into account in testNodeFailuresAreProcessedOnce
The test assumes that two nodes leaving the cluster results in two cluster state updates on the master, which is invalidated by cluster state
batching.
2017-04-07 14:43:38 +02:00
Christoph Büscher 4f94aa8a6a Tests: Fix highlighter fields order in TopHitsTests (#23968)
Shuffling xContent breaks the order of the highlighter fields in the
internal list if the highlighter doesn't use the array syntax. In other tests we
avoid shuffling this json level, but since this is done in the base test for
aggregations we should ensure the highlight builder uses the array syntax here.
2017-04-07 14:24:32 +02:00
javanna 420fa8c400 Add ParsedAggregation as base Aggregation impl for high level client
ParsedAggregation is the base Aggregation implementation for the high level client, which parses aggs responses into java objects.
2017-04-07 11:11:06 +02:00
Luca Cavanna e156dbaf42 Move getProperty method out of Aggregation interface (#23949)
The `getProperty` method is an internal method needed to run pipeline aggregations and retrieve info by path from the aggs tree. It is not needed in the `Aggregation` interface, which is  returned to users running aggregations from the transport client. The method is moved to the InternalAggregation class as that's where it belongs.
2017-04-07 10:55:35 +02:00
Luca Cavanna 13cf8aaa52 [TEST] fix shuffling of xContent keys (#23929)
ESTestCase has methods to shuffle xContent keys given a builder or a parser. Shuffling wasn't actually doing what was expected but rather reordering the keys in their natural ordering, hence the output was always the same at every run. Corrected that and added tests, also fixed a couple of tests that were affected by this fix.
2017-04-07 10:20:32 +02:00
Ali Beyad 480cfe3fe0 Fixes snapshot status on failed snapshots (#23833)
If a snapshot is taken on multiple indices, and some of them are "good"
indices that don't contain any corruption or failures, and some of them
are "bad" indices that contain missing shards or corrupted shards, and
if the snapshot request is set to partial=false (meaning don't take a
snapshot if there are any failures), then the good indices will not be
snapshotted either.  Previously, when getting the status of such a
snapshot, a 500 error would be thrown, because the snap-*.dat blob for
the shards in the good index could not be found.

This commit fixes the problem by reporting shards of good indices as
failed due to a failed snapshot, instead of throwing the
NoSuchFileException.

Closes #23716
2017-04-06 20:54:21 -04:00
Jay Modi 495bf21b46 Preserve response headers when creating an index (#23950)
This commit preserves the response headers when creating an index and updating settings for an
index.

Closes #23947
2017-04-06 20:38:09 +01:00
Jim Ferenczi 042f7566e8 update Version.V_5_3_1_UNRELEASED to the latest bugfix release of Lucene:6_4_2 2017-04-06 10:03:17 +02:00
Jim Ferenczi 38009efedd Disable graph analysis at query time for shingle and cjk filters producing tokens of different size (#23920)
This change disables graph analysis of token streams containing a shingle or a cjk filters that produce shingle or ngram of different size. The graph analysis is disabled for phrase and boolean queries.

Closes #23918
2017-04-06 08:55:00 +02:00
Tim Brooks 5b1fbe5e6c Decouple BulkProcessor from client implementation (#23373)
This commit modifies the BulkProcessor to be decoupled from the
client implementation. Instead it just takes a
BiConsumer<BulkRequest, ActionListener<BulkResponse>> that executes
the BulkRequest.
2017-04-05 12:12:43 -05:00
Lee Hinman 0257a7b97a Only re-parse operation if a mapping update was needed
When executing an index operation on the primary shard,
`TransportShardBulkAction` first parses the document, sees if there are any
mapping updates that needs to be applied, and then updates the mapping on the
master node. It then re-parses the document to make sure that the mappings have
been applied and propagated.

This adds a check that skips the second parsing of the document in the event
there was not a mapping update applied in the first case.

Fixes a performance regression introduced in #23665
2017-04-05 09:29:44 -06:00
Adrien Grand d5d0f140d6 The `filter` and `significant_terms` aggregations should parse the `filter` as a filter, not as a query. (#23797)
This is important for some queries like `bool`, which are parsed differently
depending on whether we want to get a query or a filter.
2017-04-05 16:46:21 +02:00
Simon Willnauer adccdbb3cf Simplify sorted top docs merging in SearchPhaseController (#23881)
Today we have several code paths to merge top docs based on the number of
search results returned from the shards. If there is a only a single shard
holding any hits we go a different code path with quite some complexity while
if there are more than one the code is basically duplicated to safe the
creation of a dense array of top docs which can be large if there are many results.
This commit removes the need of the dense array and in-turn the justification for
the optimization. This commit introduces a single code path to merge top docs.
2017-04-05 14:49:35 +02:00
Boaz Leskes 75b4f408e0 Refactor InternalEngine's index/delete flow for better clarity (#23711)
The InternalEngine Index/Delete methods (plus satellites like version loading from Lucene) have accumulated some cruft over the years making it hard to clearly the code flows for various use cases (primary indexing/recovery/replicas etc). This PR refactors those methods for better readability. The methods are broken up into smaller sub methods, albeit at the price of less code I reused.

To support the refactoring I have considerably beefed up the versioning tests.

This PR is a spin-off from #23543 , which made it clear this is needed.
2017-04-05 14:43:01 +02:00
Boaz Leskes c89fdd938e ZenDiscovery - only validate min_master_nodes values if local node is master (#23915)
The purpose of this validation is to make sure that the master doesn't step down
due to a change in master nodes, which also means that there is no way to revert
an accidental change. Since we validate using the current cluster state (and
not the one from which the settings come from) we have to be careful and only
validate if the local node is already a master. Doing so all the time causes
subtle issues. For example, a node that joins a cluster has no nodes in its
current cluster state. When it receives a cluster state from the master with
a dynamic minimum master nodes setting int it, we must make sure we don't reject it.

Closes #23695
2017-04-05 14:31:32 +02:00
Jason Tedor 24127bf416 Remove hardcoded ports from SingleNodeDiscoveryIT
SingleNodeDiscoveryIT uses a hardcoded port for the purpose of binding
two nodes within the limited port range that an unconfigured unicast zen
ping hosts list would try to discover another node on. This commit at
least removes this hardcoding for the first node to come up, although
still tries to bind the second node to the limited port range after the
first node has bound.
2017-04-05 08:17:33 -04:00
Luca Cavanna 318d365b12 [TEST] make sure that fromXContent doesn't rely on keys ordering (#23901)
We shuffle the keys before we parse our responses for the high level client so that we make sure we never rely on keys ordering.
2017-04-05 11:12:34 +02:00
Jason Tedor afd45c1432 Revert "Closing a ReleasableBytesStreamOutput closes the underlying BigArray (#23572)"
This reverts commit 6bfecdf921.
2017-04-04 20:33:51 -04:00
Jay Modi 6bfecdf921 Closing a ReleasableBytesStreamOutput closes the underlying BigArray (#23572)
This commit makes closing a ReleasableBytesStreamOutput release the underlying BigArray so
that we can use try-with-resources with these streams and avoid leaking memory by not returning
the BigArray. As part of this change, the ReleasableBytesStreamOutput adds protection to only release the BigArray once.

In order to make some of the changes cleaner, the ReleasableBytesStream interface has been
removed. The BytesStream interface is changed to a abstract class so that we can use it as a
useable return type for a new method, Streams#flushOnCloseStream. This new method wraps a
given stream and overrides the close method so that the stream is simply flushed and not closed.
This behavior is used in the TcpTransport when compression is used with a
ReleasableBytesStreamOutput as we need to close the compressed stream to ensure all of the data
is written from this stream. Closing the compressed stream will try to close the underlying stream
but we only want to flush so that all of the written bytes are available.

Additionally, an error message method added in the BytesRestResponse did not use a builder
provided by the channel and instead created its own JSON builder. This changes that method to use the channel builder and in turn the bytes stream output that is managed by the channel.
2017-04-04 17:01:30 +01:00
Jason Tedor 3136ed1490 Rename random ASCII helper methods
This commit renames the random ASCII helper methods in ESTestCase. This
is because this method ultimately uses the random ASCII methods from
randomized runner, but these methods actually only produce random
strings generated from [a-zA-Z].

Relates #23886
2017-04-04 11:04:18 -04:00
Jason Tedor a01f77210a Fix Javadocs for BootstrapChecks#enforceLimits
This commit adds a description for a parameter that was added to
BootstrapChecks#enforceLimits(BoundTransportAddress, String) without the
Javadocs having been updated.
2017-04-04 09:42:19 -04:00
Jason Tedor 51b5dbffb7 Disable bootstrap checks for single-node discovery
While there are use-cases where a single-node is in production, there
are also use-cases for starting a single-node that binds transport to an
external interface where the node is not in production (for example, for
testing the transport client against a node started in a Docker
container). It's tricky to balance the desire to always enforce the
bootstrap checks when a node might be in production with the need for
the community to perform testing in situations that would trip the
bootstrap checks. This commit enables some flexibility for these
users. By setting the discovery type to "single-node", we disable the
bootstrap checks independently of how transport is bound. While this
sounds like a hole in the bootstrap checks, the bootstrap checks can
already be avoided in the single-node use-case by binding only HTTP but
not transport. For users that are genuinely in production on a
single-node use-case with transport bound to an external use-case, they
can set the system property "es.enable.bootstrap.checks" to force
running the bootstrap checks. It would be a mistake for them not to do
this.

Relates #23598
2017-04-04 09:39:04 -04:00
Jim Ferenczi c14be20744 Add unit tests for the missing aggregator (#23895)
* Add unit tests for the missing aggregator

Relates #22278
2017-04-04 14:37:33 +02:00
Jim Ferenczi a04350f0dd Add a property to mark setting as final (#23872)
This change adds a setting property that sets the value of a setting as final.
Updating a final setting is prohibited in any context, for instance an index setting
marked as final must be set at index creation and will refuse any update even if the index is closed.
This change also marks the setting `index.number_of_shards` as Final and the special casing for refusing the updates on this setting has been removed.
2017-04-04 12:35:48 +02:00
Jason Tedor 71293a89bf Introduce single-node discovery
This commit adds a single node discovery type. With this discovery type,
a node will elect itself as master and never form a cluster with another
node.

Relates #23595
2017-04-04 03:02:58 -04:00
Jason Tedor 3bd2efa177 Await termination after shutting down executors
When terminating an executor service or a thread pool, we first
shutdown. Then, we do a timed await termination. If the await
termination fails because there are still tasks running, we then
shutdown now. However, this method does not wait for actively executing
tasks to terminate, so we should again wait for termination of these
tasks before returning. This commit does that.

Relates #23889
2017-04-04 03:01:00 -04:00
Jason Tedor 6234a49fb3 Fix initialization issue in ElasticsearchException
If a test touches ElasticsearchExceptionHandle before the class
initialzer for ElasticsearchException has run, a circular class
initialization problem can arise. Namely, the class initializer for
ElasticsearchExceptionHandle depends on the class initializer for
ElasticsearchExceptionHandle which depends on the class initializer for
all the classes that extend ElasticsearchException, but these classes
can not be loaded because ElasticsearchException has not finished its
class initializer. There are tests that can trigger this before
ElasticsearchException has been loaded due to an unlucky ordering of
test execution. This commit addresses this issue by making
ElasticsearchExceptionHandle private, and then exposing methods that
provide the necessary values from ElasticsearchExceptionHandle. Touching
these methods will force the class initializer for
ElasticsearchException to run first.
2017-04-04 00:33:00 -04:00
Boaz Leskes 48b0121f60 SpecificMasterNodesIT shouldn't use autoMinMasterNodes
as it tweaks the `discovery.initial_state_timeout` setting.
2017-04-03 16:23:17 +02:00
Boaz Leskes 40eb68c95a testRestorePersistentSettings doesn't to mess with discovery settings 2017-04-03 16:23:17 +02:00
Colin Goodheart-Smithe 8482503f9b
Adds tests for cardinality and filter aggregations
Relates to #22278
2017-04-03 10:09:27 +01:00
Colin Goodheart-Smithe cad4fcd9c9
Revert "Adds tests for cardinality and filter aggregations (#23826)"
This reverts commit 058869ed54.
2017-04-03 09:45:16 +01:00
Colin Goodheart-Smithe 058869ed54 Adds tests for cardinality and filter aggregations (#23826)
* Adds tests for cardinality and filter aggregations

Relates to #22278

* addresses review comments
2017-04-03 09:39:03 +01:00
Jim Ferenczi 7316b663e2 Replace custom sort field with SortedSetSortField and SortedNumericSortField when possible (#23827)
Currently for field sorting we always use a custom sort field and a custom comparator source.
Though for numeric fields this custom sort field could be replaced with a standard SortedNumericSortField unless
the field is nested especially since we removed the FieldData for numerics.
We can also use a SortedSetSortField for string sort based on doc_values when the field is not nested.

This change replaces IndexFieldData#comparatorSource with IndexFieldData#sortField that returns a Sorted{Set,Numeric}SortField when possible or a custom
 sort field when the field sort spec is not handled by the SortedSortFields.
2017-04-03 09:57:26 +02:00
Simon Willnauer bdb1cabe71 Prevent nodes from joining if newer indices exist in the cluster (#23843)
Today we prevent nodes from joining when indices exists that are too old.
Yet, the opposite can happen too since lucene / elasticsearch is not forward
compatible when it gets to indices we won't let nodes join the cluster once
there are indices in the clusterstate that are newer than the nodes version.
This prevents forward compatibility issues which we never test against. Yet,
this will not prevent rolling restarts or anything like this since indices
are always created with the minimum node version in the cluster such that an index
can only get the version of the higher nodes once all nodes are upgraded to this version.
2017-04-03 09:52:09 +02:00
Simon Willnauer 998eeb7687 Synchronized CollapseTopFieldDocs with lucenes relatives (#23854)
TopDocs et.al. got additional parameters to incrementally reduce
top docs. In order to add incremental reduction `CollapseTopFieldDocs`
needs to have the same properties.
2017-04-03 09:50:44 +02:00
Jason Tedor 7082baaed9 Stricter parsing of remote node attribute
This commit enables stricter parsing of the remote node attribute,
instead of leniently parsing values that are not "true" as false.
2017-04-01 13:18:46 -04:00
Jason Tedor 38b3fec885 Fix cross-cluster remote node gateway attributes
Remote nodes in cross-cluster search can be marked as eligible for
acting a gateway node via a remote node attribute setting. For example,
if search.remote.node.attr is set to "gateway", only nodes that have
node.attr.gateway set to "true" can be connected to for cross-cluster
search. Unfortunately, there is a bug in the handling of these
attributes due to the use of a dangerous method
Boolean#getBoolean(String) which obtains the system property with
specified name as a boolean. We are not looking at system properties
here, but node settings. This commit fixes this situation, and adds a
test. A follow-up will ban the use of Boolean#getBoolean.

Relates #23863
2017-04-01 13:04:51 -04:00
Jim Ferenczi ee68e75332 FieldCapabilitiesRequest should implements Replaceable since it accepts index patterns 2017-03-31 20:21:06 +02:00
Alexander Reelsen f720767cbc Cleanup: Remove unused FieldMappers class (#23851)
This class is unused, so it can be removed.
2017-03-31 18:26:59 +02:00
Nik Everett ba62229f47 Fix FieldCapabilities compilation in Eclipse (#23855)
Eclipse can't deal with the generics, maybe the fixed but
unreleased https://bugs.eclipse.org/bugs/show_bug.cgi?id=511750
2017-03-31 12:10:15 -04:00
Tanguy Leroux 28099162ab Cluster stats should not render empty http/transport types (#23735)
This commit changes the ClusterStatsNodes.NetworkTypes so that is does
not print out empty field names when no Transport or HTTP type is defined:

```
{
"network_types": {
        ...
        "http_types": {
          "": 2
        }
      }
}
```

is now rendered as:

```
{
"network_types": {
        ...
        "http_types": {
        }
      }
}
```
2017-03-31 17:13:27 +02:00
Simon Willnauer 135eae42b9 Cleanup SearchPhaseController interface (#23844)
SearchPhaseController is tighly coupled to AtomicArray which makes
non-dense representations of results very difficult. This commit removes
the coupling and cuts over to Collection rather than List to ensure no
order or random access lookup is implied.
2017-03-31 16:25:15 +02:00
Jim Ferenczi a8250b26e7 Add FieldCapabilities (_field_caps) API (#23007)
This change introduces a new API called `_field_caps` that allows to retrieve the capabilities of specific fields.

Example:

````
GET t,s,v,w/_field_caps?fields=field1,field2
````
... returns:
````
{
   "fields": {
      "field1": {
         "string": {
            "searchable": true,
            "aggregatable": true
         }
      },
      "field2": {
         "keyword": {
            "searchable": false,
            "aggregatable": true,
            "non_searchable_indices": ["t"]
            "indices": ["t", "s"]
         },
         "long": {
            "searchable": true,
            "aggregatable": false,
            "non_aggregatable_indices": ["v"]
            "indices": ["v", "w"]
         }
      }
   }
}
````

In this example `field1` have the same type `text` across the requested indices `t`, `s`, `v`, `w`.
Conversely `field2` is defined with two conflicting types `keyword` and `long`.
Note that `_field_caps` does not treat this case as an error but rather return the list of unique types seen for this field.
2017-03-31 15:34:46 +02:00
Colin Goodheart-Smithe 9f66b8cd38 Improves disabled fielddata error message (#23841)
Closes #22768
2017-03-31 10:01:07 +01:00
Simon Willnauer 5badf68bd9 Add infrastructure to mark contexts as system contexts (#23830)
Today we have no way to mark an execution as internal. This commit adds
a simple thread context header that allows executing code in a system context.
This allows intercepting code can make better decisions down the road when
it gets to authentication.
2017-03-31 10:47:10 +02:00
Tim Brooks 5fa80a6521 Pass exception from sendMessage to listener (#23559)
This commit changes the listener passed to sendMessage from a Runnable
to a ActionListener.

This change also removes IOException from the sendMessage signature.
That signature is misleading as it allows implementers to assume an
exception will be thrown in case of failure. That does not happen due
to Netty's async nature.
2017-03-30 15:08:23 -05:00
Jason Tedor 48357e43d3 Honor update request timeout
When executing an update request, the request timeout is not transferred
to the index/delete request executed on behalf of the update
request. This leads to update requests not timing out when they should
(e.g., if not all shards are available when the request specifies
wait_for_shards=all with a small timeout). This commit causes the
index/delete requests to honor the update request timeout.

Relates #23825
2017-03-30 14:38:34 -04:00
Christoph Büscher b92371a4dc Tests: Add base tests for InternalSimpleValue and InternalDerivative (#23799)
As an addition to #22278 we should probably also have base tests for InternalSimpleValue
and InternalDerivative.
2017-03-30 20:23:49 +02:00
Simon Willnauer 4125f012b9 Streamline shard index availability in all SearchPhaseResults (#23788)
Today we have the shard target and the target request ID available in SearchPhaseResults.
Yet, the coordinating node maintains a shard index to reference the request, response tuples
internally which is also used in many other classes to reference back from fetch results to
query results etc. Today this shard index is implicitly passed via the index in AtomicArray
which causes an undesirable dependency on this interface.
This commit moves the shard index into the SearchPhaseResult and removes some dependencies
on AtomicArray. Further removals will follow in the future. The most important refactoring here
is the removal of AtomicArray.Entry which used to be created for every element in the atomic array
to maintain the shard index during result processing. This caused an unnecessary indirection, dependency
and potentially thousands of unnecessary objects in every search phase.
2017-03-30 14:32:42 +02:00
Jim Ferenczi 3b559e01be Fixed sliced search tests that rely on BytesRef.hashCode output 2017-03-30 10:14:41 +02:00
David Causse a49e1c0062 Use a fixed seed for computing term hashCode in TermsSliceQuery (#23795)
I think this query should not use the hashCode provided BytesRef#hashCode().
It uses StringHelper#GOOD_FAST_HASH_SEED which is initialized in a static
block to System.currentTimeMillis().
Running this query on different replicas may return inconsistent results.

Using a fixed seed should guaranty that the docs are sliced consistently
accross replicas.

Fixes #23096
2017-03-30 10:10:32 +02:00
Lee Hinman c8081bde91 Further refactor and extend testing for `TransportShardBulkAction`
This moves `updateReplicaRequest` to `createPrimaryResponse` and separates the
translog updating to be a separate function so that the function purpose is more
easily understood (and testable).

It also separates the logic for `MappingUpdatePerformer` into two functions,
`updateMappingsIfNeeded` and `verifyMappings` so they don't do too much in a
single function. This allows finer-grained error testing for when a mapping
fails to parse or be applied.

Finally, it separates parsing and version validation for
`executeIndexRequestOnReplica` into a separate
method (`prepareIndexOperationOnReplica`) and adds a test for it.

Relates to #23359
2017-03-29 10:56:51 -06:00
Jason Tedor 72824609df Add lower bound for translog generation threshold
The translog already occupies 43 bytes on disk when empty. If the
translog generation threshold is below this, the flush thread can get
stuck in an infinite loop repeatedly rolling the generation. This commit
adds a lower bound on the translog generation to avoid this problem,
however we keep the lower bound small for convenience in testing.

Relates #23779
2017-03-28 14:11:50 -04:00
Ali Beyad 2d3c2a4800 Adds backwards compatibility index and repository for v5.3.0 2017-03-28 14:02:07 -04:00
Ali Beyad c675d92a56 Adds v5.3.1 to the version constants 2017-03-28 13:04:28 -04:00
Ali Beyad 2120086d82 Adds pattern keyword marker filter support (#23600)
This commit adds support for the pattern keyword marker filter in
Lucene.  Previously, the keyword marker filter in Elasticsearch
supported specifying a keywords set or a path to a set of keywords.
This commit exposes the regular expression pattern based keyword marker
filter also available in Lucene, so that any token matching the pattern
specified by the `keywords_pattern` setting is excluded from being
stemmed by any stemming filters.

Closes #4877
2017-03-28 11:13:34 -04:00
Dimitris Athanasiou 34f116eae3 Require explicit query in _delete_by_query API (#23632)
As the query of a search request defaults to match_all,
calling _delete_by_query without an explicit query may
result in deleting all data.

In order to protect users against falling into that
pitfall, this commit adds a check to require the explicit
setting of a query.

Closes #23629
2017-03-28 15:44:57 +01:00
Ali Beyad 8359dd05c9 Adds boolean similarity to Elasticsearch (#23637)
This commit adds the boolean similarity scoring from Lucene to
Elasticsearch.  The boolean similarity provides a means to specify that
a field should not be scored with typical full-text ranking algorithms,
but rather just whether the query terms match the document or not.
Boolean similarity scores a query term equal to its query boost only.
Boolean similarity is available as a default similarity option and thus
a field can be specified to have boolean similarity by declaring in its
mapping:
    "similarity": "boolean"

Closes #6731
2017-03-28 10:17:23 -04:00
Stuart Neivandt 3caf887632 Improve error handling for epoch format parser with time zone (#23689)
Change the error response when using a non UTF timezone for range queries with epoch_millis
or epoch_second formats to an illegal argument exception. The goal is to provide a better 
explanation of why the query has failed. The current behavior is to respond with a parse exception.

Closes #22621
2017-03-28 14:45:20 +02:00
Jason Tedor 742d929b56 Validate top-level keys when parsing mget requests
Today, when parsing mget requests, we silently ignore keys in the top
level that do not match "docs" or "ids". This commit addresses this
situation by throwing an exception if any other key occurs here, and
providing the names of valid keys.

Relates #23746
2017-03-28 08:27:31 -04:00
Jason Tedor 4f2dfb6819 Fix serialization for plugin info
This commit fixes the serialization for plugin info. Namely, the
serialization incorrectly specified the backwards compatibility version
as strictly after version 5.4.0, whereas it should be on or after
version 5.4.0.
2017-03-27 21:04:31 -04:00
Jason Tedor b54a9e9c83 Introduce translog generation rolling
This commit introduces a maximum size for a translog generation and
automatically rolls the translog when a generation exceeds the threshold
into a new generation. This threshold is configurable per index and
defaults to sixty-four megabytes. We introduce this constraint as
sequence numbers will require keeping around more than the current
generation (to ensure that we can rollback to the global
checkpoint). Without keeping the size of generations under control,
having to keep old generations around could consume excessive disk
space. A follow-up will enable commits to trim previous generations
based on the global checkpoint.

Relates #23606
2017-03-27 16:43:54 -04:00
Jason Tedor defd0452e7 Modify permissions dialog for plugins
This commit modifies the handling of plugins that require special
permissions to cover a case that was not previously covered.

Relates #23742
2017-03-27 15:52:45 -04:00
Christoph Büscher fc8cb417e7 FuzzyQueryBuilder should error when parsing array of values (#23762)
Closes #23759
2017-03-27 17:02:01 +02:00
Marios Trivyzas 4f694a3312 Remove obsolete index setting `index.version.minimum_compatible`. (#23593) 2017-03-27 15:59:48 +02:00
Jim Ferenczi 0e95c90e9f Upgrade to Lucene 6.5.0 (#23750) 2017-03-27 15:57:54 +02:00
Jason Tedor a6c4234575 Add early-access check
The OpenJDK project provides early-access builds of upcoming
releases. These early-access builds are not suitable for
production. These builds sometimes end up on systems due to aggressive
packaging (e.g., Ubuntu). This commit adds a bootstrap check to ensure
these early-access builds are not being used in production.

Relates #23743
2017-03-24 14:52:50 -04:00
Christoph Büscher 396785ccb1 Tests: Lower expected precision for InternalAvgTests
Closes #23723
2017-03-24 12:18:07 +01:00
Christoph Büscher e7a8e69900 Test: Check that parsing SearchHit without _type/_id works (#23715)
The hit object can be very small e.g. when using "stored_fields": ["_none_"],
this adds a test that checks that we can still parse back the object.

* also check type/id null
2017-03-24 10:14:52 +01:00
Luca Cavanna c379b9bdc8 Use ParseField for aggs CommonFields rather than String (#23717)
With this change we remove a TODO from CommonFields. Also this will be useful when parsing aggs response for the high level REST client.
2017-03-23 17:20:40 +01:00
AdityaJNair 63757efe9c Remove DocumentMapper#parse(String index, String type, String id, BytesReference source) (#23706)
Removed `parse(String index, String type, String id, BytesReference source)` in DocumentMapper.java and replaced all of its use in Test files with `parse(SourceToParse source)`.

`parse(String index, String type, String id, BytesReference source)` was only used in test files and never in the main code so it was removed. All of the test files that used it was then modified to use `parse(SourceToParse source)` method that existing in DocumentMapper.java
2017-03-23 11:01:09 -04:00
Boaz Leskes 6577503b37 AbstractSearchAsyncAction: fix potential NPE in debug logging 2017-03-23 09:00:38 +01:00
Ali Beyad 2df39689fc Fixes snapshot deletion handling on in-progress snapshot failure (#23703)
This commit fixes an issue manifested in the
SharedClusterSnapshotRestoreIT#testGetSnapshotsRequest where a delete
request on a snapshot encounters an in-progress snapshot, so it first
tries to abort the snapshot.  During the aborting process, an exception
is thrown which is handled by the snapshot listener's onSnapshotFailure
method.  This method retries the delete snapshot request, only to
encounter that the snapshot is missing, throwing an exception.  It is
possible that the snapshot failure resulted in the snapshot never having
been written to the repository, and hence, there is nothing to delete.
This commit handles the SnapshotMissingException by logging it and
notifying the listener of the missing snapshot.

Closes #23663
2017-03-22 21:02:23 -04:00
Igor Motov f927a2708d Make it possible to validate a query on all shards instead of a single random shard (#23697)
This is especially useful when we rewrite the query because the result of the rewrite can be very different on different shards. See #18254 for example.
2017-03-22 17:39:21 -04:00
Stefan Gorgiovski 798c19dd7f Deprecate request_cache for clear-cache (#23638)
It is called `request` now.
2017-03-22 08:28:04 -04:00
Luca Cavanna c6b881b53e Share XContent rendering code in terms aggs (#23680)
The output of the different implementations of terms aggs is always very similar. The toXContent methods for each of those classes though was duplicating almost the same code multiple times. This commit centralizes the code for rendering XContent to a single place, which can be reused from the different terms aggs implementations.
2017-03-22 12:28:13 +01:00
Ryan Ernst b31ed6a75c Plugins: Add plugin cli specific exit codes (#23599)
We currently use POSIX exit codes in all of our CLIs. However, posix
only suggests these exit codes are standard across tools. It does not
prescribe particular uses for codes outside of that range. This commit
adds 2 exit codes specific to plugin installation to make distinguishing
an incorrectly built plugin and a plugin already existing easier.

closes #15295
2017-03-21 13:56:00 -07:00
Ryan Ernst 111e703cde Plugins: Output better error message when existing plugin is incompatible (#23562)
This commit catches the underlying failure when trying to list plugin
information when a plugin is incompatible with the current version of
elasticsearch. This could happen when elasticsearch is upgraded but old
plugins still exist. With this change, all plugins will be output,
instead of failing at the first out of date plugin.

closes #20691
2017-03-21 13:45:27 -07:00
Nik Everett bc65be2a65 Reindex: wait for cleanup before responding (#23677)
Changes reindex and friends to wait until the entire request has
been "cleaned up" before responding. "Clean up" in this context
is clearing the scroll and (for reindex-from-remote) shutting
down the client. Failures to clean up are still only logged, not
returned to the user.

Closes #23653
2017-03-21 15:33:39 -04:00
Ryan Ernst f8453aca57 Packaging: Remove classpath ordering hack (#23596)
After the removal of the joda time hack we used to have, we can cleanup
the codebase handling in security, jarhell and plugins to be more picky
about uniqueness. This was originally in #18959 which was never merged.

closes #18959
2017-03-21 12:12:16 -07:00
Ali Beyad e72d287382 [TEST] Propertly cleans up failing restore test
The SharedClusterSnapshotRestoreIT#testDataFileCorruptionDuringRestore
would fail sporadically because it tried to simulate restoring a
corrupted index.  The test would wait until the restore is finished (and
marked as failed) before exiting.  However, in the background, the
cluster still continues to retry allocation of the failed shards,
despite the restore operation being marked as completed, which in turn
generates cluster states to process.  The end of every ESIntegTestCase
verifies that none of the nodes have any pending cluster states to
process.  Hence, this check sometimes fails on this particular test.

This commit solves the issue by ensuring the index is deleted before
exiting the test.
2017-03-21 14:02:42 -04:00
Christoph Büscher 889f0cbc40 Add unit tests for ReverseNestedAggregator (#23651)
Relates to #22278
2017-03-21 13:11:25 +01:00
Jason Tedor 7b17689458 Search took time should use a relative clock
Search took time uses an absolute clock to measure elapsed time, and
then tries to deal with the complexities of using an absolute clock for
this purpose. Instead, we should use a high-precision monotonic relative
clock that is designed exactly for measuring elapsed time. This commit
modifies the search infrastructure to use a relative clock for measuring
took time, but still provides an absolute clock for the components of
search that require a real clock (e.g., index name expression
resolution, etc.).

Relates #23662
2017-03-20 18:48:51 -04:00
Ali Beyad ce08594008 Adds toString() to snapshot operations in progress
A better toString() is added for snapshot operations in progress in the
cluster state and logging has been increased to help debug
SharedClusterSnapshotRestoreIT tests.
2017-03-20 16:45:23 -04:00
Nikiforos Botis e8b915d010 Comment and blank line cleanups (#23647) 2017-03-20 09:36:33 -04:00
Jordan Kiang d010cad503 Fix MapperService StackOverflowError (#23605)
MapperService#parentTypes is rewrapped in an UnmodifiableSet in MapperService#internalMerge every time the cluster state is updated. After thousands of updates the collection is wrapped so deeply that calling a method on it generates a StackOverflowError.

Closes #23604
2017-03-20 03:53:35 -07:00
Jason Tedor 2eafe8310e Format RemovePluginCommand to 100-column limit
This commit formats RemovePluginCommand.java to the 100-column limit and
removes this file from the list of suppressions.
2017-03-19 22:50:13 -04:00
Alex Lattas f9d6924f7d Add Javadocs for RemovePluginCommand#execute
This commit adds Javadocs for RemovePluginCommand#execute, the actual
implementation of the remove plugin command.

Relates #23644
2017-03-19 22:11:50 -04:00
Jason Tedor b23adb6d15 Avoid overflow when computing total FS stats
When adding filesystem stats from individual filesystems, free and
available can overflow. This commit guards against this by adjusting
these situations to Long.MAX_VALUE.

Relates #23641
2017-03-18 20:02:18 -04:00
Jason Tedor 44d75db9e2 Upgrade from JNA 4.2.2 to JNA 4.4.0
This commit upgrades the JNA dependency from version 4.2.2 to version
4.4.0.

Relates #23636
2017-03-17 21:06:16 -04:00
Igor Motov 1bd66136d7 Task Manager should be able to support non-transport tasks (#23619)
Currently the task manager is tied to the transport and can only create tasks based on TransportRequests. This commit enables task manager to support tasks created by non-transport services such as the persistent tasks service.
2017-03-17 19:29:18 -04:00
Jim Ferenczi b8c352fc3f Add support for fragment_length in the unified highlighter (#23431)
* Add support for fragment_length in the unified highlighter

This commit introduce a new break iterator (a BoundedBreakIterator) designed for the unified highlighter
 that is able to limit the size of fragments produced by generic break iterator like `sentence`.
The `unified` highlighter now supports `boundary_scanner` which can `words` or `sentence`.
The `sentence` mode will use the bounded break iterator in order to limit the size of the sentence to `fragment_length`.
When sentences bigger than `fragment_length` are produced, this mode will break the sentence at the next word boundary **after**
 `fragment_length` is reached.
2017-03-17 18:10:13 +01:00
Jason Tedor c462d7d486 Clear the interrupt flag before joining
This commit changes the method for checking the interrupt status of a
thread that is intentionally interrupted during
AdapterActionFutureTests#testInteruption. Namely, we want to check and
clear the interrupt status before joining on the interrupting thread. If
we do not clear the status, when we lose a race where the interrupting
thread is not yet finished, an interrupted exception will be thrown when
we try to join on it. Clearing the interrupted status on the main thread
addresses this issue.
2017-03-17 12:46:23 -04:00
Jason Tedor 90929f77ca Adapter action future should restore interrupts
When a thread blocking on an adapter action future is interrupted, we
throw an illegal state exception. This is documented, but it is rude to
not restore the interrupt flag. This commit restores the interrupt flag
in this situation, and adds a test.

Relates #23618
2017-03-17 01:05:48 -04:00
Christoph Büscher 96a92da682 CompletionSuggestionContext#toQuery() should also consider text if prefix/regex missing (#23451)
In cases where the user specifies only the `text` option on the top level
suggest element (either via REST or the java api), this gets transferred to the
`text` property in the SuggestionSearchContext. CompletionSuggestionContext
currently requires prefix or regex to be specified, otherwise errors. We should
use the global `text` property as a fallback if neither prefix nor regex is provided.

Closes to #23340
2017-03-16 21:36:18 +01:00
Ryan Ernst cb16ed1e26 Fix num docs to be positive in bucket deferring collector test 2017-03-15 15:43:07 -07:00
Ryan Ernst d808e751f4 Mapping: Fix NPE with scaled floats stats when field is not indexed (#23528)
This fixes an NPE in finding scaled float stats. The type of min/max
methods on the wrapped long stats returns a boxed type, but in the case
this is null, the unbox done for the FieldStats.Double ctor primitive
types will cause the NPE. These methods would have null for min/max when
the field exists, but does not actually have points values.

fixes #23487
2017-03-15 15:14:32 -07:00
Jason Tedor f7b8128f92 Enable explicitly enforcing bootstrap checks
This commit adds a system property that enables end-users to explicitly
enforce the bootstrap checks, independently of the binding of the
transport protocol. This can be useful for single-node production
systems that do not bind the transport protocol (and thus the bootstrap
checks would not be enforced).

Relates #23585
2017-03-15 10:36:17 -07:00
Sönke Liebau 326d6456fe Minor spelling corrections in documentation of ClusterState (#23592) 2017-03-15 10:20:18 -07:00
Boaz Leskes c0cafa786b UnicastZenPing shouldn't ping the address of the local node (#23567)
Pinging the local node address doesn't really add to discovering other nodes. It just pollutes the logs with unneeded information.
2017-03-14 07:02:42 -07:00
Jay Modi 3200da0327 Provide a method to retrieve a closeable char[] from a SecureString (#23389)
This change adds a new method that returns the underlying char[] of a SecureString and the ability
to clone the SecureString so that the original SecureString is not vulnerable to modification.
Closing the cloned SecureString will wipe the char[] that backs the clone but the original SecureString remains unaffected.

Additionally, while making a separate change I found that SecureSettings will fail when diff is called on them and there
is no fallback setting. Given the idea behind SecureSetting, I think that diff should just be a no-op and I have
implemented this here as well.
2017-03-13 19:50:55 -07:00
Christoph Büscher a8117a2d77 Tests: fix GeoHashGridAggregatorTests expectations (#23556)
Currently GeoHashGridAggregatorTests#testWithSeveralDocs increases the expected
document count per hash for each geo point added to a document. When points
added to the same doc fall into one bucket (one hash cell) the document should
only be counted once.

Closes #23555
2017-03-13 09:54:50 -07:00
Christoph Büscher 21dcd4f4ca Tests: Check that GetResponse.toString() outputs json xcontent (#23545) 2017-03-13 09:54:29 -07:00
Martijn van Groningen 78a48b102f
[INNER HITS] Changed DisMaxQueryBuilder to extract inner hits from leaf queries.
Closes #23482
2017-03-12 16:21:03 -07:00
Martijn van Groningen b01070a390
[TEST] Added unit tests for diversified sampler aggregator. 2017-03-12 16:14:47 -07:00
Jason Tedor c51ef0b2ca Honor max concurrent searches in multi-search
A previous change to the multi-search request execution to avoid stack
overflows regressed on limiting the number of concurrent search requests
from a batched multi-search request. In particular, the replacement of
the tail-recursive call with a loop could asynchronously fire off all of
the remaining search requests in the batch while max concurrent search
requests are already executing. This commit attempts to address this
issue by taking a more careful approach to the initial problem of
recurisve calls. The cause of the initial problem was due to possibility
of individual requests completing on the same thread as invoked the
search action execution. This can happen, for example, in cases when an
individual request does not resolve to any shards. To address this
problem, when an individual request completes we check if it completed
on the same thread as fired off the request. In this case, we loop and
otherwise safely recurse. Sadly, there was a unit test to check that the
maximum number of concurrent search requests was not exceeded, but that
test was broken while modifying the test to reproduce a case that led to
the possibility of stack overflow. As such, we randomize whether or not
search actions execute on the same thread as the thread that invoked the
action.

Relates #23538
2017-03-12 00:45:40 -08:00
Jason Tedor 2a26ae1d6a Fall back to non-atomic move when removing plugins
When plugins are installed on a union filesystem (for example, inside a
Docker container), removing them can fail because we attempt an atomic
move which will not work if the plugin is not installed in the top
layer. This commit modifies removing a plugin to fall back to a
non-atomic move in cases when the underlying filesystem does not support
atomic moves.

Relates #23548
2017-03-11 19:46:01 -08:00
Jason Tedor 3d82549d8e Avoid stack overflow in multi-search
Today when handling a multi-search request, we asynchornously execute as
many search requests as the minimum of the number of search requests in
the multi-search request and the maximum number of concurrent
requests. When these search requests return, we poll more search
requests from a queue of search requests from the original multi-search
request. The implementation of this was recursive, and if the number of
requests in the multi-search request was large, a stack overflow could
arise due to the recursive invocation. This commit replaces this
recursive implementation with a simple iterative implementation.

Relates #23527
2017-03-09 15:19:05 -08:00
Ryan Ernst 9488985775 Test: Upgrade randomized runner to 2.5.0 (#23513)
This commit upgrades to the newest version of randomized runner. There
is a new additional check that allows ensuring the working directory
for each child jvm is empty. By default, this check will fail the test
run. However, for elasticsearch, we default to wipe the directory. For
example, if you previously told the runner to not wipe the directory, in
order to investigate a failure, the wipe option will delete this data
upon re-running the test.
2017-03-09 11:56:43 -08:00
Jason Tedor ae6331f27e Handle existence of cgroup version 2 hierarchy
When parsing the control groups to which the Elasticsearch process
belongs, we extract a map from subsystems to paths by parsing
/proc/self/cgroup. This file contains colon-delimited entries of the
form hierarchy-ID:subsystem-list:cgroup-path. For control group version
1 hierarchies, the subsystem-list is a comma-delimited list of the
subsystems for that hierarchy. For control group version 2 hierarchies
(which can only exist on Linux kernels since version 4.5), the
subsystem-list is an empty string. The previous parsing of
/proc/self/cgroup incorrectly accounted for this possibility (a +
instead of a * in a regular expression). This commit addresses this
issue, adds a test case that covers this possibility, and simplifies the
code that parses /proc/self/cgroup.

Relates #23493
2017-03-06 14:12:26 -08:00
Tanguy Leroux 1ceb6d04c7 [Test] Fix BulkResponseTests 2017-03-03 09:34:10 +01:00
Jay Modi 01502893eb HTTP transport stashes the ThreadContext instead of the RestController (#23456)
Previously, the RestController would stash the context prior to copying headers. However, there could be deprecation
log messages logged and in turn warning headers being added to the context prior to the stashing of the context. These
headers in the context would then be removed from the request and also leaked back into the calling thread's context.

This change moves the stashing of the context to the HttpTransport so that the network threads' context isn't
accidentally populated with warning headers and to ensure the headers added early on in the RestController are not
excluded from the response.
2017-03-02 14:44:01 -05:00
Ali Beyad 577d2a6a1d Adds cluster state size to /_cluster/state response (#23440)
This commit adds the size of the cluster state to the response for the
get cluster state API call (GET /_cluster/state).  The size that is
returned is the size of the full cluster state in bytes when compressed.
This is the same size of the full cluster state when serialized to
transmit over the network.  Specifying the ?human flag displays the
compressed size in a more human friendly manner.  Note that even if the
cluster state request filters items from the cluster state (so a subset
of the cluster state is returned), the size that is returned is the
compressed size of the entire cluster state.

Closes #3415
2017-03-02 14:20:29 -05:00
Christoph Büscher d02b6f58fa Tests: Adapt ExistsQueryBuilderTests to changes in ExistQueryBuilder#toQuery() (#23462)
Recent changes in the Lucene query that the ExistsQueryBuilder creates broke
this test.
2017-03-02 18:27:30 +01:00
Kunal Kapoor 32d292b3c2 Added types options to DeleteByQueryRequest (#23265)
Add types setter and getter to `DeleteByQueryRequest`, which delegate to the inner `SearchRequest`.
2017-03-02 16:52:21 +01:00
Jim Ferenczi 6519e1207c Fix query_string_query to transform "foo:*" in an exists query on the field name (#23433)
Currently "foo:*" is parsed as prefix query on the field `foo` unless the field is defined in `default_field` or `fields`.
This commit fixes this behavior, "foo:*" is now rewritten to an exists query on the field name.
This change also removes the assumption that "_all:*" should return all docs.

relates #23356
2017-03-02 16:26:27 +01:00
Tanguy Leroux 15c936ec02 Correctly parse BulkItemResponse.Failure's status (#23432)
Today the status is lost when parsing back a BulkItemResponse.Failure. This commit changes the BulkItemResponse.Failure parsing method so that it correctly instantiates a failure with the parsed status instead of realying on the parsed ElasticsearchException (that always return an internal server error status).
2017-03-02 12:37:07 +01:00
Jim Ferenczi 22870cc6c0 Fix centroids equality tests in TDigestState#equals 2017-03-02 11:27:04 +01:00
Tanguy Leroux 5a668c4add Tests: Add unit test for SignificantLongTerms and SignificantStringTerms (#23428)
Relates to #22278
2017-03-02 10:48:29 +01:00
Jim Ferenczi 1228084c1c Fix tests on InternalAggregation that rely on equals/hashCode. 2017-03-02 10:38:16 +01:00
Martijn van Groningen 1d3f6c463c
[INGEST] Added processor type and tag header to error when processor type isn't available on node.
[INGEST] Accumulate any potential other processor parse errors before failing instead of failing upon first processor parsing error.
2017-03-02 08:37:03 +01:00
Nik Everett a54daade33 Tests InternalSingleBucketAggregation subclasses (#23388)
Adds a common base class for testing subclasses of
`InternalSingleBucketAggregation`. They are so similar they
call into question the utility of having all of these classes.
We maybe could just use `InternalSingleBucketAggregation` in
all those cases.... But for now, let's test the classes!

Relates to #22278
2017-03-01 16:50:52 -05:00
Tanguy Leroux e71d9c1960 Tests: Add unit test for InternalDateHistogram (#23402)
Relates to #22278
2017-03-01 16:08:16 +01:00
Adrien Grand 64c90346c6 Avoid adding unnecessary nested filters when ranges are used. (#23427)
The code was testing `PointRangeQuery` however we now use the
`IndexOrDocValuesQuery` in field mappers. This makes the test generate queries
through mappers so that we test the actual queries that would be generated.
2017-03-01 14:34:59 +01:00
Adrien Grand 3134d6b520 Add unit tests to percentile ranks aggregations. (#23240)
Relates #22278
2017-03-01 13:57:40 +01:00
Yannick Welsch c7edaba4e8 [TEST] Fix race condition when blocking cluster state processing during primary relocation
Two tests were periodically failing. What both tests are doing is starting a relocation of a shard from one node to another. Once the
recovery process is started, the test blocks cluster state processing on the relocation target using the BlockClusterStateProcessing disruption. The test then indefinitely
waits for the relocation to complete. The stack dump shows that the relocation is stuck in the PeerRecoveryTargetService.waitForClusterState method, waiting for the relocation target node to have at least the same cluster state
version as the relocation source.

The reason why it gets stuck is the following race:
1) The test code executes a reroute command that relocates a shard from one node to another
2) Relocation target node starts applying the clusterstate with relocation info, starting the recovery process.
4) Recovery is super fast and quickly goes to the waitForClusterState method, which wants to ensure that the cluster state that is
applied on the relocation target is at least as new as the one on the relocation source. The relocation source has already applied the
cluster state but the relocation target is still in the process of applying it. The waitForClusterState method thus uses a
ClusterObserver to wait for the next cluster state. Internally this means submitting a task with priority HIGH to the cluster service.
5) Cluster state application is about to finish on the relocation target. As one of the last steps, it acks to the master which makes the
reroute command return successfully.
6) The test code then blocks cluster state processing on the relocation target by submitting a cluster state update task (with priority
IMMEDIATE) that blocks execution.

If the task that is submitted in step 6 is handled before the one in step 4 by ClusterService's thread executor, cluster state
processing becomes blocked and prevents the cluster state observer from observing the applied cluster state.
2017-03-01 11:55:48 +01:00
Martijn van Groningen 524d7f592d
[TEST] Added unit tests for GeoHashGridAggregator and InternalGeoHashGrid
Part of #22278
2017-03-01 10:33:57 +01:00
Adrien Grand b388389ada Remove support for the include/pattern syntax. (#23141)
Relates #22933
2017-03-01 10:00:38 +01:00
Lee Hinman fd991f32f9 Refactor TransportShardBulkAction and add unit tests
This refactors the `TransportShardBulkAction` to split it appart and make it
unit-testable, and then it also adds unit tests that use these methods.

In particular, this makes `executeBulkItemRequest` shorter and more readable
2017-02-28 19:53:18 -07:00
Jason Tedor 7ce06aeb8c Fix date format in warning headers
This commit fixes the date format in warning headers. There is some
confusion around whether or not RFC 1123 requires two-digit
days. However, the warning header specification very clearly relies on a
format that requires two-digit days. This commit removes the usage of
RFC 1123 date/time format from Java 8, which allows for one-digit days,
in favor of a format that forces two-digit days (it's otherwise
identical to RFC 1123 format, it is just fixed width).

Relates #23418
2017-02-28 20:28:07 -05:00
Ryan Ernst 019263d664 Revert "Internal: Change version constant names for already released versions (#23416)"
This reverts commit dc0e93ed62.
2017-02-28 14:45:13 -08:00
Ryan Ernst dc0e93ed62 Internal: Change version constant names for already released versions (#23416)
We have many version constants in master that have already been
released, but are still marked (by naming convention) as unreleased.
This commit renames those version constants.
2017-02-28 13:05:44 -08:00
Christoph Büscher 98b7023318 Fix checkstyle LineLength issues in UpdateShardAllocationSettingsIT 2017-02-28 20:14:28 +01:00
David Pilato 0ae6bc36fa Create version constants for next bug fix version v5.2.3 2017-02-28 19:11:48 +01:00
Ali Beyad b58eb5f051 [TEST] removes unused SameShardRouting test 2017-02-28 12:58:27 -05:00
Ali Beyad 5e2e45cad9 Makes the same_shard host dyanamically updatable (#23397)
Previously, cluster.routing.allocation.same_shard.host was not a dynamic
setting and could not be updated after startup.  This commit changes the
behavior to allow the setting to be dynamically updatable.  The
documentation already states that the setting is dynamic so no
documentation changes are required.

Closes #22992
2017-02-28 12:48:54 -05:00
Christoph Büscher 5be7f6a76f Tests: fixing line length limit in ScriptedMetricAggregatorTests 2017-02-28 16:56:43 +01:00
Christoph Büscher a522deb6b5 Tests: Add unit test for InternalScriptedMetricAggregator (#23404)
Relates to #22278
2017-02-28 16:43:12 +01:00
Colin Goodheart-Smithe 406d2f7a64
Fixes the per term error in the terms aggregation
When multiple reduce phases were needed the per term error got lost in subsequent reduces in some situations:

When a previous reduce phase had calculated a non-zero error for a particular bucket we were not accounting for this error in subsequent reduce phases and instead were relying on the overall error for the agg which meant we were implicitly assuming that all shards that made up that aggregation had returned the term. This is plainly not true so we need to make sure the per term error for the aggregation is used when calcualting the error for that term in the new reduced aggregation.
2017-02-28 13:56:35 +00:00
Jim Ferenczi 17a0b4e69c Fix merge scheduler test that depends on the number of processors 2017-02-28 11:43:35 +01:00
Jim Ferenczi 0c03d0056c #23391: simplify setting fallback (missing in the squash commit) 2017-02-28 11:34:02 +01:00
Jim Ferenczi d27a55866c Fix merge scheduler config settings (#23391)
Change Setting#get(Settings, Settings) to fallback only if the setting is present in the secondary.
This is needed to fix setting that relies on other settings.
Replace IT with uts.
2017-02-28 10:21:27 +01:00
Christoph Büscher 084cb38207 Tests: Add unit test for InternalScriptedMetric (#23330)
Relates to #22278
2017-02-28 09:40:11 +01:00
Jason Tedor c61cc5f617 Fix deprecation escaping tests
This commit fixes an off-by-one error in the deprecation escaping tests.
2017-02-27 17:19:13 -05:00
Martijn van Groningen 73fb945980
Changed TaskOperationFailure#getCause() return type from Trowable to Exception. 2017-02-27 21:31:19 +01:00
Jim Ferenczi 5c84640126 Upgrade to lucene-6.5.0-snapshot-d00c5ca (#23385)
Lucene upgrade
2017-02-27 18:39:04 +01:00
Jason Tedor 577e6a5e14 Correct warning header to be compliant
The warning header used by Elasticsearch for delivering deprecation
warnings has a specific format (RFC 7234, section 5.5). The format
specifies that the warning header should be of the form

    warn-code warn-agent warn-text [warn-date]

Here, the warn-code is a three-digit code which communicates various
meanings. The warn-agent is a string used to identify the source of the
warning (either a host:port combination, or some other identifier). The
warn-text is quoted string which conveys the semantic meaning of the
warning. The warn-date is an optional quoted date that can be in a few
different formats.

This commit corrects the warning header within Elasticsearch to follow
this specification. We use the warn-code 299 which means a
"miscellaneous persistent warning." For the warn-agent, we use the
version of Elasticsearch that produced the warning. The warn-text is
unchanged from what we deliver today, but is wrapped in quotes as
specified (this is important as a problem that exists today is that
multiple warnings can not be split by comma to obtain the individual
warnings as the warnings might themselves contain commas). For the
warn-date, we use the RFC 1123 format.

Relates #23275
2017-02-27 12:14:21 -05:00
Luca Cavanna 2fb0466f66 Convert suggestion response parsing to use NamedXContentRegistry (#23355)
We recently added parsing code to parse suggesters responses into java api objects. This was done using a switch based on the type of the returned suggestion. We can now replace the switch with using NamedXContentRegistry, which will also be used for aggs parsing.
2017-02-27 15:42:25 +01:00
Colin Goodheart-Smithe 1ceaef0de6
Fixes terms error count for multiple reduce phases
Previously when multiple reduces occured for the terms aggregation we would add up the errors for the aggregations but would not take into account the errors that had already been calculated for the previous reduce phases.

This change corrects that by adding the previously created errors to the new error value.

Closes #23286
2017-02-27 13:44:18 +00:00
javanna 261f31f5b7 [TEST] move filters aggs wrapper query builder rewriting test to integ tests
This test makes little sense when sent from the REST layer, as WrapperQueryBuilder is supposed to be used from the Java api. Also, providing the inner query as base64 string will work only for string formats and break for binary formats like SMILE and CBOR, whcih doesn't play well with randomizing content type in our REST tests
2017-02-27 12:27:03 +01:00
Simon Willnauer b8e2d12b23 Factor out filling of TopDocs in SearchPhaseController (#23380)
Previously this code was duplicated across the 3 different topdocs variants
we have. It also had no real unittest (where we tested with holes in the results)
which caused a sneaky bug where the comparison used `result.size()` vs `results.size()`
causing several NPEs downstream. This change adds a static method to fill the top docs
that is shared across all variants and adds a unittest that would have caught the issue
very quickly.

Closes #19356
Closes #23357
2017-02-27 11:44:41 +01:00
Christoph Büscher 641c88dc29 Prevent negative `from` parameter in SearchSourceBuilder (#23358)
This prevents later errors like the one reported in #23324 and throws an
IllegalArgumentException early instead.
2017-02-27 09:45:10 +01:00
Boaz Leskes 396b8b371c reduce the number of iterations in testPrimaryRelocationWhileIndexing and flush every 5
Without flushing the translog doubles it's size on every recovery
2017-02-26 19:15:33 +01:00
Boaz Leskes 9088ddd09c rollback unneeded change in testNotifyOnDisconnect 2017-02-26 12:30:57 +01:00
Boaz Leskes aa49ba949c disable sampling in testNotifyOnDisconnect
the background activity confuses the test
2017-02-26 12:20:34 +01:00
Ryan Ernst 48548f6c3d CLI: Fix prompting for yes/no to handle console returning null (#23320)
Console.readText may return null in certain cases. This commit fixes a
bug in Terminal.promptYesNo which assumed a non-null return value.  It
also adds a test for this, and modifies mock terminal to be able to
handle null input values.
2017-02-24 20:20:17 -08:00
Ali Beyad 9a9259184a Handle snapshot repository's missing index.latest
If index.latest does not exist, the repository is on an older
version so simply return the empty repository's generation id.
2017-02-24 10:14:22 -05:00
Christoph Büscher 9270f8a873 Adding equals/hashCode to MainResponse (#23352) 2017-02-24 14:58:03 +01:00
Jay Modi 5490cb52b0 Always restore the ThreadContext for operations delayed due to a block (#23349)
The IndexShardOperationsLock has a mechanism to delay operations if there is currently a block on the lock. These
delayed operations are executed when the block is released and are executed by a different thread. When the different
thread executes the operations, the ThreadContext is that of the thread that was blocking operations. In order to
preserve the ThreadContext, we need to store it and wrap the listener when the operation is delayed.
2017-02-24 08:21:04 -05:00
Christoph Büscher b9eb1bba65 Add unit tests for ParentToChildAggregator (#23305)
Adds unit tests for the `children` aggregation.
This change also add the ability to mock Mapperservice in subtests of
AggregatorTestCase.
2017-02-24 10:29:31 +01:00
Jim Ferenczi 63bdd01eb7 Expose WordDelimiterGraphTokenFilter (#23327)
This change exposes the new Lucene graph based word delimiter token filter in the analysis filters.
Unlike the `word_delimiter` this token filter named `word_delimiter_graph` correctly handles multi terms expansion at query time.

Closes #23104
2017-02-24 00:53:38 +01:00
Shai Erera eeac6d27f2 Add BreakIteratorBoundaryScanner support for FVH (#23248)
This commit adds a boundary_scanner property to the search highlight
request so the user can specify different boundary scanners:

* `chars` (default,  current behavior)
* `word` Use a WordBreakIterator
* `sentence` Use a SentenceBreakIterator

This commit also adds "boundary_scanner_locale" to define which locale
should be used when scanning the text.
2017-02-23 23:32:22 +01:00
Ali Beyad 25a9a7ee3a Prioritize listing index-N blobs over index.latest in reading snapshots (#23333)
There are two ways to determine the latest index-N blob that contains
the truth of the contents of the repository: (1) list all index-N blobs
and figure out what the latest value of N is, and (2) read the
index.latest blob, which contains the latest value of N explicitely.
Note that the index.latest blob is not written atomically and can be
re-written, as opposed to the index-N blobs which are never re-written
(to create an updated index blob, index-{N+1} is written).

Previously, the latest index-N was determined by first trying to read
the index.latest blob and if that blob was missing (it was deleted
before being re-written and in between deleting it and re-writing it,
the system crashed), then all index-N blobs were listed to pick the
highest N value.

For non-read-only repositories, this could produce race conditions with
the file system.  In particular, it is possible that the index.latest
blob is being read in order to serve a read request (e.g. get snapshots)
and while doing so, an attempt is made to delete the index.latest blob
and re-write it in order to finalize a snapshot operation.  On some file
systems (e.g. Windows), it is forbidden to delete a file while it is
open for reading by another process/thread.

This commit changes the priority so that figuring out the latest index-N
blob is first done by listing all index-N blobs and determining the
latest N value.  If that values because the repository does not
support listing blobs (e.g. the URL repository), then the index.latest
blob is read.  This is safe because in read-only repositories that do
not support listing blobs, the index.latest blob is never deleted and
then re-written, so the aforementioned issue does not arise.
2017-02-23 15:44:12 -05:00
sabi0 09b3c7f270 Do not create String instances in 'Strings' methods accepting StringBuilder (#22907) 2017-02-23 10:57:34 -08:00
Christoph Büscher 8b1b152e91 Remove abstract InternalMetricsAggregation class (#23326)
This class doesn't seem to do much other than to group together
certain types of aggregations.
2017-02-23 18:03:40 +01:00
Simon Willnauer 2f3f9b9961 Remove unnecessary result sorting in SearchPhaseController (#23321)
In oder to use lucene's utilities to merge top docs the results
need to be passed in a dense array where the index corresponds to the shard index in
the result list. Yet, we were sorting results before merging them just to order them
in the incoming order again for the above mentioned reason. This change removes the
obsolet sort and prevents unnecessary materializing of results.
2017-02-23 13:48:54 +01:00
Simon Willnauer 771fd1f4ea Fix SamplerAggregatorTests to have stable and predictable docIds
Closes #23315
2017-02-23 08:08:38 +01:00
Ryan Ernst 18f57c05cf Script: Fix value of `ctx._now` to be current epoch time in milliseconds (#23175)
In update scripts, `ctx._now` uses the same milliseconds value used by the
rest of the system to calculate deltas. However, that time is not
actually epoch milliseconds, as it is derived from `System.nanoTime()`.
This change reworks the estimated time thread in ThreadPool which this
time is based on to make available both the relative time, as well as
absolute milliseconds (epoch) which may be used with calendar system. It
also renames the EstimatedTimeThread to a more apt CachedTimeThread.

closes #23169
2017-02-22 15:11:02 -08:00
Lee Hinman 77d641216a Handle long overflow when adding paths' totals
From #23093, we fixed the issue where a filesystem can be so large that it
overflows and returns a negative number. However, there is another issue when
adding a path as a sub-path to another `FsInfo.Path` object, when adding the
totals the values can still overflow.

This adds the same safety to return `Long.MAX_VALUE` instead of the negative
number, as well as a test exercising the logic.
2017-02-22 13:04:34 -07:00
Yannick Welsch 0f88f21535 Don't set local node on cluster state used for node join validation (#23311)
When a node wants to join a cluster, it sends a join request to the master. The master then sends a join validation request to the node. This checks that the node can deserialize the current cluster state that exists on the master and that it can thus handle all the indices that are currently in the cluster (see #21830).

The current code can trip an assertion as it does not take the cluster state as is but sets itself as the local node on the cluster state. This can result in an inconsistent DiscoveryNodes object as the local node is not yet part of the cluster state and a node with same id but different address can still exist in the cluster state. Also another node with the same address but different id can exist in the cluster state if multiple nodes are run on the same machine and ports have been swapped after node crashes/restarts.
2017-02-22 20:27:27 +01:00
Lee Hinman 6f1ed8a3d1
[TEST] Add additional logging to IndicesStoreIntegrationIT.testIndexCleanup 2017-02-22 10:11:05 -07:00
Luca Cavanna 495b24655b Update indices settings api to support CBOR and SMILE format (#23309)
Also expand testing on the different ways to provide index settings and remove dead code around ability to provide settings as query string parameters

Closes #23242
2017-02-22 17:51:10 +01:00
javanna f2acf466aa Convert script/template objects to json format
Elasticsearch accepts multiple content-type formats, hence scripts can be stored/provided in json, yaml, cbor or smile. Yet the format that should be used internally is json. This is a problem mainly around search templates, as they only support json out of the four content-types, so instead of maintaining the content-type of the request we should rather convert the scripts/templates to json.

 Binary formats were not previously supported. If you stored a template in yaml format, you'd get back an error "No encoder found for MIME type [application/yaml]" when trying to execute it. With this commit the request content-type is independent from the template, which always gets converted to json internally. That is transparent to users and doesn't affect the content type of the response obtained when executing the template.
2017-02-22 16:20:53 +01:00
Simon Willnauer 5c1924ad19 Remove BWC layer for number of reduce phases (#23303)
Both PRs below have been backported to 5.4 such that we can enable
BWC tests of this feature as well as remove version dependend serialization
for search request / responses.

Relates to #23288
Relates to #23253
2017-02-22 15:03:09 +01:00
mms-programming d31e41547a Handle BlobPath's trailing separator case (#23091) 2017-02-22 09:04:55 +01:00
Areek Zillur 148be11f26 Make document write requests immutable (#23038)
* Make document write requests immutable

Previously, write requests were mutated at the
transport level to update request version, version type
and sequence no before replication.
Now that all write requests go through the shard bulk
transport action, we can use the primary response stored
in item level bulk requests to pass the updated version,
seqence no. to replicas.

* incorporate feedback

* minor cleanup

* Add bwc test to ensure correct index version propagates to replica

* Fix bwc for propagating write operation versions

* Add assertion on replica request version type

* fix tests using internal version type for replica op

* Fix assertions to assert version type in replica and recovery

* add bwc tests for version checks in concurrent indexing

* incorporate feedback
2017-02-21 17:41:22 -05:00
Simon Willnauer ca38e88148 Remote assertion that relies on all shards being successful
The assertion that if there are buffered aggs at least one incremental
reduce phase should have happened doens't hold if there are shard failure.
This commit removes this assertion.

Relates to #23288
2017-02-21 22:41:49 +01:00
Nik Everett 7475175957 Adds unit test for sampler aggregation (#23243)
* Adds unit test for sampler aggregation

Relates to #22278
2017-02-21 12:51:47 -05:00
Jim Ferenczi 0ff6356b7e Revert "Never reduce the same agg twice"
This change reverts 5e4ba4a60e
Incremental reduction of aggs should also work with a single aggregation now that InternalTopHits.equals
 is fixed.
2017-02-21 18:48:28 +01:00
Simon Willnauer ce625ebdcc Expose `batched_reduce_size` via `_search` (#23288)
In #23253 we added an the ability to incrementally reduce search results.
This change exposes the parameter to control the batch since and therefore
the memory consumption of a large search request.
2017-02-21 18:36:59 +01:00
Jim Ferenczi 1ba9770037 Fix comparaison of double in InternalTopHits
InternalTopHits uses "==" to compare hit scores and fails when score is NaN.
This commit changes the comparaison to always use Double.compare.

Relates #23253
2017-02-21 18:18:44 +01:00
Simon Willnauer 5e4ba4a60e Never reduce the same agg twice
Some randomization caused reduction of the same agg multiple times
which causes issues on some aggregations.

Relates to #23253
2017-02-21 17:55:44 +01:00
Simon Willnauer 489f38918d Fix incremental reduce randomization in base tests cases
We can and should randomly reduce down to a single result before
we passing the aggs to the final reduce. This commit changes the logic
to do that and ensures we don't trip the assertions the previous imple tripped.

Relates to #23253
2017-02-21 17:13:46 +01:00
Nik Everett 74c33823ab Comment 2017-02-21 10:43:29 -05:00
Nik Everett 0dee1f85e6 Remove closeAgg 2017-02-21 10:31:42 -05:00
Tanguy Leroux 3a0fc526bb UpdateRequest implements ToXContent (#23289)
This commit changes UpdateRequest so that it implements the ToXContentObject interface.
2017-02-21 15:20:15 +01:00
Jim Ferenczi cc865cbc96 Add unit tests for stats and extended stats aggregations (#23287)
Add tests for InternalStats, InternalExtendedStats and StatsAggregator/ExtendedStatsAggregator

Relates #22278
2017-02-21 15:14:54 +01:00
Simon Willnauer f933f80902 First step towards incremental reduction of query responses (#23253)
Today all query results are buffered up until we received responses of
all shards. This can hold on to a significant amount of memory if the number of
shards is large. This commit adds a first step towards incrementally reducing
aggregations results if a, per search request, configurable amount of responses
are received. If enough query results have been received and buffered all so-far
received aggregation responses will be reduced and released to be GCed.
2017-02-21 13:02:48 +01:00
Tanguy Leroux 39ed76c58b Add parsing method to bulk response (#23234)
This commit adds the `fromXContent()` parsing method to BulkResponse.
2017-02-21 10:49:40 +01:00
Tanguy Leroux c88eb00b83 Add javadoc for DocWriteResponse.Builders (#23267) 2017-02-21 10:19:01 +01:00
Martin Scholz 24bf18b610 Upgrade HDRHistogram to 2.1.9 (#23254) 2017-02-21 08:50:26 +01:00
Martin Scholz 3e292d5245 Migrate TermsQuery to TermInSetQuery (#23229) 2017-02-21 08:49:43 +01:00
Jim Ferenczi 1ff5b318be Fix for IpRangeAggregatorTests#testRanges
Handle null from/to ranges.

Closes #23272
2017-02-20 21:16:14 +01:00
Jason Tedor 4c2bd5feab Introduce sequence-number-aware translog
Today, the relationship between Lucene and the translog is rather
simple: every document not in Lucene is guaranteed to be in the
translog. We need a stronger guarantee from the translog though, namely
that it can replay all operations after a certain sequence number. For
this to be possible, the translog has to made sequence-number aware. As
a first step, we introduce the min and max sequence numbers into the
translog so that each generation knows the possible range of operations
contained in the generation. This will enable future work to keep around
all generations containing operations after a certain sequence number
(e.g., the global checkpoint).

Relates #22822
2017-02-20 15:05:24 -05:00
Jason Tedor 15f5810774 Mark IP range aggregator test as awaits fix
This test reliably fails with the seed 4AC319F8A6B0329B.
2017-02-20 14:42:16 -05:00
Christoph Büscher ea7deace5d Adding fromXContent to Suggest and Suggestion class (#23226)
A follow up to #23202, this adds parsing from xContent and tests to the four Suggestion implementations
and the top level suggest element to be used later when parsing the entire SearchResponse.
2017-02-20 15:45:10 +01:00
Christoph Büscher ea9d51114c Tests: Add unit test for InternalChildren (#23261)
Relates to #22278
2017-02-20 14:02:56 +01:00
Jim Ferenczi 76d6b872dd Add unit tests for GeoBoundsAggregator/InternalGeoBounds (#23259)
* Add unit tests for GeoBoundsAggregator/InternalGeoBounds

Relates #22278
2017-02-20 12:04:30 +01:00
Jim Ferenczi 69b1463f7c Add unit tests for BinaryRangeAggregator/InternalBinaryRange (#23255)
* Add unit tests for BinaryRangeAggregator/InternalBinaryRange

Relates #22278
2017-02-20 11:55:48 +01:00
Tanguy Leroux 872412f645 [Tests] Cleans up DocWriteResponse parsing tests (#23233)
This commit cleans up some parsing tests added from the High Level Rest Client: IndexResponseTests, DeleteResponseTests, UpdateResponseTests, BulkItemResponseTests.

These tests are now more uniform with the others test-from-to-XContent tests we have, they now shuffle the XContent fields before parsing, the asserting method for parsed objects does not used a Map<String, Object> anymore, and buggy equals/hasCode methods in ShardInfo and ShardInfo.Failure have been removed.
2017-02-20 09:45:33 +01:00
Nik Everett d9c37ce195 Adds unit test for sampler aggregation
Relates to #22278
2017-02-17 16:16:04 -05:00
Nik Everett d1de9574ea Checkstyle: Fix link lengths in sampler aggregation 2017-02-17 15:03:57 -05:00
Jay Modi b234644035 Enforce Content-Type requirement on the rest layer and remove deprecated methods (#23146)
This commit enforces the requirement of Content-Type for the REST layer and removes the deprecated methods in transport
requests and their usages.

While doing this, it turns out that there are many places where *Entity classes are used from the apache http client
libraries and many of these usages did not specify the content type. The methods that do not specify a content type
explicitly have been added to forbidden apis to prevent more of these from entering our code base.

Relates #19388
2017-02-17 14:45:41 -05:00
Adrien Grand 3bd1d46fc7 Add unit tests for terms aggregation objects. (#23149)
Relates #22278
2017-02-17 18:01:40 +01:00
javanna 578853f264 Remove stale comment about setting routing before parent
Order does not matter anymore since we merged #15371
2017-02-17 17:10:53 +01:00
Yuhao Bi 576e698613 Minor fix of _cat output (#23211) (#23213)
One line was missing a trailing "\n"
2017-02-17 10:46:20 +01:00
Jason Tedor 00a8b8799f Fix control group pattern
The file /proc/self/cgroup lists the control groups to which the process
belongs. This file is a colon separated list of three fields:
 1. a hierarchy ID number
 2. a comma-separated list of hierarchies
 3. the pathname of the control group in the hierarchy

The regex pattern for this contains a bug for the second field. It
allows one or two entries in the comma-separated list, but not
more. This commit fixes the pattern to allow one or more entires in the
comma-separated list.

Relates #23219
2017-02-16 15:31:18 -05:00
Christoph Büscher 268d15ec4c Adding fromXContent to Suggestion.Entry and subclasses (#23202)
This adds parsing from xContent to Suggestion.Entry and its subclasses for Terms-, Phrase-
and CompletionSuggestion.Entry.
2017-02-16 17:59:55 +01:00
markharwood 1cd1ff6010 Test fix - faulty assumptions about when exceptions are thrown in relation to number of failing shards. (#23205)
Search exceptions are thrown only when all shards report failure. Fix changes assertion logic to reflect this.

Closes #23203
2017-02-16 13:48:17 +00:00
Jason Tedor 0a5917d182 Fix get HEAD requests
Get HEAD requests incorrectly return a content-length header of 0. This
commit addresses this by removing the special handling for get HEAD
requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.

Relates #23186
2017-02-15 13:07:29 -05:00
Christoph Büscher 458ca09e70 Fix checkstyle issue with modifier order in DocWriteResponse 2017-02-15 17:53:39 +01:00
Tanguy Leroux e8d669f50c Add parsing methods to BulkItemResponse (#22859)
This commit adds a parsing method to the BulkItemResponse class. In order to do that, the way DocWriteResponses are parsed has to be changed: ConstructingObjectParser/ObjectParser is removed in favor of a simpler and more readable way to parse these objects.

DocWriteResponse now provides the parseInnerToXContent() method that can be used by subclasses (IndexResponse, UpdateReponse and DeleteResponse) to parse the current token/field and potentially update a DocWriteResponseBuilder. The DocWriteResponseBuilder is a simple POJO used
to contain parsed values. It can be passed around from one parsing method to another parsing method. For example, this is what is done in IndexResponse: a IndexResponseBuilder is created in IndexResponse.fromXContent(), it get passed to IndexResponse.parseXContentFields() that
parses fields specific to IndexResponse (like "created") and updates the context, delegating to DocWriteResponse.parseInnerToXContent() the parsing of any other field. Once all XContent is parsed, IndexResponse.fromXContent() uses the method
IndexResponseBuilder.build() to create the new instance of IndexResponse.

This behavior allow to reuse parsing code among the class hierarchy while keeping the current behavior. It also allows other objects like BulkItemResponse to reuse the same parsing code to parse DocWriteResponses.

Finally, IndexResponseTests, UpdateResponseTests and DeleteResponseTests have been updated to introduce some random shuffling of fields before the XContent is parsed in order to ensure that the parsing code does not rely on field order.
2017-02-15 17:33:10 +01:00
Christoph Büscher b963144254 Add xcontent parsing to completion suggestion option (#23071)
This adds parsing from xContent to the CompletionSuggestion.Entry.Option.
The completion suggestion option also inlines the xContent rendering of the
containes SearchHit, so in order to reuse the SearchHit parser this also changes
the way SearchHit is parsed from using a loop-based parser to using a
ConstructingObjectParser that creates an intermediate map representation and
then later uses this output to create either a single SearchHit or use it with
additional fields defined in the parser for the completion suggestion option.
2017-02-15 16:52:17 +01:00
Jim Ferenczi 3c26754f87 Add BWC index for new released version 5.2.1 2017-02-15 11:14:37 +01:00
Jim Ferenczi f1aaa71a7f Create version constants for next bug fix version v5.2.2 2017-02-15 11:13:09 +01:00
Ryan Ernst 048c87d8a5 Improve setting deprecation message (#23156)
This change modifies the deprecation log message emitted when a setting
is found which is deprecated. The new message indicates docs for the
deprecated settings can be found in the breaking changes docs for the
next major version.

closes #22849
2017-02-14 21:33:13 -08:00
Jason Tedor 6ac1cb660b Cleanup RestGetIndicesAction.java
This commit is just a code cleanup of RestGetIndicesAction.java. For
example, we remove an unnecessary class, remove some unnecessary local
variables, and simplify some code flow.

Relates #23129
2017-02-14 16:51:27 -05:00
Jason Tedor 673754b1d5 Fix get source HEAD requests
Get source HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for get
source HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.

Relates #23151
2017-02-14 16:37:22 -05:00
Martijn van Groningen cab43707dc [percolator] Removed old 2.x bwc logic. 2017-02-14 22:17:17 +01:00
Areek Zillur e178dc5493 Add request version asserting during replica operation (#23167) 2017-02-14 15:40:55 -05:00
Simon Willnauer a7a3729596 Add ExpandSearchPhase as a successor for the FetchSearchPhase (#23165)
Now that we have more flexible search phases we should move the rather
hacky integration of the collapse feature as a real search phase that can
be tested and used by itself. This commit adds a new ExpandSearchPhase
including a unittest for the phase. It's integrated into the fetch phase
as an optional successor.
2017-02-14 17:14:17 +01:00
Adrien Grand 8d6a41f671 Nested queries should avoid adding unnecessary filters when possible. (#23079)
When nested objects are present in the mappings, many queries get deoptimized
due to the need to exclude documents that are not in the right space. For
instance, a filter is applied to all queries that prevents them from matching
non-root documents (`+*:* -_type:__*`). Moreover, a filter is applied to all
child queries of `nested` queries in order to make sure that the child query
only matches child documents (`_type:__nested_path`), which is required by
`ToParentBlockJoinQuery` (the Lucene query behing Elasticsearch's `nested`
queries).

These additional filters slow down `nested` queries. In 1.7-, the cost was
somehow amortized by the fact that we cached filters very aggressively. However,
this has proven to be a significant source of slow downs since 2.0 for users
of `nested` mappings and queries, see #20797.

This change makes the filtering a bit smarter. For instance if the query is a
`match_all` query, then we need to exclude nested docs. However, if the query
is `foo: bar` then it may only match root documents since `foo` is a top-level
field, so no additional filtering is required.

Another improvement is to use a `FILTER` clause on all types rather than a
`MUST_NOT` clause on all nested paths when possible since `FILTER` clauses
are more efficient.

Here are some examples of queries and how they get rewritten:

```
"match_all": {}
```

This query gets rewritten to `ConstantScore(+*:* -_type:__*)` on master and
`ConstantScore(_type:AutomatonQuery {\norg.apache.lucene.util.automaton.Automaton@4371da44})`
with this change. The automaton is the complement of `_type:__*` so it matches
the same documents, but is faster since it is now a positive clause. Simplistic
performance testing on a 10M index where each root document has 5 nested
documents on average gave a latency of 420ms on master and 90ms with this change
applied.

```
"term": {
  "foo": {
    "value": "0"
  }
}
```

This query is rewritten to `+foo:0 #(ConstantScore(+*:* -_type:__*))^0.0` on
master and `foo:0` with this change: we do not need to filter nested docs out
since the query cannot match nested docs. While doing performance testing in
the same conditions as above, response times went from 250ms to 50ms.

```
"nested": {
  "path": "nested",
  "query": {
    "term": {
      "nested.foo": {
        "value": "0"
      }
    }
  }
}
```

This query is rewritten to
`+ToParentBlockJoinQuery (+nested.foo:0 #_type:__nested) #(ConstantScore(+*:* -_type:__*))^0.0`
on master and `ToParentBlockJoinQuery (nested.foo:0)` with this change. The
top-level filter (`-_type:__*`) could be removed since `nested` queries only
match documents of the parent space, as well as the child filter
(`#_type:__nested`) since the child query may only match nested docs since the
`nested` object has both `include_in_parent` and `include_in_root` set to
`false`. While doing performance testing in the same conditions as above,
response times went from 850ms to 270ms.
2017-02-14 16:05:19 +01:00
Adrien Grand a969dad43e Integrate IndexOrDocValuesQuery. (#23119)
This gives Lucene the choice to use index/point-based queries or
doc-values-based queries depending on which one is more efficient. This commit
integrates this feature for:
 - long/integer/short/byte/double/float/half_float/scaled_float ranges,
 - date ranges,
 - geo bounding box queries,
 - geo distance queries.
2017-02-14 15:57:12 +01:00
Jun Ohtani 12bbe6e660 Merge pull request #23161 from johtani/support_keyword_to_analyze_api
[Analyze]Support Keyword type in Analyze API
2017-02-14 23:22:32 +09:00
Christoph Büscher abc8cd6c5f Remove unused sourceAsBytes field in SearchHit 2017-02-14 14:08:38 +01:00
Simon Willnauer aef0665ddb Detach SearchPhases from AbstractSearchAsyncAction (#23118)
Today all search phases are inner classes of AbstractSearchAsyncAction or one of it's
subclasses. This makes unit testing of these classes practically impossible. This commit
Extracts `DfsQueryPhase` and `FetchSearchPhase` or of the code that composes the actual
query execution types and moves most of the fan-out and collect code into an `InitialSearchPhase`
class that can be used to build initial search phases (phases that retry on shards). This will
make modification to these classes simpler and allows to easily compose or add new search phases
down the road if additional roundtrips are required.
2017-02-14 12:34:25 +01:00
Jun Ohtani 34ebb88650 [Analyze]Support Keyword type in Analyze API
Add comment and clarify
2017-02-14 17:56:36 +09:00
Jun Ohtani 4d823d69f4 [Analyze]Support Keyword type in Analyze API 2017-02-14 16:41:16 +09:00
Jason Tedor 5343b87502 Handle bad HTTP requests
When Netty decodes a bad HTTP request, it marks the decoder result on
the HTTP request as a failure, and reroutes the request to GET
/bad-request. This either leads to puzzling responses when a bad request
is sent to Elasticsearch (if an index named "bad-request" does not exist
then it produces an index not found exception and otherwise responds
with the index settings for the index named "bad-request"). This commit
addresses this by inspecting the decoder result on the HTTP request and
dispatching the request to a bad request handler preserving the initial
cause of the bad request and providing an error message to the client.

Relates #23153
2017-02-13 17:39:25 -05:00
Jay Modi 61e383813d Make the version of the remote node accessible on a transport channel (#23019)
This commit adds a new method to the TransportChannel that provides access to the version of the
remote node that the response is being sent on and that the request came from. This is helpful
for serialization of data attached as headers.
2017-02-13 15:15:57 -05:00
Lee Hinman b42d47770c Fix total disk bytes returning negative value (#23093)
* Fix total disk bytes returning negative value

This adds a workaround for JDK-8162520 -
https://bugs.openjdk.java.net/browse/JDK-8162520

Some filesystems can be so large that they return a negative value for their
free/used/available disk bytes due to being larger than `Long.MAX_VALUE`.

This adds protection for our `FsProbe` implementation and adds a test that it
does the right thing.
2017-02-13 11:20:15 -07:00
jaymode d8d03f45c2
Fix communication with 5.3.0 nodes
This commit fixes communication with 5.3.0 nodes to send XContentType to these nodes since #22691 was backported to the
5.3 branch.
2017-02-13 13:15:51 -05:00
Jason Tedor 9dff5e2af7 Properly encode location header
Today when trying to encode the location header to ASCII, we rely on the
Java URI API. This API requires a proper URI which blows up whenever the
URI contains, for example, a space (which can happen if the type, ID, or
routing contain a space). This commit addresses this issue by properly
encoding the URI. Additionally, we remove the need to create a URI
simplifying the code flow.

Relates #23133
2017-02-13 09:34:52 -05:00
Tanguy Leroux de94c1253a Expose WriteRequest.RefreshPolicy string representation (#23106)
This commit changes the RefreshPolicy enum so that string representation are exposed. This will help the high level rest client to simply use  refreshPolicy.getValue() to get the corresponding parameter value of a given refresh policy.
2017-02-13 10:49:46 +01:00
Boaz Leskes 29ea3059fc Allow a cluster state applier to register an observer and wait for a better state (#23132)
#21817 introduced the notion of a cluster state applier and banned those for sampling the cluster state directly (as it is not applied yet). Testing has exposed one exceptional use case - if the appliers want to spawn off a follow up it may require waiting for specific new cluster state (for example, the shard started action, called by the IndicesClusterStateService, may run into trouble connecting to the master and wait for a new master to be elected). This requires creating an observer which, in turn, samples the cluster state. 

An example failure can be seen at https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+periodic/1701/console

This commit allows creating an observer from a cluster state applier. The observer is adapted to exclude any potential old cluster state in its logic.
2017-02-12 14:58:22 +02:00
Jason Tedor 0f21ed5b70 Fix template HEAD requests
Template HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for
template HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.

Relates #23130
2017-02-11 18:30:16 -05:00
Lee Hinman 13446937a5 Remove action.allow_id_generation setting (#23120)
This was an undocumented and unsettable setting that allowed id generation.

Resolves #23088
2017-02-10 14:04:40 -07:00
Jim Ferenczi 1ba73d9797 Fix GraphQuery expectation after Lucene upgrade to 6.5 (#23117)
GraphQueries are now generated as simple clauses in BooleanQuery. So for instance a multi terms synonym will generate
 a GraphQuery but only for the side paths, the other part of the query will not be impacted. This means that we cannot apply
 `minimum_should_match` or `cutoff_frequency` on GraphQuery anymore (only ES 5.3 does that because we generate all possible paths if a query has at least one multi terms synonym).
Starting in 5.4 multi terms synonym will now be treated as a single term when `minimum_should_match` is computed and will be ignored when `cutoff_frequency` is set.
Fixes #23102
2017-02-10 18:20:00 +01:00
sabi0 09c7c5c82f Limit IndexRequest toString() length (#22832)
Limits the length of `IndexRequest#toString` which also limits the size of the task description generated for `IndexRequest`s. If the document being written is larger than 2kb we skip logging the _source entirely. This is because truncating the source is tricky and it isn't worth it.
2017-02-10 10:42:08 -05:00
Sebastian 976da87e8f Fix some Javadoc typos (#23111) 2017-02-10 15:53:30 +01:00
Jason Tedor a6158398dd Fix index HEAD requests
Index HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for index
HEAD requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.

Relates #23112
2017-02-10 09:44:01 -05:00
Jason Tedor 7ac44656df Fix alias HEAD requests
Alias HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for alias
HEAD requests, and just relying on the general mechanism that exists for
handling HEAD requests in the REST layer.

Relates #23094
2017-02-10 09:19:35 -05:00
Adrien Grand 709cc9ba65 Upgrade to lucene-6.5.0-snapshot-f919485. (#23087) 2017-02-10 15:08:47 +01:00
Jay Modi 7018b6ac6f Add BulkProcessor methods with XContentType parameter (#23078)
This commit adds methods to the BulkProcessor that accept bytes and a XContentType to avoid content type detection. The
methods that do not accept XContentType with bytes have been deprecated by this commit.

Relates #22691
2017-02-10 08:59:37 -05:00
Jason Tedor 4f2b4724be Cleanup RestGetAliasesAction.java
This commit is just a code cleanup of RestGetAliasesAction.java. For
example, we remove an unnecessary class, simplify a convenience method,
and simplify some code flow.

Relates #23095
2017-02-10 08:37:05 -05:00
Tanguy Leroux e2e5937455 Use `typed_keys` parameter to prefix suggester names by type in search responses (#23080)
This pull request reuses the typed_keys parameter added in #22965, but this time it applies it to suggesters. When set to true, the suggester names in the search response will be prefixed with a prefix that reflects their type.
2017-02-10 10:53:38 +01:00
Boaz Leskes e0c8a6a3eb Relax WaitActiveShardCountIT check of exception messages
So ti wouldn't depend on BulkShardRequest.toString()
2017-02-09 23:14:09 +02:00
Areek Zillur 990918a655 fix failing tests for BulkShardRequest.tostring 2017-02-09 15:34:22 -05:00
Boaz Leskes 033defee9a fix BulkShardRequestTests after changes to BulkShardRequest.toString 2017-02-09 21:05:21 +02:00
Boaz Leskes cd1cb41603 Move EvilPeerRecoveryIT to a unit test in RecoveryDuringReplicationTests (#22900)
EvillPeerRecoveryIT checks scenario where recovery is happening while there are on going indexing operation that already have been assigned a seq# . This is fairly hard to achieve and the test goes through a couple of hoops via the plugin infra to achieve that. This PR extends the unit tests infra to allow for those hoops to happen in unit tests. This allows the test to be moved to RecoveryDuringReplicationTests

Relates to #22484
2017-02-09 20:14:03 +02:00
Jim Ferenczi 94087b3274 Removes ExpandCollapseSearchResponseListener, search response listeners and blocking calls
This changes removes the SearchResponseListener that was used by the ExpandCollapseSearchResponseListener to expand collapsed hits.
The removal of SearchResponseListener is not a breaking change because it was never released.
This change also replace the blocking call in ExpandCollapseSearchResponseListener by a single asynchronous multi search request. The parallelism of the expand request can be set via CollapseBuilder#max_concurrent_group_searches

Closes #23048
2017-02-09 18:06:10 +01:00
Boaz Leskes 33915aefd8 Improve BulkShardRequest.toString when it has only 1 internal request
Now that we use bulk for single item indexing, this is often the case. Having an indicator of the id of the indexed document helps debugging.

It now looks like this `BulkShardRequest to [[test][0]] containing [index {[test][type][AVojzy9ZxfWASZ-ysmN7], source[{"auto":true}]}]`
2017-02-09 18:59:49 +02:00
Luca Cavanna 90ea778c17 Cluster allocation explain to never return empty response body (#23054)
Empty response bodies should only be sent for HEAD requests, otherwise we should always send back info about the exception that was thrown. Removed some manual exception handling in the REST action that should be rather bubbled up and handled by our rest action infra like every other rest action does.
2017-02-09 17:46:39 +01:00
Luca Cavanna 9f60924ed5 Remove redundant reads of human flag (#23074)
The human flag is centrally handled in RestChannel, no need to have Rest actions manually read it and set it to the builder
2017-02-09 14:58:01 +01:00
Christoph Büscher b85fa54ee7 Tests: Renaming InternalSearchHitsTests to SearchHitsTests
The class under test changed its name from InternalSearchHit(s) to just
SearchHit(s), renaming the tests accordingly.
2017-02-09 14:17:21 +01:00
Tanguy Leroux 3553522328 Add parameter to prefix aggs name with type in search responses (#22965)
This pull request adds a new parameter to the REST Search API named `typed_keys`. When set to true, the aggregation names in the search response will be prefixed with a prefix that reflects the internal type of the aggregation.

Here is a simple example:
```
GET /_search?typed_keys
{
    "aggs": {
        "tweets_per_user": {
            "terms": {
                "field": "user"
            }
        }
    },
    "size": 0
}
```

And the response:

```
{
    "aggs": {
        "sterms:tweets_per_user": {
            ...
        }
    }
}
```

This parameter is intended to make life easier for REST clients that could parse back the prefix and could detect the type of the aggregation to parse. It could also be implemented for suggesters.
2017-02-09 11:19:04 +01:00
Simon Willnauer e02d5563f4 Harden ops counting in AbstractSearchAsyncAction (#23045)
Today we account for too many response with an `IllegalStateException` in
`AbstractSearchAsyncAction` while this is something that should never happen
we should rather assert that we are always have less or equal the number of
expected ops when waiting for responses.
2017-02-09 09:30:13 +01:00
Luca Cavanna b5f5356c4a Remove getDefaultScriptingLanguage from QueryParseContext (#23043)
The method is not needed anymore, was needed only when we supported setting a legacy default lang, which was removed with #21607

Relates to #21607
2017-02-09 09:03:26 +01:00
Nik Everett f7071325c4 Fix generics on LeadDocLookup (#23060)
All the warnings were upsetting me. This doesn't change behavior.
2017-02-08 18:59:24 -05:00
Christoph Büscher e09f3ecbb3 Add xcontent parsing to suggestion options (#23018)
This adds parsing from xContent to Suggestion.Entry.Option and
Termsuggestion.Entry.Option.
2017-02-08 19:03:12 +01:00
Jay Modi 7f3769c745 Remove ldjson support and document ndjson for bulk/msearch (#23049)
This commit removes support for the `application/x-ldjson` Content-Type header as this was only used in the first draft
of the spec and had very little uptake. Additionally, the docs for bulk and msearch have been updated to specifically
call out ndjson and mention that the newline character may be preceded by a carriage return.

Finally, the bulk request handling of the carriage return has been improved to remove this character from the source.

Closes #23025
2017-02-08 11:55:50 -05:00
Simon Willnauer df932ef68f Fix line len 2017-02-08 16:41:41 +01:00
Simon Willnauer d45761e488 Fork off a search thread before sending back fetched responses
This is just a temporary fix until #23048 is fixed. FieldCollapsing
is executing blocking calls on a network thread which causes potential deadlocks
and trips assertions.

Relates to #23048
2017-02-08 15:27:08 +01:00
Simon Willnauer ecb01c15b9 Fold InternalSearchHits and friends into their interfaces (#23042)
We have a bunch of interfaces that have only a single implementation
for 6 years now. These interfaces are pretty useless from a SW development
perspective and only add unnecessary abstractions. They also require
lots of casting in many places where we expect that there is only one
concrete implementation. This change removes the interfaces, makes
all of the classes final and removes the duplicate `foo` `getFoo` accessors
in favor of `getFoo` from these classes.
2017-02-08 14:40:08 +01:00
Simon Willnauer 2d6d871f5c Raise a phase failure if fetch phase gets rejected 2017-02-08 12:52:18 +01:00
Boaz Leskes 0161edae10 MasterFaultDetection can start after the initial cluster state has been processed and the NodeConnectionService connect to the new master (#23037)
After the first cluster state from a new master is processed, NodeConnectionService guarantees we connect to the new master. This removes the need to explicitly connect to the master in the MasterFaultDetection code making it simpler and bypasses the assertion triggered due to the blocking operation on the cluster state thread.

Relates to #22828
2017-02-08 13:49:06 +02:00
Simon Willnauer a8b376670c Separate reduce (aggs, suggest and profile) from merging fetched hits (#23017)
Today we carry on all search results including aggs, suggest and profile results
until we have successfully fetched all hits for the search request. This can potentially
hold on to a large amount of memory if there are heavy aggregations involved. With
this change aggs and profiles are entirely consumed an released for GC before the fetch
phase is executing. This is a first step towards reducing results on-the-fly if the number
of non-empty response are large.
2017-02-08 10:11:51 +01:00
Yannick Welsch 9154686623 Remove legacy primary shard allocation mode based on versions (#23016)
Elasticsearch v5.0.0 uses allocation IDs to safely allocate primary shards whereas prior versions of ES used a version-based mode instead. Elasticsearch v5 still has support for version-based primary shard allocation as it needs to be able to load 2.x shards. ES v6 can drop the legacy support.
2017-02-08 10:00:55 +01:00
Boaz Leskes a512ab32fb Increase time out tolerance in NoMasterNodeIT.
see https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-intake/746/console
2017-02-08 08:50:26 +02:00
Lee Hinman b3c27a7fdd Disallow include_in_all for 6.0+ indices
Since `_all` is now deprecated and cannot be set for new indices, we should also
disallow any field that has the `include_in_all` parameter set.

Resolves #22923
2017-02-07 19:31:51 -07:00
Tim Brooks fcc568fd8d Add methods requiring connect to forbidden apis (#22964)
This is related to #22116. This commit adds calls that require
SocketPermission connect to forbidden APIs.

The following calls are now forbidden:

- java.net.URL#openStream()
- java.net.URLConnection#connect()
- java.net.URLConnection#getInputStream()
- java.net.Socket#connect(java.net.SocketAddress)
- java.net.Socket#connect(java.net.SocketAddress, int)
- java.nio.channels.SocketChannel#open(java.net.SocketAddress)
- java.nio.channels.SocketChannel#connect(java.net.SocketAddress)
2017-02-07 14:41:50 -06:00
Boaz Leskes ba06c14a97 TransportService.connectToNode should validate remote node ID (#22828)
#22194 gave us the ability to open low level temporary connections to remote node based on their address. With this use case out of the way, actual full blown connections should validate the node on the other side, making sure we speak to who we think we speak to. This helps in case where multiple nodes are started on the same host and a quick node restart causes them to swap addresses, which in turn can cause confusion down the road.
2017-02-07 22:11:32 +02:00
Tim Brooks adc1184dd0 Fix broken test in FileSystemUtilsTests
Commit ee84ce09d7 changed an exception
message without changing the corresponding test. This commit fixes the
related test.
2017-02-07 12:50:07 -06:00
Tim Brooks ee84ce09d7 Allow openFileURLStream(URL) to open jars
This is related to #23020. There are some cases for where this method
might be called with a URL to a file inside a jar. This commit allows
this method to read URLs with a protocol of 'jar:/'.
2017-02-07 11:42:27 -06:00
Ryan Ernst 470ad1ae4a Settings: Add secure settings validation on startup (#22894)
Secure settings from the elasticsearch keystore were not yet validated.
This changed improves support in Settings so that secure settings more
seamlessly blend in with normal settings, allowing the existing settings
validation to work. Note that the setting names are still not validated
(yet) when using the elasticsearc-keystore tool.
2017-02-07 09:34:41 -08:00
Tim Brooks 27b7d9bd8d Add FileSystemUtil method to read 'file:/' URLs (#23020)
As part of #22116 we are going to forbid usage of api
java.net.URL#openStream(). However in a number of places across the
we use this method to read files from the local filesystem. This commit
introduces a helper method openFileURLStream(URL url) to read files
from URLs. It does specific validation to only ensure that file:/
urls are read.

Additionlly, this commit removes unneeded method
FileSystemUtil.newBufferedReader(URL, Charset). This method used the
openStream () method which will soon be forbidden. Instead we use the
Files.newBufferedReader(Path, Charset).
2017-02-07 10:24:22 -06:00
Jay Modi c898e8ab83 Add support for newline delimited JSON Content-Type (#22947)
This commit adds support for the newline delimited JSON Content-Type, which is how
the bulk, multi-search, and multi-search template APIs expect data to be formatted. The
`elasticsearch-js` client has also been using this content type for these types of requests.

Closes #22943
2017-02-07 09:20:06 -05:00
Simon Willnauer dc659feeb4 Add a setting to disable remote cluster connections on a node (#23005)
Today either all nodes in the cluster connect to remote clusters of only nodes
that have remote clusters configured in their node config. To allow global remote
cluster configuration but restrict connections to a set of nodes in the cluster
this change adds a new setting `search.remote.connect` (defaults to `true`) to allow
to disable remote cluster connections on a per node basis.
2017-02-07 09:59:24 +01:00
Nik Everett 0d6e622242 Make dates be ReadableDateTimes in scripts (#22948)
Instead of longs. If you want millis since epoch you can call doc.date_field.value.millis.

Relates to #22875
2017-02-06 16:44:56 -05:00
Nicholas Knize 1c9fdfd1b3 Remove GeoPointFieldMapper abstraction
In order to support the evolving GeoPoint encodings in Lucene 5 and 6, ES 2.x and 5.x implements an abstraction layer to the GeoPointFieldMapper classes. As of 5.x the geo_point field mapper settled on using Lucene's more performant LatLonPoint field type and deprecated all other encodings. In 6.0 all encodings except LatLonPoint have been removed rendering this abstraction layer useless. This commit removes the abstraction layer and renames the LatLonPointFieldMapper back to GeoPointFieldMapper to mantain consistency with ES field naming.
2017-02-06 14:17:21 -06:00
Christoph Büscher 033f03109f [Tests] Adding tests for AvgAggregator and InternalAvg (#23000) 2017-02-06 20:05:40 +01:00
Ali Beyad 42a9f95fde This commit changes the exception type thrown when trying to (#22921)
create a snapshot with a name that already exists in the repository.
Instead of throwing a SnapshotCreateException, which results in a
generic 500 status code, a duplicate snapshot name will throw a
InvalidSnapshotNameException, which will result in a 400 status code
(bad request).
2017-02-06 11:39:59 -06:00
Adrien Grand eb26e1a292 Add unit tests to histogram aggregations. (#22961) 2017-02-06 18:18:21 +01:00
Simon Willnauer f09c4e1cdb Expose `search.highlight.term_vector_multi_value` as a node level setting (#22999)
This setting was missed in the great settings refactoring and should be exposed
via node level settings.
2017-02-06 18:17:34 +01:00
Simon Willnauer 7513c6e4eb Remove QUERY_AND_FETCH search type (#22996)
`QUERY_AND_FETCH` has been treated as an internal optimization for 2 major
versions. This commit removes the search type and it's implementation details and
folds the optimization in the case of a single shard into the search controller such
that every search with a single shard (non DFS) will receive this optimization.
2017-02-06 17:10:03 +01:00
Boaz Leskes 5e7d22357f Connect to new nodes concurrently (#22984)
When a node receives a new cluster state from the master, it opens up connections to any new node in the cluster state. That has always been done serially on the cluster state thread but it has been a long standing TODO to do this concurrently, which is done by this PR.

This is spin off of #22828, where an extra handshake is done whenever connecting to a node, which may slow down connecting. Also, the handshake is done in a blocking fashion which triggers assertions w.r.t blocking requests on the cluster state thread. Instead of adding an exception, I opted to implement concurrent connections which both side steps the assertion and compensates for the extra handshake.
2017-02-06 16:32:41 +01:00
Martijn van Groningen e4663d6263 added comment 2017-02-06 15:16:16 +01:00
Martijn van Groningen c8d470f190 Change `org.elasticsearch.bootstrap.JNAKernel32Library$SizeT` constructor's modifier to public.
Otherwise `NativeMappedConverter` can't construct this class.

Closes #22991
2017-02-06 15:16:16 +01:00
Christoph Büscher d02170b277 Add parsing from xContent to MainResponse (#22934)
Add parsing from xContent to MainResponse
2017-02-06 12:30:42 +01:00
Yannick Welsch 6f6596cfb5 Revert "Reduce log-level of IndexPrimaryRelocationIT to hunt Heisenbug"
This reverts commit d0fa6a9bd8.
2017-02-06 11:40:39 +01:00
Adrien Grand 76f779486b 5.2.1 is now on Lucene 6.4.1 too. 2017-02-06 10:02:31 +01:00
Adrien Grand c8496fc4f4 Upgrade to Lucene 6.4.1. (#22978) 2017-02-06 09:28:43 +01:00
Martijn van Groningen 9201ee82f6 [TEST] Added unit tests for sum aggs.
Relates to #22278
2017-02-06 08:32:10 +01:00
Lee Hinman 39e7c30912 Change certain replica failures not to fail the replica shard
This changes the way that replica failures are handled such that not all
failures will cause the replica shard to be failed or marked as stale.

In some cases such as refresh operations, or global checkpoint syncs, it is
"okay" for the operation to fail without the shard being failed (because no data
is out of sync). In these cases, instead of failing the shard we should simply
fail the operation, and, in the event it is a user-facing operation, return a
5xx response code including the shard-specific failures.

This was accomplished by having two forms of the `Replicas` proxy, one that is
for non-write operations that does not fail the shard, and one that is for write
operations that will fail the shard when an operation fails.

Relates to #10708
2017-02-03 14:39:46 -07:00
Nik Everett 70e3cce904 Fix name of `enable_position_increments` (#22895)
It was accidentally renamed `enabled_position_increment` in the cleanups
for 5.0. This adds `enable_position_increment` as a deprecated alias
so it will continue to work.
2017-02-03 16:28:27 -05:00
Nicholas Knize b1a6b227e1 Remove deprecated geo query parameters, and GeoPointDistanceRangeQuery
This commit removes the following queries and parameters (which were deprecated in 5.0):

* GeoPointDistanceRangeQuery
* coerce, and ignore_malformed for GeoBoundingBoxQuery, GeoDistanceQuery, GeoPolygonQuery, and GeoDistanceSort
2017-02-03 10:08:00 -06:00
Tim Brooks f70188ac58 Remove connect SocketPermissions from core (#22797)
This is related to #22116. Core no longer needs `SocketPermission`
`connect`.

This permission is relegated to these modules/plugins:
- transport-netty4 module
- reindex module
- repository-url module
- discovery-azure-classic plugin
- discovery-ec2 plugin
- discovery-gce plugin
- repository-azure plugin
- repository-gcs plugin
- repository-hdfs plugin
- repository-s3 plugin

And for tests:
- mocksocket jar
- rest client
- httpcore-nio jar
- httpasyncclient jar
2017-02-03 09:39:56 -06:00
Jason Tedor 9a0b216c36 Upgrade checkstyle to version 7.5
This commit upgrades the checkstyle configuration from version 5.9 to
version 7.5, the latest version as of today. The main enhancement
obtained via this upgrade is better detection of redundant modifiers.

Relates #22960
2017-02-03 09:46:44 -05:00
Jason Tedor 01871e4def Fix compilation in RecoverySourceHandlerTests
This error arose after the signature of a method was changed.
2017-02-03 08:48:29 -05:00
Jason Tedor 6e9940283b Avoid losing ops in file-based recovery
When a primary is relocated from an old node to a new node, it can have
ops in its translog that do not have a sequence number assigned. When a
file-based recovery is started, this can lead to skipping these ops when
replaying the translog due to a bug in the recovery logic. This commit
addresses this bug and adds a test in the BWC tests.

Relates #22945
2017-02-03 08:11:57 -05:00
Jim Ferenczi 4876448e39 Consilify get-field-mapping docs (#22936)
This change also removes the reference to the difference bewteen full name and index name.
They are always the same since 2.x and `name` does not refer anymore to `author.name` automatically.
A simple pattern must be used instead.
Remove redundant code that checks the field name twice.
2017-02-03 10:04:31 +01:00
Yannick Welsch d0fa6a9bd8 Reduce log-level of IndexPrimaryRelocationIT to hunt Heisenbug 2017-02-03 09:46:07 +01:00
Chris Buonocore 365d33efe3 Handle missing plugin name in remove command
Today if a user invokes the remove plugin command without specifying the
name of a plugin to remove, we arrive at a null pointer exception. This
commit adds logic to cleanly handle this situation and provide clear
feedback to the user.

Relates #22930
2017-02-02 19:39:56 -05:00
Nicholas Knize 58c34f0da9 Fix NPE in RangeFieldMapper.doXContentBody
RangeFieldMapper.doXContentBody should only serialize format and locale when type is set to 'date_range'.

closes #22925
2017-02-02 16:52:33 -06:00
Nik Everett 71b2655bb3 Fix outdated example in javadoc
The code changed and the example in the javadoc didn't.

Closes #22932
2017-02-02 14:28:48 -05:00
Areek Zillur ba8ad397a1 Use bulk action interally for update action (#22915)
Currently, update action internally uses deprecated index and delete
transport actions. As of #21964, these tranport actions were deprecated
in favour of using single item bulk request. In this commit, update action
uses single item bulk action.
2017-02-02 14:21:53 -05:00
Jay Modi 7520a107be Optionally require a valid content type for all rest requests with content (#22691)
This change adds a strict mode for xcontent parsing on the rest layer. The strict mode will be off by default for 5.x and in a separate commit will be enabled by default for 6.0. The strict mode, which can be enabled by setting `http.content_type.required: true` in 5.x, will require that all incoming rest requests have a valid and supported content type header before the request is dispatched. In the non-strict mode, the Content-Type header will be inspected and if it is not present or not valid, we will continue with auto detection of content like we have done previously.

The content type header is parsed to the matching XContentType value with the only exception being for plain text requests. This value is then passed on with the content bytes so that we can reduce the number of places where we need to auto-detect the content type.

As part of this, many transport requests and builders were updated to provide methods that
accepted the XContentType along with the bytes and the methods that would rely on auto-detection have been deprecated.

In the non-strict mode, deprecation warnings are issued whenever a request with body doesn't provide the Content-Type header.

See #19388
2017-02-02 14:07:13 -05:00
Nicholas Knize b41d5747f0 Reduce GeoDistance insanity
GeoDistance query, sort, and scripts make use of a crazy GeoDistance enum for handling 4 different ways of computing geo distance: SLOPPY_ARC, ARC, FACTOR, and PLANE. Only two of these are necessary: ARC, PLANE. This commit removes SLOPPY_ARC, and FACTOR and cleans up the way Geo distance is computed.
2017-02-02 12:39:42 -06:00
Igor Motov c34b63dadd Expand AbstractSerializingTestCase and AbstractWireSerializingTestCase to test diff serialization
This commit adds two additional test cases that can be used to verify correct diff serialization in additional to binary and xcontent serialization.
2017-02-02 12:19:53 -05:00
Tanguy Leroux f86fd62821 Parse elasticsearch exception's root causes (#22924)
This commit change ElasticsearchException.failureFromXContent() method so that it now parses root causes which were ignored before, and adds them as suppressed exceptions of the returned exception.
2017-02-02 17:00:16 +01:00
Nik Everett dacc150934 Expose multi-valued dates to scripts and document painless's date functions (#22875)
Implemented by wrapping an array of reused `ModuleDateTime`s that
we grow when needed. The `ModuleDateTime`s are reused when we
move to the next document.

Also improves the error message returned when attempting to modify
the `ScriptdocValues`, removes a couple of allocations, and documents
that the date functions are available in Painless.

Relates to #22162
2017-02-01 21:57:07 -05:00
Ali Beyad 9f97eec12e Fixes the Version constants for 5.2.0, 5.2.1, and 5.3.0 to
have the correct Lucene version (6.4.0)
2017-02-01 12:47:36 -05:00
Lee Hinman 8d83edc4a5 Disallow introducing illegal object mappings (double '..')
This disallows object mappings that would accidentally create something like
`foo..bar`, which is then unparsable for the `bar` field as it does not know
what its parent is.

Resolves #22794
2017-02-01 09:03:20 -07:00
Ali Beyad 5c1410d031 Removes premature addition of v5.4.0 constant 2017-02-01 10:43:53 -05:00
Tanguy Leroux 3cfcd1acb7 [TEST] Fix BytesRestResponseTests.testNoErrorFromXContent 2017-02-01 10:34:42 +01:00
Boaz Leskes 5bf9cb9f70 remove await fix from testCannotAllocateStaleReplicaExplanation 2017-02-01 10:31:26 +01:00
Boaz Leskes 06b8a1ada7 fix testCannotAllocateStaleReplicaExplanation node management
The test tried to create a situation where a stale replica is the only shard available. It did so by stopping the node with the replica, indexing some, stopping the primary node, starting a new node. This is flawed because the newly started node may reuse the data path of the primary node and things go back to green. Instead we should make sure that the replica is on the path that will be selected when the new node is started (i.e., the path with the smaller ordinal)
2017-02-01 10:30:42 +01:00
Tanguy Leroux c74679b6b9 Add parsing method to BytesRestResponse's error (#22873)
This commit adds a BytesRestResponse.errorFromXContent() method to parse the error returned by BytesRestResponse. It returns a ElasticsearchStatusException instance.
2017-02-01 10:11:17 +01:00
Ali Beyad f436a06971 [TEST] adds AwaitsFix on two failing tests 2017-01-31 22:42:23 -05:00
Ali Beyad 6f2222f8fb [TEST] fix node allocation result check in explain API test 2017-01-31 20:11:01 -05:00
Ali Beyad 547eb5c22f Include stale replica shard info when explaining an unassigned primary (#22826)
Currently, if a previously allocated shard has no in-sync copy in the
cluster, but there is a stale replica copy, the explain API does not
include information about the stale replica copies in its output.  This
commit includes any shard copy information available (even for stale
copies) when explaining an unassigned primary shard that was previously
allocated in the cluster.

This situation can arise as follows: imagine an index with 1 primary and
1 replica and a cluster with 2 nodes.  If the node holding the replica
is shut down, and data continues to be indexed, only the primary will
have the latest data and the replica that has gone offline will be
marked as stale.  Now, suppose the node holding the primary is shut
down.  There are no copies of the shard data in the cluster.  Now, start
the first stopped node (holding the stale replica) back up.  The cluster
is red because there is no in-sync copy available.  Running the explain
API before would inform the user that there is no valid shard copy in
the cluster for that shard, but it would not provide any information
about the existence of the stale replica that exists on the restarted
node.  With this commit, the explain API provides information about all
the stale replica copies when explaining the unassigned primary.
2017-01-31 16:31:55 -06:00
Ali Beyad c223457ba1 Adds v5.2.1 and v5.4.0 constants and bwc index for 5.2.0 2017-01-31 17:12:02 -05:00
Jack Conradson 3d2626c4c6 Change Namespace for Stored Script to Only Use Id (#22206)
Currently, stored scripts use a namespace of (lang, id) to be put, get, deleted, and executed. This is not necessary since the lang is stored with the stored script. A user should only have to specify an id to use a stored script. This change makes that possible while keeping backwards compatibility with the previous namespace of (lang, id). Anywhere the previous namespace is used will log deprecation warnings.

The new behavior is the following:

When a user specifies a stored script, that script will be stored under both the new namespace and old namespace.

Take for example script 'A' with lang 'L0' and data 'D0'. If we add script 'A' to the empty set, the scripts map will be ["A" -- D0, "A#L0" -- D0]. If a script 'A' with lang 'L1' and data 'D1' is then added, the scripts map will be ["A" -- D1, "A#L1" -- D1, "A#L0" -- D0].

When a user deletes a stored script, that script will be deleted from both the new namespace (if it exists) and the old namespace.

Take for example a scripts map with {"A" -- D1, "A#L1" -- D1, "A#L0" -- D0}. If a script is removed specified by an id 'A' and lang null then the scripts map will be {"A#L0" -- D0}. To remove the final script, the deprecated namespace must be used, so an id 'A' and lang 'L0' would need to be specified.

When a user gets/executes a stored script, if the new namespace is used then the script will be retrieved/executed using only 'id', and if the old namespace is used then the script will be retrieved/executed using 'id' and 'lang'
2017-01-31 13:27:02 -08:00
Boaz Leskes eb36b82de4 Seq Number based recovery should validate last lucene commit max seq# (#22851)
The seq# base recovery logic relies on rolling back lucene to remove any operations above the global checkpoint. This part of the plan is not implemented yet but have to have these guarantees. Instead we should make the seq# logic validate that the last commit point (and the only one we have) maintains the invariant and if not, fall back to file based recovery.

 This commit adds a test that creates situation where rollback is needed (primary failover with ops in flight) and fixes another issue that was surfaced by it - if a primary can't serve a seq# based recovery request and does a file copy, it still used the incoming `startSeqNo` as a filter.

 Relates to #22484 & #10708
2017-01-31 20:27:31 +01:00
Ryan Ernst 29f63c78cc Internal: Convert empty and size checks of settings to not use getAsMap() (#22890)
With the new secure settings, methods like getAsMap() no longer work
correctly as a means of checking for empty settings, or the total size.
This change converts the existing uses of that method to use methods
directly on Settings. Note this does not update the implementations to
account for SecureSettings, as that will require a followup which
changes how secure settings work.
2017-01-31 10:44:09 -08:00
Jim Ferenczi f6d38d480a Integrate UnifiedHighlighter (#21621)
* Integrate UnifiedHighlighter

This change integrates the Lucene highlighter called "unified" in the list of supported highlighters for ES.
This highlighter can extract offsets from either postings, term vectors, or via re-analyzing text.
The best strategy is picked automatically at query time and depends on the field and the query to highlight.
2017-01-31 19:06:03 +01:00
Ryan Ernst a4f6edec52 Settings: Fix settings reading to account for defaults (#22871)
In #22762, settings preparation during bootstrap was changed slightly to
account for SecureSettings, by starting with a fresh settings builder
after reading the initial configuration. However, this the defaults from
system properties were never re-read. This change fixes that bug (which
was never released).

closes #22861
2017-01-30 14:42:40 -08:00
Christoph Büscher 4e613139dc Ensure fixed serialization order of InnerHitBuilder (#22820)
Usually the order in which we serialize sets and maps of things doesn't matter,
but since InnerHitBuilder is part of SearchSourceBuilder, which is in turn used
as a cache key in its bytes serialization, we need to ensure the order of all
these fields when writing them to an output stream.

This adds tests and makes sure we iterate over the scriptField set and the
childInnerHits map in a fixed order.

Closes #22808
2017-01-30 19:28:55 +01:00
Alex Bumbu 41abf6e81d Add used memory amount to CircuitBreakingException message (#22521) 2017-01-28 16:54:13 +00:00
Nik Everett e042c77301 Add tests for reducing top hits (#22837)
Also adds many `equals` and `hashCode` implementations and moves
the failure printing in `MatchAssertion` into a common spot and
exposes it over `assertEqualsWithErrorMessageFromXContent` which
does an object equality test but then uses `toXContent` to print
the differences.

Relates to #22278
2017-01-27 20:54:11 -05:00
Boaz Leskes b1f0d8f4cf await fix testRecoveryWaitsForOps 2017-01-27 23:26:23 +01:00
Boaz Leskes 204df2a199 fix @TestLogging annotation 2017-01-27 23:15:51 +01:00
Simon Willnauer e946ec0c33 Don't convert source to UTF-8 it might not be valid UTF-8 2017-01-27 22:31:44 +01:00
Nik Everett 2e48fb8294 Move delete by query helpers into core (#22810)
This moves the building blocks for delete by query into core. This
should enabled two thigns:
1. Plugins other than reindex to implement "bulk by scroll" style
operations.
2. Plugins to directly call delete by query. Those plugins should
be careful to make sure that task cancellation still works, but
this should be possible.

Notes:
1. I've mostly just moved classes and moved around tests methods.
2. I haven't been super careful about cohesion between these core
classes and reindex. They are quite interconnected because I wanted
to make the change as mechanical as possible.

Closes #22616
2017-01-27 16:09:18 -05:00
Ryan Ernst aad51d44ab S3 repository: Add named configurations (#22762)
* S3 repository: Add named configurations

This change implements named configurations for s3 repository as
proposed in #22520. The access/secret key secure settings which were
added in #22479 are reverted, and the only secure settings are those
with the new named configs. All other previously used settings for the
connection are deprecated.

closes #22520
2017-01-27 10:42:45 -08:00
Boaz Leskes 0f58f3f34b fix TestLogging instructions for the remove of single doc indexing actions + add some seq# related info 2017-01-27 19:21:11 +01:00
Nik Everett e36e5fc994 Remove annotation
Not allowed.
2017-01-27 12:37:35 -05:00
Nik Everett 8abd4101eb Add tests for reducing top hits
Also adds many `equals` and `hashCode` implementations and moves
the failure printing in `MatchAssertion` into a common spot and
exposes it over `assertEqualsWithErrorMessageFromXContent` which
does an object equality test but then uses `toXContent` to print
the differences.

Relates to #22278
2017-01-27 12:32:17 -05:00
Jason Tedor 930282e161 Introduce sequence-number-based recovery
This commit introduces sequence-number-based recovery. When a replica
has fallen out of sync, rather than performing a file-based recovery we
first attempt to replay operations since the last local checkpoint on
the replica. To do this, at the start of recovery the replica tells the
primary what its local checkpoint is. The primary will then wait for all
operations between that local checkpoint and the current maximum
sequence number to complete; this is to ensure that there are no gaps in
the operations that will be replayed from the primary to the
replica. This is a best-effort attempt as we currently have no
guarantees on the primary that these operations will be available; if we
are not able to replay all operations in the desired range, we just
fallback to file-based recovery. Later work will strengthen the
guarantees.

Relates #22484
2017-01-27 08:16:38 -08:00
Simon Willnauer 417c93c570 First step towards separating individual search phases (#22802)
At this point AbstractSearchAsyncAction is just a base-class for the first phase of a search where we have multiple replicas
for each shardID. If one of them is not available we move to the next one. Yet, once we passed that first stage we have to work with
the shards we succeeded on the initial phase.
Unfortunately, subsequent phases are not fully detached from the initial phase since they are all non-static inner classes.
In future changes this will be changed to detach the inner classes to test them in isolation and to simplify their creation.
The AbstractSearchAsyncAction should be final and it should just get a factory for the next phase instead of requiring subclasses
etc.
2017-01-27 15:53:41 +01:00
Igor Motov b068814d10 Fix hanging cancelling task with no children
Cancelling tasks with no cancellable children can cause the cancellation operation to hang. This commit fixes this issue.
2017-01-27 08:03:02 -05:00
Tanguy Leroux ea7077fb1b Add parsing method for ElasticsearchException.generateFailureXContent() (#22815)
This commit adds a ElasticsearchException.failureFromXContent() that can be used to parse the result of ElasticsearchException.generateFailureXContent().
2017-01-27 10:12:58 +01:00
Nik Everett 1baa884ab7 Fix TophitsAggregatorTests
It needs a DirectoryReader so it has to be careful.

Closes #22818
2017-01-26 14:08:30 -05:00
Nik Everett f8c28711be Merge some equivalent interfaces (#22816)
Remove `FromXContent` and use `CheckedFunction` instead.
Remove `FromXContentWithContext` and use `ContentParser` instead.
2017-01-26 13:15:29 -05:00
Simon Willnauer a475323aa1 Invalidate cached query results if query timed out (#22807)
Today we cache query results even if the query timed out. This is obviously
problematic since results are not complete. Yet, the decision if a query timed
out or not happens too late to simply not cache the result since if we'd just throw
an exception all currently waiting requests with the same request / cache key would
fail with the same exception without the option to access the result or to re-execute.
Instead, this change will allow the request to enter the cache but invalidates it immediately.
Concurrent request might not get executed and return the timed out result which is not absolutely
correct but very likely since identical requests will likely timeout as well. As a side-effect
we won't hammer the node with concurrent slow searches but rather only execute one of them
and return shortly cached result.

Closes #22789
2017-01-26 16:45:29 +01:00
Tanguy Leroux 1fa2734566 [TEST] Fix ElasticsearchExceptionTests
Some test failures can happen in ElasticsearchExceptionTests, this commit fixes them.
2017-01-26 16:33:56 +01:00
Tanguy Leroux be96278c95 Add parsing method for ElasticsearchException.generateThrowableXContent() (#22783)
The output of the ElasticsearchException.generateThrowableXContent() method can be parsed back by the ElasticsearchException.fromXContent() method.

This commit adds unit tests in the style of the to-and-from-xcontent tests we already have for other parsing methods. It also relax the strict parsing of the ElasticsearchException.fromXContent() so that it does not throw an exception when custom metadata and headers are parsed, as long as they are either strings or arrays of strings. Every other type is ignored at parsing time.
2017-01-26 15:17:07 +01:00
Simon Willnauer f128b7a7fe Improve connection closing in `RemoteClusterConnection` (#22804)
Some tests verify that all connection have been closed but due to the
async / concurrent nature of `RemoteClusterConnection` there are situations
where we notify listeners that trigger tests to finish before we actually
closed all connections. The race is very very small and has no impact on the
code correctness. This commit documents and improves the way we close
connections to ensure test won't fail with false positives.

Closes #22803
2017-01-26 13:58:26 +01:00
Simon Willnauer 281250dec9 Remove DFS_QUERY_AND_FETCH as a search type (#22787)
This commit removes the search type `dfs_query_and_fetch` without a
replacement. We don't allow to use this type via REST since 2.x
but still keep it around for no particular reason. There we no users
complaining about the availability. This should now be removed from the
codebase. `query_and_fetch` is still used internally to safe a roundtrip
if there is only one shard but it can't be used via the rest interface.
2017-01-26 09:14:44 +01:00
Tim Brooks 719e75bb3f Add repository-url module and move URLRepository (#22752)
This is related to #22116. URLRepository requires SocketPermission
connect. This commit introduces a new module called "repository-url"
where URLRepository will reside. With the new module, permissions can
be removed from core.
2017-01-25 17:09:25 -06:00
Nik Everett d704a880e7 Add tests for top_hits aggregation (#22754)
Add unit tests for `TopHitsAggregator` and convert some snippets in
docs for `top_hits` aggregation to `// CONSOLE`.

Relates to #22278
Relates to #18160
2017-01-25 16:15:50 -05:00
Martijn van Groningen f6ed39aa08 Merge branch 'pr/22772' 2017-01-25 17:15:24 +01:00
Martijn van Groningen 81e40e3139 [TEST] Added this for 93a28b0acf submitted via #22772 2017-01-25 17:08:17 +01:00
Jason Tedor cb822b4670 Fix typo in comment in OsProbe.java
This commit fixes a silly typo in a comment relating to cgroups in
OsProbe.java.
2017-01-25 06:30:51 -05:00
Colin Goodheart-Smithe a9135cd636 RangeQuery WITHIN case now normalises query (#22431)
Previous to his change when the range query was rewritten to an unbounded range (`[* TO *]`) it maintained the timezone and format for the query. This means that queries with different timezones and format which are rewritten to unbounded range queries actually end up as different entries in the search request cache.

This is inefficient and unnecessary so this change nulls the timezone and format in the rewritten query so that regardless of the timezone or format the rewritten query will be the same.

Although this does not fix #22412 (since it deals with the WITHIN case rather than the INTERSECTS case) it is born from the same arguments
2017-01-25 10:37:15 +00:00
Boaz Leskes ed94f75a15 Remove EngineClosedException
All usage has been removed in https://github.com/elastic/elasticsearch/pull/22631, which is back ported to 5.x. This means 6.x will never get it on the wire and we can remove it
2017-01-25 11:00:50 +01:00
javanna 5103b76610 update version checks in ElasticsearchException serialization code
5.3.0 is the first version that contains the split from headers to metadata, updated the check to reflect that. It was previously after to be able to commit to master first, and only after that backport the change. Otherwise master tests would have failed until the change was backported.
2017-01-24 20:40:17 +01:00
Lee Hinman 304296ea6a Fix BulkItemResponse serialization for 6.x <-> 5.3.x
Previously the behavior where the `OpType` byte was serialized was only in
master, but it was recently backported to 5.x, so the serialization version
checks need to be updated as well.
2017-01-24 12:04:11 -07:00
srgclr 93a28b0acf skip parentid if child document is an orphan
#22770
2017-01-24 17:49:53 +00:00
Luca Cavanna 47c0e13a3b Stop returning "es." internal exception headers as http response headers (#22703)
move "es." internal headers to separate metadata set in ElasticsearchException and stop returning them as response headers

Closes #17593

* [TEST] remove ESExceptionTests, move its methods to ElasticsearchExceptionTests or ExceptionSerializationTests
2017-01-24 16:12:45 +01:00
Jason Tedor bcffc6fa49 Add hack for Docker cgroups
Docker cgroups are mounted in the wrong place (i.e., inconsistently with
/proc/self/cgroup). This commit adds an undocumented hack for working
around, for now.

Relates #22757
2017-01-24 06:36:03 -05:00
Christoph Büscher 59aefe5a38 Include human readable responses in response parsing tests (#22717)
As a follow up to #22649, this changes the resent tests for parsing parts of search 
responses to randomly set the humanReadable() flag of the XContentBuilder that 
is used to render the responses. This should help to test that we can parse back 
thoses classes if the user specifies `?human=true` in the request url.
2017-01-24 11:17:58 +01:00
Jim Ferenczi b0c2a5da30 Remove unused field in CollapseBuilder 2017-01-24 09:26:23 +01:00
Jim Ferenczi 868b12b548 Add BWC tests for field collapsing
Field collapsing is supported from version 5.3
2017-01-24 08:34:16 +01:00
Nik Everett 2e399e5505 Rename constant
It deserves a new name after the cleanup in #22749
2017-01-23 16:30:42 -05:00
javanna 8065531236 [TEST] add test to verify that the SMILE format works within the _bulk api 2017-01-23 19:40:24 +01:00
Nik Everett ee264c6957 Fix parsing for `max_determinized_states` (#22749)
There was a typo in the `ParseField` declaration. I know
we want to port these parsers to `ObjectParser` eventually
but I don't have the energy for that today and want to get
this fixed.

Closes #22722
2017-01-23 11:57:43 -05:00
Jim Ferenczi e48bc2eed7 Add field collapsing for search request (#22337)
* Add top hits collapsing to search request

The field collapsing is done with a custom top docs collector that "collapse" search hits with same field value.
The distributed aspect is resolve using the two passes that the regular search uses. The first pass "collapse" the top hits, then the coordinating node merge/collapse the top hits from each shard.

```
GET _search
{
   "collapse": {
      "field": "category",
   }
}
```

This change also adds an ExpandCollapseSearchResponseListener that intercepts the search response and expands collapsed hits using the CollapseBuilder#innerHit} options.
The retrieval of each inner_hits is done by sending a query to all shards filtered by the collapse key.

```
GET _search
{
   "collapse": {
      "field": "category",
      "inner_hits": {
	"size": 2
      }
   }
}
```
2017-01-23 16:33:51 +01:00
Tanguy Leroux 11164b394b Add unit tests for ValueCountAggregator and InternalValueCount (#22741)
Adds unit tests for the value count aggregator.

Relates #22278
2017-01-23 16:24:55 +01:00
Simon Willnauer 27b5c2ad54 Pass `forceExecution` flag to transport interceptor (#22739)
To effectively allow a plugin to intercept a transport handler it needs
to know if the handler must be executed even if there is a rejection on the
thread pool in the case the wrapper forks a thread to execute the actual handler.
2017-01-23 11:04:27 +01:00
Alexander Reelsen 6159ca28ae Version: Add missing releases from 2.x in Version.java (#22594) 2017-01-23 09:53:21 +01:00
Tim Brooks a4ac29c005 Add single static instance of SpecialPermission (#22726)
This commit adds a SpecialPermission constant and uses that constant
opposed to introducing new instances everywhere.

Additionally, this commit introduces a single static method to check that
the current code has permission. This avoids all the duplicated access
blocks that exist currently.
2017-01-21 12:03:52 -06:00
Simon Willnauer 3ad6d6ebcc Simplify InternalEngine#innerIndex (#22721)
Today `InternalEngine#innerIndex` is a pretty big method (> 150 SLoC). This
commit merged `#index` and `#innerIndex` and splits it up into smaller contained
methods.
2017-01-21 08:51:35 +01:00
Jim Ferenczi 8028578305 Upgrade to Lucene 6.4.0 (#22724)
* Upgrade to Lucene 6.4.0

`ValueSource`s are now converted to `DoubleValueSource`s using the Lucene adapter made for the migration to the new API in 6.4.0.
2017-01-21 04:48:01 +01:00
Igor Motov cfb415de7b Fix broken TaskInfo.toString()
Related to #22387
2017-01-20 20:57:42 -05:00
Tim Brooks d86f97c428 Add CheckedSupplier and CheckedRunnable to core (#22725)
Introduce CheckedSupplier and CheckedRunnable functional interfaces
into core. These offer a checked version of the Supplier and Runnable
interfaces for use with lambda apis.
2017-01-20 19:17:44 -06:00
Ali Beyad 3bf06d1440 Fixes retrieval of the latest snapshot index blob (#22700)
This commit ensures that the index.latest blob is first examined to
determine the latest index-N blob id, before attempting to list all
index-N blobs and picking the blob with the highest N.

It also fixes the MockRepository#move so that tests are able to handle
non-atomic moves.  This is done by adding a special setting to the
MockRepository that requires the test to specify if it can handle
non-atomic moves.  If so, then the MockRepository#move operation will be
non-atomic to allow testing for against such repositories.
2017-01-20 17:00:46 -06:00
Jim Ferenczi 4ec4bad908 Fix script score function that combines _score and weight (#22713)
The weight factor function does not check if the delegate score function needs to access the score of the query.
This results in a _score equals to 0 for all score function that set a weight.
This change modifies the WeightFactorFunction#needsScore to delegate the call to its underlying score function.

Fix #21483
2017-01-20 19:50:57 +01:00
Nik Everett 6265ef1c1b Deguice rest handlers (#22575)
There are presently 7 ctor args used in any rest handlers:
* `Settings`: Every handler uses it to initialize a logger and
  some other strange things.
* `RestController`: Every handler registers itself with it.
* `ClusterSettings`: Used by `RestClusterGetSettingsAction` to
  render the default values for cluster settings.
* `IndexScopedSettings`: Used by `RestGetSettingsAction` to get
  the default values for index settings.
* `SettingsFilter`: Used by a few handlers to filter returned
  settings so we don't expose stuff like passwords.
* `IndexNameExpressionResolver`: Used by `_cat/indices` to
  filter the list of indices.
* `Supplier<DiscoveryNodes>`: Used to fill enrich the response
  by handlers that list tasks.

We probably want to reduce these arguments over time but
switching construction away from guice gives us tighter
control over the list of available arguments.

These parameters are passed to plugins using
`ActionPlugin#initRestHandlers` which is expected to build and
return that handlers immediately. This felt simpler than
returning an reference to the ctors given all the different
possible args.

Breaks java plugins by moving rest handlers off of guice.
2017-01-20 11:48:51 -05:00
Simon Willnauer 824beea89d Fix handling of document failure expcetion in InternalEngine (#22718)
Today we try to be smart and make a generic decision if an exception should
be treated as a document failure but in some cases concurrency in the index writer
make this decision very difficult since we don't have a consistent state in the case
another thread is currently failing the IndexWriter/InternalEngine due to a tragic event.

This change simplifies the exception handling and makes specific decisions about document failures
rather than using a generic heuristic. This prevent exceptions to be treated as document failures
that should have failed the engine but backed out of failing since since some other thread has
already taken over the failure procedure but didn't finish yet.
2017-01-20 16:55:00 +01:00
markharwood f01784205f New AdjacencyMatrix aggregation
Similar to the Filters aggregation but only supports "keyed" filter buckets and automatically "ANDs" pairs of filters to produce a form of adjacency matrix.
The intersection of buckets "A" and "B" is named "A&B" (the choice of separator is configurable). Empty intersection buckets are removed from the final results.

Closes #22169
2017-01-20 15:49:31 +00:00
Tim Brooks bc16162d21 Remove accept SocketPermissions from core (#22622)
This is related to #22116. Core no longer needs SocketPermission 
accept. This permission is relegated to the transport-netty4 module 
and (for tests) to the mocksocket jar.
2017-01-20 09:27:45 -06:00
Tanguy Leroux 239ed0c912 Add unit tests for DateHistogramAggregator (#22714)
Adds unit tests for the date histogram aggregator.

Relates #22278
2017-01-20 14:18:30 +01:00
Christoph Büscher 54105f3ddd Add parsing from xContent to ShardSearchFailure (#22699)
In preparation for being able to parse SearchResponse from its rest
representation, this adds fromXContent to ShardSearchFailure.
2017-01-20 12:49:54 +01:00
Yannick Welsch 1f0e0a2170 Close InputStream when receiving cluster state in PublishClusterStateAction (#22711)
Not closing the InputStream will leak native memory as the DeflateCompressor/Inflater won't be closed.
2017-01-20 12:26:07 +01:00
Boaz Leskes 5d806bf93e Index creation and setting update may not return deprecation logging (#22702)
Those services validate their setting before submitting an AckedClusterStateUpdateTask to the cluster state service. An acked cluster state may be completed by a networking thread when the last acks as received. As such it needs special care to make sure that thread context headers are handled correctly.
2017-01-20 10:14:13 +01:00
David Pilato fc4dc5ef21 Fix comment 2017-01-20 10:13:13 +01:00
David Pilato ad5b8def26 Merge branch 'pr/delete-from-xcontent' 2017-01-20 09:16:34 +01:00
Lee Hinman eb8a41ef94 Add missing serialization BWC for disk usage estimates
Relates to #22081
2017-01-19 15:37:06 -07:00
Lee Hinman 4eb32e9d86 Expose disk usage estimates in nodes stats
This exposes the least and most used disk usage estimates within the "fs" nodes
stats output:

```json
GET /_nodes/stats/fs?pretty&human
{
  "nodes" : {
    "34fPVU0uQ_-wWitDzDXX_g" : {
      "fs" : {
        "timestamp" : 1481238723550,
        "total" : {
          "total" : "396.1gb",
          "total_in_bytes" : 425343254528,
          "free" : "140.6gb",
          "free_in_bytes" : 151068725248,
          "available" : "120.5gb",
          "available_in_bytes" : 129438912512
        },
        "least_usage_estimate" : {
          "path" : "/home/hinmanm/es/elasticsearch/distribution/build/cluster/run node0/elasticsearch-6.0.0-alpha1-SNAPSHOT/data/nodes/0",
          "total" : "396.1gb",
          "total_in_bytes" : 425343254528,
          "available" : "120.5gb",
          "available_in_bytes" : 129438633984,
          "used_disk_percent" : 69.56842912023208
        },
        "most_usage_estimate" : {
          "path" : "/home/hinmanm/es/elasticsearch/distribution/build/cluster/run node0/elasticsearch-6.0.0-alpha1-SNAPSHOT/data/nodes/0",
          "total" : "396.1gb",
          "total_in_bytes" : 425343254528,
          "available" : "120.5gb",
          "available_in_bytes" : 129438633984,
          "used_disk_percent" : 69.56842912023208
        },
        "data" : [{...}],
        "io_stats" : {...}
      }
    }
  }
}
```

Resolves #8686
Resolves #22081
2017-01-19 13:56:52 -07:00
Jason Tedor 9781b88a38 Fix deprecation logging for lenient booleans
This commit fixes an issue with deprecation logging for lenient
booleans. The underlying issue is that adding deprecation logging for
lenient booleans added a static deprecation logger to the Settings
class. However, the Settings class is initialized very early and in CLI
tools can be initialized before logging is initialized. This leads to
status logger error messages. Additionally, the deprecation logging for
a lot of the settings does not provide useful context (for example, in
the token filter factories, the deprecation logging only produces the
name of the setting, but gives no context which token filter factory it
comes from). This commit addresses both of these issues by changing the
call sites to push a deprecation logger through to the lenient boolean
parsing.

Relates #22696
2017-01-19 12:30:33 -05:00
David Pilato 5be8bd76e2 Also test found field
And optimize imports
2017-01-19 17:28:31 +01:00
Tim Brooks 3deae99a34 Fix incorrect args order passed to createAggregator
This commit fixes a compile issue where the arguments are passed to
createAggregator in the incorrect order.
2017-01-19 10:08:38 -06:00
Christoph Büscher e03554070c Add parsing from xContent to SearchProfileShardResults and nested classes (#22649)
In preparation for being able to parse SearchResponse from its rest representation
for the java rest client, this adds fromXContent to SearchProfileShardResults and its
nested classes.
2017-01-19 16:29:10 +01:00
Jim Ferenczi b781a4a176 Add unit tests for FiltersAggregator (#22678)
Adds unit tests for the `filters` aggregation.
This change also adds an helper to search and reduce any aggregator in a unit test.
This is done by dividing a single searcher in sub-searcher, one for each segment.

Relates #22278
2017-01-19 16:22:48 +01:00
David Pilato 0315dcc306 Use now common methods with index/update
Brought by #22229
2017-01-19 16:10:13 +01:00
Jim Ferenczi 3d54258de2 Don't register search response listener in transport clients
Small fix for https://github.com/elastic/elasticsearch/pull/22682
2017-01-19 16:08:24 +01:00
David Pilato 718a6b9be7 Add fromxcontent methods to delete response
This commit adds the parsing fromXContent() methods to the IndexResponse class.

It's a pale copy of what has been done in #22229.
2017-01-19 15:59:24 +01:00
Nicholas Knize b006636aaf unmute FieldStatsIntegrationIT.testGeoPointNotIndexed, fix already pushed 2017-01-19 08:44:00 -06:00
Nicholas Knize 88c78833f0 Mute FieldStatsIntegrationIT.testGeoPointNotIndexed, for now 2017-01-19 08:38:17 -06:00
Jim Ferenczi d145d459ae Fix NPE on FieldStats with mixed cluster on version pre/post 5.2 (#22688)
* Fix NPE on FieldStats with mixed cluster on version pre/post 5.2

In 5.2 the FieldStats API can return null min/max values.
These values cannot be deserialized by a node with version pre 5.2 so if this node
is pick to coordinate a FieldStats request in a mixed cluster an NPE can be thrown.
This change prevents the NPE by removing the non serializable FieldStats object directly in the field stats shard request.
The filtered fields will not be present in the response when a node pre 5.2 acts as a coordinating node.
2017-01-19 14:20:07 +01:00
Tanguy Leroux 833284cae2 Add parsing methods for UpdateResponse (#22586)
This commit adds the fromXContent() method to the UpdateResponse class, so that it can be used with the high level rest client.
2017-01-19 12:49:45 +01:00
Jim Ferenczi 21dae1924f Add the ability to define search response listeners in plugins (#22682)
This change is a simple adaptation of https://github.com/elastic/elasticsearch/pull/19587 for the current state of master.
It allows to define search response listener in the form of `BiConsumer<SearchRequest, SearchResponse>`s in a search plugin.
2017-01-19 12:48:45 +01:00
Daniel Mitterdorfer ce765f7ad2 Use a proper boolean in FieldStatsIntegrationIT#testGeoPointNotIndexed() 2017-01-19 08:33:08 +01:00
Daniel Mitterdorfer aece89d6a1 Make boolean conversion strict (#22200)
This PR removes all leniency in the conversion of Strings to booleans: "true"
is converted to the boolean value `true`, "false" is converted to the boolean
value `false`. Everything else raises an error.
2017-01-19 07:59:18 +01:00
Nicholas Knize 51e80e7176 remove unnecessary text from exception message 2017-01-18 14:51:56 -06:00
Nicholas Knize 84e4f91253 Add geo_point to FieldStats
This commit adds a new GeoPoint class to FieldStats for computing field stats over geo_point field types.
2017-01-18 14:37:03 -06:00
Nik Everett 1fe74a6b4b Better error when can't auto create index (#22488)
Changes the error message when `action.auto_create_index` or
`index.mapper.dynamic` forbids automatic creation of an index
from `no such index` to one of:
* `no such index and [action.auto_create_index] is [false]`
* `no such index and [index.mapper.dynamic] is [false]`
* `no such index and [action.auto_create_index] contains [-<pattern>] which forbids automatic creation of the index`
* `no such index and [action.auto_create_index] ([all patterns]) doesn't match`

This should make it more clear *why* there is `no such index`.

Closes #22435
2017-01-18 15:18:32 -05:00
Ali Beyad cd52065871 [TEST] testAckedIndexing waits for all nodes to stabilize
testAckedIndexing now waits for all nodes to stabilize in the cluster
state through an assertBusy before final validation that all documents
are found in tehir respective shards in the cluster.  Before, what could
happen is that the ensureGreen check passes but only after that is a
ping failure from the network disruption processed by the master,
thereby rendering the cluster RED again.  This assertBusy waits up to 30
seconds for all nodes to have stabilized and all get document actions to
succeed.
2017-01-18 13:51:25 -05:00
Michael McCandless 1d1bdd476c Finish exposing FlattenGraphTokenFilter (#22667) 2017-01-18 11:05:34 -05:00
Nik Everett e71b26f480 Improve unit test coverage of aggs (#22668)
Add tests for `GlobalAggregator`, `MaxAggregator`, and `InternalMax`.

Relates to #22278
2017-01-18 10:33:45 -05:00
Simon Willnauer 24e2847af2 Streamline foreign stored context restore and allow to perserve response headers (#22677)
Today we do not preserve response headers if they are present on a transport protocol
response. While preserving these headers is not always desired, in the most cases we
should pass on these headers to have consistent results for depreciation headers etc.
yet, this hasn't been much of a problem since most of the deprecations are detected early
ie. on the coordinating node such that this bug wasn't uncovered until #22647

This commit allow to optionally preserve headers when a context is restored and also streamlines
the context restore since it leaked frequently into the callers thread context when the callers
context wasn't restored again.
2017-01-18 16:17:54 +01:00