Commit Graph

7885 Commits

Author SHA1 Message Date
Colin Goodheart-Smithe d4a6ba8ec9 No longer add illegal content type option to stored search templates (#24251)
When parsing StoredSearchScript we were adding a Content type option that was forbidden (by a check that threw an exception) by the parser thats used to parse the template when we read it from the cluster state. This was stopping Elastisearch from starting after stored search templates had been added.

This change no longer adds the content type option to the StoredScriptSource object when parsing from the put search template request.  This is safe because the StoredScriptSource content is always JSON when its stored in the cluster state since we do a conversion to JSON before this point.

Also removes the check for the content type in the options when parsing StoredScriptSource so users who already have stored scripts can start Elasticsearch.

Closes #24227
2017-04-22 13:37:04 -04:00
Ryan Ernst 473e98981b Scripts: Remove unnecessary executable shortcut (#24264)
ScriptService has two executable methods, one which takes a
CompiledScript, which is similar to search, and one that takes a raw
Script and both compiles and returns an ExecutableScript for it. The
latter is not needed, and the call sites which used one or the other
were mixed. This commit removes the extra executable method in favor of
callers first calling compile, then executable.
2017-04-21 17:53:03 -07:00
Ryan Ernst aadc33d260 Scripts: Remove unwrap method from executable scripts (#24263)
The unwrap method was leftover from support javascript and python. Since
those languages are removed in 6.0, this commit removes the unwrap
feature from scripts.
2017-04-21 17:50:22 -07:00
Nik Everett 447f307ebb Fix _bulk response when it can't create an index (#24048)
Before #22488 when an index couldn't be created during a `_bulk`
operation we'd do all the *other* actions and return the index
creation error on each failing action. In #22488 we accidentally
changed it so that we now reject the entire bulk request if a single
action cannot create an index that it must create to run. This
gets reverts to the old behavior while still keeping the nicer
error messages. Instead of failing the entire request we now only
fail the portions of the request that can't work because the index
doesn't exist.

Closes #24028
2017-04-21 18:56:04 -04:00
Jason Tedor fe91c72151 Use a marker file when removing a plugin
Today when removing a plugin, we attempt to move the plugin directory to
a temporary directory and then delete that directory from the
filesystem. We do this to avoid a plugin being in a half-removed
state. We previously tried an atomic move, and fell back to a non-atomic
move if that failed. Atomic moves can fail on union filesystems when the
plugin directory is not in the top layer of the
filesystem. Interestingly, the regular move can fail as well. This is
because when the JDK is executing such a move, it first tries to rename
the source directory to the target directory and if this fails with
EXDEV (as in the case of an atomic move failing), it falls back to
copying the source to the target, and then attempts to rmdir the
source. The bug here is that the JDK never deleted the contents of the
source so the rmdir will always fail (except in the case of an empty
directory).

Given all this silliness, we were inspired to find a different
strategy. The strategy is simple. We will add a marker file to the
plugin directory that indicates the plugin is in a state of
removal. This file will be the last file out the door during removal. If
this file exists during startup, we fail startup.

Relates #24252
2017-04-21 15:50:44 -04:00
Simon Willnauer 2ca7072b24 Fill missing sequence IDs up to max sequence ID when recovering from store (#24238)
Today we might promote a primary and recover from store where after translog
recovery the local checkpoint is still behind the maximum sequence ID seen.
To fill the holes in the sequence ID history this PR adds a utility method
that fills up all missing sequence IDs up to the maximum seen sequence ID
with no-ops.

Relates to #10708
2017-04-21 20:28:00 +02:00
Ryan Ernst ba48674695 Build: Move plugin cli and tests to distribution tool (#24220)
The plugin cli currently resides inside the elasticsearch jar. This
commit moves it into a plugin-cli jar. This is change alone is a no-op;
it does not change anything about what is loaded at runtime. But it will
allow easier testing (with fixtures in the future to test ES or maven
installation), as well as eventually not loading these classes when
starting elasticsearch.
2017-04-21 09:25:58 -07:00
Boaz Leskes badb2be066 Peer Recovery: remove maxUnsafeAutoIdTimestamp hand off (#24243)
With #24149 , it is now stored in the Lucene commit and is implicitly transferred in the file phase of the recovery.
2017-04-21 17:31:50 +02:00
Ali Beyad 63e5aff5d6 Adds version 5.3.2 and backwards compatibility indices for 5.3.1 2017-04-21 10:48:41 -04:00
Tanguy Leroux 480bf0996d Add utility method to parse named XContent objects with typed prefix (#24240)
This commit adds a XContentParserUtils.parseTypedKeysObject() method
that can be used to parse named XContent objects identified by a field
name containing a type identifier, a delimiter and the name of the
object to parse.
2017-04-21 15:41:27 +02:00
Tanguy Leroux 251b6d452b MultiBucketsAggregation.Bucket should not extend Writeable (#24216)
The MultiBucketsAggregation.Bucket interface extends Writeable, forcing
all implementation classes to implement writeTo(). This commit removes
the Writeable from the interface and move it down to the InternalBucket
implementation.
2017-04-21 15:29:53 +02:00
Yannick Welsch c2deb1c81d Don't expose cleaned-up tasks as pending in PrioritizedEsThreadPoolExecutor (#24237)
Changes in #24102 exposed the following oddity: PrioritizedEsThreadPoolExecutor.getPending() can return Pending entries where pending.task == null. This can happen for example when tasks are added to the pending list while they are in the clean up phase, i.e. TieBreakingPrioritizedRunnable#runAndClean has run already, but afterExecute has not removed the task yet. Instead of safeguarding consumers of the API (as was done before #24102) this changes the executor to not count these tasks as pending at all.
2017-04-21 15:25:19 +02:00
Colin Goodheart-Smithe 3c7c4bc824 Adds declareNamedObjects methods to ConstructingObjectParser (#24219)
* Adds declareNamedObjects methods to ConstructingObjectParser

* Addresses review comments
2017-04-21 09:50:30 +01:00
Christoph Büscher c8ad26edc9 Tests: Extend InternalStatsTests (#24212)
Currently we don't test for count = 0 which will make a difference when adding
tests for parsing for the high level rest client. Also min/max/sum should also
be tested with negative values and on a larger range.
2017-04-21 10:38:09 +02:00
Adrien Grand 81b64ed587 IndicesQueryCache should delegate the scorerSupplier method. (#24209)
Otherwise the range improvements that we did on range queries would not work.
This is similar to https://issues.apache.org/jira/browse/LUCENE-7749.
2017-04-21 10:33:02 +02:00
Adrien Grand f322f537e4 Speed up parsing of large `terms` queries. (#24210)
The addition of the normalization feature on keywords slowed down the parsing
of large `terms` queries since all terms now have to go through normalization.
However this can be avoided in the default case that the analyzer is a
`keyword` analyzer since all that normalization will do is a UTF8 conversion.
Using `Analyzer.normalize` for that is a bit overkill and could be skipped.
2017-04-21 10:32:33 +02:00
Jim Ferenczi a4365971a0 [TEST] make sure that the random query_string query generator defines a default_field or a list of fields 2017-04-21 02:56:26 +02:00
Fabien Baligand 4a45579506 token_count type : add an option to count tokens (fix #23227) (#24175)
Add option "enable_position_increments" with default value true.
If option is set to false, indexed value is the number of tokens
(not position increments count)
2017-04-21 00:53:28 +02:00
Jim Ferenczi 525101b64d Query string default field (#24214)
Currently any `query_string` query that use a wildcard field with no matching field is rewritten with the `_all` field.

For instance:
````
#creating test doc
PUT testing/t/1
{
  "test": {
    "field_one": "hello",
    "field_two": "world"
  }
}
#searching abc.* (does not exist) -> hit
GET testing/t/_search
{
  "query": {
    "query_string": {
      "fields": [
        "abc.*"
      ],
      "query": "hello"
    }
  }
}
````

This bug first appeared in 5.0 after the query refactoring and impacts only users that use `_all` as default field.
Indices created in 6.x will not have this problem since `_all` is deactivated in this version.

This change fixes this bug by returning a MatchNoDocsQuery for any term that expand to an empty list of field.
2017-04-20 22:12:20 +02:00
Luca Cavanna 82c678b5c7 Make Aggregations an abstract class rather than an interface (#24184)
Some of the base methods that don't have to do with reduce phase and serialization can be moved to the base class which is no longer an interface. This will be reusable by the high level REST client further on the road. Also it simplify things as having an interface with a single implementor is not that helpful.
2017-04-20 21:31:34 +02:00
Areek Zillur 077a6c3ee7 [TEST] ensure expected sequence no and version are set when index/delete engine operation has a document failure 2017-04-20 13:38:52 -04:00
Yannick Welsch 22e0795990 Extract batch executor out of cluster service (#24102)
Refactoring that extracts the task batching functionality from ClusterService and makes it a reusable component that can be tested in isolation.
2017-04-20 17:28:43 +02:00
Tanguy Leroux 55a879ee8d Align behavior or HDR percentiles iterator with percentile() method (#24206) 2017-04-20 12:37:33 +02:00
Nik Everett caf376c8af Start building analysis-common module (#23614)
Start moving built in analysis components into the new analysis-common
module. The goal of this project is:
1. Remove core's dependency on lucene-analyzers-common.jar which should
shrink the dependencies for transport client and high level rest client.
2. Prove that analysis plugins can do all the "built in" things by moving all
"built in" behavior to a plugin.
3. Force tests not to depend on any oddball analyzer behavior. If tests
need anything more than the standard analyzer they can use the mock
analyzer provided by Lucene's test infrastructure.
2017-04-19 18:51:34 -04:00
Jason Tedor 4796557a30 Add primary term to doc write response
This commit adds the primary term to the doc write response.

Relates #24171
2017-04-19 14:44:22 -04:00
Ryan Ernst c7e9231a86 Plugins: Remove leniency for missing plugins dir (#24173)
This leniency was left in after plugin installer refactoring for 2.0
because some tests still relied on it. However, the need for this
leniency no longer exists.
2017-04-19 09:09:34 -07:00
Christoph Büscher a9657a5a09 Add BucketMetricValue interface (#24188)
Unlike other implementations of InternalNumericMetricsAggregation.SingleValue,
the InternalBucketMetricValue aggregation currently doesn't implement a
specialized interface that exposes the `keys()` method. This change adds this so
that clients can access the keys via the interface.
2017-04-19 16:27:33 +02:00
Jim Ferenczi f05af0a382 Enable index-time sorting (#24055)
This change adds an index setting to define how the documents should be sorted inside each Segment.
It allows any numeric, date, boolean or keyword field inside a mapping to be used to sort the index on disk.
It is not allowed to use a `nested` fields inside an index that defines an index sorting since `nested` fields relies on the original sort of the index.
This change does not add early termination capabilities in the search layer. This will be added in a follow up.

Relates #6720
2017-04-19 14:36:11 +02:00
Boaz Leskes 8758c541b3 ElectMasterService.hasEnoughMasterNodes should return false if no masters were found
This is a regression introduced in #20063
2017-04-19 09:52:06 +02:00
Tanguy Leroux 741c031384 [Test] Add unit tests for InternalHDRPercentilesTests (#24157)
Related to #22278
2017-04-19 09:37:01 +02:00
Areek Zillur 4f773e2dbb Replicate write failures (#23314)
* Replicate write failures

Currently, when a primary write operation fails after generating
a sequence number, the failure is not communicated to the replicas.
Ideally, every operation which generates a sequence number on primary
should be recorded in all replicas.

In this change, a sequence number is associated with write operation
failure. When a failure with an assinged seqence number arrives at a
replica, the failure cause and sequence number is recorded in the translog
and the sequence number is marked as completed via executing `Engine.noOp`
on the replica engine.

* use zlong to serialize seq_no

* Incorporate feedback

* track write failures in translog as a noop in primary

* Add tests for replicating write failures.

Test that document failure (w/ seq no generated) are recorded
as no-op in the translog for primary and replica shards

* Update to master

* update shouldExecuteOnReplica comment

* rename indexshard noop to markSeqNoAsNoOp

* remove redundant conditional

* Consolidate possible replica action for bulk item request
depanding on it's primary execution

* remove bulk shard result abstraction

* fix failure handling logic for bwc

* add more tests

* minor fix

* cleanup

* incorporate feedback

* incorporate feedback

* add assert to remove handling noop primary response when 5.0 nodes are not supported
2017-04-19 01:23:54 -04:00
Jason Tedor 9e0ebc5965 Rename variable in translog simple commit test
This commit renames a variable for clarity in the translog simple commit
test.
2017-04-18 23:43:25 -04:00
Jason Tedor 20181dd0ad Strengthen translog commit with open view test
This commit strengthens an assertion in the translog commit with open
view test.
2017-04-18 23:41:55 -04:00
Jason Tedor 180d1f2219 Stronger check in translog prepare and commit test
This commit strengthens an assertion in the translog prepare commit and
commit test.
2017-04-18 23:37:54 -04:00
Jason Tedor 23b224a5a9 Fix translog prepare commit and commit test
This test was terribly, horribly, no goodly, and badly broken it's
amazing it ever passed so this commit fixes it.
2017-04-18 23:32:47 -04:00
Boaz Leskes edff30f82a Engine: store maxUnsafeAutoIdTimestamp in commit (#24149)
The `maxUnsafeAutoIdTimestamp` timestamp is a safety marker guaranteeing that no retried-indexing operation with a higher auto gen id timestamp was process by the engine. This allows us to safely process documents without checking if they were seen before.

Currently this property is maintained in memory and is handed off from the primary to any replica during the recovery process.

This commit takes a more natural approach and stores it in the lucene commit, using the same semantics (no retry op with a higher time stamp is part of this commit). This means that the knowledge is transferred during the file copy and also means that we don't need to worry about crazy situations where an original append only request arrives at the engine after a retry was processed *and* the engine was restarted.
2017-04-18 20:11:32 +02:00
Simon Willnauer ab9884b2e9 Remove leniency when merging fetched hits in a search response phase (#24158)
Today when we merge hits we have a hard check to prevent AIOOB exceptions
that simply skips an expected search hit. This can only happen if there is a
bug in the code which should be turned into a hard exception or an assertion
triggered. This change adds an assertion an removes the lenient check for the
fetched hits.
2017-04-18 17:19:57 +02:00
Tanguy Leroux 829dd068d6 [Test] Use appropriate DocValueFormats in Aggregations tests (#24155)
Some aggregations (like Min, Max etc) use a wrong DocValueFormat in
tests (like IP or GeoHash). We should not test aggregations that expect
a numeric value with a DocValueFormat like IP. Such wrong DocValueFormat
can also prevent the aggregation to be rendered as ToXContent, and this
will be an issue for the High Level Rest Client tests which expect to be
able to parse back aggregations.
2017-04-18 17:03:32 +02:00
Christoph Büscher 8f540346a9 Tests: Fixing typo in class name of InternalGlobalTests
Renaming from InternalGlogbalTests -> InternalGlobalTests
2017-04-18 16:27:15 +02:00
Adrien Grand 4632661bc7 Upgrade to a Lucene 7 snapshot (#24089)
We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index.

Some notes about the change:
 - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors
 - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed
 - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge
2017-04-18 15:17:21 +02:00
Tanguy Leroux f217eb8ad8 Merge Percentile class with interface (#24154)
This commit merges the Percentile interface with the InternalPercentile
class, as we don't need to maintain both.
2017-04-18 14:47:18 +02:00
Martijn van Groningen edada2581e
[TEST] Added unittests for InternalSampler 2017-04-18 14:31:58 +02:00
Yannick Welsch 0b2cb68f6f [TEST] Randomly add and remove no_master blocks in IndicesClusterStateServiceRandomUpdatesTests
Checks that IndicesClusterStateService stays consistent with incoming cluster states that contain no_master blocks (especially
discovery.zen.no_master_block=all which disables state persistence). In particular this checks that active shards which have no in-memory data
structures on a node are failed.
2017-04-18 14:27:54 +02:00
Martijn van Groningen ac41fb2c4a
[TEST] Added test for GeoCentroidAggregator and
made constructors of GeoCentroidAggregator, GeoCentroidAggregatorFactory and InternalGeoCentroid package protected.
2017-04-18 13:54:31 +02:00
Tanguy Leroux 81dbdb239f [Test] Add unit tests for InternalTDigestPercentilesTests (#24090) 2017-04-18 09:48:35 +02:00
Chris Earle 12c8423ec9 Warn on not enough masters during election (#20063)
This changes the trace level logging to warn, and adds the needed number to the message as well.

My fear is that it may get noisy, but this is an issue that you want to be noisy.
2017-04-17 22:18:28 -04:00
Jason Tedor 34eda1a1a8 Do not set path.data in environment if not set
When preparing the final settings in the environment, we unconditionally
set path.data even if path.data was not explicitly set. This confounds
detection for whether or not path.data was explicitly set, and this is
trappy. This commit adds logic to only set path.data in the final
settings if path.data was explicitly set, and provides a test case that
fails without this logic.

Relates #24132
2017-04-17 10:43:13 -04:00
Jason Tedor f7ebe9d18f Preserve multiple translog generations
Today when a flush is performed, the translog is committed and if there
are no outstanding views, only the current translog generation is
preserved. Yet for the purpose of sequence numbers, we need stronger
guarantees than this. This commit migrates the preservation of translog
generations to keep the minimum generation that would be needed to
recover after the local checkpoint.

Relates #24015
2017-04-17 08:51:54 -04:00
Jason Tedor 8033c576b7 Detect remnants of path.data/default.path.data bug
In Elasticsearch 5.3.0 a bug was introduced in the merging of default
settings when the target setting existed as an array. When this bug
concerns path.data and default.path.data, we ended up in a situation
where the paths specified in both settings would be used to write index
data. Since our packaging sets default.path.data, users that configure
multiple data paths via an array and use the packaging are subject to
having shards land in paths in default.path.data when that is very
likely not what they intended.

This commit is an attempt to rectify this situation. If path.data and
default.path.data are configured, we check for the presence of indices
there. If we find any, we log messages explaining the situation and fail
the node.

Relates #24099
2017-04-17 07:03:46 -04:00
jaymode a8be0a5836
Cat APIs should not close the stream obtained from the channel
The cat APIs and rest tables would obtain a stream from the RestChannel, which happened to be a
ReleasableBytesStreamOutput. These APIs used the stream to write content to, closed the stream,
and then tried to send a response. After #23941 was merged, closing the stream meant that the bytes
were released for use elsewhere. This caused occasional corruption of the response when the bytes
were used prior to the response being sent.

This commit changes these two usages to wrap the stream obtained from the channel in a flush on
close stream so that the bytes are still reserved until the message is sent.
2017-04-15 14:57:00 -04:00