Commit Graph

8417 Commits

Author SHA1 Message Date
Boaz Leskes d20cd6afcb ESIndexLevelReplicationTestCase.ReplicationAction#execute should send exceptions to it's listener rather than bubble them up
This is how TRA works as well.
2017-06-22 23:37:08 +02:00
Boaz Leskes fb8c767737 testRecoveryAfterPrimaryPromotion shouldn't flush the replica with extra operations
We don't yet have lucene rollbacks, so we can't bake those in
2017-06-22 23:24:43 +02:00
Simon Willnauer 59b625121b Ensure `InternalEngineTests.testConcurrentWritesAndCommits` doesn't pile up commits (#25367)
`InternalEngineTests.testConcurrentWritesAndCommits` can be very heavy on disks
if threads are slow and the main thread keeps on pulling commit points holding on
to many many segments. This commit adds some quadratic backoff to not pile up too many
commits and to make sure indexing threads can make progress. This also now doesn't do
busy waiting but waits on a latch with a timeout.

Closes #25110
2017-06-22 21:50:11 +02:00
Simon Willnauer a077fa9b07 [TEST] Add debug logging if an unexpected exception is thrown 2017-06-22 21:19:39 +02:00
Igor Motov e6e5ae6202 TemplateUpgraders should be called during rolling restart (#25263)
In #24379 we added ability to upgrade templates on full cluster startup. This PR invokes the same update procedure also when a new node first joins the cluster allowing to update templates on a rolling cluster restart as well.

Closes #24680
2017-06-22 14:55:28 -04:00
Jason Tedor 8dcb1f5c7c Initialize max unsafe auto ID timestamp on shrink
When shrinking an index we initialize its max unsafe auto ID timestamp
to the maximum of the max unsafe auto ID timestamps on the source
shards.

Relates #25356
2017-06-22 11:14:25 -04:00
Boaz Leskes d963882053 Enable a long translog retention policy by default (#25294)
#25147  added the translog deletion policy but didn't enable it by default. This PR enables a default retention of 512MB (same maximum size of the current translog) and an age of 12 hours (i.e., after 12 hours all translog files will be deleted). This increases to chance to have an ops based recovery, even if the primary flushed or the replica was offline for a few hours.

In order to see which parts of the translog are committed into lucene the translog stats are extended to include information about uncommitted operations.

Views now include all translog ops and guarantee, as before, that those will not go away. Snapshotting a view allows to filter out generations that are not relevant based on a specific sequence number.

Relates to #10708
2017-06-22 17:08:14 +02:00
Simon Willnauer 29e80eea40 Remove `index.mapping.single_type=false` from core/tests (#25331)
This change cleans up core tests to not use `index.mapping.single_type=false`
but instead where applicable use a single type or markt the index as created
with a pre 6.x version.

Relates to #24961
2017-06-22 16:48:16 +02:00
Jason Tedor 97a2c4523d Get short path name for native controllers
Due to limitations with CreateProcessW on Windows (ultimately used by
ProcessBuilder) with respect to maximum path lengths, we need to get the
short path name for any native controllers before trying to start them
in case the absolute path exceeds the maximum path length. This commit
uses JNA to invoke the necessary Windows API for this to start the
native controller using the short path.

To be precise about the limitation here, the MSDN docs for
CreateProcessW say for the command line parameter:

>The command line to be executed. The maximum length of this string is
>32,768 characters, including the Unicode terminating null character. If
>lpApplicationName is NULL, the module name portionof lpCommandLine is
>limited to MAX_PATH characters.

This is exactly how the Windows implementation of Process in the JDK
invokes CreateProcessW: with the executable name (lpApplicationName) set
to NULL.

Relates #25344
2017-06-22 07:59:58 -04:00
Yannick Welsch e41eae9f05 Live primary-replica resync (no rollback) (#24841)
Adds a replication task that streams all operations from the primary's global checkpoint to all replicas.
2017-06-22 13:35:34 +02:00
Adrien Grand 44e9c0b947 Upgrade to lucene-7.0.0-snapshot-ad2cb77. (#25349)
Most notable changes:
 - better update concurrency: LUCENE-7868
 - TopDocs.totalHits is now a long: LUCENE-7872
 - QueryBuilder does not remove the boolean query around multi-term synonyms:
   LUCENE-7878
 - removal of Fields: LUCENE-7500

For the `TopDocs.totalHits` change, this PR relies on the fact that the encoding
of vInts and vLongs are compatible: you can write and read with any of them as
long as the value can be represented by a positive int.
2017-06-22 12:35:33 +02:00
Jason Tedor cc67d027de Initialize sequence numbers on a shrunken index
Bringing together shards in a shrunken index means that we need to
address the start of history for the shrunken index. The problem here is
that sequence numbers before the maximum of the maximum sequence numbers
on the source shards can collide in the target shards in the shrunken
index. To address this, we set the maximum sequence number and the local
checkpoint on the target shards to this maximum of the maximum sequence
numbers. This enables correct document-level semantics for documents
indexed before the shrink, and history on the shrunken index will
effectively start from here.

Relates #25321
2017-06-21 13:40:45 -04:00
Nik Everett 4bbb7e828b Port most snapshot/restore static bwc tests to qa:full-cluster-restart (#25296)
Ports all of RepositoryUpgradabilityIT to qa:full-cluster-restart and ports as much of RestoreBackwardsCompatIT as possible into qa:full-cluster-restart.
2017-06-21 13:26:03 -04:00
Nik Everett bec1a49a54 Javadoc: ThreadPool doesn't reject while shutdown (#23678)
It caught me offguard yesterday that our executors won't always
reject when the ThreadPool is shutdown.
2017-06-21 12:21:48 -04:00
Tanguy Leroux 49ebd65548 Add backward compatibility indices for 5.4.2 2017-06-21 10:42:26 +02:00
Tanguy Leroux 8274cd67ab Add version v5.4.2 after release 2017-06-21 10:23:32 +02:00
Alexander Reelsen 68423989da IndexMetaData: Add internal format index setting (#25292)
This setting is supposed to ease index upgrades as it allows you
to check for a new setting called `index.internal.version` which
can be used to check before upgrading indices.
2017-06-21 09:30:46 +02:00
Simon Willnauer 86a544de3b Ensure we never read from a closed MockSecureSettings object (#25322)
If secure settings are closed after the node has been constructed
no key-store access is permitted. We should also try to be as close as possible
to the real behavior if we mock secure settings. This change also adds
the same behavior as bootstrap has to InternalTestCluster to ensure we fail
if we try to read from secure settings after the node has been constructed.
2017-06-21 08:14:38 +02:00
Simon Willnauer 406a15e7a9 Fix settings serialization to not serialize secure settings or not take the total size into account (#25323) 2017-06-21 08:13:56 +02:00
Jason Tedor 1f14d042f6 Initialize primary term for shrunk indices
Today when an index is shrunk, the primary terms for its shards start
from one. Yet, this is a problem as the index will already contain
assigned sequence numbers across primary terms. To ensure document-level
sequence number semantics, the primary terms of the target shards must
start from the maximum of all the shards in the source index. This
commit causes this to be the case.

Relates #25307
2017-06-20 15:12:39 -04:00
Guillaume Le Floch 93e29d290f Tests: Refactor NodeTests settings (#25309)
This pull request aims to use the method baseSettings already present in the class.
2017-06-20 15:17:52 +02:00
Jun Ohtani 62d1969595 Parse synonyms with the same analysis chain (#8049)
* [Analysis] Parse synonyms with the same analysis chain

Synonym Token Filter / Synonym Graph Filter tokenize synonyms with whatever tokenizer and token filters appear before it in the chain.

Close #7199
2017-06-20 21:50:33 +09:00
Nik Everett 3261586cac Tweak reindex cancel logic and add many debug logs (#25256)
I'm still trying to hunt down rare failures in the cancelation tests
for reindex and friends. Here is the latest:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-unix-compatibility/os=ubuntu/876/console

It doesn't show much, other than that one of the tasks didn't kill
itself when asked to cancel.

So I'm going a bit crazy with debug logging so that the next time this
comes up I can trace exactly what happened.

Additionally, this tweaks the logic around how rethrottles were
performed around cancel. Previously we set the `requestsPerSecond`
to `0` when we cancelled the task. That was the "old way" to set them
to inifity which was the intent. This switches that from `0` to
`Float.MAX_VALUE` which is the "new way" to set the `requestsPerSecond`
to infinity. I don't know that this is much better, but it feels better.
2017-06-19 18:46:42 -04:00
Jay Modi 0d6c47fe14 Keystore CLI should use the AddFileKeyStoreCommand for files (#25298)
This commit fixes a typo in the KeyStoreCli class. The add-file command was incorrectly set to use
the AddStringKeyStoreCommand instead of the AddFileKeyStoreCommand.
2017-06-19 12:43:26 -06:00
Yannick Welsch 1a20760d79 Simplify IndexShard indexing and deletion methods (#25249)
Indexing or deleting documents through the IndexShard interface is quite complex and error-prone. It requires multiple calls, e.g. first prepareIndexOnPrimary, then do some checks if mapping updates have occurred, then do the actual indexing using index(...) etc. Currently each consumer of the interface (local recovery, peer recovery, replication) has additional custom checks built around it to deal with mapping updates, some of which are even inconsistent. This commit aims at reducing the complexity by exposing a simpler interface on IndexShard. There are no more prepare*** methods and the mapping complexity is also hidden, but still giving callers a possibility to implement custom logic to deal with mapping updates.
2017-06-19 20:11:54 +02:00
David Kyle d1be2ecfdb Initialise empty lists in BaseTaskResponse constructor (#25290)
* Initialise empty lists in BaseTaskResponse constructor

* Remove little used default constructor which leaves uninitialised members
2017-06-19 16:37:21 +01:00
Luca Cavanna d9ec2a23c5 Remove (deprecated) support for '+' in index expressions (#25274)
Relates to #24515
2017-06-19 15:19:17 +02:00
Tanguy Leroux e4f4886d40 [Test] Extend parsing checks for DocWriteResponses (#25257)
This commit changes the parsing logic of DocWriteResponse, ReplicationResponse
and GetResult so that it skips any unknown additional fields (for forward compatibility 
reasons). This affects the IndexResponse, UpdateResponse,DeleteResponse and 
GetResponse objects.
2017-06-19 13:19:09 +02:00
Martijn van Groningen bcaa413b0b
test: Port the remaining old indices search tests to full cluster restart qa module
Also tweaked the qa module's gradle file to actually run bwc tests against all index compat versions.

Relates to #24939
2017-06-19 12:27:24 +02:00
Simon Willnauer dc02b32650 Simplify connection closing and cleanups in TcpTransport (#25250)
Today we maintain a map of open connections in order to close them when
a low level channel gets closed or handles a failure. We also spawn a thread due to some
tricky concurrency issues especially with respect to netty since they listener might
be called on a transport / boss thread. Executions on those threads must not be blocking
since otherwise we will likely deadlock the event processing which adds to the
complexity of the concurrency model in this class.

This change associates the connection with the close callback that every channel invokes
once it's closed which allows us to remove the connections map. A relaxed non-blocking
concurrency model in the connection close listener allows cleaning up connected nodes without
blocking on any lock.
2017-06-19 09:19:45 +02:00
Boaz Leskes 7291aba8ae enable debug logging for testMasterFailoverDuringIndexingWithMappingChanges 2017-06-18 22:40:13 +02:00
Jason Tedor 4c28e781dd Fix failing delete index test
This test is failing because delete /{index} requests no longer support
index matching an alias. This commit removes testing such requests again
aliases.

Closes #25284
2017-06-18 15:32:43 -04:00
Christoph Büscher 3f9f713b44 Add AwaitsFix on IndicesRequestIT due to #25284 2017-06-18 18:56:41 +02:00
Christoph Büscher e99ced06cc [Tests] Check that parsing aggregations works in a forward compatible way (#25219)
This change adds tests for the aggregation parsing that try to simulate that we
can parse existing aggregations in a forward compatible way in the future,
ignoring potential newly added fields or substructures to the xContent response.
2017-06-17 13:06:31 +02:00
Ali Beyad 0c697348f4 Adds AwaitsFix on snapshot test failing due to #25281 2017-06-16 16:57:01 -04:00
Simon Willnauer f18b0d293c Move TransportStats accounting into TcpTransport (#25251)
Today TcpTransport is the de-facto base-class for transport implementations.
The need for all the callbacks we have in TransportServiceAdaptor are not necessary
anymore since we can simply have the logic inside the base class itself. This change
moves the stats metrics directly into TcpTransport removing the need for low level
bytes send / received callbacks.
2017-06-16 22:34:11 +02:00
Nik Everett ecc87f613f Move pre-configured "keyword" tokenizer to the analysis-common module (#24863)
Moves the keyword tokenizer to the analysis-common module. The keyword tokenizer is special because it is used by CustomNormalizerProvider so I pulled it out into its own PR. To get the move to work I've reworked the lookup from static to one using the AnalysisRegistry. This seems safe enough.

Part of #23658.
2017-06-16 11:48:15 -04:00
Luca Cavanna b5cea6980b Delete index API to work only against concrete indices (#25268)
With #23997 we have introduced a new internal index option that allows to resolve index expressions only against concrete indices while ignoring aliases. Such index option was applied to IndicesAliasesRequest, so that the index part of alias actions would only be resolved against concrete indices.

Same is done in this commit with delete index request. Deleting aliases has always been confusing as some users expect it to only remove the alias from the index (which has its own specific API). Even worse, in case of filtered aliases, deleting an alias may leave users with the expectation that only the documents that match the filter are deleted, which was never the case. To address all this confusion, delete index api works now only against concrete indices. WIldcard expressions will be only resolved against concrete index, as if aliases didn't exist. If one tries to delete against an alias, an IndexNotFoundException will be thrown regardless of whether the alias exists or not, as a concrete index with such a name doesn't exist.

Closes #2318
2017-06-16 17:46:01 +02:00
Boaz Leskes 9ddea539f5 Introduce translog size and age based retention policies (#25147)
This PR extends the TranslogDeletionPolicy to allow keeping the translog files longer than what is needed for recovery from lucene. Specifically, we allow specifying the total size of the files and their maximum age (i.e., keep up to 512MB but no longer than 12 hours). This will allow making ops based recoveries more common. 

Note that the default size and age still set to 0, maintaining current behavior. This is needed as the other components in the system are not yet ready for a longer translog retention. I will adapt those in follow up PRs.

Relates to #10708
2017-06-16 09:09:51 +02:00
Ali Beyad 350125ed2a Improves snapshot logging and snapshoth deletion error handling (#25264)
This commit does two things:
  1. Adds logging at the DEBUG level for when the index-N blob is
  updated.
  2. When attempting to delete a snapshot, if the snapshot was not found
  in the repository data, an exception is now thrown instead of silently
  ignoring the lack of presence of the snapshot in the repository data.
2017-06-15 19:43:19 -04:00
Christoph Büscher d3442f7d0c Add unit test for PathHierarchyTokenizerFactory (#24984) 2017-06-15 19:18:33 +02:00
Guillaume Le Floch a9014dfcc5 Deprecate tribe service
This commit deprecates the tribe service so that deprecation log
messages are delivered if a tribe node is configured.

Relates #24598
2017-06-15 12:41:05 -04:00
Martijn van Groningen 428e70758a
Moved more token filters to analysis-common module.
The following token filters were moved: `edge_ngram`, `ngram`, `uppercase`, `lowercase`, `length`, `flatten_graph` and `unique`.

Relates to #23658
2017-06-15 18:28:31 +02:00
Jim Ferenczi 2a78b0a19f [Test] Make sure that SearchAfterSortedDocQueryTests uses a single threaded searcher 2017-06-15 18:13:38 +02:00
markharwood 7a3155368c Test fix - removed superfluous assertion (#25247)
Closes #25245
2017-06-15 16:29:25 +01:00
Martijn van Groningen fe02829aac test: Ported more OldIndexBackwardsCompatibilityIT tests to full cluster restart qa tests. (#25173)
Relates to #24939
2017-06-15 14:48:06 +02:00
Adrien Grand 1b90c46a53 Allow reader wrappers to have different live docs but the same cache key.
Relates to #19856
2017-06-15 13:51:46 +02:00
Boaz Leskes 648b4717a4 move assertBusy to use CheckException (#25246)
We use assertBusy in many places where the underlying code throw exceptions. Currently we need to wrap those exceptions in a RuntimeException which is ugly.
2017-06-15 13:24:07 +02:00
Tanguy Leroux 27f1206999 Use SPI in High Level Rest Client to load XContent parsers (#25098)
This commit adds a NamedXContentProvider interface that can 
be implemented by plugins or modules using Java's SPI feature 
in order to provide additional NamedXContent parsers to external
applications like the Java High Level Rest Client.
2017-06-15 12:50:02 +02:00
Adrien Grand 5a6fa62844 Speed up PK lookups at index time. (#19856)
At index time Elasticsearch needs to look up the version associated with the
`_id` of the document that is being indexed, which is often the bottleneck for
indexing.

While reviewing the output of the `jfr` telemetry from a Rally benchmark, I saw
that significant time was spent in `ConcurrentHashMap#get` and `ThreadLocal#get`.
The reason is that we cache lookup objects per thread and segment, and for every
indexed document, we first need to look up the cache associated with this
segment (`ConcurrentHashMap#get`) and then get a state that is local to the
current thread (`ThreadLocal#get`). So if you are indexing N documents per
second and have S segments, both these methods will be called N*S times per
second.

This commit changes version lookup to use a cache per index reader rather than
per segment. While this makes cache entries live for less long, we now only need
to do one call to `ConcurrentHashMap#get` and `ThreadLocal#get` per indexed
document.
2017-06-15 10:17:42 +02:00