Commit Graph

1753 Commits

Author SHA1 Message Date
Yannick Welsch 47ada69c46
Zen2: Move most integration tests to Zen2 (#35678)
Zen2 is now feature-complete enough to run most ESIntegTestCase tests. The changes in this PR
are as follows:
- ClusterSettingsIT is adapted to not be Zen1 specific anymore (it was using Zen1 settings).
- Some of the integration tests require persistent storage of the cluster state, which is not fully
implemented yet (see #33958). These tests keep running with Zen1 for now but will be switched
over as soon as that is fully implemented.
- Some very few integration tests are not running yet with Zen2 for other reasons, depending on
some of the other open points in #32006.
2018-11-19 21:15:29 +01:00
Andrey Ershov f9ecd0c49e
[Zen2] Write manifest file (#35049)
Elasticsearch node is responsible for storing cluster metadata. 
There are 2 types of metadata: global metadata and index metadata. 
`GatewayMetaState` implements `ClusterStateApplier` and receives all 
`ClusterStateChanged` events and is responsible for storing modified 
metadata to disk. 

When new `ClusterStateChanged` event is received, `GatewayMetaState` 
checks if global metadata has changed and if it's the case writes new 
global metadata to disk. After that `GatewayMetaState` checks if index 
metadata has changed or there are new indices assigned to this node and 
if it's the case writes new index metadata to disk. Atomicity of global 
metadata and index metadata writes is ensured by `MetaDataStateFormat` 
class.

Unfortunately, there is no atomicity when more than one metadata changes
(global and index, or metadata for two indices). And atomicity is 
important for Zen2 correctness.
This commit adds atomicity by adding a notion of manifest file, 
represented by `MetaState` class. `MetaState` contains pointers to
current metadata.
More precisely, it stores global state generation as long and map from 
`Index` to index metadata generation as long. Atomicity of writes for 
manifest file is ensured by `MetaStateFormat` class.

The algorithm of writing changes to the disk would be the following:

1. Write global metadata state file to disk and remember
it's generation.
2. For each new/changed index write state file to disk and remember
it's generation. For each not-changed index use generation from 
previous manifest file. If index is removed or this node is no longer
responsible for this index - forget about the index.
3. Create `MetaState` object using previously remembered generations and
write it to disk.
4. Remove old state files for global metadata, indices metadata and 
manifest.

Additonally new implementation relies on enhanced `MetaDataStateFormat` 
failure semantics, `applyClusterState` throws IOException, whose 
descendant `WriteStateException` could be (and should be in Zen2) 
explicitly handled.
2018-11-19 19:49:44 +01:00
David Turner 86ef041539
[Zen2] Introduce ClusterBootstrapService (#35488)
Today, the bootstrapping of a Zen2 cluster is driven externally, requiring
something else to wait for discovery to converge and then to inject the initial
configuration. This is hard to use in some situations, such as REST tests.

This change introduces the `ClusterBootstrapService` which brings the bootstrap
retry logic within each node and allows it to be controlled via an (unsafe)
node setting.
2018-11-15 20:09:22 +00:00
David Turner 135c3f0f07 Merge branch 'master' into zen2 2018-11-15 08:24:26 +00:00
Hendrik Muhs fc774a3776
add ES 6.5.1 (on master) (#35549)
add ES 6.5.1
2018-11-14 21:08:21 +01:00
Tanguy Leroux c8c8ce2374
Extract RunOnce into a dedicated class (#35489)
This commit extracts the static inner class RunOnce from 
WorkerBulkByScrollTaskState so that it can be reused at 
other places.
2018-11-14 17:33:04 +01:00
Andrey Ershov 045fdd0d3b Merge master into zen2 2018-11-14 15:37:13 +03:00
Tanguy Leroux bbe50e7a86
Remove LoggingRunnable class (#35486)
This commit removes the unused LoggingRunnable class.
2018-11-14 10:12:25 +01:00
Tanguy Leroux 31567cefb4
[RCI] Check blocks while having index shard permit in TransportReplicationAction (#35332)
Today, the TransportReplicationAction checks the global level blocks and 
the index level blocks before routing the operation to the primary, in the 
ReroutePhase, and it happens at the very beginning of the transport 
replication action execution. For the upcoming rework of the Close Index 
API and in order to deal with primary relocation, we'll need to also check 
for blocks before executing the operation on the primary (while holding a 
permit) but before routing to the new primary.

This pull request change the AsyncPrimaryAction so that it checks for 
replication action's blocks before executing the operation locally or before 
routing the primary action to the newly primary shard. The check is done 
while holding a PrimaryShardReference.

Related to #33888
2018-11-14 09:43:55 +01:00
Hendrik Muhs 5c84708ee5 test: expose error message on failure 2018-11-14 08:25:41 +01:00
Christoph Büscher d8b1c23e1d
Remove Comparable interface from ScoreAccessor (#35519)
The way ScoreAccessor implements `compareTo()` is problematic because it doesn't
completely follow the Comparable contract, specificaly symmetry (if x is a
ScoreAccessor and y any Number then x.comparTo(y) works, but y.compareTo(x)
generally does not even compile). Fortunately we don't seem to use the fact that
ScoreAccessor is a Comparable anywhere, so we can simply remove it.
2018-11-14 05:58:05 +01:00
David Turner 229637fd7e
[Zen2] Remove duplicate discovered peers (#35505)
Today the `PeerFinder` probes each address it obtains, identifies the node to
which it just connected, and then returns all such nodes. However, this can
lead to duplicates if a node manages to connect to another node via two
distinct addresses.  This causes bootstrapping to fail since
`BootstrapConfiguration#resolve` forbids duplicates.

This change alters the behaviour of the `PeerFinder` to remove duplicates in
this situation.
2018-11-13 22:30:36 +00:00
Vladimir Dolzhenko 9728119b82 [CI] AllocationIdIT testFailedRecoveryOnAllocateStalePrimaryRequiresAnotherAllocateStalePrimary failure
Closes #35504
2018-11-13 20:57:50 +01:00
David Turner 8e40a2bbe2
[Zen2] Introduce vote withdrawal (#35446)
If shutting down half or more of the master-eligible nodes, their votes must
first be explicitly withdrawn to ensure that the cluster doesn't lose its
quorum. This works via _voting tombstones_, stored in the cluster state, which
tell the reconfigurator to remove nodes from the voting configuration.

This change introduces voting tombstones to the cluster state, together with
transport APIs for adding and removing them, and makes use of these APIs in
`InternalTestCluster` to support tests which remove at least half of the
master-eligible nodes at once (e.g. shrinking from two master-eligible nodes to
one).
2018-11-13 19:32:32 +00:00
David Turner 0e1a12122c Merge branch 'master' into zen2 2018-11-13 15:25:35 +00:00
David Turner fbd3cab410
[Zen2] Remove AbstractComponent usage (#35483)
AbstractComponent was deprecated in #35140 and is looking like it will be
removed at some point by #34888. Today all it does is provide a logger. This
change removes the usages of AbstractComponent that live solely in the zen2
feature branch to avoid some future merge pain, and replaces it where necessary
with some directly-created loggers.
2018-11-13 15:20:49 +00:00
Simon Willnauer 3229dfc4de
Allow efficient can_match phases on frozen indices (#35431)
This change adds a special caching reader that caches all relevant
values for a range query to rewrite correctly in a can_match phase
without actually opening the underlying directory reader. This
allows frozen indices to be filtered with can_match and in-turn
searched with wildcards in a efficient way since it allows us to
exclude shards that won't match based on their date-ranges without
opening their directory readers.

Relates to #34352
Depends on #34357
2018-11-13 14:53:55 +01:00
Christoph Büscher 0a6614a03a
Correct implemented interface of ParsedReverseNested (#35455)
The ParsedReverseNested implementation should implement the ReverseNested
interface and not the Nested interface. Although this is an empty marker
interface it is confusing and can lead to casting errors. Also adding a test to
check that both ParsedNested and ParsedReverseNested implement the correct
interface.

Closes #35449
2018-11-13 10:34:29 +01:00
Jason Tedor a18b599d64
Handle OS pretty name on old OS without OS release (#35453)
Some very old ancient versions of Linux do not have /etc/os-release. For
example, old Red Hat-like OS. This commit adds a fallback for handling
pretty name for these OS.
2018-11-12 19:31:12 -05:00
Tim Brooks 71cfb730f6
Register remote cluster compress setting (#35464)
This is a follow up to #35357. That commit failed to register the new
cluster.remote.cluster_name.transport.compress setting with
`ClusterSettings`. This commit fixes that.
2018-11-12 16:07:42 -07:00
Igor Motov e7896bcefc
Geo: enables coerce support in WKT polygon parser (#35414)
WKT parser now automatically closes open polygons similar to GeoJSON
parser if coerce flag in mapping is set to true.

Closes to #35059
2018-11-12 09:40:04 -10:00
Jason Tedor 40ca62c298
Address handling of OS pretty name on some OS (#35451)
Some OS (e.g., Oracle Linux Server 6.9) have a trailing space at the end
of the PRETTY_NAME line in /etc/os-release. This commit addresses this
by accounting for this trailing space when extracting the pretty name.
2018-11-12 14:27:57 -05:00
Yannick Welsch d2ff01af13
Zen2: Add basic Zen1 transport-level BWC (#35443)
Implements serialization compatibility between Zen1 and Zen2 transport action, allowing a Zen1 node to join a fully formed Zen2 cluster and vice-versa.
2018-11-12 19:31:10 +01:00
Nick Knize 2591f66a33
upgrade to lucene-8.0.0-snapshot-6d9c714052 (#35428) 2018-11-12 10:48:27 -06:00
Yannick Welsch fe29b18c26 Fix compilation 2018-11-12 11:05:11 +01:00
Yannick Welsch 4e6c58c942 Merge remote-tracking branch 'elastic/master' into zen2 2018-11-12 10:03:59 +01:00
Christoph Büscher 09cac321e7
Upgrade to Joda 2.10.1 (#35410)
This version contains a bugfix that allows us to reenable one of our muted tests
in DateTimeUnitTests.

Closes #33749
2018-11-12 10:02:41 +01:00
Tim Brooks ba478827ad
Improve MockTcpTransport memory usage (#35402)
The MockTcpTransport is not friendly in regards to memory usage. It must
allocate multiple byte arrays for every message. This improves the
memory situation by failing fast if the message is improperly formatted.
Additionally, it uses reusable big arrays for at least half of the
allocated byte arrays.
2018-11-09 10:12:49 -07:00
David Turner f69a5c9b3c Fix compile error introduced by conflict in previous two commits 2018-11-09 15:50:11 +00:00
Jim Ferenczi 7054e289fa
Add trace log of the request for the query and fetch phases (#34479)
This change adds a logger for the query and fetch phases that prints all requests
before their execution at the trace level. This will help debugging cases where an issue
occurs during the execution since only completed queries are logged by the slow logs.
2018-11-09 09:41:51 +01:00
Tim Brooks bccc99c2be
Fix TcpTransport compression test (#35396)
This commit fixes an assertion in the TcpTransportTests compresssion
test.
2018-11-08 18:04:48 -07:00
Tim Brooks 93c2c604e5
Move compression config to ConnectionProfile (#35357)
This is related to #34483. It introduces a namespaced setting for
compression that allows users to configure compression on a per remote
cluster basis. The transport.tcp.compress remains as a fallback
setting. If transport.tcp.compress is set to true, then all requests
and responses are compressed. If it is set to false, only requests to
clusters based on the cluster.remote.cluster_name.transport.compress
setting are compressed. However, after this change regardless of any
local settings, responses will be compressed if the request that is
received was compressed.
2018-11-08 10:37:59 -07:00
Jason Tedor 5c2a5f2e37
Adjust BWC version on OS pretty name
This commit adjusts the BWC version the OS pretty name field on OsInfo
now that this field has been backported to the 6.x development branch.
2018-11-08 12:24:10 -05:00
Jason Tedor 730ec1ddfb
Add more detailed OS name on Linux (#35352)
Today our OS information returned in node stats only returns a
high-level name of the OS (e.g., "Linux"). Yet, for some uses this is
too high-level and knowing at a finer level of granularity the
underlying OS can be useful. This commit extracts the pretty name on
Linux from /etc/os-release. This pretty name usually includes the Linux
vendor and the Linux vendor version number (e.g., Fedora 28).
2018-11-08 12:16:58 -05:00
Yannick Welsch c315ead0ac
Zen2: Add diff-based publishing (#35290)
Enables diff-based publishing, which is an optimization where only the changing parts of the cluster
state are published to the nodes in the cluster, falling back to full cluster state publishing if the
receiver does not have the previous cluster state.
2018-11-08 17:16:09 +01:00
David Turner 6885a7cb0f
Introduce transport API for cluster bootstrapping (#34961)
- Introduces a transport API for bootstrapping a Zen2 cluster
- Introduces a transport API for requesting the set of nodes that a
  master-eligible node has discovered and for waiting until this comprises the
  expected number of nodes.
- Alters ESIntegTestCase to use these APIs when forming a cluster, rather than
  injecting the initial configuration directly.
2018-11-08 16:09:37 +00:00
Christoph Büscher 113af7996c
Make limit on number of expanded fields configurable (#35284)
Currently we introduced a hard limit of 1024 to the number of fields a query can
be expanded to in #26541. Instead of using a hard limit, we should make this
configurable. This change removes the hard limit check and uses the existing
`max_clause_count` setting instead.

Closes #34778
2018-11-08 17:04:40 +01:00
Daniel Mitterdorfer 6980feddd2
Remove unused class MemoryCircuitBreaker
The class `MemoryCircuitBreaker` is unused so we remove all its traces
from the code base.

Relates #35367
2018-11-08 15:33:24 +01:00
David Turner 77789a733d Merge branch 'master' into 2018-11-08-merge-master 2018-11-08 13:38:18 +00:00
Alpar Torok 518e0de078 Mute test #35365 2018-11-08 12:27:40 +02:00
Christoph Büscher 14b811446f
Preserve `date_histogram` format when aggregating on unmapped fields (#35254)
Currently when aggregating on an unmapped date field (e.g. using a
date_histogram) we don't preserve the aggregations `format` setting but instead
use the default format. This can lead to loosing the aggregations `format` when
aggregating over several indices where some of them contain unmapped date fields
and are encountered first in the reduce phase.

Related to #31760
2018-11-08 10:22:25 +01:00
Jim Ferenczi 891fdda68e
Allow unmapped fields in composite aggregations (#35331)
Today the `composite` aggregation throws an error if a source targets an
unmapped field and `missing_bucket` is set to false. Documents without a
value for a source cannot produce any bucket if `missing_bucket` is not
activated so the error is a shortcut to say that the response will be empty.
However this is not consistent with the `terms` aggregation which accepts
unmapped field by default even if the response is also guaranteed to be empty.
This commit removes this restriction, if a source contains an unmapped field
we now return an empty response (no buckets).

Closes #35317
2018-11-08 09:30:52 +01:00
Tanguy Leroux 1703a61fec
[RCI] Add IndexShardOperationPermits.asyncBlockOperations(ActionListener<Releasable>) (#34902)
The current implementation of asyncBlockOperations() can be used to
execute some code once all indexing operations permits have been acquired,
 then releases all permits immediately after the code execution. This
 immediate release is not suitable for treatments that need to keep all
 permits over multiple execution steps.

This commit adds a new asyncBlockOperations() that exposes a Releasable,
 making it possible to acquire all permits and only release them all
 when needed by closing the Releasable. The existing blockOperations() 
method has been modified to delegate permit acquisition/releasing to this new
method.

Relates to #33888
2018-11-08 09:23:33 +01:00
Jason Tedor 4f4fc3b8f8
Replicate index settings to followers (#35089)
This commit uses the index settings version so that a follower can
replicate index settings changes as needed from the leader.

Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
2018-11-07 21:20:51 -05:00
Ryan Ernst a4d979cfc8 Scripting: Add back lookup vars in score script (#34833)
The lookup vars under params (namely _fields and _source) were
inadvertently removed when scoring scripts were converted to using
script contexts. This commit adds them back, along with deprecation
warnings for those that should not be used.
2018-11-07 15:09:09 -08:00
Nhat Nguyen ed8732b161
Use soft-deleted docs to resolve strategy for engine operation (#35230)
A CCR test failure shows that the approach in #34474 is flawed.
Restoring the LocalCheckpointTracker from an index commit can cause both
FollowingEngine and InternalEngine to incorrectly ignore some deletes.

Here is a small scenario illustrating the problem:

1. Delete doc with seq=1 => engine will add a delete tombstone to Lucene

2. Flush a commit consisting of only the delete tombstone

3. Index doc with seq=0  => engine will add that doc to Lucene but soft-deleted

4. Restart an engine with the commit (step 2); the engine will fill its
LocalCheckpointTracker with the delete tombstone in the commit

5. Replay the local translog in reverse order: index#0 then delete#1

6. When process index#0, an engine will add it into Lucene as a live doc
and advance the local checkpoint to 1 (seq#1 was restored from the
commit - step 4).

7. When process delete#1, an engine will skip it because seq_no=1 is
less than or equal to the local checkpoint.

We should have zero document after recovering from translog, but here we
have one.

Since all operations after the local checkpoint of the safe commit are
retained, we should find them if the look-up considers also soft-deleted
documents. This PR fills the disparity between the version map and the
local checkpoint tracker by taking soft-deleted documents into account
while resolving strategy for engine operations.

Relates #34474
Relates #33656
2018-11-07 15:26:30 -05:00
Martijn van Groningen 8de3c6e618
Ignore date ranges containing 'now' when pre-processing a percolator query (#35160)
Today when a percolator query contains a date range then the query
analyzer extracts that range, so that at search time the `percolate` query
can exclude percolator queries efficiently that are never going to match.

The problem is that if 'now' is used it is evaluated at index time.
So the idea is to rewrite date ranges with 'now' to a match all query, 
so that the query analyzer can't extract it and the `percolate` query 
is  then able to evaluate 'now' at query time.
2018-11-07 20:41:27 +01:00
Simon Willnauer 0cc0fd2d15
Add a frozen engine implementation (#34357)
This change adds a `frozen` engine that allows lazily open a directory reader
on a read-only shard. The engine wraps general purpose searchers in a LazyDirectoryReader
that also allows to release and reset the underlying index readers after any and before
secondary search phases.

Relates to #34352
2018-11-07 20:23:35 +01:00
Vladimir Dolzhenko f789d49fb3
Put a fake allocation id on allocate stale primary command (#34140)
removes fake allocation id after recovery is done

Relates to #33432
2018-11-07 20:18:11 +01:00
Simon Willnauer 2131e119d7
Apply `ignore_throttled` also to concrete indices (#35335)
Today we only apply `ingore_throttled` to expansions from wildcards,
date math expressions and aliases. Yet, this is tricky since we might
have resolved certain expressions in pre-filter steps like security.
It's more consistent to apply this logic to all expressions including
concrete indices.

Relates to #34354
2018-11-07 18:43:27 +01:00