Commit Graph

2093 Commits

Author SHA1 Message Date
David Turner aec44fecbc Decouple DiskThresholdMonitor & ClusterInfoService (#44105)
Today the `ClusterInfoService` requires the `DiskThresholdMonitor` at
construction time so that it can notify it when nodes report changes in their
disk usage, but this is awkward to construct: the `DiskThresholdMonitor`
requires a `RerouteService` which requires an `AllocationService` which comees
from the `ClusterModule` which requires the `ClusterInfoService`.

Today we break the cycle with a `LazilyInitializedRerouteService` which is
itself a little ugly. This commit replaces this with a more traditional
subject/observer relationship between the `ClusterInfoService` and the
`DiskThresholdMonitor`.
2019-07-09 18:43:32 +01:00
David Turner 268971db03 Wait for blackholed connection before discovery (#44077)
Since #42636 we no longer treat connections specially when simulating a
blackholed connection. This means that at the end of the safety phase we may
have just started a connection attempt which will time out, but the default
timeout is 30 seconds, much longer than the 2 seconds we normally allow for
post-safety-phase discovery. This commit adds time for such a connection
attempt to time out.

It also fixes some spurious logging of `this` that now refers to an object with
an unhelpful `toString()` implementation introduced in #42636.

Fixes #44073
2019-07-09 10:59:53 +01:00
Armin Braun 9eac5ceb1b
Dry up inputstream to bytesreference (#43675) (#44094)
* Dry up Reading InputStream to BytesReference
* Dry up spots where we use the same pattern to get from an InputStream to a BytesReferences
2019-07-09 09:18:25 +02:00
David Turner 6dce458ecc Randomise retention lease expiry time (#44067)
In today's test suite indices mostly use the default value of `12h` for the
`index.soft_deletes.retention_lease.period` setting, which in the context of
the test suite essentially means "never expires". In fact, the tests should all
behave correctly even if the lease period is much shorter; tests that rely on
leases not expiring should configure their indices appropriately.

This commit randomises the lease expiry time for those indices created during
tests which do not set a specific value for this setting.
2019-07-08 18:29:27 +02:00
Alpar Torok 0c8294e633 Make sure the clean task doesn't break test fixtures (#43641)
Use a dedicated fixture dir.
2019-07-08 17:58:27 +03:00
Armin Braun afe81fd625
Some Cleanup in Test Framework (#44039) (#44059)
* Remove some obvious dead code
* Move assert methods that were only used in a single test class to the child they belong to
* Inline some redundant methods
2019-07-08 14:15:31 +02:00
Armin Braun af9b98e81c
Recursively Delete Unreferenced Index Directories (#42189) (#44051)
* Use ability to list child "folders" in the blob store to implement recursive delete on all stale index folders when cleaning up instead of using the diff between two `RepositoryData` instances to cover aborted deletes
* Runs after ever delete operation
* Relates  #13159 (fixing most of this issues caused by unreferenced indices, leaving some meta files to be cleaned up only)
2019-07-08 10:55:39 +02:00
Nhat Nguyen 9089820d8f Enable indexing optimization using sequence numbers on replicas (#43616)
This PR enables the indexing optimization using sequence numbers on
replicas. With this optimization, indexing on replicas should be faster
and use less memory as it can forgo the version lookup when possible.
This change also deactivates the append-only optimization on replicas.

Relates #34099
2019-07-05 22:12:08 -04:00
Yannick Welsch 504a43d43a Move ConnectionManager to async APIs (#42636)
This commit converts the ConnectionManager's openConnection and connectToNode methods to
async-style. This will allow us to not block threads anymore when opening connections. This PR also
adapts the cluster coordination subsystem to make use of the new async APIs, allowing to remove
some hacks in the test infrastructure that had to account for the previous synchronous nature of the
connection APIs.
2019-07-05 20:40:22 +02:00
Yannick Welsch d090fa514f Use unique ports per test worker (#43983)
* Use unique ports per test worker

* Add test for system property

* check presence of tests.gradle

* Revert "check presence of tests.gradle"

This reverts commit 2fee7512a28f95c94c5bf7a3312e808f918a9510.
2019-07-05 11:02:28 +02:00
Jim Ferenczi cdf55cb5c5 Refactor index engines to manage readers instead of searchers (#43860)
This commit changes the way we manage refreshes in the index engines.
Instead of relying on a SearcherManager, this change uses a ReaderManager that
creates ElasticsearchDirectoryReader when needed. Searchers are now created on-demand
(when acquireSearcher is called) from the current ElasticsearchDirectoryReader.
It also slightly changes the Engine.Searcher to extend IndexSearcher in order
to simplify the usage in the consumer.
2019-07-04 22:49:43 +02:00
Alan Woodward 4b99255fed Add name() method to TokenizerFactory (#43909)
This brings TokenizerFactory into line with CharFilterFactory and TokenFilterFactory,
and removes the need to pass around tokenizer names when building custom analyzers.

As this means that TokenizerFactory is no longer a functional interface, the commit also
adds a factory method to TokenizerFactory to make construction simpler.
2019-07-04 11:28:55 +01:00
Armin Braun be20fb80e4
Recursive Delete on BlobContainer (#43281) (#43920)
This is a prerequisite of #42189:

* Add directory delete method to blob container specific to each implementation:
  * Some notes on the implementations:
       * AWS + GCS: We can simply exploit the fact that both AWS and GCS return blobs lexicographically ordered which allows us to simply delete in the same order that we receive the blobs from the listing request. For AWS this simply required listing without the delimiter setting (so we get a deep listing) and for GCS the same behavior is achieved by not using the directory mode on the listing invocation. The nice thing about this is, that even for very large numbers of blobs the memory requirements are now capped nicely since we go page by page when deleting.
       * For Azure I extended the parallelization to the listing calls as well and made it work recursively. I verified that this works with thread count `1` since we only block once in the initial thread and then fan out to a "graph" of child listeners that never block.
       * HDFS and FS are trivial since we have directory delete methods available for them
* Enhances third party tests to ensure the new functionality works (I manually ran them for all cloud providers)
2019-07-03 17:14:57 +02:00
Armin Braun 455b12a4fb
Add Ability to List Child Containers to BlobContainer (#42653) (#43903)
* Add Ability to List Child Containers to BlobContainer (#42653)

* Add Ability to List Child Containers to BlobContainer
* This is a prerequisite of #42189
2019-07-03 11:30:49 +02:00
Armin Braun 826f38cd70
Enable Parallel Deletes in Azure Repository (#42783) (#43886)
* Parallel deletes via private thread pool
2019-07-03 09:28:39 +02:00
Zachary Tong ea1794832f Add RareTerms aggregation (#35718)
This adds a `rare_terms` aggregation.  It is an aggregation designed
to identify the long-tail of keywords, e.g. terms that are "rare" or
have low doc counts.

This aggregation is designed to be more memory efficient than the
alternative, which is setting a terms aggregation to size: LONG_MAX
(or worse, ordering a terms agg by count ascending, which has
unbounded error).

This aggregation works by maintaining a map of terms that have
been seen. A counter associated with each value is incremented
when we see the term again.  If the counter surpasses a predefined
threshold, the term is removed from the map and inserted into a cuckoo
filter.  If a future term is found in the cuckoo filter we assume it
was previously removed from the map and is "common".

The map keys are the "rare" terms after collection is done.
2019-07-01 10:30:02 -04:00
Nhat Nguyen 598e00a689 Make peer recovery send file info step async (#43792)
Relates #36195
2019-07-01 08:40:45 -04:00
Ryan Ernst 3a2c698ce0
Rename Action to ActionType (#43778)
Action is a class that encapsulates meta information about an action
that allows it to be called remotely, specifically the action name and
response type. With recent refactoring, the action class can now be
constructed as a static constant, instead of needing to create a
subclass. This makes the old pattern of creating a singleton INSTANCE
both misnamed and lacking a common placement.

This commit renames Action to ActionType, thus allowing the old INSTANCE
naming pattern to be TYPE on the transport action itself. ActionType
also conveys that this class is also not the action itself, although
this change does not rename any concrete classes as those will be
removed organically as they are converted to TYPE constants.

relates #34389
2019-06-30 22:00:17 -07:00
David Turner fca7a19713 Avoid parallel reroutes in DiskThresholdMonitor (#43381)
Today the `DiskThresholdMonitor` limits the frequency with which it submits
reroute tasks, but it might still submit these tasks faster than the master can
process them if, for instance, each reroute takes over 60 seconds. This causes
a problem since the reroute task runs with priority `IMMEDIATE` and is always
scheduled when there is a node over the high watermark, so this can starve any
other pending tasks on the master.

This change avoids further updates from the monitor while its last task(s) are
still in progress, and it measures the time of each update from the completion
time of the reroute task rather than its start time, to allow a larger window
for other tasks to run.

It also now makes use of the `RoutingService` to submit the reroute task, in
order to batch this task with any other pending reroutes. It enhances the
`RoutingService` to notify its listeners on completion.

Fixes #40174
Relates #42559
2019-06-30 16:54:16 +01:00
Nhat Nguyen 55b3ec8d7b Make peer recovery clean files step async (#43787)
Relates #36195
2019-06-29 18:30:51 -04:00
Albert Zaharovits 5e17bc5dcc
Consistent Secure Settings #40416
Introduces a new `ConsistentSecureSettingsValidatorService` service that exposes
a single public method, namely `allSecureSettingsConsistent`. The method returns
`true` if the local node's secure settings (inside the keystore) are equal to the
master's, and `false` otherwise. Technically, the local node has to have exactly
the same secure settings - setting names should not be missing or in surplus -
for all `SecureSetting` instances that are flagged with the newly introduced
`Property.Consistent`. It is worth highlighting that the `allSecureSettingsConsistent`
is not a consensus view across the cluster, but rather the local node's perspective
in relation to the master.
2019-06-29 23:26:17 +03:00
Jim Ferenczi 7ca69db83f Refactor IndexSearcherWrapper to disallow the wrapping of IndexSearcher (#43645)
This change removes the ability to wrap an IndexSearcher in plugins. The IndexSearcherWrapper is replaced by an IndexReaderWrapper and allows to wrap the DirectoryReader only. This simplifies the creation of the context IndexSearcher that is used on a per request basis. This change also moves the optimization that was implemented in the security index searcher wrapper to the ContextIndexSearcher that now checks the live docs to determine how the search should be executed. If the underlying live docs is a sparse bit set the searcher will compute the intersection
betweeen the query and the live docs instead of checking the live docs on every document that match the query.
2019-06-28 16:28:02 +02:00
Christoph Büscher 2cc7f5a744
Allow reloading of search time analyzers (#43313)
Currently changing resources (like dictionaries, synonym files etc...) of search
time analyzers is only possible by closing an index, changing the underlying
resource (e.g. synonym files) and then re-opening the index for the change to
take effect.

This PR adds a new API endpoint that allows triggering reloading of certain
analysis resources (currently token filters) that will then pick up changes in
underlying file resources. To achieve this we introduce a new type of custom
analyzer (ReloadableCustomAnalyzer) that uses a ReuseStrategy that allows
swapping out analysis components. Custom analyzers that contain filters that are
markes as "updateable" will automatically choose this implementation. This PR
also adds this capability to `synonym` token filters for use in search time
analyzers.

Relates to #29051
2019-06-28 09:55:40 +02:00
Nhat Nguyen ce8771feb7 Do not use MockInternalEngine in GatewayIndexStateIT (#43716)
GatewayIndexStateIT#testRecoverBrokenIndexMetadata replies on the
flushing on shutdown. This behaviour, however, can be randomly disabled
in MockInternalEngine.

Closes #43034
2019-06-27 18:28:04 -04:00
Yannick Welsch 6744344ef2 Handle situation where only voting-only nodes are bootstrapped (#43628)
Adds support for the situation where only voting-only nodes are bootstrapped. In that case, they will
still try to become elected and bring full master nodes into the cluster.
2019-06-27 18:10:15 +02:00
David Roberts c5beb05f77 [ML][DataFrame] Consider data frame templates internal in REST tests (#43692)
The data frame index template pattern was not in the list
considered as internal and therefore not needing cleanup
after every test.
2019-06-27 14:40:30 +01:00
Christoph Büscher 36360358b2 Move query builder caching check to dedicated tests (#43238)
Currently `AbstractQueryTestCase#testToQuery` checks the search context cachable
flag. This is a bit fragile due to the high randomization of query builders
performed by this general test. Also we might only rarely check the
"interesting" cases because they rarely get generated when fully randomizing the
query builder.

This change moved the general checks out ot #testToQuery and instead adds
dedicated cache tests for those query builders that exhibit something other than
the default behaviour.

Closes #43200
2019-06-27 14:56:29 +02:00
Yannick Welsch 2049f715b3 Add voting-only master node (#43410)
A voting-only master-eligible node is a node that can participate in master elections but will not act
as a master in the cluster. In particular, a voting-only node can help elect another master-eligible
node as master, and can serve as a tiebreaker in elections. High availability (HA) clusters require at
least three master-eligible nodes, so that if one of the three nodes is down, then the remaining two
can still elect a master amongst them-selves. This only requires one of the two remaining nodes to
have the capability to act as master, but both need to have voting powers. This means that one of
the three master-eligible nodes can be made as voting-only. If this voting-only node is a dedicated
master, a less powerful machine or a smaller heap-size can be chosen for this node. Alternatively, a
voting-only non-dedicated master node can play the role of the third master-eligible node, which
allows running an HA cluster with only two dedicated master nodes.

Closes #14340

Co-authored-by: David Turner <david.turner@elastic.co>
2019-06-26 08:07:56 +02:00
Zachary Tong 63fef5a31e Add scripting support to AggregatorTestCase (#43494)
This refactors AggregatorTestCase to allow testing mock scripts.
The main change is to QueryShardContext.  This was previously mocked,
but to get the ScriptService you have to invoke a final method
which can't be mocked.

Instead, we just create a mostly-empty QueryShardContext and populate
the fields that are needed for testing.  It also introduces a few
new helper methods that can be overridden to change the default
behavior a bit.

Most tests should be able to override getMockScriptService() to supply
a ScriptService to the context, which is later used by the aggs.
More complicated tests can override queryShardContextMock() as before.

Adds a test to MaxAggregatorTests to test out the new functionality.
2019-06-25 11:52:12 -04:00
Yannick Welsch d45f12799c Sync global checkpoint on pending in-sync shards (#43526)
At the end of a peer recovery the primary wants to mark the replica as in-sync. For that the
persisted local checkpoint of the replica needs to have caught up with the global checkpoint on the
primary. If translog durability is set to ASYNC, this means that information about the persisted local
checkpoint can lag on the primary and might need to be explicitly fetched through a global
checkpoint sync action. Unfortunately, that action will only be triggered after 30 seconds, and, even
worse, will only run based on what the in-sync shard copies say (see
IndexShard.maybeSyncGlobalCheckpoint). As the replica has not been marked as in-sync yet, it is
not taken into consideration, and the primary might have its global checkpoint equal to the max seq
no, so it thinks nothing needs to be done.

Closes #43486
2019-06-24 18:35:57 +02:00
Armin Braun 1053a89b79
Log Blocked IO Thread State (#43424) (#43447)
* Let's log the state of the thread to find out if it's dead-locked or just stuck after being suspended
* Relates #43392
2019-06-20 22:31:36 +02:00
Yannick Welsch 29d76baf7d Increase timeout for assertSeqNos
Helps with tests that do async translog syncing
2019-06-20 19:06:49 +02:00
Zachary Tong a8a81200d0 Better support for unmapped fields in AggregatorTestCase (#43405)
AggregatorTestCase will NPE if only a single, null MappedFieldType
is provided (which is required to simulate an unmapped field).  While
it's possible to test unmapped fields by supplying other, non-related
field types... that's clunky and unnecessary.  AggregatorTestCase
just needs to filter out null field types when setting up.
2019-06-20 11:31:49 -04:00
Armin Braun 7d1983a7e3
Fix Operation Timestamps in Tests (#43155) (#43419)
* For the issue in #43086 we were running into inactive shards because the random timestamps previously used would randomly make `org.elasticsearch.index.shard.IndexShard#checkIdle` see an incorrect+huge  inactive time
* Also fixed one other spot in tests that passed `ms` instead of `ns` for the same timestamp on an index op to correctly use relative `ns`
* Closes #43086
2019-06-20 16:36:17 +02:00
David Turner c8eb09f158 Fail connection attempts earlier in tests (#43320)
Today the `DisruptibleMockTransport` always allows a connection to a node to be
established, and then fails requests sent to that node such as the subsequent
handshake. Since #42342, we log handshake failures on an open connection as a
warning, and this makes the test logs rather noisy. This change fails the
connection attempt first, avoiding these unrealistic warnings.
2019-06-20 14:45:24 +01:00
Armin Braun 5af9387fad Fix Stuck IO Thread Logging Time Precision (#42882)
* The precision of the timestamps we get from the cached time thread is only 200ms by default resulting in a number of needless ~200ms slow network thread execution logs
  * Fixed by making the warn threshold a function of the precision of the cached time thread found in the settings
2019-06-20 14:26:20 +02:00
Yannick Welsch 7f8e1454ab Advance checkpoints only after persisting ops (#43205)
Local and global checkpoints currently do not correctly reflect what's persisted to disk. The issue is
that the local checkpoint is adapted as soon as an operation is processed (but not fsynced yet). This
leaves room for the history below the global checkpoint to still change in case of a crash. As we rely
on global checkpoints for CCR as well as operation-based recoveries, this has the risk of shard
copies / follower clusters going out of sync.

This commit required changing some core classes in the system:

- The LocalCheckpointTracker keeps track now not only of the information whether an operation has
been processed, but also whether that operation has been persisted to disk.
- TranslogWriter now keeps track of the sequence numbers that have not been fsynced yet. Once
they are fsynced, TranslogWriter notifies LocalCheckpointTracker of this.
- ReplicationTracker now keeps track of the persisted local and persisted global checkpoints of all
shard copies when in primary mode. The computed global checkpoint (which represents the
minimum of all persisted local checkpoints of all in-sync shard copies), which was previously stored
in the checkpoint entry for the local shard copy, has been moved to an extra field.
- The periodic global checkpoint sync now also takes async durability into account, where the local
checkpoints on shards only advance when the translog is asynchronously fsynced. This means that
the previous condition to detect inactivity (max sequence number is equal to global checkpoint) is
not sufficient anymore.
- The new index closing API does not work when combined with async durability. The shard
verification step is now requires an additional pre-flight step to fsync the translog, so that the main
verify shard step has the most up-to-date global checkpoint at disposition.
2019-06-20 11:12:38 +02:00
Mark Vieira 0867ea75b7
Properly format reproduction lines for test methods that contain periods (#43255) 2019-06-18 09:17:47 -07:00
Nhat Nguyen 0c5086d2f3 Rebuild version map when opening internal engine (#43202)
With this change, we will rebuild the live version map and local
checkpoint using documents (including soft-deleted) from the safe commit
when opening an internal engine. This allows us to safely prune away _id
of all soft-deleted documents as the version map is always in-sync with
the Lucene index.

Relates #40741
Supersedes #42979
2019-06-17 18:08:09 -04:00
Martijn Laarman 8b1b9f8ab9
Introduce stability description to the REST API specification (#38413) (#43278)
* introduce state to the REST API specification

* change state over to stability

* CCR is no GA updated to stable

* SQL is now GA so marked as stable

* Introduce `internal` as state for API's, marks stable in terms of lifetime but unstable in terms of guarantees on its output format since it exposes internal representations

* make setting a wrong stability value, or not setting it at all an error that causes the YAML test suite to fail

* update spec files to be explicit about their stability state

* Document the fact that stability needs to be defined

Otherwise the YAML test runner will fail (with a nice exception message)

* address check style violations

* update rest spec unit tests to include stability

* found one more test spec file not declaring stability, made sure stability appears after documentation everywhere

* cluster.state is stable, mark response in some way to denote its a key value format that can be changed during minors

* mark data frame API's as beta

* remove internal and private as states for an API

* removed the wrong enum values in the Stability Enum in the previous commit

(cherry picked from commit 61c34bbd92f8f7e5f22fa411c6b682b0ebd8a99d)
2019-06-17 16:57:13 +02:00
Henning Andersen ba15d08e14
Allow cluster access during node restart (#42946) (#43272)
This commit modifies InternalTestCluster to allow using client() and
other operations inside a RestartCallback (onStoppedNode typically).
Restarting nodes are now removed from the map and thus all
methods now return the state as if the restarting node does not exist.

This avoids various exceptions stemming from accessing the stopped
node(s).
2019-06-17 15:04:17 +02:00
Christoph Büscher 7af23324e3 SimpleQ.S.B and QueryStringQ.S.B tests should avoid `now` in query (#43199)
Currently the randomization of the q.b. in these tests can create query strings
that can cause caching to be disabled for this query if we query all fields and
there is a date field present. This is pretty much an anomaly that we shouldn't
generally test for in the "testToQuery" tests where cache policies are checked.

This change makes sure we don't create offending query strings so the cache
checks never hit these cases and adds a special test method to check this edge
case.

Closes #43112
2019-06-14 11:21:48 +02:00
Alpar Torok 7cc6dca697 Remove explicily enabled build fixture task 2019-06-14 10:42:08 +03:00
Jason Tedor 5bc3b7f741
Enable node roles to be pluggable (#43175)
This commit introduces the possibility for a plugin to introduce
additional node roles.
2019-06-13 15:15:48 -04:00
Yogesh Gaikwad 4ae1e30a98
Enable krb5kdc-fixture, kerberos tests mount urandom for kdc container (#41710) (#43178)
Infra has fixed #10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes #40624 Closes #40678
2019-06-13 13:02:16 +10:00
Alan Woodward 9de1c69c28 IndexAnalyzers doesn't need to extend AbstractIndexComponent (#43149)
AIC doesn't add anything here, and it removes the need to pass index settings
to the constructor.
2019-06-12 17:48:31 +01:00
Henning Andersen 6108899a2d Fix unresponsive network simulation (#42579)
Unresponsive network simulation would throw away requests. However, then
we no longer have any guarantees that a transport action either succeeds
or fails, which could lead to hangs (example: unclosed IndexShard
permits).

Closes #42244
2019-06-11 17:54:48 +02:00
Nhat Nguyen f2e66e22eb Increase waiting time when check retention locks (#42994)
WriteActionsTests#testBulk and WriteActionsTests#testIndex sometimes
fail with a pending retention lock. We might leak retention locks when
switching to async recovery. However, it's more likely that ongoing
recoveries prevent the retention lock from releasing.

This change increases the waiting time when we check for no pending
retention lock and also ensures no ongoing recovery in
WriteActionsTests.

Closes #41054
2019-06-10 17:58:37 -04:00
Jason Tedor 915d2f2daa
Refactor put mapping request validation for reuse (#43005)
This commit refactors put mapping request validation for reuse. The
concrete case that we are after here is the ability to apply effectively
the same framework to indices aliases requests. This commit refactors
the put mapping request validation framework to allow for that.
2019-06-09 10:19:04 -04:00
henryptung 61b62125b8 Wire query cache into sorting nested-filter computation (#42906)
Don't use Lucene's default query cache when filtering in sort.

Closes #42813
2019-06-06 21:16:58 +02:00