Commit Graph

2552 Commits

Author SHA1 Message Date
Stuart Tettemer 376932a47d
Scripting: split out compile limits and caching (#52498) (#52652)
Phase 1 of adding compilation limits per context.
* Refactor rate limiting and caching into separate class,
  `ScriptCache`,  which will be used per context.
* Disable compilation limit for certain tests.

Backport of 0866031
Refs: #50152
2020-02-21 12:10:51 -07:00
Armin Braun 0a09e15959
Add Caching for RepositoryData in BlobStoreRepository (#52341) (#52566)
Cache latest `RepositoryData` on heap when it's absolutely safe to do so (i.e. when the repository is in strictly consistent mode).

`RepositoryData` can safely be assumed to not grow to a size that would cause trouble because we often have at least two copies of it loaded at the same time when doing repository operations. Also, concurrent snapshot API status requests currently load it independently of each other and so on, making it safe to cache on heap and assume as "small" IMO.

The benefits of this move are:
* Much faster repository status API calls
   * listing all snapshot names becomes instant
   * Other operations are sped up massively too because they mostly operate in two steps: load repository data then load multiple other blobs to get the additional data
* Additional cloud cost savings
* Better resiliency, saving another spot where an IO issue could break the snapshot
* We can simplify a number of spots in the current code that currently pass around the repository data in tricky ways to avoid loading it multiple times in follow ups.
2020-02-21 10:20:07 +01:00
Armin Braun 4bb780bc37
Refactor Inflexible Snapshot Repository BwC (#52365) (#52557)
* Refactor Inflexible Snapshot Repository BwC (#52365)

Transport the version to use for  a snapshot instead of whether to use shard generations in the snapshots in progress entry. This allows making upcoming repository metadata changes in a flexible manner in an analogous way to how we handle serialization BwC elsewhere.
Also, exposing the version at the repository API level will make it easier to do BwC relevant changes in derived repositories like source only or encrypted.
2020-02-21 09:14:34 +01:00
Przemysław Witek 7cd997df84
[ML] Make ml internal indices hidden (#52423) (#52509) 2020-02-19 14:02:32 +01:00
Tim Brooks a742c58d45
Extract a ConnectionManager interface (#51722)
Currently we have three different implementations representing a
`ConnectionManager`. There is the basic `ConnectionManager` which
holds all connections for a cluster. And a remote connection manager
which support proxy behavior. And a stubbable connection manager for
tests. The remote and stubbable instances use the delegate pattern,
so this commit extracts an interface for them all to implement.
2020-02-18 09:19:24 -07:00
David Turner 3d57a78deb Add extra logging for investigation into #52000 (#52472)
It looks like #52000 is caused by a slowdown in cluster state application
(maybe due to #50907) but I would like to understand the details to ensure that
there's nothing else going on here too before simply increasing the timeout.
This commit enables some relevant `DEBUG` loggers and also captures stack
traces from all threads rather than just the three hottest ones.
2020-02-18 13:02:33 +00:00
Rory Hunter 4fb399396c Fix mixed threadlocal and supplied random use (#52459)
Closes #52450. `setRandomIndexSettings(...)` in `ESIntegTestCase` was
using both a thread-local and a supplied source of randomness. Fix this
method to only use the supplied source.
2020-02-18 10:45:23 +00:00
Nhat Nguyen bdb2e72ea4
Fix timeout in testDowngradeRemoteClusterToBasic (#52322)
- ESCCRRestTestCase#ensureYellow does not work well with assertBusy
- Increases timeout to 60s

Closes #52036
2020-02-17 15:05:42 -05:00
Nik Everett 146def8caa
Implement top_metrics agg (#51155) (#52366)
The `top_metrics` agg is kind of like `top_hits` but it only works on
doc values so it *should* be faster.

At this point it is fairly limited in that it only supports a single,
numeric sort and a single, numeric metric. And it only fetches the "very
topest" document worth of metric. We plan to support returning a
configurable number of top metrics, requesting more than one metric and
more than one sort. And, eventually, non-numeric sorts and metrics. The
trick is doing those things fairly efficiently.

Co-Authored by: Zachary Tong <zach@elastic.co>
2020-02-14 11:19:11 -05:00
Julie Tibshirani 0d7165a40b Standardize naming of fetch subphases. (#52171)
This commit makes the names of fetch subphases more consistent:
* Now the names end in just 'Phase', whereas before some ended in
  'FetchSubPhase'. This matches the query subphases like AggregationPhase.
* Some names include 'fetch' like FetchScorePhase to avoid ambiguity about what
  they do.
2020-02-13 13:00:46 -08:00
Nik Everett 2dac36de4d
HLRC support for string_stats (#52163) (#52297)
This adds a builder and parsed results for the `string_stats`
aggregation directly to the high level rest client. Without this the
HLRC can't access the `string_stats` API without the elastic licensed
`analytics` module.

While I'm in there this adds a few of our usual unit tests and
modernizes the parsing.
2020-02-12 19:25:05 -05:00
Marios Trivyzas dac720d7a1
Add a cluster setting to disallow expensive queries (#51385) (#52279)
Add a new cluster setting `search.allow_expensive_queries` which by
default is `true`. If set to `false`, certain queries that have
usually slow performance cannot be executed and an error message
is returned.

- Queries that need to do linear scans to identify matches:
  - Script queries
- Queries that have a high up-front cost:
  - Fuzzy queries
  - Regexp queries
  - Prefix queries (without index_prefixes enabled
  - Wildcard queries
  - Range queries on text and keyword fields
- Joining queries
  - HasParent queries
  - HasChild queries
  - ParentId queries
  - Nested queries
- Queries on deprecated 6.x geo shapes (using PrefixTree implementation)
- Queries that may have a high per-document cost:
  - Script score queries
  - Percolate queries

Closes: #29050
(cherry picked from commit a8b39ed842c7770bd9275958c9f747502fd9a3ea)
2020-02-12 22:56:14 +01:00
Gordon Brown d48ce12920
Convert ILM and SLM histories into hidden indices (#51456)
Modifies SLM's and ILM's history indices to be hidden indices for added
protection against accidental querying and deletion, and improves
IndexTemplateRegistry to handle upgrading index templates.

Also modifies the REST test cleanup to delete hidden indices.
2020-02-11 14:18:55 -07:00
David Turner 00b9098250 Ignore timeouts with single-node discovery (#52159)
Today we use `cluster.join.timeout` to prevent nodes from waiting indefinitely
if joining a faulty master that is too slow to respond, and
`cluster.publish.timeout` to allow a faulty master to detect that it is unable
to publish its cluster state updates in a timely fashion. If these timeouts
occur then the node restarts the discovery process in an attempt to find a
healthier master.

In the special case of `discovery.type: single-node` there is no point in
looking for another healthier master since the single node in the cluster is
all we've got. This commit suppresses these timeouts and instead lets the node
wait for joins and publications to succeed no matter how long this might take.
2020-02-11 14:15:01 +00:00
Armin Braun 90eb6a020d Remove Redundant Loading of RepositoryData during Restore (#51977) (#52108)
We can just put the `IndexId` instead of just the index name into the recovery soruce and
save one load of `RepositoryData` on each shard restore that way.
2020-02-09 21:44:18 +01:00
Julie Tibshirani 337d73a7c6 Rename MapperService#fullName to fieldType.
The new name more accurately describes what the method returns.
2020-02-07 10:35:53 -08:00
Martijn Laarman f626700757 Introduces validation to restrict API's to single namespace (#51982)
To be merged after #51981 lands

(cherry picked from commit 196eaf8ac1ccc777f5bd81469d1148bc7ec299dd)
2020-02-06 17:17:03 +01:00
Armin Braun 3be70f64d8
Fix GCS Mock Http Handler JDK Bug (#51933) (#51941)
There is an open JDK bug that is causing an assertion in the JDK's
http server to trip if we don't drain the request body before sending response headers.
See https://bugs.openjdk.java.net/browse/JDK-8180754
Working around this issue here by always draining the request at the beginning of the handler.

Fixes #51446
2020-02-05 15:37:06 +01:00
Maria Ralli 8d3e73b3a0 Add host address to BindTransportException message (#51269)
When bind fails, show the host address in addition to the port. This
helps debugging cases with wrong "network.host" values.

Closes #48001
2020-02-04 17:13:19 +00:00
James Rodewig 4ea7297e1e
[DOCS] Change http://elastic.co -> https (#48479) (#51812)
Co-authored-by: Jonathan Budzenski <jon@budzenski.me>
2020-02-03 09:50:11 -05:00
Ryan Ernst 21224caeaf Remove comparison to true for booleans (#51723)
While we use `== false` as a more visible form of boolean negation
(instead of `!`), the true case is implied and the true value does not
need to explicitly checked. This commit converts cases that have slipped
into the code checking for `== true`.
2020-01-31 16:35:43 -08:00
Adrien Grand 915a931e93
Bucket aggregation circuit breaker optimization. (#46751) (#51730)
Co-authored-by: Howard <danielhuang@tencent.com>
2020-01-31 11:30:51 +01:00
Ioannis Kakavas 72c5bc9630
Disable diagnose trust mananer only when needed (#51683)
Do not try to set xpack.ssl.security.diagnose.trust
when it is not registered
2020-01-30 21:35:03 +02:00
Armin Braun 7914c1a734
Optimize GCS Mock (#51593) (#51594)
This test was still very GC heavy in Java 8 runs in particular
which seems to slow down request processing to the point of timeouts
in some runs.
This PR completely removes the large number of O(MB) `byte[]` allocations
that were happening in the mock http handler which cuts the allocation rate
by about a factor of 5 in my local testing for the GC heavy `testSnapshotWithLargeSegmentFiles`
run.

Closes #51446
Closes #50754
2020-01-29 11:06:05 +01:00
Jim Ferenczi 77f4aafaa2 Expose the logic to cancel task when the rest channel is closed (#51423)
This commit moves the logic that cancels search requests when the rest channel is closed
to a generic client that can be used by other APIs. This will be useful for any rest action
that wants to cancel the execution of a task if the underlying rest channel is closed by the
client before completion.

Relates #49931
Relates #50990
Relates #50990
2020-01-28 22:55:42 +01:00
Armin Braun aae93a7578
Allow Repository Plugins to Filter Metadata on Create (#51472) (#51542)
* Allow Repository Plugins to Filter Metadata on Create

Add a hook that allows repository plugins to filter the repository metadata
before it gets written to the cluster state.
2020-01-28 18:33:26 +01:00
Yannick Welsch fa212fe60b Stricter checks of setup and teardown in docs tests (#51430)
Adds extra checks due to 7.x backport
2020-01-28 16:52:23 +01:00
Jake Landis 1197ebb9e9 Additional assertions for SLM restart tests (#48428)
Adds the ability to conditionally wipe SLM policies
and ensures that you can read what you wrote across
restarts.
2020-01-28 16:52:23 +01:00
Yannick Welsch f6686345c9 Avoid unnecessary setup and teardown in docs tests (#51430)
The docs tests have recently been running much slower than before (see #49753).

The gist here is that with ILM/SLM we do a lot of unnecessary setup / teardown work on each
test. Compounded with the slightly slower cluster state storage mechanism, this causes the
tests to run much slower.

In particular, on RAMDisk, docs:check is taking

ES 7.4: 6:55 minutes
ES master: 16:09 minutes
ES with this commit: 6:52 minutes

on SSD, docs:check is taking

ES 7.4: ??? minutes
ES master: 32:20 minutes
ES with this commit: 11:21 minutes
2020-01-28 16:52:23 +01:00
William Brafford 9efa5be60e
Password-protected Keystore Feature Branch PR (#51123) (#51510)
* Reload secure settings with password (#43197)

If a password is not set, we assume an empty string to be
compatible with previous behavior.
Only allow the reload to be broadcast to other nodes if TLS is
enabled for the transport layer.

* Add passphrase support to elasticsearch-keystore (#38498)

This change adds support for keystore passphrases to all subcommands
of the elasticsearch-keystore cli tool and adds a subcommand for
changing the passphrase of an existing keystore.
The work to read the passphrase in Elasticsearch when
loading, which will be addressed in a different PR.

Subcommands of elasticsearch-keystore can handle (open and create)
passphrase protected keystores

When reading a keystore, a user is only prompted for a passphrase
only if the keystore is passphrase protected.

When creating a keystore, a user is allowed (default behavior) to create one with an
empty passphrase

Passphrase can be set to be empty when changing/setting it for an
existing keystore

Relates to: #32691
Supersedes: #37472

* Restore behavior for force parameter (#44847)

Turns out that the behavior of `-f` for the add and add-file sub
commands where it would also forcibly create the keystore if it
didn't exist, was by design - although undocumented.
This change restores that behavior auto-creating a keystore that
is not password protected if the force flag is used. The force
OptionSpec is moved to the BaseKeyStoreCommand as we will presumably
want to maintain the same behavior in any other command that takes
a force option.

*  Handle pwd protected keystores in all CLI tools  (#45289)

This change ensures that `elasticsearch-setup-passwords` and
`elasticsearch-saml-metadata` can handle a password protected
elasticsearch.keystore.
For setup passwords the user would be prompted to add the
elasticsearch keystore password upon running the tool. There is no
option to pass the password as a parameter as we assume the user is
present in order to enter the desired passwords for the built-in
users.
For saml-metadata, we prompt for the keystore password at all times
even though we'd only need to read something from the keystore when
there is a signing or encryption configuration.

* Modify docs for setup passwords and saml metadata cli (#45797)

Adds a sentence in the documentation of `elasticsearch-setup-passwords`
and `elasticsearch-saml-metadata` to describe that users would be
prompted for the keystore's password when running these CLI tools,
when the keystore is password protected.

Co-Authored-By: Lisa Cawley <lcawley@elastic.co>

* Elasticsearch keystore passphrase for startup scripts (#44775)

This commit allows a user to provide a keystore password on Elasticsearch
startup, but only prompts when the keystore exists and is encrypted.

The entrypoint in Java code is standard input. When the Bootstrap class is
checking for secure keystore settings, it checks whether or not the keystore
is encrypted. If so, we read one line from standard input and use this as the
password. For simplicity's sake, we allow a maximum passphrase length of 128
characters. (This is an arbitrary limit and could be increased or eliminated.
It is also enforced in the keystore tools, so that a user can't create a
password that's too long to enter at startup.)

In order to provide a password on standard input, we have to account for four
different ways of starting Elasticsearch: the bash startup script, the Windows
batch startup script, systemd startup, and docker startup. We use wrapper
scripts to reduce systemd and docker to the bash case: in both cases, a
wrapper script can read a passphrase from the filesystem and pass it to the
bash script.

In order to simplify testing the need for a passphrase, I have added a
has-passwd command to the keystore tool. This command can run silently, and
exit with status 0 when the keystore has a password. It exits with status 1 if
the keystore doesn't exist or exists and is unencrypted.

A good deal of the code-change in this commit has to do with refactoring
packaging tests to cleanly use the same tests for both the "archive" and the
"package" cases. This required not only moving tests around, but also adding
some convenience methods for an abstraction layer over distribution-specific
commands.

* Adjust docs for password protected keystore (#45054)

This commit adds relevant parts in the elasticsearch-keystore
sub-commands reference docs and in the reload secure settings API
doc.

* Fix failing Keystore Passphrase test for feature branch (#50154)

One problem with the passphrase-from-file tests, as written, is that
they would leave a SystemD environment variable set when they failed,
and this setting would cause elasticsearch startup to fail for other
tests as well. By using a try-finally, I hope that these tests will fail
more gracefully.

It appears that our Fedora and Ubuntu environments may be configured to
store journald information under /var rather than under /run, so that it
will persist between boots. Our destructive tests that read from the
journal need to account for this in order to avoid trying to limit the
output we check in tests.

* Run keystore management tests on docker distros (#50610)

* Add Docker handling to PackagingTestCase

Keystore tests need to be able to run in the Docker case. We can do this
by using a DockerShell instead of a plain Shell when Docker is running.

* Improve ES startup check for docker

Previously we were checking truncated output for the packaged JDK as
an indication that Elasticsearch had started. With new preliminary
password checks, we might get a false positive from ES keystore
commands, so we have to check specifically that the Elasticsearch
class from the Bootstrap package is what's running.

* Test password-protected keystore with Docker (#50803)

This commit adds two tests for the case where we mount a
password-protected keystore into a Docker container and provide a
password via a Docker environment variable.

We also fix a logging bug where we were logging the identifier for an
array of strings rather than the contents of that array.

* Add documentation for keystore startup prompting (#50821)

When a keystore is password-protected, Elasticsearch will prompt at
startup. This commit adds documentation for this prompt for the archive,
systemd, and Docker cases.

Co-authored-by: Lisa Cawley <lcawley@elastic.co>

* Warn when unable to upgrade keystore on debian (#51011)

For Red Hat RPM upgrades, we warn if we can't upgrade the keystore. This
commit brings the same logic to the code for Debian packages. See the
posttrans file for gets executed for RPMs.

* Restore handling of string input

Adds tests that were mistakenly removed. One of these tests proved
we were not handling the the stdin (-x) option correctly when no
input was added. This commit restores the original approach of
reading stdin one char at a time until there is no more (-1, \r, \n)
instead of using readline() that might return null

* Apply spotless reformatting

* Use '--since' flag to get recent journal messages

When we get Elasticsearch logs from journald, we want to fetch only log
messages from the last run. There are two reasons for this. First, if
there are many logs, we might get a string that's too large for our
utility methods. Second, when we're looking for a specific message or
error, we almost certainly want to look only at messages from the last
execution.

Previously, we've been trying to do this by clearing out the physical
files under the journald process. But there seems to be some contention
over these directories: if journald writes a log file in between when
our deletion command deletes the file and when it deletes the log
directory, the deletion will fail.

It seems to me that we might be able to use journald's "--since" flag to
retrieve only log messages from the last run, and that this might be
less likely to fail due to race conditions in file deletion.

Unfortunately, it looks as if the "--since" flag has a granularity of
one-second. I've added a two-second sleep to make sure that there's a
sufficient gap between the test that will read from journald and the
test before it.

* Use new journald wrapper pattern

* Update version added in secure settings request

Co-authored-by: Lisa Cawley <lcawley@elastic.co>
Co-authored-by: Ioannis Kakavas <ikakavas@protonmail.com>
2020-01-28 05:32:32 -05:00
Ioannis Kakavas 4f3548fbd7
Disable diagnostic trust manager in tests (#51501)
This commit sets `xpack.security.ssl.diagnose.trust` to false in all
of our tests when running in FIPS 140 mode and when settings objects
are used to create an instance of the SSLService. This is needed
in 7.x because setting xpack.security.ssl.diagnose.trust to true
wraps SunJSSE TrustManager with our own DiagnosticTrustManager and
this is not allowed when SunJSSE is in FIPS mode.
An alternative would be to set xpack.security.fips.enabled to
true which would also implicitly disable
xpack.security.ssl.diagnose.trust but would have additional effects
(would require that we set PBKDF2 for password hashing algorithm in
all test clusters, would prohibit using JKS keystores in nodes even
if relevant tests have been muted in FIPS mode etc.)

Relates: #49900
Resolves: #51268
2020-01-28 10:17:35 +02:00
Ioannis Kakavas ee202a642f
Enable tests in FIPS 140 in JDK 11 (#49485)
This change changes the way to run our test suites in 
JVMs configured in FIPS 140 approved mode. It does so by:

- Configuring any given runtime Java in FIPS mode with the bundled
policy and security properties files, setting the system
properties java.security.properties and java.security.policy
with the == operator that overrides the default JVM properties
and policy.

- When runtime java is 11 and higher, using BouncyCastle FIPS 
Cryptographic provider and BCJSSE in FIPS mode. These are 
used as testRuntime dependencies for unit
tests and internal clusters, and copied (relevant jars)
explicitly to the lib directory for testclusters used in REST tests

- When runtime java is 8, using BouncyCastle FIPS 
Cryptographic provider and SunJSSE in FIPS mode. 

Running the tests in FIPS 140 approved mode doesn't require an
additional configuration either in CI workers or locally and is
controlled by specifying -Dtests.fips.enabled=true
2020-01-27 11:14:52 +02:00
Przemysław Witek 8703b885c2
Move TupleMatchers class to org.elasticsearch.test.hamcrest package (#51359) (#51395) 2020-01-24 11:10:54 +01:00
Nhat Nguyen 1ca5dd13de
Flush instead of synced-flush inactive shards (#51365)
If all nodes are on 7.6, we prefer to perform a normal flush 
instead of synced flush when a shard becomes inactive.

Backport of #49126
2020-01-23 17:20:57 -05:00
Henning Andersen 8c8d0dbacc Revert "Workaround for JDK 14 EA FileChannel.map issue (#50523)" (#51323)
This reverts commit c7fd24ca1569a809b499caf34077599e463bb8d6.

Now that JDK-8236582 is fixed in JDK 14 EA, we can revert the workaround.

Relates #50523 and #50512
2020-01-23 11:27:02 +01:00
Nhat Nguyen 157b352b47 Exclude nested documents in LuceneChangesSnapshot (#51279)
LuceneChangesSnapshot can be slow if nested documents are heavily used.
Also, it estimates the number of operations to be recovered in peer
recoveries inaccurately. With this change, we prefer excluding the
nested non-root documents in a Lucene query instead.
2020-01-22 17:33:28 -05:00
Armin Braun d9ad66965c
Fix Rest Tests Failing to Cleanup Rollup Jobs (#51246) (#51294)
* Fix Rest Tests Failing to Cleanup Rollup Jobs

If the rollup jobs index doesn't exist for some reason (like running against a 6.x cluster)
we should just assume the jobs have been cleaned up and move on.

Closes #50819
2020-01-22 11:38:20 +01:00
Przemyslaw Gomulka 0513d8dca3
Add SPI jvm option to SystemJvmOptions (#50916)
Adding back accidentally removed jvm option that is required to enforce
start of the week = Monday in IsoCalendarDataProvider.
Adding a `feature` to yml test in order to skip running it in JDK8

commit that removed it 398c802
commit that backports SystemJvmOptions c4fbda3
relates 7.x backport of code that enforces CalendarDataProvider use #48349
2020-01-21 09:02:21 +01:00
Jay Modi 107989df3e
Introduce hidden indices (#51164)
This change introduces a new feature for indices so that they can be
hidden from wildcard expansion. The feature is referred to as hidden
indices. An index can be marked hidden through the use of an index
setting, `index.hidden`, at creation time. One primary use case for
this feature is to have a construct that fits indices that are created
by the stack that contain data used for display to the user and/or
intended for querying by the user. The desire to keep them hidden is
to avoid confusing users when searching all of the data they have
indexed and getting results returned from indices created by the
system.

Hidden indices have the following properties:
* API calls for all indices (empty indices array, _all, or *) will not
  return hidden indices by default.
* Wildcard expansion will not return hidden indices by default unless
  the wildcard pattern begins with a `.`. This behavior is similar to
  shell expansion of wildcards.
* REST API calls can enable the expansion of wildcards to hidden
  indices with the `expand_wildcards` parameter. To expand wildcards
  to hidden indices, use the value `hidden` in conjunction with `open`
  and/or `closed`.
* Creation of a hidden index will ignore global index templates. A
  global index template is one with a match-all pattern.
* Index templates can make an index hidden, with the exception of a
  global index template.
* Accessing a hidden index directly requires no additional parameters.

Backport of #50452
2020-01-17 10:09:01 -07:00
Yannick Welsch dc47b380c8 Block too many concurrent mapping updates (#51038)
Ensures that there are not too many concurrent dynamic mapping updates going out from the
data nodes to the master.

Closes #50670
2020-01-15 16:02:11 +01:00
Nik Everett fc5fde7950
Add "did you mean" to ObjectParser (#50938) (#50985)
Check it out:
```
$ curl -u elastic:password -HContent-Type:application/json -XPOST localhost:9200/test/_update/foo?pretty -d'{
  "dac": {}
}'

{
  "error" : {
    "root_cause" : [
      {
        "type" : "x_content_parse_exception",
        "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?"
      }
    ],
    "type" : "x_content_parse_exception",
    "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?"
  },
  "status" : 400
}
```

The tricky thing about implementing this is that x-content doesn't
depend on Lucene. So this works by creating an extension point for the
error message using SPI. Elasticsearch's server module provides the
"spell checking" implementation.
s
2020-01-14 17:53:41 -05:00
Armin Braun 16c07472e5
Track Snapshot Version in RepositoryData (#50930) (#50989)
* Track Snapshot Version in RepositoryData (#50930)

Add tracking of snapshot versions to RepositoryData to make BwC logic more efficient.
Follow up to #50853
2020-01-14 18:15:07 +01:00
Armin Braun 6e8ea7aaa2
Work around JVM Bug in LongGCDisruptionTests (#50731) (#50974)
There is a JVM bug causing `Thread#suspend` calls to randomly take
multiple seconds breaking these tests that call the method numerous times
in a loop. Increasing the timeout would will not work since we may call
`suspend` tens if not hundreds of times and even a small number of them
experiencing the blocking will lead to multiple minutes of waiting.

This PR detects the specific issue by timing the `Thread#suspend` calls and
skips the remainder of the test if it timed out because of the JVM bug.

Closes #50047
2020-01-14 17:13:21 +01:00
Tim Brooks d8510be3d9
Revert "Send cluster name and discovery node in handshake (#48916)" (#50944)
This reverts commit 0645ee88e2.
2020-01-14 09:53:13 -06:00
Yannick Welsch 91d7b446a0 Warn on slow metadata performance (#50956)
Has the new cluster state storage layer emit warnings in case metadata performance is very
slow.

Relates #48701
2020-01-14 15:04:28 +01:00
Yannick Welsch 22ba759e1f
Move metadata storage to Lucene (#50928)
* Move metadata storage to Lucene (#50907)

Today we split the on-disk cluster metadata across many files: one file for the metadata of each
index, plus one file for the global metadata and another for the manifest. Most metadata updates
only touch a few of these files, but some must write them all. If a node holds a large number of
indices then it's possible its disks are not fast enough to process a complete metadata update before timing out. In severe cases affecting master-eligible nodes this can prevent an election
from succeeding.

This commit uses Lucene as a metadata storage for the cluster state, and is a squashed version
of the following PRs that were targeting a feature branch:

* Introduce Lucene-based metadata persistence (#48733)

This commit introduces `LucenePersistedState` which master-eligible nodes
can use to persist the cluster metadata in a Lucene index rather than in
many separate files.

Relates #48701

* Remove per-index metadata without assigned shards (#49234)

Today on master-eligible nodes we maintain per-index metadata files for every
index. However, we also keep this metadata in the `LucenePersistedState`, and
only use the per-index metadata files for importing dangling indices. However
there is no point in importing a dangling index without any shard data, so we
do not need to maintain these extra files any more.

This commit removes per-index metadata files from nodes which do not hold any
shards of those indices.

Relates #48701

* Use Lucene exclusively for metadata storage (#50144)

This moves metadata persistence to Lucene for all node types. It also reenables BWC and adds
an interoperability layer for upgrades from prior versions.

This commit disables a number of tests related to dangling indices and command-line tools.
Those will be addressed in follow-ups.

Relates #48701

* Add command-line tool support for Lucene-based metadata storage (#50179)

Adds command-line tool support (unsafe-bootstrap, detach-cluster, repurpose, & shard
commands) for the Lucene-based metadata storage.

Relates #48701

* Use single directory for metadata (#50639)

Earlier PRs for #48701 introduced a separate directory for the cluster state. This is not needed
though, and introduces an additional unnecessary cognitive burden to the users.

Co-Authored-By: David Turner <david.turner@elastic.co>

* Add async dangling indices support (#50642)

Adds support for writing out dangling indices in an asynchronous way. Also provides an option to
avoid writing out dangling indices at all.

Relates #48701

* Fold node metadata into new node storage (#50741)

Moves node metadata to uses the new storage mechanism (see #48701) as the authoritative source.

* Write CS asynchronously on data-only nodes (#50782)

Writes cluster states out asynchronously on data-only nodes. The main reason for writing out
the cluster state at all is so that the data-only nodes can snap into a cluster, that they can do a
bit of bootstrap validation and so that the shard recovery tools work.
Cluster states that are written asynchronously have their voting configuration adapted to a non
existing configuration so that these nodes cannot mistakenly become master even if their node
role is changed back and forth.

Relates #48701

* Remove persistent cluster settings tool (#50694)

Adds the elasticsearch-node remove-settings tool to remove persistent settings from the on
disk cluster state in case where it contains incompatible settings that prevent the cluster from
forming.

Relates #48701

* Make cluster state writer resilient to disk issues (#50805)

Adds handling to make the cluster state writer resilient to disk issues. Relates to #48701

* Omit writing global metadata if no change (#50901)

Uses the same optimization for the new cluster state storage layer as the old one, writing global
metadata only when changed. Avoids writing out the global metadata if none of the persistent
fields changed. Speeds up server:integTest by ~10%.

Relates #48701

* DanglingIndicesIT should ensure node removed first (#50896)

These tests occasionally failed because the deletion was submitted before the
restarting node was removed from the cluster, causing the deletion not to be
fully acked. This commit fixes this by checking the restarting node has been
removed from the cluster.

Co-authored-by: David Turner <david.turner@elastic.co>

* fix tests

Co-authored-by: David Turner <david.turner@elastic.co>
2020-01-14 09:35:43 +01:00
Nhat Nguyen fb32a55dd5 Deprecate synced flush (#50835)
A normal flush has the same effect as a synced flush on Elasticsearch 
7.6 or later. It's deprecated in 7.6 and will be removed in 8.0.

Relates #50776
2020-01-13 19:54:38 -05:00
Nhat Nguyen 05f97d5e1b Revert "Deprecate synced flush (#50835)"
This reverts commit 1a32d7142a.
2020-01-13 11:41:03 -05:00
Nhat Nguyen 1a32d7142a
Deprecate synced flush (#50835)
A normal flush has the same effect as a synced flush on Elasticsearch 
7.6 or later. It's deprecated in 7.6 and will be removed in 8.0.

Relates #50776
2020-01-13 10:58:29 -05:00
Armin Braun f70e8f6ab5
Fix Snapshot Repository Corruption in Downgrade Scenarios (#50692) (#50797)
* Fix Snapshot Repository Corruption in Downgrade Scenarios (#50692)

This PR introduces test infrastructure for downgrading a cluster while interacting with a given repository.
It fixes the fact that repository metadata in the new format could be written while there's still older snapshots in the repository that require the old-format metadata to be restorable.
2020-01-09 21:21:13 +01:00
Armin Braun 4a7e09f624
Enforce Logging of Errors in GCS Rest RetriesTests (#50761) (#50783)
It's impossible to tell why #50754 fails without this change.
We're failing to close the `exchange` somewhere and there is no
write timeout in the GCS SDK (something to look into separately)
only a read timeout on the socket so if we're failing on an assertion without
reading the full request body (at least into the read-buffer) we're locking up
waiting forever on `write0`.

This change ensure the `exchange` is closed in the tests where we could lock up
on a write and logs the failure so we can find out what broke #50754.
2020-01-09 10:46:07 +01:00
Armin Braun a725896c92
Fix and Reenable SnapshotTool Minio Tests (#50736) (#50745)
This solves half of the problem in #46813 by moving the S3
tests to using the shared minio fixture so we at least have
some non-3rd-party, constantly running coverage on these tests.
2020-01-08 16:33:36 +01:00
Adrien Grand 31158ab3d5
Add per-field metadata. (#50333)
This PR adds per-field metadata that can be set in the mappings and is later
returned by the field capabilities API. This metadata is completely opaque to
Elasticsearch but may be used by tools that index data in Elasticsearch to
communicate metadata about fields with tools that then search this data. A
typical example that has been requested in the past is the ability to attach
a unit to a numeric field.

In order to not bloat the cluster state, Elasticsearch requires that this
metadata be small:
 - keys can't be longer than 20 chars,
 - values can only be numbers or strings of no more than 50 chars - no inner
   arrays or objects,
 - the metadata can't have more than 5 keys in total.

Given that metadata is opaque to Elasticsearch, field capabilities don't try to
do anything smart when merging metadata about multiple indices, the union of
all field metadatas is returned.

Here is how the meta might look like in mappings:

```json
{
  "properties": {
    "latency": {
      "type": "long",
      "meta": {
        "unit": "ms"
      }
    }
  }
}
```

And then in the field capabilities response:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms" ]
      }
    }
  }
}
```

When there are no conflicts, values are arrays of size 1, but when there are
conflicts, Elasticsearch includes all unique values in this array, without
giving ways to know which index has which metadata value:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms", "ns" ]
      }
    }
  }
}
```

Closes #33267
2020-01-08 16:21:18 +01:00
Rory Hunter b1ff74f652 New setting to prevent automatically importing dangling indices (#49174)
Introduce a new static setting, `gateway.auto_import_dangling_indices`, which prevents dangling indices from being automatically imported. Part of #48366.
2020-01-08 13:39:20 +01:00
Armin Braun d0d48311f4
Faster and Simpler GCS REST Mock (#50706) (#50707)
* Faster and Simpler GCS REST Mock

I reworked the GCS mock a little to use less copying+allocation,
log the full request body on failure to read a multi-part request
and generally be a little simpler and easy to follow to track down
the remaining issues that are causing almost daily failures from this
class's multi-part request parsing that can't be reproduced locally.
2020-01-07 20:17:46 +01:00
David Turner 2039cc813b Fix testDelayVariabilityAppliesToFutureTasks (#50667)
This test seems to be bogus as it was confusing a nominal execution time with a
delay (i.e. an elapsed time). This commit reworks the test to address this.

Fixes #50650
2020-01-07 10:01:03 +00:00
Armin Braun 72a405fafb
Fix GCS Mock Broken Handling of some Blobs (#50666) (#50671)
* Fix GCS Mock Broken Handling of some Blobs

We were incorrectly handling blobs starting in `\r\n` which broke
tests randomly when blobs started on these.

Relates #49429
2020-01-06 19:27:57 +01:00
Nhat Nguyen b71490b06b
Deprecate indices without soft-deletes (#50502) (#50634)
Soft-deletes will be enabled for all indices in 8.0. Hence, we should
deprecate new indices without soft-deletes in 7.x.

Backport of #50502
2020-01-06 08:44:30 -05:00
Henning Andersen 312bf44601 Workaround for JDK 14 EA FileChannel.map issue (#50523)
FileChannel.map provokes static initialization of ExtendedMapMode in
JDK14 EA, which needs elevated privileges.

Relates #50512
2020-01-06 12:18:49 +01:00
Nik Everett 2362c430cd
Clean up wire test case a bit (#50627) (#50632)
* Adds JavaDoc to `AbstractWireTestCase` and
`AbstractWireSerializingTestCase` so it is more obvious you should prefer
the latter if you have a choice
* Moves the `instanceReader` method out of `AbstractWireTestCase` becaue
it is no longer used.
* Marks a bunch of methods final so it is more obvious which classes are
for what.
* Cleans up the side effects of the above.
2020-01-05 16:20:38 -05:00
Martijn van Groningen 10ed1ae1d2
Add remote info to the HLRC (#50483)
The additional change to the original PR (#49657), is that `org.elasticsearch.client.cluster.RemoteConnectionInfo` now parses the initial_connect_timeout field as a string instead of a TimeValue instance.

The reason that this is needed is because that the initial_connect_timeout field in the remote connection api is serialized for human consumption, but not for parsing purposes.
Therefore the HLRC can't parse it correctly (which caused test failures in CI, but not in the PR CI
:( ). The way this field is serialized needs to be changed in the remote connection api, but that is a breaking change. We should wait making this change until rest api versioning is introduced.

Co-Authored-By: j-bean <anton.shuvaev91@gmail.com>

Co-authored-by: j-bean <anton.shuvaev91@gmail.com>
2019-12-24 15:11:58 +01:00
Nhat Nguyen 33204c2055 Use peer recovery retention leases for indices without soft-deletes (#50351)
Today, the replica allocator uses peer recovery retention leases to
select the best-matched copies when allocating replicas of indices with
soft-deletes. We can employ this mechanism for indices without
soft-deletes because the retaining sequence number of a PRRL is the
persisted global checkpoint (plus one) of that copy. If the primary and
replica have the same retaining sequence number, then we should be able
to perform a noop recovery. The reason is that we must be retaining
translog up to the local checkpoint of the safe commit, which is at most
the global checkpoint of either copy). The only limitation is that we
might not cancel ongoing file-based recoveries with PRRLs for noop
recoveries. We can't make the translog retention policy comply with
PRRLs. We also have this problem with soft-deletes if a PRRL is about to
expire.

Relates #45136
Relates #46959
2019-12-23 22:04:07 -05:00
Nhat Nguyen 1dc98ad617 Ensure global checkpoint was advanced and synced
We need to make sure that the global checkpoints and peer recovery
retention leases were advanced to the max_seq_no and synced; otherwise,
we can risk expiring some peer recovery retention leases because of the
file-based recovery threshold.

Relates #49448
2019-12-23 21:10:30 -05:00
Lee Hinman c3c9ccf61f
[7.x] Add ILM histore store index (#50287) (#50345)
* Add ILM histore store index (#50287)

* Add ILM histore store index

This commit adds an ILM history store that tracks the lifecycle
execution state as an index progresses through its ILM policy. ILM
history documents store output similar to what the ILM explain API
returns.

An example document with ALL fields (not all documents will have all
fields) would look like:

```json
{
  "@timestamp": 1203012389,
  "policy": "my-ilm-policy",
  "index": "index-2019.1.1-000023",
  "index_age":123120,
  "success": true,
  "state": {
    "phase": "warm",
    "action": "allocate",
    "step": "ERROR",
    "failed_step": "update-settings",
    "is_auto-retryable_error": true,
    "creation_date": 12389012039,
    "phase_time": 12908389120,
    "action_time": 1283901209,
    "step_time": 123904107140,
    "phase_definition": "{\"policy\":\"ilm-history-ilm-policy\",\"phase_definition\":{\"min_age\":\"0ms\",\"actions\":{\"rollover\":{\"max_size\":\"50gb\",\"max_age\":\"30d\"}}},\"version\":1,\"modified_date_in_millis\":1576517253463}",
    "step_info": "{... etc step info here as json ...}"
  },
  "error_details": "java.lang.RuntimeException: etc\n\tcaused by:etc etc etc full stacktrace"
}
```

These documents go into the `ilm-history-1-00000N` index to provide an
audit trail of the operations ILM has performed.

This history storage is enabled by default but can be disabled by setting
`index.lifecycle.history_index_enabled` to `false.`

Resolves #49180

* Make ILMHistoryStore.putAsync truly async (#50403)

This moves the `putAsync` method in `ILMHistoryStore` never to block.
Previously due to the way that the `BulkProcessor` works, it was possible
for `BulkProcessor#add` to block executing a bulk request. This was bad
as we may be adding things to the history store in cluster state update
threads.

This also moves the index creation to be done prior to the bulk request
execution, rather than being checked every time an operation was added
to the queue. This lessens the chance of the index being created, then
deleted (by some external force), and then recreated via a bulk indexing
request.

Resolves #50353
2019-12-20 12:33:36 -07:00
Jim Ferenczi 2acafd4b15
Optimize composite aggregation based on index sorting (#48399) (#50272)
Co-authored-by: Daniel Huang <danielhuang@tencent.com>

This is a spinoff of #48130 that generalizes the proposal to allow early termination with the composite aggregation when leading sources match a prefix or the entire index sort specification.
In such case the composite aggregation can use the index sort natural order to early terminate the collection when it reaches a composite key that is greater than the bottom of the queue.
The optimization is also applicable when a query other than match_all is provided. However the optimization is deactivated for sources that match the index sort in the following cases:
  * Multi-valued source, in such case early termination is not possible.
  * missing_bucket is set to true
2019-12-20 12:32:37 +01:00
Stuart Tettemer 689df1f28f
Scripting: ScriptFactory not required by compile (#50344) (#50392)
Avoid backwards incompatible changes for 8.x and 7.6 by removing type
restriction on compile and Factory.  Factories may optionally implement
ScriptFactory.  If so, then they can indicate determinism and thus
cacheability.

**Backport**

Relates: #49466
2019-12-19 12:50:25 -07:00
Stuart Tettemer 06a24f09cf
Scripting: Cache script results if deterministic (#50106) (#50329)
Cache results from queries that use scripts if they use only
deterministic API calls.  Nondeterministic API calls are marked in the
whitelist with the `@nondeterministic` annotation.  Examples are
`Math.random()` and `new Date()`.

Refs: #49466
2019-12-18 13:00:42 -07:00
Armin Braun 55cc5432d6
Fix S3 Repo Tests Incomplete Reads (#50268) (#50275)
We need to read in a loop here. A single read to a huge byte array will
only read 16k max with the S3 SDK so if the blob we're trying to fully
read is larger we close early and fail the size comparison.
Also, drain streams fully when checking existence to avoid S3 SDK warnings.
2019-12-17 15:33:09 +01:00
Yannick Welsch 1f981580aa Simplify InternalTestCluster.fullRestart (#50218)
With node ordinals gone, there's no longer a need for such a complicated full cluster restart
procedure (as we can now uniquely associate nodes to data folders).

Follow-up to #41652
2019-12-17 14:33:01 +01:00
Armin Braun 2e7b1ab375
Use ClusterState as Consistency Source for Snapshot Repositories (#49060) (#50267)
Follow up to #49729

This change removes falling back to listing out the repository contents to find the latest `index-N` in write-mounted blob store repositories.
This saves 2-3 list operations on each snapshot create and delete operation. Also it makes all the snapshot status APIs cheaper (and faster) by saving one list operation there as well in many cases.
This removes the resiliency to concurrent modifications of the repository as a result and puts a repository in a `corrupted` state in case loading `RepositoryData` failed from the assumed generation.
2019-12-17 10:55:13 +01:00
Armin Braun 761d6e8e4b
Remove BlobContainer Tests against Mocks (#50194) (#50220)
* Remove BlobContainer Tests against Mocks

Removing all these weird mocks as asked for by #30424.
All these tests are now part of real repository ITs and otherwise left unchanged if they had
independent tests that didn't call the `createBlobStore` method previously.
The HDFS tests also get added coverage as a side-effect because they did not have an implementation
of the abstract repository ITs.

Closes #30424
2019-12-16 11:37:09 +01:00
Nhat Nguyen df46848fb0 Migrate peer recovery from translog to retention lease (#49448)
Since 7.4, we switch from translog to Lucene as the source of history
for peer recoveries. However, we reduce the likelihood of
operation-based recoveries when performing a full cluster restart from
pre-7.4 because existing copies do not have PPRL.

To remedy this issue, we fallback using translog in peer recoveries if
the recovering replica does not have a peer recovery retention lease,
and the replication group hasn't fully migrated to PRRL.

Relates #45136
2019-12-15 10:24:39 -05:00
Nhat Nguyen c151a75dfe Use retention lease in peer recovery of closed indices (#48430)
Today we do not use retention leases in peer recovery for closed indices
because we can't sync retention leases on closed indices. This change
allows that ability and adjusts peer recovery to use retention leases
for all indices with soft-deletes enabled.

Relates #45136

Co-authored-by: David Turner <david.turner@elastic.co>
2019-12-15 10:24:34 -05:00
Armin Braun c73930988b
Remove Unused Delete Endpoint from GCS Mock (#50128) (#50134)
Follow up to #50024: we're not using the single-delete any more so no need to have a mock endpoint for it
2019-12-12 14:18:06 +01:00
Armin Braun 2a186c8148
Disable LongGCDisruptionTests on JDK11+12 (#50097) (#50126)
See discussion in #50047 (comment).
There are reproducible issues with Thread#suspend in Jdk11 and Jdk12 for me locally
and we have one failure for each on CI.
Jdk8 and Jdk13 are stable though on CI and in my testing so I'd selectively disable
this test here to keep the coverage. We aren't using suspend in production code so the
JDK bug behind this does not affect us.

Closes #50047
2019-12-12 11:40:44 +01:00
Armin Braun 6eee41e253
Remove Unused Single Delete in BlobStoreRepository (#50024) (#50123)
* Remove Unused Single Delete in BlobStoreRepository

There are no more production uses of the non-bulk delete or the delete that throws
on missing so this commit removes both these methods.
Only the bulk delete logic remains. Where the bulk delete was derived from single deletes,
the single delete code was inlined into the bulk delete method.
Where single delete was used in tests it was replaced by bulk deleting.
2019-12-12 11:17:46 +01:00
Armin Braun 0fae4065ef
Better Logging GCS Blobstore Mock (#50102) (#50124)
* Better Logging GCS Blobstore Mock

Two things:
1. We should just throw a descriptive assertion error and figure out why we're not reading a multi-part instead of
returning a `400` and failing the tests that way here since we can't reproduce these 400s locally.
2. We were missing logging the exception on a cleanup delete failure that coincides with the `400` issue in tests.

Relates #49429
2019-12-12 11:17:22 +01:00
Armin Braun d19c8db4e4
Fix GCS Mock Batch Delete Behavior (#50034) (#50084)
Batch deletes get a response for every delete request, not just those that actually hit an existing blob.
The fact that we only responded for existing blobs leads to a degenerate response that throws a parse exception if a batch delete only contains non-existant blobs.
2019-12-11 17:40:25 +01:00
Przemyslaw Gomulka 81ff2d0f0d
Allow skipping ranges of versions backport(#50014) (#50028)
Multiple version ranges are allowed to be used in section skip in yml
tests. This is useful when a bugfix was backported to latest versions
and all previous releases contain a wire breaking bug.
examples:
6.1.0 - 6.3.0, 6.6.0 - 6.7.9, 7.0 -
- 7.2, 8.0.0 -
backport #50014
2019-12-10 16:43:41 +01:00
Armin Braun ac2774c9fa
Use Cluster State to Track Repository Generation (#49729) (#49976)
Step on the road to #49060.

This commit adds the logic to keep track of a repository's generation
across repository operations. See changes to package level Javadoc for the concrete changes in the distributed state machine.

It updates the write side of new repository generations to be fully consistent via the cluster state. With this change, no `index-N` will be overwritten for the same repository ever. So eventual consistency issues around conflicting updates to the same `index-N` are not a possibility any longer.

With this change the read side will still use listing of repository contents instead of relying solely on the cluster state contents.
The logic for that will be introduced in #49060. This retains the ability to externally delete the contents of a repository and continue using it afterwards for the time being. In #49060 the use of listing to determine the repository generation will be removed in all cases (except for full-cluster restart) as the last step in this effort.
2019-12-09 09:02:57 +01:00
Stuart Tettemer 17cda5b2c0
Scripting: Groundwork for caching script results (#49895) (#49944)
In order to cache script results in the query shard cache, we need to
check if scripts are deterministic.  This change adds a default method
to the script factories, `isResultDeterministic() -> false` which is
used by the `QueryShardContext`.

Script results were never cached and that does not change here.  Future
changes will implement this method based on whether the results of the
scripts are deterministic or not and therefore cacheable.

Refs: #49466

**Backport**
2019-12-06 15:08:05 -07:00
Zachary Tong fec882a457 Decouple pipeline reductions from final agg reduction (#45796)
Historically only two things happened in the final reduction:
empty buckets were filled, and pipeline aggs were reduced (since it
was the final reduction, this was safe).  Usage of the final reduction
is growing however.  Auto-date-histo might need to perform
many reductions on final-reduce to merge down buckets, CCS
may need to side-step the final reduction if sending to a
different cluster, etc

Having pipelines generate their output in the final reduce was
convenient, but is becoming increasingly difficult to manage
as the rest of the agg framework advances.

This commit decouples pipeline aggs from the final reduction by
introducing a new "top level" reduce, which should be called
at the beginning of the reduce cycle (e.g. from the SearchPhaseController).
This will only reduce pipeline aggs on the final reduce after
the non-pipeline agg tree has been fully reduced.

By separating pipeline reduction into their own set of methods,
aggregations are free to use the final reduction for whatever
purpose without worrying about generating pipeline results
which are non-reducible
2019-12-05 16:11:54 -05:00
Stuart Tettemer 426c7a5e8f
Scripting: add available languages & contexts API (#49652) (#49815)
Adds `GET /_script_language` to support Kibana dynamic scripting
language selection.

Response contains whether `inline` and/or `stored` scripts are
enabled as determined by the `script.allowed_types` settings.

For each scripting language registered, such as `painless`,
`expression`, `mustache` or custom, available contexts for the language
are included as determined by the `script.allowed_contexts` setting.

Response format:
```
{
  "types_allowed": [
    "inline",
    "stored"
  ],
  "language_contexts": [
    {
      "language": "expression",
      "contexts": [
        "aggregation_selector",
        "aggs"
        ...
      ]
    },
    {
      "language": "painless",
      "contexts": [
        "aggregation_selector",
        "aggs",
        "aggs_combine",
        ...
      ]
    }
...
  ]
}
```

Fixes: #49463 

**Backport**
2019-12-04 16:18:22 -07:00
Armin Braun 996cddd98b
Stop Copying Every Http Request in Message Handler (#44564) (#49809)
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way
* Relates #32228
   * I think the issue that preventet that PR  that PR from being merged was solved by #39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks  (I'd argue the bounds checks we add, we save when copying the composite buffer)
* I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
2019-12-04 08:41:42 +01:00
Yannick Welsch fbb92f527a Replicate write actions before fsyncing them (#49746)
This commit fixes a number of issues with data replication:

- Local and global checkpoints are not updated after the new operations have been fsynced, but
might capture a state before the fsync. The reason why this probably went undetected for so
long is that AsyncIOProcessor is synchronous if you index one item at a time, and hence working
as intended unless you have a high enough level of concurrent indexing. As we rely in other
places on the assumption that we have an up-to-date local checkpoint in case of synchronous
translog durability, there's a risk for the local and global checkpoints not to be up-to-date after
replication completes, and that this won't be corrected by the periodic global checkpoint sync.
- AsyncIOProcessor also has another "bad" side effect here: if you index one bulk at a time, the
bulk is always first fsynced on the primary before being sent to the replica. Further, if one thread
is tasked by AsyncIOProcessor to drain the processing queue and fsync, other threads can
easily pile more bulk requests on top of that thread. Things are not very fair here, and the thread
might continue doing a lot more fsyncs before returning (as the other threads pile more and
more on top), which blocks it from returning as a replication request (e.g. if this thread is on the
primary, it blocks the replication requests to the replicas from going out, and delaying
checkpoint advancement).

This commit fixes all these issues, and also simplifies the code that coordinates all the after
write actions.
2019-12-03 12:22:46 +01:00
Mayya Sharipova 7cf170830c
Optimize sort on numeric long and date fields. (#49732)
This rewrites long sort as a `DistanceFeatureQuery`, which can
efficiently skip non-competitive blocks and segments of documents.
Depending on the dataset, the speedups can be 2 - 10 times.

The optimization can be disabled with setting the system property
`es.search.rewrite_sort` to `false`.

Optimization is skipped when an index has 50% or more data with
the same value.

Optimization is done through:
1. Rewriting sort as `DistanceFeatureQuery` which can
efficiently skip non-competitive blocks and segments of documents.

2. Sorting segments according to the primary numeric sort field(#44021)
This allows to skip non-competitive segments.

3. Using collector manager.
When we optimize sort, we sort segments by their min/max value.
As a collector expects to have segments in order,
we can not use a single collector for sorted segments.
We use collectorManager, where for every segment a dedicated collector
will be created.

4. Using Lucene's shared TopFieldCollector manager
This collector manager is able to exchange minimum competitive
score between collectors, which allows us to efficiently skip
the whole segments that don't contain competitive scores.

5. When index is force merged to a single segment, #48533 interleaving
old and new segments allows for this optimization as well,
as blocks with non-competitive docs can be skipped.

Backport for #48804


Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
2019-11-29 15:37:40 -05:00
Armin Braun 813b49adb4
Make BlobStoreRepository Aware of ClusterState (#49639) (#49711)
* Make BlobStoreRepository Aware of ClusterState (#49639)

This is a preliminary to #49060.

It does not introduce any substantial behavior change to how the blob store repository
operates. What it does is to add all the infrastructure changes around passing the cluster service to the blob store, associated test changes and a best effort approach to tracking the latest repository generation on all nodes from cluster state updates. This brings a slight improvement to the consistency
by which non-master nodes (or master directly after a failover) will be able to determine the latest repository generation. It does not however do any tricky checks for the situation after a repository operation
(create, delete or cleanup) that could theoretically be used to get even greater accuracy to keep this change simple.
This change does not in any way alter the behavior of the blobstore repository other than adding a better "guess" for the value of the latest repo generation and is mainly intended to isolate the actual logical change to how the
repository operates in #49060
2019-11-29 14:57:47 +01:00
Armin Braun 90e9d61f2b
Optimize GoogleCloudStorageHttpHandler (#49677) (#49707)
Removing a lot of needless buffering and array creation
to reduce the significant memory usage of tests using this.
The incoming stream from the `exchange` is already buffered
so there is no point in adding a ton of additional buffers
everywhere.
2019-11-29 11:17:47 +01:00
Jim Ferenczi 496bb9e2ee
Add a listener to track the progress of a search request locally (#49471) (#49691)
This commit adds a function in NodeClient that allows to track the progress
of a search request locally. Progress is tracked through a SearchProgressListener
that exposes query and fetch responses as well as partial and final reduces.
This new method can be used by modules/plugins inside a node in order to track the
progress of a local search request.

Relates #49091
2019-11-28 18:23:09 +01:00
Jim Ferenczi d6445fae4b Add a cluster setting to disallow loading fielddata on _id field (#49166)
This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when https://github.com/elastic/elasticsearch/issues/26472
is implemented.

Closes #43599
2019-11-28 09:35:28 +01:00
Armin Braun 3862400270
Remove Redundant EsBlobStoreTestCase (#49603) (#49605)
All the implementations of `EsBlobStoreTestCase` use the exact same
bootstrap code that is also used by their implementation of
`EsBlobStoreContainerTestCase`.
This means all tests might as well live under `EsBlobStoreContainerTestCase`
saving a lot of code duplication. Also, there was no HDFS implementation for
`EsBlobStoreTestCase` which is now automatically resolved by moving the tests over
since there is a HDFS implementation for the container tests.
2019-11-26 20:57:19 +01:00
Armin Braun 495b543e63
Improve Stability of GCS Mock API (#49592) (#49597)
Same as #49518 pretty much but for GCS.
Fixing a few more spots where input stream can get closed
without being fully drained and adding assertions to make sure
it's always drained.
Moved the no-close stream wrapper to production code utilities since
there's a number of spots in production code where it's also useful
(will reuse it there in a follow-up).
2019-11-26 16:53:51 +01:00
Nhat Nguyen d2e92a1791 EngineTestCase#getDocIds should use internal reader (#49564)
We do not guarantee that EngineTestCase#getDocIds is called after the 
engine has been externally refreshed. Hence, we trip an assertion
assertSearcherIsWarmedUp.

CI: https://gradle-enterprise.elastic.co/s/pm2at5qmfm2iu

Relates #48605
2019-11-25 21:07:30 -05:00
Armin Braun a5fa86ed97
Improve Stability of Mock APIs (#49518) (#49524)
This commit ensures that even for requests that are known to be empty body
we at least attempt to read one bytes from the request body input stream.
This is done to work around the behavior in `sun.net.httpserver.ServerImpl.Dispatcher#handleEvent`
that will close a TCP/HTTP connection that does not have the `eof` flag (see `sun.net.httpserver.LeftOverInputStream#isEOF`)
set on its input stream. As far as I can tell the only way to set this flag is to do a read when there's no more bytes buffered.
This fixes the numerous connection closing issues because the `ServerImpl` stops closing connections that it thinks
weren't fully drained.

Also, I removed a now redundant drain loop in the Azure handler as well as removed the connection closing in the error handler's
drain action (this shouldn't have an effect but makes things more predictable/easier to reason about IMO).

I would suggest merging this and closing related issue after verifying that this fixes things on CI.

The way to locally reproduce the issues we're seeing in tests is to make the retry timings more aggressive in e.g. the azure tests
and move them to single digit values. This makes the retries happen quickly enough that they run into the async connecting closing
of allegedly non-eof connections by `ServerImpl` and produces the exact kinds of failures we're seeing currently.

Relates #49401, #49429
2019-11-25 10:28:55 +01:00
Nhat Nguyen 8260cba629 Increase timeout while checking for no snapshotted commit (#49461)
If some replica is performing a file-based recovery, then the check 
assertNoSnapshottedIndexCommit would fail. We should increase the
timeout for this check so that we can wait until all recoveries done
or aborted.

Closes #49403
2019-11-24 15:12:34 -05:00
Armin Braun 231d079bf8
Fix Azure Mock Issues (#49377) (#49381)
Fixing a few small issues found in this code:
1. We weren't reading the request headers but the response headers when checking for blob existence in the mocked single upload path
2. Error code can never be `null` removed the dead code that resulted
3. In the logging wrapper we weren't checking for `Throwable` so any failing assertions in the http mock would not show up since they
run on a thread managed by the mock http server
2019-11-21 19:57:50 +01:00
Yannick Welsch 420825c3b5 Strengthen validateClusterFormed check (#49248)
Strengthens the validateClusterFormed check that is used by the test infrastructure to make
sure that nodes are properly connected and know about each other. Is used in situations where
the cluster is scaled up and down, and where there previously was a network disruption that has
been healed.

Closes #49243
2019-11-21 17:38:12 +01:00
Armin Braun df8d7b213b Add Logging to Mock Repo API Server (#49409)
While we log exception in the handler, we may still miss exceptions
hgiher up the execution chain. This adds logging of exceptions to all
operations on the IO loop including connection establishment.

Relates #49401
2019-11-21 11:33:57 +01:00
Tanguy Leroux f753fa2265 HttpHandlers should return correct list of objects (#49283)
This commit fixes the server side logic of "List Objects" operations
of Azure and S3 fixtures. Until today, the fixtures were returning a "
flat" view of stored objects and were not correctly handling the
delimiter parameter. This causes some objects listing to be wrongly
interpreted by the snapshot deletion logic in Elasticsearch which
relies on the ability to list child containers of BlobContainer (#42653)
to correctly delete stale indices.

As a consequence, the blobs were not correctly deleted from the
 emulated storage service and stayed in heap until they got garbage
collected, causing CI failures like #48978.

This commit fixes the server side logic of Azure and S3 fixture when
listing objects so that it now return correct common blob prefixes as
expected by the snapshot deletion process. It also adds an after-test
check to ensure that tests leave the repository empty (besides the
root index files).

Closes #48978
2019-11-20 09:26:42 +01:00
Jay Modi eed4cd25eb
ThreadPool and ThreadContext are not closeable (#43249) (#49273)
This commit changes the ThreadContext to just use a regular ThreadLocal
over the lucene CloseableThreadLocal. The CloseableThreadLocal solves
issues with ThreadLocals that are no longer needed during runtime but
in the case of the ThreadContext, we need it for the runtime of the
node and it is typically not closed until the node closes, so we miss
out on the benefits that this class provides.

Additionally by removing the close logic, we simplify code in other
places that deal with exceptions and tracking to see if it happens when
the node is closing.

Closes #42577
2019-11-19 13:15:16 -07:00
Armin Braun 0acba44a2e
Make Repository.getRepositoryData an Async API (#49299) (#49312)
This API call in most implementations is fairly IO heavy and slow
so it is more natural to be async in the first place.
Concretely though, this change is a prerequisite of #49060 since
determining the repository generation from the cluster state
introduces situations where this call would have to wait for other
operations to finish. Doing so in a blocking manner would break
`SnapshotResiliencyTests` and waste a thread.
Also, this sets up the possibility to in the future make use of async IO
where provided by the underlying Repository implementation.

In a follow-up `SnapshotsService#getRepositoryData` will be made async
as well (did not do it here, since it's another huge change to do so).
Note: This change for now does not alter the threading behaviour in any way (since `Repository#getRepositoryData` isn't forking) and is purely mechanical.
2019-11-19 16:49:12 +01:00
Tanguy Leroux ca4f55f2e4
Add docker-compose fixtures for S3 integration tests (#49107) (#49229)
Similarly to what has been done for Azure (#48636) and GCS (#48762),
this committ removes the existing Ant fixture that emulates a S3 storage
service in favor of multiple docker-compose based fixtures.

The goals here are multiple: be able to reuse a s3-fixture outside of the
repository-s3 plugin; allow parallel execution of integration tests; removes
the existing AmazonS3Fixture that has evolved in a weird beast in
dedicated, more maintainable fixtures.

The server side logic that emulates S3 mostly comes from the latest
HttpHandler made for S3 blob store repository tests, with additional
features extracted from the (now removed) AmazonS3Fixture:
authentication checks, session token checks and improved response
errors. Chunked upload request support for S3 object has been added
too.

The server side logic of all tests now reside in a single S3HttpHandler class.

Whereas AmazonS3Fixture contained logic for basic tests, session token
tests, EC2 tests or ECS tests, the S3 fixtures are now dedicated to each
kind of test. Fixtures are inheriting from each other, making things easier
to maintain.
2019-11-18 05:56:59 -05:00
markharwood c3745b03ee
Search optimisation - add canMatch early aborts for queries on "_index" field (#49158)
Make queries on the “_index” field fast-fail if the target shard is an index that doesn’t match the query expression. Part of the “canMatch” phase optimisations.

Closes #48473
2019-11-15 16:50:32 +00:00
Jay Modi b6ec066ca9
ESIntegTestCase always cleans up static fields (#49105) (#49108)
ESIntegTestCase has logic to clean up static fields in a method
annotated with `@AfterClass` so that these fields do not trigger the
StaticFieldsInvariantRule. However, during the exceptional close of the
test cluster, this cleanup can be missed. The StaticFieldsInvariantRule
always runs and will attempt to inspect the size of the static fields
that were not cleaned up. If the `currentCluster` field of
ESIntegTestCase references an InternalTestCluster, this could hold a
reference to an implementation of a `Path` that comes from the
`sun.nio.fs` package, which the security manager will deny access to.
This casues additional noise to be generated since the
AccessControlException will cause the StaticFieldsInvariantRule to fail
and also be reported along with the actual exception that occurred.

This change clears the static fields of ESIntegTestCase in a finally
block inside the `@AfterClass` method to prevent this unnecessary noise.

Closes #41526
2019-11-15 09:39:57 -07:00
Rory Hunter c46a0e8708
Apply 2-space indent to all gradle scripts (#49071)
Backport of #48849. Update `.editorconfig` to make the Java settings the
default for all files, and then apply a 2-space indent to all `*.gradle`
files. Then reformat all the files.
2019-11-14 11:01:23 +00:00
Henning Andersen 66f0c8900f
Fix Transport Stopped Exception (#48930) (#49035)
When a node shuts down, `TransportService` moves to stopped state and
then closes connections. If a request is done in between, an exception
was thrown that was not retried in replication actions. Now throw a
wrapped `NodeClosedException` exception instead, which is correctly
handled in replication action. Fixed other usages too.

Relates #42612
2019-11-13 18:48:05 +01:00
Tanguy Leroux 20fc1dbe18
Move MinIO fixture in its own project (#49036)
This commit moves the MinIO docker-compose fixture from the
:plugins:repository-s3 to its own :test:minio-fixture Gradle project.
2019-11-13 10:03:59 -05:00
Yannick Welsch 2dfa0133d5 Always use primary term from primary to index docs on replica (#47583)
Ensures that we always use the primary term established by the primary to index docs on the
replica. Makes the logic around replication less brittle by always using the operation primary
term on the replica that is coming from the primary.
2019-11-13 12:13:45 +01:00
Tanguy Leroux 1903505a3f Log exceptions thrown by HttpHandlers in repository integration tests (#48991)
This commit changes the ESMockAPIBasedRepositoryIntegTestCase so 
that HttpHandler are now wrapped in order to log any exceptions that 
could be thrown when executing the server side logic in repository 
integration tests.
2019-11-12 20:14:30 +01:00
Tim Brooks 0645ee88e2
Send cluster name and discovery node in handshake (#48916)
This commits sends the cluster name and discovery naode in the transport
level handshake response. This will allow us to stop sending the
transport service level handshake request in the 8.0-8.x release cycle.
It is necessary to start sending this in 7.x so that 8.0 is guaranteed
to be communicating with a version that sends the required information.
2019-11-11 18:42:02 -05:00
Yannick Welsch 87862868c6 Allow realtime get to read from translog (#48843)
The realtime GET API currently has erratic performance in case where a document is accessed
that has just been indexed but not refreshed yet, as the implementation will currently force an
internal refresh in that case. Refreshing can be an expensive operation, and also will block the
thread that executes the GET operation, blocking other GETs to be processed. In case of
frequent access of recently indexed documents, this can lead to a refresh storm and terrible
GET performance.

While older versions of Elasticsearch (2.x and older) did not trigger refreshes and instead opted
to read from the translog in case of realtime GET API or update API, this was removed in 5.0
(#20102) to avoid inconsistencies between values that were returned from the translog and
those returned by the index. This was partially reverted in 6.3 (#29264) to allow _update and
upsert to read from the translog again as it was easier to guarantee consistency for these, and
also brought back more predictable performance characteristics of this API. Calls to the realtime
GET API, however, would still always do a refresh if necessary to return consistent results. This
means that users that were calling realtime GET APIs to coordinate updates on client side
(realtime GET + CAS for conditional index of updated doc) would still see very erratic
performance.

This PR (together with #48707) resolves the inconsistencies between reading from translog and
index. In particular it fixes the inconsistencies that happen when requesting stored fields, which
were not available when reading from translog. In case where stored fields are requested, this
PR will reparse the _source from the translog and derive the stored fields to be returned. With
this, it changes the realtime GET API to allow reading from the translog again, avoid refresh
storms and blocking the GET threadpool, and provide overall much better and predictable
performance for this API.
2019-11-09 17:47:50 +01:00
Nhat Nguyen ff6c121eb9 Closed shard should never open new engine (#47186)
We should not open new engines if a shard is closed. We break this
assumption in #45263 where we stop verifying the shard state before
creating an engine but only before swapping the engine reference.
We can fail to snapshot the store metadata or checkIndex a closed shard
if there's some IndexWriter holding the index lock.

Closes #47060
2019-11-08 23:40:34 -05:00
Yannick Welsch af887be3e5 Hide orphaned tasks from follower stats (#48901)
CCR follower stats can return information for persistent tasks that are in the process of being cleaned up. This is problematic for tests where CCR follower indices have been deleted, but their persistent follower task is only cleaned up asynchronously afterwards. If one of the following tests then accesses the follower stats, it might still get the stats for that follower task.

In addition, some tests were not cleaning up their auto-follow patterns, leaving orphaned patterns behind. Other tests cleaned up their auto-follow patterns. As always the same name was used, it just depended on the test execution order whether this led to a failure or not. This commit fixes the offensive tests, and will also automatically remove auto-follow-patterns at the end of tests, like we do for many other features.

Closes #48700
2019-11-08 13:56:53 +01:00
Tanguy Leroux 8a14ea5567
Add docker-composed based test fixture for GCS (#48902)
Similarly to what has be done for Azure in #48636, this commit
adds a new :test:fixtures:gcs-fixture project which provides two
docker-compose based fixtures that emulate a Google Cloud
Storage service.

Some code has been extracted from existing tests and placed
into this new project so that it can be easily reused in other
projects.
2019-11-07 13:27:22 -05:00
Armin Braun d83e374062
Bound Linearizability Check in CoordinatorTests (#48751) (#48853)
Same as #44444 but for the coordinator tests.
Closes #48742
2019-11-04 21:36:17 +01:00
Armin Braun a22f6fbe3c
Cleanup Redundant Futures in Recovery Code (#48805) (#48832)
Follow up to #48110 cleaning up the redundant future
uses that were left over from that change.
2019-11-02 17:28:12 +01:00
Tanguy Leroux 989467ca1e
Add docker-compose based test fixture for Azure (#48736)
This commit adds a new :test:fixtures:azure-fixture project which 
provides a docker-compose based container that runs a AzureHttpFixture 
Java class that emulates an Azure Storage service.

The logic to emulate the service is extracted from existing tests and 
placed in AzureHttpHandler into the new project so that it can be 
easily reused. The :plugins:repository-azure project is an example 
of such utilization.

The AzureHttpFixture fixture is just a wrapper around AzureHttpHandler 
and is now executed within the docker container. 

The :plugins:repository-azure:qa:microsoft-azure project uses the new 
test fixture and the existing AzureStorageFixture has been removed.
2019-10-31 10:43:43 +01:00
Armin Braun 52e5ceb321
Restore from Individual Shard Snapshot Files in Parallel (#48110) (#48686)
Make restoring shard snapshots run in parallel on the `SNAPSHOT` thread-pool.
2019-10-30 14:36:30 +01:00
Tanguy Leroux 24f6985235 Reduce allocations when draining HTTP requests bodies in repository tests (#48541)
In repository integration tests, we drain the HTTP request body before 
returning a response. Before this change this operation was done using
Streams.readFully() which uses a 8kb buffer to read the input stream, it
 now uses a 1kb for the same operation. This should reduce the allocations 
made during the tests and speed them up a bit on CI.

Co-authored-by: Armin Braun <me@obrown.io>
2019-10-29 09:15:06 +01:00
Rory Hunter 30389c6660
Improve SAML tests resiliency to auto-formatting (#48517)
Backport of #48452.

The SAML tests have large XML documents within which various parameters
are replaced. At present, if these test are auto-formatted, the XML
documents get strung out over many, many lines, and are basically
illegible.

Fix this by using named placeholders for variables, and indent the
multiline XML documents.

The tests in `SamlSpMetadataBuilderTests` deserve a special mention,
because they include a number of certificates in Base64. I extracted
these into variables, for additional legibility.
2019-10-27 16:06:23 +00:00
Tim Brooks f5f1072824
Multiple remote connection strategy support (#48496)
* Extract remote "sniffing" to connection strategy (#47253)

Currently the connection strategy used by the remote cluster service is
implemented as a multi-step sniffing process in the
RemoteClusterConnection. We intend to introduce a new connection strategy
that will operate in a different manner. This commit extracts the
sniffing logic to a dedicated strategy class. Additionally, it implements
dedicated tests for this class.

Additionally, in previous commits we moved away from a world where the
remote cluster connection was mutable. Instead, when setting updates are
made, the connection is torn down and rebuilt. We still had methods and
tests hanging around for the mutable behavior. This commit removes those.

* Introduce simple remote connection strategy (#47480)

This commit introduces a simple remote connection strategy which will
open remote connections to a configurable list of user supplied
addresses. These addresses can be remote Elasticsearch nodes or
intermediate proxies. We will perform normal clustername and version
validation, but otherwise rely on the remote cluster to route requests
to the appropriate remote node.

* Make remote setting updates support diff strategies (#47891)

Currently the entire remote cluster settings infrastructure is designed
around the sniff strategy. As we introduce an additional conneciton
strategy this infrastructure needs to be modified to support it. This
commit modifies the code so that the strategy implementations will tell
the service if the connection needs to be torn down and rebuilt.

As part of this commit, we will wait 10 seconds for new clusters to
connect when they are added through the "update" settings
infrastructure.

* Make remote setting updates support diff strategies (#47891)

Currently the entire remote cluster settings infrastructure is designed
around the sniff strategy. As we introduce an additional conneciton
strategy this infrastructure needs to be modified to support it. This
commit modifies the code so that the strategy implementations will tell
the service if the connection needs to be torn down and rebuilt.

As part of this commit, we will wait 10 seconds for new clusters to
connect when they are added through the "update" settings
infrastructure.
2019-10-25 09:29:41 -06:00
Tim Brooks c0b545f325
Make BytesReference an interface (#48486)
BytesReference is currently an abstract class which is extended by
various implementations. This makes it very difficult to use the
delegation pattern. The implication of this is that our releasable
BytesReference is a PagedBytesReference type and cannot be used as a
generic releasable bytes reference that delegates to any reference type.
This commit makes BytesReference an interface and introduces an
AbstractBytesReference for common functionality.
2019-10-24 15:39:30 -06:00
Igor Motov bdbc353dea Geo: improve handling of out of bounds points in linestrings (#47939)
Brings handling of out of bounds points in linestrings in line with
points. Now points with latitude above 90 and below -90 are handled
the same way as for points by adjusting the longitude by moving it by
180 degrees.

Relates to #43916
2019-10-23 14:17:44 -04:00
Armin Braun 7215201406
Track Shard-Snapshot Index Generation at Repository Root (#48371)
This change adds a new field `"shards"` to `RepositoryData` that contains a mapping of `IndexId` to a `String[]`. This string array can be accessed by shard id to get the generation of a shard's shard folder (i.e. the `N` in the name of the currently valid `/indices/${indexId}/${shardId}/index-${N}` for the shard in question).

This allows for creating a new snapshot in the shard without doing any LIST operations on the shard's folder. In the case of AWS S3, this saves about 1/3 of the cost for updating an empty shard (see #45736) and removes one out of two remaining potential issues with eventually consistent blob stores (see #38941 ... now only the root `index-${N}` is determined by listing).

Also and equally if not more important, a number of possible failure modes on eventually consistent blob stores like AWS S3 are eliminated by moving all delete operations to the `master` node and moving from incremental naming of shard level index-N to uuid suffixes for these blobs.

This change moves the deleting of the previous shard level `index-${uuid}` blob to the master node instead of the data node allowing for a safe and consistent update of the shard's generation in the `RepositoryData` by first updating `RepositoryData` and then deleting the now unreferenced `index-${newUUID}` blob.
__No deletes are executed on the data nodes at all for any operation with this change.__

Note also: Previous issues with hanging data nodes interfering with master nodes are completely impossible, even on S3 (see next section for details).

This change changes the naming of the shard level `index-${N}` blobs to a uuid suffix `index-${UUID}`. The reason for this is the fact that writing a new shard-level `index-` generation blob is not atomic anymore in its effect. Not only does the blob have to be written to have an effect, it must also be referenced by the root level `index-N` (`RepositoryData`) to become an effective part of the snapshot repository.
This leads to a problem if we were to use incrementing names like we did before. If a blob `index-${N+1}` is written but due to the node/network/cluster/... crashes the root level `RepositoryData` has not been updated then a future operation will determine the shard's generation to be `N` and try to write a new `index-${N+1}` to the already existing path. Updates like that are problematic on S3 for consistency reasons, but also create numerous issues when thinking about stuck data nodes.
Previously stuck data nodes that were tasked to write `index-${N+1}` but got stuck and tried to do so after some other node had already written `index-${N+1}` were prevented form doing so (except for on S3) by us not allowing overwrites for that blob and thus no corruption could occur.
Were we to continue using incrementing names, we could not do this. The stuck node scenario would either allow for overwriting the `N+1` generation or force us to continue using a `LIST` operation to figure out the next `N` (which would make this change pointless).
With uuid naming and moving all deletes to `master` this becomes a non-issue. Data nodes write updated shard generation `index-${uuid}` and `master` makes those `index-${uuid}` part of the `RepositoryData` that it deems correct and cleans up all those `index-` that are unused.

Co-authored-by: Yannick Welsch <yannick@welsch.lu>
Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>
2019-10-23 10:58:26 +01:00
Tanguy Leroux 4790ee4c32 Reenable azure repository tests and remove some randomization in http servers (#48283)
Relates #47948
Relates #47380
2019-10-23 09:06:50 +02:00
Armin Braun 8a02a5fc7d
Simplify Shard Snapshot Upload Code (#48155) (#48345)
The code here was needlessly complicated when it
enqueued all file uploads up-front. Instead, we can
go with a cleaner worker + queue pattern here by taking
the max-parallelism from the threadpool info.

Also, I slightly simplified the rethrow and
listener (step listener is pointless when you add the callback in the next line)
handling it since I noticed that we were needlessly rethrowing in the same
code and that wasn't worth a separate PR.
2019-10-22 17:17:09 +01:00
Armin Braun dc08feadc6
Remove Redundant Version Param from Repository APIs (#48231) (#48298)
This parameter isn't used by any implementation
2019-10-21 16:20:45 +02:00
Ignacio Vera b1224fca8c
upgrade to Lucene-8.3.0-snapshot-25968e3b75e (#48227) 2019-10-21 08:21:09 +02:00
Alpar Torok cc26e30281
Increase timeout for yml tests (#48237)
Some of these are larger than what can complete in the regular timeout.

Closes #48212
2019-10-18 11:14:15 -07:00
jimczi b858e19bcc Revert #46598 that breaks the cachability of the sub search contexts. 2019-10-15 09:40:59 +02:00
Alpar Torok fbbe04b801 Add a verifyVersions to the test FW (#47192)
The test FW has a method to check that it's implementation of getting
index and wire compatible versions as well as reasoning about which
version is released or not produces the same rezults as the simillar
implementation in the build.

This PR adds the `verifyVersions` task to the test FW so we have one
task to check everything related to versions.
2019-10-10 11:23:56 +03:00
Armin Braun 302e09decf
Simplify some Common ActionRunnable Uses (#47799) (#47828)
Especially in the snapshot code there's a lot
of logic chaining `ActionRunnables` in tricky
ways now and the code is getting hard to follow.
This change introduces two convinience methods that
make it clear that a wrapped listener is invoked with
certainty in some trickier spots and shortens the code a bit.
2019-10-09 23:29:50 +02:00
Hendrik Muhs 5e0e54f455
[Transform] move root endpoint to _transform with BWC layer (#47127) (#47682)
move the main endpoint to /_transform/ from /_data_frame/transforms/ with providing backwards compatibility and deprecation warnings
2019-10-08 08:59:01 +02:00
Alpar Torok 2b16d7bcf8
Backport testclusters all (#47565)
* Bwc testclusters all (#46265)

Convert all bwc projects to testclusters

* Fix bwc versions config

* WIP fix rolling upgrade

* Fix bwc tests on old versions

* Fix rolling upgrade
2019-10-04 16:12:53 +03:00
Ryan Ernst f32692208e
Add explanations to script score queries (#46693) (#47548)
While function scores using scripts do allow explanations, they are only
creatable with an expert plugin. This commit improves the situation for
the newer script score query by adding the ability to set the
explanation from the script itself.

To set the explanation, a user would check for `explanation != null` to
indicate an explanation is needed, and then call
`explanation.set("some description")`.
2019-10-03 21:05:05 -07:00
Nhat Nguyen 5e4732f2bb Limit number of retaining translog files for peer recovery (#47414)
Today we control the extra translog (when soft-deletes is disabled) for
peer recoveries by size and age. If users manually (force) flush many
times within a short period, we can keep many small (or empty) translog
files as neither the size or age condition is reached. We can protect
the cluster from running out of the file descriptors in such a situation
by limiting the number of retaining translog files.
2019-10-03 20:45:29 -04:00
Yannick Welsch 99d2fe295d Use optype CREATE for single auto-id index requests (#47353)
Changes auto-id index requests to use optype CREATE, making it compliant with our docs.
This will also make these auto-id index requests compatible with the new "create-doc" index
privilege (which is based on the optype), the default optype is changed to create, just as it is
already documented.
2019-10-02 14:16:52 +02:00
Henning Andersen b5a2afccb2 MockSearchService concurrency fix (#47139)
Fixed MockSearchService concurrency, assertNoInFlightContext could
have false negative result (rarely).

Split out from #46060
Closes #47048
2019-10-02 12:33:18 +02:00
Tanguy Leroux f5c5411fe8
Differentiate base paths in repository integration tests (#47284) (#47300)
This commit change the repositories base paths used in Azure/S3/GCS
integration tests so that they don't conflict with each other when tests
 run in parallel on real storage services.

Closes #47202
2019-10-01 08:39:55 +02:00
Armin Braun 3d23cb44a3
Speed up Snapshot Finalization (#47283) (#47309)
As a result of #45689 snapshot finalization started to
take significantly longer than before. This may be a
little unfortunate since it increases the likelihood
of failing to finalize after having written out all
the segment blobs.
This change parallelizes all the metadata writes that
can safely run in parallel in the finalization step to
speed the finalization step up again. Also, this will
generally speed up the snapshot process overall in case
of large number of indices.

This is also a nice to have for #46250 since we add yet
another step (deleting of old index- blobs in the shards
to the finalization.
2019-09-30 23:28:59 +02:00
Yannick Welsch 9dc90e41fc Remove "force" version type (#47228)
It's been deprecated long ago and can be removed.

Relates to #20377

Closes #19769
2019-09-30 11:58:34 +02:00
Rory Hunter 53a4d2176f
Convert most awaitBusy calls to assertBusy (#45794) (#47112)
Backport of #45794 to 7.x. Convert most `awaitBusy` calls to
`assertBusy`, and use asserts where possible. Follows on from #28548 by
@liketic.

There were a small number of places where it didn't make sense to me to
call `assertBusy`, so I kept the existing calls but renamed the method to
`waitUntil`. This was partly to better reflect its usage, and partly so
that anyone trying to add a new call to awaitBusy wouldn't be able to find
it.

I also didn't change the usage in `TransportStopRollupAction` as the
comments state that the local awaitBusy method is a temporary
copy-and-paste.

Other changes:

  * Rework `waitForDocs` to scale its timeout. Instead of calling
    `assertBusy` in a loop, work out a reasonable overall timeout and await
    just once.
  * Some tests failed after switching to `assertBusy` and had to be fixed.
  * Correct the expect templates in AbstractUpgradeTestCase.  The ES
    Security team confirmed that they don't use templates any more, so
    remove this from the expected templates. Also rewrite how the setup
    code checks for templates, in order to give more information.
  * Remove an expected ML template from XPackRestTestConstants The ML team
    advised that the ML tests shouldn't be waiting for any
    `.ml-notifications*` templates, since such checks should happen in the
    production code instead.
  * Also rework the template checking code in `XPackRestTestHelper` to give
    more helpful failure messages.
  * Fix issue in `DataFrameSurvivesUpgradeIT` when upgrading from < 7.4
2019-09-29 12:21:46 +01:00
Tim Brooks e11c56760d
Fix bind failure logging for mock transport (#47150)
Currently the MockNioTransport uses a custom exception handler for
server channel exceptions. This means that bind failures are logged at
the warn level. This commit modifies the transport to use the common
TcpTransport exception handler which will log exceptions at the correct
level.
2019-09-27 13:53:48 -06:00
Nhat Nguyen 444b47ce88 Relax maxSeqNoOfUpdates assertion in FollowingEngine (#47188)
We disable MSU optimization if the local checkpoint is smaller than
max_seq_no_of_updates. Hence, we need to relax the MSU assertion in
FollowingEngine for that scenario. Suppose the leader has three
operations: index-0, delete-1, and index-2 for the same doc Id. MSU on
the leader is 1 as index-2 is an append. If the follower applies index-0
then index-2, then the assertion is violated.

Closes #47137
2019-09-27 14:00:20 -04:00
Tanguy Leroux 42ae76ab7c Injected response errors in Azure repository tests should have a body (#47176)
The Azure SDK client expects server errors to have a body, 
something that looks like:

<?xml version="1.0" encoding="utf-8"?>  
<Error>  
  <Code>string-value</Code>  
  <Message>string-value</Message>  
</Error> 

I've forgot to add such errors in Azure tests and that triggers 
some NPE in the client like the one reported in #47120.

Closes #47120
2019-09-27 09:43:29 +02:00
Jim Ferenczi 73a09b34b8
Replace SearchContextException with SearchException (#47046)
This commit removes the SearchContextException in favor of a simpler
SearchException that doesn't leak the SearchContext.

Relates #46523
2019-09-26 14:21:23 +02:00
Tanguy Leroux 95e2ca741e
Remove unused private methods and fields (#47154)
This commit removes a bunch of unused private fields and unused
private methods from the code base.

Backport of (#47115)
2019-09-26 12:49:21 +02:00
David Turner 45c7783018
Warn on slow metadata persistence (#47130)
Today if metadata persistence is excessively slow on a master-ineligible node
then the `ClusterApplierService` emits a warning indicating that the
`GatewayMetaState` applier was slow, but gives no further details. If it is
excessively slow on a master-eligible node then we do not see any warning at
all, although we might see other consequences such as a lagging node or a
master failure.

With this commit we emit a warning if metadata persistence takes longer than a
configurable threshold, which defaults to `10s`. We also emit statistics that
record how much index metadata was persisted and how much was skipped since
this can help distinguish cases where IO was slow from cases where there are
simply too many indices involved.

Backport of #47005.
2019-09-26 07:40:54 +01:00
Tim Brooks 4f47e1f169
Extract proxy connection logic to specialized class (#47138)
Currently the logic to check if a connection to a remote discovery node
exists and otherwise create a proxy connection is mixed with the
collect nodes, cluster connection lifecycle, and other
RemoteClusterConnection logic. This commit introduces a specialized
RemoteConnectionManager class which handles the open connections.
Additionally, it reworks the "round-robin" proxy logic to create the list
of potential connections at connection open/close time, opposed to each
time a connection is requested.
2019-09-25 15:58:18 -06:00
David Turner ac920e8e64 Assert no exceptions during state application (#47090)
Today we log and swallow exceptions during cluster state application, but such
an exception should not occur. This commit adds assertions of this fact, and
updates the Javadocs to explain it.

Relates #47038
2019-09-25 12:32:51 +01:00
Tim Brooks 6720c56bdd
Set netty system properties in BuildPlugin (#45881)
Currently in production instances of Elasticsearch we set a couple of
system properties by default. We currently do not apply all of these
system properties in tests. This commit applies these properties in the
tests.
2019-09-24 10:49:36 -06:00
David Turner 6943a3101f
Cut PersistedState interface from GatewayMetaState (#46655)
Today `GatewayMetaState` implements `PersistedState` but it's an error to use
it as a `PersistedState` before it's been started, or if the node is
master-ineligible. It also holds some fields that are meaningless on nodes that
do not persist their states. Finally, it takes responsibility for both loading
the original cluster state and some of the high-level logic for writing the
cluster state back to disk.

This commit addresses these concerns by introducing a more specific
`PersistedState` implementation for use on master-eligible nodes which is only
instantiated if and when it's appropriate. It also moves the fields and
high-level persistence logic into a new `IncrementalClusterStateWriter` with a
more appropriate lifecycle.

Follow-up to #46326 and #46532
Relates #47001
2019-09-24 12:31:13 +01:00
Julie Tibshirani 9124c94a6c
Add support for aliases in queries on _index. (#46944)
Previously, queries on the _index field were not able to specify index aliases.
This was a regression in functionality compared to the 'indices' query that was
deprecated and removed in 6.0.

Now queries on _index can specify an alias, which is resolved to the concrete
index names when we check whether an index matches. To match a remote shard
target, the pattern needs to be of the form 'cluster:index' to match the
fully-qualified index name. Index aliases can be specified in the following query
types: term, terms, prefix, and wildcard.
2019-09-23 13:21:37 -07:00
Jim Ferenczi 08f28e642b Replace SearchContext with QueryShardContext in query builder tests (#46978)
This commit replaces the SearchContext used in AbstractQueryTestCase with
a QueryShardContext in order to reduce the visibility of search contexts.

Relates #46523
2019-09-23 20:24:02 +02:00
Luca Cavanna d4d1182677 update _common.json format (#46872)
API spec now use an object for the documentation field. _common was not updated yet. This commit updates _common.json and its corresponding parser.

Closes #46744

Co-Authored-By: Tomas Della Vedova <delvedor@users.noreply.github.com>
2019-09-23 17:01:29 +02:00
Yannick Welsch 9638ca20b0 Allow dropping documents with auto-generated ID (#46773)
When using auto-generated IDs + the ingest drop processor (which looks to be used by filebeat
as well) + coordinating nodes that do not have the ingest processor functionality, this can lead
to a NullPointerException.

The issue is that markCurrentItemAsDropped() is creating an UpdateResponse with no id when
the request contains auto-generated IDs. The response serialization is lenient for our
REST/XContent format (i.e. we will send "id" : null) but the internal transport format (used for
communication between nodes) assumes for this field to be non-null, which means that it can't
be serialized between nodes. Bulk requests with ingest functionality are processed on the
coordinating node if the node has the ingest capability, and only otherwise sent to a different
node. This means that, in order to reproduce this, one needs two nodes, with the coordinating
node not having the ingest functionality.

Closes #46678
2019-09-19 16:46:33 +02:00
Tanguy Leroux 3ae51f25dd Move testSnapshotWithLargeSegmentFiles to ESMockAPIBasedRepositoryIntegTestCase (#46802)
This commit moves the common test testSnapshotWithLargeSegmentFiles 
to the ESMockAPIBasedRepositoryIntegTestCase base class.
2019-09-18 15:41:30 +02:00
Armin Braun f983b67fdc
Add Assertion About Leaking index-N to Repo Tests (#46774) (#46801)
This adds an assert to make sure we're not leaking
index-N blobs on the shard level to the repo consistency checks.
It is ok to have a single redundant index-N blob in a failure scenario
but additional index-N should always be cleaned up before adding more.
2019-09-18 13:15:56 +02:00
Tanguy Leroux 4db37801d0 Add resumable uploads support to GCS repository integration tests (#46562)
This commit adds support for resumable uploads to the internal HTTP 
server used in GoogleCloudStorageBlobStoreRepositoryTests. This 
way we can also test the behavior of the Google's client when the 
service returns server errors in response to resumable upload requests.

The BlobStore implementation for GCS has the choice between 2 
methods to upload a blob: resumable and multipart. In the current 
implementation, the client executes a resumable upload if the blob 
size is larger than LARGE_BLOB_THRESHOLD_BYTE_SIZE, 
otherwise it executes a multipart upload. This commit makes this 
logic overridable in tests, allowing to randomize the decision of 
using one method or the other.

The commit add support for single request resumable uploads 
and chunked resumable uploads (the blob is uploaded into multiple 
2Mb chunks; each chunk being a resumable upload). For this last 
case, this PR also adds a test testSnapshotWithLargeSegmentFiles 
which makes it more probable that a chunked resumable upload is 
executed.
2019-09-18 09:33:05 +02:00
Armin Braun 2c70d403fc
Reenable+Fix testMasterShutdownDuringFailedSnapshot (#46303) (#46747)
Reenable this test since it was fixed by #45689 in production
code (specifically, the fact that we write the `snap-` blobs
without overwrite checks now).
Only required adding the assumed blocking on index file writes
to test code to properly work again.

* Closes #25281
2019-09-17 18:09:48 +02:00
Armin Braun b00de8edf3
Ensure SAS Tokens in Test Use Minimal Permissions (#46112) (#46628)
There were some issues with the Azure implementation requiring
permissions to list all containers ue to a container exists
check. This was caught in CI this time, but going forward we
should ensure that CI is executed using a token that does not
allow listing containers.

Relates #43288
2019-09-17 15:40:11 +02:00
Armin Braun b0f09b279f
Make Snapshot Logic Write Metadata after Segments (#45689) (#46764)
* Write metadata during snapshot finalization after segment files to prevent outdated metadata in case of dynamic mapping updates as explained in #41581
* Keep the old behavior of writing the metadata beforehand in the case of mixed version clusters for BwC reasons
   * Still overwrite the metadata in the end, so even a mixed version cluster is fixed by this change if a newer version master does the finalization
* Fixes #41581
2019-09-17 13:09:39 +02:00
Przemysław Witek e49be611ad
[7.x] Add audit messages for Data Frame Analytics (#46521) (#46738) 2019-09-16 21:21:38 +02:00
Nhat Nguyen 5465c8d095 Increase timeout for relocation tests (#46554)
There's nothing wrong in the logs from these failures. I think 30
seconds might not be enough to relocate shards with many documents as CI
is quite slow. This change increases the timeout to 60 seconds for these
relocation tests. It also dumps the hot threads in case of timed out.

Closes #46526
Closes #46439
2019-09-12 16:34:01 -04:00
Jim Ferenczi 4407f3af1b Delay the creation of SubSearchContext to the FetchSubPhase (#46598)
This change delays the creation of the SubSearchContext for nested and parent/child inner_hits
to the fetch sub phase in order to ensure that a SearchContext can built entirely from a
QueryShardContext. This commit also adds a validation step to the inner hits builder that ensures that we fail the request early if the inner hits path is invalid.

Relates #46523
2019-09-12 14:52:15 +02:00
Jim Ferenczi 23bf310c84 Replace the SearchContext with QueryShardContext when building aggregator factories (#46527)
This commit replaces the `SearchContext` with the `QueryShardContext` when building aggregator factories. Aggregator factories are part of the `SearchContext` so they shouldn't require a `SearchContext` to create them.
The main changes here are the signatures of `AggregationBuilder#build` that now takes a `QueryShardContext` and `AggregatorFactory#createInternal` that passes the `SearchContext` to build the `Aggregator`.

Relates #46523
2019-09-11 16:43:30 +02:00
Armin Braun 41633cb9b5
More Efficient Ordering of Shard Upload Execution (#42791) (#46588)
* More Efficient Ordering of Shard Upload Execution (#42791)
* Change the upload order of of snapshots to work file by file in parallel on the snapshot pool instead of merely shard-by-shard
* Inspired by #39657
* Cleanup BlobStoreRepository Abort and Failure Handling (#46208)
2019-09-11 13:59:20 +02:00
Jim Ferenczi 425b1a77e8 Add more context to QueryShardContext (#46584)
This change adds an IndexSearcher and the node's BigArrays in the QueryShardContext.
It's a spin off of #46527 as this change is required to allow aggregation builder to solely use the
query shard context.

Relates #46523
2019-09-11 12:24:51 +02:00
David Turner 6c67b53932
Load metadata at start time not construction time (#46326)
Today we load the metadata from disk while constructing the node. However there
is no real need to do so, and this commit moves that code to run later while
the node is starting instead.
2019-09-10 11:15:10 +01:00
Tanguy Leroux 88bed09119 Mutualize code in cloud-based repository integration tests (#46483)
This commit factors out some common code between the cloud-based
repository integration tests that were recently improved.

Relates #46376
2019-09-09 16:02:14 +02:00
Armin Braun 1bb1c77885
Increase REST-Test Client Timeout to 60s (#46455) (#46461)
We are seeing requests take more than the default 30s
which leads to requests being retried and returning
unexpected failures like e.g. "index already exists"
because the initial requests that timed out, worked
out functionally anyway.
=> double the timeout to reduce the likelihood of
the failures described in #46091
=> As suggested in the issue, we should in a follow-up
turn off retrying all-together probably
2019-09-07 07:40:16 +02:00
Tanguy Leroux 28974b5723 Replace mocked client in GCSBlobStoreRepositoryTests by HTTP server (#46255)
This commit removes the usage of MockGoogleCloudStoragePlugin in
GoogleCloudStorageBlobStoreRepositoryTests and replaces it by a
HttpServer that emulates the Storage service. This allows the repository
tests to use the real Google's client under the hood in tests and will allow
us to test the behavior of the snapshot/restore feature for GCS repositories
by simulating random server-side internal errors.

The HTTP server used to emulate the Storage service is intentionally simple
and minimal to keep things understandable and maintainable. Testing full
client options on the server side (like authentication, chunked encoding
etc) remains the responsibility of the GoogleCloudStorageFixture.
2019-09-05 10:37:37 +02:00
Alpar Torok d709a5c193 Quote the task name in reproduction line printer (#46266)
Some tasks have `#` for instance that doesn't play well with some shells
( e.x. zsh )
2019-09-04 12:22:58 +03:00
Lee Hinman 57f322f85e Move MockRespository into test framework (#46298)
This moves the `MockRespository` class into `test/framework/src/main` so
it can be used across all modules and plugins in tests.
2019-09-03 16:21:10 -06:00
David Turner d340530a47 Avoid overshooting watermarks during relocation (#46079)
Today the `DiskThresholdDecider` attempts to account for already-relocating
shards when deciding how to allocate or relocate a shard. Its goal is to stop
relocating shards onto a node before that node exceeds the low watermark, and
to stop relocating shards away from a node as soon as the node drops below the
high watermark.

The decider handles multiple data paths by only accounting for relocating
shards that affect the appropriate data path. However, this mechanism does not
correctly account for _new_ relocating shards, which are unwittingly ignored.
This means that we may evict far too many shards from a node above the high
watermark, and may relocate far too many shards onto a node causing it to blow
right past the low watermark and potentially other watermarks too.

There are in fact two distinct issues that this PR fixes. New incoming shards
have an unknown data path until the `ClusterInfoService` refreshes its
statistics. New outgoing shards have a known data path, but we fail to account
for the change of the corresponding `ShardRouting` from `STARTED` to
`RELOCATING`, meaning that we fail to find the correct data path and treat the
path as unknown here too.

This PR also reworks the `MockDiskUsagesIT` test to avoid using fake data paths
for all shards. With the changes here, the data paths are handled in tests as
they are in production, except that their sizes are fake.

Fixes #45177
2019-08-29 12:40:55 +01:00
Rory Hunter 3666bcfbd8
Handle multiple loopback addresses (#46061)
AbstractSimpleTransportTestCase.testTransportProfilesWithPortAndHost
expects a host to only have a single IPv4 loopback address, which isn't
necessarily the case. Allow for >= 1 address.

Backport of #45901.
2019-08-29 09:45:51 +01:00
Tanguy Leroux 9e14ffa8be Few clean ups in ESBlobStoreRepositoryIntegTestCase (#46068) 2019-08-28 16:29:46 +02:00
Luca Cavanna 267183998e [TEST] wait for http channels to be closed in ESIntegTestCase (#45977)
We recently added a check to `ESIntegTestCase` in order to verify that
no http channels are being tracked when we close clusters and the
REST client. Close listeners though are invoked asynchronously, hence
this check may fail if we assert before the close listener that removes
the channel from the map is invoked.

With this commit we add an `assertBusy` so we try and wait for the map
to be empty.

Closes #45914
Closes #45955
2019-08-27 14:00:24 +02:00
Nhat Nguyen f2e8b17696 Do not create engine under IndexShard#mutex (#45263)
Today we create new engines under IndexShard#mutex. This is not ideal
because it can block the cluster state updates which also execute under
the same mutex. We can avoid this problem by creating new engines under
a separate mutex.

Closes #43699
2019-08-26 17:18:29 -04:00
Tanguy Leroux 8e66df9925 Move testRetentionLeasesClearedOnRestore (#45896) 2019-08-23 13:43:40 +02:00
Jason Tedor de6b6fd338
Add node.processors setting in favor of processors (#45885)
This commit namespaces the existing processors setting under the "node"
namespace. In doing so, we deprecate the existing processors setting in
favor of node.processors.
2019-08-22 22:18:37 -04:00
Armin Braun bfddaaa2ae
Acknowledge Indices Were Wiped Successfully in REST Tests (#45832) (#45842)
In internal test clusters tests we check that wiping all indices was acknowledged
but in REST tests we didn't.
This aligns the behavior in both kinds of tests.
Relates #45605 which might be caused by unacked deletes that were just slow.
2019-08-22 17:19:51 +02:00
Luca Cavanna a47ade3e64 Cancel search task on connection close (#43332)
This PR introduces a mechanism to cancel a search task when its corresponding connection gets closed. That would relief users from having to manually deal with tasks and cancel them if needed. Especially the process of finding the task_id requires calling get tasks which needs to call every node in the cluster.

The implementation is based on associating each http channel with its currently running search task, and cancelling the task when the previously registered close listener gets called.
2019-08-22 10:43:20 +02:00
Armin Braun 824f1090a9
Disable testTimeoutPerConnection on Windows (#45785) (#45818)
* It appears this test that is specific to how the BSD network stack works
does randomly fail on Windows => disabling it since it's not clear that it
should work on Windows in a stable way
* Fixes #45777
2019-08-22 06:06:09 +02:00
William Brafford 2b549e7342
CLI tools: write errors to stderr instead of stdout (#45586)
Most of our CLI tools use the Terminal class, which previously did not provide methods for writing to standard output. When all output goes to standard out, there are two basic problems. First, errors and warnings are "swallowed" in pipelines, making it hard for a user to know when something's gone wrong. Second, errors and warnings are intermingled with legitimate output, making it difficult to pass the results of interactive scripts to other tools.

This commit adds a second set of print commands to Terminal for printing to standard error, with errorPrint corresponding to print and errorPrintln corresponding to println. This leaves it to developers to decide which output should go where. It also adjusts existing commands to send errors and warnings to stderr.

Usage is printed to standard output when it's correctly requested (e.g., bin/elasticsearch-keystore --help) but goes to standard error when a command is invoked incorrectly (e.g. bin/elasticsearch-keystore list-with-a-typo | sort).
2019-08-21 14:46:07 -04:00
Armin Braun 6aaee8aa0a
Repository Cleanup Endpoint (#43900) (#45780)
* Repository Cleanup Endpoint (#43900)

* Snapshot cleanup functionality via transport/REST endpoint.
* Added all the infrastructure for this with the HLRC and node client
* Made use of it in tests and resolved relevant TODO
* Added new `Custom` CS element that tracks the cleanup logic.
Kept it similar to the delete and in progress classes and gave it
some (for now) redundant way of handling multiple cleanups but only allow one
* Use the exact same mechanism used by deletes to have the combination
of CS entry and increment in repository state ID provide some
concurrency safety (the initial approach of just an entry in the CS
was not enough, we must increment the repository state ID to be safe
against concurrent modifications, otherwise we run the risk of "cleaning up"
blobs that just got created without noticing)
* Isolated the logic to the transport action class as much as I could.
It's not ideal, but we don't need to keep any state and do the same
for other repository operations
(like getting the detailed snapshot shard status)
2019-08-21 17:59:49 +02:00
Gordon Brown ecb3ebd796
Clean SLM and ongoing snapshots in test framework (#45564)
Adjusts the cluster cleanup routine in ESRestTestCase to clean up SLM
test cases, and optionally wait for all snapshots to be deleted.

Waiting for all snapshots to be deleted, rather than failing if any are
in progress, is necessary for tests which use SLM policies because SLM
policies may be in the process of executing when the test ends.
2019-08-16 14:17:34 -06:00
Igor Motov 98c850c08b
Geo: Change order of parameter in Geometries to lon, lat 7.x (#45618)
Changes the order of parameters in Geometries from lat, lon to lon, lat
and moves all Geometry classes are moved to the
org.elasticsearch.geomtery package.

Backport of #45332

Closes #45048
2019-08-16 14:42:02 -04:00
Luca Cavanna c31cddf27e
Update the schema for the REST API specification (#42346)
* Update the REST API specification

This patch updates the REST API spefication in JSON files to better encode deprecated entities,
to improve specification of URL paths, and to open up the schema for future extensions.

Notably, it changes the `paths` from a list of strings to a list of objects, where each
particular object encodes all the information for this particular path: the `parts` and the `methods`.

Among the benefits of this approach is eg. encoding the difference between using the `PUT` and `POST`
methods in the Index API, to either use a specific document ID, or let Elasticsearch generate one.

Also `documentation` becomes an object that supports an `url` and also a `description` which is a
new field.

* Adapt YAML runner to new REST API specification format

The logic for choosing the path to use when running tests has been
simplified, as a consequence of the path parts being listed under each
path in the spec. The special case for create and index has been removed.

Also the parsing code has been hardened so that errors are thrown earlier
when the structure of the spec differs from what expected, and their
error messages should be more helpful.
2019-08-16 14:40:00 +02:00
Alpar Torok 4a67645e5d Use dynamic ports for ESSingleNodeTestCase too
Extends #45601 to cover all tests.
2019-08-16 09:17:19 +03:00
Armin Braun 73e266b2fd
Fix Failures when Closing Indices in EsBlobStoreRepositoryIntegTestCase (#45532) (#45614)
* Same issue as in #44754 as far as I can see: in case of async translog persistence we randomly fail to close
* Closes #45335 
* Closes #45334
2019-08-15 19:45:17 +02:00
Alpar Torok 03a1645bc6 Use dynamic port ranges for ExternalTestCluster (#45601)
Moves methods added in #44213 and uses them to configure the port range
for `ExternalTestCluster` too.
These were still using `9300-9400` ( teh default ) and running into
races.
2019-08-15 16:40:12 +03:00
Nick Knize 647a8308c3
[SPATIAL] Backport new ShapeFieldMapper and ShapeQueryBuilder to 7x (#45363)
* Introduce Spatial Plugin (#44389)

Introduce a skeleton Spatial plugin that holds new licensed features coming to 
Geo/Spatial land!

* [GEO] Refactor DeprecatedParameters in AbstractGeometryFieldMapper (#44923)

Refactor DeprecatedParameters specific to legacy geo_shape out of
AbstractGeometryFieldMapper.TypeParser#parse.

* [SPATIAL] New ShapeFieldMapper for indexing cartesian geometries (#44980)

Add a new ShapeFieldMapper to the xpack spatial module for
indexing arbitrary cartesian geometries using a new field type called shape.
The indexing approach leverages lucene's new XYShape field type which is
backed by BKD in the same manner as LatLonShape but without the WGS84
latitude longitude restrictions. The new field mapper builds on and
extends the refactoring effort in AbstractGeometryFieldMapper and accepts
shapes in either GeoJSON or WKT format (both of which support non geospatial
geometries).

Tests are provided in the ShapeFieldMapperTest class in the same manner
as GeoShapeFieldMapperTests and LegacyGeoShapeFieldMapperTests.
Documentation for how to use the new field type and what parameters are
accepted is included. The QueryBuilder for searching indexed shapes is
provided in a separate commit.

* [SPATIAL] New ShapeQueryBuilder for querying indexed cartesian geometry (#45108)

Add a new ShapeQueryBuilder to the xpack spatial module for
querying arbitrary Cartesian geometries indexed using the new shape field
type.

The query builder extends AbstractGeometryQueryBuilder and leverages the
ShapeQueryProcessor added in the previous field mapper commit.

Tests are provided in ShapeQueryTests in the same manner as
GeoShapeQueryTests and docs are updated to explain how the query works.
2019-08-14 16:35:10 -05:00
Nhat Nguyen 4fcf7bbd07 Do not hold writeLock while verifying Lucene/translog
We should not hold Engine#writeLock while executing
assertConsistentHistoryBetweenTranslogAndLuceneIndex
for this check might acquire Engine#readLock.

Relates #45461
2019-08-13 16:16:06 -04:00
Nhat Nguyen 24514275c7 Get max_seq_no after snapshot translog and Lucene (#45461)
We should capture max_seq_no after snapshotting translog and Lucene;
otherwise, that max_seq_no can be smaller some operation in translog or
Lucene. With this change, we also hold the Engine#writeLock during this
check so that no indexing can happen.

Closes #45454
2019-08-13 16:16:06 -04:00
Nhat Nguyen 25c6102101 Trim local translog in peer recovery (#44756)
Today, if an operation-based peer recovery occurs, we won't trim
translog but leave it as is. Some unacknowledged operations existing in
translog of that replica might suddenly reappear when it gets promoted.
With this change, we ensure trimming translog above the starting
sequence number of phase 2. This change can allow us to read translog
forward.
2019-08-10 22:59:02 -04:00
Armin Braun 12ed6dc999
Only retain reasonable history for peer recoveries (#45208) (#45355)
Today if a shard is not fully allocated we maintain a retention lease for a
lost peer for up to 12 hours, retaining all operations that occur in that time
period so that we can recover this replica using an operations-based recovery
if it returns. However it is not always reasonable to perform an
operations-based recovery on such a replica: if the replica is a very long way
behind the rest of the replication group then it can be much quicker to perform
a file-based recovery instead.

This commit introduces a notion of "reasonable" recoveries. If an
operations-based recovery would involve copying only a small number of
operations, but the index is large, then an operations-based recovery is
reasonable; on the other hand if there are many operations to copy across and
the index itself is relatively small then it makes more sense to perform a
file-based recovery. We measure the size of the index by computing its number
of documents (including deleted documents) in all segments belonging to the
current safe commit, and compare this to the number of operations a lease is
retaining below the local checkpoint of the safe commit. We consider an
operations-based recovery to be reasonable iff it would involve replaying at
most 10% of the documents in the index.

The mechanism for this feature is to expire peer-recovery retention leases
early if they are retaining so much history that an operations-based recovery
using that lease would be unreasonable.

Relates #41536
2019-08-09 01:56:32 +02:00
Tim Brooks af908efa41
Disable netty direct buffer pooling by default (#44837)
Elasticsearch does not grant Netty reflection access to get Unsafe. The
only mechanism that currently exists to free direct buffers in a timely
manner is to use Unsafe. This leads to the occasional scenario, under
heavy network load, that direct byte buffers can slowly build up without
being freed.

This commit disables Netty direct buffer pooling and moves to a strategy
of using a single thread-local direct buffer for interfacing with sockets.
This will reduce the memory usage from networking. Elasticsearch
currently derives very little value from direct buffer usage (TLS,
compression, Lucene, Elasticsearch handling, etc all use heap bytes). So
this seems like the correct trade-off until that changes.
2019-08-08 15:10:31 -06:00
David Turner 355713b9ca
Improve slow logging in MasterService (#45241)
Adds a tighter threshold for logging a warning about slowness in the
`MasterService` instead of relying on the cluster service's 30-second warning
threshold. This new threshold applies to the computation of the cluster state
update in isolation, so we get a warning if computing a new cluster state
update takes longer than 10 seconds even if it is subsequently applied quickly.
It also applies independently to the length of time it takes to notify the
cluster state tasks on completion of publication, in case any of these
notifications holds up the master thread for too long.

Relates #45007
Backport of #45086
2019-08-06 17:01:49 +01:00
Jason Tedor 5b1b146099
Normalize environment paths (#45179)
This commit applies a normalization process to environment paths, both
in how they are stored internally, also their settings values. This
normalization is done via two means:
 - we make the paths absolute
 - we remove redundant name elements from the path (what Java calls
   "normalization")

This change ensures that when we compare and refer to these paths within
the system, we are using a common ground. For example, prior to the
change if the data path was relative, we would not compare it correctly
to paths from disk usage. This is because the paths in disk usage were
being made absolute.
2019-08-06 06:04:30 -04:00
Yannick Welsch 7aeb2fe73c Add per-socket keepalive options (#44055)
Uses JDK 11's per-socket configuration of TCP keepalive (supported on Linux and Mac), see
https://bugs.openjdk.java.net/browse/JDK-8194298, and exposes these as transport settings.
By default, these options are disabled for now (i.e. fall-back to OS behavior), but we would like
to explore whether we can enable them by default, in particular to force keepalive configurations
that are better tuned for running ES.
2019-08-06 10:45:44 +02:00
Zachary Tong 3df1c76f9b Allow pipeline aggs to select specific buckets from multi-bucket aggs (#44179)
This adjusts the `buckets_path` parser so that pipeline aggs can
select specific buckets (via their bucket keys) instead of fetching
the entire set of buckets.  This is useful for bucket_script in
particular, which might want specific buckets for calculations.

It's possible to workaround this with `filter` aggs, but the workaround
is hacky and probably less performant.

- Adjusts documentation
- Adds a barebones AggregatorTestCase for bucket_script
- Tweaks AggTestCase to use getMockScriptService() for reductions and
pipelines.  Previously pipelines could just pass in a script service
for testing, but this didnt work for regular aggs.  The new
getMockScriptService() method fixes that issue, but needs to be used
for pipelines too.  This had a knock-on effect of touching MovFn,
AvgBucket and ScriptedMetric
2019-08-05 12:18:40 -04:00
David Turner 13a167051f
Remove fileBasedRecovery flag (#45146)
Today `RecoveryTarget#prepareForTranslogOperations` takes a boolean flag
indicating whether the recovery is file-based or not. This was used in 6.x to
bootstrap some commit data that were missing in indices created in 5.x:

b506955f8d/server/src/main/java/org/elasticsearch/indices/recovery/RecoveryTarget.java (L298-L300)

This flag no longer has any effect, so this commit removes it.

Backport of #45131 to 7.x.
2019-08-05 08:17:40 +01:00
Tim Brooks 984ba82251
Move nio channel initialization to event loop (#45155)
Currently in the transport-nio work we connect and bind channels on the
a thread before the channel is registered with a selector. Additionally,
it is at this point that we set all the socket options. This commit
moves these operations onto the event-loop after the channel has been
registered with a selector. It attempts to set the socket options for a
non-server channel at registration time. If that fails, it will attempt
to set the options after the channel is connected. This should fix
#41071.
2019-08-02 17:31:31 -04:00
David Turner 9ff320d967
Use index for peer recovery instead of translog (#45137)
Today we recover a replica by copying operations from the primary's translog.
However we also retain some historical operations in the index itself, as long
as soft-deletes are enabled. This commit adjusts peer recovery to use the
operations in the index for recovery rather than those in the translog, and
ensures that the replication group retains enough history for use in peer
recovery by means of retention leases.

Reverts #38904 and #42211
Relates #41536
Backport of #45136 to 7.x.
2019-08-02 15:00:43 +01:00
Armin Braun 9450505d5b
Stop Passing Around REST Request in Multiple Spots (#44949) (#45109)
* Stop Passing Around REST Request in Multiple Spots

* Motivated by #44564
  * We are currently passing the REST request object around to a large number of places. This works fine since we simply copy the full request content before we handle the rest itself which is needlessly hard on GC and heap.
  * This PR removes a number of spots where the request is passed around needlessly. There are many more spots to optimize in follow-ups to this, but this one would already enable bypassing the request copying for some error paths in a follow up.
2019-08-02 07:31:38 +02:00
David Turner c088bafbbc Wait for events in waitForRelocation (#45074)
Adds a `waitForEvents(Priority.LANGUID)` to the cluster health request in
`ESIntegTestCase#waitForRelocation()` to deal with the case that this health
request returns successfully despite the fact that there is a pending reroute task which
will relocate another shard.

Relates #44433
Fixes #45003
2019-08-01 13:47:39 +01:00
Nhat Nguyen 979d0a71c7 Remove leniency during replay translog in peer recovery (#44989)
This change removes leniency in InternalEngine during replaying translog
in peer recovery.
2019-07-30 13:25:15 -04:00
Armin Braun 548c767b6b
S3 3rd Party Test Goal (#44799) (#45004)
* Create S3 Third Party Test Task that Covers the S3 CLI Tool
* Adjust snapshot cli test tool tests to work with real S3
  * Build adjustment
  * Clean up repo path before testing
* Dedup the logic for asserting path contents by using the correct utility method here that somehow became unused
2019-07-30 17:16:41 +02:00
David Turner 55f1dd8da6 Close nodes properly in Coordinator tests (#44967)
Today closing a `ClusterNode` in an `AbstractCoordinatorTestCase` uses
`onNode()` so has no effect if the node is not in the current list of nodes.
It also discards the `Runnable` it creates without having run it, so has no
effect anyway.

This commit makes these tests much stricter about properly closing the nodes
started during `Coordinator` tests, by tracking the persisted states that are
opened, and adds an assertion to catch the trappy requirement that the closing
node still belongs to the cluster.
2019-07-30 11:47:36 +01:00
Andrey Ershov 5a0bd696fc
Snapshot tool S3 cleanup 7.x backport (#44575)
Backport of #44551
2019-07-30 11:02:08 +02:00
Nhat Nguyen 4813728783 Remove leniency in reset engine from translog (#44711)
Replaying operations from the local translog must never fail as those
operations were processed successfully on the primary before and the
mapping is up to update already. This change removes leniency during
resetting engine from translog in IndexShard and InternalEngine.
2019-07-29 16:31:45 -04:00
Yannick Welsch 8653c33838 Fix testBlockingIncomingRequests (#44939)
Adapted test to take non-blocking nature into account.
2019-07-29 16:37:53 +02:00
Yannick Welsch 24873dd3e3 Do not block transport thread on startup (#44939)
We currently block the transport thread on startup, which has caused test failures. I think this is
some kind of deadlock situation. I don't think we should even block a transport thread, and
there's also no need to do so. We can just reject requests as long we're not fully set up. Note
that the HTTP layer is only started much later (after we've completed full start up of the
transport layer), so that one should be completely unaffected by this.

Closes #41745
2019-07-29 11:35:17 +02:00
Jason Tedor 6ea2b5dec0
Deprecate setting processors to more than available (#44889)
Today the processors setting is permitted to be set to more than the
number of processors available to the JVM. The processors setting
directly sizes the number of threads in the various thread pools, with
most of these sizes being a linear function in the number of
processors. It doesn't make any sense to set processors very high as the
overhead from context switching amongst all the threads will overwhelm,
and changing the setting does not control how many physical CPU
resources there are on which to schedule the additional threads. We have
to draw a line somewhere and this commit deprecates setting processors
to more than the number of available processors. This is the right place
to draw the line given the linear growth as a function of processors in
most of the thread pools, and that some are capped at the number of
available processors already.
2019-07-26 17:06:44 +09:00
Yannick Welsch 0ce841915c Add Clone Index API (#44267)
Adds an API to clone an index. This is similar to the index split and shrink APIs, just with the
difference that the number of primary shards is kept the same. In case where the filesystem
provides hard-linking capabilities, this is a very cheap operation.

Indexing cloning can be done by running `POST my_source_index/_clone/my_target_index` and it
supports the same options as the split and shrink APIs.

Closes #44128
2019-07-25 22:02:28 +02:00
Andrei Stefan 2633d11eb7
Switch from using docvalue_fields to extracting values from _source (#44062) (#44804)
* Switch from using docvalue_fields to extracting values from _source
where applicable. Doing this means parsing the _source and handling the
numbers parsing just like Elasticsearch is doing it when it's indexing
a document.
* This also introduces a minor limitation: aliases type of fields that
are NOT part of a tree of sub-fields will not be able to be retrieved
anymore. field_caps API doesn't shed any light into a field being an
alias or not and at _source parsing time there is no way to know if a
root field is an alias or not. Fields of the type "a.b.c.alias" can be
extracted from docvalue_fields, only if the field they point to can be
extracted from docvalue_fields. Also, not all fields in a hierarchy of
fields can be evaluated to being an alias.

(cherry picked from commit 8bf8a055e38f00df5f49c8d97f632f69d6e00c2c)
2019-07-25 10:02:41 +03:00
Igor Motov f9943a3e53 Geo: deprecate ShapeBuilder in QueryBuilders (#44715)
Removes unnecessary now timeline decompositions from shape builders
and deprecates ShapeBuilders in QueryBuilder in favor of libs/geo
shapes.

Relates to #40908
2019-07-24 14:27:58 -04:00
Armin Braun d8be9244f9
Fix Repository Cleanup Test Correctness (#44738) (#44751)
* The tests were creating the corruption and asserting its existence not on the repository base path but on a clean path.
As a result the consistency assertion on the repository wouldn't see the corruption ever an pass even if the cleanup was broken for repositories that have a non-root base path
2019-07-24 16:03:37 +02:00
Armin Braun 818103ff1e
Fix testRetentionLeasesClearedOnRestore (#44754) (#44766)
* Fix this test randomly failing when running into async translog persistence edge case and failing to successfully close index
* Also, slightly improve debug logging on close failure
* Closes #44681
2019-07-23 21:29:07 +02:00
Igor Motov 9338fc8536 GEO: Switch to using GeoTestUtil to generate random geo shapes (#44635)
Switches to more robust way of generating random test geometries by
reusing lucene's GeoTestUtil. Removes duplicate random geometry
generators by moving them to the test framework.

Closes #37278
2019-07-23 14:30:41 -04:00
Mayya Sharipova 972a49312c Fix testQuotedQueryStringWithBoost test (#43385)
Add more logging to indexRandom

Seems that asynchronous indexing from indexRandom sometimes indexes
the same document twice, which will mess up the expected score calculations.

For example, indexing:
{ "index" : {"_id" : "1" } }
{"important" :"phrase match", "less_important": "nothing important"}
{ "index" : {"_id" : "2" } }
{"important" :"nothing important", "less_important" :"phrase match"}
Produces the expected scores: 13.8 for doc1, and 1.38 for doc2

indexing:
{ "index" : {"_id" : "1" } }
{"important" :"phrase match", "less_important": "nothing important"}
{ "index" : {"_id" : "2" } }
{"important" :"nothing important", "less_important" :"phrase match"}
{ "index" : {"_id" : "3" } }
{"important" :"phrase match", "less_important": "nothing important"}
Produces scores: 9.4 for doc1, and 1.96 for doc2 which are found in the
error logs.

Relates to #43144
2019-07-22 08:44:31 -04:00
Ryan Ernst 4c05d25ec7
Convert Transport Request/Response to Writeable (#44636) (#44654)
This commit converts all remaining TransportRequest and
TransportResponse classes to implement Writeable, and disallows
Streamable implementations.

relates #34389
2019-07-20 11:25:58 -07:00
Ryan Ernst f4ee2e9e91
Convert direct implementations of Streamable to Writeable (#44605) (#44646)
This commit converts Streamable to Writeable for direct implementations.

relates #34389
2019-07-20 08:32:29 -07:00
Ryan Ernst f193d14764
Convert remaining Action Response/Request to writeable.reader (#44528) (#44607)
This commit converts readFrom to ctor with StreamInput on the remaining
ActionResponse and ActionRequest classes.

relates #34389
2019-07-19 13:33:38 -07:00
Lee Hinman fe2ef66e45 Expose index age in ILM explain output (#44457)
* Expose index age in ILM explain output

This adds the index's age to the ILM explain output, for example:

```
{
  "indices" : {
    "ilm-000001" : {
      "index" : "ilm-000001",
      "managed" : true,
      "policy" : "full-lifecycle",
      "lifecycle_date" : "2019-07-16T19:48:22.294Z",
      "lifecycle_date_millis" : 1563306502294,
      "age" : "1.34m",
      "phase" : "hot",
      "phase_time" : "2019-07-16T19:48:22.487Z",
      ... etc ...
    }
  }
}
```

This age can be used to tell when ILM will transition the index to the
next phase, based on that phase's `min_age`.

Resolves #38988

* Expose age in getters and in HLRC
2019-07-18 15:33:45 -06:00
Andrey Ershov ef6ddd15c6 Revert "Snapshot tool: S3 orphaned files
cleanup (#44551)"

This reverts commit 09edeeb3
2019-07-18 17:21:45 +02:00
Andrey Ershov 6f5327ba45 Fix BlobStoreTestUtil 2019-07-18 17:00:23 +02:00
Andrey Ershov 09edeeb38e Snapshot tool: S3 orphaned files cleanup (#44551)
A tool to work with snapshots.
Co-authored by @original-brownbear.
This commit adds snapshot tool and the single command cleanup, that
cleans up orphaned files for S3.
Snapshot tool lives in x-pack/snapshot-tool.

(cherry picked from commit fc4aed44dd975d83229561090f957a95cc76b287)
2019-07-18 16:38:00 +02:00
David Turner 452f7f67a0
Defer reroute when starting shards (#44539)
Today we reroute the cluster as part of the process of starting a shard, which
runs at `URGENT` priority. In large clusters, rerouting may take some time to
complete, and this means that a mere trickle of shard-started events can cause
starvation for other, lower-priority, tasks that are pending on the master.

However, it isn't really necessary to perform a reroute when starting a shard,
as long as one occurs eventually. This commit removes the inline reroute from
the process of starting a shard and replaces it with a deferred one that runs
at `NORMAL` priority, avoiding starvation of higher-priority tasks.

Backport of #44433 and #44543.
2019-07-18 14:10:40 +01:00
Nhat Nguyen 51180af91d Make peer recovery send file chunks async (#44468)
Relates #44040
Relates #36195
2019-07-17 22:25:43 -04:00
Nhat Nguyen 458f24c46a Reenable accounting circuit breaker (#44495)
We have a new Lucene 8.2 snapshot on master and 7.x; hence we can
re-enable the accounting on these branches.

Relates #30290
2019-07-17 22:25:43 -04:00
Jason Tedor 39c5f98de7
Introduce test issue logging (#44477)
Today we have an annotation for controlling logging levels in
tests. This annotation serves two purposes, one is to control the
logging level used in tests, when such control is needed to impact and
assert the behavior of loggers in tests. The other use is when a test is
failing and additional logging is needed. This commit separates these
two concerns into separate annotations.

The primary motivation for this is that we have a history of leaving
behind the annotation for the purpose of investigating test failures
long after the test failure is resolved. The accumulation of these stale
logging annotations has led to excessive disk consumption. Having
recently cleaned this up, we would like to avoid falling into this state
again. To do this, we are adding a link to the test failure under
investigation to the annotation when used for the purpose of
investigating test failures. We will add tooling to inspect these
annotations, in the same way that we have tooling on awaits fix
annotations. This will enable us to report on the use of these
annotations, and report when stale uses of the annotation exist.
2019-07-18 05:33:33 +09:00
Yannick Welsch f78e64e3e2 Terminate linearizability check early on large histories (#44444)
Large histories can be problematic and have the linearizability checker occasionally run OOM. As it's
very difficult to bound the size of the histories just right, this PR will let it instead run for 10 seconds
on large histories and then abort.

Closes #44429
2019-07-17 18:51:25 +02:00
Armin Braun c8db0e9b7e
Remove blobExists Method from BlobContainer (#44472) (#44475)
* We only use this method in one place in production code and can replace that with a read -> remove it to simplify the interface
   * Keep it as an implementation detail in the Azure repository
2019-07-17 11:56:02 +02:00
Tim Brooks 0a352486e8
Isolate nio channel registered from channel active (#44388)
Registering a channel with a selector is a required operation for the
channel to be handled properly. Currently, we mix the registeration with
other setup operations (ip filtering, SSL initiation, etc). However, a
fail to register is fatal. This PR modifies how registeration occurs to
immediately close the channel if it fails.

There are still two clear loopholes for how a user can interact with a
channel even if registration fails. 1. through the exception handler.
2. through the channel accepted callback. These can perhaps be improved
in the future. For now, this PR prevents writes from proceeding if the
channel is not registered.
2019-07-16 17:18:57 -06:00
Nhat Nguyen 301c8daf4c Revert "Make peer recovery send file chunks async (#44040)"
This reverts commit a2b4687d89.
2019-07-16 14:18:35 -04:00
Nhat Nguyen a2b4687d89 Make peer recovery send file chunks async (#44040) 2019-07-16 10:43:46 -04:00
Lee Hinman fb0461ac76
[7.x] Add Snapshot Lifecycle Management (#44382)
* Add Snapshot Lifecycle Management (#43934)

* Add SnapshotLifecycleService and related CRUD APIs

This commit adds `SnapshotLifecycleService` as a new service under the ilm
plugin. This service handles snapshot lifecycle policies by scheduling based on
the policies defined schedule.

This also includes the get, put, and delete APIs for these policies

Relates to #38461

* Make scheduledJobIds return an immutable set

* Use Object.equals for SnapshotLifecyclePolicy

* Remove unneeded TODO

* Implement ToXContentFragment on SnapshotLifecyclePolicyItem

* Copy contents of the scheduledJobIds

* Handle snapshot lifecycle policy updates and deletions (#40062)

(Note this is a PR against the `snapshot-lifecycle-management` feature branch)

This adds logic to `SnapshotLifecycleService` to handle updates and deletes for
snapshot policies. Policies with incremented versions have the old policy
cancelled and the new one scheduled. Deleted policies have their schedules
cancelled when they are no longer present in the cluster state metadata.

Relates to #38461

* Take a snapshot for the policy when the SLM policy is triggered (#40383)

(This is a PR for the `snapshot-lifecycle-management` branch)

This commit fills in `SnapshotLifecycleTask` to actually perform the
snapshotting when the policy is triggered. Currently there is no handling of the
results (other than logging) as that will be added in subsequent work.

This also adds unit tests and an integration test that schedules a policy and
ensures that a snapshot is correctly taken.

Relates to #38461

* Record most recent snapshot policy success/failure (#40619)

Keeping a record of the results of the successes and failures will aid
troubleshooting of policies and make users more confident that their
snapshots are being taken as expected.

This is the first step toward writing history in a more permanent
fashion.

* Validate snapshot lifecycle policies (#40654)

(This is a PR against the `snapshot-lifecycle-management` branch)

With the commit, we now validate the content of snapshot lifecycle policies when
the policy is being created or updated. This checks for the validity of the id,
name, schedule, and repository. Additionally, cluster state is checked to ensure
that the repository exists prior to the lifecycle being added to the cluster
state.

Part of #38461

* Hook SLM into ILM's start and stop APIs (#40871)

(This pull request is for the `snapshot-lifecycle-management` branch)

This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also
manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are
cancelled.

Relates to #38461

* Add tests for SnapshotLifecyclePolicyItem (#40912)

Adds serialization tests for SnapshotLifecyclePolicyItem.

* Fix improper import in build.gradle after master merge

* Add human readable version of modified date for snapshot lifecycle policy (#41035)

* Add human readable version of modified date for snapshot lifecycle policy

This small change changes it from:

```
...
"modified_date": 1554843903242,
...
```

To

```
...
"modified_date" : "2019-04-09T21:05:03.242Z",
"modified_date_millis" : 1554843903242,
...
```

Including the `"modified_date"` field when the `?human` field is used.

Relates to #38461

* Fix test

* Add API to execute SLM policy on demand (#41038)

This commit adds the ability to perform a snapshot on demand for a policy. This
can be useful to take a snapshot immediately prior to performing some sort of
maintenance.

```json
PUT /_ilm/snapshot/<policy>/_execute
```

And it returns the response with the generated snapshot name:

```json
{
  "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug"
}
```

Note that this does not allow waiting for the snapshot, and the snapshot could
still fail. It *does* record this information into the cluster state similar to
a regularly trigged SLM job.

Relates to #38461

* Add next_execution to SLM policy metadata (#41221)

* Add next_execution to SLM policy metadata

This adds the next time a snapshot lifecycle policy will be executed when
retriving a policy's metadata, for example:

```json
GET /_ilm/snapshot?human
{
  "production" : {
    "version" : 1,
    "modified_date" : "2019-04-15T21:16:21.865Z",
    "modified_date_millis" : 1555362981865,
    "policy" : {
      "name" : "<production-snap-{now/d}>",
      "schedule" : "*/30 * * * * ?",
      "repository" : "repo",
      "config" : {
        "indices" : [
          "foo-*",
          "important"
        ],
        "ignore_unavailable" : true,
        "include_global_state" : false
      }
    },
    "next_execution" : "2019-04-15T21:16:30.000Z",
    "next_execution_millis" : 1555362990000
  },
  "other" : {
    "version" : 1,
    "modified_date" : "2019-04-15T21:12:19.959Z",
    "modified_date_millis" : 1555362739959,
    "policy" : {
      "name" : "<other-snap-{now/d}>",
      "schedule" : "0 30 2 * * ?",
      "repository" : "repo",
      "config" : {
        "indices" : [
          "other"
        ],
        "ignore_unavailable" : false,
        "include_global_state" : true
      }
    },
    "next_execution" : "2019-04-16T02:30:00.000Z",
    "next_execution_millis" : 1555381800000
  }
}
```

Relates to #38461

* Fix and enhance tests

* Figured out how to Cron

* Change SLM endpoint from /_ilm/* to /_slm/* (#41320)

This commit changes the endpoint for snapshot lifecycle management from:

```
GET /_ilm/snapshot/<policy>
```

to:

```
GET /_slm/policy/<policy>
```

It mimics the ILM path only using `slm` instead of `ilm`.

Relates to #38461

* Add initial documentation for SLM (#41510)

* Add initial documentation for SLM

This adds the initial documentation for snapshot lifecycle management.

It also includes the REST spec API json files since they're sort of
documentation.

Relates to #38461

* Add `manage_slm` and `read_slm` roles (#41607)

* Add `manage_slm` and `read_slm` roles

This adds two more built in roles -

`manage_slm` which has permission to perform any of the SLM actions, as well as
stopping, starting, and retrieving the operation status of ILM.

`read_slm` which has permission to retrieve snapshot lifecycle policies as well
as retrieving the operation status of ILM.

Relates to #38461

* Add execute to the test

* Fix ilm -> slm typo in test

* Record SLM history into an index (#41707)

It is useful to have a record of the actions that Snapshot Lifecycle
Management takes, especially for the purposes of alerting when a
snapshot fails or has not been taken successfully for a certain amount of
time.

This adds the infrastructure to record SLM actions into an index that
can be queried at leisure, along with a lifecycle policy so that this
history does not grow without bound.

Additionally,
SLM automatically setting up an index + lifecycle policy leads to
`index_lifecycle` custom metadata in the cluster state, which some of
the ML tests don't know how to deal with due to setting up custom
`NamedXContentRegistry`s.  Watcher would cause the same problem, but it
is already disabled (for the same reason).

* High Level Rest Client support for SLM (#41767)

* High Level Rest Client support for SLM

This commit add HLRC support for SLM.

Relates to #38461

* Fill out documentation tests with tags

* Add more callouts and asciidoc for HLRC

* Update javadoc links to real locations

* Add security test testing SLM cluster privileges (#42678)

* Add security test testing SLM cluster privileges

This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm`
cluster privileges.

Relates to #38461

* Don't redefine vars

*  Add Getting Started Guide for SLM  (#42878)

This commit adds a basic Getting Started Guide for SLM.

* Include SLM policy name in Snapshot metadata (#43132)

Keep track of which SLM policy in the metadata field of the Snapshots
taken by SLM. This allows users to more easily understand where the
snapshot came from, and will enable future SLM features such as
retention policies.

* Fix compilation after master merge

* [TEST] Move exception wrapping for devious exception throwing

Fixes an issue where an exception was created from one line and thrown in another.

* Fix SLM for the change to AcknowledgedResponse

* Add Snapshot Lifecycle Management Package Docs (#43535)

* Fix compilation for transport actions now that task is required

* Add a note mentioning the privileges needed for SLM (#43708)

* Add a note mentioning the privileges needed for SLM

This adds a note to the top of the "getting started with SLM"
documentation mentioning that there are two built-in privileges to
assist with creating roles for SLM users and administrators.

Relates to #38461

* Mention that you can create snapshots for indices you can't read

* Fix REST tests for new number of cluster privileges

* Mute testThatNonExistingTemplatesAreAddedImmediately (#43951)

* Fix SnapshotHistoryStoreTests after merge

* Remove overridden newResponse functions that have been removed

* Fix compilation for backport

* Fix get snapshot output parsing in test

* [DOCS] Add redirects for removed autogen anchors (#44380)

* Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 07:37:13 -06:00
Armin Braun 099d52f3b0
Prevent Confusing Blocked Thread Warnings in MockNioTransport (#44356) (#44376)
* Prevent Confusing Blocked Thread Warnings in MockNioTransport

* We can run into a race where the stacktrace collection and subsequent logging happens after the thread has already unblocked thus logging a confusing stacktrace of wherever the transport thread was after it became unblocked
* Fixed this by comparing whether or not the recorded timestamp is still the same before and after the stacktrace was recorded and not logging if it already changed
2019-07-16 04:40:50 +02:00
David Turner 86ee8eab3f Allow RerouteService to reroute at lower priority (#44338)
Today the `BatchedRerouteService` submits its delayed reroute task at `HIGH`
priority, but in some cases a lower priority would be more appropriate. This
commit adds the facility to submit delayed reroute tasks at different
priorities, such that each submitted reroute task runs at a priority no lower
than the one requested. It does not change the fact that all delayed reroute
tasks are submitted at `HIGH` priority, but at least it makes this explicit.
2019-07-15 17:41:39 +01:00
Christoph Büscher 835b7a120d Fix AnalyzeAction response serialization (#44284)
Currently we loose information about whether a token list in an AnalyzeAction
response is null or an empty list, because we write a 0 value to the stream in
both cases and deserialize to a null value on the receiving side. This change
fixes this so we write an additional flag indicating whether the value is null
or not, followed by the size of the list and its content.

Closes #44078
2019-07-14 10:35:11 +02:00
Przemyslaw Gomulka e23ecc5838
JSON logging refactoring and X-Opaque-ID support backport(#41354) (#44178)
This is a refactor to current JSON logging to make it more open for extensions
and support for custom ES log messages used inDeprecationLogger IndexingSlowLog , SearchSLowLog
We want to include x-opaque-id in deprecation logs. The easiest way to have this as an additional JSON field instead of part of the message is to create a custom DeprecatedMessage (extends ESLogMEssage)

These messages are regular log4j messages with a text, but also carry a map of fields which can then populate the log pattern. The logic for this lives in ESJsonLayout and ESMessageFieldConverter.

Similar approach can be used to refactor IndexingSlowLog and SearchSlowLog JSON logs to contain fields previously only present as escaped JSON string in a message field.

closes #41350
 backport #41354
2019-07-12 16:53:27 +02:00
Yannick Welsch 068286ca4b Remove RemoteClusterConnection.ConnectedNodes (#44235)
This instead exposes the set of connected nodes on ConnectionManager.
2019-07-12 14:54:21 +02:00
Armin Braun 6c02cf0241
Fix InternalTestCluster StopRandomNode Assertion (#44258) (#44265)
* The assertion added in #44214 is tripped by tests running dedicated
test clusters per test needlessly.This breaks existing tests like the one in #44245.
* Closes #44245
2019-07-12 13:18:55 +02:00
David Turner 735c897ec6
Avoid counting votes from master-ineligible nodes (#43688)
Today if a master-eligible node is converted to a master-ineligible node it may
remain in the voting configuration, meaning that the master node may count its
publish responses as an indication that it has properly persisted the cluster
state. However master-ineligible nodes do not properly persist the cluster
state, so it is not safe to count these votes.

This change adjusts `CoordinationState` to take account of this from a safety
point of view, and also adjusts the `Coordinator` to prevent such nodes from
joining the cluster. Instead, it triggers a reconfiguration to remove from the
voting configuration a node that now appears to be master-ineligible before
processing its join.

Backport of #43688, see #44260.
2019-07-12 11:30:52 +01:00
Alpar Torok 8d35583c43 Fix port range allocation with large worker IDs (#44213)
* Fix port range allocation with large worker IDs

Relates to #43983

The IDs gradle uses are incremented for the lifetime of the daemon which
can result in port ranges that are outside the valid range.
This change implements a modulo based formula to wrap the port ranges
when the IDs get too large.

Adresses #44134 but #44157 is also required to be able to close it.
2019-07-12 11:04:57 +03:00
Yogesh Gaikwad 91c342a888
fix and enable repository-hdfs secure tests (#44044) (#44199)
Due to recent changes are done for converting `repository-hdfs` to test
clusters (#41252), the `integTestSecure*` tasks did not depend on
`secureHdfsFixture` which when running would fail as the fixture
would not be available. This commit adds the dependency of the fixture
to the task.

The `secureHdfsFixture` is a `AntFixture` which is spawned a process.
Internally it waits for 30 seconds for the resources to be made available.
For my local machine, it took almost 45 seconds to be available so I have
added the wait time as an input to the `AntFixture` defaults to 30 seconds
 and set it to 60 seconds in case of secure hdfs fixture.

The integ test for secure hdfs was disabled for a long time and so
the changes done in #42090 to fix the tests are also done in this commit.
2019-07-12 12:44:01 +10:00
Armin Braun 2768662822
Cleanup Stale Root Level Blobs in Sn. Repository (#43542) (#44226)
* Cleans up all root level temp., snap-%s.dat, meta-%s.dat blobs that aren't referenced by any snapshot to deal with dangling blobs left behind by delete and snapshot finalization failures
   * The scenario that get's us here is a snapshot failing before it was finalized or a delete failing right after it wrote the updated index-(N+1) that doesn't reference a snapshot anymore but then fails to remove that snapshot
   * Not deleting other dangling blobs since that don't follow the snap-, meta- or tempfile naming schemes to not accidentally delete blobs not created by the snapshot logic
* Follow up to #42189
  * Same safety logic, get list of all blobs before writing index-N blobs, delete things after index-N blobs was written
2019-07-11 19:35:15 +02:00
Armin Braun 5f22370b6b
Fix ShrinkIndexIT (#44214) (#44223)
* Fix ShrinkIndexIT

* Move this test suit to cluster scope. Currently, `testShrinkThenSplitWithFailedNode` stops a random node which randomly turns out to be the only shared master node so the cluster reset fails on account of the fact that no shared master node survived.
* Closes #44164
2019-07-11 17:58:00 +02:00