Commit Graph

54218 Commits

Author SHA1 Message Date
Tanguy Leroux eac99dd594
SnapshotShardSizeInfo should prefer default value when provided (#63390) (#63394)
In #61906 we agreed on always providing the default value 
ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE 
when the SnasphotInfoService failed to retrieve the exact 
size for a given snapshot shard. The motivation was to 
allow the shard allocation to move forward in case of 
failures (so that the unassigned shard does not get stuck 
in an unassigned state for too long) while relying on the 
fallback values for shard sizes.

Sadly a bug in the 
SnapshotShardSizeInfo#getShardSize(ShardRouting, long) 
makes the default value to be ignored when the snapshot 
shard size retrieval previously failed, returning 
ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE 
instead of the provided default value. With DiskThresholdDecider 
also not relying on the provided default value this triggers 
some assertion like in #63376 which helped us to spot the bug.

Closes ##63376
2020-10-07 13:53:05 +02:00
David Kyle e4f37d96f4
Unmute notifications mapping upgrade test (#63388)
fixed by #63063
2020-10-07 12:51:18 +01:00
Tanguy Leroux 581490d83c
Fix DiskThresholdDeciderIT.testHighWatermarkNotExceeded (#63112) (#63385)
The first refreshDiskUsage() refreshes the ClusterInfo update which in turn 
calls listeners like DiskThreshMonitor. This one triggers a reroute as 
expected and turns an internal checkInProgress flag before submitting 
a cluster state update to relocate shards (the internal flag is toggled 
again once the cluster state update is processed).

In the test I suspect that the second refreshDiskUsage() may complete 
before DiskThreshMonitor's internal flag is set back to its initial state, 
resulting in the second ClusterInfo update to be ignored and message 
like "[node_t0] skipping monitor as a check is already in progress" to 
be logged. Adding another wait for languid events to be processed 
before executing the second refreshDiskUsage() should help here.

Closes #62326
2020-10-07 11:27:25 +02:00
Hendrik Muhs d45f7de3fb [Transform] Add test logging regarding conflict on start (#63383)
add extra logging for investigation of #63365
2020-10-07 10:17:31 +02:00
Przemyslaw Gomulka eadd69e1e4
Deprecate week_year in favour of weekyear date format backport(63307) (#63308)
week_year is misleading as the formatter only has a weekyear. A field
corresponding to 'Y'. 'weekyear' should be used instead

relates #60707
backports https://github.com/elastic/elasticsearch/pull/63307
2020-10-07 09:16:27 +02:00
Tim Vernum c30c5555c5
Mute DeleteExpiredDataIT deleteExpired NoThrottle (#63381)
Mutes test method DeleteExpiredDataIT.testDeleteExpiredDataNoThrottle

Relates: #63379
Backport of: #63380
2020-10-07 17:43:52 +11:00
Tim Vernum eeb45b4a74
[Backport 7.x] Add example settings to sample security realm (#63301)
This change adds configurable settings to the `CustomRealm` in the QA
project as the correct declaration and use of settings can be a source
of confusion in custom realms.

The "username" "password" and "roles" are now all configurable, which
demonstrates the use of a simple string setting ("username") a secure
setting ("password") and a more complex list setting ("roles").

Backport of: #62287
2020-10-07 17:40:24 +11:00
Stuart Tettemer 8a61b95a0f
Scripting: JSON parsing and writing in watcher (#63278) (#63377)
Co-authored-by: Honza Král
Co-authored-by: Jack Conradson
Backport of: f43e52d
2020-10-06 23:39:40 -05:00
Stuart Tettemer 7f4f70f557
Scripting: Augment String with Hash support in Watcher (#63346) (#63375)
Strings in the watcher context may use the `.sha1()` and `.sha256()`
augmentation added for ingest.

Ref: #59633, #59671
Fixes: #61244
 Backport of: 380ee6f
2020-10-06 22:10:27 -05:00
Tim Brooks 855ab31f84
Add ess marker to indexing_pressure.memory.limit (#61127)
Adds marker indicating this setting is supported on Cloud.
2020-10-06 21:01:51 -06:00
Tim Brooks dd4b0d85fe
Write translog operation bytes to byte stream (#63298)
Currently we add translog operation bytes to an array list and flush
them on the next write. Unfortunately, this does not currently play well
with our byte pooling which means each operation is backed, at minimum,
by a 16KB array. This commit improves memory efficiency for small
operations by serializing the operations to an output stream.
2020-10-06 20:55:44 -06:00
Gordon Brown 15edc39d9b
Update logstash_admin role for system indices (#63368)
This PR updates the `logstash_admin` role to include the recently-added Logstash Pipeline Management APIs, as well as access to the `.logstash*` index pattern.

Co-authored-by: William Brafford <williamrandolphbrafford@gmail.com>
2020-10-06 20:43:36 -06:00
Tim Brooks 64bbbaeef1
Do not block Translog add on file write (#63374)
Currently a TranslogWriter add operation is synchronized. This operation
adds the bytes to the file output stream buffer and issues a write
system call if the buffer is filled. This happens every 8KB which means
that we routinely block other add calls on system writes.

This commit modifies the add operation to simply place the operation in
an array list. The array list if flushed when the sync call occurs or
when 1MB is buffered.
2020-10-06 20:40:15 -06:00
Mayya Sharipova f2ba62b894
Upgrade to lucene- 8.7.0-snapshot-66c49a35402 (#63372)
This includes fixing a bug in doc iteration during sort optimization

Backport for #63349
2020-10-06 22:38:58 -04:00
Ryan Ernst dd5ec83c99
Remove ml cpp notice check from oss distributions (#63370)
The ML cpp notice only exists with default distributions, but the check
task exists on all archive distributions. This commit avoids creating
the task for distributions that don't have ML.

closes #63128
2020-10-06 18:05:25 -07:00
Dawid Weiss dbcbdcc029
Set context class loader for plugin initialization (#63185)
Plugins are loaded in isolated child class loaders of the root class loader. However, some libraries depend on the context class loader being set. This commit sets the context class loader for the duration of calling each plugins constructor.

relates #52320

Co-authored-by: Ryan Ernst <ryan@iernst.net>
2020-10-06 18:00:21 -07:00
Julie Tibshirani f17ca18dfa
Make array value parsing flag more robust. (#63371)
When constructing a value fetcher, the 'parsesArrayValue' flag must match
`FieldMapper#parsesArrayValue`. However there is nothing in code or tests to
help enforce this.

This PR reworks the value fetcher constructors so that `parsesArrayValue` is
'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must
explicitly set it to true and ensure the behavior is covered by tests.

Follow-up to #62974.
2020-10-06 17:49:25 -07:00
Jack Conradson 2b838d1ea6 fix expression type for null safe operator (#63367)
An invalid void expression type from a null safe operator caused ClassFormatError for the script Map 
x= ['0': 0]; x?.0 > 1. This change sets and propagates the correct expression type for the null safe 
operator to be written out.
2020-10-06 16:02:56 -07:00
Rene Groeschke 1d0f46cc35
Fix openjdk download on mac (#63309)
- the wrong platform classifier (mac) was used when downloading
openjdk on mac. This fixes it to use osx

closes #63353
2020-10-06 15:04:15 -07:00
James Rodewig 26a157da7b
[DOCS] Update snowball links (#63351) (#63355) 2020-10-06 16:21:37 -04:00
Gordon Brown 5c8b0662df
Deprecate REST access to System Indices (#63274) (Original #60945)
This PR adds deprecation warnings when accessing System Indices via the REST layer. At this time, these warnings are only enabled for Snapshot builds by default, to allow projects external to Elasticsearch additional time to adjust their access patterns.

Deprecation warnings will be triggered by all REST requests which access registered System Indices, except for purpose-specific APIs which access System Indices as an implementation detail a few specific APIs which will continue to allow access to system indices by default:

- `GET _cluster/health`
- `GET {index}/_recovery`
- `GET _cluster/allocation/explain`
- `GET _cluster/state`
- `POST _cluster/reroute`
- `GET {index}/_stats`
- `GET {index}/_segments`
- `GET {index}/_shard_stores`
- `GET _cat/[indices,aliases,health,recovery,shards,segments]`

Deprecation warnings for accessing system indices take the form:
```
this request accesses system indices: [.some_system_index], but in a future major version, direct access to system indices will be prevented by default
```
2020-10-06 13:41:40 -06:00
James Rodewig 3e548592b6
[DOCS] Update link to Snowball documentation (#63305) (#63348)
The current link points to an obsolete site, which is no longer maintained.

Co-authored-by: Stefan Walter <67258699+rd-stefan-walter@users.noreply.github.com>
2020-10-06 13:41:06 -04:00
Lee Hinman dd99125193
[7.x] Add assert that raw and readable xcontent field names are different (#63332) (#63343)
This adds asserts that will catch the case where we accidentally provide the same raw and readable
field name in xcontent.
2020-10-06 11:32:41 -06:00
Adam Locke 4307b1d607
[DOCS] Updating permissions language for RPM install packages (#63277) (#63344)
* Updating permissions language for RPM install packages.

* Fix typo
2020-10-06 13:09:50 -04:00
Lisa Cawley 8f76c89cd3
[7.x][DOCS] Add feature_importance_baseline to get trained model API (#63279) (#63336)
Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
2020-10-06 10:08:34 -07:00
Adam Locke 4f314eeb9c
Updating certificate location instructions. (#63334) (#63340) 2020-10-06 12:51:22 -04:00
Tanguy Leroux 87076c32e2
Determine shard size before allocating shards recovering from snapshots (#61906) (#63337)
Determines the shard size of shards before allocating shards that are
recovering from snapshots. It ensures during shard allocation that the
target node that is selected as recovery target will have enough free
disk space for the recovery event. This applies to regular restores,
CCR bootstrap from remote, as well as mounting searchable snapshots.

The InternalSnapshotInfoService is responsible for fetching snapshot
shard sizes from repositories. It provides a getShardSize() method
to other components of the system that can be used to retrieve the
latest known shard size. If the latest snapshot shard size retrieval
failed, the getShardSize() returns
ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE. While
we'd like a better way to handle such failures, returning this value
allows to keep the existing behavior for now.

Note that this PR does not address an issues (we already have today)
where a replica is being allocated without knowing how much disk
space is being used by the primary.

Co-authored-by: Yannick Welsch <yannick@welsch.lu>
2020-10-06 18:37:05 +02:00
Julie Tibshirani 733e89d7ed Make sure that IdFieldType#isAggregatable is accurate. (#62903)
Before, it always returned 'true' even when the setting
"indices.id_field_data.enabled" was false.

Fixes #62897.
2020-10-06 09:33:44 -07:00
Igor Motov 2405162c39
Mute RegressionIT.testAliasFields test (#63339)
It fails quite frequently in 7.x.

Relates to #63268
2020-10-06 12:18:12 -04:00
David Kyle ea32b4ab82
[ML] Audit message when nightly maintenance times out (#63252) (#63330)
During deletion of old ml data set the delete by query timeout to 8 hours and
audit a job message when the nightly maintenance task times out.
2020-10-06 16:19:37 +01:00
Hendrik Muhs 058c55da6a [Transform] disallow field and script being empty for group sources (#63313)
fail validation earlier when field and script are both missing in a group source
2020-10-06 16:59:02 +02:00
Rene Groeschke 09f7cff612
Fix syncing expanded distributions in gradle build (#63285) (#63328) 2020-10-06 16:51:12 +02:00
István Zoltán Szabó a3a373b67f
[DOCS] Adds delta and offset parameters to Evaluate DFA API docs (#63317) (#63329) 2020-10-06 16:49:08 +02:00
Dan Hermann 7a59ae8fa2
[7.x] Allow_duplicates option for append processor (#61916) (#63257) 2020-10-06 09:03:47 -05:00
Rene Groeschke a3252af5c0
Cleanup on integtest distribution setup (7.x backport) (#63189)
* Cleanup on integtest distribution setup (#62937)

- Simplify build task and archive base name calculation
- Move integ test zip project only setup into integ test zip build script

* Fix merge
2020-10-06 15:58:42 +02:00
Rene Groeschke 8144106ace
Build local unreleased bwc versions more efficient for tests (7.x backport) (#63188)
* Wire local unreleased bwc versions more efficient for tests (#62473)

For testing against the local distribution we already avoid the packaging/unpackaging 
cycle of es distributions when setting up test clusters. This PR adopts the usage of the
expanded created distributions for unreleased bwc versions (versions that are checkout 
from a branch and build from source in the :distribution:bwc:minor / :distribution:bwc:bugfix). 
This makes the setup of bwc based cross version tests a bit faster by avoiding 
the unpackaging overhead. We still assemble both in the bwcBuild tasks atm 
which will be addressed in a later issue.

This reworks the :distribution:bwc project:

- Convert all the custom logic from build script logic (groovy) into gradle binary plugins (java)
- Tried to make the bwc setup logic a bit more readable
- Add basic functional test coverage for the bwc logic this PR tweaked.
- Extracted a general internal BWC Git plugin out of the bwc setup plugin to improve maintenance 
and testability
- Changed the InternalDistributionPlugin to resolve the extracted distro instead on relying 
on unpacking the distribution archive

* Fix java8 incompatibility
* Fix extension calculation for 6.8.* distribution
2020-10-06 15:58:13 +02:00
Yang Wang abf9b885b4
Bulk invalidate API keys using a list of IDs (#63224) (#63320)
Add a new ids field to the API of invalidating API keys so that it supports bulk
invalidation with a list of IDs.
Note the existing id field is kept as is and it is an error if both id and ids are specified.
2020-10-07 00:49:21 +11:00
Yang Wang bbfa2f1303 Fix test failure due to missing client action 2020-10-07 00:45:30 +11:00
Armin Braun a8dbab23a5
Increase Timeout in testDynamicRestoreThrottling (#63300) (#63324)
Even if we increase the limit it might not take effect straight away if a thread is
blocked on a long wait in `org.elasticsearch.index.snapshots.blobstore.RateLimitingInputStream#maybePause`.
Let's increase the limit a little and see if that deals with the remaining failures for good and stop burning
cycles busy asserting a future completion.

Closes #63246
2020-10-06 15:27:05 +02:00
Benjamin Trent a72d7cc76a
[ML] prefer secondary auth headers on data frame analytics _explain (#63281) (#63323)
We should prefer secondary auth headers when calling _explain
2020-10-06 09:15:29 -04:00
Luca Cavanna ca68298e89
Remove MapperService argument from IndexFieldData.Builder#build (#63197) (#63311)
MapperService carries a lot of weight and is only used to determine if loading of field data for the id field is enabled, which can be done in a different way.
2020-10-06 15:04:23 +02:00
Yang Wang 7969fbb4ab
Cache API key doc to reduce traffic to the security index (#59376) (#63319)
Getting the API key document form the security index is the most time consuing part
of the API Key authentication flow (>60% if index is local and >90% if index is remote).
This traffic is now avoided by caching added with this PR.

Additionally, we add a cache invalidator registry so that clearing of different caches will
be managed in a single place (requires follow-up PRs).
2020-10-06 23:49:23 +11:00
Armin Braun 2aa80f9ee3
Dry up Searchable Snapshots ITs (#63190) (#63321)
Just a few spots where we can dry up these tests using the snapshot test infrastructure
in core that I found while studying the existing searchable snapshot tests.
2020-10-06 14:41:11 +02:00
Christoph Büscher 82096d3971
Enable SourceLookup to leverage sequential stored fields reader (#63035) (#63316)
In #62509 we already plugged faster sequential access for stored fields in the fetch phase.
This PR now adds using the potentially better field reader also in SourceLookup.
Rally exeriments are showing that this speeds up e.g. when runtime fields that are using
"_source" are added e.g. via "docvalue_fields" or are used in queries or aggs.

Closes #62621
2020-10-06 14:34:39 +02:00
Mayya Sharipova bea0ead08a
Fix fields retrieval on unsinged_long field (#63310)
This fixes fields retrieval on unsigned_long field

1) For docvalue_fields a custom UnsignedLongLeafFieldData::getLeafValueFetcher
is implemented that correctly retrieves doc values.

2) For stored fields, an error was fixed in UnsignedLongFieldMapper
 how stored values were stored. Before they were incorrectly
stored in the shifted format, now they are stored as original
values in String format.

Relates to #60050
Backport for #63119
2020-10-06 06:37:31 -04:00
David Kyle 8f4ef40f78
[ML] Auditor ensures template is installed before writes (#63286)
The ML auditors should not write if the latest template is not present. 
Instead a PUT template request is made and the writes queued up
2020-10-06 11:20:37 +01:00
Alan Woodward 7405af8060
Convert TypeFieldType to a constant field type (#63214)
In 6x and 7x, indexes can have only one type, which means that we can rework
all queries against the type field to use a ConstantFieldType. This has already
been done in master with the removal of the TypeFieldMapper, but we still need
that class in 7x to deal with nested documents. This commit leaves
TypeFieldMapper in place, but refactors TypeFieldType to extend
ConstantFieldType and consolidates deprecation warnings within that class.

It also incidentally removes the requirement to pass a MapperService to
IndexFieldData.Builder#build, which should allow #63197 to be backported.
2020-10-06 10:27:37 +01:00
Armin Braun d7f6812d78
Improve Snapshot Abort Efficiency (#62173) (#63297)
There is no need to let snapshots that haven't yet written anything to the repo
finalize with `FAILED`. When we still had the `INIT` state we would also just remove
these snapshots from the state without any further action.

This is not just a theoretical optimization. Currently, the situation of having a lot of
queued up snapshots is fairly complicated to resolve when all the queued shards move to aborted
since it is now necessary to execute tasks on the `SNAPSHOT` pool (that might be very busy) to
remove the snapshot from the CS (including a number of redundant CS updates and repo writes
for finalizing these snapshots before deleting them right away after).
2020-10-06 05:14:25 +02:00
Nhat Nguyen 25fbc01459 Retry CCR shard follow task when no seed node left (#63225)
If the connection between clusters is disconnected or the leader cluster
is offline, then CCR shard-follow tasks can stop with "no seed node
left". CCR should retry on this error.
2020-10-05 21:56:56 -04:00
Armin Braun 5c3a4c13dd
Clone Snapshot API (#61839) (#63291)
Snapshot clone API. Complete except for some TODOs around documentation (and adding HLRC support).

backport of #61839, #63217, #63037
2020-10-06 01:52:25 +02:00