Commit Graph

5689 Commits

Author SHA1 Message Date
Julie Tibshirani ae2fc4118d Add factory methods for common value fetchers. (#63438)
This PR adds factory methods for the most common implementations:
* `SourceValueFetcher.identity` to pass through the source value untouched.
* `SourceValueFetcher.toString` to simply convert the source value to a string.
2020-10-08 12:14:53 -07:00
Julie Tibshirani a506705569 Small fixes to flattened field value fetching. (#63443)
* Remove FlatObjectFieldTypeTests, as it's redundant.
* Do not apply null_value when fetching root-level values.
* Remove a TODO in favor of opening an issue.
2020-10-08 11:52:54 -07:00
Benjamin Trent a9be4181c6
[ML] fix grabbing the doc value limit setting in _explain (#63402) (#63471)
Getting the doc value settings shouldn't use the API callers headers. We only use this value internally.
2020-10-08 08:53:26 -04:00
Mayya Sharipova e022b78198
Upgrade to lucene-8.7.0-snapshot-5c4168d (#63466)
This disables sort optim on _doc, which may still be unstable.
Backport for #63444
2020-10-08 08:20:43 -04:00
Costin Leau 2ab5f226c4 EQL: Avoid filtering on tiebreakers (#63415)
Do not filter by tiebreaker while searching sequence matches as
it's not monotonic and thus can filter out valid data.
Add handling for data 'near' the boundary that has the same timestamp
but different tie-breaker and thus can be just outside the window.

Fix #62781
Relates #63215

(cherry picked from commit 36f834600d4d9ded0fb7b1440274b2e597733770)
(cherry picked from commit 72a2ce825f3bfd13f87423ba7f3c739ea64c57f6)
2020-10-08 13:50:41 +03:00
David Roberts a9d541561f [ML] Unmute DeleteExpiredDataIT.testDeleteExpiredDataNoThrottle (#63408)
This test appears to work again following the Lucene bug fix
that was integrated in #63395
2020-10-08 09:11:29 +01:00
Przemysław Witek bd761cce1d
[ML] Validate that AucRoc has the data necessary to be calculated (#63302) (#63454) 2020-10-08 09:52:15 +02:00
Luca Cavanna 659988a77f
Remove runtime fields (#63418)
We are not going to release runtime fields with 7.10, hence we are removing them from the 7.10 branch.
2020-10-07 20:39:41 +02:00
Mayya Sharipova e236ea43e9 Upgrade to lucene-8.7.0-snapshot-e914862 (#63401)
Backport for: #63395
2020-10-07 09:45:14 -04:00
Alan Woodward 88b45dfa61
Convert TextFieldMapper to parametrized form (#63269) (#63392)
As a result of this, we can remove a chunk of code from TypeParsers as well. Tests
for search/index mode analyzers have moved into their own file. This commit also
rationalises the serialization checks for parameters into a single SerializerCheck
interface that takes the values includeDefaults, isConfigured and the value
itself.

Relates to #62988
2020-10-07 13:26:25 +01:00
Hendrik Muhs d45f7de3fb [Transform] Add test logging regarding conflict on start (#63383)
add extra logging for investigation of #63365
2020-10-07 10:17:31 +02:00
Tim Vernum c30c5555c5
Mute DeleteExpiredDataIT deleteExpired NoThrottle (#63381)
Mutes test method DeleteExpiredDataIT.testDeleteExpiredDataNoThrottle

Relates: #63379
Backport of: #63380
2020-10-07 17:43:52 +11:00
Stuart Tettemer 8a61b95a0f
Scripting: JSON parsing and writing in watcher (#63278) (#63377)
Co-authored-by: Honza Král
Co-authored-by: Jack Conradson
Backport of: f43e52d
2020-10-06 23:39:40 -05:00
Stuart Tettemer 7f4f70f557
Scripting: Augment String with Hash support in Watcher (#63346) (#63375)
Strings in the watcher context may use the `.sha1()` and `.sha256()`
augmentation added for ingest.

Ref: #59633, #59671
Fixes: #61244
 Backport of: 380ee6f
2020-10-06 22:10:27 -05:00
Gordon Brown 15edc39d9b
Update logstash_admin role for system indices (#63368)
This PR updates the `logstash_admin` role to include the recently-added Logstash Pipeline Management APIs, as well as access to the `.logstash*` index pattern.

Co-authored-by: William Brafford <williamrandolphbrafford@gmail.com>
2020-10-06 20:43:36 -06:00
Mayya Sharipova f2ba62b894
Upgrade to lucene- 8.7.0-snapshot-66c49a35402 (#63372)
This includes fixing a bug in doc iteration during sort optimization

Backport for #63349
2020-10-06 22:38:58 -04:00
Julie Tibshirani f17ca18dfa
Make array value parsing flag more robust. (#63371)
When constructing a value fetcher, the 'parsesArrayValue' flag must match
`FieldMapper#parsesArrayValue`. However there is nothing in code or tests to
help enforce this.

This PR reworks the value fetcher constructors so that `parsesArrayValue` is
'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must
explicitly set it to true and ensure the behavior is covered by tests.

Follow-up to #62974.
2020-10-06 17:49:25 -07:00
Gordon Brown 5c8b0662df
Deprecate REST access to System Indices (#63274) (Original #60945)
This PR adds deprecation warnings when accessing System Indices via the REST layer. At this time, these warnings are only enabled for Snapshot builds by default, to allow projects external to Elasticsearch additional time to adjust their access patterns.

Deprecation warnings will be triggered by all REST requests which access registered System Indices, except for purpose-specific APIs which access System Indices as an implementation detail a few specific APIs which will continue to allow access to system indices by default:

- `GET _cluster/health`
- `GET {index}/_recovery`
- `GET _cluster/allocation/explain`
- `GET _cluster/state`
- `POST _cluster/reroute`
- `GET {index}/_stats`
- `GET {index}/_segments`
- `GET {index}/_shard_stores`
- `GET _cat/[indices,aliases,health,recovery,shards,segments]`

Deprecation warnings for accessing system indices take the form:
```
this request accesses system indices: [.some_system_index], but in a future major version, direct access to system indices will be prevented by default
```
2020-10-06 13:41:40 -06:00
Tanguy Leroux 87076c32e2
Determine shard size before allocating shards recovering from snapshots (#61906) (#63337)
Determines the shard size of shards before allocating shards that are
recovering from snapshots. It ensures during shard allocation that the
target node that is selected as recovery target will have enough free
disk space for the recovery event. This applies to regular restores,
CCR bootstrap from remote, as well as mounting searchable snapshots.

The InternalSnapshotInfoService is responsible for fetching snapshot
shard sizes from repositories. It provides a getShardSize() method
to other components of the system that can be used to retrieve the
latest known shard size. If the latest snapshot shard size retrieval
failed, the getShardSize() returns
ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE. While
we'd like a better way to handle such failures, returning this value
allows to keep the existing behavior for now.

Note that this PR does not address an issues (we already have today)
where a replica is being allocated without knowing how much disk
space is being used by the primary.

Co-authored-by: Yannick Welsch <yannick@welsch.lu>
2020-10-06 18:37:05 +02:00
Igor Motov 2405162c39
Mute RegressionIT.testAliasFields test (#63339)
It fails quite frequently in 7.x.

Relates to #63268
2020-10-06 12:18:12 -04:00
David Kyle ea32b4ab82
[ML] Audit message when nightly maintenance times out (#63252) (#63330)
During deletion of old ml data set the delete by query timeout to 8 hours and
audit a job message when the nightly maintenance task times out.
2020-10-06 16:19:37 +01:00
Hendrik Muhs 058c55da6a [Transform] disallow field and script being empty for group sources (#63313)
fail validation earlier when field and script are both missing in a group source
2020-10-06 16:59:02 +02:00
Yang Wang abf9b885b4
Bulk invalidate API keys using a list of IDs (#63224) (#63320)
Add a new ids field to the API of invalidating API keys so that it supports bulk
invalidation with a list of IDs.
Note the existing id field is kept as is and it is an error if both id and ids are specified.
2020-10-07 00:49:21 +11:00
Yang Wang bbfa2f1303 Fix test failure due to missing client action 2020-10-07 00:45:30 +11:00
Benjamin Trent a72d7cc76a
[ML] prefer secondary auth headers on data frame analytics _explain (#63281) (#63323)
We should prefer secondary auth headers when calling _explain
2020-10-06 09:15:29 -04:00
Luca Cavanna ca68298e89
Remove MapperService argument from IndexFieldData.Builder#build (#63197) (#63311)
MapperService carries a lot of weight and is only used to determine if loading of field data for the id field is enabled, which can be done in a different way.
2020-10-06 15:04:23 +02:00
Yang Wang 7969fbb4ab
Cache API key doc to reduce traffic to the security index (#59376) (#63319)
Getting the API key document form the security index is the most time consuing part
of the API Key authentication flow (>60% if index is local and >90% if index is remote).
This traffic is now avoided by caching added with this PR.

Additionally, we add a cache invalidator registry so that clearing of different caches will
be managed in a single place (requires follow-up PRs).
2020-10-06 23:49:23 +11:00
Armin Braun 2aa80f9ee3
Dry up Searchable Snapshots ITs (#63190) (#63321)
Just a few spots where we can dry up these tests using the snapshot test infrastructure
in core that I found while studying the existing searchable snapshot tests.
2020-10-06 14:41:11 +02:00
Mayya Sharipova bea0ead08a
Fix fields retrieval on unsinged_long field (#63310)
This fixes fields retrieval on unsigned_long field

1) For docvalue_fields a custom UnsignedLongLeafFieldData::getLeafValueFetcher
is implemented that correctly retrieves doc values.

2) For stored fields, an error was fixed in UnsignedLongFieldMapper
 how stored values were stored. Before they were incorrectly
stored in the shifted format, now they are stored as original
values in String format.

Relates to #60050
Backport for #63119
2020-10-06 06:37:31 -04:00
David Kyle 8f4ef40f78
[ML] Auditor ensures template is installed before writes (#63286)
The ML auditors should not write if the latest template is not present. 
Instead a PUT template request is made and the writes queued up
2020-10-06 11:20:37 +01:00
Nhat Nguyen 25fbc01459 Retry CCR shard follow task when no seed node left (#63225)
If the connection between clusters is disconnected or the leader cluster
is offline, then CCR shard-follow tasks can stop with "no seed node
left". CCR should retry on this error.
2020-10-05 21:56:56 -04:00
Armin Braun e91936512a
Refactor SnapshotsInProgress State Transitions (#60517) (#63266)
The copy constructors previously used were hard to read and the exact state changes
were not obvious at all.
Refactored those into a number of named constructors instead, added additional assertions
and moved the snapshot abort logic into `SnapshotsInProgress`.
2020-10-06 00:03:42 +02:00
Armin Braun 860791260d
Implement Shard Snapshot Clone Logic (#62771) (#63260)
First part of the snapshot clone logic that implements the snapshot clone functionality on
the repository level.
2020-10-05 22:55:52 +02:00
Costin Leau d027e24b31 EQL: Remove match functions (#63275)
Since match (for matching regex) is not currently in use remove it for
now.

Close #63263

(cherry picked from commit 6abd531cf457f3c5686f59709647bed3276e3c6b)
2020-10-05 23:30:41 +03:00
Costin Leau 6856306dcf EQL: Remove wildcard functionality from : (#63276)
Restrict : operator to only case insensitive matching on strings

Close #63262

(cherry picked from commit bc02e77150cdd85594dfac4f03d8aeb85aaddbb3)
2020-10-05 23:30:41 +03:00
Andrei Stefan 76bba601ab
Remove case_sensitive request option (#63218) (#63244)
Make EQL case sensitive by default and adapt some of the string functions
Remove the case sensitive option from Between string function
Add case_insensitive option to term and wildcard queries usage

(cherry picked from commit 7550e0664c8c2f1f13519036c759b1e76345551f)
2020-10-05 22:04:42 +03:00
Nhat Nguyen 1a6837883a Upgrade to Lucene-8.7.0-snapshot-77396dbf339 (#63222)
Includes LUCENE-9554, which exposes the pendingNumDocs from IndexWriter.
2020-10-05 14:39:30 -04:00
Armin Braun cf75abb021
Optimize XContentParserUtils.ensureExpectedToken (#62691) (#63253)
We only ever use this with `XContentParser` no need to make it inline
worse by forcing the lambda and hence dynamic callsite here.
=> Extraced the exception formatting code path that is likely very cold
to a separate method and removed the lambda usage in hot loops by simplifying
the signature here.
2020-10-05 19:08:32 +02:00
Armin Braun de6eeecbd3
Dry up Snapshot Integ Tests some More (#62856) (#63248)
* Just some obvious drying up of these super complex tests.
* Mainly just shortening the diff of #61839 here by moving test utilities
to the abstract test case.
Also, making use of the now available functionality to simplify existing tests
and improve logging in them.
2020-10-05 18:33:59 +02:00
Benjamin Trent 1e63313c19
[ML] adds feature_importance_baseline object to model metadata (#63172) (#63237)
this adds the new field `feature_importance_baseline` and allows it to be optionally be included in the model's metadata.

Related to: https://github.com/elastic/ml-cpp/pull/1522
2020-10-05 09:33:38 -04:00
Marios Trivyzas 19650e860a
EQL: [Test] Add a test for `identifier` as eventType (#63227) (#63235)
Add a unit test to verify that an identifier surrounded with backquotes
is not a valid syntax for eventType value, as eventType is
schemantically a string literal and not a field identifier.

Follows: #63169
(cherry picked from commit ff12c1340b3890ac52251f31259fa9a719d9eacc)
2020-10-05 15:23:08 +02:00
Costin Leau 1047d67199 Revert "EQL: Avoid filtering on tiebreakers (#63215)"
This reverts commit efd2243886.
2020-10-05 15:55:59 +03:00
Costin Leau 8c4503bcc3 EQL: Change default indices options (#63192)
Ignore by default unavailable indices (same as ES) and verify that
allowNoIndices is set to false since at least one index is required
for validating the query.

Fix #62986

(cherry picked from commit fd75ac27223cd1b699b8d9c311dc401a39f9e0c8)
2020-10-05 14:21:56 +03:00
Costin Leau b67d2274ae QL: Optimize regexs without patterns as equality (#63216)
If a QL regex doesn't contain any pattern, convert it to Equals.

Close #63196

(cherry picked from commit e22a843124290aaacd0e80d7ae9b883e5ec2431e)
2020-10-05 14:21:42 +03:00
Costin Leau efd2243886 EQL: Avoid filtering on tiebreakers (#63215)
Do not filter by tiebreaker while searching sequence matches as
it's not monotonic and thus can filter out valid data.

Fix #62781

(cherry picked from commit 4d62198df70f3b70f8b6e7730e888057652c18a8)
2020-10-05 14:21:30 +03:00
Costin Leau 4f593bdd69 EQL: Make queries using Point-In-Time rely on index filtering (#63161)
Point-In-Time queries cannot be ran on individual indices but on all.
Thus all PIT queries move their index from the request level to a filter
so this condition is fulfilled while keeping the query scoped
accordingly.

Fix #63132

(cherry picked from commit c8eb4f724d5dcc0fcc172c6219ecfbc1dc1fbbae)
2020-10-05 14:21:09 +03:00
Alan Woodward 01950bc80f
Move FieldMapper#valueFetcher to MappedFieldType (#62974) (#63220)
For runtime fields, we will want to do all search-time interaction with
a field definition via a MappedFieldType, rather than a FieldMapper, to
avoid interfering with the logic of document parsing. Currently, fetching
values for runtime scripts and for building top hits responses need to
call a method on FieldMapper. This commit moves this method to
MappedFieldType, incidentally simplifying the current call sites and freeing
us up to implement runtime fields as pure MappedFieldType objects.
2020-10-04 14:54:59 +01:00
Jason Tedor 1c136bb7fc
Add tier preference when mounting (#63204)
This commit adds a tier preference when mounting a searchable
snapshot. This sets a preference that a searchable snapshot is mounted
to a node with the cold role if one exists, then the warm role, then the
hot role, assuming that no other allocation rules are in place. This
means that by default, searchable snapshots are mounted to a node with
the cold role.

Note that depending on how we implement frozen functionality of
searchable snapshots (not pre-cached/not fully-cached), we might need to
adjust this to prefer frozen if mounting a not pre-cached/fully-cached
searchable snapshot versus mounting a pre-cached/fully-cached searchable
snapshot. This is a later concern since neither this nor the frozen role
are implemented currently.
2020-10-03 07:33:36 -04:00
Nhat Nguyen 4ef8673fdd Fix testRestartAfterCompletion (#63211)
We need to complete the search before closing the iterator, which 
internally closes the point in time; otherwise, the search will fail
with a missing context error.

Closes #62451
2020-10-02 18:14:42 -04:00
Martijn van Groningen 0b6e2b8f16
Fix enrich policy test bug.
Backport #63182 to 7.x branch.

The `randomEnrichPolicy(...)` helper method stores the policy and creates the source indices.
If a source index already exists, because it was creates for a random policy created earlier then
skipping the source index fails, but that is ignored and the test continues. However if the policy
has a match field that doesn't exist in the previous random policy then the mapping is never updated
and the put policy api fails with the fact that the match field can't be found.

This pr fixes that by execute a put mapping call in the event that the source index already exists.

Closes #63126
2020-10-02 19:34:39 +02:00