5762 Commits

Author SHA1 Message Date
Armin Braun
2aa80f9ee3
Dry up Searchable Snapshots ITs (#63190) (#63321)
Just a few spots where we can dry up these tests using the snapshot test infrastructure
in core that I found while studying the existing searchable snapshot tests.
2020-10-06 14:41:11 +02:00
Mayya Sharipova
bea0ead08a
Fix fields retrieval on unsinged_long field (#63310)
This fixes fields retrieval on unsigned_long field

1) For docvalue_fields a custom UnsignedLongLeafFieldData::getLeafValueFetcher
is implemented that correctly retrieves doc values.

2) For stored fields, an error was fixed in UnsignedLongFieldMapper
 how stored values were stored. Before they were incorrectly
stored in the shifted format, now they are stored as original
values in String format.

Relates to #60050
Backport for #63119
2020-10-06 06:37:31 -04:00
David Kyle
8f4ef40f78
[ML] Auditor ensures template is installed before writes (#63286)
The ML auditors should not write if the latest template is not present. 
Instead a PUT template request is made and the writes queued up
2020-10-06 11:20:37 +01:00
Nhat Nguyen
25fbc01459 Retry CCR shard follow task when no seed node left (#63225)
If the connection between clusters is disconnected or the leader cluster
is offline, then CCR shard-follow tasks can stop with "no seed node
left". CCR should retry on this error.
2020-10-05 21:56:56 -04:00
Armin Braun
e91936512a
Refactor SnapshotsInProgress State Transitions (#60517) (#63266)
The copy constructors previously used were hard to read and the exact state changes
were not obvious at all.
Refactored those into a number of named constructors instead, added additional assertions
and moved the snapshot abort logic into `SnapshotsInProgress`.
2020-10-06 00:03:42 +02:00
Armin Braun
860791260d
Implement Shard Snapshot Clone Logic (#62771) (#63260)
First part of the snapshot clone logic that implements the snapshot clone functionality on
the repository level.
2020-10-05 22:55:52 +02:00
Costin Leau
d027e24b31 EQL: Remove match functions (#63275)
Since match (for matching regex) is not currently in use remove it for
now.

Close #63263

(cherry picked from commit 6abd531cf457f3c5686f59709647bed3276e3c6b)
2020-10-05 23:30:41 +03:00
Costin Leau
6856306dcf EQL: Remove wildcard functionality from : (#63276)
Restrict : operator to only case insensitive matching on strings

Close #63262

(cherry picked from commit bc02e77150cdd85594dfac4f03d8aeb85aaddbb3)
2020-10-05 23:30:41 +03:00
Andrei Stefan
76bba601ab
Remove case_sensitive request option (#63218) (#63244)
Make EQL case sensitive by default and adapt some of the string functions
Remove the case sensitive option from Between string function
Add case_insensitive option to term and wildcard queries usage

(cherry picked from commit 7550e0664c8c2f1f13519036c759b1e76345551f)
2020-10-05 22:04:42 +03:00
Nhat Nguyen
1a6837883a Upgrade to Lucene-8.7.0-snapshot-77396dbf339 (#63222)
Includes LUCENE-9554, which exposes the pendingNumDocs from IndexWriter.
2020-10-05 14:39:30 -04:00
Armin Braun
cf75abb021
Optimize XContentParserUtils.ensureExpectedToken (#62691) (#63253)
We only ever use this with `XContentParser` no need to make it inline
worse by forcing the lambda and hence dynamic callsite here.
=> Extraced the exception formatting code path that is likely very cold
to a separate method and removed the lambda usage in hot loops by simplifying
the signature here.
2020-10-05 19:08:32 +02:00
Armin Braun
de6eeecbd3
Dry up Snapshot Integ Tests some More (#62856) (#63248)
* Just some obvious drying up of these super complex tests.
* Mainly just shortening the diff of #61839 here by moving test utilities
to the abstract test case.
Also, making use of the now available functionality to simplify existing tests
and improve logging in them.
2020-10-05 18:33:59 +02:00
Benjamin Trent
1e63313c19
[ML] adds feature_importance_baseline object to model metadata (#63172) (#63237)
this adds the new field `feature_importance_baseline` and allows it to be optionally be included in the model's metadata.

Related to: https://github.com/elastic/ml-cpp/pull/1522
2020-10-05 09:33:38 -04:00
Marios Trivyzas
19650e860a
EQL: [Test] Add a test for identifier as eventType (#63227) (#63235)
Add a unit test to verify that an identifier surrounded with backquotes
is not a valid syntax for eventType value, as eventType is
schemantically a string literal and not a field identifier.

Follows: #63169
(cherry picked from commit ff12c1340b3890ac52251f31259fa9a719d9eacc)
2020-10-05 15:23:08 +02:00
Costin Leau
1047d67199 Revert "EQL: Avoid filtering on tiebreakers (#63215)"
This reverts commit efd2243886f772bf31ae822497b1d545ff0544c9.
2020-10-05 15:55:59 +03:00
Costin Leau
8c4503bcc3 EQL: Change default indices options (#63192)
Ignore by default unavailable indices (same as ES) and verify that
allowNoIndices is set to false since at least one index is required
for validating the query.

Fix #62986

(cherry picked from commit fd75ac27223cd1b699b8d9c311dc401a39f9e0c8)
2020-10-05 14:21:56 +03:00
Costin Leau
b67d2274ae QL: Optimize regexs without patterns as equality (#63216)
If a QL regex doesn't contain any pattern, convert it to Equals.

Close #63196

(cherry picked from commit e22a843124290aaacd0e80d7ae9b883e5ec2431e)
2020-10-05 14:21:42 +03:00
Costin Leau
efd2243886 EQL: Avoid filtering on tiebreakers (#63215)
Do not filter by tiebreaker while searching sequence matches as
it's not monotonic and thus can filter out valid data.

Fix #62781

(cherry picked from commit 4d62198df70f3b70f8b6e7730e888057652c18a8)
2020-10-05 14:21:30 +03:00
Costin Leau
4f593bdd69 EQL: Make queries using Point-In-Time rely on index filtering (#63161)
Point-In-Time queries cannot be ran on individual indices but on all.
Thus all PIT queries move their index from the request level to a filter
so this condition is fulfilled while keeping the query scoped
accordingly.

Fix #63132

(cherry picked from commit c8eb4f724d5dcc0fcc172c6219ecfbc1dc1fbbae)
2020-10-05 14:21:09 +03:00
Alan Woodward
01950bc80f
Move FieldMapper#valueFetcher to MappedFieldType (#62974) (#63220)
For runtime fields, we will want to do all search-time interaction with
a field definition via a MappedFieldType, rather than a FieldMapper, to
avoid interfering with the logic of document parsing. Currently, fetching
values for runtime scripts and for building top hits responses need to
call a method on FieldMapper. This commit moves this method to
MappedFieldType, incidentally simplifying the current call sites and freeing
us up to implement runtime fields as pure MappedFieldType objects.
2020-10-04 14:54:59 +01:00
Jason Tedor
1c136bb7fc
Add tier preference when mounting (#63204)
This commit adds a tier preference when mounting a searchable
snapshot. This sets a preference that a searchable snapshot is mounted
to a node with the cold role if one exists, then the warm role, then the
hot role, assuming that no other allocation rules are in place. This
means that by default, searchable snapshots are mounted to a node with
the cold role.

Note that depending on how we implement frozen functionality of
searchable snapshots (not pre-cached/not fully-cached), we might need to
adjust this to prefer frozen if mounting a not pre-cached/fully-cached
searchable snapshot versus mounting a pre-cached/fully-cached searchable
snapshot. This is a later concern since neither this nor the frozen role
are implemented currently.
2020-10-03 07:33:36 -04:00
Nhat Nguyen
4ef8673fdd Fix testRestartAfterCompletion (#63211)
We need to complete the search before closing the iterator, which 
internally closes the point in time; otherwise, the search will fail
with a missing context error.

Closes #62451
2020-10-02 18:14:42 -04:00
Martijn van Groningen
0b6e2b8f16
Fix enrich policy test bug.
Backport #63182 to 7.x branch.

The `randomEnrichPolicy(...)` helper method stores the policy and creates the source indices.
If a source index already exists, because it was creates for a random policy created earlier then
skipping the source index fails, but that is ignored and the test continues. However if the policy
has a match field that doesn't exist in the previous random policy then the mapping is never updated
and the put policy api fails with the fact that the match field can't be found.

This pr fixes that by execute a put mapping call in the event that the source index already exists.

Closes #63126
2020-10-02 19:34:39 +02:00
Benjamin Trent
752ee0288e
[7.x] [ML] optimize delete expired snapshots (#63134) (#63200)
* [ML] optimize delete expired snapshots (#63134)

When deleting expired snapshots, we do an individual delete action per snapshot per job.

We should instead gather the expired snapshots and delete them in a single call.

This commit achieves this and a side-effect is there is less audit log spam on nightly cleanup

closes https://github.com/elastic/elasticsearch/issues/62875
2020-10-02 13:24:36 -04:00
Marios Trivyzas
3cac996373
EQL: Fix syntax for event type (#63169) (#63194)
Event type is actually a string value for event.category which can
contain any kind of characters, or start with a digit, which currently
is not supported, so we introduce the possibility to be able to use the
usual syntax of " and """ for strings and raw strings.

Make the grammar a bit cleaner by using the identifier only where it's
actually an identifier in terms of query scemantics.

Fixes: #62933
(cherry picked from commit 306e1d76da3db652db57f11f847705b3995609ff)
2020-10-02 17:28:13 +02:00
markharwood
bfb3071539
Wildcard field - add normalisation of ngram tokens to reduce disk space. (#63120) (#63193)
Adds normalisation of ngram tokens to reduce disk space.
All punctuation becomes / char and for A-Z0-9 chars turn even codepoints to prior odd e.g. aab becomes aaa

Closes #62817
2020-10-02 16:24:27 +01:00
Przemysław Witek
5370f270d7
[7.x] [ML] Ensure data frame analytics jobs don't run on a node that's too new (#62749) (#63175) 2020-10-02 17:19:58 +02:00
Marios Trivyzas
9cf0722fe6
SQL: Fix exception when using CAST on inexact field (#62943) (#63187)
Currently, CAST will use the first keyword subfield of a text field for
an expression in WHERE clause that gets translated to a painless script
which will lead to an exception thrown:
```
"root_cause": [
      {
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [
          "org.elasticsearch.index.mapper.TextFieldMapper$TextFieldType.fielddataBuilder(TextFieldMapper.java:759)",
          "org.elasticsearch.index.fielddata.IndexFieldDataService.getForField(IndexFieldDataService.java:116)",
          "org.elasticsearch.index.query.QueryShardContext.lambda$lookup$0(QueryShardContext.java:308)",
          "org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:101)",
          "org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:98)",
          "java.security.AccessController.doPrivileged(Native Method)",
          "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:98)",
          "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:41)",
          "org.elasticsearch.xpack.sql.expression.function.scalar.whitelist.InternalSqlScriptUtils.docValue(InternalSqlScriptUtils.java:79)",
          "InternalSqlScriptUtils.cast(InternalSqlScriptUtils.docValue(doc,params.v0),params.v1)",
          "                                                                      ^---- HERE"
        ],
        "script": "InternalSqlScriptUtils.cast(InternalSqlScriptUtils.docValue(doc,params.v0),params.v1)",
        "lang": "painless"
      }
    ],
```

Instead of allowing a painless translation using the first underlying
keyword silently, which can be confusing, we detect such usage and throw\
an error early.

Relates to #60178

(cherry picked from commit 7402e8267ba564e52dc672c25b262824b6048b40)
2020-10-02 16:42:59 +02:00
Joe Gallo
d172a18c95 Tidy up some ILM and SLM packages (#63146)
Very minor refactoring, just moving some ILM and SLM classes around to decrease
the total number of packages.
2020-10-02 09:30:24 -04:00
Martijn van Groningen
300e525138
Fix querying a data stream name in _index field. (#63178)
Backport #63170 to 7.x branch.

The _index field is a special field that allows using queries against the name of an index or alias.
Data stream names were not included, this pr fixes that by changing SearchIndexNameMatcher
(which used via IndexFieldMapper) to also include data streams.
2020-10-02 15:29:20 +02:00
Marios Trivyzas
7d74fb8577
EQL: Replace ?"..." with """...""" for unescaped strings (#62539) (#63174)
Use triple double quotes enclosing a string literal to interpret it
as unescaped, in order to use `?` for marking query params and avoid
user confusion. `?` also usually implies regex expressions.

Any character inside the `"""` beginning-closing markings is considered
raw and the only thing that is not permitted is the `"""` sequence itself.
If a user wants to use that, needs to resort to the normal `"` string literal
and use proper escaping.

Relates to #61659

(cherry picked from commit d87c2ca2eacab5552bca1e520d33cf71da40bcfd)
2020-10-02 14:58:50 +02:00
Benjamin Trent
cfcf973259
[7.x] [ML] renames */inference* apis to */trained_models* (#63097) (#63136)
* [ML] renames */inference* apis to */trained_models* (#63097)

This commit renames all `inference` CRUD APIs to `trained_models`.

This aligns with internal terminology, documentation, and use-cases.
2020-10-02 07:34:28 -04:00
Benjamin Trent
535f8a434b
Revert "[ML] adding baseline field to total_feature_importance objects (#63098) (#63125)" (#63144)
This reverts commit 95242ecceed6f20a559aca15e39b3a5de1ac3e34.
2020-10-02 07:03:15 -04:00
Luca Cavanna
a42a516b67 Shorten runtime field type class names (#63123)
In the codebase there is the non-written convention that classes that extend `MappedFieldType` are generally called `*FieldType`. With this commit we adopt the same convention for runtime field types which allows us to shorten their names by removing the `Mapped` portion which is implicit.
2020-10-02 11:25:25 +02:00
Ioannis Kakavas
e91f66e22f
Ensure domain_name setting for AD realm is present (#61983) (#63159)
We would only check for a null value and not for an empty string so
that meant that we were not actually enforcing this mandatory
setting. This commits ensures we check for both and fail 
accordingly if necessary, on startup
2020-10-02 12:16:08 +03:00
David Kyle
279f951700
[ML] Set parent task Id on ml expired data removers (#62854) (#62966)
Setting the parent task Id (of the delete expired data action) on the ML
expired data removers makes it easier to track and cancel long running
tasks
2020-10-02 10:14:10 +01:00
Christoph Büscher
4c7c540ca1 Update version field yml test skip version (#63139) 2020-10-02 10:01:27 +02:00
Costin Leau
614f4c13a5 EQL: Introduce case-sensitive equality (#63121)
Introduce : operator for doing case insensitive string comparisons.
Recognizes "*" for wildcard matches in string literals.
Restricted only to string types.

Relates #62941

(cherry picked from commit 201e577e65f26a9b958a6197fe6c7268da39de29)
2020-10-02 00:23:08 +03:00
Igor Motov
fc13b72cea
Extract histogramFieldDocValues into an utility class (#63100) (#63148)
This function will be needed in the upcoming rate aggs tests.
2020-10-01 15:44:37 -04:00
Marios Trivyzas
3ad4b00c7e
EQL: Clean grammar from fork (#63094) (#63138)
Since `fork` is not used, is undocumented in Python EQL and
there is no plan at the moment to implement it in the future,
removing it  from the grammar. User will get parsing exceptions
instead of higher level messages about unsupported features
which can lead to wrong expectations.

(cherry picked from commit f6a0f8f01c1b1893bab86629d1de73e9f9dae8dc)
2020-10-01 21:14:41 +02:00
Lee Hinman
f0f0da2188
[7.x] Add telemetry for data tiers (#63031) (#63140)
Backports the following commits to 7.x:

    Add telemetry for data tiers (#63031)
2020-10-01 12:37:32 -06:00
Benjamin Trent
95242eccee
[ML] adding baseline field to total_feature_importance objects (#63098) (#63125)
This adds a new `baseline` field to the feature importance values. 

This field contains the baseline importance for a given feature and class.
2020-10-01 09:48:07 -04:00
Dimitris Athanasiou
46c3973400
[7.x][ML] Remove direct access to system index from filter_crud REST test (#63111) (#63115)
This test accesses system indices for 2 reasons.

First, it creates a filter that has a different type. This was done
to assert that filter is not returned from the APIs. However,
now that access to the `.ml-meta` index is restricted,
it is not really a concern.

Second, it creates a `.ml-meta` index without mappings to test
the get API does not fail due to lack of mappings on a sorted field,
namely the `filter_id`. Once again, this test is less useful once
system indices have restricted access.

Relates #62501

Backport of #63111
2020-10-01 15:15:34 +03:00
Costin Leau
c2992ea287 EQL: Fix NPE from incorrect use of ids search (#63032)
This fixes a bug introduced when moving from mget to ids query. While
mget returns all the ids given, id query is a search query and thus by
default returns only 10 documents.
The fix correctly sets the expected size so all the information is
returned inside the response.

Fix #63030

(cherry picked from commit 09ba85548a0142a1fe8376efea9cc4e7764a207c)
2020-10-01 13:49:58 +03:00
Hendrik Muhs
e001b4c021 [Transform] fix time rounding in TransformContinuousIT (#63113)
fix a time rounding problem in the test, due to rounding down to epoch
seconds instead of epoch millis

fixes #62951
2020-10-01 11:43:50 +02:00
Ignacio Vera
ba5574935e
Remove dependency of Geometry queries with mapped type names (#63077) (#63110)
It extracts the query capabilities from AbstractGeometryFieldType into two new interfaces, GeoshapeQueryable and ShapeQueryable. Those interfaces are implemented by the final mappers.
2020-10-01 10:49:12 +02:00
Howard
8c6e197f51 Remove allocation id from engine (#62680)
We no longer need the allocation id in Engine.
2020-09-30 15:28:27 -04:00
Marios Trivyzas
f69d268500
SQL: Allow skip of bwc tests on check task (#62936) (#63089)
Bwc tests can consume much time to build and to run so it's nice to be
able to skip them when running the `check` task on the SQL module.

Introduce a new task `checkNoBwc` so one can use:
```
./gradlew -p x-pack/plugin/sql checkNoBwc
```
to skip them.

(cherry picked from commit a52e1846f338f6869273181c6f248579581fa68c)
2020-09-30 20:03:19 +02:00
Marios Trivyzas
0ebaf8a3ec
EQL: Allow escaped backquote in identifiers (#62932) (#63082)
Previously, backquote couldn't not be used inside an escaped identifier,
e.g.:
```
`my`identifier` = "some_value"
```
was not allowed. Introduce escaping of the backquote by using a
double backquote:
```
`my``identifier` = "some_value"
```

(cherry picked from commit 49514121486f42a58674b3e5901de4021fda5c15)
2020-09-30 19:10:09 +02:00
Alan Woodward
675d18f6ea Convert dense/sparse vector field mappers to Parametrized form (#62992)
Also adds a proper MapperTestCase test for dense vectors.

Relates to #62988
2020-09-30 16:55:28 +01:00