Commit Graph

54218 Commits

Author SHA1 Message Date
markharwood bfb3071539
Wildcard field - add normalisation of ngram tokens to reduce disk space. (#63120) (#63193)
Adds normalisation of ngram tokens to reduce disk space.
All punctuation becomes / char and for A-Z0-9 chars turn even codepoints to prior odd e.g. aab becomes aaa

Closes #62817
2020-10-02 16:24:27 +01:00
Przemysław Witek 5370f270d7
[7.x] [ML] Ensure data frame analytics jobs don't run on a node that's too new (#62749) (#63175) 2020-10-02 17:19:58 +02:00
Marios Trivyzas 9cf0722fe6
SQL: Fix exception when using CAST on inexact field (#62943) (#63187)
Currently, CAST will use the first keyword subfield of a text field for
an expression in WHERE clause that gets translated to a painless script
which will lead to an exception thrown:
```
"root_cause": [
      {
        "type": "script_exception",
        "reason": "runtime error",
        "script_stack": [
          "org.elasticsearch.index.mapper.TextFieldMapper$TextFieldType.fielddataBuilder(TextFieldMapper.java:759)",
          "org.elasticsearch.index.fielddata.IndexFieldDataService.getForField(IndexFieldDataService.java:116)",
          "org.elasticsearch.index.query.QueryShardContext.lambda$lookup$0(QueryShardContext.java:308)",
          "org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:101)",
          "org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:98)",
          "java.security.AccessController.doPrivileged(Native Method)",
          "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:98)",
          "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:41)",
          "org.elasticsearch.xpack.sql.expression.function.scalar.whitelist.InternalSqlScriptUtils.docValue(InternalSqlScriptUtils.java:79)",
          "InternalSqlScriptUtils.cast(InternalSqlScriptUtils.docValue(doc,params.v0),params.v1)",
          "                                                                      ^---- HERE"
        ],
        "script": "InternalSqlScriptUtils.cast(InternalSqlScriptUtils.docValue(doc,params.v0),params.v1)",
        "lang": "painless"
      }
    ],
```

Instead of allowing a painless translation using the first underlying
keyword silently, which can be confusing, we detect such usage and throw\
an error early.

Relates to #60178

(cherry picked from commit 7402e8267ba564e52dc672c25b262824b6048b40)
2020-10-02 16:42:59 +02:00
James Rodewig 099e5d00cc
[DOCS] EQL: Reorganize EQL syntax sections (#63179) (#63184) 2020-10-02 10:25:32 -04:00
nitin2goyal c9baadd19b Fix to actually throttle indexing when throttling is activated (#61768)
In #22721, the decision to throttle indexing was inadvertently flipped,
so that we until this commit throttle indexing during recovery but
never throttle user initiated indexing requests. This commit
fixes that to throttle user initiated indexing requests and never
throttle recovery requests.

Closes #61959
2020-10-02 15:50:31 +02:00
James Rodewig 322a6b3655
[DOCS] Corrected track_total_hits def (#62830) (#63181)
Co-authored-by: John Berryman <jnbrymn@github.com>
2020-10-02 09:46:16 -04:00
Joe Gallo d172a18c95 Tidy up some ILM and SLM packages (#63146)
Very minor refactoring, just moving some ILM and SLM classes around to decrease
the total number of packages.
2020-10-02 09:30:24 -04:00
Martijn van Groningen 300e525138
Fix querying a data stream name in _index field. (#63178)
Backport #63170 to 7.x branch.

The _index field is a special field that allows using queries against the name of an index or alias.
Data stream names were not included, this pr fixes that by changing SearchIndexNameMatcher
(which used via IndexFieldMapper) to also include data streams.
2020-10-02 15:29:20 +02:00
Armin Braun 1663dc7cf8
Fix GCS Repo Cleanup Tool Exception Handling (#63168)
We recently upgraded the SDK which resulted in the storage exception to be
wrapped now so we must unwrap to check for whether it's a 404 or not.

Closes #63091
2020-10-02 15:26:39 +02:00
Armin Braun 022a3ef831
Split Tests out of SharedClusterSnapshotRestoreIT (#63130) (#63176)
Splitting some tests out of this class that has become a catch-all
for random snapshot related tests into either existing suits that fit
better for these tests or one of two new suits to prevent timeouts
in extreme cases (e.g. `WindowsFS` + many nodes + multiple data paths per node).
No other changes to tests were made whatsoever.

Closes #61541
2020-10-02 15:26:22 +02:00
Marios Trivyzas 7d74fb8577
EQL: Replace ?"..." with """...""" for unescaped strings (#62539) (#63174)
Use triple double quotes enclosing a string literal to interpret it
as unescaped, in order to use `?` for marking query params and avoid
user confusion. `?` also usually implies regex expressions.

Any character inside the `"""` beginning-closing markings is considered
raw and the only thing that is not permitted is the `"""` sequence itself.
If a user wants to use that, needs to resort to the normal `"` string literal
and use proper escaping.

Relates to #61659

(cherry picked from commit d87c2ca2eacab5552bca1e520d33cf71da40bcfd)
2020-10-02 14:58:50 +02:00
Benjamin Trent cfcf973259
[7.x] [ML] renames */inference* apis to */trained_models* (#63097) (#63136)
* [ML] renames */inference* apis to */trained_models* (#63097)

This commit renames all `inference` CRUD APIs to `trained_models`.

This aligns with internal terminology, documentation, and use-cases.
2020-10-02 07:34:28 -04:00
Benjamin Trent 535f8a434b
Revert "[ML] adding `baseline` field to total_feature_importance objects (#63098) (#63125)" (#63144)
This reverts commit 95242eccee.
2020-10-02 07:03:15 -04:00
Luca Cavanna a42a516b67 Shorten runtime field type class names (#63123)
In the codebase there is the non-written convention that classes that extend `MappedFieldType` are generally called `*FieldType`. With this commit we adopt the same convention for runtime field types which allows us to shorten their names by removing the `Mapped` portion which is implicit.
2020-10-02 11:25:25 +02:00
Ioannis Kakavas e91f66e22f
Ensure domain_name setting for AD realm is present (#61983) (#63159)
We would only check for a null value and not for an empty string so
that meant that we were not actually enforcing this mandatory
setting. This commits ensures we check for both and fail 
accordingly if necessary, on startup
2020-10-02 12:16:08 +03:00
David Kyle 279f951700
[ML] Set parent task Id on ml expired data removers (#62854) (#62966)
Setting the parent task Id (of the delete expired data action) on the ML
expired data removers makes it easier to track and cancel long running
tasks
2020-10-02 10:14:10 +01:00
Ioannis Kakavas d9d024c17f
Update bcfips in plugin-cli (#63149) (#63157)
In 63099 we updated the bcfips version we use in tests to 1.0.2.
We however, bundle bcfips and bcpg-fips in plugin-cli and we should
update this too.
2020-10-02 11:41:26 +03:00
Rafi Estrada 7c122498bd [Docs] Correct typo (#63102) 2020-10-02 10:16:44 +02:00
Christoph Büscher 4c7c540ca1 Update version field yml test skip version (#63139) 2020-10-02 10:01:27 +02:00
Przemyslaw Gomulka ee500c10b9
[doc] Rounding range query rules backport(#63109) (#63155)
a documentation explaining defaulting of missing fields when using date math parser.
relates #62268
2020-10-02 09:40:01 +02:00
Przemyslaw Gomulka eb630e599d
Allow passing versioned media types to 7.x server (#63071)
7.x client can pass media type with a version which will return a 7.x
version of the api in ES 8.
In ES server 7 this media type shoulld be accepted but it serve the same
version of the API (7x)
relates #61427
2020-10-02 09:17:11 +02:00
Costin Leau 614f4c13a5 EQL: Introduce case-sensitive equality (#63121)
Introduce : operator for doing case insensitive string comparisons.
Recognizes "*" for wildcard matches in string literals.
Restricted only to string types.

Relates #62941

(cherry picked from commit 201e577e65f26a9b958a6197fe6c7268da39de29)
2020-10-02 00:23:08 +03:00
Jake Landis d864d14832
[7.x] Introduce changes related to yamlRestCompatTest (#62985) (#63105)
Backport for #62985 that includes the related changes, but 
not the actual plugin for yamlRestCompatTest. The plugin 
is not necessary in 7.x, and back porting relevant changes to
help keep 7.x code inline with master.
2020-10-01 15:29:57 -05:00
William Brafford 6899ce6309
System index auto-creation should not be disabled by user settings (#62984) (#63147)
* Add System Indices check to AutoCreateIndex

By default, Elasticsearch auto-creates indices when a document is
submitted to a non-existent index. There is a setting that allows users
to disable this behavior. However, this setting should not apply to
system indices, so that Elasticsearch modules and plugins are able to
use auto-create behavior whether or not it is exposed to users.

This commit constructs the AutoCreateIndex object with a reference to
the SystemIndices object so that we bypass the check for the user-facing
autocreate setting when it's a system index that is being autocreated.

We also modify the logic in TransportBulkAction to make sure that if a
system index is included in a bulk request, we don't skip the
autocreation step.
2020-10-01 16:26:07 -04:00
Igor Motov fc13b72cea
Extract histogramFieldDocValues into an utility class (#63100) (#63148)
This function will be needed in the upcoming rate aggs tests.
2020-10-01 15:44:37 -04:00
Marios Trivyzas 3ad4b00c7e
EQL: Clean grammar from `fork` (#63094) (#63138)
Since `fork` is not used, is undocumented in Python EQL and
there is no plan at the moment to implement it in the future,
removing it  from the grammar. User will get parsing exceptions
instead of higher level messages about unsupported features
which can lead to wrong expectations.

(cherry picked from commit f6a0f8f01c1b1893bab86629d1de73e9f9dae8dc)
2020-10-01 21:14:41 +02:00
Lee Hinman f0f0da2188
[7.x] Add telemetry for data tiers (#63031) (#63140)
Backports the following commits to 7.x:

    Add telemetry for data tiers (#63031)
2020-10-01 12:37:32 -06:00
Igor Motov 6a9cde2918
Add support for x_opaque_id to _cat/tasks (#63036) (#63135)
Adds an optional column with support for x_opaque_id to _cat/tasks API.

Closes #61118
2020-10-01 13:17:46 -04:00
Tim Brooks 7f6d1981a1
Transfer network bytes to smaller buffer (#62673)
Currently we read in 64KB blocks from the network. When TLS is not
enabled, these bytes are normally passed all the way to the application
layer (some exceptions: compression). For the HTTP layer this means that
these bytes can live throughout the entire lifecycle of an indexing
request.

The problem is that if the reads from the socket are small, this means
that 64KB buffers can be consumed by 1KB or smaller reads. If the socket
buffer or TCP buffer sizes are small, the leads to massive memory
waste. It has been identified as a major source of OOMs on coordinating
nodes as Elasticsearch easily exhausts the heap for these network bytes.

This commit resolves the problem by placing a handler after the TLS
handler to copy these bytes to a more appropriate buffer size as
necessary. This comes after TLS, because TLS is a framing layer which
often resolves this problem for us (the 64KB buffer will be decoded
into a more appropriate buffer size). However, this extra handler will
solve it for the non-TLS pipelines.
2020-10-01 10:39:24 -06:00
Jake Landis 294f40de72
[7.xUpdate TESTING.asciidoc for recent REST test changes (#62841) (#62895)
* Remove reference to Runner (no longer valid)
* Remove tests.rest (no longer valid)
* Add reference to javaRestTest
* Brief mention of qa tests
2020-10-01 11:02:29 -05:00
Jake Landis 0795f4b898
[7.x] Add network from MaxMind Geo ASN database (#61676) (#62898)
This adds the network property from the MaxMind Geo ASN database. 
This enables analysis of IP data based on the subnets that MaxMind have 
previously identified for ASN networks.

closes #60942

Co-authored-by: Peter Ansell <p_ansell@yahoo.com>
2020-10-01 11:01:44 -05:00
Benjamin Trent 95242eccee
[ML] adding `baseline` field to total_feature_importance objects (#63098) (#63125)
This adds a new `baseline` field to the feature importance values. 

This field contains the baseline importance for a given feature and class.
2020-10-01 09:48:07 -04:00
Dan Hermann fbf552d24c
Add country_name to the default properties of geoip ingest processor (#62915) (#63124) 2020-10-01 08:47:51 -05:00
Dimitris Athanasiou 46c3973400
[7.x][ML] Remove direct access to system index from filter_crud REST test (#63111) (#63115)
This test accesses system indices for 2 reasons.

First, it creates a filter that has a different type. This was done
to assert that filter is not returned from the APIs. However,
now that access to the `.ml-meta` index is restricted,
it is not really a concern.

Second, it creates a `.ml-meta` index without mappings to test
the get API does not fail due to lack of mappings on a sorted field,
namely the `filter_id`. Once again, this test is less useful once
system indices have restricted access.

Relates #62501

Backport of #63111
2020-10-01 15:15:34 +03:00
Ioannis Kakavas 2ea3073a5e
Use BCFIPS 1.0.2 in our CI (#63116)
Bouncy Castle's BC-FJA-1.0.2 has been certified for a while now
but we had noticed that it seems to be rather entropy hungry and
ES would start very slowly ( and tests would take forever )
because of blocking calls to /dev/random.

We verified that this is resolved when enabling hw RNG or a
software one like haveged. While rng-tools should be suggested for
production uses, our ephemeral workers have haveged installed
which should work just fine for CI.

Backport of 63099
2020-10-01 14:48:53 +03:00
Costin Leau c2992ea287 EQL: Fix NPE from incorrect use of ids search (#63032)
This fixes a bug introduced when moving from mget to ids query. While
mget returns all the ids given, id query is a search query and thus by
default returns only 10 documents.
The fix correctly sets the expected size so all the information is
returned inside the response.

Fix #63030

(cherry picked from commit 09ba85548a0142a1fe8376efea9cc4e7764a207c)
2020-10-01 13:49:58 +03:00
Hendrik Muhs e001b4c021 [Transform] fix time rounding in TransformContinuousIT (#63113)
fix a time rounding problem in the test, due to rounding down to epoch
seconds instead of epoch millis

fixes #62951
2020-10-01 11:43:50 +02:00
Ignacio Vera ba5574935e
Remove dependency of Geometry queries with mapped type names (#63077) (#63110)
It extracts the query capabilities from AbstractGeometryFieldType into two new interfaces, GeoshapeQueryable and ShapeQueryable. Those interfaces are implemented by the final mappers.
2020-10-01 10:49:12 +02:00
James Rodewig 700bfb156d
[DOCS] EQL: date_nanos timestamp is not supported (#63101) (#63103) 2020-09-30 17:45:00 -04:00
Howard 8c6e197f51 Remove allocation id from engine (#62680)
We no longer need the allocation id in Engine.
2020-09-30 15:28:27 -04:00
Marios Trivyzas f69d268500
SQL: Allow skip of bwc tests on `check` task (#62936) (#63089)
Bwc tests can consume much time to build and to run so it's nice to be
able to skip them when running the `check` task on the SQL module.

Introduce a new task `checkNoBwc` so one can use:
```
./gradlew -p x-pack/plugin/sql checkNoBwc
```
to skip them.

(cherry picked from commit a52e1846f338f6869273181c6f248579581fa68c)
2020-09-30 20:03:19 +02:00
Marios Trivyzas 0ebaf8a3ec
EQL: Allow escaped backquote in identifiers (#62932) (#63082)
Previously, backquote couldn't not be used inside an escaped identifier,
e.g.:
```
`my`identifier` = "some_value"
```
was not allowed. Introduce escaping of the backquote by using a
double backquote:
```
`my``identifier` = "some_value"
```

(cherry picked from commit 49514121486f42a58674b3e5901de4021fda5c15)
2020-09-30 19:10:09 +02:00
James Rodewig e91e5ff6d7
[DOCS] Document escaped backticks for identifiers (#63079) (#63084) 2020-09-30 12:26:20 -04:00
Alan Woodward 4fe09b4bf0 Convert test field mappers to parametrized forms (#63018)
Relates to #62988
2020-09-30 16:59:35 +01:00
Alan Woodward 675d18f6ea Convert dense/sparse vector field mappers to Parametrized form (#62992)
Also adds a proper MapperTestCase test for dense vectors.

Relates to #62988
2020-09-30 16:55:28 +01:00
Lisa Cawley 3838fe1fd4 [DOCS] Add experimental tag to inference processor and bucket aggregation (#63023) 2020-09-30 08:51:26 -07:00
István Zoltán Szabó 0655d9e8ac
[DOCS] Adds limitation item about using scripts in transforms (#63021) (#63075) 2020-09-30 16:25:48 +02:00
James Rodewig e179b89085
[DOCS] Clarify that v2.0+ hyphenation files aren't supported (#60579) (#63073)
Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com>

Co-authored-by: jgkirschbaum <juergen.kirschbaum@gmail.com>
2020-09-30 09:28:44 -04:00
Mayya Sharipova abfae16517 Unmute test DeleteByQueryConcurrentTests
Unmute DeleteByQueryConcurrentTests
testConcurrentDeleteByQueriesOnDifferentDocs test.

LUCENE-9449 introduced a bug in sorting on _doc,
which resulted in failure of this test. As Lucene bug
has been fixed, this reenables the test.

Closes #62609
2020-09-30 09:06:42 -04:00
James Rodewig 803f1ec897
[DOCS] Updated target_field description of the json ingest processor (#61968) (#63068)
Co-authored-by: Dan Hermann <danhermann@users.noreply.github.com>
Co-authored-by: Jakob Reiter <jakommo@users.noreply.github.com>
2020-09-30 09:04:59 -04:00