Commit Graph

4750 Commits

Author SHA1 Message Date
Albert Zaharovits cc1fce96ba
Add a new async search security origin (#52141)
This commit adds a new security origin, and an associated reserved user
and role, named `_async_search`, which can be used by internal clients to
manage the `.async-search-*` restricted index namespace.
2020-02-11 19:58:06 +02:00
James Rodewig d68a4ec82e
[7.x] Permit EQL feature flag in release builds (#52201) (#52214)
7.x backport of #52201

Provides a path to set register the EQL feature flag in release builds.
This enables EQL in release builds so that release docs tests pass.

Release docs tests do not have infrastructure in place to only register
snippets from included portions of the docs, they instead include all
docs snippets.

Since EQL can not be enabled in release builds, this meant that the EQL
snippets fail in the release docs tests.

This adds the ability to enable EQL in the release docs tests. This
system property will be removed when EQL is ready for release.
2020-02-11 11:49:49 -05:00
Hendrik Muhs 098380e483 Percentiles aggregation validation checks for range (#51871)
disallow to specify percentile out of range [0,100]. This also fixes a problem in transform by failing
validation if an invalid percentile configuration is used.
2020-02-11 17:25:39 +01:00
David Roberts d1d9c40e71 [ML] Switch poor categorization audit warning to use status field (#52195)
In #51146 a rudimentary check for poor categorization was added to
7.6.

This change replaces that warning based on a Java-side check with
a new one based on the categorization_status field that the ML C++
sets.  categorization_status was added in 7.7 and above by #51879,
so this new warning based on more advanced conditions will also be
in 7.7 and above.

Closes #50749
2020-02-11 15:33:27 +00:00
David Roberts 473468d763 [ML] Better error when persistent task assignment disabled (#52014)
Changes the misleading error message when attempting to open
a job while the "cluster.persistent_tasks.allocation.enable"
setting is set to "none" to a clearer message that names the
setting.

Closes #51956
2020-02-11 15:23:21 +00:00
Igor Motov 667e1a5225
Add Boxplot Aggregation (#52174)
Adds a `boxplot` aggregation that calculates min, max, medium and the first
and the third quartiles of the given data set.

Closes #33112
2020-02-11 09:38:17 -05:00
Marios Trivyzas 204d086266 SQL: Fix issue with timezone when paginating (#52101)
Previously, when the specified (or default) fetchSize led to
subsequent HTTP requests and the usage of cursors, those subsequent
were no longer using the client timezone specified in the initial
SQL query. As a consequence, Even though the query is executed once
(with the correct timezone) the processing of the query results by
the HitExtractors in the next pages was done using the default
timezone Z. This could lead to incorrect results.

Fix the issue by correctly using the initially specified timezone,
which is found in the deserialisation of the cursor string.

Fixes: #51258
(cherry picked from commit 8f7afbdeb9295999b48a6c36db5b31cbe0cee432)
2020-02-11 15:27:56 +01:00
Yang Wang 16ba59e9d1
Expose more authentication info to ingest pipeline (#51305) (#52119)
The changes add more granularity for identiying the data ingestion user.
The ingest pipeline can now be configure to record authentication realm and
type. It can also record API key name and ID when one is in use. 
This improves traceability when data are being ingested from multiple agents
and will become more relevant with the incoming support of required
pipelines (#46847)

Resolves: #49106
2020-02-11 23:05:01 +11:00
Tim Vernum b0b1b13311
Extract class to store Authentication in context (#52183)
This change extracts the code that previously existed in the
"Authentication" class that was responsible for reading and writing
authentication objects to/from the ThreadContext.

This is needed to support multiple authentication objects under
separate keys.

This refactoring highlighted that there were a large number of places
where we extracted the Authentication/User objects from the thread
context, in a variety of ways. These have been consolidated to rely on
the SecurityContext object.

Backport of: #52032
2020-02-11 20:59:06 +11:00
Dimitris Athanasiou 6086fadf00
[7.x][ML] Prepare to hold additional stats in DF Analytics task (#52134) (#52187)
Refactors `DataFrameAnalyticsTask` to hold a `StatsHolder` object.
That just has a `ProgressTracker` for now but this is paving the
way to add additional stats like memory usage, analysis stats, etc.

Backport #52134
2020-02-11 11:18:45 +02:00
Martijn van Groningen c14e4666df
Wait for watcher to be started prior to rolling upgrade tests. (#52186)
Backport: #52139

In the rolling upgrade tests, watcher is manually executed,
in rare scenarios this happens before watcher is started,
resulting in the manual execution to fail.

Relates to #33185
2020-02-11 09:39:20 +01:00
Dimitris Athanasiou cbebc26f50
[7.x][ML] Retry persisting DF Analytics results (#52048) (#52160)
Employs `ResultsPersisterService` from `DataFrameRowsJoiner` in order
to add retries when a data frame analytics job is persisting the results
to the destination data frame.

Backport of #52048
2020-02-11 09:55:00 +02:00
Andrei Stefan 2f1631d9d0
Telemetry data initial implementation (#51715) (#52175)
(cherry picked from commit f1d1cceacaacf226fcd2459f34689843b822fe4b)
2020-02-11 09:15:47 +02:00
Marios Trivyzas 6b600855a9
SQL: Make parsing of date more lenient (#52137)
Make the parsing of date more lenient

- as an escaped literal: `{d '2020-02-10[[T| ]10:20[:30][.123456789][tz]]'}`
- cast a string to a date: `CAST(2020-02-10[[T| ]10:20[:30][.123456789][tz]]' AS DATE)`

Closes: #49379
(cherry picked from commit 5863b27500d5e7f6cdd8c6c62b09b84e53ca724a)
2020-02-10 21:47:00 +01:00
Julie Tibshirani 28a8db730f In FieldTypeLookup, factor out flat object field logic. (#52091)
Currently, the logic for looking up `flattened` field types lives in the
top-level `FieldTypeLookup`. This PR moves it into a dedicated class
`DynamicKeyFieldTypeLookup`.
2020-02-10 10:44:02 -08:00
Bogdan Pintea 7b58ed0dd7
Fix milliseconds handling in intervals (#51675) (#52156)
This fixes:

- the parsing of milliseconds in intervals: everything past the . used to be converted as-is to milliseconds, with no normalisation of the unit; thus, a value of .23 ended up as 23 millis in the interval, instead of 230.
- the printing of a trailing .0, in case the interval lacks the fractional part;
- tests generating a random millisecond value used to simply print it in the string about to be evaluated without a necessary front-filling of 0[s], where the amount was below 100/10.

(The combination of first and last issues above, plus statistical "luck" made the incorrect handling pass the tests.)

(cherry picked from commit 4de8c64f63ee37c1bcfdb9b9d3a07d09be243222)
2020-02-10 19:24:26 +01:00
Lee Hinman 37a2e9bac6
[7.x] Allow forcemerge in the hot phase for ILM policies (#520… (#52083)
* Allow forcemerge in the hot phase for ILM policies

This commit changes the `forcemerge` action to also be allowed in the `hot` phase for policies. The
forcemerge will occur after a rollover, and allows users to take advantage of higher disk speeds for
performing the force merge (on a separate node type, for example).

On caveat with this is that a `forcemerge` in the `hot` phase *MUST* be accompanied by a `rollover`
action. ILM validates policies to ensure this is the case.

Resolves #43165

* Use anyMatch instead of findAny in validation

* Make randomTimeseriesLifecyclePolicy single-pass
2020-02-10 08:54:49 -07:00
Przemysław Witek c7cc383d33
[7.x] Update persistent state document in the index the document belongs to (#51751) (#52145) 2020-02-10 16:32:34 +01:00
Martijn van Groningen c77b80f01e
Unmute smoke test monitoring with watcher. (#52140)
Backport of #51490
2020-02-10 15:13:32 +01:00
Nhat Nguyen 864e9d875d Bubble up exception in follow task in ccr tests (#52085)
It's perfectly fine if a bulk request on the follower hits 
IndexShardClosedException in some CCR tests because we sometimes 
close some follower shards while the follow-task is replicating operations.
Instead of failing the test immediately, this commit bubbles up that
failure to the shard follow task.

Closes #52052
2020-02-10 08:27:04 -05:00
Marios Trivyzas 27265f032a SQL: Enhance timestamp escaped literal parsing (#52097)
Allow also whitespace ` ` (together with `T`) as a separator between
date and time parts of the timestamp string. E.g.:
```
{ts '2020-02-08 12.10.45'}
```
or
```
{ts '2020-02-08T12.10.45'}
```

Fixes: #46069
(cherry picked from commit 07c977023fb8ceab5991c359a6cbfe07beaad9bb)
2020-02-10 11:24:55 +01:00
Tim Vernum 4e4815355a Mute DocumentSubsetBitsetCacheTests.testCacheUnderConcurrentAccess (#52135)
Test does not always complete in expected time.

Relates: #51914
Backport of: #52122
2020-02-10 21:19:18 +11:00
Andrei Stefan fa4dcd50d9 Extract common optimization rules for QL (#52054) (#52132)
(cherry picked from commit ee43115531234c2d955193ce0c9c268e1f02ab43)
2020-02-10 11:48:45 +02:00
Ignacio Vera 80e3c97210 Upgrade to lucene-8.5.0-snapshot-d62f6307658 (#52039) (#52130) 2020-02-10 10:13:22 +01:00
David Roberts 1cefafdd14 [ML] Add new categorization stats to model_size_stats (#52009)
This change adds support for the following new model_size_stats
fields:

- categorized_doc_count
- total_category_count
- frequent_category_count
- rare_category_count
- dead_category_count
- categorization_status

Backport of #51879
2020-02-10 09:10:50 +00:00
Jay Modi 3edadfefd0 RestHandlers declare handled routes (#52123)
This commit changes how RestHandlers are registered with the
RestController so that a RestHandler no longer needs to register itself
with the RestController. Instead the RestHandler interface has new
methods which when called provide information about the routes
(method and path combinations) that are handled by the handler
including any deprecated and/or replaced combinations.

This change also makes the publication of RestHandlers safe since they
no longer publish a reference to themselves within their constructors.

Closes #51622

Co-authored-by: Jason Tedor <jason@tedor.me>

Backport of #51950
2020-02-09 22:48:32 -07:00
Ioannis Kakavas 8c0b49cd32 Adjust jarHell and 3rd party audit exclusions (#51733) (#51766)
Now that the FIPS 140 security provider is simply a test dependency
we don't need the thirdPartyAudit exceptions, but plugin-cli and
transport-netty4 do need jarHell disabled as they use the non fips
BouncyCastle security provider as a test dependency too.
2020-02-10 07:38:59 +02:00
Nhat Nguyen dc143d59c8 Increase shard inactive time to 1h in upgrade tests (#52051)
Similar to the fix in #51651, this commit increases the shard inactive 
timeout for x-pack.

Closes #52031
2020-02-09 23:25:21 -05:00
Tim Vernum d5c015062d
Don't allow null User.principal (#52049)
Some parts of the User class (e.g. equals/hashCode) assumed that
principal could never be null, but the constructor didn't enforce
that.

This adds a null check into the constructor and fixes a few tests that
relied on being able to pass in null usernames.

Backport of: #51988
2020-02-10 12:23:55 +11:00
Jason Tedor 2b99291187
Add autoscaling feature flag in release REST tests (#52096)
The REST tests for autoscaling either need to be skipped in a
non-snapshot build, or alternatively, the feature flag registered so
that autoscaling can be enabled. We prefer the latter approach, as it
allows us to also test autoscaling in non-snapshot builds incrementally,
instead of at the end of development as autoscaling prepares for
release. This commit registers the autoscaling feature flag in REST
tests for non-snapshot builds.
2020-02-09 15:49:01 -05:00
Armin Braun 90eb6a020d Remove Redundant Loading of RepositoryData during Restore (#51977) (#52108)
We can just put the `IndexId` instead of just the index name into the recovery soruce and
save one load of `RepositoryData` on each shard restore that way.
2020-02-09 21:44:18 +01:00
Marios Trivyzas 3e7f939f63
SQL: [Tests] Add more tests for aggs and literals (#52086)
Add some more tests where more than one literal is selected,
unaliased and aliased.

Follows: #42121
(cherry picked from commit 405271d408a233e697eb2e9ded3005a71f4df5e7)
2020-02-09 18:01:05 +01:00
Costin Leau 214beed90f QL: move query AST from SQL to QL (#52069)
(cherry picked from commit 59368968b698652352be1bb2a60d5a357a01b978)
2020-02-08 23:10:51 +02:00
Jason Tedor 8b1d2c5b95
Permit autoscaling feature flag in release builds (#52088)
This commit provides a path to set register the autoscaling feature flag
in release builds, and therefore enabling autoscaling in release
builds. The primary reason that we add this is so that our release docs
tests can pass. Our release docs tests do not have infrastructure in
place to only register snippets from included portions of the docs, they
instead include all docs snippets. Since autoscaling can not be enabled
in release builds, this meant that the autoscaling snippets would fail
in the release docs tests. To address then, we need the ability to
enable autoscaling in the release docs tests which we can now do with
the system property added here. This system property will be removed
when autoscaling is ready for release.
2020-02-07 21:40:51 -05:00
Benjamin Trent dffcd021df
[7.x] [ML] Add bwc serialization unit test scaffold (#51889) (#52061)
* [ML] Add bwc serialization unit test scaffold (#51889)

Adds new `AbstractBWCSerializationTestCase` which provides easy scaffolding for BWC serialization unit tests.

These are no replacement for true BWC tests (which execute actual old code). These tests do provide some good coverage for the current code when serializing to/from old versions.

* removing unnecessary override for 7.series branch

* adding necessary import

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-02-07 17:17:11 -05:00
Benjamin Trent c6111eb90e
[ML][Inference] adding number_samples to TreeNode (#51937) (#52060)
in preparation for feature importance and split information gain, adding `number_samples` field to `TreeNode` definition.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-02-07 17:04:58 -05:00
Julie Tibshirani 337d73a7c6 Rename MapperService#fullName to fieldType.
The new name more accurately describes what the method returns.
2020-02-07 10:35:53 -08:00
Emanuele Sabellico 282e919607 SQL: [Tests] Add integ tests for selecting a literal and an aggregate (#42121)
The related issue regarding aggregation queries where some literals
are also selected together with aggregate function has been fixed
with #49570. Add integration tests to verify the behavior.

Relates to: #41411

(cherry picked from commit 9f414a8d05c75e1a9f8250084f6dcd634d5d78d8)
2020-02-07 19:00:15 +01:00
Albert Zaharovits 4add82d966 Mute CoreFullClusterRestartIT testRecovery (#52038)
Relates #52031
2020-02-07 13:35:43 +02:00
David Kyle 8f10a7c6ca [ML] Make Ensemble feature names optional (#51996)
The featureNames field is requisite in individual models but is not required by the Ensemble.
2020-02-07 10:08:37 +00:00
Armin Braun 91e938ead8
Add Trace Logging of REST Requests (#51684) (#52015)
Being able to trace log all REST requests to a node would make debugging
a number of issues a lot easier.
2020-02-07 09:03:20 +01:00
Jason Tedor 25daf5f1e1
Add autoscaling API skelton (#51564)
The main purpose of this commit is to add a single autoscaling REST
endpoint skeleton, for the purpose of starting to build out the build
and testing infrastructure that will surround it. For example, rather
than commiting a fully-functioning autoscaling API, we introduce here
the skeleton so that we can start wiring up the build and testing
infrastructure, establish security roles/permissions, an so on. This
way, in a forthcoming PR that introduces actual functionality, that PR
will be smaller and have less distractions around that sort of
infrastructure.
2020-02-06 21:55:01 -05:00
Andrei Stefan 488944f4a1
SQL: Handle uberjar scenario where the ES jdbc driver file is bundled in another jar (#51856) (#52024)
(cherry picked from commit 6247b0793c9db19a8a9fa6f0164cc14d0debed6e)
2020-02-07 04:15:59 +02:00
Benjamin Trent 846f87a26e
[ML] allow close/stop for jobs/datafeeds with missing configs (#51888) (#51997)
If the configs are removed (by some horrific means), we should still allow tasks to be cleaned up easily.

Datafeeds and jobs with missing configs are now visible in their respective _stats calls and can be stopped/closed.
2020-02-06 12:10:18 -05:00
Hendrik Muhs 03fb5cdaae fallback to float if source type is scaled_float for mapping deduction (#51990)
fallback to float if source type is scaled_float for mapping deduction of min/max aggregation

fixes #51780
2020-02-06 17:27:26 +01:00
Martijn Laarman 898dd0b9cc Cat.ml.* introduces an additional depths to namespace API's (#51981)
Not all clients support this e.g if the java high level rest client were
to map this it would look like `client.cat().ml().api()` which hinders
discoverability.

(cherry picked from commit 21cdabf09dc8305ce2f5e3b6cb193f67137d8bdb)
2020-02-06 13:16:59 +01:00
Jim Ferenczi 0f333c89b9
Always rewrite search shard request outside of the search thread pool (#51708) (#51979)
This change ensures that the rewrite of the shard request is executed in the network thread or in the refresh listener when waiting for an active shard. This allows queries that rewrite to match_no_docs to bypass the search thread pool entirely even if the can_match phase was skipped (pre_filter_shard_size > number of shards). Coordinating nodes don't have the ability to create empty responses so this change also ensures that at least one shard creates a full empty response while the other can return null ones. This is needed since creating true empty responses on shards require to create concrete aggregators which would be too costly to build on a network thread. We should move this functionality to aggregation builders in a follow up but that would be a much bigger change.
This change is also important for #49601 since we want to add the ability to use the result of other shards to rewrite the request of subsequent ones. For instance if the first M shards have their top N computed, the top worst document in the global queue can be pass to subsequent shards that can then rewrite to match_no_docs if they can guarantee that they don't have any document better than the provided one.
2020-02-06 10:53:11 +01:00
Lisa Cawley 53bd88ea8c [DOCS] Adds tip for elastic built-in user (#51891) 2020-02-05 18:56:23 -08:00
Jason Tedor 12473c2bcb
Log failure when cleaning shard follow task (#51971)
When clenaing a shard follow task after an index has been deleted, an
exception can occur submitting the complete persistent task
action. However, this exception message is not logged. This commit
addresses this by including the exception that led to the failure in the
log message.
2020-02-05 20:48:00 -05:00
Tanguy Leroux d86a7ad6d2 Give more time to AutoFollowIT tests (#51938)
AutoFollowIT tests are regularly failing on CI because they rely 
on how cluster state updates are processed within the integration 
clusters. We tried to limit this in #49141 by moving to latches 
instead of waiting for assertions to pass but there are still some 
places were it still need to wait for the cluster state updates to 
be processed and auto-follow stats to be updated.

This commit gives more time to assertBusy() that verifies the 
AutoFollowStats (up to 60 seconds) and also always log the 
auto-follow stats in case the assertions failed.

Closes #48982
2020-02-05 15:57:27 +01:00
Costin Leau bd6d9e063c EQL: Add missing commit messages for #51940
* EQL: Plug query params into the AstBuilder (#51886)

As the eventType is customizable, plug that into the parser based on the
given request.

(cherry picked from commit 5b4a3a3c07eacbc339cbd4c05a3621d056cc8d60)

* EQL: Add field resolution and verification (#51872)

Add basic field resolution inside the Analyzer and a basic Verifier to
check for any unresolved fields.

(cherry picked from commit 7087358ae2fb212811d480ec8641a46167946c82)

* EQL: Introduce basic execution pipeline (#51809)

Add main classes that form the 'execution' pipeline are added - most of
them have no functionality; the purpose of this PR is to add flesh out
the contract between the various moving parts so that work can start on
them independently.

(cherry picked from commit 9a1bae50a49af7fe8467b74b154c0d82c6bb9a19)

* EQL: Add AstBuilder to convert to QL tree (#51558)

* EQL: Add AstBuilder visitors
* EQL: Add tests for wildcards and sets
* EQL: Fix licensing
* EQL: Fix ExpressionTests.java license
* EQL: Cleanup imports
* EQL: PR feedback and remove LiteralBuilder
* EQL: Split off logical plan from expressions
* EQL: Remove stray import
* EQL: Add predicate handling for set checks
* EQL: Remove commented out dead code
* EQL: Remove wildcard test, wait until analyzer

(cherry picked from commit a462700f9c8e1fb977d62d42eb0077403b8fa98b)

* EQL grammar updates and tests (#49658)

* EQL: Additional tests and grammar updates
* EQL: Add backtick escaped identifiers
* EQL: Adding keywords to language
* EQL: Add checks for unsupported syntax
* EQL: Testing updates and PR feedback
* EQL: Add string escapes
* EQL: Cleanup grammar for identifier
* EQL: Remove tabs from .eql tests

(cherry picked from commit 6f1890bf2d52cabdfd1e7848fb481cf54b895f25)
2020-02-05 16:53:42 +02:00
Costin Leau 6ff0e411a8
EQL: backport updates to 7.x (#51940) 2020-02-05 16:45:58 +02:00
Benjamin Trent 79f143907a
[7.x] [ML] add _cat/ml/trained_models API (#51529) (#51936)
* [ML] add _cat/ml/trained_models API (#51529)

This adds _cat/ml/trained_models.
2020-02-05 08:26:44 -05:00
Marios Trivyzas 64f9a2089b SQL: [Tests] add tests for literals and GROUP BY (#51878)
Add unit and integration tests where literals are SELECTed
in combination with GROUP BY and possibly aggregate functions.

Relates to #41411 and #34583
which have been fixed.

(cherry picked from commit b97f1ca12675d6ea4772c60578922fe1cc2409ee)
2020-02-05 12:55:56 +01:00
Ignacio Vera ababd730f6
Histogram field: Use #name() instead of #simpleName() when generating doc values (#51920) (#51927) 2020-02-05 12:35:49 +01:00
Yannick Welsch 60c93b6df5 Increase scroll timeout for upgrade test (#51912)
Bumps the timeout already bumped in #50195, which was insufficient.
2020-02-05 11:13:58 +01:00
Adrien Grand ad9d2f1922
Move analysis/mappings stats to cluster-stats. (#51875)
Closes #51138
2020-02-05 11:02:25 +01:00
debadair c0156cbb5d
Backporting updates to ILM org, overview, & GS (#51898)
* [DOCS] Align with ILM API docs (#48705)

* [DOCS] Reconciled with Snapshot/Restore reorg

* [DOCS] Split off ILM overview to a separate topic. (#51287)

* [DOCS} Split off overview to a separate topic.

* [DOCS] Incorporated feedback from @jrodewig.

* [DOCS] Edit ILM GS tutorial (#51513)

* [DOCS] Edit ILM GS tutorial

* [DOCS] Incorporated review feedback from @andreidan.

* [DOCS] Removed test link & fixed anchor & title.

* Update docs/reference/ilm/getting-started-ilm.asciidoc

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

* Fixed glossary merge error.

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2020-02-04 16:45:18 -08:00
Lee Hinman 0be61a3662
[7.x] Adding best_compression (#49974) (763480ee) (#51819)
* Adding best_compression (#49974)

This commit adds a `codec` parameter to the ILM `forcemerge` action. When setting the codec to `best_compression` ILM will close the index, then update the codec setting, re-open the index, and finally perform a force merge.

* Fix ForceMergeAction toSteps construction (#51825)

There was a duplicate force merge step and the test continued to fail. This commit clarifies the
`toStep` method and changes the `assertBestCompression` method for better readability.

Resolves #51822

* Update version constants

Co-authored-by: Sivagurunathan Velayutham <sivadeva.93@gmail.com>
2020-02-04 14:15:43 -07:00
Julie Tibshirani 38ce428831
Create a class to hold field capabilities for one index. (#51844)
Currently, the same class `FieldCapabilities` is used both to represent the
capabilities for one index, and also the merged capabilities across indices. To
help clarify the logic, this PR proposes to create a separate class
`IndexFieldCapabilities` for the capabilities in one index. The refactor will
also help when adding `source_path` information in #49264, since the merged
source path field will have a different structure from the field for a single index.

Individual changes:
* Add a new class IndexFieldCapabilities.
* Remove extra constructor from FieldCapabilities.
* Combine the add and merge methods in FieldCapabilities.Builder.
2020-02-04 11:24:57 -08:00
Hendrik Muhs b7aace44f3 mark transform API's stable (#51862)
mark transform API's stable, meaning making transform GA for the next minor release
2020-02-04 16:13:47 +01:00
David Roberts 9d55c45b5a [ML] Improve multiline_start_pattern for CSV in find_file_structure (#51737)
The work to switch file upload over to treating delimited files
like semi-structured text and using the ingest pipeline for CSV
parsing makes the multi-line start pattern used for delimited
files much more critical than it used to be.

Previously it was always based on the time field, even if that
was towards the end of the columns, and no multi-line pattern
was created if no timestamp was detected.

This change improves the multi-line start pattern by:

1. Never creating a multi-line pattern if the sample contained
   only single line records.  This improves the import
   efficiency in a common case.
2. Choosing the leftmost field that has a well-defined pattern,
   whether that be the time field or a boolean/numeric field.
   This reduces the risk of a field with newlines occurring
   earlier, and also means the algorithm doesn't automatically
   fail for data without a timestamp.
2020-02-04 12:37:48 +00:00
Hendrik Muhs c2b08bb72f [Transform] add support for percentile aggs (#51808)
make transform ready for multi value aggregations and add support for percentile

fixes #51663
2020-02-04 12:02:20 +01:00
Hendrik Muhs 5d5f3ce256 [Transform] improve irrecoverable error detection
treat resource not found and illegal argument exceptions as irrecoverable error

relates #50135
2020-02-04 10:36:35 +01:00
Benjamin Trent d293980a09
[7.x] [ML] add GET _cat/ml/datafeeds (#51500) (#51829)
* [ML] add GET _cat/ml/datafeeds (#51500)

This adds GET _cat/ml/datafeeds && _cat/ml/datafeeds/{datafeed_id}

* fixing for java8 compilation
2020-02-03 17:16:33 -05:00
Jonathan Budzenski 8fa4a40bdf [rest spec] fill in documentation links for security.{put,delete}_privileges (#48482)
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-02-03 10:53:50 -06:00
Jochem Wichers Hoeth 8aaca45922 Fix section header in Get API key information API doc (#51807) 2020-02-03 07:36:18 -08:00
James Rodewig 4ea7297e1e
[DOCS] Change http://elastic.co -> https (#48479) (#51812)
Co-authored-by: Jonathan Budzenski <jon@budzenski.me>
2020-02-03 09:50:11 -05:00
Dan Hermann 4083eae0b7
[7.x] Secure password for monitoring HTTP exporter (#51775)
Adds a secure and reloadable SECURE_AUTH_PASSWORD setting to allow keystore entries in the form "xpack.monitoring.exporters.*.auth.secure_password" to securely supply passwords for monitoring HTTP exporters. Also deprecates the insecure `AUTH_PASSWORD` setting.
2020-02-03 07:42:30 -06:00
Andrei Dan 81388051d8
Reenable testWhenUserLimitedByOnlyAliasOfIndexCanWriteToIndexWhichWasRolledoverByILMPolicy (#51768) (#51801)
We suspect the flakiness could’ve come from the fact that the rollover
step used to create the new index and roll the write alias to the new
index in separate cluster state updates. So the assertion that the
rolled index exists could’ve passed in the test but, before the
alias was rolled over to the new index, the subsequent write we execute
in the test (namely
`indexDocs("test_user", "x-pack-test-password", "foo_alias", 1)`)
would’ve sent the new document to the source index (ie. foo-logs-000001)

This would see the source index containing 3 documents and the rolled
index (foo-logs-000002) 0 documents.

However, we fixed this and the rollover step executes the “create index
and roll alias” in one single cluster update, so this situation should
not occur anymore.

(cherry picked from commit 834261c4fe7dd93f437eeec43c00d01ff2279f86)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-02-03 11:54:00 +00:00
David Roberts d5d8fb26fa [TEST] Remove obsolete test trace logging from NetworkDisruptionIT (#51746)
The issue this logging was added to fix (#49908) was closed in
December and the problem has not recurred so this logging is no
longer needed.
2020-02-03 11:25:53 +00:00
Karel Minarik 050c4d4c89
Fixes for the REST specification (#51791)
* REST: Test: Fix the `accept_enterprise` parameter for Get License API (#51527)

The Get License API specifies the `accept_enterprise` parameter as a `boolean`:

0ca5cb8cb6/x-pack/plugin/src/test/resources/rest-api-spec/api/license.get.json (L22-L27)

In the test, a `string` is passed however, which makes the test compilation fail in the Go client.

(cherry picked from commit e2a2169b3d44592057c143253bb56375ed3e4268)

* Fix the SQL API documentation in REST specification (#51534)

This patch fixes the SQL REST API documentation to conform to the current schema.

(cherry picked from commit c8b6a849852699883086a6ada42279f2f68d7e07)

* Fix the "slices" parameter for the Delete By Query API in the REST specification (#51535)

This patch updates the `type` parameter in the Delete By Query API: according to
[the documentation](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-delete-by-query.html#docs-delete-by-query-slice),
it can be set to "auto", but the type in the documentation allows only numerical values.

This prevents people from setting the parameter to "auto" eg. in the Go client,
which generates source from the specification, and sets the corresponding Go
type as number.

The patch uses the `|` notation, which we have discussed previously for encoding
a "polymorphic" parameter like this.

Related: https://github.com/elastic/go-elasticsearch/issues/77

* Fix the Enrich API documentation in REST specification (#51528)

This patch fixes the REST API documentation for the Enrich APIs to conform to the current schema.

(cherry picked from commit 59f28f4f2feeba3f6d2f0b632410577eacb28121)
2020-02-02 15:28:08 +01:00
Hendrik Muhs ed170cc548
[Transform] Fix stats can return old state information if security is enabled (#51732) (#51738)
do index refresh of the internal transform index with the system user
instead of using the calling user which does not have sufficient rights
if security is enabled

fixes #51728
2020-02-01 19:34:58 +01:00
Ryan Ernst 21224caeaf Remove comparison to true for booleans (#51723)
While we use `== false` as a more visible form of boolean negation
(instead of `!`), the true case is implied and the true value does not
need to explicitly checked. This commit converts cases that have slipped
into the code checking for `== true`.
2020-01-31 16:35:43 -08:00
Lee Hinman 4594a210bf
[7.x] Fix SnapshotLifecycleRestIT.testFullPolicySnapshot (#517… (#51778)
* Fix SnapshotLifecycleRestIT.testFullPolicySnapshot

This previously was missing some key information in the output of the failure. This captures that
information and adds logging at each step so we can determine the cause *if* it fails again.

Resolves #50358
2020-01-31 15:38:28 -07:00
Aleksandr Maus d4f6f38150
EQL: Fix #51541: [CI] unknown setting [xpack.eql.enabled] in release-tests (#51699) (#51770)
Fixes #51541
Co-authored-by: Igor Motov <igor@motovs.org>
2020-01-31 15:14:27 -05:00
Dimitris Athanasiou 55b5c8f703
[7.x][ML] Remove index.unassigned.node_left.delayed_timeout setting from M… (#51740) (#51764)
This setting was introduced with the purpose of reducing the time took by
tests that shut nodes down. Tests like `MlDistributedFailureIT` and
`NetworkDisruptionIT`. However, it is unfortunate to have to set the value
to an explicit value in production. In addition, and most important, the dynamically
choosing the value for this setting makes it impossible to adopt static index template configs
that we register via `IndexTemplateRegistry`, which we need to use in order to start
registering ILM policies for the ML indices.

This commit removes this setting from our templates. I run the tests a few times and could
not see execution time differing significantly.

Backport of #51740
2020-01-31 20:28:29 +02:00
Andrei Dan 5ca51562ec
Fix testThatNonExistingTemplatesAreAddedImmediately (#51668) (#51752)
This addresses another race condition that could yield this test flaky.

(cherry picked from commit d20d90aceb2b687239654d6f013f61f7f4cc1512)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-31 18:18:00 +00:00
Andrei Dan 20f47b14b0
Fix SnapshotLifecycleServiceTests.testPolicyCRUD (#51653) (#51755)
(cherry picked from commit 8f9a87fa576a8a1c6ea3efb29bf1296d50d89ace)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-31 18:17:38 +00:00
Przemko Robakowski 227621dd13
Change index.lifecycle.step.master_timeout to indices.lifecycle.step.master_timeout (#51744) (#51761)
* Change index.lifecycle.step.master_timeout to indices.lifecycle.step.master_timeout

This changes setting name from `index.lifecycle.step.master_timeout` to
`indices.lifecycle.step.master_timeout` to avoid confusion about its scope.
`index.*` settings are recognized as index level settings, this one is node level.

Reletes to #51698
2020-01-31 18:56:50 +01:00
Lee Hinman deefc85d60
[7.x] Stop policy on last PhaseCompleteStep instead of Termina… (#51758)
Currently when an ILM policy finishes its execution, the index moves into the `TerminalPolicyStep`,
denoted by a completed/completed/completed phase/action/step lifecycle execution state.

This commit changes the behavior so that the index lifecycle execution state halts at the last
configured phase's `PhaseCompleteStep`, so for instance, if an index were configured with a policy
containing a `hot` and `cold` phase, the index would stop at the `cold/complete/complete`
`PhaseCompleteStep`. This allows an ILM user to update the policy to add any later phases and have
indices configured to use that policy pick up execution at the newly added "later" phase. For
example, if a `delete` phase were added to the policy specified about, the index would then move
from `cold/complete/complete` into the `delete` phase.

Relates to #48431
2020-01-31 10:36:41 -07:00
Mayya Sharipova 42b885f050
Upgrade to lucene-8.5.0-snapshot-3333ce7da6d (#51749)
Backport for #51327
2020-01-31 11:20:15 -05:00
Benjamin Trent e372854d43
[ML][Inference] Fix model pagination with models as resources (#51573) (#51736)
This adds logic to handle paging problems when the ID pattern + tags reference models stored as resources. 

Most of the complexity comes from the issue where a model stored as a resource could be at the start, or the end of a page or when we are on the last page.
2020-01-31 07:52:19 -05:00
Yang Wang 77b00fc0c0
Add warnings for invalid realm order config (#51195) (#51515)
The changes are to help users prepare for migration to next major
release (v8.0.0) regarding to the break change of realm order config.

Warnings are added for when:
* A realm does not have an order config
* Multiple realms have the same order config

The warning messages are added to both deprecation API and loggings.
The main reasons for doing this are: 1) there is currently no automatic relay
between the two; 2) deprecation API is under basic and we need logging
for OSS.
2020-01-31 12:32:37 +11:00
Gordon Brown 10c8179351
Use exclusions list instead of fake system indices (#51586)
This commit switches the strategy for managing dot-prefixed indices that
should be hidden indices from using "fake" system indices to an explicit
exclusions list that must be updated when those indices are converted to
hidden indices.
2020-01-30 16:31:27 -07:00
Mark Vieira 8d2370bf00
Always use bundled JDK for external cluster nodes when BWC testing (#51505) (#51701) 2020-01-30 14:35:43 -08:00
Bogdan Pintea f1173aaa48
SQL: Add optimisations for not-equalities (#51088) (#51700)
* Optimize not-equalities in con-/disjunctions

This commit adds optimisations of not-equalities in conjunctions and
disjunctions:
* for conjunctions, the not-equality can be optimized away when applied
together with a range or inequality, in case the not-equality point
falls outside the domain of the later condition; if its on the boarder,
it will modify the bound, to simply exclude the equality, if present;
otherwise no optimisation can be applied;
* for disjunctions, the not-equals could filter away the ranges and
inequalities, unless these include an equality on the bound, in which
case the entire condition becomes always true, but this would influence
the score() function, so it's been omitted;

* fix aggregations of inequalities in ranges

This commit fixes the loop that aggregates inequalities into ranges:
- it won't advance the outer loop index in case of a merge, since the
current element is removed;
- it will break the inner loop, since comparision against the element
selected in the outer loop can't continue, as it had been removed.



(cherry picked from commit 789724ac2cc726de603849b4eeb8194da7528bcc)
2020-01-30 23:29:39 +01:00
Lee Hinman b9faa0733d
[7.x] Rename ILM history index enablement setting (#51698) (#51705)
* Rename ILM history index enablement setting

The previous setting was `index.lifecycle.history_index_enabled`, this commit changes it to
`indices.lifecycle.history_index_enabled` to indicate this is not an index-level setting (it's node
level).
2020-01-30 15:27:44 -07:00
Benjamin Trent 1380dd439a
[7.x] [ML][Inference] Fix weighted mode definition (#51648) (#51695)
* [ML][Inference] Fix weighted mode definition (#51648)

Weighted mode inaccurately assumed that the "max value" of the input values would be the maximum class value. This does not make sense. 

Weighted Mode should know how many classes there are. Hence the new parameter `num_classes`. This indicates what the maximum class value to be expected.
2020-01-30 15:33:25 -05:00
Nhat Nguyen 1cba5d7c4b Force flush in FrozenEngine#testSearchers (#51635)
We need to force flush to make the last commit safe; otherwise, we might 
fail to open FrozenEngine. Note that we force flush before closing a
shard.

Closes #51620
2020-01-30 14:48:45 -05:00
Benjamin Trent 2a2a0941af
[ML][Inference] stream inflate to parser + throw when byte limit is reached (#51644) (#51679)
Three fixes for when the `compressed_definition` is utilized on PUT

* Update the inflate byte limit to be the minimum of 10% the max heap, or 1GB (what it was previously)
* Stream data directly to the JSON parser, so if it is invalid, we don't have to inflate the whole stream to find out
* Throw when the maximum bytes are reach indicating that is why the request was rejected
2020-01-30 10:16:14 -05:00
Marios Trivyzas f373020349 SQL: Fix ORDER BY YEAR() function (#51562)
Previously, if YEAR() was used as and ORDER BY argument without being
wrapped with another scalar (e.g. YEAR(birth_date) + 10), no script
ordering was used but instead the underlying field (e.g. birth_date)
was used instead as a performance optimisation. This works correctly if
YEAR() is the only ORDER BY arg but if further args are used as tie
breakers for the ordering wrong results are produced. This is because
2 rows with the different birth_date but on the same year are not tied
as the underlying ordering is on birth_date and not on the
YEAR(birth_date), and the following ORDER BY args are ignored.

Remove this optimisation for YEAR() to avoid incorrect results in
such cases.

As a consequence another bug is revealed: scalar functions on top
of nested fields produce scripted sorting/filtering which is not yet
supported. In such cases no error was thrown but instead all values for
such nested fields were null and were passed to the script implementing
the sorting/filtering, producing incorrect results.

Detect such cases and throw a validation exception.

Fixes: #51224
(cherry picked from commit f41efd6753dc3650a7eabb3e07b02b3b32c5704c)
2020-01-30 15:29:36 +01:00
Martijn van Groningen f7e2082378
Backport: unmute rolling upgrade watcher tests and (#51664)
set watcher logger to debug level.

These tests haven't run in such a long time,
we first need to get a better picture how/if
these tests fail today.

Backport of #51478
See #33185
2020-01-30 14:01:30 +01:00
Marios Trivyzas 285a167c34 SQL: Verify Full-Text Search functions not allowed in SELECT (#51568)
Add a verification that full-text search functions `MATCH()` and `QUERY()`
are not allowed in the SELECT clause, so that a nice error message is
returned to the user early instead of an "ugly" exception.

Fixes: #47446
2020-01-30 13:14:38 +01:00
Albert Zaharovits f25b6cc2eb
Add new 'maintenance' index privilege #50643
This commit creates a new index privilege named `maintenance`.
The privilege grants the following actions: `refresh`, `flush` (also synced-`flush`),
and `force-merge`. Previously the actions were only under the `manage` privilege
which in some situations was too permissive.

Co-authored-by: Amir H Movahed <arhd83@gmail.com>
2020-01-30 11:59:11 +02:00
Henning Andersen 149b68d850 [ML] Fix possible race condition starting datafeed (#51646)
Datafeeds being closed while starting could result in and NPE. This was
handled as any other failure, masking out the NPE. However, this
conflicts with the changes in #50886.

Related to #50886 and #51302
2020-01-30 08:23:45 +01:00
Lisa Cawley 28f2f3dd02 [DOCS] Minor fixes in transform documentation (#51633) 2020-01-29 16:58:18 -08:00
Aleksandr Maus 0d21d9e2c5
EQL: Enable QA/rest integration tests for snapshot builds only (#51624) (#51645)
* Related to #51541: [CI] unknown setting [xpack.eql.enabled] in release-tests
2020-01-29 16:38:52 -05:00
Julie Tibshirani 9dcc3ef7e6
Always use one shard in vector REST tests. (#51643)
This PR tries to address the intermittent vector test failures on 7.x by making
sure we create indices with one shard.

The fix is based on this theory as to what's happening:
* On 7.x, the default number of shards is 1, but in REST tests we randomly use
2 in order to cover the multiple shards case. In the failing test run, we use 2
shards and all documents end up on only one shard.
* During a search, the response from the empty shard doesn't produce
deprecation warnings because  we never try to execute the script. If not all
shard responses contain the warning headers, then certain deprecation warnings
can be lost (due to the bug described in #33936).

Addresses #50716.
Relates to #50061.
2020-01-29 12:24:41 -08:00
Przemysław Witek 683170b007
Increase the number of indexed documents to increase a chance that there are at least 2 training rows. (#51607) (#51615) 2020-01-29 17:17:19 +01:00