Commit Graph

4530 Commits

Author SHA1 Message Date
David Kyle 0ac03ac5e7
[ML] Add parsers for inference configuration classes (#51300) 2020-01-22 17:03:01 +00:00
David Kyle ca4b90a001
[ML] Calculate results and snapshot retention using latest bucket timestamps (#51061) (#51301)
The retention period is calculated relative to the last bucket result or snapshot
time rather than wall clock
2020-01-22 14:52:33 +00:00
Dimitris Athanasiou 59687a9384
[7.x][ML] Validate classification dependent_variable cardinality is at lea… (#51232) (#51309)
Data frame analytics classification currently only supports 2 classes for the
dependent variable. We were checking that the field's cardinality is not higher
than 2 but we should also check it is not less than that as otherwise the process
fails.

Backport of #51232
2020-01-22 16:51:16 +02:00
Benjamin Trent 2a73e849d6
[ML][Inference] fixing ingest IT tests (#51267) (#51311)
Converts InferenceIngestIT into a `ESRestTestCase`.

closes #51201
2020-01-22 09:50:17 -05:00
David Roberts 932c63297f [ML] Fix possible race condition when starting datafeed (#51302)
The ID of the datafeed's associated job was being obtained
frequently by looking up the datafeed task in a map that
was being modified in other threads.  This could lead to
NPEs if the datafeed stopped running at an unexpected time.

This change reduces the number of places where a datafeed's
associated job ID is looked up to avoid the possibility of
failures when the datafeed's task is removed from the map
of running tasks during multi-step operations in other
threads.

Fixes #51285
2020-01-22 11:40:39 +00:00
Przemysław Witek bfcfcdee33
[7.x] Do not copy mapping from dependent variable to prediction field in regression analysis (#51227) (#51288) 2020-01-22 12:36:24 +01:00
Andrei Dan 421aa14972
ILM: Make UpdateSettingsStep retryable (#51235) (#51298)
This makes the UpdateSettingsStep retryable. This step updates settings needed
during the execution of ILM actions (mark indexes as read-only, change
allocation configurations, mark indexing complete, etc)

As the index updates are idempotent in nature (PUT requests and are applied only
if the values have changed) and the settings values are seldom user-configurable
(aside from the allocate action) the testing for this change goes along the
lines of artificially simulating a setting update failure on a particular value
update, which is followed by a successful step execution (a retry) in an
environment outside of ILM (the step executions are triggered manually).

(cherry picked from commit 8391b0aba469f39532bfc2796b76148167dc0289)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-22 11:02:26 +00:00
Andrei Dan 123266714b
ILM wait for active shards on rolled index in a separate step (#50718) (#51296)
After we rollover the index we wait for the configured number of shards for the
rolled index to become active (based on the index.write.wait_for_active_shards
setting which might be present in a template, or otherwise in the default case,
for the primaries to become active).
This wait might be long due to disk watermarks being tripped, replicas not
being able to spring to life due to cluster nodes reconfiguration and others
and, the RolloverStep might not complete successfully due to this inherent
transient situation, albeit the rolled index having been created.

(cherry picked from commit 457a92fb4c68c55976cc3c3e2f00a053dd2eac70)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-22 11:01:52 +00:00
Hendrik Muhs af76ae4ab9 [Transform] Add yml test suite for testing remote clusters (CCS) (#51033)
add a test suite for remote clusters features and add test cases for transform
2020-01-22 11:19:02 +01:00
Ioannis Kakavas a76321437c
Truncate SAML Response in trace log (#51237) (#51283)
When not truncated, a long SAML response XML document can fill max
line length and mask the actual exception message that the trace
statement is meant to inform about.
The same XML Document is also printed in full on trace level in
SamlRequestHandler#parseSamlMessage() so there is no loss of
information
2020-01-22 09:56:39 +02:00
Nhat Nguyen 5d4bbdcc50 Use conditional doc type in testFrozenIndexAfterRestarted 2020-01-21 12:57:58 -05:00
Nik Everett ca15a3f5a8
Add "did you mean" to unknown queries (#51177) (#51254)
This replaces the message we return for unknown queries with the standard
one that we use for unknown fields from `ObjectParser`. This is nice
because it includes "did you mean". One day we might convert parsing
queries to using object parser, but that looks complex. This change is
much smaller and seems useful.
2020-01-21 12:45:52 -05:00
Benjamin Trent a9b2bc525e
[ML] address two edge cases for categorization.GrokPatternCreator#findBestGrokMatchFromExamples (#51168) (#51255)
There are two edge cases that can be ran into when example input is matched in a weird way.

1. Recursion depth could continue many many times, resulting in a HUGE runtime cost. I put a limit of 10 recursions (could be adjusted I suppose). 
2. If there are no "fixed regex bits", exploring the grok space would result in a fence-post error during runtime (with assertions turned off)
2020-01-21 10:29:29 -05:00
Martijn van Groningen 6b5b26a595
Protects against NPE:
2> REPRODUCE WITH: ./gradlew ':x-pack:plugin:watcher:test' --tests "org.elasticsearch.xpack.watcher.history.HistoryTemplateTransformMappingsTests.testTransformFields" -Dtests.seed=26754396AB9C1A30 -Dtests.security.manager=true -Dtests.locale=lv-LV -Dtests.timezone=America/Dominica -Dcompiler.java=13 -Druntime.java=8
  2> java.lang.NullPointerException
        at __randomizedtesting.SeedInfo.seed([26754396AB9C1A30:B2A3CA27E260803B]:0)
        at org.elasticsearch.xpack.watcher.history.HistoryTemplateTransformMappingsTests.lambda$testTransformFields$1(HistoryTemplateTransformMappingsTests.java:85)
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
        at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
        at java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1628)
        at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
        at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
        at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
        at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
        at org.elasticsearch.xpack.watcher.history.HistoryTemplateTransformMappingsTests.lambda$testTransformFields$2(HistoryTemplateTransformMappingsTests.java:88)
        at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:892)
        at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:877)
        at org.elasticsearch.xpack.watcher.history.HistoryTemplateTransformMappingsTests.testTransformFields(HistoryTemplateTransformMappingsTests.java:74)
2020-01-21 15:42:22 +01:00
Nik Everett 788836ea3f
Revert "Begin moving date_histogram to offset rounding (backport of #50873) (#50978)" (#51239)
This reverts commit 9a3d4db840. It was
subtly broken in ways we didn't have tests for.
2020-01-21 08:50:02 -05:00
David Roberts 0fa7db9a95 [ML] Make datafeeds work with nanosecond time fields (#51180)
Allows ML datafeeds to work with time fields that have
the "date_nanos" type _and make use of the extra precision_.
(Previously datafeeds only worked with time fields that were
exact multiples of milliseconds.  So datafeeds would work
with "date_nanos" only if the extra precision over "date" was
not used.)

Relates #49889
2020-01-21 09:59:50 +00:00
Nhat Nguyen 43ed244a04
Account soft-deletes in FrozenEngine (#51192) (#51229)
Currently, we do not exclude soft-deleted documents when opening index
reader in the FrozenEngine.

Backport of #51192
2020-01-20 17:07:29 -05:00
Adrien Grand 1a73d8329c Disable xpack/15_basic/Usage stats for mappings.
Relates #51127
2020-01-20 18:05:26 +01:00
Andrei Stefan 2908b7e5fc
SQL: add support for passing query parameters in REST API calls (#51029) (#51222)
* REST PreparedStatement-like query parameters are now supported in the form of an array of non-object, non-array values where ES SQL parser will try to infer the data type of the value being passed as parameter.

(cherry picked from commit 45b8bf619aecb1c03d7bc0cf06928dcc36005a66)
2020-01-20 16:40:19 +02:00
Andrei Stefan 543cc85b78
Add trace logging for responses coming from server (#50530) (#51221)
(cherry picked from commit 38eb485deffa175c7eb0b55a42a3e309f8a9802d)
2020-01-20 16:39:46 +02:00
Andrei Stefan df36169220
SQL: change the way unsupported data types fields are handled (#50823) (#51220)
The hierarchy of fields/sub-fields under a field that is of an
unsupported data type will be marked as unsupported as well. Until this
change, the behavior was to set the unsupported data type field's
hierarchy as empty.

Example, considering the following hierarchy of fields/sub-fields
a -> b -> c -> d, if b would be of type "foo", then b, c and d will
be marked as unsupported.

(cherry picked from commit 7adb286c4c485b9e781f88b0a2f98cab9ec5b7e2)
2020-01-20 16:23:43 +02:00
Hendrik Muhs 51134d9738 check custom meta data to avoid NPE (#51163)
check custom meta data to avoid NPE, fixes a problem introduced in #51072

fixes #51153
2020-01-20 13:53:42 +01:00
Tim Vernum a0ca82422c
Mute TimeSeriesLifecycleActionsIT.waitForSnapshot (#51208)
This test was recently un-muted, but is still failing

Relates: #50781
Backport of: #51203
2020-01-20 20:19:29 +11:00
Nik Everett 977b53ab91
Fix flaky usage tracking test (#51169) (#51179)
We added tracking of index feature usage in #51031 but due to some copy
and paste errors the test fails on some seeds. This fixes those errors.
2020-01-17 16:53:13 -05:00
Jason Tedor 9ce4d2b901
Initial autoscaling commit (#51161)
This commit merely adds the skeleton for the autoscaling project, adding
the basics to include the autoscaling module in the default
distribution, opt-in to code formatting, and a placeholder for the docs.
2020-01-17 15:31:12 -05:00
Lee Hinman 731c96b507
[7.x] Use separate policies for tests in SnapshotLifecycleRest… (#51181)
These policies store statistics, but since stats updating is asynchronous, it's
possible for the update from one test to bleed into a separate one. This change
switches the tests to use separate policy ids so that their stats are tracked
independently. It also relaxes the checking constraint in one of the tests.

Hopefully this:
Resolves #48531
Resolves #48017
2020-01-17 13:26:40 -07:00
Jay Modi 107989df3e
Introduce hidden indices (#51164)
This change introduces a new feature for indices so that they can be
hidden from wildcard expansion. The feature is referred to as hidden
indices. An index can be marked hidden through the use of an index
setting, `index.hidden`, at creation time. One primary use case for
this feature is to have a construct that fits indices that are created
by the stack that contain data used for display to the user and/or
intended for querying by the user. The desire to keep them hidden is
to avoid confusing users when searching all of the data they have
indexed and getting results returned from indices created by the
system.

Hidden indices have the following properties:
* API calls for all indices (empty indices array, _all, or *) will not
  return hidden indices by default.
* Wildcard expansion will not return hidden indices by default unless
  the wildcard pattern begins with a `.`. This behavior is similar to
  shell expansion of wildcards.
* REST API calls can enable the expansion of wildcards to hidden
  indices with the `expand_wildcards` parameter. To expand wildcards
  to hidden indices, use the value `hidden` in conjunction with `open`
  and/or `closed`.
* Creation of a hidden index will ignore global index templates. A
  global index template is one with a match-all pattern.
* Index templates can make an index hidden, with the exception of a
  global index template.
* Accessing a hidden index directly requires no additional parameters.

Backport of #50452
2020-01-17 10:09:01 -07:00
Jay Modi 96e8f67425
Upgrade to the latest OWASP HTML sanitizer (#50765) (#51166)
This commit upgrades the OWASP HTML sanitizer used by watcher to the
latest version and also upgrades guava, which it depends on. The guava
upgrade also requires the addition of a new dependency that guava
itself requires as of version 27.0. The sanitizer's behavior has changed to
re-write these templated values with a comment that results in this output
`{<!-- -->{ctx.metadata.name}}`. This would be an issue if we attempted to
sanitize the template, but the code that uses the sanitizer runs the rendered
string through the sanitizer, which means that the templated values have
been replaced already.

Relates #50395
2020-01-17 10:00:33 -07:00
Ioannis Kakavas 4fc865e579
Don't fallback to anonymous for tokens/apikeys (#51042) (#51159)
This commit changes our behavior so that when we receive a
request with an invalid/expired/wrong access token or API Key
we do not fallback to authenticating as the anonymous user even if
anonymous access is enabled for Elasticsearch.
2020-01-17 18:56:02 +02:00
David Roberts 295665b1ea [ML] Add audit warning for 1000 categories found early in job (#51146)
If 1000 different category definitions are created for a job in
the first 100 buckets it processes then an audit warning will now
be created.  (This will cause a yellow warning triangle in the
ML UI's jobs list.)

Such a large number of categories suggests that the field that
categorization is working on is not well suited to the ML
categorization functionality.
2020-01-17 16:28:45 +00:00
Przemysław Witek da73c9104e
[ML] Fix tests randomly failing on CI (#51142) (#51150) 2020-01-17 14:58:58 +01:00
Dimitris Athanasiou b70ebdeb96
[7.x][ML] DF Analytics _explain API should skip object fields (#51115) (#51147)
Object fields cannot be used as features. At the moment _explain
API includes them and even worse it allows it does not error when
an object field is excluded. This creates the expectation to the
user that all children fields will also be excluded while it's not
the case.

This commit omits object fields from the _explain API and also
adds an error if an object field is included or excluded.

Backport of #51115
2020-01-17 14:02:59 +02:00
Przemysław Witek b1a526d5e9
[7.x] [ML] Update DFA progress document in the index the document belongs to (#51111) (#51117) 2020-01-17 08:12:54 +01:00
Hendrik Muhs 13343b15c9 [Transform] Improve force stop robustness in case of an error (#51072)
If a transform config got lost (e.g. because the internal index disappeared) tasks could not be
stopped using transform API. This change makes it possible to stop transforms without a config,
meaning to remove the background task. In order to do so force must be set to true.
2020-01-17 07:42:21 +01:00
Ioannis Kakavas d0554fd317
Fail gracefully on invalid token strings (#51014) (#51096)
When we receive a request with an Authorization header that contains
a Bearer token that is not generated by us or that is malformed in
some way, attempting to decode it as one of our own might cause a
number of exceptions that are not IOExceptions. This commit ensures
that we catch and log these too and call onResponse with `null, so
that we can return 401 instead of 500.

Resolves: #50497
2020-01-16 17:00:17 +02:00
Florian Kelbert 584cb0d926 [DOCS] Correctly read total hits inside watcher config (#50614)
With elastic/elasticsearch#35848, users can now retrieve total hits as an integer when the `rest_total_hits_as_int` query parameter is `true`. This is the default value.

This updates several snippet examples in the Watcher docs that used a workaround to get a total hits integer.
2020-01-16 09:43:25 -05:00
Bogdan Pintea fb65ef3f2d
SQL: Extend the optimisations for equalities (#50792) (#51098)
* Extend the optimizations for equalities

This commit supplements the optimisations of equalities in conjunctions
and disjunctions:
* for conjunctions, the existing optimizations with ranges are extended
with not-equalities and inequalities; these lead to a fast resolution,
the conjunction either being evaluate to a FALSE, or the non-equality
conditions being dropped as superfluous;
* optimisations for disjunctions are added to be applied against ranges,
inequalities and not-equalities; these lead to disjunction either
becoming TRUE or the equality being dropped, either as superfluous or
merged into a range/inequality.

* Adress review notes

* Fix the bug around wrongly optimizing 'a=2 OR a!=?', which only yields
TRUE for same values in equality and inequality.
* Var renamings, code style adjustments, comments corrections.

* Address further review comments. Extend optim.

- fix a few code comments;
- extend the Equals OR NotEquals optimitsation (a=2 OR a!=5 -> a!=5);
- extend the Equals OR Range optimisation on limits equality (a=2 OR
  2<=a<5 -> 2<=a<5);
- in case an equality is being removed in a conjunction, the rest of
  possible optimisations to test is now skipped.

* rename one var for better legiblity

- s/rmEqual/removeEquals

(cherry picked from commit 62e7c6a010f10cd7893ee5c99bad8b8d2a693436)
2020-01-16 14:32:34 +01:00
Tom Veasey 32ec934b15
[7.x][ML] Assert top classes are ordered by score (#51028)
Backport #51003.
2020-01-16 12:23:15 +00:00
markharwood ff0a45f882
Fix NPE in PinnedQuery call to DisjunctionMaxScorer. (#51047) (#51064)
Fix NPE in PinnedQuery call to DisjunctionMaxScorer. (#51047)
Added test and fix that tests for score type.
Closes #51034
2020-01-16 10:41:43 +00:00
Rory Hunter 80d925e225
Auto-format buildSrc (#51043)
Backport / reimplementation of #50786 on 7.x.

Opt-in `buildSrc` for automatic formatting. This required a config tweak
in order to pick up all the Java sources, and as a result more files are
now found in the Enrich plugin, that were previously missed.

I also moved the 2 Java files in `buildSrc/src/main/groovy` into the Java
directory, which required some follow-up changes.
2020-01-16 10:26:27 +00:00
Adrien Grand 45d7bdcfd7
Add analysis components and mapping types to the usage API. (#51062)
Knowing about used analysis components and mapping types would be incredibly
useful in order to know which ones may be deprecated or should get more love.

Some field types also act as a proxy to know about feature usage of some APIs
like the `percolator` or `completion` fields types for percolation and the
completion suggester, respectively.
2020-01-16 09:56:41 +01:00
Tim Vernum ac6602a156
Fix windows newline issue in test (#51082)
Fixes HttpCertificateCommandTests.testTextFileSubstitutions on Windows

Backport of: #51030
2020-01-16 17:01:58 +11:00
Yang Wang c1a6d5d9ff
Encrypt generated key with AES (#51019) (#51076)
Replace DES with AES to align with modern encryption standards
Backport also fixs Files.readString API that is not available in Java 8

Resolves: #50843
2020-01-16 14:47:21 +11:00
Lee Hinman 2d1c28a45d
[7.x] Fix AllocateRoutedStepTests reusing keys for random valu… (#51058)
In these tests there was a very small chance that keys could collide,
which causes test failures.

Resolves #49307
2020-01-15 11:36:34 -07:00
Lee Hinman e395cf3419
Guard against null settings in CCRIndexLifecycleIT (#51008) (#51054)
It's possible that the index could return no settings and thus throw a
`NullPointerException`.

I wasn't able to reproduce the original issue, but this should guard
against in the future.

Resolves #50646

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-01-15 11:21:18 -07:00
Lee Hinman ad60f0015e
Address failures in SnapshotLifecycleRestIT.testFullPolicySnapshot (#51013)
This test failed a couple of different ways, related to timing, as well
as concurrent snapshots, and also naming.

This commit splits the giant `assertBusy` into separate parts so that we don't
perform ~5 different requests and tests in the same loop. It also gives each
test a unique repository so that no other test can accidentally re-use
snapshots.

Resolves #50358 (hopefully!)
2020-01-15 09:47:41 -07:00
Rory Hunter 2f069d8f3f
Tweak formatter config for long generic lines (#51027)
Backport of #50909. The current formatting config allows some long
generic declarations to break the 140 character limit. Tweak the config
to wrap such lines.
2020-01-15 13:17:37 +00:00
David Roberts 1536c3e622 [TEST] Increase ML distributed test job open timeout (#50998)
There have been occasional failures, presumably due to
too many tests running in parallel, caused by jobs taking
around 15 seconds to open.  (You can see the job open
successfully during the cleanup phase shortly after the
failure of the test in these cases.)  This change increases
the wait time from 10 seconds to 20 seconds to reduce the
risk of this happening.
2020-01-15 08:58:55 +00:00
Martijn van Groningen e76c3d4d32
Tidy up enrich processors: (#50957)
* Fix generics usages.
* Sealed match processor class.
2020-01-15 08:51:22 +01:00
Tomas Della Vedova 5b6fa79fd8
[ML] Removed key value from the catch regex test (#50977) (#51021) 2020-01-15 08:50:59 +01:00