Commit Graph

4427 Commits

Author SHA1 Message Date
Benjamin Trent 060e0a6277
[ML][Inference] Add support for models shipped as resources (#50680) (#50700)
This adds support for models that are shipped as resources in the ML plugin. The first of which is the `lang_ident` model.
2020-01-07 09:21:59 -05:00
Hendrik Muhs 98ca9500e8
implement a workaround for remote cluster validation (#50460)
In 7.x an internal API used for validating remote cluster does not throw, see #50420 for the 
details. This change implements a workaround for remote cluster validation, only for 7.x branches.

fixes #50420
2020-01-07 13:51:51 +01:00
Przemysław Witek 4116452d90
Implement testStopAndRestart for ClassificationIT (#50585) (#50698) 2020-01-07 13:41:37 +01:00
David Roberts 35453e2b0e [ML] Improve uniqueness of result document IDs (#50644)
Switch from a 32 bit Java hash to a 128 bit Murmur hash for
creating document IDs from by/over/partition field values.
The 32 bit Java hash was not sufficiently unique, and could
produce identical numbers for relatively common combinations
of by/partition field values such as L018/128 and L017/228.

Fixes #50613
2020-01-07 10:24:45 +00:00
David Roberts 46d600c446 [ML] Fix off-by-one error in ml_classic tokenizer end offset (#50655)
The end offset of a tokenizer is supposed to point one past the
end of the input, not to the end character of the input.  The
ml_classic tokenizer was erroneously doing the latter.
2020-01-07 10:14:59 +00:00
Lee Hinman 552edd862e
[7.x] Add aditional logging for ILM history store tests (#5062… (#50678)
* Add aditional logging for ILM history store tests (#50624)

These tests use the same index name, making it hard to read logs when
diagnosing the failures. Additionally more information about the current
state of the index could be retrieved when failing.

This changes these two things in the hope of capturing more data about
why this fails on some CI nodes but not others.

Relates to #50353
2020-01-06 15:24:24 -07:00
Nik Everett 7fd84a03a0
Drop references to deprecated logger (#50474) (#50681)
This drops all remaining references to `BaseRestHandler.logger` which
has been deprecated for something like a year now. I replaced all of the
references with locally declared loggers which is so much less spooky
action at a distance to me.
2020-01-06 16:34:07 -05:00
Benjamin Trent 06cea5136e
[ML] construct new random generator on each persistence call (#50657) (#50684)
Sharing a random generator may cause test failures as non-threadsafe random generators are periodically utilized in tests (see: https://github.com/elastic/elasticsearch/issues/50651)

This change constructs a calls `Randomness.get()` within the  `bulkIndexWithRetry` method so that the returned `Random` object is only used in a single thread. Before, the member variable could have been used between threads, which caused test failures.
2020-01-06 16:26:29 -05:00
Benjamin Trent 5ab9e75e28
[7.x] [ML][Inference] lang_ident model (#50292) (#50675)
* [ML][Inference] lang_ident model (#50292)

This PR contains a java port of Google's CLD3 compact NN model https://github.com/google/cld3

The ported model is formatted to fit within our inference model formatting and stored as a resource in the `:xpack:ml:` plugin and is under basic license.

The model is broken up into two major parts:
- Preprocessing through the custom embedding (based on CLD3's embedding layer)
- Pushing the embedded text through the two layers of fully connected shallow NN. 

Main differences between this port and CLD3:
- We take advantage of Java's internal Unicode handling where possible (i.e. codepoints, characters, decoders, etc.)
- We do not trim down input text by removing duplicated tokens
- We do not encode doubles/floats as longs/integers.
2020-01-06 16:24:03 -05:00
Benjamin Trent f52af7977d
[ML][Inference] minor cleanup for inference (#50444) (#50676) 2020-01-06 14:05:04 -05:00
Nik Everett 1b28af489f
Fix bare warnings on RollupJobTests (#50633) (#50677)
Silences some ugly warnings.
2020-01-06 14:03:30 -05:00
Albert Zaharovits 9ae3cd2a78
Add 'monitor_snapshot' cluster privilege (#50489) (#50647)
This adds a new cluster privilege `monitor_snapshot` which is a restricted
version of `create_snapshot`, granting the same privileges to view
snapshot and repository info and status but not granting the actual
privilege to create a snapshot.

Co-authored-by: j-bean <anton.shuvaev91@gmail.com>
2020-01-06 13:15:55 +02:00
Martijn van Groningen 0f2d26bdca
Unmute 'Test url escaping with url mustache function' webhook watcher test (#50439)
Some changes had to be made in order to make the test pass due to the removal or types.
Added some more assertions. The failure description in this comment [0] indicates that the rest handler couldn't be found. The test passes now.
I plan to merge this into master and see how CI reacts, if it handles this change well then I will also unmute this test in 7 dot x branch.

Also check watch count after stopping watcher in test teardown and
disabled slm in smoke test watcher qa test.

Relates to #41172

0: https://github.com/elastic/elasticsearch/issues/41172#issuecomment-496993976
2020-01-06 10:43:55 +01:00
Nik Everett 2362c430cd
Clean up wire test case a bit (#50627) (#50632)
* Adds JavaDoc to `AbstractWireTestCase` and
`AbstractWireSerializingTestCase` so it is more obvious you should prefer
the latter if you have a choice
* Moves the `instanceReader` method out of `AbstractWireTestCase` becaue
it is no longer used.
* Marks a bunch of methods final so it is more obvious which classes are
for what.
* Cleans up the side effects of the above.
2020-01-05 16:20:38 -05:00
Nik Everett 45663ac1a8
Use Void context on parsers where possible (#50573) (#50617)
*Most* of our parsing can be done without passing any extra context into
the parser that isn't already part of the xcontent stream. While I was
looking around at the places that *do* need a context I found a few
places that were declared to need a context but don't actually need it.
2020-01-03 13:28:55 -05:00
Christoph Büscher 6c8868e955 Mute TimeSeriesLifecycleActionsIT.testHistoryIsWrittenWithSuccess
Also muting TimeSeriesLifecycleActionsIT.testHistoryIsWrittenWithFailure.

Tracked in #50353
2020-01-03 18:32:03 +01:00
Andrei Dan 3c971f2911
ILM retryable async action steps (#50522) (#50591)
This adds support for retrying AsyncActionSteps by triggering the async
step after ILM was moved back on the failed step (the async step we'll
be attempting to run after the cluster state reflects ILM being moved
back on the failed step).

This also marks the RolloverStep as retryable and adds an integration
test where the RolloverStep is failing to execute as the rolled over
index already exists to test that the async action RolloverStep is
retried until the rolled over index is deleted.

(cherry picked from commit 8bee5f4cb58a1242cc2ef4bc0317dae6c8be49d3)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-01-03 16:19:58 +02:00
Dimitris Athanasiou ca0828ba07
[7.x][ML] Implement force deleting a data frame analytics job (#50553) (#50589)
Adds a `force` parameter to the delete data frame analytics
request. When `force` is `true`, the action force-stops the
jobs and then proceeds to the deletion. This can be used in
order to delete a non-stopped job with a single request.

Closes #48124

Backport of #50553
2020-01-03 13:46:02 +02:00
Przemysław Witek 8917c05df8
[7.x] Synchronize processInStream.close() call (#50581) 2020-01-03 10:23:51 +01:00
Lee Hinman 0d78aa2708
Don't dump a stacktrace for invalid patterns when executing elasticsearch-croneval (#49744) (#50578)
Co-authored-by: bellengao <gbl_long@163.com>
2020-01-02 16:57:51 -07:00
Nik Everett b36a8ab141
Make some ObjectParsers final (#50471) (#50556)
We have about 800 `ObjectParsers` in Elasticsearch, about 700 of which
are final. This is *probably* the right way to declare them because in
practice we never mutate them after they are built. And we certainly
don't change the static reference. Anyway, this adds `final` to a bunch
of these parsers, mostly the ones in xpack and their "paired" parsers in
the high level rest client. I picked these just to have somewhere to
break the up the change so it wouldn't be huge.

I found the non-final parsers with this:
```
diff \
  <(find . -type f -name '*.java' -exec grep -iHe 'static.*PARSER\s*=' {} \+ | sort) \
  <(find . -type f -name '*.java' -exec grep -iHe 'static.*final.*PARSER\s*=' {} \+ | sort) \
  2>&1 | grep '^<'
```
2020-01-02 10:47:38 -05:00
Przemysław Witek 4ecabe496f
Mute testStopAndRestart test case (#50551) 2020-01-02 15:28:20 +01:00
Christoph Büscher 1599af8428 Fix type conversion problem in Eclipse (#50549)
Eclipse 4.13 shows a type mismatch error in the affected line because it cannot
correctly infer the boolean return type for the method call. Assigning return
value to a local variable resolves this problem.
2020-01-02 14:29:20 +01:00
Lisa Cawley 8869f2b9b2 [DOCS] Adds intro for OIDC realm (#50485) 2019-12-30 07:05:28 -08:00
Tim Vernum cad0f6bf28
Do not load SSLService in plugin contructor (#50519)
XPackPlugin created an SSLService within the plugin contructor.
This has 2 negative consequences:

1. The service may be constructed based on a partial view of settings.
   Other plugins are free to add setting values via the
   additionalSettings() method, but this (necessarily) happens after
   plugins have been constructed.

2. Any exceptions thrown during the plugin construction are handled
   differently than exceptions thrown during "createComponents".
   Since SSL configurations exceptions are relatively common, it is
   far preferable for them to be thrown and handled as part of the
   createComponents flow.

This commit moves the creation of the SSLService to
XPackPlugin.createComponents, and alters the sequence of some other
steps to accommodate this change.

Backport of: #49667
2019-12-30 14:42:32 +11:00
James Rodewig 3f7f31b6b0 [DOCS] Fix search request body links (#50500)
PR #44238 changed several links related to the Elasticsearch search request body API. This updates several places still using outdated links or anchors.

This will ultimately let us remove some redirects related to those link changes.
2019-12-26 14:31:09 -05:00
James Rodewig ef467cc6f5 [DOCS] Remove unneeded redirects (#50476)
The docs/reference/redirects.asciidoc file stores a list of relocated or
deleted pages for the Elasticsearch Reference documentation.

This prunes several older redirects that are no longer needed and
don't require work to fix broken links in other repositories.
2019-12-26 08:29:28 -05:00
Orhan Toy 6a3d1a077e [DOCS] Fixes "enables you to" typos (#50225) 2019-12-23 14:39:14 -05:00
Armin Braun cec02da0ac
Fix Source Only Snapshot REST Test Failure (#50456) (#50459)
We are matching on the exact number of shards in this test, but may run into
snapshotting more than the single index created in it due to auto-created indices like
`.watcher`.
Fixed by making the test only take a snapshot of the single index used by this test.

Closes #50450
2019-12-23 12:24:08 +01:00
Igor Motov 339d10c16f Geo: Switch generated GeoJson type names to camel case (#50400)
Switches generated GeoJson type names to camel case
to conform to the standard.

Closes #49568
2019-12-20 15:37:22 -05:00
Lee Hinman c3c9ccf61f
[7.x] Add ILM histore store index (#50287) (#50345)
* Add ILM histore store index (#50287)

* Add ILM histore store index

This commit adds an ILM history store that tracks the lifecycle
execution state as an index progresses through its ILM policy. ILM
history documents store output similar to what the ILM explain API
returns.

An example document with ALL fields (not all documents will have all
fields) would look like:

```json
{
  "@timestamp": 1203012389,
  "policy": "my-ilm-policy",
  "index": "index-2019.1.1-000023",
  "index_age":123120,
  "success": true,
  "state": {
    "phase": "warm",
    "action": "allocate",
    "step": "ERROR",
    "failed_step": "update-settings",
    "is_auto-retryable_error": true,
    "creation_date": 12389012039,
    "phase_time": 12908389120,
    "action_time": 1283901209,
    "step_time": 123904107140,
    "phase_definition": "{\"policy\":\"ilm-history-ilm-policy\",\"phase_definition\":{\"min_age\":\"0ms\",\"actions\":{\"rollover\":{\"max_size\":\"50gb\",\"max_age\":\"30d\"}}},\"version\":1,\"modified_date_in_millis\":1576517253463}",
    "step_info": "{... etc step info here as json ...}"
  },
  "error_details": "java.lang.RuntimeException: etc\n\tcaused by:etc etc etc full stacktrace"
}
```

These documents go into the `ilm-history-1-00000N` index to provide an
audit trail of the operations ILM has performed.

This history storage is enabled by default but can be disabled by setting
`index.lifecycle.history_index_enabled` to `false.`

Resolves #49180

* Make ILMHistoryStore.putAsync truly async (#50403)

This moves the `putAsync` method in `ILMHistoryStore` never to block.
Previously due to the way that the `BulkProcessor` works, it was possible
for `BulkProcessor#add` to block executing a bulk request. This was bad
as we may be adding things to the history store in cluster state update
threads.

This also moves the index creation to be done prior to the bulk request
execution, rather than being checked every time an operation was added
to the queue. This lessens the chance of the index being created, then
deleted (by some external force), and then recreated via a bulk indexing
request.

Resolves #50353
2019-12-20 12:33:36 -07:00
Lisa Cawley 2106a7b02a
[7.x][DOCS] Updates ML links (#50387) (#50409) 2019-12-20 10:01:19 -08:00
Benjamin Trent 71ff330c4e
[ML][Inference] updates specs with new params + docs (#50373) (#50441) 2019-12-20 12:13:45 -05:00
Martijn van Groningen 9646f3abad
Disable slm in AbstractWatcherIntegrationTestCase (#50422)
SLM isn't required tests extending from this base class and
only add noise during test suite teardown.

Closes #50302
2019-12-20 15:51:46 +01:00
Przemysław Witek 3e3a93002f
[7.x] Fix accuracy metric (#50310) (#50433) 2019-12-20 15:34:38 +01:00
Przemysław Witek 14d95aae46
[7.x] Make each analysis report desired field mappings to be copied (#50219) (#50428) 2019-12-20 15:10:33 +01:00
Przemysław Witek 5bb668b866
[7.x] Get rid of maxClassesCardinality internal parameter (#50418) (#50423) 2019-12-20 14:24:23 +01:00
Hendrik Muhs 40bce49a7f mute SourceDestValidatorTests.testRemoteSourceDoesNotExist 2019-12-20 11:25:43 +01:00
Hendrik Muhs 7c10e9b0e7 [Transform] improve checkpoint reporting (#50369)
fixes empty checkpoints, re-factors checkpoint info creation (moves builder) and always reports
last change detection

relates #43201
relates #50018
2019-12-20 10:49:53 +01:00
Hendrik Muhs de14092ad2 [Transform] refactor source and dest validation to support CCS (#50018)
refactors source and dest validation, adds support for CCS, makes resolve work like reindex/search, allow aliased dest index with a single write index.

fixes #49988
fixes #49851
relates #43201
2019-12-20 10:49:53 +01:00
Marios Trivyzas f1a6b675f7 SQL: Fix issue with CAST and NULL checking. (#50371)
Previously, during expression optimisation, CAST would be considered
nullable if the casted expression resulted to a NULL literal, and would
be always non-nullable otherwise. As a result if CASE was wrapped by a
null check function like IS NULL or IS NOT NULL it was simplified to
TRUE/FALSE, eliminating the actual casting operation. So in case of an
expression with an erroneous casting like CAST('foo' AS DATETIME) IS NULL
it would be simplified to FALSE instead of throwing an Exception signifying
the attempt to cast 'foo' to a DATETIME type.

CAST now always returns Nullability.UKNOWN except from the case that
its result evaluated to a constant NULL, where it returns Nullability.TRUE.
This way the IS NULL/IS NOT NULL don't get simplified to FALSE/TRUE
and the CAST actually gets evaluated resulting to a thrown Exception.

Fixes: #50191
(cherry picked from commit 671e07a931cd828661e226cba22a5d38804a17a5)
2019-12-20 10:24:35 +02:00
Tim Brooks cb73fb0f9b
Backport remote proxy mode stats and naming (#50402)
* Update remote cluster stats to support simple mode (#49961)

Remote cluster stats API currently only returns useful information if
the strategy in use is the SNIFF mode. This PR modifies the API to
provide relevant information if the user is in the SIMPLE mode. This
information is the configured addresses, max socket connections, and
open socket connections.

* Send hostname in SNI header in simple remote mode (#50247)

Currently an intermediate proxy must route conncctions to the
appropriate remote cluster when using simple mode. This commit offers
a additional mechanism for the proxy to route the connections by
including the hostname in the TLS SNI header.

* Rename the remote connection mode simple to proxy (#50291)

This commit renames the simple connection mode to the proxy connection
mode for remote cluster connections. In order to do this, the mode specific
settings which we namespaced by their mode (ex: sniff.seed and
proxy.addresses) have been reverted.

* Modify proxy mode to support a single address (#50391)

Currently, the remote proxy connection mode uses a list setting for the
proxy address. This commit modifies this so that the setting is
proxy_address and only supports a single remote proxy address.
2019-12-19 18:02:48 -07:00
Stuart Tettemer 689df1f28f
Scripting: ScriptFactory not required by compile (#50344) (#50392)
Avoid backwards incompatible changes for 8.x and 7.6 by removing type
restriction on compile and Factory.  Factories may optionally implement
ScriptFactory.  If so, then they can indicate determinism and thus
cacheability.

**Backport**

Relates: #49466
2019-12-19 12:50:25 -07:00
Przemysław Witek cc4bc797f9
[7.x] Implement `precision` and `recall` metrics for classification evaluation (#49671) (#50378) 2019-12-19 18:55:05 +01:00
Igor Motov c77ca98928 Geo: Switch generated WKT to upper case (#50285)
Switches generated WKT to upper case to
conform to the standard recommendation.

Relates #49568
2019-12-18 17:29:08 -05:00
Dimitris Athanasiou d3c83cd55a
[7.x][ML] Refresh state index before completing data frame analytics job (#50322) (#50324)
In order to ensure any persisted model state is searchable by the moment
the job reports itself as `stopped`, we need to refresh the state index
before completing.

This should fix the occasional failures we see in #50168 and #50313 where
the model state appears missing.

Closes #50168
Closes #50313

Backport of #50322
2019-12-18 22:19:59 +00:00
Benjamin Trent 4396a1f78b
[ML][Inference] fix support for nested fields (#50258) (#50335)
This fixes support for nested fields

We now support fully nested, fully collapsed, or a mix of both on inference docs.

ES mappings allow the `_source` to be any combination of nested objects + dot delimited fields.
So, we should do our best to find the best path down the Map for the desired field.
2019-12-18 15:47:06 -05:00
Jason Tedor 7c5a3bcf6d
Always consume the body in has privileges (#50298)
Our REST infrastructure will reject requests that have a body where the
body of the request is never consumed. This ensures that we reject
requests on endpoints that do not support having a body. This requires
cooperation from the REST handlers though, to actually consume the body,
otherwise the REST infrastructure will proceed with rejecting the
request. This commit addresses an issue in the has privileges API where
we would prematurely try to reject a request for not having a username,
before consuming the body. Since the body was not consumed, the REST
infrastructure would instead reject the request as a bad request.
2019-12-18 08:30:53 -05:00
Dimitris Athanasiou 447bac27d2
[7.x][ML] Delete unused data frame analytics state (#50243) (#50280)
This commit adds removal of unused data frame analytics state
from the _delete_expired_data API (and in extend th ML daily
maintenance task). At the moment the potential state docs
include the progress document and state for regression and
classification analyses.

Backport of #50243
2019-12-18 12:30:11 +00:00
Yannick Welsch 82086929d7 Increase timeout on FollowIndexSecurityIT.testAutoFollowPatterns (#50282)
This test was causing test failures on slow CI runs.

Closes #50279
2019-12-18 10:37:11 +01:00