Commit Graph

450 Commits

Author SHA1 Message Date
Przemysław Witek 85d55e30d0
Add test that proves _timing_stats document is deleted when the job is deleted (#45840) (#45854) 2019-08-23 07:03:09 +02:00
Przemysław Witek 2ed19b2c81
Put error message from inside the process into the exception that is thrown when the process doesn't start correctly. (#45846) (#45875) 2019-08-23 07:02:50 +02:00
Benjamin Trent 8e3c54fff7
[7.x] [ML] Adding data frame analytics stats to _usage API (#45820) (#45872)
* [ML] Adding data frame analytics stats to _usage API (#45820)

* [ML] Adding data frame analytics stats to _usage API

* making the size of analytics stats 10k

* adjusting backport
2019-08-22 15:15:41 -05:00
Przemysław Witek 7512337922
[7.x] Allow the user to specify 'query' in Evaluate Data Frame request (#45775) (#45825) 2019-08-22 11:14:26 +02:00
Przemysław Witek bf701b83d2
Shorten field names in EstimateMemoryUsageResponse (#45719) (#45772) 2019-08-21 12:45:09 +02:00
Przemysław Witek c6709f0979
Mute tests affected by renaming fields in Estimate memory usage response (#45743) (#45766) 2019-08-21 09:57:23 +02:00
Dimitris Athanasiou d5c3d9b50f
[7.x][ML] Do not skip rows with missing values for regression (#45751) (#45754)
Regression analysis support missing fields. Even more, it is expected
that the dependent variable has missing fields to the part of the
data frame that is not for training.

This commit allows to declare that an analysis supports missing values.
For such analysis, rows with missing values are not skipped. Instead,
they are written as normal with empty strings used for the missing values.

This also contains a fix to the integration test.

Closes #45425
2019-08-21 08:15:38 +03:00
Benjamin Trent ba7b677618
[ML] better handle empty results when evaluating regression (#45745) (#45759)
* [ML] better handle empty results when evaluating regression

* adding new failure test to ml_security black list

* fixing equality check for regression results
2019-08-20 17:37:04 -05:00
Dimitris Athanasiou 49edf9e5b5
[7.x][ML] Remove timeout on waiting for DF analytics result processor to complete (#45724) (#45733)
We cannot know how long the analysis will take to complete thus we should not have
a timeout. Note that if the process crashes, the result processor will pick the
exception due to the stream closing.

Closes #45723
2019-08-20 17:21:40 +03:00
Przemysław Witek b37ebd1adf
Prepare the codebase for new Auditor subclasses (#45716) (#45731) 2019-08-20 16:03:50 +02:00
Przemysław Witek 80dd0a0948
Get rid of EstimateMemoryUsageRequest and EstimateMemoryUsageAction.Request. (#45718) (#45725) 2019-08-20 15:49:17 +02:00
Przemysław Witek 7bc8400222
Call the new _estimate_memory_usage API endpoint on df analytics _start (#45536) (#45701) 2019-08-19 21:37:55 +02:00
Igor Motov 98c850c08b
Geo: Change order of parameter in Geometries to lon, lat 7.x (#45618)
Changes the order of parameters in Geometries from lat, lon to lon, lat
and moves all Geometry classes are moved to the
org.elasticsearch.geomtery package.

Backport of #45332

Closes #45048
2019-08-16 14:42:02 -04:00
Przemysław Witek df574e5168
[7.x] Implement ml/data_frame/analytics/_estimate_memory_usage API endpoint (#45188) (#45510) 2019-08-14 08:26:03 +02:00
Armin Braun 90803a5caf
Reenable Integ Tests in native-multi-node-tests (#45482) (#45496)
* Reenable Integ Tests in native-multi-node-tests

* The tests broken here were likely fixed by #45463 => let's reenable them and see if things run fine again
* Relates #45405, #45455
2019-08-13 15:55:54 +02:00
Przemysław Witek 1aed388a24
Add view_index_metadata to roles.yml and remove as many df analytics test cases from build.gradle blacklist as possible. (#45451) (#45465) 2019-08-13 08:31:58 +02:00
Mark Vieira 7e3379444b
Fix build failure due to unknown task and disable test conventions
(cherry picked from commit 8ed84bc5cef9bcfae6c817059f764d97e4451a4a)
2019-08-12 09:18:39 -07:00
Przemyslaw Gomulka 421e9b8e8b
Mute integ tests in native-multi-node-tests (#45457)
Tracked at #45405
2019-08-12 17:42:24 +02:00
Przemyslaw Gomulka d11ae08467
Muting ForecastIT.testOverflowToDisk (#45435) (#45438)
awaits #45405
2019-08-12 11:01:32 +02:00
Dimitris Athanasiou d02d6e40c2 [ML] Mute regression integ test
Relates #45425
2019-08-12 10:59:24 +03:00
Armin Braun a9e1402189
Remove Settings from BaseRestRequest Constructor (#45418) (#45429)
* Resolving the todo, cleaning up the unused `settings` parameter
* Cleaning up some other minor dead code in affected classes
2019-08-12 05:14:45 +02:00
Dimitris Athanasiou 27497ff75f
[7.x][ML] Add regression analysis to DF analytics (#45292) (#45388)
This commit adds a first draft of a regression analysis
to data frame analytics. There is high probability that
the exact syntax might change.

This commit adds the new analysis type and its parameters as
well as appropriate validation. It also modifies the extractor
and the fields detector to be able to handle categorical fields
as regression analysis supports them.
2019-08-09 19:31:13 +03:00
Jason Tedor 9a142ff25c
Introduce formal node ML role (#45174)
This commit builds on the ability for plugins to introduce new roles to
add a formal node ML role.
2019-08-06 13:00:05 -04:00
David Roberts a1f0285f0e [TEST] Only test US locale in day/month order test in FIPS JVM (#45141)
In the FIPS JVM the JVM default locale seems to leak into places
where it should be overridden. This change skips assertions
in TimestampFormatFinderTests.testGuessIsDayFirstFromLocale
that may be impacted.

Fixes #45140
2019-08-02 15:04:47 +01:00
David Roberts f617585dbd [ML] Improve CSV header row detection in find_file_structure (#45099)
When doing a fieldwise Levenshtein distance comparison
between CSV rows, this change ignores all fields that
have long values, not just the longest field.

This approach works better for CSV formats that have
multiple freeform text fields rather than just a single
"message" field.

Fixes #45047
2019-08-02 09:08:21 +01:00
Dimitris Athanasiou 8a6675b994
[7.x][ML] Check dest index is empty when starting DF analytics (#45094) (#45112)
If one tries to start a DF analytics job that has already run,
the result will be that the task will fail after reindexing the
dest index from the source index. The results of the prior run
will be gone and the task state is not properly set to failed
with the failure reason.

This commit improves the behavior in this scenario. First, we
set the task state to `failed` in a set of failures that were
missed. Second, a validation is added that if the destination
index exists, it must be empty.
2019-08-02 00:19:48 +03:00
Przemysław Witek 6c87845fc1
Persist DatafeedTimingStats with RefreshPolicy.NONE by default (#44940) (#45079) 2019-08-01 14:36:59 +02:00
Dimitris Athanasiou aef419c0b0
[7.x][ML] Catch any error thrown while closing data frame analytics process (#44958) (#44968)
In case closing the process throws an exception we should be catching
it no matter its type. The process may have terminated because of a
fatal error in which case closing the process will throw a server
error, not an `IOException`. If this happens we fail to mark the
persistent task as failed and the task gets in limbo.
2019-07-29 21:59:10 +03:00
Benjamin Trent 3b514f0dae
[ML] update Instant serialization (#44765) (#44954)
* [ML] update Instant serialization

* addressing PR comments

* removing unused import
2019-07-29 13:06:56 -05:00
Dimitris Athanasiou 9dd527328a
[ML] Outlier detection should only fetch docs that have the analyzed … (#44944) (#44959)
As data frame rows with missing values for analyzed fields are skipped,
we can be more efficient by including a query that only picks documents
that have values for all analyzed fields. Besides improving the number
of documents we go through, we also provide a more accurate measurement
of how many rows we need which reduces the memory requirements.

This also adds an integration test that runs outlier detection on data
with missing fields.
2019-07-29 18:23:56 +03:00
Luca Cavanna a3cc32da64 TaskListener#onFailure to accept Exception instead of Throwable (#44946)
TaskListener accepts today Throwable in its onFailure method. Though
looking at where it is called (TransportAction), it can never be
notified of a Throwable.

This commit changes the signature of TaskListener#onFailure so that it
accepts an `Exception` rather than a `Throwable` as second argument.
2019-07-29 16:47:19 +02:00
David Kyle d05f12dadb [ML] Close any opened pipes if there is an error connecting to the process (#44869) 2019-07-29 10:48:31 +01:00
Przemysław Witek 79121ea127
[7.x] Implement exponential average search time per hour statistics. (#44683) (#44897) 2019-07-26 15:56:34 +02:00
Przemysław Witek 8bb8543fdf
Treat PostDataActionResponse.DataCounts.bucketCount as incremental rather than absolute (total). (#44803) (#44856) 2019-07-25 20:46:56 +02:00
Przemysław Witek 53f409e5ae
Add result_type field to TimingStats and DatafeedTimingStats documents (#44812) (#44841) 2019-07-25 10:11:55 +02:00
Andrei Stefan 2633d11eb7
Switch from using docvalue_fields to extracting values from _source (#44062) (#44804)
* Switch from using docvalue_fields to extracting values from _source
where applicable. Doing this means parsing the _source and handling the
numbers parsing just like Elasticsearch is doing it when it's indexing
a document.
* This also introduces a minor limitation: aliases type of fields that
are NOT part of a tree of sub-fields will not be able to be retrieved
anymore. field_caps API doesn't shed any light into a field being an
alias or not and at _source parsing time there is no way to know if a
root field is an alias or not. Fields of the type "a.b.c.alias" can be
extracted from docvalue_fields, only if the field they point to can be
extracted from docvalue_fields. Also, not all fields in a hierarchy of
fields can be evaluated to being an alias.

(cherry picked from commit 8bf8a055e38f00df5f49c8d97f632f69d6e00c2c)
2019-07-25 10:02:41 +03:00
Alpar Torok b34ac66d96
Mute multiple tests on Windows (7.x) (#44676)
* Mute failing test

tracked in #44552

* mute EvilSecurityTests

tracking in #44558

* Fix line endings in ESJsonLayoutTests

* Mute failing ForecastIT  test on windows

Tracking in #44609

* mute BasicRenormalizationIT.testDefaultRenormalization

tracked in #44613

* fix mute testDefaultRenormalization

* Increase busyWait timeout windows is slow

* Mute failure unconfigured node name

* mute x-pack internal cluster test windows

tracking #44610

* Mute JvmErgonomicsTests on windows

Tracking #44669

* mute SharedClusterSnapshotRestoreIT testParallelRestoreOperationsFromSingleSnapshot

Tracking #44671

* Mute NodeTests on Windows

Tracking #44256
2019-07-22 11:32:29 +03:00
David Kyle 0fc091f166
Enable XLint warnings for ML (#44346)
Removes the warning suppression -Xlint:-deprecation,-rawtypes,-serial,-try,-unchecked.
Many warnings were unchecked warnings in the test code often because of the use of mocks.
These are suppressed with @SuppressWarning
2019-07-18 09:33:37 +01:00
Tal Levy a5ad59451c
migrate more ML actions off of using Request suppliers (#44462) (#44529)
many classes still use the Streamable constructors of HandledTransportAction,
this commit moves more of those classes to the new Writeable constructors.

relates #34389.
2019-07-17 20:28:29 -07:00
Ryan Ernst 0755a13c9f
Convert AcknowledgedRequest to Writeable.Reader (#44412) (#44454)
This commit adds constructors to AcknolwedgedRequest subclasses to
implement Writeable.Reader, and ensures all future subclasses implement
the same.

relates #34389
2019-07-17 11:17:36 -07:00
Tal Levy 901310a826
[7.x] Migrate ML Actions to use writeable ActionType (#44302) (#44391)
* Migrate ML Actions to use writeable ActionType (#44302)

This commit converts all the StreamableResponseActionType
actions in the ML core module to be ActionType and leverage
the Writeable infrastructure.
2019-07-16 12:41:10 -07:00
Przemysław Witek 34bf6bcec0
Treat big changes in searchCount as significant and persist the document after such changes (#44413) (#44424) 2019-07-16 16:15:32 +02:00
Lee Hinman fb0461ac76
[7.x] Add Snapshot Lifecycle Management (#44382)
* Add Snapshot Lifecycle Management (#43934)

* Add SnapshotLifecycleService and related CRUD APIs

This commit adds `SnapshotLifecycleService` as a new service under the ilm
plugin. This service handles snapshot lifecycle policies by scheduling based on
the policies defined schedule.

This also includes the get, put, and delete APIs for these policies

Relates to #38461

* Make scheduledJobIds return an immutable set

* Use Object.equals for SnapshotLifecyclePolicy

* Remove unneeded TODO

* Implement ToXContentFragment on SnapshotLifecyclePolicyItem

* Copy contents of the scheduledJobIds

* Handle snapshot lifecycle policy updates and deletions (#40062)

(Note this is a PR against the `snapshot-lifecycle-management` feature branch)

This adds logic to `SnapshotLifecycleService` to handle updates and deletes for
snapshot policies. Policies with incremented versions have the old policy
cancelled and the new one scheduled. Deleted policies have their schedules
cancelled when they are no longer present in the cluster state metadata.

Relates to #38461

* Take a snapshot for the policy when the SLM policy is triggered (#40383)

(This is a PR for the `snapshot-lifecycle-management` branch)

This commit fills in `SnapshotLifecycleTask` to actually perform the
snapshotting when the policy is triggered. Currently there is no handling of the
results (other than logging) as that will be added in subsequent work.

This also adds unit tests and an integration test that schedules a policy and
ensures that a snapshot is correctly taken.

Relates to #38461

* Record most recent snapshot policy success/failure (#40619)

Keeping a record of the results of the successes and failures will aid
troubleshooting of policies and make users more confident that their
snapshots are being taken as expected.

This is the first step toward writing history in a more permanent
fashion.

* Validate snapshot lifecycle policies (#40654)

(This is a PR against the `snapshot-lifecycle-management` branch)

With the commit, we now validate the content of snapshot lifecycle policies when
the policy is being created or updated. This checks for the validity of the id,
name, schedule, and repository. Additionally, cluster state is checked to ensure
that the repository exists prior to the lifecycle being added to the cluster
state.

Part of #38461

* Hook SLM into ILM's start and stop APIs (#40871)

(This pull request is for the `snapshot-lifecycle-management` branch)

This change allows the existing `/_ilm/stop` and `/_ilm/start` APIs to also
manage snapshot lifecycle scheduling. When ILM is stopped all scheduled jobs are
cancelled.

Relates to #38461

* Add tests for SnapshotLifecyclePolicyItem (#40912)

Adds serialization tests for SnapshotLifecyclePolicyItem.

* Fix improper import in build.gradle after master merge

* Add human readable version of modified date for snapshot lifecycle policy (#41035)

* Add human readable version of modified date for snapshot lifecycle policy

This small change changes it from:

```
...
"modified_date": 1554843903242,
...
```

To

```
...
"modified_date" : "2019-04-09T21:05:03.242Z",
"modified_date_millis" : 1554843903242,
...
```

Including the `"modified_date"` field when the `?human` field is used.

Relates to #38461

* Fix test

* Add API to execute SLM policy on demand (#41038)

This commit adds the ability to perform a snapshot on demand for a policy. This
can be useful to take a snapshot immediately prior to performing some sort of
maintenance.

```json
PUT /_ilm/snapshot/<policy>/_execute
```

And it returns the response with the generated snapshot name:

```json
{
  "snapshot_name" : "production-snap-2019.04.09-rfyv3j9qreixkdbnfuw0ug"
}
```

Note that this does not allow waiting for the snapshot, and the snapshot could
still fail. It *does* record this information into the cluster state similar to
a regularly trigged SLM job.

Relates to #38461

* Add next_execution to SLM policy metadata (#41221)

* Add next_execution to SLM policy metadata

This adds the next time a snapshot lifecycle policy will be executed when
retriving a policy's metadata, for example:

```json
GET /_ilm/snapshot?human
{
  "production" : {
    "version" : 1,
    "modified_date" : "2019-04-15T21:16:21.865Z",
    "modified_date_millis" : 1555362981865,
    "policy" : {
      "name" : "<production-snap-{now/d}>",
      "schedule" : "*/30 * * * * ?",
      "repository" : "repo",
      "config" : {
        "indices" : [
          "foo-*",
          "important"
        ],
        "ignore_unavailable" : true,
        "include_global_state" : false
      }
    },
    "next_execution" : "2019-04-15T21:16:30.000Z",
    "next_execution_millis" : 1555362990000
  },
  "other" : {
    "version" : 1,
    "modified_date" : "2019-04-15T21:12:19.959Z",
    "modified_date_millis" : 1555362739959,
    "policy" : {
      "name" : "<other-snap-{now/d}>",
      "schedule" : "0 30 2 * * ?",
      "repository" : "repo",
      "config" : {
        "indices" : [
          "other"
        ],
        "ignore_unavailable" : false,
        "include_global_state" : true
      }
    },
    "next_execution" : "2019-04-16T02:30:00.000Z",
    "next_execution_millis" : 1555381800000
  }
}
```

Relates to #38461

* Fix and enhance tests

* Figured out how to Cron

* Change SLM endpoint from /_ilm/* to /_slm/* (#41320)

This commit changes the endpoint for snapshot lifecycle management from:

```
GET /_ilm/snapshot/<policy>
```

to:

```
GET /_slm/policy/<policy>
```

It mimics the ILM path only using `slm` instead of `ilm`.

Relates to #38461

* Add initial documentation for SLM (#41510)

* Add initial documentation for SLM

This adds the initial documentation for snapshot lifecycle management.

It also includes the REST spec API json files since they're sort of
documentation.

Relates to #38461

* Add `manage_slm` and `read_slm` roles (#41607)

* Add `manage_slm` and `read_slm` roles

This adds two more built in roles -

`manage_slm` which has permission to perform any of the SLM actions, as well as
stopping, starting, and retrieving the operation status of ILM.

`read_slm` which has permission to retrieve snapshot lifecycle policies as well
as retrieving the operation status of ILM.

Relates to #38461

* Add execute to the test

* Fix ilm -> slm typo in test

* Record SLM history into an index (#41707)

It is useful to have a record of the actions that Snapshot Lifecycle
Management takes, especially for the purposes of alerting when a
snapshot fails or has not been taken successfully for a certain amount of
time.

This adds the infrastructure to record SLM actions into an index that
can be queried at leisure, along with a lifecycle policy so that this
history does not grow without bound.

Additionally,
SLM automatically setting up an index + lifecycle policy leads to
`index_lifecycle` custom metadata in the cluster state, which some of
the ML tests don't know how to deal with due to setting up custom
`NamedXContentRegistry`s.  Watcher would cause the same problem, but it
is already disabled (for the same reason).

* High Level Rest Client support for SLM (#41767)

* High Level Rest Client support for SLM

This commit add HLRC support for SLM.

Relates to #38461

* Fill out documentation tests with tags

* Add more callouts and asciidoc for HLRC

* Update javadoc links to real locations

* Add security test testing SLM cluster privileges (#42678)

* Add security test testing SLM cluster privileges

This adds a test to `PermissionsIT` that uses the `manage_slm` and `read_slm`
cluster privileges.

Relates to #38461

* Don't redefine vars

*  Add Getting Started Guide for SLM  (#42878)

This commit adds a basic Getting Started Guide for SLM.

* Include SLM policy name in Snapshot metadata (#43132)

Keep track of which SLM policy in the metadata field of the Snapshots
taken by SLM. This allows users to more easily understand where the
snapshot came from, and will enable future SLM features such as
retention policies.

* Fix compilation after master merge

* [TEST] Move exception wrapping for devious exception throwing

Fixes an issue where an exception was created from one line and thrown in another.

* Fix SLM for the change to AcknowledgedResponse

* Add Snapshot Lifecycle Management Package Docs (#43535)

* Fix compilation for transport actions now that task is required

* Add a note mentioning the privileges needed for SLM (#43708)

* Add a note mentioning the privileges needed for SLM

This adds a note to the top of the "getting started with SLM"
documentation mentioning that there are two built-in privileges to
assist with creating roles for SLM users and administrators.

Relates to #38461

* Mention that you can create snapshots for indices you can't read

* Fix REST tests for new number of cluster privileges

* Mute testThatNonExistingTemplatesAreAddedImmediately (#43951)

* Fix SnapshotHistoryStoreTests after merge

* Remove overridden newResponse functions that have been removed

* Fix compilation for backport

* Fix get snapshot output parsing in test

* [DOCS] Add redirects for removed autogen anchors (#44380)

* Switch <tt>...</tt> in javadocs for {@code ...}
2019-07-16 07:37:13 -06:00
Przemysław Witek 3f3a3d3f2b
[7.x] Add DatafeedTimingStats.average_search_time_per_bucket_ms and TimingStats.total_bucket_processing_time_ms stats (#44125) (#44404) 2019-07-16 12:51:29 +02:00
Ryan Ernst 7e06888bae
Convert testclusters to use distro download plugin (#44253) (#44362)
Test clusters currently has its own set of logic for dealing with
finding different versions of Elasticsearch, downloading them, and
extracting them. This commit converts testclusters to use the
DistributionDownloadPlugin.
2019-07-15 17:53:05 -07:00
Ryan Ernst 59658daef9
Separate streamable based master node actions (#44313)
This commit creates new base classes for master node actions whose
response types still implement Streamable. This simplifies both finding
remaining classes to convert, as well as creating new master node
actions that use Writeable for their responses.

relates #34389
2019-07-15 09:20:20 -07:00
Armin Braun d73e2f9c56
HLRC: Fix '+' Not Correctly Encoded in GET Req. (#33164) (#44324)
* HLRC: Fix '+' Not Correctly Encoded in GET Req.

* Encode `+` correctly as `%2B` in URL paths
* Keep encoding `+` as space in URL parameters
* Closes #33077
2019-07-15 10:21:54 +02:00
Ryan Ernst 1dcf53465c Reorder HandledTransportAction ctor args (#44291)
This commit moves the Supplier variant of HandledTransportAction to have
a different ordering than the Writeable.Reader variant. The Supplier
version is used for the legacy Streamable, and currently having the
location of the Writeable.Reader vs Supplier in the same place forces
using casts of Writeable.Reader to select the correct super constructor.
This change in ordering allows easier migration to Writeable.Reader.

relates #34389
2019-07-12 13:45:09 -07:00
Przemysław Witek dd5f4ae00e
Update .ml-config mappings before indexing job, datafeed or df analytics config (#44216) (#44273) 2019-07-12 16:49:48 +02:00
Przemysław Witek 40d3c60d7a
Make testDatafeedTimingStats_DatafeedJobIdUpdated test easier to debug (#44206) (#44268) 2019-07-12 13:52:26 +02:00