Commit Graph

1246 Commits

Author SHA1 Message Date
markharwood 217e41ab6c
Search - added HLRC support for PinnedQueryBuilder (#45779) (#45853)
Added HLRC support for PinnedQueryBuilder

Related  #44074
2019-08-23 09:22:17 +01:00
Tim Vernum f94e4a9151
Set security index refresh interval to 1s (#45888)
The security indices were being created without specifying the
refresh interval, which means it would inherit a value from any
templates that exists.
However, certain security functionality depends on being able to
wait_for refresh, and causes errors (e.g. in Kibana) if that time
exceeds 30s.

This commit changes the security indices configuration to always be
created with a 1s refresh interval. This prevents any templates from
inadvertantly interfering with the proper functioning of security.
It is possible for an administrator to explicitly change the refresh
interval after the indices have been created.

Backport of: #45434
2019-08-23 12:41:37 +10:00
Tim Vernum 029725fc35
Add SSL/TLS settings for watcher email (#45836)
This change adds a new SSL context

    xpack.notification.email.ssl.*

that supports the standard SSL configuration settings (truststore,
verification_mode, etc). This SSL context is used when configuring
outbound SMTP properties for watcher email notifications.

Backport of: #45272
2019-08-23 10:13:51 +10:00
Benjamin Trent 8e3c54fff7
[7.x] [ML] Adding data frame analytics stats to _usage API (#45820) (#45872)
* [ML] Adding data frame analytics stats to _usage API (#45820)

* [ML] Adding data frame analytics stats to _usage API

* making the size of analytics stats 10k

* adjusting backport
2019-08-22 15:15:41 -05:00
Benjamin Trent e50a78cf50
[ML-DataFrame] version data frame transform internal index (#45375) (#45837)
Adds index versioning for the internal data frame transform index. Allows for new indices to be created and referenced, `GET` requests now query over the index pattern and takes the latest doc (based on INDEX name).
2019-08-22 11:46:30 -05:00
Przemysław Witek 7512337922
[7.x] Allow the user to specify 'query' in Evaluate Data Frame request (#45775) (#45825) 2019-08-22 11:14:26 +02:00
Gordon Brown 47b1e2b3d0
[7.x] Use rollover for SLM's history indices (#45686)
Following our own guidelines, SLM should use rollover instead of purely
time-based indices to keep shard counts low. This commit implements lazy
index creation for SLM's history indices, indexing via an alias, and
rollover in the built-in ILM policy.
2019-08-21 13:42:11 -06:00
Przemysław Witek bf701b83d2
Shorten field names in EstimateMemoryUsageResponse (#45719) (#45772) 2019-08-21 12:45:09 +02:00
Dimitris Athanasiou d5c3d9b50f
[7.x][ML] Do not skip rows with missing values for regression (#45751) (#45754)
Regression analysis support missing fields. Even more, it is expected
that the dependent variable has missing fields to the part of the
data frame that is not for training.

This commit allows to declare that an analysis supports missing values.
For such analysis, rows with missing values are not skipped. Instead,
they are written as normal with empty strings used for the missing values.

This also contains a fix to the integration test.

Closes #45425
2019-08-21 08:15:38 +03:00
Benjamin Trent ba7b677618
[ML] better handle empty results when evaluating regression (#45745) (#45759)
* [ML] better handle empty results when evaluating regression

* adding new failure test to ml_security black list

* fixing equality check for regression results
2019-08-20 17:37:04 -05:00
Przemysław Witek b37ebd1adf
Prepare the codebase for new Auditor subclasses (#45716) (#45731) 2019-08-20 16:03:50 +02:00
Przemysław Witek 80dd0a0948
Get rid of EstimateMemoryUsageRequest and EstimateMemoryUsageAction.Request. (#45718) (#45725) 2019-08-20 15:49:17 +02:00
Benjamin Trent 88641a08af
[ML][Data frame] fixing failure state transitions and race condition (#45627) (#45656)
* [ML][Data frame] fixing failure state transitions and race condition (#45627)

There is a small window for a race condition while we are flagging a task as failed.

Here are the steps where the race condition occurs:
1. A failure occurs
2. Before `AsyncTwoPhaseIndexer` calls the `onFailure` handler it does the following:
   a. `finishAndSetState()` which sets the IndexerState to STARTED
   b. `doSaveState(...)` which attempts to save the current state of the indexer
3. Another trigger is fired BEFORE `onFailure` can fire, but AFTER `finishAndSetState()` occurs.

The trick here is that we will eventually set the indexer to failed, but possibly not before another trigger had the opportunity to fire. This could obviously cause some weird state interactions. To combat this, I have put in some predicates to verify the state before taking actions. This is so if state is indeed marked failed, the "second trigger" stops ASAP.

Additionally, I move the task state checks INTO the `start` and `stop` methods, which will now require a `force` parameter. `start`, `stop`, `trigger` and `markAsFailed` are all `synchronized`. This should gives us some guarantees that one will not switch states out from underneath another.

I also flag the task as `failed` BEFORE we successfully write it to cluster state, this is to allow us to make the task fail more quickly. But, this does add the behavior where the task is "failed" but the cluster state does not indicate as much. Adding the checks in `start` and `stop` will handle this "real state vs cluster state" race condition. This has always been a problem for `_stop` as it is not a master node action and doesn’t always have the latest cluster state.

closes #45609

Relates to #45562

* [ML][Data Frame] moves failure state transition for MT safety (#45676)

* [ML][Data Frame] moves failure state transition for MT safety

* removing unused imports
2019-08-20 07:30:17 -05:00
Benjamin Trent fde5dae387
[ML][Data Frame] adjusting change detection workflow (#45511) (#45580)
* [ML][Data Frame] adjusting change detection workflow

* adjusting for PR comment

* disallowing null as an argument value
2019-08-14 17:26:24 -05:00
Nick Knize 647a8308c3
[SPATIAL] Backport new ShapeFieldMapper and ShapeQueryBuilder to 7x (#45363)
* Introduce Spatial Plugin (#44389)

Introduce a skeleton Spatial plugin that holds new licensed features coming to 
Geo/Spatial land!

* [GEO] Refactor DeprecatedParameters in AbstractGeometryFieldMapper (#44923)

Refactor DeprecatedParameters specific to legacy geo_shape out of
AbstractGeometryFieldMapper.TypeParser#parse.

* [SPATIAL] New ShapeFieldMapper for indexing cartesian geometries (#44980)

Add a new ShapeFieldMapper to the xpack spatial module for
indexing arbitrary cartesian geometries using a new field type called shape.
The indexing approach leverages lucene's new XYShape field type which is
backed by BKD in the same manner as LatLonShape but without the WGS84
latitude longitude restrictions. The new field mapper builds on and
extends the refactoring effort in AbstractGeometryFieldMapper and accepts
shapes in either GeoJSON or WKT format (both of which support non geospatial
geometries).

Tests are provided in the ShapeFieldMapperTest class in the same manner
as GeoShapeFieldMapperTests and LegacyGeoShapeFieldMapperTests.
Documentation for how to use the new field type and what parameters are
accepted is included. The QueryBuilder for searching indexed shapes is
provided in a separate commit.

* [SPATIAL] New ShapeQueryBuilder for querying indexed cartesian geometry (#45108)

Add a new ShapeQueryBuilder to the xpack spatial module for
querying arbitrary Cartesian geometries indexed using the new shape field
type.

The query builder extends AbstractGeometryQueryBuilder and leverages the
ShapeQueryProcessor added in the previous field mapper commit.

Tests are provided in ShapeQueryTests in the same manner as
GeoShapeQueryTests and docs are updated to explain how the query works.
2019-08-14 16:35:10 -05:00
Benjamin Trent 0c343d8443
[7.x] [ML][Transforms] adjusting stats.progress for cont. transforms (#45361) (#45551)
* [ML][Transforms] adjusting stats.progress for cont. transforms (#45361)

* [ML][Transforms] adjusting stats.progress for cont. transforms

* addressing PR comments

* rename fix

* Adjusting bwc serialization versions
2019-08-14 13:08:27 -05:00
Przemysław Witek df574e5168
[7.x] Implement ml/data_frame/analytics/_estimate_memory_usage API endpoint (#45188) (#45510) 2019-08-14 08:26:03 +02:00
Yogesh Gaikwad 471d940c44
Refactor cluster privileges and cluster permission (#45265) (#45442)
The current implementations make it difficult for
adding new privileges (example: a cluster privilege which is
more than cluster action-based and not exposed to the security
administrator). On the high level, we would like our cluster privilege
either:
- a named cluster privilege
  This corresponds to `cluster` field from the role descriptor
- or a configurable cluster privilege
  This corresponds to the `global` field from the role-descriptor and
allows a security administrator to configure them.

Some of the responsibilities like the merging of action based cluster privileges
are now pushed at cluster permission level. How to implement the predicate
(using Automaton) is being now enforced by cluster permission.

`ClusterPermission` helps in enforcing the cluster level access either by
performing checks against cluster action and optionally against a request.
It is a collection of one or more permission checks where if any of the checks
allow access then the permission allows access to a cluster action.

Implementations of cluster privilege must be able to provide information
regarding the predicates to the cluster permission so that can be enforced.
This is enforced by making implementations of cluster privilege aware of
cluster permission builder and provide a way to specify how the permission is
to be built for a given privilege.

This commit renames `ConditionalClusterPrivilege` to `ConfigurableClusterPrivilege`.
`ConfigurableClusterPrivilege` is a renderable cluster privilege exposed
as a `global` field in role descriptor.

Other than this there is a requirement where we would want to know if a cluster
permission is implied by another cluster-permission (`has-privileges`).
This is helpful in addressing queries related to privileges for a user.
This is not just simply checking of cluster permissions since we do not
have access to runtime information (like request object).
This refactoring does not try to address those scenarios.

Relates #44048
2019-08-13 09:06:18 +10:00
Armin Braun a9e1402189
Remove Settings from BaseRestRequest Constructor (#45418) (#45429)
* Resolving the todo, cleaning up the unused `settings` parameter
* Cleaning up some other minor dead code in affected classes
2019-08-12 05:14:45 +02:00
Benjamin Trent fac1a6f8e8
[ML][Data Frame] have DataFrameTransformConfigUpdate#apply set Version (#45391) (#45400) 2019-08-09 14:32:49 -05:00
Hendrik Muhs bf4da6c6ad
[ML-DataFrame] fix starting a batch data frame after stopping at runtime (#45340) (#45381)
fix loading of next checkpoint after data frame transform has been stopped/started within one run

closes #45339
2019-08-09 20:30:11 +02:00
Dimitris Athanasiou 27497ff75f
[7.x][ML] Add regression analysis to DF analytics (#45292) (#45388)
This commit adds a first draft of a regression analysis
to data frame analytics. There is high probability that
the exact syntax might change.

This commit adds the new analysis type and its parameters as
well as appropriate validation. It also modifies the extractor
and the fields detector to be able to handle categorical fields
as regression analysis supports them.
2019-08-09 19:31:13 +03:00
David Roberts 14545f8958
[ML-DataFrame] Combine task_state and indexer_state in _stats (#45324)
This commit replaces task_state and indexer_state in the
data frame _stats output with a single top level state
that combines the two. It is defined as:

- failed if what's currently reported as task_state is failed
- stopped if there is no persistent task
- Otherwise what's currently reported as indexer_state

Backport of #45276
2019-08-08 16:24:26 +01:00
Benjamin Trent 5db9982f71
[7.x] [ML][Data Frame] Add update transform api endpoint (#45154) (#45279)
* [ML][Data Frame] Add update transform api endpoint (#45154)

This adds the ability to `_update` stored data frame transforms. All mutable fields are applied when the next checkpoint starts. The exception being `description`.

This PR contains all that is necessary for this addition:
* HLRC
* Docs
* Server side
2019-08-07 10:37:35 -05:00
Lee Hinman c7ec0b8431 Include in-progress snapshot for a policy with get SLM policy… (#45245)
This commit adds the "in_progress" key to the SLM get policy API,
returning a policy that looks like:

```json
{
  "daily-snapshots" : {
    "version" : 1,
    "modified_date" : "2019-08-05T18:41:48.778Z",
    "modified_date_millis" : 1565030508778,
    "policy" : {
      "name" : "<production-snap-{now/d}>",
      "schedule" : "0 30 1 * * ?",
      "repository" : "repo",
      "config" : {
        "indices" : [
          "foo-*",
          "important"
        ],
        "ignore_unavailable" : true,
        "include_global_state" : false
      },
      "retention" : {
        "expire_after" : "10m"
      }
    },
    "last_success" : {
      "snapshot_name" : "production-snap-2019.08.05-oxctmnobqye3luim4uejhg",
      "time_string" : "2019-08-05T18:42:23.257Z",
      "time" : 1565030543257
    },
    "next_execution" : "2019-08-06T01:30:00.000Z",
    "next_execution_millis" : 1565055000000,
    "in_progress" : {
      "name" : "production-snap-2019.08.05-oxctmnobqye3luim4uejhg",
      "uuid" : "t8Idqt6JQxiZrzp0Vt7z6g",
      "state" : "STARTED",
      "start_time" : "2019-08-05T18:42:22.998Z",
      "start_time_millis" : 1565030542998
    }
  }
}
```

These are only visible while the snapshot is being taken (or failed),
since it reads from the cluster state rather than from the repository
itself.
2019-08-07 08:29:49 -06:00
Tom Callahan a7a419bee8 Change Ldap SDK License to LGPL-2.1 (#45116)
We currently use the unboundid ldap SDK, which is triply licensed under
GPL-2.0, LGPL-2.1, and the "UnboundID LDAP SDK Free Use License". We currently
identify the license as the latter, but LGPL-2.1 is the one we should be using
per our policy.
2019-08-06 16:48:09 -04:00
Zachary Tong 422aca9a5d Fix Rollup job creation to work with templates (#43943)
The PutJob API accidentally used an "expert" API of CreateIndexRequest.
That API is semi-lenient to syntax; a type could be omitted and the
request would work as expected.  But if a type was omitted it would
not merge with templates correctly, leading to index creation that
only has the template and not the requested mappings in the request.

This commit refactors the PutJob API to:

- Include the type name
- Use a less "expert" API in an attempt to future proof against errors
- Uses an XContentBuilder instead of string replacing, removes json template
2019-08-06 10:53:44 -04:00
Christoph Büscher 3366726ad1 Enable reloading of synonym_graph filters (#45135)
Reloading of synonym_graph filter doesn't work currently because the search time
AnalysisMode doesn't get propagated to the TokenFilterFactory emitted by the
graph filters getChainAwareTokenFilterFactory() method. This change fixes that.

Closes #45127
2019-08-02 15:33:42 +02:00
Tim Vernum e21d58541a
Improve errors when TLS files cannot be read (#45122)
This change improves the exception messages that are thrown when the
system cannot read TLS resources such as keystores, truststores,
certificates, keys or certificate-chains (CAs).

This change specifically handles:

- Files that do not exist
- Files that cannot be read due to file-system permissions
- Files that cannot be read due to the ES security-manager

Backport of: #44787
2019-08-02 12:29:43 +10:00
Tim Vernum 590777150f
Explicitly fail if a realm only exists in keystore (#45091)
There are no realms that can be configured exclusively with secure
settings. Every realm that supports secure settings also requires one
or more non-secure settings.
However, sometimes a node will be configured with entries in the
keystore for which there is nothing in elasticsearch.yml - this may be
because the realm we removed from the yml, but not deleted from the
keystore, or it could be because there was a typo in the realm name
which has accidentially orphaned the keystore entry.

In these cases the realm building would fail, but the error would not
always be clear or point to the root cause (orphaned keystore
entries). RealmSettings would act as though the realm existed, but
then fail because an incorrect combination of settings was provided.

This change causes realm building to fail early, with an explicit
message about incorrect keystore entries.

Backport of: #44471
2019-08-02 12:28:59 +10:00
Mayya Sharipova 0c68765088
Adds usage stats for vectors (#45023)
Example of usage:

_xpack/usage

"vectors": {
    "available": true,
    "enabled": true,
    "dense_vector_fields_count" : 1,
    "sparse_vector_fields_count" : 1,
    "dense_vector_dims_avg_count" : 100
}
Backport for #44512
2019-07-31 12:32:41 -04:00
Lee Hinman 598c4e72f9
[7.x] Rename indexlifecycle to ilm and snapshotlifecycle to sl… (#44977)
* Rename indexlifecycle to ilm and snapshotlifecycle to slm (#44917)

As a followup to #44725 and #44608, which renamed the packages within
the x-pack project, this renames the packages within the core x-pack
project. It also renames 'snapshotlifecycle' within the HLRC to slm.

* Fix one more import
2019-07-29 15:51:14 -06:00
Benjamin Trent 3b514f0dae
[ML] update Instant serialization (#44765) (#44954)
* [ML] update Instant serialization

* addressing PR comments

* removing unused import
2019-07-29 13:06:56 -05:00
Gordon Brown d4b2d21339
Add option to filter ILM explain response (#44777)
In order to make it easier to interpret the output of the ILM Explain
API, this commit adds two request parameters to that API:

- `only_managed`, which causes the response to only contain indices
  which have `index.lifecycle.name` set
- `only_errors`, which causes the response to contain only indices in an
  ILM error state

"Error state" is defined as either being in the `ERROR` step or having
`index.lifecycle.name` set to a policy that does not exist.
2019-07-26 11:57:38 -04:00
Przemysław Witek 79121ea127
[7.x] Implement exponential average search time per hour statistics. (#44683) (#44897) 2019-07-26 15:56:34 +02:00
Przemysław Witek 8bb8543fdf
Treat PostDataActionResponse.DataCounts.bucketCount as incremental rather than absolute (total). (#44803) (#44856) 2019-07-25 20:46:56 +02:00
Hendrik Muhs 2ca6306452 do not assert on indexer state (#44854)
remove the unreliable check for the state change

fixes #44813
2019-07-25 16:39:24 +02:00
David Roberts b2e969f4ba
[ML-DataFrame] Remove ID field from data frame indexer stats (#44848)
This is a followup to #44350. The indexer stats used to
be persisted standalone, but now are only persisted as
part of a state-and-stats document. During the review
of #44350 it was decided that we'll stick with this
design, so there will never be a need for an indexer
stats object to store its transform ID as it is stored
on the enclosing document. This PR removes the indexer
stats document ID.

Backport of #44768
2019-07-25 15:19:32 +01:00
Przemysław Witek 53f409e5ae
Add result_type field to TimingStats and DatafeedTimingStats documents (#44812) (#44841) 2019-07-25 10:11:55 +02:00
Gordon Brown 2ac54da60a
Fix swapped variables in error message (#44300)
The alias name and index were in the incorrect order in this error
message. This commit corrects the order.
2019-07-24 11:38:15 -04:00
Przemysław Witek 26da573e94
[ML] [7.x] Only emit deprecation warning if there was actual change of a datafeed's job_id. (#44755)
* Only emit deprecation warning if there was actual change of a datafeed's job_id.
* Add @Deprecated annotation to DatafeedUpdate.Builder#setJobId method
2019-07-24 10:03:25 +02:00
David Roberts caf9411a72
[ML] Improve response format of data frame stats endpoint (#44743)
This change adjusts the data frame transforms stats
endpoint to return a structure that is easier to
understand.

This is a breaking change for clients of the data frame
transforms stats endpoint, but the feature is in beta so
stability is not guaranteed.

Backport of #44350
2019-07-23 18:00:50 +01:00
Przemysław Witek 16c8e18013
Deprecate the ability to update datafeed's job_id. (#44691) (#44742) 2019-07-23 14:48:56 +02:00
Jason Tedor 5878bde8dc
Rename SLM package to slm (#44608)
This commit renames the SLM package from snapshotlifecycle to slm. We
have all come to know index lifecycle management as ILM, the APIs and
settings use ilm, and it would be nice of the package did too. For SLM,
let's use slm for all of these including the package name from the
beginning.
2019-07-23 07:35:06 +09:00
Benjamin Trent 4456850a8e
[7.x] [ML][Data Frame] Add optional defer_validation param to PUT (#44455) (#44697)
* [ML][Data Frame] Add optional defer_validation param to PUT (#44455)

* [ML][Data Frame] Add optional defer_validation param to PUT

* addressing PR comments

* reverting bad replace

* addressing pr comments

* Update put-transform.asciidoc

* Update put-transform.asciidoc

* Update put-transform.asciidoc

* adjusting for backport

* fixing imports

* [DOCS] Fixes formatting in  create data frame transform API
2019-07-22 15:12:55 -05:00
Benjamin Trent 06e21f7902
[7.x] [ML][Data Frame] adding force delete (#44590) (#44696)
* [ML][Data Frame] adding force delete (#44590)

* [ML][Data Frame] adding force delete

* Update delete-transform.asciidoc

* adjusting for backport
2019-07-22 13:13:25 -05:00
Hendrik Muhs 4387d81e5b add a test to check excecution flow (#44481)
add a test for the execution flow of a2p indexer
2019-07-22 17:07:11 +02:00
Tal Levy 7c84636029
Remove StreamOutput #writeOptionalStreamable and #writeStreamableList (#44602) (#44643)
remove usages of writeOptionalStreamable and writeStreambaleList

relates #34389.
2019-07-19 15:55:53 -07:00
Benjamin Trent 2e303fc5f7
[ML][Data Frame] adding dynamic cluster setting for failure retries (#44577) (#44639)
This adds a new dynamic cluster setting `xpack.data_frame.num_transform_failure_retries`.

This setting indicates how many times non-critical failures should be retried before a data frame transform is marked as failed and should stop executing. At the time of this commit; Min: 0, Max: 100, Default: 10
2019-07-19 16:17:39 -05:00
Ryan Ernst f193d14764
Convert remaining Action Response/Request to writeable.reader (#44528) (#44607)
This commit converts readFrom to ctor with StreamInput on the remaining
ActionResponse and ActionRequest classes.

relates #34389
2019-07-19 13:33:38 -07:00