Commit Graph

5939 Commits

Author SHA1 Message Date
Hendrik Muhs 95c99ca887 [Transform] Fix Regression: continuous transform can fail for (date) histogram group_by(#60196)
do not create change collector if group_by configuration does not support change detection

fixes #60125
2020-07-27 14:50:03 +02:00
Dimitris Athanasiou 439b7f7e59
[7.x][ML] DFA result processor should only skip rows and model chunks on cancel (#60113) (#60193)
When the job is force-closed or shutting down due to a fatal error we clean
up all cancellable job operations. This includes cancelling the results processor.
However, this means that we might not persist objects that are written from the
process like stats, memory usage, etc.

In hindsight, we do not gain from cancelling the results processor in its
entirety. It makes more sense to skip row results and model chunks but keep
stats and instrumentation about the job as the latter may contain useful information
to understand what happened to the job.

Backport of #60113
2020-07-27 13:42:46 +03:00
David Roberts 89466eefa5
Don't require separate privilege for internal detail of put pipeline (#60190)
Putting an ingest pipeline used to require that the user calling
it had permission to get nodes info as well as permission to
manage ingest.  This was due to an internal implementaton detail
that was not visible to the end user.

This change alters the behaviour so that a user with the
manage_pipeline cluster privilege can put an ingest pipeline
regardless of whether they have the separate privilege to get
nodes info.  The internal implementation detail now runs as
the internal _xpack user when security is enabled.

Backport of #60106
2020-07-27 10:44:48 +01:00
Dan Hermann 88e8f691af
Update index privileges doc to include data streams (#59139) (#60169) 2020-07-24 07:52:36 -05:00
Lisa Cawley 2665bfffce
[DOCS] Fix security links in machine learning APIs (#60098) (#60152) 2020-07-23 16:43:10 -07:00
Nhat Nguyen bc65b3a590 Increase timeout in AutoFollowIT (#60004)
It can take more than 10 seconds to auto-follow and create a follow-task
on a slow CI. This commit increases timeout in AutoFollowIT by replacing
assertBusy with assertLongBusy.

Closes #59952
2020-07-23 16:36:53 -04:00
Nhat Nguyen 0fe4d5df67 Increase timeout testFollowIndexWithConcurrentMappingChanges
Fixes #59273
2020-07-23 16:22:58 -04:00
Albert Zaharovits 2eaf5e1c25
[DOCS] Mapping updates are deprecated for ingestion privileges (#60024)
This PR contains the deprecation notice that `create`, `create_doc`, `index` and
`write` ingest privileges do not permit mapping updates in version 8. It also
updates the docs description of said privileges. 

This should've been part of #58784
2020-07-23 19:49:23 +03:00
James Rodewig 988e8c8fc6
[DOCS] Swap `[float]` for `[discrete]` (#60134)
Changes instances of `[float]` in our docs for `[discrete]`.

Asciidoctor prefers the `[discrete]` tag for floating headings:
https://asciidoctor.org/docs/asciidoc-asciidoctor-diffs/#blocks
2020-07-23 12:42:33 -04:00
Dimitris Athanasiou 6b9a362ec2
[7.x][ML] Skip test inference if DFA task has been stopped (#60116) (#60127)
If the job is stopped before starting inference on test data, we
should skip inference entirely.

Backport of #60116
2020-07-23 18:34:09 +03:00
Dan Hermann ca25f6ae6f
Include the resolve index action in the view_index_metadata privilege (#59785) (#60112) 2020-07-23 08:13:56 -05:00
Dan Hermann fe12217c7f
[7.x] Move REST specs for data streams (#60111) 2020-07-23 08:10:54 -05:00
Albert Zaharovits 3ad3a7d268 DOCS audit attributes for API Key authn (#60033)
This PR describes the new audit entry attributes api_key.id,
api_key.name and authentication.type, as well as the meaning of
existing attributes when authentication is performed using API keys.

This should've been part of #58928
2020-07-23 15:51:40 +03:00
Armin Braun ebb6677815
Formalize and Streamline Buffer Sizes used by Repositories (#59771) (#60051)
Due to complicated access checks (reads and writes execute in their own access context) on some repositories (GCS, Azure, HDFS), using a hard coded buffer size of 4k for restores was needlessly inefficient.
By the same token, the use of stream copying with the default 8k buffer size  for blob writes was inefficient as well.

We also had dedicated, undocumented buffer size settings for HDFS and FS repositories. For these two we would use a 100k buffer by default. We did not have such a setting for e.g. GCS though, which would only use an 8k read buffer which is needlessly small for reading from a raw `URLConnection`.

This commit adds an undocumented setting that sets the default buffer size to `128k` for all repositories. It removes wasteful allocation of such a large buffer for small writes and reads in case of HDFS and FS repositories (i.e. still using the smaller buffer to write metadata) but uses a large buffer for doing restores and uploading segment blobs.

This should speed up Azure and GCS restores and snapshots in a non-trivial way as well as save some memory when reading small blobs on FS and HFDS repositories.
2020-07-22 21:06:31 +02:00
James Rodewig 74a34777d1
[DOCS] Fix outdated Kibana UI refs and screenshots in security docs (#60023) (#60059) 2020-07-22 13:08:22 -04:00
Larry Gregory a686ccc9b2
[Backport][7.x] Introduce reserved_ml_apm_user kibana privilege (#59854) (#60047) 2020-07-22 11:06:10 -04:00
Jay Modi c8ef2e18f7
Thread safe clean up of LocalNodeModeListeners (#60007)
This commit continues on the work in #59801 and makes other
implementors of the LocalNodeMasterListener interface thread safe in
that they will no longer allow the callbacks to run on different
threads and possibly race each other. This also helps address other
issues where these events could be queued to wait for execution while
the service keeps moving forward thinking it is the master even when
that is not the case.

In order to accomplish this, the LocalNodeMasterListener no longer has
the executorName() method to prevent future uses that could encounter
this surprising behavior.

Each use was inspected and if the class was also a
ClusterStateListener, the implementation of LocalNodeMasterListener
was removed in favor of a single listener that combined the logic. A
single listener is used and there is currently no guarantee on execution
order between ClusterStateListeners and LocalNodeMasterListeners,
so a future change there could cause undesired consequences. For other
classes, the implementations of the callbacks were inspected and if the
operations were lightweight, the overriden executorName method was
removed to use the default, which runs on the same thread.

Backport of #59932
2020-07-22 08:02:18 -06:00
Dimitris Athanasiou 7e652ca873
[7.x][ML] Include same fields during test inference as in training (#… (#60034)
In #58877, when we switched test inference on java, we just
use the doc's `_source` as features. However, this could be
missing out on features that were used during training,
e.g. alias fields, etc.

This commit addresses this by extracting fields to use as
features during inference the same way they are extracted
in `DataFrameDataExtractor` when they are used for training.

Backport of #59963
2020-07-22 12:54:13 +03:00
David Roberts 7358f9fb05 [ML] Mute ForecastIT.testOverflowToDisk in EAR builds (#60040)
Due to https://github.com/elastic/elasticsearch/issues/58806
2020-07-22 10:17:37 +01:00
James Baiera 1c1a4297e0
Track backing indices in data streams stats from cluster state (#59817) (#60015)
If shard level results are incomplete in the data streams stats call, it is possible to get inaccurate 
counts of the number of backing indices, despite this data being accurate and available in the 
cluster state.
2020-07-21 23:21:33 -04:00
James Rodewig 401e12dc2b
[DOCS] Fix data stream docs (#59818) (#60010) 2020-07-21 17:04:13 -04:00
James Baiera b3363cf8f9
[7.x] Remove unneeded rest params from Data Stream Stats (#59575) (#59661)
This PR removes the expand_wildcards and forbid_closed_indices parameters from the Data 
Streams Stats REST endpoint. These options are required for broadcast requests, but are not 
needed for anything in terms of resolving data streams. Instead, we just set a default set of 
IndicesOptions on the transport request.
2020-07-21 15:59:16 -04:00
James Rodewig b302b09b85
[DOCS] Reformat snippets to use two-space indents (#59973) (#59994) 2020-07-21 15:49:58 -04:00
Armin Braun 5613e4b00b
Increase Timeout in testSLMRetentionAfterRestore (#59979) (#59991)
This test failed by hitting the 10s default busy assert timeout.
Given how involved the retention run is (multiple disk reads, CS updates etc.)
we should have a higher timeout here.

Also, removed the pointless delete call for the snapshot that we just asserted is gone,
 at the end of the test.

Closes #59956
2020-07-21 18:19:18 +02:00
James Rodewig 4d646ca819
[DOCS] Fix typo in LDAP config docs (#59953) (#59974)
Co-authored-by: AndyHunt66 <andrew.hunt@elastic.co>
2020-07-21 10:48:08 -04:00
Nik Everett 6f6076e208
Drop some params from IndexFieldData.Builder (backport of #59934) (#59972)
We never used the `IndexSettings` parameter and we only used the
`MappedFieldType` parameter to get the name of the field which we
already know everywhere where we build the `IFD.Builder`. This allows us
to drop a fair bit of ceremony from a couple of tests.
2020-07-21 10:28:59 -04:00
Przemysław Witek 283a1f605c
Rename binary_soft_classification evaluation to outlier_detection (#59951) (#59970) 2020-07-21 15:15:04 +02:00
Yannick Welsch 07784a0b16 CCR recoveries using wrong setting for chunk sizes (#59597)
The default chunk size for CCR file-based recoveries was wrongly set to 40MB instead of 1MB.
2020-07-21 13:56:06 +02:00
Tal Levy c9ac4bf7c8 Reduce memory usage of GeoGridTiler tests (#59921)
This PR further reduces the memory footprint of the
testGeoHashGridCircuitBreaker test such that only
0.26% of the randomized runs result in memory usage of between
500kb-1mb. where most of that those that are in that range
produce ~650kb of usage. Before, 3% of the runs would use
> 50mb of memory resulting in OOMs in CI

Closes #59853.
2020-07-20 15:45:39 -07:00
Jay Modi 515b53d297
Fix race in SLM master/cluster state listeners (#59896)
This change fixes two possible race conditions in SLM related to
how local master changes and cluster state events are observed. When
implementing the `LocalNodeMasterListener` interface, it is only
recommended to execute on a separate threadpool if the operations are
heavy and would block the cluster state thread. SLM specified that the
listeners should run in the Snapshot thread pool, but the operations
in the listener were lightweight. This had the side effect of causing
master changes to be delayed if the Snapshot threads were all busy and
could also potentially cause the `onMaster` and `offMaster` calls to
race if both were queued and then executed concurrently. Additionally,
the `SnapshotLifecycleService` is also a `ClusterStateListener` and
there is currently no order of operations guarantee between
`LocalNodeMasterListeners` and `ClusterStateListeners` so this could
lead to incorrect behavior.

The resolution for these two issues is that the
SnapshotRetentionService now specifies the `SAME` executor for its
implementation of the `LocalNodeMasterListener` interface. The
`SnapshotLifecycleService` is no longer a `LocalNodeMasterListener` and
instead tracks local master changes in its `ClusterStateListner`.

Backport of #59801
2020-07-20 09:59:46 -06:00
Nik Everett fcd8b5fe6e
Fix top_metrics when metric is missing (backport of #59471) (#59881)
This fixes a null pointer exception when the metric is missing for the
latest document returned by `top_metrics`.

Closes #58926
2020-07-20 10:42:58 -04:00
Rene Groeschke e31ebc96f9
Enforce fail on deprecated gradle usage (7.x backport) (#59758)
* Enforce fail on deprecated gradle usage (#59598)
* Fix branch specific deprecated gradle api usages
* Fix archiveVersion property usage
2020-07-20 08:52:30 +02:00
Albert Zaharovits 3ffb20bdfc
Fix DLS/FLS permission for the submit async search action (#59693)
The submit async search action should not populate the thread context
DLS/FLS permission set, because it is not currently authorised as an "indices request"
and hence the permission set that it builds is incomplete and it overrides the
DLS/FLS permission set of the actual spawned search request (which is built correctly).
2020-07-20 09:37:26 +03:00
Costin Leau 9cc80621c3 EQL: Fix matching of tail/desc queries (#59827)
When dealing with tail queries, data is returned descending for the base
criterion yet the rest of the queries are ascending. This caused a
problem during insertion since while in a page, the data is ASC, between
pages the blocks of data is DESC.
This caused incorrectly sorting inside a SequenceGroup which led to
incorrect results.

Further more in case of limit, since the data in a page is ASC, early
return is not possible neither is desc matching. Thus the page needs to
be consumed first before finding the final results.
A future improvement could be to keep only the top N results dropping
the rest during insertion time.

(cherry picked from commit 77c88da054a1ce662a264f72cde5986d4ce37e3a)
2020-07-19 00:49:16 +03:00
Lee Hinman 8c7d414a3b
[7.x] Fix retrieving data stream stats for a DS with multiple backing indices (#59806) (#59810)
Backports the following commits to 7.x:

    Fix retrieving data stream stats for a DS with multiple backing indices (#59806)
2020-07-17 16:56:07 -06:00
Nik Everett 95e6e4a452
Small cleanup for IndexFieldData (#59724) (#59800)
This drops `IndexComponent` from `IndexFieldData` because it wasn't
doing anything other than forcing us to perform a bunch of ceremony to
build them.
2020-07-17 13:38:15 -04:00
Tal Levy c9ab7bb651
Fix bug in circuit-breaker check for geoshape grid aggregations (#57962) (#59741)
There was a bug in the geoshape circuit-breaker check where the
hash values array was being allocated before its new size was
accounted for by the circuit breaker.

Fixes #57847.
2020-07-17 09:26:00 -07:00
Benjamin Trent b7f30fc929
[7.x] Adding new `require_alias` option to indexing requests (#58917) (#59769)
* Adding new `require_alias` option to indexing requests (#58917)

This commit adds the `require_alias` flag to requests that create new documents.

This flag, when `true` prevents the request from automatically creating an index. Instead, the destination of the request MUST be an alias.

When the flag is not set, or `false`, the behavior defaults to the `action.auto_create_index` settings.

This is useful when an alias is required instead of a concrete index.

closes https://github.com/elastic/elasticsearch/issues/55267
2020-07-17 10:24:58 -04:00
Alan Woodward b29d368b52
Convert DateFieldMapper to parametrized format (#59429) (#59759)
This commit makes DateFieldMapper extend ParametrizedFieldMapper,
declaring its parameters explicitly. As well as changes to DateFieldMapper
itself, there are some changes to dynamic mapping code to ensure that
dynamically detected date formats are passed through to new date mapper
builders.
2020-07-17 12:46:18 +01:00
Andrei Dan 301d61a98e
Tests: fix TimeSeriesDataStreamsIT.testShrinkActionInPolicyWithoutHotPhase (#59603) (#59689)
The ILM policy for the source and shrunk indices run separately (ie. they
are two separate managed indices). This fixes the test which exhibited some
flakiness by allowing some time for the ILM policy for the source index
to finish executing.

(cherry picked from commit c78d5e8499fc5ca2ca1314f97bcc6b55ba06e2e7)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-07-17 11:26:06 +01:00
Andrei Stefan d513e1090f
Do not create the index, if it's already there (#59745) (#59747)
(cherry picked from commit d097447d257efdf0a36b1157e1f177aed86ecca1)
2020-07-17 11:38:30 +03:00
Tanguy Leroux 4827fec1cf
Revert "Mute AzureSearchableSnapshotsIT (#58775)" (#59749)
This reverts commit 74a78b3a7b.
2020-07-17 10:02:46 +02:00
Martijn van Groningen 74c9402912
Re-enable data stream bwc tests (#59734)
after backporting #59503
Backport of #59732 yo 7.x
2020-07-16 23:59:52 +02:00
Martijn van Groningen 0096238df1
Replaced _data_stream_timestamp meta field's 'path' option with 'enabled' option (#59727)
Backport #59503 to 7.x

and adjusted exception messages.

Relates to #59076
2020-07-16 22:29:40 +02:00
Igor Motov 2408803fad
Adds hard_bounds to histogram aggregations (#59175) (#59656)
Adds a hard_bounds parameter to explicitly limit the buckets that a histogram
can generate. This is especially useful in case of open ended ranges that can
produce a very large number of buckets.
2020-07-16 15:31:53 -04:00
Martijn van Groningen 4089cbd767
Ignore multiple matching templates warning in specific tests. (#59692) (#59715)
Closes #59679
2020-07-16 20:07:38 +02:00
Marios Trivyzas c7efbc1b83
SQL: Implement DATE_PARSE function for parsing strings into DATE values (#57391) (#59699)
Implement DATE_PARSE(<date_str>, <pattern_str>) function
which allows to parse a date string according to the specified
pattern into a date object. The patterns allowed are those of
java.time.format.DateTimeFormatter.

Closes #54962

Co-authored-by: Marios Trivyzas <matriv@users.noreply.github.com>
Co-authored-by: Patrick Jiang(白泽) <dreamlike.sky@foxmail.com>

(cherry picked from commit 647a413d9b21bd3938f1716bb19f8407e1334125)
2020-07-16 17:24:30 +02:00
Benjamin Trent a28547c4b4
[7.x] [ML] add new `custom` field to trained model processors (#59542) (#59700)
* [ML] add new `custom` field to trained model processors (#59542)

This commit adds the new configurable field `custom`.

`custom` indicates if the preprocessor was submitted by a user or automatically created by the analytics job.

Eventually, this field will be used in calculating feature importance. When `custom` is true, the feature importance for
the processed fields is calculated. When `false` the current behavior is the same (we calculate the importance for the originating field/feature).

This also adds new required methods to the preprocessor interface. If users are to supply their own preprocessors
in the analytics job configuration, we need to know the input and output field names.
2020-07-16 10:57:38 -04:00
Nik Everett 343053c0a7 Fix compilation in Eclipse (backport #59675)
Eclipse was confused by #59583. It can't see a the public inner
interface within the superclass. This time. Usually that is fine, but
the Eclipse gods don't like this particular code, I guess.
2020-07-16 08:25:12 -04:00
David Kyle c349fdcb89 Mute RegressionIT testWithDataStream (#59687)
For #59664
2020-07-16 09:45:29 +01:00