Commit Graph

3375 Commits

Author SHA1 Message Date
Przemysław Witek 65a584b6fb
[7.x] Report timing stats as part of the Job stats response (#42709) (#43193) 2019-06-14 09:03:14 +02:00
Marios Trivyzas 3c73602524 SQL: Fix wrong results when sorting on aggregate (#43154)
- Previously, when shorting on an aggregate function the bucket
processing ended early when the explicit (LIMIT XXX) or the impliciti
limit of 512 was reached. As a consequence, only a set of grouping
buckets was processed and the results returned didn't reflect the global
ordering.

- Previously, the priority queue shorting method had an inverse
comparison check and the final response from the priority queue was also
returned in the inversed order because of the calls to the `pop()`
method.

Fixes: #42851

(cherry picked from commit 19909edcfdf5792b38c1363b07379783ebd0e6c4)
2019-06-13 21:59:20 +02:00
Jason Tedor 5bc3b7f741
Enable node roles to be pluggable (#43175)
This commit introduces the possibility for a plugin to introduce
additional node roles.
2019-06-13 15:15:48 -04:00
Ryan Ernst c3ce3f6891 Add native code info to ML info api (#43172)
The machine learning feature of xpack has native binaries with a
different commit id than the rest of code. It is currently exposed in
the xpack info api. This commit adds that commit information to the ML
info api, so that it may be removed from the info api.
2019-06-13 11:38:58 -07:00
Przemyslaw Gomulka 8f7cd84422
Disable x-pack:qa:kerberos-tests due to failures (#43208)
relates #40678
2019-06-13 20:19:17 +02:00
Alpar Torok 4ba94a5051 Testclusters: convert ccr tests (#42313) 2019-06-13 19:19:36 +03:00
Martijn van Groningen c8e6474eef
Changes required for merging in 7.x branch. 2019-06-13 16:58:27 +02:00
Martijn van Groningen 1f3db7eb3e
Merge remote-tracking branch 'es/7.x' into enrich-7.x 2019-06-13 16:49:38 +02:00
David Roberts 43665183c2 [ML] Restrict detection of epoch timestamps in find_file_structure (#43188)
Previously 10 digit numbers were considered candidates to be
timestamps recorded as seconds since the epoch and 13 digit
numbers as timestamps recorded as milliseconds since the epoch.

However, this meant that we could detect these formats for
numbers that would represent times far in the future.  As an
example ISBN numbers starting with 9 were detected as milliseconds
since the epoch since they had 13 digits.

This change tweaks the logic for detecting such timestamps to
require that they begin with 1 or 2.  This means that numbers
that would represent times beyond about 2065 are no longer
detected as epoch timestamps.  (We can add 3 to the definition
as we get closer to the cutoff date.)
2019-06-13 13:15:41 +01:00
Alpar Torok 167e51335d Convert ILM tests to use testclusters (#43076)
Also improove the error message when bin scripts are not found
2019-06-13 12:24:48 +03:00
Alpar Torok eb7a8bb4a4 Testclusters: graph (#43033)
Convert x-pack  graph to use testClusters
2019-06-13 09:50:59 +03:00
Simon Willnauer f70141c862 Only load FST off heap if we are actually using mmaps for the term dictionary (#43158)
Given the significant performance impact that NIOFS has when term dicts are
loaded off-heap this change enforces FstLoadMode#AUTO that loads term dicts
off heap only if the underlying index input indicates a memory map.

Relates to #43150
2019-06-13 07:54:02 +02:00
Yogesh Gaikwad 4ae1e30a98
Enable krb5kdc-fixture, kerberos tests mount urandom for kdc container (#41710) (#43178)
Infra has fixed #10462 by installing `haveged` on CI workers.
This commit enables the disabled fixture and tests, and mounts
`/dev/urandom` for the container so there is enough
entropy required for kdc.
Note: hdfs-repository tests have been disabled, will raise a separate issue for it.

Closes #40624 Closes #40678
2019-06-13 13:02:16 +10:00
Benjamin Trent ec50d4d281
[ML][Data Frame] write a warning audit on bulk index failures (#43106) (#43171)
* [ML][Data Frame] write a warning audit on bulk index failures

* adding failure message and moving to use volalitile
2019-06-12 14:50:17 -05:00
Benjamin Trent aff4795441
[ML][Data Frame] cleaning up tests since tasks are cancelled onfinish (#43136) (#43166)
* [ML][Data Frame] cleaning up usage test since tasks are cancelled onfinish

* Update DataFrameUsageIT.java

* Fixing additional test, waiting for task to complete

* removing unused import

* unmuting test
2019-06-12 14:39:38 -05:00
Benjamin Trent b110164bf4
[ML][Data Frame] add the src priv check for view_index_metadata (#43118) (#43161) 2019-06-12 13:22:46 -05:00
Benjamin Trent f13f55ede3
[ML][Data Frame] change failure count reset logic (#43064) (#43159) 2019-06-12 13:22:34 -05:00
David Kyle 597ae5c7b8 [ML DataFrame] Reject Data Frame Ids containing upper case characters (#43145) 2019-06-12 18:13:18 +01:00
James Baiera 9d56a0365f Limit a enrich policy execution to only one at a time (#42535)
Add a keyed lock mechanism to the policy executor to ensure that an enrich policy
can only have one execution happening at a time.
2019-06-12 10:45:10 -04:00
Yannick Welsch 110f0c5b7e Mute testDataFrameTransformCrud
Relates to #43139
2019-06-12 14:12:01 +02:00
Dimitris Athanasiou b28e006f7c
[ML] Lock down extraction method when possible (#43104) (#43140) 2019-06-12 14:07:17 +03:00
Luca Cavanna afeda1a7b9 Split search in two when made against throttled and non throttled searches (#42510)
When a search on some indices takes a long time, it may cause problems to other indices that are being searched as part of the same search request and being written to as well, because their search context needs to stay open for a long time. This is especially a problem when searching against throttled and non-throttled indices as part of the same request. The problem can be generalized though: this may happen whenever read-only indices are searched together with indices that are being written to. Search contexts staying open for a long time is only an issue for indices that are being written to, in practice.

This commit splits the search in two sub-searches: one for read-only indices, and one for ordinary indices. This way the two don't interfere with each other. The split is done only when size is greater than 0, no scroll is provided and query_then_fetch is used as search type. Otherwise, the search executes like before. Note that the returned num_reduce_phases reflect the number of reduction phases that were run. If the search is split in two, there are three reductions: one non-final for each search, and a final one that merges the results of the previous two.

Closes #40900
2019-06-12 11:25:03 +02:00
Nhat Nguyen 5692be2161 Fix timing issue in CcrRetentionLeaseIT (#43054)
In these tests, we sleep for a small multiple of the renew interval,
then check that the retention leases are not changed. If a renewal
request takes longer than that interval because of GC or slow CI, then
the retention leases are not the same as before sleep. With this change,
we relax to assert that we eventually stop the renewable process.

Closes #39509
2019-06-11 18:03:16 -04:00
Benjamin Trent 7ff3d86cf0
[ML][Data Frame] adding dest.index and id validations (#43053) (#43109)
* [ML][Data Frame] adding dest.index and id validations

* adjusting message format

* Adjusting id validity pattern

* Update DataFrameStrings.java
2019-06-11 15:55:18 -05:00
Benjamin Trent e384bf0276
[ML-DataFrame] stop task at completion of data frame function (#42955) (#43114)
* stop data frame task after it finishes

* test auto stop

* adapt tests

* persist the state correctly and move stop into listener

* Calling `onStop` even if persistence fails, changing `stop` to rely on doSaveState
2019-06-11 15:55:02 -05:00
Ryan Ernst 172cd4dbfa Remove description from xpack feature sets (#43065)
The description field of xpack featuresets is optionally part of the
xpack info api, when using the verbose flag. However, this information
is unnecessary, as it is better left for documentation (and the existing
descriptions describe anything meaningful). This commit removes the
description field from feature sets.
2019-06-11 09:22:58 -07:00
David Roberts d3136f99e6 [ML] Fix race condition when closing time checker (#43098)
The tests for the ML TimeoutChecker rely on threads
not being interrupted after the TimeoutChecker is
closed.  This change ensures this by making the
close() and setTimeoutExceeded() methods synchronized
so that the code inside them cannot execute
simultaneously.

Fixes #43097
2019-06-11 16:39:17 +01:00
Nhat Nguyen 5d3849215b CCR should not replicate private/internal settings (#43067)
With this change, CCR will not replicate internal or private settings to
follower indices.

Closes #41268
2019-06-11 06:59:09 -04:00
Martijn Laarman cb7ce865b7
remove path from rest-api-spec (#41452) (#43084)
(cherry picked from commit f5fde1d0843d2f0f53d3b9a15b9cfc8b94471ab7)
2019-06-11 12:52:36 +02:00
Ioannis Kakavas 1776d6e055 Refresh remote JWKs on all errors (#42850)
It turns out that key rotation on the OP, can manifest as both
a BadJWSException and a BadJOSEException in nimbus-jose-jwt. As
such we cannot depend on matching only BadJWSExceptions to
determine if we should poll the remote JWKs for an update.

This has the side-effect that a remote JWKs source will be polled
exactly one additional time too for errors that have to do with
configuration, or for errors that might be caused by not synched
clocks, forged JWTs, etc. ( These will throw a BadJWTException
which extends BadJOSEException also )
2019-06-11 11:01:54 +03:00
Benjamin Trent 79052050bf
[ML] Adding support for geo_shape, geo_centroid, geo_point in datafeeds (#42969) (#43069)
* [ML] Adding support for geo_shape, geo_centroid, geo_point in datafeeds

* only supporting doc_values for geo_point fields

* moving validation into GeoPointField ctor
2019-06-10 21:52:53 -05:00
Benjamin Trent eadfe05587
[ML] Changes slice specification to auto. See #42996 (#43039) (#43070) 2019-06-10 21:52:22 -05:00
Nhat Nguyen 53eb630700 Fix NPE in CcrRetentionLeaseIT (#43059)
The retention leases stats is null if the processing shard copy is being
closed. In this the case, we should check against null then retry to
avoid failing a test.

Closes #41237
2019-06-10 17:58:37 -04:00
Nhat Nguyen f2e66e22eb Increase waiting time when check retention locks (#42994)
WriteActionsTests#testBulk and WriteActionsTests#testIndex sometimes
fail with a pending retention lock. We might leak retention locks when
switching to async recovery. However, it's more likely that ongoing
recoveries prevent the retention lock from releasing.

This change increases the waiting time when we check for no pending
retention lock and also ensures no ongoing recovery in
WriteActionsTests.

Closes #41054
2019-06-10 17:58:37 -04:00
Nhat Nguyen 4191df6e1d Unmute IndexFollowingIT#testFollowIndex
Fixed in #41987
2019-06-10 17:58:37 -04:00
Benjamin Trent 1ddc4c8fc6
[ML][Data Frame] Removes slice specification from DBQ. See #42996 (#43036) (#43055) 2019-06-10 13:40:55 -05:00
Dimitris Athanasiou 76a92b49a8
[ML] Get resources action should be lenient when sort field is unmapped (#42991) (#43046)
Get resources action sorts on the resource id. When there are no resources at
all, then it is possible the index does not contain a mapping for the resource
id field. In that case, the search api fails by default.

This commit adjusts the search request to ignore unmapped fields.

Closes elastic/kibana#37870
2019-06-10 19:50:19 +03:00
David Roberts bf5d56053a
[TEST] Adding a BWC test for ML categorization config (#42988)
This test coverage was previously missing.

Backport of #42981
2019-06-10 15:39:28 +01:00
Alan Woodward 8e23e4518a Move construction of custom analyzers into AnalysisRegistry (#42940)
Both TransportAnalyzeAction and CategorizationAnalyzer have logic to build
custom analyzers for index-independent analysis. A lot of this code is duplicated,
and it requires the AnalysisRegistry to expose a number of internal provider
classes, as well as making some assumptions about when analysis components are
constructed.

This commit moves the build logic directly into AnalysisRegistry, reducing the
registry's API surface considerably.
2019-06-10 14:33:25 +01:00
David Turner 68339f90e9 Mute AutodetectMemoryLimitIT#testTooManyPartitions
Relates #43013
2019-06-10 09:20:36 +01:00
Andrei Stefan 036f9c4a55 SQL: cover the Integer type when extracting values from _source (#42859)
* Take into consideration a wider range of Numbers when extracting the
values from source, more specifically - BigInteger and BigDecimal.

(cherry picked from commit 561b8d73dd7b03c50242e4e3f0128b2142959176)
2019-06-10 09:25:56 +03:00
Jason Tedor 63bad28005
Do not allow modify aliases on followers (#43017)
Now that aliases are replicated by a follower from its leader, this
commit prevents directly modifying aliases on follower indices.
2019-06-09 22:53:54 -04:00
Jason Tedor 915d2f2daa
Refactor put mapping request validation for reuse (#43005)
This commit refactors put mapping request validation for reuse. The
concrete case that we are after here is the ability to apply effectively
the same framework to indices aliases requests. This commit refactors
the put mapping request validation framework to allow for that.
2019-06-09 10:19:04 -04:00
Benjamin Trent 553c73b22d
[ML][Data Frame] allow null values for aggs with sparse data (#42966) (#42998)
* [ML][Data Frame] allow null values for aggs with sparse data

* Making classes static, memory allocation optimization
2019-06-07 15:43:06 -05:00
Benjamin Trent 755ba72896
[ML][Data frame] make sure that fields exist when creating progress (#42943) (#42984) 2019-06-07 10:13:18 -05:00
jalvar08 b77be89c9a Remove Comma in Example (#41873)
The comma is there in error as there are no other parameter after 'value'
2019-06-07 08:39:27 -04:00
Henning Andersen dea935ac31
Reindex max_docs parameter name (#42942)
Previously, a reindex request had two different size specifications in the body:
* Outer level, determining the maximum documents to process
* Inside the source element, determining the scroll/batch size.

The outer level size has now been renamed to max_docs to
avoid confusion and clarify its semantics, with backwards compatibility and
deprecation warnings for using size.
Similarly, the size parameter has been renamed to max_docs for
update/delete-by-query to keep the 3 interfaces consistent.

Finally, all 3 endpoints now support max_docs in both body and URL.

Relates #24344
2019-06-07 12:16:36 +02:00
Tim Vernum 090d42d3e6
Permit API Keys on Basic License (#42973)
Kibana alerting is going to be built using API Keys, and should be
permitted on a basic license.

This commit moves API Keys (but not Tokens) to the Basic license

Relates: elastic/kibana#36836
Backport of: #42787
2019-06-07 14:18:05 +10:00
Tim Brooks 667c613d9e
Remove `nonApplicationWrite` from `SSLDriver` (#42954)
Currently, when the SSLEngine needs to produce handshake or close data,
we must manually call the nonApplicationWrite method. However, this data
is only required when something triggers the need (starting handshake,
reading from the wire, initiating close, etc). As we have a dedicated
outbound buffer, this data can be produced automatically. Additionally,
with this refactoring, we combine handshake and application mode into a
single mode. This is necessary as there are non-application messages that
are sent post handshake in TLS 1.3. Finally, this commit modifies the
SSLDriver tests to test against TLS 1.3.
2019-06-06 17:44:40 -04:00
henryptung 61b62125b8 Wire query cache into sorting nested-filter computation (#42906)
Don't use Lucene's default query cache when filtering in sort.

Closes #42813
2019-06-06 21:16:58 +02:00
Gordon Brown e35b240a43
Fix hang in test for "too many fields" dep. check (#42909)
This commit fixes a rare case where the method to randomly generate a
very large mapping could enter an infinite loop.
2019-06-06 08:28:32 -06:00
Benjamin Trent 1f1868a294
[ML][Data Frame] pull state and states for indexer from index (#42856) (#42935)
* [ML][Data Frame] pull state and states for indexer from index

* Update DataFrameTransformTask.java
2019-06-06 08:51:11 -05:00
Hendrik Muhs ba8bd8dfbe configure auto expand for dataframe indexes (#42924)
creates the dataframe destination index with auto expand for replicas (0-1)
2019-06-06 14:13:26 +02:00
Ioannis Kakavas ba5b1d3249 Mute failing testPerformActionAttrsRequestFails (#42933) 2019-06-06 14:28:50 +03:00
Ioannis Kakavas 6ee578c6eb Mute testEnableDisableBehaviour (#42929) 2019-06-06 14:01:07 +03:00
David Roberts 40c827a3b8 [ML] Close sample stream in find_file_structure endpoint (#42896)
A static code analysis revealed that we are not closing
the input stream in the find_file_structure endpoint.
This actually makes no difference in practice, as the
particular InputStream implementation in this case is
org.elasticsearch.common.bytes.BytesReferenceStreamInput
and its close() method is a no-op.  However, it is good
practice to close the stream anyway.
2019-06-06 11:03:45 +01:00
David Roberts b202a59f88 [ML] Add earliest and latest timestamps to field stats (#42890)
This change adds the earliest and latest timestamps into
the field stats for fields of type "date" in the output of
the ML find_file_structure endpoint.  This will enable the
cards for date fields in the file data visualizer in the UI
to be made to look more similar to the cards for date
fields in the index data visualizer in the UI.
2019-06-06 08:58:35 +01:00
Hendrik Muhs 280a2c9401 [ML-DataFrame] reduce log spam: do not trigger indexer if state is indexing or stopping (#42849)
reduce log spam: do not trigger indexer if state is indexing or stopping
2019-06-06 07:47:54 +02:00
Hendrik Muhs 5cd503a1e0 [ML-DataFrame] increase the scheduler interval to 10s (#42845)
increases the scheduler interval to fire less frequently, namely changing it from 1s to 10s. The scheduler interval is used for retrying after an error condition.
2019-06-06 07:47:38 +02:00
Gordon Brown 6eb4600e93
Add custom metadata to snapshots (#41281)
Adds a metadata field to snapshots which can be used to store arbitrary
key-value information. This may be useful for attaching a description of
why a snapshot was taken, tagging snapshots to make categorization
easier, or identifying the source of automatically-created snapshots.
2019-06-05 17:30:31 -06:00
James Baiera 1300183001
NullPointerException when creating a watch with Jira action (#41922) (#42081) (#42873)
NullPointerException when secured_url does not use proper scheme in jira action. 
This commit will handle Expection and display proper message.
2019-06-05 16:03:07 -04:00
Przemyslaw Gomulka ab5bc83597
Deprecation info for joda-java migration on 7.x (#42659)
Some clusters might have been already migrated to version 7 without being warned about the joda-java migration changes.
Deprecation api on that version will give them guidance on what patterns need to be changed.
relates. This change is using the same logic like in 6.8 that is: verifying the pattern is from the incompatible set ('y'-Y', 'C', 'Z' etc), not from predifined set, not prefixed with 8. AND was also created in 6.x. Mappings created in 7.x are considered migrated and should not generate warnings

There is no pipeline check (present on 6.8) as it is impossible to verify when the pipeline was created, and therefore to make sure the format is depracated or not
#42010
2019-06-05 19:50:04 +02:00
Przemyslaw Gomulka cfdb1b771e
Enable console audit logs for docker backport#42671 #42887
Enable audit logs in docker by creating console appenders for audit loggers.
also rename field @timestamp to timestamp and add field type with value audit

The docker build contains now two log4j configuration for oss or default versions. The build now allows override the default configuration.

Also changed the format of a timestamp from ISO8601 to include time zone as per this discussion #36833 (comment)

closes #42666
backport#42671
2019-06-05 17:15:37 +02:00
Benjamin Trent 293f306b9a
[ML][Data Frame] forcing that no ptask => STOPPED state (#42800) (#42860)
* [ML][Data Frame] forcing that no ptask => STOPPED state

* Addressing side-effect, early exit for stop when stopped
2019-06-05 07:09:34 -05:00
David Roberts d5baedb789 [ML] Change dots in CSV column names to underscores (#42839)
Dots in the column names cause an error in the ingest
pipeline, as dots are special characters in ingest pipeline.
This PR changes dots into underscores in CSV field names
suggested by the ML find_file_structure endpoint _unless_
the field names are specifically overridden.  The reason for
allowing them in overrides is that fields that are not
mentioned in the ingest pipeline can contain dots.  But it's
more consistent that the default behaviour is to replace
them all.

Fixes elastic/kibana#26800
2019-06-05 11:28:33 +01:00
Simon Willnauer ebec118ccf Bring back ExecutionException after backport 2019-06-05 12:10:02 +02:00
Simon Willnauer 41a9f3ae3b Use reader attributes to control term dict memory useage (#42838)
This change makes use of the reader attributes added in LUCENE-8671
to ensure that `_id` fields are always on-heap for best update performance
and term dicts are generally off-heap on Read-Only engines.

Closes #38390
2019-06-05 11:01:06 +02:00
Jason Tedor 78be3dde25
Enable testing against JDK 13 EA builds (#40829)
This commit adds JDK 13 to the CI rotation for testing. For now, we will
be testing against JDK 13 EA builds.
2019-06-04 20:54:24 -04:00
Jason Tedor 117df87b2b
Replicate aliases in cross-cluster replication (#42875)
This commit adds functionality so that aliases that are manipulated on
leader indices are replicated by the shard follow tasks to the follower
indices. Note that we ignore write indices. This is due to the fact that
follower indices do not receive direct writes so the concept is not
useful.

Relates #41815
2019-06-04 20:36:24 -04:00
Jason Tedor aad1b3a2a0
Fix version parsing in various tests (#42871)
This commit fixes the version parsing in various tests. The issue here is that
the parsing was relying on java.version. However, java.version can contain
additional characters such as -ea for early access builds. See JEP 233:

Name                            Syntax
------------------------------  --------------
java.version                    $VNUM(\-$PRE)?
java.runtime.version            $VSTR
java.vm.version                 $VSTR
java.specification.version      $VNUM
java.vm.specification.version   $VNUM

Instead, we want java.specification.version.
2019-06-04 18:22:20 -04:00
Mark Vieira e44b8b1e2e
[Backport] Remove dependency substitutions 7.x (#42866)
* Remove unnecessary usage of Gradle dependency substitution rules (#42773)

(cherry picked from commit 12d583dbf6f7d44f00aa365e34fc7e937c3c61f7)
2019-06-04 13:50:23 -07:00
Chris Cho 1514d1a1be Change shard allocation filter property and api (#42602)
The current example is not working and a bit confused. This change tries
to match it with the sample of the watcher blog.
2019-06-04 12:30:39 -04:00
Ioannis Kakavas 440ec4d9f5
[Backport 7.x] OpenID Connect realm guide (#42836)
This commit adds a configuration guide for the newly introduced
OpenID Connect realm. The guide is similar to the style of the
SAML Guide and shares certain parts where applicable (role mapping)
It also contains a short section on how the realm can be used for
authenticating users without Kibana.

Co-Authored-By: Lisa Cawley <lcawley@elastic.co>

Backport of #41423 and #42555
2019-06-04 14:08:41 +03:00
Tim Vernum 928f49992f
Don't require TLS for single node clusters (#42830)
This commit removes the TLS cluster join validator.

This validator existed to prevent v6.x nodes (which mandated
TLS) from joining an existing cluster of v5.x nodes (which did
not mandate TLS) unless the 6.x node (and by implication the
5.x nodes) was configured to use TLS.

Since 7.x nodes cannot talk to 5.x nodes, this validator is no longer
needed.

Removing the validator solves a problem where single node clusters
that were bound to local interfaces were incorrectly requiring TLS
when they recovered cluster state and joined their own cluster.

Backport of: #42826
2019-06-04 19:48:37 +10:00
Tim Vernum 8de3a88205
Log the status of security on license change (#42741)
Whether security is enabled/disabled is dependent on the combination
of the node settings and the cluster license.

This commit adds a license state listener that logs when the license
change causes security to switch state (or to be initialised).

This is primarily useful for diagnosing cluster formation issues.

Backport of: #42488
2019-06-04 14:25:43 +10:00
Tim Vernum 9035e61825
Detect when security index is closed (#42740)
If the security index is closed, it should be treated as unavailable
for security purposes.

Prior to 8.0 (or in a mixed cluster) a closed security index has
no routing data, which would cause a NPE in the cluster change
handler, and the index state would not be updated correctly.
This commit fixes that problem

Backport of: #42191
2019-06-04 14:25:20 +10:00
Benjamin Trent 87cc6a974c
[ML] [Data Frame] adding and modifying auditor messages (#42722) (#42818)
* [ML] [Data Frame] adding and modifying auditor messages

* Update DataFrameTransformTask.java
2019-06-03 19:49:58 -05:00
James Baiera 415f1a484f Enrich validate nested mappings (#42452)
Ensures that fields retained in an enrich index from a source are not contained
under a nested field. It additionally makes sure that key fields exist, and that
value fields are checked if they are present. The policy runner test has also
been expanded with some faulty mapping test cases.
2019-06-03 16:02:31 -04:00
David Roberts b61202b0a8 [ML] Add a limit on line merging in find_file_structure (#42501)
When analysing a semi-structured text file the
find_file_structure endpoint merges lines to form
multi-line messages using the assumption that the
first line in each message contains the timestamp.
However, if the timestamp is misdetected then this
can lead to excessive numbers of lines being merged
to form massive messages.

This commit adds a line_merge_size_limit setting
(default 10000 characters) that halts the analysis
if a message bigger than this is created.  This
prevents significant CPU time being spent subsequently
trying to determine the internal structure of the
huge bogus messages.
2019-06-03 13:45:51 +01:00
Benjamin Trent 0253927ec4
[ML Data Frame] Refactor stop logic (#42644) (#42763)
* Revert "invalid test"

This reverts commit 9dd8b52c13c716918ff97e6527aaf43aefc4695d.

* Testing

* mend

* Revert "[ML Data Frame] Mute Data Frame tests"

This reverts commit 5d837fa312b0e41a77a65462667a2d92d1114567.

* Call onStop and onAbort outside atomic update

* Don’t update CS

* Tidying up

* Remove invalid test that asserted logic that has been removed

* Add stopped event

* Revert "Add stopped event"

This reverts commit 02ba992f4818bebd838e1c7678bd2e1cc090bfab.

* Adding check for STOPPED in saveState
2019-06-03 06:53:44 -05:00
David Roberts 10aca87389 [ML] Better detection of binary input in find_file_structure (#42707)
This change helps to prevent the situation where a binary
file uploaded to the find_file_structure endpoint is
detected as being text in the UTF-16 character set, and
then causes a large amount of CPU to be spent analysing
the bogus text structure.

The approach is to check the distribution of zero bytes
between odd and even file positions, on the grounds that
UTF-16BE or UTF16-LE would have a very skewed distribution.
2019-06-03 12:47:22 +01:00
Alan Woodward 2129d06643 Create client-only AnalyzeRequest/AnalyzeResponse classes (#42197)
This commit clones the existing AnalyzeRequest/AnalyzeResponse classes
to the high-level rest client, and adjusts request converters to use these new
classes.

This is a prerequisite to removing the Streamable interface from the internal
server version of these classes.
2019-06-03 09:46:36 +01:00
James Rodewig f51f8ed04c [DOCS] Remove unneeded options from `[source,sql]` code blocks (#42759)
In AsciiDoc, `subs="attributes,callouts,macros"` options were required
to render `include-tagged::` in a code block.

With elastic/docs#827, Elasticsearch Reference documentation migrated
from AsciiDoc to Asciidoctor.

In Asciidoctor, the `subs="attributes,callouts,macros"` options are no
longer needed to render `include-tagged::` in a code block. This commit
removes those unneeded options.

Resolves #41589
2019-05-31 13:05:13 -04:00
Benjamin Trent f22dcfb9da
[ML] [Data Frame] nesting group_by fields like other aggs (#42718) (#42760) 2019-05-31 10:55:35 -05:00
David Roberts 87ca762573 [ML] Add Kibana application privilege to data frame admin/user roles (#42757)
Data frame transforms are restricted by different roles to ML, but
share the ML UI.  To prevent the ML UI being hidden for users who
only have the data frame admin or user role, it is necessary to add
the ML Kibana application privilege to the backend data frame roles.
2019-05-31 15:41:59 +01:00
Jay Modi e687fd58fc
Re-enable token bwc tests (#42727)
This commit re-enables token bwc tests that run as part of the rolling
upgrade tests. These tests were muted while #42651 was being
backported.
2019-05-31 08:03:10 -06:00
Przemyslaw Gomulka d5061a151a
Remove suppresions for "unchecked" for hamcrest varargs methods Backport(41528) #42749
In hamcrest 2.1 warnings for unchecked varargs were fixed by hamcrest using @SafeVarargs for those matchers where this warning occurred.
This PR is aimed to remove these annotations when Matchers.contains ,Matchers.containsInAnyOrder or Matchers.hasItems was used
backport #41528
2019-05-31 13:58:49 +02:00
Przemysław Witek f6779de2b7
Increase maximum forecast interval to 10 years. (#41082) (#42710)
Increase the maximum duration to ~10 years (3650 days).
2019-05-31 06:19:47 +02:00
James Baiera 215170b6c3 Merge branch '7.x' into enrich-7.x 2019-05-30 16:13:06 -04:00
Jason Tedor 371cb9a8ce
Remove Log4j 1.2 API as a dependency (#42702)
We had this as a dependency for legacy dependencies that still needed
the Log4j 1.2 API. This appears to no longer be necessary, so this
commit removes this artifact as a dependency.

To remove this dependency, we had to fix a few places where we were
accidentally relying on Log4j 1.2 instead of Log4j 2 (easy to do, since
both APIs were on the compile-time classpath).

Finally, we can remove our custom Netty logger factory. This was needed
when we were on Log4j 1.2 and handled logging in our own unique
way. When we migrated to Log4j 2 we could have dropped this
dependency. However, even then Netty would still pick up Log4j 1.2 since
it was on the classpath, thus the advantage to removing this as a
dependency now.
2019-05-30 16:08:07 -04:00
Mark Vieira c1816354ed
[Backport] Improve build configuration time (#42674) 2019-05-30 10:29:42 -07:00
Benjamin Trent b5527b3278
[ML] [Data Frame] add support for weighted_avg agg (#42646) (#42714) 2019-05-30 12:05:35 -05:00
Jay Modi 711de2f59a
Make hashed token ids url safe (#42651)
This commit changes the way token ids are hashed so that the output is
url safe without requiring encoding. This follows the pattern that we
use for document ids that are autogenerated, see UUIDs and the
associated classes for additional details.
2019-05-30 10:44:41 -06:00
Ioannis Kakavas 7cabe8acc9 Fix refresh remote JWKS logic (#42662)
This change ensures that:

- We only attempt to refresh the remote JWKS when there is a
signature related error only ( BadJWSException instead of the
geric BadJOSEException )
- We do call OpenIDConnectAuthenticator#getUserClaims upon
successful refresh.
- We test this in OpenIdConnectAuthenticatorTests.

Without this fix, when using the OpenID Connect realm with a remote
JWKSet configured in `op.jwks_path`, the refresh would be triggered
for most configuration errors ( i.e. wrong value for `op.issuer` )
and the kibana wouldn't get a response and timeout since
`getUserClaims` wouldn't be called because
`ReloadableJWKSource#reloadAsync` wouldn't call `onResponse` on the
future.
2019-05-30 18:08:30 +03:00
Ioannis Kakavas 24a794fd6b Fix testTokenExpiry flaky test (#42585)
Test was using ClockMock#rewind passing the amount of nanoseconds
in order to "strip" nanos from the time value. This was intentional
as the expiration time of the UserToken doesn't have nanosecond
precision.
However, ClockMock#rewind doesn't support nanos either, so when it's
called with a TimeValue, it rewinds the clock by the TimeValue's
millis instead. This was causing the clock to go enough millis
before token expiration time and the test was passing. Once every
few hundred times though, the TimeValue by which we attempted to
rewind the clock only had nanos and no millis, so rewind moved the
clock back just a few millis, but still after expiration time.

This change moves the clock explicitly to the same instant as expiration,
using clock.setTime and disregarding nanos.
2019-05-30 07:53:56 +03:00
Igor Motov d2f9ccbe18 Geo: Refactor libs/geo parsers (#42549)
Refactors the WKT and GeoJSON parsers from an utility class into an
instantiatable objects. This is a preliminary step in
preparation for moving out coordinate validators from Geometry
constructors. This should allow us to make validators plugable.
2019-05-29 20:07:27 -04:00
David Kyle c5a410f68b [ML Data Frame] Set DF task state when stopping (#42516)
Set the state to stopped prior to persisting
2019-05-29 16:39:44 +01:00
Hendrik Muhs 345ff21ae5 [ML-DataFrame] rewrite start and stop to answer with acknowledged (#42589)
rewrite start and stop to answer with acknowledged

fixes #42450
2019-05-29 11:14:32 +02:00
Michael Basnight 77eed9e6a0 Add enrich policy GET API (#41384)
This commit wires up the Rest calls and Transport calls for GET enrich
policy, as well as tests and rest spec additions.
2019-05-28 23:19:23 -05:00
Michael Basnight be60125a4e Merge branch '7.x' into enrich-7.x 2019-05-28 18:32:18 -05:00