Commit Graph

369 Commits

Author SHA1 Message Date
Jason Tedor 1f1a035def
Remove stale test logging annotations (#43403)
This commit removes some very old test logging annotations that appeared
to be added to investigate test failures that are long since closed. If
these are needed, they can be added back on a case-by-case basis with a
comment associating them to a test failure.
2019-06-19 22:58:22 -04:00
Lee Hinman d81ce9a647 Return 0 for negative "free" and "total" memory reported by the OS (#42725)
* Return 0 for negative "free" and "total" memory reported by the OS

We've had a situation where the MX bean reported negative values for the
free memory of the OS, in those rare cases we want to return a value of
0 rather than blowing up later down the pipeline.

In the event that there is a serialization or creation error with regard
to memory use, this adds asserts so the failure will occur as soon as
possible and give us a better location for investigation.

Resolves #42157

* Fix test passing in invalid memory value

* Fix another test passing in invalid memory value

* Also change mem check in MachineLearning.machineMemoryFromStats

* Add background documentation for why we prevent negative return values

* Clarify comment a bit more
2019-06-19 10:35:48 -06:00
Przemysław Witek 86b58d9ff3
Rename AutoDetectResultProcessor* to AutodetectResultProcessor* for consistency with other classes where the spelling is "Autodetect" (#43359) (#43366) 2019-06-19 15:31:26 +02:00
Jason Tedor fa09113080 Remove trace logging for ML native multi-node tests
This trace logging looks like it was copy/pasted from another test,
where the logging in that test was only added to investigate a test
failure. This commit removes the trace logging.
2019-06-18 22:28:27 -04:00
Igor Motov 9f7d1ff2de Geo: Add coerce support to libs/geo WKT parser (#43273)
Adds support for coercing not closed polygons and ignoring Z value
to libs/geo WKT parser.

Closes #43173
2019-06-18 14:41:01 -04:00
Alpar Torok 94930d0e84 Testclusters: convert ml qa tests (#43229)
* Testclusters: convert ml qa tests

This PR converts the ML tests to use testclusters.
2019-06-18 11:55:11 +03:00
David Roberts da97325790 [ML] Speed up persistent task rechecks in ML failover tests (#43291)
The ML failover tests sometimes need to wait for jobs to be
assigned to new nodes following a node failure.  They wait
10 seconds for this to happen.  However, if the node that
failed was the master node and a new master was elected then
this 10 seconds might not be long enough as a refresh of the
memory stats will delay job assignment.  Once the memory
refresh completes the persistent task will be assigned when
the next cluster state update occurs or after the periodic
recheck interval, which defaults to 30 seconds.  Rather than
increase the length of the wait for assignment to 31 seconds,
this change decreases the periodic recheck interval to 1
second.

Fixes #43289
2019-06-18 09:19:20 +01:00
David Roberts 3effe264da [ML] Fix problem with lost shards in distributed failure test (#43153)
We were stopping a node in the cluster at a time when
the replica shards of the .ml-state index might not
have been created.  This change moves the wait for
green status to a point where the .ml-state index
exists.

Fixes #40546
Fixes #41742

Forward port of #43111
2019-06-17 09:28:56 +01:00
Przemysław Witek b2613a123d
[7.x] Report exponential_avg_bucket_processing_time which gives more weight to recent buckets (#43189) (#43263) 2019-06-17 08:58:26 +02:00
David Roberts 3928c624a3 [ML] Close sample stream in post_data endpoint (#43235)
A static code analysis revealed that we are not closing
the input stream in the post_data endpoint.  This
actually makes no difference in practice, as the
particular InputStream implementation in this case is
org.elasticsearch.common.bytes.BytesReferenceStreamInput
and its close() method is a no-op. However, it is good
practice to close the stream anyway.
2019-06-14 17:54:54 +01:00
Przemysław Witek 65a584b6fb
[7.x] Report timing stats as part of the Job stats response (#42709) (#43193) 2019-06-14 09:03:14 +02:00
Jason Tedor 5bc3b7f741
Enable node roles to be pluggable (#43175)
This commit introduces the possibility for a plugin to introduce
additional node roles.
2019-06-13 15:15:48 -04:00
Ryan Ernst c3ce3f6891 Add native code info to ML info api (#43172)
The machine learning feature of xpack has native binaries with a
different commit id than the rest of code. It is currently exposed in
the xpack info api. This commit adds that commit information to the ML
info api, so that it may be removed from the info api.
2019-06-13 11:38:58 -07:00
David Roberts 43665183c2 [ML] Restrict detection of epoch timestamps in find_file_structure (#43188)
Previously 10 digit numbers were considered candidates to be
timestamps recorded as seconds since the epoch and 13 digit
numbers as timestamps recorded as milliseconds since the epoch.

However, this meant that we could detect these formats for
numbers that would represent times far in the future.  As an
example ISBN numbers starting with 9 were detected as milliseconds
since the epoch since they had 13 digits.

This change tweaks the logic for detecting such timestamps to
require that they begin with 1 or 2.  This means that numbers
that would represent times beyond about 2065 are no longer
detected as epoch timestamps.  (We can add 3 to the definition
as we get closer to the cutoff date.)
2019-06-13 13:15:41 +01:00
Dimitris Athanasiou b28e006f7c
[ML] Lock down extraction method when possible (#43104) (#43140) 2019-06-12 14:07:17 +03:00
Ryan Ernst 172cd4dbfa Remove description from xpack feature sets (#43065)
The description field of xpack featuresets is optionally part of the
xpack info api, when using the verbose flag. However, this information
is unnecessary, as it is better left for documentation (and the existing
descriptions describe anything meaningful). This commit removes the
description field from feature sets.
2019-06-11 09:22:58 -07:00
David Roberts d3136f99e6 [ML] Fix race condition when closing time checker (#43098)
The tests for the ML TimeoutChecker rely on threads
not being interrupted after the TimeoutChecker is
closed.  This change ensures this by making the
close() and setTimeoutExceeded() methods synchronized
so that the code inside them cannot execute
simultaneously.

Fixes #43097
2019-06-11 16:39:17 +01:00
Benjamin Trent 79052050bf
[ML] Adding support for geo_shape, geo_centroid, geo_point in datafeeds (#42969) (#43069)
* [ML] Adding support for geo_shape, geo_centroid, geo_point in datafeeds

* only supporting doc_values for geo_point fields

* moving validation into GeoPointField ctor
2019-06-10 21:52:53 -05:00
Benjamin Trent eadfe05587
[ML] Changes slice specification to auto. See #42996 (#43039) (#43070) 2019-06-10 21:52:22 -05:00
Dimitris Athanasiou 76a92b49a8
[ML] Get resources action should be lenient when sort field is unmapped (#42991) (#43046)
Get resources action sorts on the resource id. When there are no resources at
all, then it is possible the index does not contain a mapping for the resource
id field. In that case, the search api fails by default.

This commit adjusts the search request to ignore unmapped fields.

Closes elastic/kibana#37870
2019-06-10 19:50:19 +03:00
Alan Woodward 8e23e4518a Move construction of custom analyzers into AnalysisRegistry (#42940)
Both TransportAnalyzeAction and CategorizationAnalyzer have logic to build
custom analyzers for index-independent analysis. A lot of this code is duplicated,
and it requires the AnalysisRegistry to expose a number of internal provider
classes, as well as making some assumptions about when analysis components are
constructed.

This commit moves the build logic directly into AnalysisRegistry, reducing the
registry's API surface considerably.
2019-06-10 14:33:25 +01:00
David Turner 68339f90e9 Mute AutodetectMemoryLimitIT#testTooManyPartitions
Relates #43013
2019-06-10 09:20:36 +01:00
Henning Andersen dea935ac31
Reindex max_docs parameter name (#42942)
Previously, a reindex request had two different size specifications in the body:
* Outer level, determining the maximum documents to process
* Inside the source element, determining the scroll/batch size.

The outer level size has now been renamed to max_docs to
avoid confusion and clarify its semantics, with backwards compatibility and
deprecation warnings for using size.
Similarly, the size parameter has been renamed to max_docs for
update/delete-by-query to keep the 3 interfaces consistent.

Finally, all 3 endpoints now support max_docs in both body and URL.

Relates #24344
2019-06-07 12:16:36 +02:00
David Roberts 40c827a3b8 [ML] Close sample stream in find_file_structure endpoint (#42896)
A static code analysis revealed that we are not closing
the input stream in the find_file_structure endpoint.
This actually makes no difference in practice, as the
particular InputStream implementation in this case is
org.elasticsearch.common.bytes.BytesReferenceStreamInput
and its close() method is a no-op.  However, it is good
practice to close the stream anyway.
2019-06-06 11:03:45 +01:00
David Roberts b202a59f88 [ML] Add earliest and latest timestamps to field stats (#42890)
This change adds the earliest and latest timestamps into
the field stats for fields of type "date" in the output of
the ML find_file_structure endpoint.  This will enable the
cards for date fields in the file data visualizer in the UI
to be made to look more similar to the cards for date
fields in the index data visualizer in the UI.
2019-06-06 08:58:35 +01:00
David Roberts d5baedb789 [ML] Change dots in CSV column names to underscores (#42839)
Dots in the column names cause an error in the ingest
pipeline, as dots are special characters in ingest pipeline.
This PR changes dots into underscores in CSV field names
suggested by the ML find_file_structure endpoint _unless_
the field names are specifically overridden.  The reason for
allowing them in overrides is that fields that are not
mentioned in the ingest pipeline can contain dots.  But it's
more consistent that the default behaviour is to replace
them all.

Fixes elastic/kibana#26800
2019-06-05 11:28:33 +01:00
Mark Vieira e44b8b1e2e
[Backport] Remove dependency substitutions 7.x (#42866)
* Remove unnecessary usage of Gradle dependency substitution rules (#42773)

(cherry picked from commit 12d583dbf6f7d44f00aa365e34fc7e937c3c61f7)
2019-06-04 13:50:23 -07:00
David Roberts b61202b0a8 [ML] Add a limit on line merging in find_file_structure (#42501)
When analysing a semi-structured text file the
find_file_structure endpoint merges lines to form
multi-line messages using the assumption that the
first line in each message contains the timestamp.
However, if the timestamp is misdetected then this
can lead to excessive numbers of lines being merged
to form massive messages.

This commit adds a line_merge_size_limit setting
(default 10000 characters) that halts the analysis
if a message bigger than this is created.  This
prevents significant CPU time being spent subsequently
trying to determine the internal structure of the
huge bogus messages.
2019-06-03 13:45:51 +01:00
David Roberts 10aca87389 [ML] Better detection of binary input in find_file_structure (#42707)
This change helps to prevent the situation where a binary
file uploaded to the find_file_structure endpoint is
detected as being text in the UTF-16 character set, and
then causes a large amount of CPU to be spent analysing
the bogus text structure.

The approach is to check the distribution of zero bytes
between odd and even file positions, on the grounds that
UTF-16BE or UTF16-LE would have a very skewed distribution.
2019-06-03 12:47:22 +01:00
David Roberts 48dc0dca57 [ML] Use map and filter instead of flatMap in find_file_structure (#42534)
Using map and filter avoids the garbage from all the
Stream.of calls that flatMap necessitated. Performance
is better when there are masses of fields.
2019-05-24 20:12:06 +01:00
David Roberts 34de68b007 [ML] Fix possible race condition when closing an opening job (#42506)
This change fixes a race condition that would result in an
in-memory data structure becoming out-of-sync with persistent
tasks in cluster state.

If repeated often enough this could result in it being
impossible to open any ML jobs on the affected node, as the
master node would think the node had capacity to open another
job but the chosen node would error during the open sequence
due to its in-memory data structure being full.

The race could be triggered by opening a job and then closing
it a tiny fraction of a second later.  It is unlikely a user
of the UI could open and close the job that fast, but a script
or program calling the REST API could.

The nasty thing is, from the externally observable states and
stats everything would appear to be fine - the fast open then
close sequence would appear to leave the job in the closed
state.  It's only later that the leftovers in the in-memory
data structure might build up and cause a problem.
2019-05-24 20:11:58 +01:00
David Roberts f472186b9f [ML] Improve file structure finder timestamp format determination (#41948)
This change contains a major refactoring of the timestamp
format determination code used by the ML find file structure
endpoint.

Previously timestamp format determination was done separately
for each piece of text supplied to the timestamp format finder.
This had the drawback that it was not possible to distinguish
dd/MM and MM/dd in the case where both numbers were 12 or less.
In order to do this sensibly it is best to look across all the
available timestamps and see if one of the numbers is greater
than 12 in any of them.  This necessitates making the timestamp
format finder an instantiable class that can accumulate evidence
over time.

Another problem with the previous approach was that it was only
possible to override the timestamp format to one of a limited
set of timestamp formats.  There was no way out if a file to be
analysed had a timestamp that was sane yet not in the supported
set.  This is now changed to allow any timestamp format that can
be parsed by a combination of these Java date/time formats:
yy, yyyy, M, MM, MMM, MMMM, d, dd, EEE, EEEE, H, HH, h, mm, ss,
a, XX, XXX, zzz
Additionally S letter groups (fractional seconds) are supported
providing they occur after ss and separated from the ss by a dot,
comma or colon.  Spacing and punctuation is also permitted with
the exception of the question mark, newline and carriage return
characters, together with literal text enclosed in single quotes.

The full list of changes/improvements in this refactor is:

- Make TimestampFormatFinder an instantiable class
- Overrides must be specified in Java date/time format - Joda
  format is no longer accepted
- Joda timestamp formats in outputs are now derived from the
  determined or overridden Java timestamp formats, not stored
  separately
- Functionality for determining the "best" timestamp format in
  a set of lines has been moved from TextLogFileStructureFinder
  to TimestampFormatFinder, taking advantage of the fact that
  TimestampFormatFinder is now an instantiable class with state
- The functionality to quickly rule out some possible Grok
  patterns when looking for timestamp formats has been changed
  from using simple regular expressions to the much faster
  approach of using the Shift-And method of sub-string search,
  but using an "alphabet" consisting of just 1 (representing any
  digit) and 0 (representing non-digits)
- Timestamp format overrides are now much more flexible
- Timestamp format overrides that do not correspond to a built-in
  Grok pattern are mapped to a %{CUSTOM_TIMESTAMP} Grok pattern
  whose definition is included within the date processor in the
  ingest pipeline
- Grok patterns that correspond to multiple Java date/time
  patterns are now handled better - the Grok pattern is accepted
  as matching broadly, and the required set of Java date/time
  patterns is built up considering all observed samples
- As a result of the more flexible acceptance of Grok patterns,
  when looking for the "best" timestamp in a set of lines
  timestamps are considered different if they are preceded by
  a different sequence of punctuation characters (to prevent
  timestamps far into some lines being considered similar to
  timestamps near the beginning of other lines)
- Out-of-the-box Grok patterns that are considered now include
  %{DATE} and %{DATESTAMP}, which have indeterminate day/month
  ordering
- The order of day/month in formats with indeterminate day/month
  order is determined by considering all observed samples (plus
  the server locale if the observed samples still do not suggest
  an ordering)

Relates #38086
Closes #35137
Closes #35132
2019-05-24 09:10:08 +01:00
Dimitris Athanasiou a6eb20ad35
[ML] Include node name when native controller cannot start process (#42225) (#42338)
This adds the node name where we fail to start a process via the native
controller to facilitate debugging as otherwise it might not be known
to which node the job was allocated.
2019-05-22 12:42:04 +03:00
Yannick Welsch 770d8e9e39 Remove usage of max_local_storage_nodes in test infrastructure (#41652)
Moves the test infrastructure away from using node.max_local_storage_nodes, allowing us in a
follow-up PR to deprecate this setting in 7.x and to remove it in 8.0.

This also changes the behavior of InternalTestCluster so that starting up nodes will not automatically
reuse data folders of previously stopped nodes. If this behavior is desired, it needs to be explicitly
done by passing the data path from the stopped node to the new node that is started.
2019-05-22 11:04:55 +02:00
Ed Savage d97f4d5e28 [ML][TEST] Fix limits in AutodetectMemoryLimitIT (#42279)
Re-enable muted tests and accommodate recent backend changes
that result in higher memory usage being reported for a job
at the start of its life-cycle
2019-05-21 18:44:47 +01:00
Dimitris Athanasiou a4e6fb4dd2
[ML] Fix logger declaration in ML plugins (#42222) (#42238)
This corrects what appears to have been a copy-paste error
where the logger for `MachineLearning` and `DataFrame` was wrongly
set to be that of `XPackPlugin`.
2019-05-21 18:03:24 +03:00
Zachary Tong 6ae6f57d39
[7.x Backport] Force selection of calendar or fixed intervals (#41906)
The date_histogram accepts an interval which can be either a calendar
interval (DST-aware, leap seconds, arbitrary length of months, etc) or
fixed interval (strict multiples of SI units). Unfortunately this is inferred
by first trying to parse as a calendar interval, then falling back to fixed
if that fails.

This leads to confusing arrangement where `1d` == calendar, but
`2d` == fixed.  And if you want a day of fixed time, you have to
specify `24h` (e.g. the next smallest unit).  This arrangement is very
error-prone for users.

This PR adds `calendar_interval` and `fixed_interval` parameters to any
code that uses intervals (date_histogram, rollup, composite, datafeed, etc).
Calendar only accepts calendar intervals, fixed accepts any combination of
units (meaning `1d` can be used to specify `24h` in fixed time), and both
are mutually exclusive.

The old interval behavior is deprecated and will throw a deprecation warning.
It is also mutually exclusive with the two new parameters. In the future the
old dual-purpose interval will be removed.

The change applies to both REST and java clients.
2019-05-20 12:07:29 -04:00
Ed Savage 840af87a74 [ML] Temporarily muting failing tests
Muting a number of AutoDetectMemoryLimitIT tests to give CI a chance to
settle before easing in required backend changes.

relates elastic/ml-cpp#486
relates #42086
2019-05-19 08:29:50 -04:00
Ed Savage a68b04e47b [ML] Improve hard_limit audit message (#42086)
Improve the hard_limit memory audit message by reporting how many bytes
over the configured memory limit the job was at the point of the last
allocation failure.

Previously the model memory usage was reported, however this was
inaccurate and hence of limited use -  primarily because the total
memory used by the model can decrease significantly after the models
status is changed to hard_limit but before the model size stats are
reported from autodetect to ES.

While this PR contains the changes to the format of the hard_limit audit
message it is dependent on modifications to the ml-cpp backend to
send additional data fields in the model size stats message. These
changes will follow in a subsequent PR. It is worth noting that this PR
must be merged prior to the ml-cpp one, to keep CI tests happy.
2019-05-17 17:40:08 -04:00
David Roberts 226df35d96 [ML] Improve message misformation error in file structure finder (#42175)
This change replaces the extremely unfriendly message
"Number of messages analyzed must be positive" in the
case where the sample lines were incorrectly grouped
into just one message to an error that more helpfully
explains the likely root cause of the problem.
2019-05-16 18:29:38 +01:00
Benjamin Trent bf5a40c754
[ML] relax set upgrade mode test to match what is guaranteed (#41958) (#41979)
* [ML] relax set upgrade mode test to match what is guaranteed

* removing unused import
2019-05-09 14:28:50 -05:00
Jason Tedor d7fd51a84e
Provide names for all artifact repositories (#41857)
This commit adds a name for each Maven and Ivy repository used in the
build.
2019-05-07 06:35:28 -04:00
Ryan Ernst 6fd8924c5a Switch run task to use real distro (#41590)
The run task is supposed to run elasticsearch with the given plugin or
module. However, for modules, this is most realistic if using the full
distribution. This commit changes the run setup to use the default or
oss as appropriate.
2019-05-06 12:34:07 -07:00
Jason Tedor f4da98ca3d
Use a proper repository for ml-cpp artifacts (#41817)
This switches the strategy used to download machine learning artifacts
from a manual download through S3 to using an Ivy repository on top of
S3. This gives us all the benefits of Gradle dependency resolution
including local caching.
2019-05-04 12:44:19 -04:00
Jason Tedor 241c4ef97a
Use https for artifact locations
This commit switches to using https for some artifact locations.
2019-05-03 16:15:48 -04:00
Benjamin Trent a92c06ae09
[ML] Refactor NativeStorageProvider to enable reuse (#41414) (#41746)
* [ML] Refactor NativeStorageProvider to enable reuse

Moves `NativeStorageProvider` as a machine learning component
so that it can be reused for other job types. Also, we now
pass the persistent task description as unique identifier which
avoids conflicts between jobs of different type but with same ids.

* Adding nativeStorageProvider as component

Since `TransportForecastJobAction` is expected to get injected a `NativeStorageProvider` class, we need to make sure that it is a constructed component, as it does not have a zero parametered, public ctor.
2019-05-02 09:46:22 -05:00
Jason Tedor 7f3ab4524f
Bump 7.x branch to version 7.2.0
This commit adds the 7.2.0 version constant to the 7.x branch, and bumps
BWC logic accordingly.
2019-05-01 13:38:57 -04:00
Tom Veasey b3f4533e1c [ML] Update for model selection change and disable temporarily (#41482) (#41682) 2019-04-30 15:47:54 -05:00
Yogesh Gaikwad c0d40ae4ca
Remove deprecated stashWithOrigin calls and use the alternative (#40847) (#41562)
This commit removes the deprecated `stashWithOrigin` and
modifies its usage to use the alternative.
2019-04-28 21:25:42 +10:00
Christoph Büscher 52495843cc [Docs] Fix common word repetitions (#39703) 2019-04-25 20:47:47 +02:00