Commit Graph

5559 Commits

Author SHA1 Message Date
Przemysław Witek 6b5f49d097
[7.x] Introduce ModelPlotConfig. annotations_enabled setting (#57539) (#57641) 2020-06-04 15:15:35 +02:00
Benjamin Trent ea9b8b9d41
[ML] fix setting forecasts to failed method (#57654) (#57656) 2020-06-04 08:54:46 -04:00
Rene Groeschke 751f16858b
Remove duplicate ssl setup in sql/qa projects (#57319) (#57643)
* Remove duplicate ssl setup in sql/qa projects
* Fix enforcement of task instances
* Use static data for cert generation
* Move ssl testing logic into a plugin
* Document test cert creation
2020-06-04 14:53:23 +02:00
Marios Trivyzas 5f8442d1f4
SQL: Improve performances of LTRIM/RTRIM (#57603)
Change custom stripping leading and trailing whitespaces implementation
to substantially improves performance:
```
Benchmark                         Mode  Cnt      Score     Error  Units
StringTrim.testWithStringBuilder  avgt   25  82547.575 ±  66.244  ns/op (existing impl)
StringTrim.testWithSubstring      avgt   25   1398.762 ± 101.152  ns/op (new impl)
StringTrim.testWithJavaStrip      avgt   25   1186.120 ±  10.374  ns/op (for reference)
```
Java's string stripLeading()/stripTrailing() not available to all supported JDKs.

Enhanced LENGTH unit tests and compine a couple of LTRIM/RTRIM integ
tests.

Relates to: #57594
(partially cherry picked from commit ee7868d68733f195dc46926a7eab3d9dd7033ef4)

Co-authored-by: Bogdan Pintea <bogdan.pintea@elastic.co>
2020-06-04 13:43:49 +02:00
Igor Motov 8d7f389f3a
Increase search.max_buckets to 65,535 (#57042)
Increases the default search.max_buckets limit to 65,535, and only counts
buckets during reduce phase.

Closes #51731
2020-06-03 15:35:41 -04:00
Julie Tibshirani e0a15e8dc4
Remove the 'array value parser' marker interface. (#57571) (#57622)
This PR replaces the marker interface with the method
FieldMapper#parsesArrayValue. I find this cleaner and it will help with the
fields retrieval work (#55363).

The refactor also ensures that only field mappers can declare they parse array
values. Previously other types like ObjectMapper could implement the marker
interface and be passed array values, which doesn't make sense.
2020-06-03 11:30:14 -07:00
Marios Trivyzas a674844893
SQL: Implement TRIM function (#57518) (#57593)
Add `TRIM` function which combines the functionality of both
`LTRIM` and `RTRIM` by stripping both leading and trailing
whitespaces.

Refers to #41195

(cherry picked from commit 6c86c919e12f0c4cb5e39d129aa65ab3e274268f)
2020-06-03 15:19:48 +02:00
Ioannis Kakavas 64583f7ec4
Mute EmailSslTests test case in fips (#57576) (#57577)
We test expected TLS failures by catching SSLException, but other
security providers ( i.e. BCFIPS ) might throw a different one. In
this case, BCFIPS throws org.bouncycastle.tls.TlsFatalAlert
2020-06-03 11:23:31 +03:00
Marios Trivyzas 634936e3be
SQL: [Tests] Enable tests which have been fixed (#57526) (#57538)
Enable integration tests for issues that have been fixed
over time.

(cherry picked from commit 117759ee152bcfb0043e5af3a784302ca31f6b8c)
2020-06-02 23:38:33 +02:00
Nik Everett 2a27c411fb
Same memory when geo aggregations are not on top (#57483) (#57551)
Saves memory when the `geotile_grid` and `geohash_grid` are not on the
top level by using the `LongKeyedBucketOrds` we built in #55873.
2020-06-02 16:21:50 -04:00
Dan Hermann 97a51272b0
Fix incorrect log warning when exporting monitoring via HTTP without authentication (#57552) 2020-06-02 15:03:55 -05:00
Mark Tozzi e50f514092
IndexFieldData should hold the ValuesSourceType (#57373) (#57532) 2020-06-02 12:16:53 -04:00
Armin Braun ba2d70d8eb
Serialize Outbound Messages on IO Threads (#56961) (#57080)
Almost every outbound message is serialized to buffers of 16k pagesize.
We were serializing these messages off the IO loop (and retaining the concrete message
instance as well) and would then enqueue it on the IO loop to be dealt with as soon as the
channel is ready.
1. This would cause buffers to be held onto for longer than necessary, causing less reuse on average.
2. If a channel was slow for some reason, not only would concrete message instances queue up for it, but also 16k of buffers would be reserved for each message until it would be written+flushed physically.

With this change, the serialization happens on the event loop which effectively limits the number of buffers that `N` IO-threads will ever use so long as messages are small and channels writable.
Also, this change dereferences the reference to the concrete outbound message as soon as it has been serialized to save some more on GC.

This reduces the GC time for a default PMC run by about 50% in experiments (3 nodes, 2G heap each, loopback ... obvious caveat is that GC isn't that heavy in the first place with recent changes but still a measurable gain).
I also expect it to be helpful for master node stability by causing less of a spike if master is e.g. hit by a large number of requests that are processed batched (e.g. shard snapshot status updates) and responded to in a short time frame all at once.

Obviously, the downside to this change is that it introduces more latency on the IO loop for the serialization. But since we read all of these messages on the IO loop as well I don't see it as much of a qualitative change really and the more predictable buffer use seems much more valuable relatively.
2020-06-02 16:15:18 +02:00
Rene Groeschke 8584da40af
Move classes from build scripts to buildSrc (#57197) (#57512)
* Move classes from build scripts to buildSrc

- move Run task
- move duplicate SanEvaluator

* Remove :run workaround

* Some little cleanup on build scripts on the way
2020-06-02 15:33:53 +02:00
Andrei Dan bd188f4a21
[7.x] ILM: add support for rolling over data streams (#57295) (#57515)
As the datastream information is stored in the `ClusterState.Metadata` we exposed
the `Metadata` to the `AsyncWaitStep#evaluateCondition` method in order for
the steps to be able to identify when a managed index is part of a DataStream.

If a managed index is part of a DataStream the rollover target is the DataStream
name and the highest generation index is the write index (ie. the rolled index).

(cherry picked from commit 6b410dfb78f3676fce1b7401f1628c1ca6fbd45a)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-06-02 11:55:23 +01:00
Przemysław Witek ea6cfb7c3d
[7.x] Make Annotation a result type (#56342) (#57508) 2020-06-02 11:56:41 +02:00
Tanguy Leroux b4a2cd810a
Use 3rd party task to run integration tests on external service (#56588)
Backport of #56587 for 7.x
2020-06-02 11:26:58 +02:00
Marios Trivyzas 52c555e286
SQL: Make CASTing string to DATETIME more lenient (#57451) (#57509)
Some BI tools (i.e. Tableau) would try to cast strings where the time
part is separated from the date part with a whitespace instead of `T`.
Adjust type conversion used by CAST to support this.

(cherry picked from commit 0e18321e7ad9f779c42855efbf93f171b9128a5e)
2020-06-02 10:54:03 +02:00
Marios Trivyzas b8a13de20f
SQL: Implement TOP as an alternative to LIMIT (#57428) (#57507)
Add basic support for `TOP X` as a synonym to LIMIT X which is used
by [MS-SQL server](https://docs.microsoft.com/en-us/sql/t-sql/queries/top-transact-sql?view=sql-server-ver15),
e.g.:

```
SELECT TOP 5 a, b, c FROM test
```

TOP in SQL server also supports the `PERCENTAGE` and `WITH TIES`
keywords which this implementation doesn't.

Don't allow usage of both TOP and LIMIT in the same query.

Refers to #41195

(cherry picked from commit 2f5ab81b9ad884434d1faa60f4391f966ede73e8)
2020-06-02 10:53:42 +02:00
Przemysław Witek ceb4b29b98
Introduce Annotation.event field (#57144) (#57453) 2020-06-01 20:42:25 +02:00
Mark Tozzi 1f500583b1
Clean up Aggregator Supplier Boiler Plate (#57442) (#57452) 2020-06-01 14:21:07 -04:00
Zachary Tong daaf5a3dcc
Fix assertion catching in aggregation supported type test (#56466) (#57382)
At some point, we changed the supported-type test to also catch
assertion errors.  This has the side effect of also catching the
`fail()` call inside the try-catch, which silently smothered some
failures.

This modifies the test to throw at the end of the try-catch
block to prevent from accidentally catching itself.

Catching the AssertionError is convenient because there are other locations
that do throw an assertion in tests (due to hitting an assertion
before the exception is thrown) so I think we should keep it around.

Also includes a variety of fixes to other tests which were failing
but being silently smothered.
2020-06-01 12:10:05 -04:00
David Kyle 064093c4d4 Fix compilation after backport of #57278 2020-06-01 12:03:13 +01:00
Przemysław Witek 72ad9a4548
[7.x] Make AnnotationPersister use bulk requests instead of indexing individual documents (#57278) (#57354) 2020-06-01 12:05:09 +02:00
David Roberts 9fdf1722e6
[TEST] Fix more allowed warnings for composable template rename (#57398)
Should have been done in #57232
2020-05-31 18:14:48 +01:00
Benjamin Trent 34f1e0b6bb
[7.x] [ML] mark forecasts for force closed/failed jobs as failed (#57143) (#57374)
* [ML] mark forecasts for force closed/failed jobs as failed (#57143)

forecasts that are still running should be marked as failed/finished in the following scenarios:

- Job is force closed
- Job is re-assigned to another node.

Forecasts are not "resilient". Their execution does not continue after a node failure. Consequently, forecasts marked as STARTED or SCHEDULED should be flagged as failed. These forecasts can then be deleted.

Additionally, force closing a job kills the native task directly. This means that if a forecast was running, it is not allowed to complete and could still have the status of `STARTED` in the index.

relates to https://github.com/elastic/elasticsearch/issues/56419
2020-05-29 14:48:10 -04:00
Benjamin Trent 35d5126cea
[7.x] [ML] adds new for_export flag to GET _ml/inference API (#57351) (#57368)
* [ML] adds new for_export flag to GET _ml/inference API (#57351)

Adds a new boolean flag, `for_export` to the `GET _ml/inference/<model_id>` API.

This flag is useful for moving models between clusters.
2020-05-29 14:01:08 -04:00
Benjamin Trent 15aba60c02
[7.x] Add new circuitbreaker plugin and refactor CircuitBreakerService (#55695) (#57359)
* Add new circuitbreaker plugin and refactor CircuitBreakerService (#55695)

This commit lays the ground work for plugins supplying their own circuit breakers.

It adds a new interface: `CircuitBreakerPlugin`.

This interface provides methods for providing custom child CircuitBreaker objects. There are also facilities for allowing dynamic settings for the custom breakers.

With the refactor, circuit breakers are no longer replaced on setting changes. Instead, the two mutable settings themselves are `volatile`. Plugins that want to use their custom circuit breaker should keep a reference of their constructed breaker.
2020-05-29 12:13:46 -04:00
Benjamin Trent c8374dc9f3
[ML] add max_model_memory parameter to forecast request (#57254) (#57355)
This adds a max_model_memory setting to forecast requests. 
This setting can take a string value that is formatted according to byte sizes (i.e. "50mb", "150mb").

The default value is `20mb`.

There is a HARD limit at `500mb` which will throw an error if used.

If the limit is larger than 40% the anomaly job's configured model limit, the forecast limit is reduced to be strictly lower than that value. This reduction is logged and audited.

related native change: https://github.com/elastic/ml-cpp/pull/1238

closes: https://github.com/elastic/elasticsearch/issues/56420
2020-05-29 11:16:08 -04:00
Marios Trivyzas b2651323fd
SQL: Implement TIME_PARSE function for parsing strings into TIME values (#55223) (#57342)
Implement TIME_PARSE(<time_str>, <pattern_str>) function
which allows to parse a time string according to the specified
pattern into a time object. The patterns allowed are those of
java.time.format.DateTimeFormatter.

Closes #54963

Co-authored-by: Andrei Stefan <astefan@users.noreply.github.com>
Co-authored-by: Patrick Jiang(白泽) <patrickjiang0530@gmail.com>

(cherry picked from commit 1fe1188d449cad7d0782a202372edc52a4014135)
2020-05-29 15:48:37 +02:00
Dan Hermann 6b0d707671
[7.x] Do not report negative values for swap sizes (#57353) 2020-05-29 08:11:47 -05:00
Martijn van Groningen 04ef39da77
Change cluster info actions to be able to resolve data streams. (#57343)
Backport of #56878 to 7.x branch.

With this change the following APIs will be able to resolve data streams:
get index, get mappings and ilm explain APIs.

Relates to #53100
2020-05-29 12:17:53 +02:00
Dimitris Athanasiou 322f953060
[7.x][ML] Anomaly detection jobs should allow missing values for geo fields (#57300) (#57338)
Allows geo fields (`geo_point`, `geo_shape`) to have missing values.
Fixes a bug where such missing values would result in an error.

Closes #57299

Backport of #57300
2020-05-29 13:06:16 +03:00
Benjamin Trent 24d605e41e
[ML] fixing GET _ml/inference so size param is respected (#57303) (#57308)
`size` was previously ignored when grabbing full trained model configs. 

closes https://github.com/elastic/elasticsearch/issues/57298
2020-05-28 15:45:26 -04:00
Martijn van Groningen 225ccd1cfa
Ensure template exists when creating data stream (#57275)
Backporting #56888 to 7.x branch.

Limit the creation of data streams only for namespaces that have a composable template with a data stream definition.

This way we ensure that mappings/settings have been specified and will be used at data stream creation and data stream rollover.

Also remove `timestamp_field` parameter from create data stream request and
let the create data stream api resolve the timestamp field
from the data stream definition snippet inside a composable template.

Relates to #53100
2020-05-28 15:08:25 +02:00
Marios Trivyzas fdac9e99fa
SQL: Fix unecessary evaluation for CASE/IIF (#57159) (#57262)
Previously, `CASE` and `IIF` when translated to painless scripts
(used in GROUP BY, HAVING, WHERE) a custom `caseFunction`
registered in the `InternalSqlScriptUtils` was used. This function
received and array of arbitrary length:
```[condition1, result1, condition2, result2, ... elseResult]```

Painless doesn't know of the context and therefore is evaluating
all conditions and results before invoking the `caseFunction` on them.
As a consequence, erroneous result expressions (i.e. division by 0)
where always evaluated despite of the guarding condition.

Replace the `caseFunction` with painless `<cond> ? <res1> : <res2>`
expressions to properly guard the result expressions and only evaluate
the one for which its guarding condition evaluates to true (or of course
the elseResult).

As a bonus, this approach includes performance benefits since we avoid
unnecessary evaluations of both conditions and result expressions.

Fixes: #49672
(cherry picked from commit 9584b345d89f797bfb658212b928b9812804f02f)
2020-05-28 11:30:14 +02:00
Tim Vernum 408250dcc4
Fix smtp.ssl.trust setting for watcher email (#57268)
The ssl.trust setting for Watcher provides a list of hostnames that
should be automatically trusted for SSL hostname verification. It was
accidentally broken when we added the full ssl.* settings for email
notifications (see #45272)

This commit corrects this, so the setting is once again respected,
as long as none of the other ssl settings are configured for email
notifications.

Resolves: #52153
Backport of: #56090
2020-05-28 17:34:13 +10:00
Ryan Ernst fdb8573413
Convert remaining compilerJavaHome reference 2020-05-27 17:04:04 -07:00
Ryan Ernst beb1d0c338
Remove compiler java version flag (#57237)
This commit removes the compiler.java setting from the build. It was
originally added when Gradle was far behind support for the latest jdk,
but is no longer applicable as we don't have any need to update the
supported compile version before gradle supports the newer version. Note
that the runtime version changing support still exists here, this only
ensures we use the same jdk to compile as we use to run gradle.
2020-05-27 16:33:38 -07:00
David Roberts d139a79ef6
[7.x][ML] Fix monitoring if orphaned anomaly detector persistent tasks exist (#57240)
Since #51888 the ML job stats endpoint has returned entries for
jobs that have a persistent task but not job config. Such
orphaned tasks caused monitoring to fail.

This change ignores any such corrupt jobs for monitoring purposes.

Backport of #57235
2020-05-27 22:59:11 +01:00
James Baiera 3b73ce3112
Fix enrich coordinator to reject documents instead of deadlocking (#56247) (#57179)
This PR removes the blocking call to insert ingest documents into a queue in the
coordinator. It replaces it with an offer call which will throw a rejection exception
in the event that the queue is full. This prevents deadlocks of the write threads
when the queue fills to capacity and there are more than one enrich processors
in a pipeline.
2020-05-27 15:32:13 -04:00
Lee Hinman c0f732b9f6
[7.x] Rename template V2 classes to ComposableTemplate (#57183) (#57232)
Backports the following commits to 7.x:

    Rename template V2 classes to ComposableTemplate (#57183)
2020-05-27 11:01:59 -06:00
AndyHunt66 6760c69783 [DOCS] Fix formatting of create API key API docs (#57138) 2020-05-27 08:34:51 -04:00
Tal Levy 81060820e9 Fix NormalizerAgg test searcher wrapping (#57171)
The searcher was randomly wrapping its reader as slow, parallel, or filtered.
This was causing casting issues in the normalizer tests. By removing the
wrapping, the problem goes away.

Closes #57164
2020-05-26 13:25:19 -07:00
Benjamin Trent decc6277f9
[ML] allow unran/incomplete forecasts to be deleted for stopped/failed jobs (#57152) (#57172)
If a job is NOT opened, forecasts should be able to be deleted, no matter their state.

This also fixes a bug with expanding forecast IDs. We should check for wildcard `*` and `_all` when expanding the ids

closes https://github.com/elastic/elasticsearch/issues/56419
2020-05-26 15:44:22 -04:00
Bogdan Pintea 74b2c8a770 Change error message for comp against fields (#57126)
Change the error message wording for comparisons against fields in
filtering (s/variables/fields).

(cherry picked from commit d9a1cb50940d0a98fd75b9c0123ca6e1d862f65d)
2020-05-26 17:57:51 +02:00
Bogdan Pintea 0c379e334a SQL: update the JLine dependency to 3.14.1 (#57111)
* Update the JLine dependency to 3.14.1

Update the JLine dependency from 3.10.0 to 3.14.1.

(cherry picked from commit c2d9b74046fa5ddb54604da3afa7887cc38548a1)
2020-05-26 17:56:34 +02:00
markharwood b2bc6071fd
Add regex query support to wildcard field (approach 2) (#55548) (#57141)
Backport of #55548

Adds equivalence for keyword field to the wildcard field. Regex, fuzzy, wildcard and prefix queries are all supported.
All queries use an approximation query backed by an automaton-based verification queries.

Closes #54275
2020-05-26 16:55:59 +01:00
markharwood 1d74549d7f
Wildcard field - add support for null field with test (#57047) (#57139)
Backport of #57047
2020-05-26 16:07:49 +01:00
David Kyle 571477d0ad
[7.x] Fix delete_expired_data/nightly maintenance when many model snapshots need deleting (#57041) (#57136)
Fix delete_expired_data/nightly maintenance when 
many model snapshots need deleting (#57041)

The queries performed by the expired data removers pull back entire 
documents when only a few fields are required. For ModelSnapshots in 
particular this is a problem as they contain quantiles which may be 
100s of KB and the search size is set to 10,000.

This change makes the search more efficient by only requesting the 
fields needed to work out which expired data should be deleted.
2020-05-26 10:56:42 +01:00