52078 Commits

Author SHA1 Message Date
David Kyle
39020f3900
HLRC for delete expired data by job Id (#57722) (#57975)
High level rest client changes for #57337
2020-06-12 09:44:17 +01:00
Martijn van Groningen
c8031c6f99
Add data stream support to the reindex api. (#57970)
Backport of #57870 to 7.x branch.

This change now also copies the op_type from the reindex request's destination index request to the actual index request being used in the bulk request.

For ensuring no document exists, the op_type create doesn't need to be copied, since Versions.MATCH_DELETED will copied from the 'mainRequest.getDestination().version()'.
The `version()` method on IndexRequest only returns Versions.MATCH_DELETED if op_type=create and no specific version has been specified.

However in order to be able to index into a data stream, the op_type must be create. So in order to support that the op_type must be copied from the reindex request's destination index request to the actual index request being used in the bulk request.

Relates to #53100 and #57788
2020-06-12 09:54:37 +02:00
Rene Groeschke
5226fef321 Update Gradle wrapper to 6.5 (#57580) (#57653)
* Update Gradle wrapper to 6.5
* Fix groovy incompatibility issue after gradle update
* Fix Gstring String incompatibility
2020-06-12 08:38:16 +02:00
Ryan Ernst
3bc2601ba3
Re-enable packaging tests for windows (#58010)
This commit fixes the gc logfile name for windows on java 8, and re-enables windows testing of the archive tests.

closes #50825
2020-06-11 16:26:24 -07:00
James Rodewig
bf90b6f221 [DOCS] Remove extra word from data stream docs 2020-06-11 17:44:59 -04:00
Mark Tozzi
36f551bdb4
Make ValuesSourceConfig behave like a config object (#57762) (#58012) 2020-06-11 17:23:55 -04:00
Igor Motov
5138c0c045
Fix missing null values for std_deviation_bounds in ext. stats aggs (#58000)
Adds missing null values for std_deviation_bounds in extended stats aggs and
improves null handling in parsed extended stats.
2020-06-11 16:23:20 -04:00
James Rodewig
1814b66a69 [DOCS] Fix typos in data stream docs 2020-06-11 16:21:09 -04:00
Benjamin Trent
2881995a45
[ML] adding new inference model size estimate handling from native process (#57930) (#57999)
Adds support for reading in `model_size_info` objects.

These objects contain numeric values indicating the model definition size and complexity.

Additionally, these objects are not stored or serialized to any other node. They are to be used for calculating and storing model metadata. They are much smaller on heap than the true model definition and should help prevent the analytics process from using too much memory.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-06-11 15:59:23 -04:00
Lee Hinman
ffc3c77f75
[7.x] Disallow deletion of composable template if in use by data stream (#57957) (#57994)
Backports the following commits to 7.x:

    Disallow deletion of composable template if in use by data stream (#57957)
2020-06-11 13:51:56 -06:00
Mark Vieira
d9e547dbd3
Revert "Re-enable windows archives packaging tests (#57955)"
This reverts commit 573c6279af7adbe25c11733401b47e7139cf4168.
2020-06-11 11:58:56 -07:00
Lisa Cawley
7442808869 [DOCS] Rename monitoring collection from internal to legacy (#56395) 2020-06-11 10:21:01 -07:00
Jim Ferenczi
4c6bfe32a7 Fix possible NPE on search phase failure (#57952)
When a search phase fails, we release the context of all successful shards.
Successful shards that rewrite the request to match none will not create any context
since #. This change ensures that we don't try to release a `null` context on these
successful shards.

Closes #57945
2020-06-11 18:54:16 +02:00
James Rodewig
c36df27730
[DOCS] Reformat pattern_replace token filter (#57699) (#57995)
Changes:

* Rewrites description and adds Lucene link
* Adds analyze example
* Adds parameter definitions
* Adds custom analyzer example
2020-06-11 12:19:38 -04:00
Yannick Welsch
85b0b540f0 Fix refresh behavior in MockDiskUsagesIT (#57926)
Ensures that InternalClusterInfoService's internally cached stats are refreshed whenever the
shard size or disk usage function (to mock out disk usage) are overridden.

Closes #57888
2020-06-11 17:38:12 +02:00
James Rodewig
6fc8317f07
[DOCS] Reformat data streams intro and overview (#57954) (#57993)
Changes:

* Updates 'Data streams' intro page to focus on problem solution and
  benefits.

* Adds 'Data streams overview' page to cover conceptual information,
  based on existing content in the 'Data streams' intro.

* Adds diagrams for data streams and search/indexing request examples.

* Moves API jump list and API docs to a new 'Data streams APIs' section.
  Links to these APIs will be available through tutorials.

* Add xrefs to existing docs for concepts like generation, write index,
  and append-only.
2020-06-11 11:32:09 -04:00
James Rodewig
4e738f60f8 [DOCS] Fix typo in data stream docs 2020-06-11 11:30:00 -04:00
James Rodewig
d534862d41
[DOCS] Move search API's docvalue_fields examples (#57760) (#57989)
Changes:

* Condenses and relocates the `docvalue_fields` example to the 'Run a search'
   page.
* Adds docs for the `docvalue_fields` request body parameter.
* Updates several related xrefs.

Co-authored-by: debadair <debadair@elastic.co>
2020-06-11 11:25:04 -04:00
David Turner
f950c121bb Hide AlreadyClosedException on IndexCommit release (#57986)
Today `InternalEngine#releaseIndexCommit` fails with an
`AlreadyClosedException` if the engine is closed before the index commit is
released. This can happen if, for example, a node leaves and rejoins the
cluster and acquires an index commit for replica shard allocation concurrently
with shutting the shard down.

There's no need to fail the operation like this: if the engine is shut down
then we will clean up the unreferenced files when it's restarted (or if it's
allocated elsewhere) so we can suppress an `AlreadyClosedException` in this
case. This commit does so.

Fixes #57797
2020-06-11 15:41:50 +01:00
David Turner
9b52a250f8 Add admonition to cluster state instability note (#57985)
We document that the cluster state API is an internal representation which may
change, but apparently not emphatically enough. This commit adds a `NOTE:`
admonition to this paragraph.
2020-06-11 15:32:18 +01:00
Alan Woodward
16e230dcb8 Update to lucene snapshot e7c625430ed (#57981)
Includes LUCENE-9148 and LUCENE-9398, which splits the BKD metadata, index and data into separate files and keeps the index off-heap.
2020-06-11 14:51:53 +01:00
Yannick Welsch
34fc52dbf3 Fix PersistedClusterStateServiceTests.testSlowLogging (#57971)
The range in the last writeDurationMillis selection could be empty, as it could prior to the call be set to 1.
2020-06-11 15:47:34 +02:00
Igor Motov
947573f309
Added standard deviation / variance sampling to extended stats (#49782) (#57947)
Per 49554 I added standard deviation sampling and variance sampling to the extended stats interface.
 
Closes #49554

Co-authored-by: Igor Motov <igor@motovs.org>

Co-authored-by: andrewjohnson2 <aj114114@gmail.com>
2020-06-11 09:19:44 -04:00
Nik Everett
da72a3a51d
Speed up reducing auto_date_histo with a time zone (backport of #57933) (#57958)
When reducing `auto_date_histogram` we were using `Rounding#round`
which is quite a bit more expensive than
```
Rounding.Prepared prepared = rounding.prepare(min, max);
long result = prepared.round(date);
```
when rounding to a non-fixed time zone like `America/New_York`. This
stops using the former and starts using the latter.

Relates to #56124
2020-06-11 09:15:12 -04:00
David Roberts
54d4f2a623 [ML] Refresh annotations index on job flush and close (#57979)
Now that annotations are part of the anomaly detection job results
the annotations index should be refreshed on flushing and closing
the job so that flush and close continue to fulfil their contracts
that immediately after returning all results the job generated up
to that point are searchable.
2020-06-11 12:29:04 +01:00
David Kyle
b87b147704
Add models for search to ModelLoadingService (#57592) (#57919)
ModelLoadingService only caches models if they are referenced by an 
ingest pipeline. For models used in search we want to always cache the
models and rely on TTL to evict them. Additionally when an ingest 
pipeline is deleted the model it references should not be evicted if 
it is used in search.
2020-06-11 10:48:37 +01:00
David Kyle
2905a2f623
Use Search After job iterators (#57875) (#57923)
Search after is a better choice for the delete expired data iterators
where processing takes a long time as unlike scroll a context does not
have to be kept alive. Also changes the delete expired data endpoint to
404 if the job is unknown
2020-06-11 10:06:18 +01:00
Ryan Ernst
573c6279af
Re-enable windows archives packaging tests (#57955)
This commit re-enables windows testing for archives packaging tests.
These were disabled previously because of constant failure due to
windows file locks, but the failure does not occur outside of CI, so
they are being re-enabled to further investigate the failure.

relates #50825
2020-06-10 15:13:33 -07:00
Costin Leau
ff0ea62cb8 EQL: Fix casing for tiebreaker field (#57943)
Use tiebreaker instead of tieBreaker

(cherry picked from commit 3c774948a5d5e10fac267cb9a54f5d0559a00c1d)
2020-06-11 00:10:19 +03:00
Nik Everett
0a2bd10758
Save memory when parent and child are not on top (#57892) (#57944)
Reworks the `parent` and `child` aggregation are not at the top level
using the optimization from #55873. Instead of wrapping all
non-top-level `parent` and `child` aggregators we now handle being a
child aggregator in the aggregator, specifically by adding recording
which global ordinals show up in the parent and then checking if they
match the child.
2020-06-10 16:25:10 -04:00
James Rodewig
9eb8085ac0
[DOCS] Reformat data stream tutorial docs (#57883) (#57946)
Creates a new page for a 'Set up a data stream' tutorial, based on
existing content in 'Data streams'.

Also adds tutorials for:

* Configuring an ILM policy for a data stream
* Indexing documents to a data stream
* Searching a data stream
* Manually rolling over a data stream
2020-06-10 14:03:46 -04:00
Albert Zaharovits
c57ccd99f7
Just log 401 stacktraces (#55774)
Ensure stacktraces of 401 errors for unauthenticated users are logged
but not returned in the response body.
2020-06-10 20:39:32 +03:00
James Rodewig
44c3bb29e2 [DOCS] EQL: Correct EQL search API's size param def
The `size` parameter can be used to limit matching events or sequences.
2020-06-10 10:12:54 -04:00
Valeriy Khakhutskyy
c0f368bbf3
[7.x][ML] Adjust assertion for job case memory usage estimates (#57929)
Since we change the memory estimates for data frame analytics jobs from worst case to a realistic case, the strict less-than assertion in the test does not hold anymore. I replaced it with a less-or-equal-than assertion.

Backport or #57882
2020-06-10 15:17:16 +02:00
Aleksandr Maus
ec60335496
EQL: implement case sensitivity for indexOf and endsWith string functions (#57707) (#57908)
* EQL: implement case sensitivity for indexOf and endsWith string functions
2020-06-10 08:55:49 -04:00
Armin Braun
85f5c4192b
Improve Test Coverage for Old Repository Metadata Formats (#57915) (#57922)
Use the the hack used in `CorruptedBlobStoreRepositoryIT` in more snapshot
failure tests to verify that BwC repository metadata is handled properly
in these so far not-test-covered scenarios.
Also, some minor related dry-up of snapshot tests.

Relates #57798
2020-06-10 13:27:01 +02:00
Andrei Dan
9f280621ba
[7.x] ILM add data stream support to searchable snapshot action (#57873) (#57916)
(cherry picked from commit 34856a90532c6c62a53817bb395399c8a8c17c0f)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-06-10 10:16:57 +01:00
Yannick Welsch
80f221e920
Use clean thread context for transport and applier service (#57792) (#57914)
Adds assertions to Netty to make sure that its threads are not polluted by thread contexts (and
also that thread contexts are not leaked). Moves the ClusterApplierService to use the system
context (same as we do for MasterService), which allows to remove a hack from
TemplateUgradeService and makes it clearer that applying CS updates is fully executing under
system context.
2020-06-10 10:30:28 +02:00
Armin Braun
fe85bdbe6f
Fix Remote Recovery Being Retried for Removed Nodes (#57608) (#57913)
If a node is disconnected we retry. It does not make sense
to retry the recovery if the node is removed from the cluster though.
=> added a CS listener that cancels the recovery for removed nodes

Also, we were running the retry on the `SAME` pool which for each retry will
be the scheduler pool. Since the error path of the listener we use here
will do blocking operations when closing the resources used by the recovery
we can't use the `SAME` pool here since not all exceptions go to the `ActionListenerResponseHandler`
threading like e.g. `NodeNotConnectedException`.

Closes #57585
2020-06-10 09:41:52 +02:00
Armin Braun
d579420452
Stop Serializing Exceptions in SnapshotInfo (#57866) (#57898)
In ff9e8c622427d42a2d87b4ceb298d043ae3c4e6a we changed the format
used when serializing snapshot failures in the cluster state and
`SnapshotInfo`. This turned them from a short string holding all the
nested exception messages into a multi kb stacktrace in many cases.
This is not great if you snapshot a large number of shards that all fail
for example and massively blows up the size of the GET snapshots response
if there are snapshots with failures in there.
This change reverts to the format used for exceptions before the above commit.

Also, this change short circuits logging and serialization of the failure
for an aborted snapshot where we don't care about the specific message at all
and aligns the message to "aborted" in all cases (current if we aborted before any IO,
it would have been "aborted" and an exception when aborting later during IO).
2020-06-10 08:41:03 +02:00
Hendrik Muhs
95bd7b63b0 [Transform] fix page size return in cat transform, add dps (#57871)
fixes the page size reported after moving page size to settings(#56007) and
adds documents per second(throttling) to the output.

fixes #56498
2020-06-10 08:10:25 +02:00
Russ Cam
f51f9b19c7 Mark Component and Index template APIs as experimental (#57910)
This commit marks the Component Template and
Index Template APIs as experimental.

(cherry picked from commit a85f2bede8eb632e3837ac7630f8dfdf46da6b52)
2020-06-10 14:07:09 +10:00
Yang Wang
72a6441a88
Revert "Resolve anonymous roles and deduplicate roles during authentication (#53453) (#55995)" (#57858)
This reverts commit 84a2f1adf217c8da2817a7d36d9f5694ae85b4b0.
2020-06-10 10:42:52 +10:00
Simon
18fc4395c6 [DOCS] Fix incorrect AD realm setting (#57520) 2020-06-09 16:56:19 -07:00
Jake Landis
a370d5eead
[7.x] Ensure Joni warning are logged at debug (#57302) (#57897)
When Joni, the regex engine that powers grok emits a warning it
does so by default to System.err. System.err logs are all bucketed
together in the server log at WARN level. When Joni emits a warning,
it can be extremely verbose, logging a message for each execution
again that pattern. For ingest node that means for every document
that is run that through Grok. Fortunately, Joni provides a call
back hook to push these warnings to a custom location.

This commit implements Joni's callback hook to push the Joni warning
to the Elasticsearch server logger (logger.org.elasticsearch.ingest.common.GrokProcessor)
at debug level. Generally these warning indicate a possible issue with
the regular expression and upon creation of the Grok processor will
do a "test run" of the expression and log the result (if any) at WARN 
level. This WARN level log should only occur on pipeline creation which 
is a much lower frequency then every document. 

Additionally, the documentation is updated with instructions for how
to set the logger to debug level.
2020-06-09 17:06:29 -05:00
Gordon Brown
aab6317260
[7.x] Include hidden indices in snapshots by default (#57325)
Previously, hidden indices were not included in snapshots by default, unless
specified using one of the usual methods for doing so: naming indices directly,
using index patterns starting with a ., or specifying expand_wildcards to
a value that includes hidden (e.g. all or hidden,open).

This commit changes the default expand_wildcards value to include hidden
indices.
2020-06-09 16:01:52 -06:00
Yannick Welsch
9eec819c5b Revert "Use clean thread context for transport and applier service (#57792)"
This reverts commit 259be236cfa85aa670cba55075c97a47311fe91a.
2020-06-09 22:24:54 +02:00
Yannick Welsch
8199956937 Revert "Assert on request headers only (#57792)"
This reverts commit b5d3565214dcca3aacbc04c69b0f54098a198d9f.
2020-06-09 22:24:35 +02:00
Henning Andersen
1e8e115ae1 Rollover avoid heavy lifting in dry-run/validation (#57894)
Fixed two newly introduced issues with rollover:
1. Using auto-expand replicas, rollover could result in unexpected log
messages on future indexes.
2. It did a reroute and other heavy work on the network thread.

Closes #57706
Supersedes #57865
Relates #53965
2020-06-09 22:07:30 +02:00
Costin Leau
439205d1ea EQL: Introduce tie breaker support (#57787)
Allow a field inside the data to be used as a tie breaker for events
that have the same timestamp.
The field is optional by default.
If used, the tie-breaker always requires a non-null value since it is
used inside `search_after` which requires a non-null value.

Fix #56824

(cherry picked from commit e5719ecb474b32730d93afdbb6834a32b0b2df8b)
2020-06-09 22:50:19 +03:00