4955 Commits

Author SHA1 Message Date
Dan Hermann
911d46370e
Prohibit clone, shrink, and split on a data stream's write index 2020-06-16 10:53:20 -05:00
Lee Hinman
03ce0f8a4d
[7.x] Normalized prefix for rollover API (#57271) (69e1c066) (#58171)
* Normalized prefix for rollover API (#57271)

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Lee Hinman <lee@writequit.org>

It fixes the issue #53388
by normalizing prefix at index creation request itself

* Fix compilation for backport

Co-authored-by: Gaurav Chandani <chngau@amazon.com>
2020-06-16 09:22:10 -06:00
Francisco Fernández Castaño
a5bc5ae030
Don't log on RetentionLeaseSync error handler (#58157)
After an index has been deleted it may take some time to cancel all the
maintenance tasks such as RetentionLeaseSync, it's possible that the
task is already executing before the cancellation. This commit just
avoids logging a warning message for those scenarios.

Closes #57864

Backport of (#58098)
2020-06-16 14:04:32 +02:00
Yannick Welsch
e046b0a8fa Fix realtime get of numeric fields (#58121)
Using realtime get on numeric fields when reading from the translog would yield a ClassCastException.

Closes #57462
2020-06-16 09:16:26 +02:00
Tal Levy
69d5e044af
Add optional description parameter to ingest processors. (#57906) (#58152)
This commit adds an optional field, `description`, to all ingest processors
so that users can explain the purpose of the specific processor instance.

Closes #56000.
2020-06-15 19:27:57 -07:00
Stuart Tettemer
71a42dbde9
[7.x] Rely on the computeIfAbsent logic to prevent duplicated compilation of scripts (#55467) (#58123)
Instead of serializing compilation using a plain lock / mutex combined with a double check, rely on the computeIfAbsent logic to prevent duplicated compilation of scripts. Made checkCompilationLimit to be thread-safe and lock free.

Backport: 865acad

Co-authored-by: Michael Bischoff <michael.bischoff@elastic.co>
2020-06-15 12:01:22 -06:00
markharwood
03dd73dc0d
Fix for wildcard fields that returned ByteRefs not Strings to scripts. (#58060) (#58109)
This need some reorg of BinaryDV field data classes to allow specialisation of scripted doc values.
Moved common logic to a new abstract base class and added a new subclass to return string-based representations to scripts.

Closes #58044
2020-06-15 14:52:56 +01:00
Dan Hermann
8a910443c4
Add ignore_empty_value parameter in set ingest processor (#57030) (#58108) 2020-06-15 08:35:08 -05:00
Armin Braun
1a48983a56
Fix Running TranslogOps on CS Thread (#58056) (#58076)
We should fork off from the CS thread to run this even if it's a rare
condition.
2020-06-13 17:00:49 +02:00
Nik Everett
a5571eb1a8
Save memory when rare_terms is not on top (backport of #57948) (#58069)
This uses the optimization that we started making in #55873 for
`rare_terms` to save a bit of memory when that aggregation is not on the
top level.
2020-06-12 17:47:10 -04:00
Dan Hermann
17f3318732
[7.x] Resolve index API (#58037) 2020-06-12 15:41:32 -05:00
Mayya Sharipova
8bd0147ba7
Correct how meta-field is defined for pre 7.8 hits (#57951)
We keep a static list of meta-fields: META_FIELDS_BEFORE_7_8
as it was before.
This is done to ensure the backwards compatability with pre 7.8 nodes.

Closes #57831
2020-06-12 09:39:53 -04:00
Armin Braun
5662281562
Fix ExtraFS Breaking SharedClusterSnapshotRestoreIT (#58026) (#58040)
If `ExtraFS` decides to put `extra0/0` into the indices folder
then the previous logic in this test would have interpreted the `0`
as shard `0` of index `extra0` and fail to list its contents (since it's a file
and not an actual shard directory).

=> simplified the logic to use actually referenced `IndexId` for iterating over indices
instead.
2020-06-12 15:27:48 +02:00
Martijn van Groningen
01d8bb8cfa
Enforce valid field mapping exists for timestamp_field in templates. (#58036)
Backport of #57741 to 7.x branch.

Relates to #53100
2020-06-12 15:24:42 +02:00
Armin Braun
a5a251d8c0
Handle Rejections when Scheduling RetryableAction (#58033) (#58039)
Scheduling on the threadpool will throw if the scheduler is already
shut down. Handled by treating the rejection like any other non-retryable
exception.

Closes #58021
2020-06-12 15:23:02 +02:00
Nik Everett
d6c8d9415d
Give significance lookups their own home (backport of #57903) (#57959)
This moves the code to look up significance heuristics information like
background frequency and superset size out of
`SignificantTermsAggregatorFactory` and into its own home so that it is
easier to pass around. This will:
1. Make us feel better about ourselves for not passing around the
   factory, which is really *supposed* to be a throw away thing.
2. Abstract the significance lookup logic so we can reuse it for the
   `significant_text` aggregation.
3. Make if very simple to cache the background frequencies which should
   speed up when the agg is a sub-agg. We had done this for numerics
   but not string-shaped significant terms.
2020-06-12 09:21:19 -04:00
Martijn van Groningen
f4199f2ee0
Prohibit append-only writes targeting backing indices directly. (#58025)
Backport of #57788 to 7.x branch.

Append-only writes can only target the corresponding data stream.

Relates to #53100
2020-06-12 13:17:55 +02:00
Armin Braun
db03e7c93b
Exclude WindowsFS from SharedClusterSnapshotRestoreIT (#58020) (#58023)
Same as #52488 but for a different test suite

Closes #58019
2020-06-12 10:49:03 +02:00
Mark Tozzi
36f551bdb4
Make ValuesSourceConfig behave like a config object (#57762) (#58012) 2020-06-11 17:23:55 -04:00
Igor Motov
5138c0c045
Fix missing null values for std_deviation_bounds in ext. stats aggs (#58000)
Adds missing null values for std_deviation_bounds in extended stats aggs and
improves null handling in parsed extended stats.
2020-06-11 16:23:20 -04:00
Lee Hinman
ffc3c77f75
[7.x] Disallow deletion of composable template if in use by data stream (#57957) (#57994)
Backports the following commits to 7.x:

    Disallow deletion of composable template if in use by data stream (#57957)
2020-06-11 13:51:56 -06:00
Jim Ferenczi
4c6bfe32a7 Fix possible NPE on search phase failure (#57952)
When a search phase fails, we release the context of all successful shards.
Successful shards that rewrite the request to match none will not create any context
since #. This change ensures that we don't try to release a `null` context on these
successful shards.

Closes #57945
2020-06-11 18:54:16 +02:00
Yannick Welsch
85b0b540f0 Fix refresh behavior in MockDiskUsagesIT (#57926)
Ensures that InternalClusterInfoService's internally cached stats are refreshed whenever the
shard size or disk usage function (to mock out disk usage) are overridden.

Closes #57888
2020-06-11 17:38:12 +02:00
David Turner
f950c121bb Hide AlreadyClosedException on IndexCommit release (#57986)
Today `InternalEngine#releaseIndexCommit` fails with an
`AlreadyClosedException` if the engine is closed before the index commit is
released. This can happen if, for example, a node leaves and rejoins the
cluster and acquires an index commit for replica shard allocation concurrently
with shutting the shard down.

There's no need to fail the operation like this: if the engine is shut down
then we will clean up the unreferenced files when it's restarted (or if it's
allocated elsewhere) so we can suppress an `AlreadyClosedException` in this
case. This commit does so.

Fixes #57797
2020-06-11 15:41:50 +01:00
Alan Woodward
16e230dcb8 Update to lucene snapshot e7c625430ed (#57981)
Includes LUCENE-9148 and LUCENE-9398, which splits the BKD metadata, index and data into separate files and keeps the index off-heap.
2020-06-11 14:51:53 +01:00
Yannick Welsch
34fc52dbf3 Fix PersistedClusterStateServiceTests.testSlowLogging (#57971)
The range in the last writeDurationMillis selection could be empty, as it could prior to the call be set to 1.
2020-06-11 15:47:34 +02:00
Igor Motov
947573f309
Added standard deviation / variance sampling to extended stats (#49782) (#57947)
Per 49554 I added standard deviation sampling and variance sampling to the extended stats interface.
 
Closes #49554

Co-authored-by: Igor Motov <igor@motovs.org>

Co-authored-by: andrewjohnson2 <aj114114@gmail.com>
2020-06-11 09:19:44 -04:00
Nik Everett
da72a3a51d
Speed up reducing auto_date_histo with a time zone (backport of #57933) (#57958)
When reducing `auto_date_histogram` we were using `Rounding#round`
which is quite a bit more expensive than
```
Rounding.Prepared prepared = rounding.prepare(min, max);
long result = prepared.round(date);
```
when rounding to a non-fixed time zone like `America/New_York`. This
stops using the former and starts using the latter.

Relates to #56124
2020-06-11 09:15:12 -04:00
Albert Zaharovits
c57ccd99f7
Just log 401 stacktraces (#55774)
Ensure stacktraces of 401 errors for unauthenticated users are logged
but not returned in the response body.
2020-06-10 20:39:32 +03:00
Armin Braun
85f5c4192b
Improve Test Coverage for Old Repository Metadata Formats (#57915) (#57922)
Use the the hack used in `CorruptedBlobStoreRepositoryIT` in more snapshot
failure tests to verify that BwC repository metadata is handled properly
in these so far not-test-covered scenarios.
Also, some minor related dry-up of snapshot tests.

Relates #57798
2020-06-10 13:27:01 +02:00
Yannick Welsch
80f221e920
Use clean thread context for transport and applier service (#57792) (#57914)
Adds assertions to Netty to make sure that its threads are not polluted by thread contexts (and
also that thread contexts are not leaked). Moves the ClusterApplierService to use the system
context (same as we do for MasterService), which allows to remove a hack from
TemplateUgradeService and makes it clearer that applying CS updates is fully executing under
system context.
2020-06-10 10:30:28 +02:00
Armin Braun
fe85bdbe6f
Fix Remote Recovery Being Retried for Removed Nodes (#57608) (#57913)
If a node is disconnected we retry. It does not make sense
to retry the recovery if the node is removed from the cluster though.
=> added a CS listener that cancels the recovery for removed nodes

Also, we were running the retry on the `SAME` pool which for each retry will
be the scheduler pool. Since the error path of the listener we use here
will do blocking operations when closing the resources used by the recovery
we can't use the `SAME` pool here since not all exceptions go to the `ActionListenerResponseHandler`
threading like e.g. `NodeNotConnectedException`.

Closes #57585
2020-06-10 09:41:52 +02:00
Armin Braun
d579420452
Stop Serializing Exceptions in SnapshotInfo (#57866) (#57898)
In ff9e8c622427d42a2d87b4ceb298d043ae3c4e6a we changed the format
used when serializing snapshot failures in the cluster state and
`SnapshotInfo`. This turned them from a short string holding all the
nested exception messages into a multi kb stacktrace in many cases.
This is not great if you snapshot a large number of shards that all fail
for example and massively blows up the size of the GET snapshots response
if there are snapshots with failures in there.
This change reverts to the format used for exceptions before the above commit.

Also, this change short circuits logging and serialization of the failure
for an aborted snapshot where we don't care about the specific message at all
and aligns the message to "aborted" in all cases (current if we aborted before any IO,
it would have been "aborted" and an exception when aborting later during IO).
2020-06-10 08:41:03 +02:00
Gordon Brown
aab6317260
[7.x] Include hidden indices in snapshots by default (#57325)
Previously, hidden indices were not included in snapshots by default, unless
specified using one of the usual methods for doing so: naming indices directly,
using index patterns starting with a ., or specifying expand_wildcards to
a value that includes hidden (e.g. all or hidden,open).

This commit changes the default expand_wildcards value to include hidden
indices.
2020-06-09 16:01:52 -06:00
Yannick Welsch
9eec819c5b Revert "Use clean thread context for transport and applier service (#57792)"
This reverts commit 259be236cfa85aa670cba55075c97a47311fe91a.
2020-06-09 22:24:54 +02:00
Yannick Welsch
8199956937 Revert "Assert on request headers only (#57792)"
This reverts commit b5d3565214dcca3aacbc04c69b0f54098a198d9f.
2020-06-09 22:24:35 +02:00
Henning Andersen
1e8e115ae1 Rollover avoid heavy lifting in dry-run/validation (#57894)
Fixed two newly introduced issues with rollover:
1. Using auto-expand replicas, rollover could result in unexpected log
messages on future indexes.
2. It did a reroute and other heavy work on the network thread.

Closes #57706
Supersedes #57865
Relates #53965
2020-06-09 22:07:30 +02:00
Jake Landis
fff0a106c9
[7.x] Support if_seq_no and if_primary_term for ingest (#55430) (#57768)
Allow for optimistic concurrency control during ingest by checking the
sequence number and primary term. This is accomplished by defining
_if_seq_no and _if_primary_term in the pipeline, similarly to _version
and _version_type.

Closes #41255
Co-authored-by: Maria Ralli <mariai.ralli@gmail.com>
2020-06-09 14:20:26 -05:00
Andrei Dan
3945712c72
[7.x] ILM add data stream support to the Shrink action (#57616) (#57884)
The shrink action creates a shrunken index with the target number of shards.
This makes the shrink action data stream aware. If the ILM managed index is
part of a data stream the shrink action will make sure to swap the original
managed index with the shrunken one as part of the data stream's backing
indices and then delete the original index.

(cherry picked from commit 99aeed6acf4ae7cbdd97a3bcfe54c5d37ab7a574)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-06-09 19:45:22 +01:00
Jim Ferenczi
ea696198e9 Fix rounding composite aggs on sorted index (#57867)
This commit fixes a bug on the composite aggregation when the index
is sorted and the primary composite source needs to round values (date_histo).
In such case, we cannot take into account the subsequent sources even if they
match the index sort because the rounding of the primary sort value may break
the original index order.

Fixes #57849
2020-06-09 20:41:45 +02:00
Nik Everett
44a79d1739
Deprecte Rounding#round (#57845) (#57893)
This deprecates `Rounding#round` and `Rounding#nextRoundingValue` in
favor of calling
```
Rounding.Prepared prepared = rounding.prepare(min, max);
...
prepared.round(val)

```

because it is always going to be faster to prepare once. There
are going to be some cases where we won't know what to prepare *for*
and in those cases you can call `prepareForUnknown` and stil be faster
than calling the deprecated method over and over and over again.

Ultimately, this is important because it doesn't look like there is an
easy way to cache `Rounding.Prepared` or any of its precursors like
`LocalTimeOffset.Lookup`. Instead, we can just build it at most once per
request.

Relates to #56124
2020-06-09 14:30:56 -04:00
Tim Brooks
2630c80b5d
Fix IndexRecoveryIT transient error test (#57826)
Currently it is possible for a transient network error to disrupt the
start recovery request from the remote to source node. This disruption
is racy with the recovery occurring on the source node. It is possible
for the source node to finish and clear its recovery. When this occurs,
the recovery cannot be reestablished and the "no two start" assertion
is tripped. This commit fixes this issue by allowing two starts if the
finalize request has been received.

Fixes #57416.
2020-06-09 10:49:38 -06:00
Tim Brooks
8119b96517
Fix stalled send translog ops request (#57859)
Currently, the translog ops request is reentrent when there is a mapping
update. The impact of this is that a translog ops ends up waiting on the
pre-existing listener and it is never completed. This commit fixes this
by introducing a new code path to avoid the idempotency logic.
2020-06-09 10:46:34 -06:00
Tim Brooks
c17121428e
Fix translog ops action name in channel listener (#57854)
The action name is passed to the `ChannelListener` and is used for
logging purposes. Currently, we are using the incorrect action name for
the translog ops listener. This commit fixes the issue.
2020-06-09 10:38:58 -06:00
Lee Hinman
cb2ce3736a
[7.x] Make noop template updates be cluster state noops (#57851) (#57880)
Backports the following commits to 7.x:

    Make noop template updates be cluster state noops (#57851)
2020-06-09 09:26:06 -06:00
Dan Hermann
b501b282f8
Change default backing index naming scheme 2020-06-09 09:31:34 -05:00
Nik Everett
e7cc2448d2
Save memory when string terms are not on top (#57758) (#57876)
This reworks string flavored implementations of the `terms` aggregation
to save memory when it is under another bucket by dropping the usage of
`asMultiBucketAggregator`.
2020-06-09 10:26:29 -04:00
Yannick Welsch
b5d3565214 Assert on request headers only (#57792)
Only assert that actual request headers are empty, as default headers might still be there when stashing the context.
2020-06-09 14:08:25 +02:00
Yannick Welsch
259be236cf Use clean thread context for transport and applier service (#57792)
Adds assertions to Netty to make sure that its threads are not polluted by thread contexts (and
also that thread contexts are not leaked). Moves the ClusterApplierService to use the system
context (same as we do for MasterService), which allows to remove a hack from
TemplateUgradeService and makes it clearer that applying CS updates is fully executing under
system context.
2020-06-09 12:32:28 +02:00
Tim Brooks
9eaee3da8d
Fix exception check in RecoveryRequestTrackerTests (#57493)
Currently we check that exceptions are the same in the recovery request
tracker test. This is inconsistent because the future wraps the
exception in a new instance. This commit fixes the test by comparing a
random exception message.

Fixes #57199
2020-06-08 15:42:48 -06:00