Commit Graph

4878 Commits

Author SHA1 Message Date
Julie Tibshirani b1161cba35 Rename SearchContext#smartNameFieldType. (#58203)
The concept of a 'smart name' doesn't make sense now that there are no mapping
types.
2020-06-17 10:38:32 -07:00
Tim Brooks 2074412d79
Retry failed replication due to transient errors (#56230)
Currently a failed replication action will fail an entire replica. This
includes when replication fails due to potentially short lived transient
issues such as network distruptions or circuit breaking errors.

This commit implements retries using the retryable action.
2020-06-17 10:17:30 -06:00
Luca Cavanna 5ddea03de7 Remove needless termsQuery implementation from StringFieldType (#57609)
The base class `TermBasedFieldType` already implements exactly the same `termsQuery` method, hence there is no need to override it.
2020-06-17 18:04:49 +02:00
GeChenxin a96f526de1 Add index name to refresh mapping task (#57598) 2020-06-17 10:49:36 -04:00
Armin Braun 41af7f5455
Fix Typo in Snapshot Abort Test (#58238) (#58247)
Forgot the brackets here in #58214 so in the rare case where the
first update seen by the listener doesn't match it will still remove
itself and never be invoked again -> timeout.
2020-06-17 14:53:39 +02:00
Nik Everett ab2c6d9696
Save memory when auto_date_histogram is not on top (backport of #57304) (#58190)
This builds an `auto_date_histogram` aggregator that natively aggregates
from many buckets and uses it when the `auto_date_histogram` used to use
`asMultiBucketAggregator` which should save a significant amount of
memory in those cases. In particular, this happens when
`auto_date_histogram` is a sub-aggregator of a multi-bucketing aggregator
like `terms` or `histogram` or `filters`. For the most part we preserve
the original implementation when `auto_date_histogram` only collects from
a single bucket.

It isn't possible to "just port the aggregator" without taking a pretty
significant performance hit because we used to rewrite all of the
buckets every time we switched to a coarser and coarser rounding
configuration. Without some major surgery to how to delay sub-aggs
we'd end up rewriting the delay list zillions of time if there are many
buckets.

The multi-bucket version of the aggregator has a "budget" of "wasted"
buckets and only rewrites all of the buckets when we exceed that budget.
Now that we don't rebucket every time we increase the rounding we can no
longer get an accurate count of the number of buckets! So instead the
aggregator uses an estimate of the number of buckets to trigger switching
to a coarser rounding. This estimate is likely to be *terrible* when
buckets are far apart compared to the rounding. So it also uses the
difference between the first and last bucket to trigger switching to a
coarser rounding. Which covers for the shortcomings of the bucket
estimation technique pretty well. It also causes the aggregator to emit
fewer buckets in cases where they'd be reduced together on the
coordinating node. This is wonderful! But probably fairly rare.

All of that does buy us some speed improvements when the aggregator is
a child of multi-bucket aggregator:
Without metrics or time zone: 25% faster
With metrics: 15% faster
With time zone: 22% faster

Relates to #56487
2020-06-17 08:48:41 -04:00
Jason Tedor b78b3edeea
Upgrade to JNA 5.5.0 (#58183)
This commit bumps our JNA dependency from 4.5.1 to 5.5.0, so that we are
now on the latest maintained line, and pick up a large collection of bug
fixes that have accumulated.
2020-06-17 07:35:08 -04:00
Ignacio Vera b6585f2b51
Add new extensions for Lucene86 points codec to FsDirectoryFactory (#58226) (#58233) 2020-06-17 12:55:33 +02:00
Armin Braun 85be78b624
Fix Snapshot Abort Not Waiting for Data Nodes (#58214) (#58228)
This was a really subtle bug that we introduced a long time ago.
If a shard snapshot is in aborted state but hasn't started snapshotting on a node
we can only send the failed notification for it if the shard was actually supposed
to execute on the local node.
Without this fix, if shard snapshots were spread out across at least two data nodes
(so that each data node does not have all the primaries) the abort would actually
never wait on the data nodes. This isn't a big deal with uuid shard generations
but could lead to potential corruption on S3 when using numeric shard generations
(albeit very unlikely now that we have the 3 minute wait there).
Another negative side-effect of this bug was that master would receive a lot more
shard status update messages for aborted shards since each data node not assigned
a primary would send one message for that primary.
2020-06-17 11:39:50 +02:00
Armin Braun c2b416ee31
Fix DanglingIndicesIT Failures from MasterNotDiscoveredException (#58215) (#58221)
The dangling indices action is not a proper master node action so it does not
retry when executed while the cluster hasn't fully formed yet.
Since we use node restarts when setting up the dangling indices state we need
to manually ensure a fully formed cluster before moving on with the tests to avoid
failures.
2020-06-17 10:34:08 +02:00
Stuart Tettemer 01795d1925
Revert "Scripting: Deprecate general cache settings (#55753)" (#58201)
This reverts commit 88e8b34fc2.
2020-06-16 14:58:18 -06:00
Rory Hunter 03369e0980
Implement dangling indices API (#58176)
Backport of #50920. Part of #48366. Implement an API for listing,
importing and deleting dangling indices.

Co-authored-by: David Turner <david.turner@elastic.co>
2020-06-16 21:50:38 +01:00
Stuart Tettemer 88e8b34fc2
Scripting: Deprecate general cache settings (#55753)
Backport: ef543b0
2020-06-16 13:06:59 -06:00
Alan Woodward c6acc7c976 Correctly deal with aliases when retrieving lucene FieldType 2020-06-16 18:06:37 +01:00
Alan Woodward 12a3f6dfca
MappedFieldType should not extend FieldType (#58160)
MappedFieldType is a combination of two concerns:

* an extension of lucene's FieldType, defining how a field should be indexed
* a set of query factory methods, defining how a field should be searched

We want to break these two concerns apart. This commit is a first step to doing this, breaking
the inheritance relationship between MappedFieldType and FieldType. MappedFieldType
instead has a series of boolean flags defining whether or not the field is searchable or
aggregatable, and FieldMapper has a separate FieldType passed to its constructor defining
how indexing should be done.

Relates to #56814
2020-06-16 16:56:43 +01:00
Dan Hermann 911d46370e
Prohibit clone, shrink, and split on a data stream's write index 2020-06-16 10:53:20 -05:00
Lee Hinman 03ce0f8a4d
[7.x] Normalized prefix for rollover API (#57271) (69e1c066) (#58171)
* Normalized prefix for rollover API (#57271)

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
Co-authored-by: Lee Hinman <lee@writequit.org>

It fixes the issue #53388
by normalizing prefix at index creation request itself

* Fix compilation for backport

Co-authored-by: Gaurav Chandani <chngau@amazon.com>
2020-06-16 09:22:10 -06:00
Francisco Fernández Castaño a5bc5ae030
Don't log on RetentionLeaseSync error handler (#58157)
After an index has been deleted it may take some time to cancel all the
maintenance tasks such as RetentionLeaseSync, it's possible that the
task is already executing before the cancellation. This commit just
avoids logging a warning message for those scenarios.

Closes #57864

Backport of (#58098)
2020-06-16 14:04:32 +02:00
Yannick Welsch e046b0a8fa Fix realtime get of numeric fields (#58121)
Using realtime get on numeric fields when reading from the translog would yield a ClassCastException.

Closes #57462
2020-06-16 09:16:26 +02:00
Tal Levy 69d5e044af
Add optional description parameter to ingest processors. (#57906) (#58152)
This commit adds an optional field, `description`, to all ingest processors
so that users can explain the purpose of the specific processor instance.

Closes #56000.
2020-06-15 19:27:57 -07:00
Stuart Tettemer 71a42dbde9
[7.x] Rely on the computeIfAbsent logic to prevent duplicated compilation of scripts (#55467) (#58123)
Instead of serializing compilation using a plain lock / mutex combined with a double check, rely on the computeIfAbsent logic to prevent duplicated compilation of scripts. Made checkCompilationLimit to be thread-safe and lock free.

Backport: 865acad

Co-authored-by: Michael Bischoff <michael.bischoff@elastic.co>
2020-06-15 12:01:22 -06:00
markharwood 03dd73dc0d
Fix for wildcard fields that returned ByteRefs not Strings to scripts. (#58060) (#58109)
This need some reorg of BinaryDV field data classes to allow specialisation of scripted doc values.
Moved common logic to a new abstract base class and added a new subclass to return string-based representations to scripts.

Closes #58044
2020-06-15 14:52:56 +01:00
Dan Hermann 8a910443c4
Add ignore_empty_value parameter in set ingest processor (#57030) (#58108) 2020-06-15 08:35:08 -05:00
Rene Groeschke 01e9126588
Remove deprecated usage of testCompile configuration (#57921) (#58083)
* Remove usage of deprecated testCompile configuration
* Replace testCompile usage by testImplementation
* Make testImplementation non transitive by default (as we did for testCompile)
* Update CONTRIBUTING about using testImplementation for test dependencies
* Fail on testCompile configuration usage
2020-06-14 22:30:44 +02:00
Armin Braun 1a48983a56
Fix Running TranslogOps on CS Thread (#58056) (#58076)
We should fork off from the CS thread to run this even if it's a rare
condition.
2020-06-13 17:00:49 +02:00
Nik Everett a5571eb1a8
Save memory when rare_terms is not on top (backport of #57948) (#58069)
This uses the optimization that we started making in #55873 for
`rare_terms` to save a bit of memory when that aggregation is not on the
top level.
2020-06-12 17:47:10 -04:00
Dan Hermann 17f3318732
[7.x] Resolve index API (#58037) 2020-06-12 15:41:32 -05:00
Mayya Sharipova 8bd0147ba7
Correct how meta-field is defined for pre 7.8 hits (#57951)
We keep a static list of meta-fields: META_FIELDS_BEFORE_7_8
as it was before.
This is done to ensure the backwards compatability with pre 7.8 nodes.

Closes #57831
2020-06-12 09:39:53 -04:00
Armin Braun 5662281562
Fix ExtraFS Breaking SharedClusterSnapshotRestoreIT (#58026) (#58040)
If `ExtraFS` decides to put `extra0/0` into the indices folder
then the previous logic in this test would have interpreted the `0`
as shard `0` of index `extra0` and fail to list its contents (since it's a file
and not an actual shard directory).

=> simplified the logic to use actually referenced `IndexId` for iterating over indices
instead.
2020-06-12 15:27:48 +02:00
Martijn van Groningen 01d8bb8cfa
Enforce valid field mapping exists for timestamp_field in templates. (#58036)
Backport of #57741 to 7.x branch.

Relates to #53100
2020-06-12 15:24:42 +02:00
Armin Braun a5a251d8c0
Handle Rejections when Scheduling RetryableAction (#58033) (#58039)
Scheduling on the threadpool will throw if the scheduler is already
shut down. Handled by treating the rejection like any other non-retryable
exception.

Closes #58021
2020-06-12 15:23:02 +02:00
Nik Everett d6c8d9415d
Give significance lookups their own home (backport of #57903) (#57959)
This moves the code to look up significance heuristics information like
background frequency and superset size out of
`SignificantTermsAggregatorFactory` and into its own home so that it is
easier to pass around. This will:
1. Make us feel better about ourselves for not passing around the
   factory, which is really *supposed* to be a throw away thing.
2. Abstract the significance lookup logic so we can reuse it for the
   `significant_text` aggregation.
3. Make if very simple to cache the background frequencies which should
   speed up when the agg is a sub-agg. We had done this for numerics
   but not string-shaped significant terms.
2020-06-12 09:21:19 -04:00
Martijn van Groningen f4199f2ee0
Prohibit append-only writes targeting backing indices directly. (#58025)
Backport of #57788 to 7.x branch.

Append-only writes can only target the corresponding data stream.

Relates to #53100
2020-06-12 13:17:55 +02:00
Armin Braun db03e7c93b
Exclude WindowsFS from SharedClusterSnapshotRestoreIT (#58020) (#58023)
Same as #52488 but for a different test suite

Closes #58019
2020-06-12 10:49:03 +02:00
Mark Tozzi 36f551bdb4
Make ValuesSourceConfig behave like a config object (#57762) (#58012) 2020-06-11 17:23:55 -04:00
Igor Motov 5138c0c045
Fix missing null values for std_deviation_bounds in ext. stats aggs (#58000)
Adds missing null values for std_deviation_bounds in extended stats aggs and
improves null handling in parsed extended stats.
2020-06-11 16:23:20 -04:00
Lee Hinman ffc3c77f75
[7.x] Disallow deletion of composable template if in use by data stream (#57957) (#57994)
Backports the following commits to 7.x:

    Disallow deletion of composable template if in use by data stream (#57957)
2020-06-11 13:51:56 -06:00
Jim Ferenczi 4c6bfe32a7 Fix possible NPE on search phase failure (#57952)
When a search phase fails, we release the context of all successful shards.
Successful shards that rewrite the request to match none will not create any context
since #. This change ensures that we don't try to release a `null` context on these
successful shards.

Closes #57945
2020-06-11 18:54:16 +02:00
Yannick Welsch 85b0b540f0 Fix refresh behavior in MockDiskUsagesIT (#57926)
Ensures that InternalClusterInfoService's internally cached stats are refreshed whenever the
shard size or disk usage function (to mock out disk usage) are overridden.

Closes #57888
2020-06-11 17:38:12 +02:00
David Turner f950c121bb Hide AlreadyClosedException on IndexCommit release (#57986)
Today `InternalEngine#releaseIndexCommit` fails with an
`AlreadyClosedException` if the engine is closed before the index commit is
released. This can happen if, for example, a node leaves and rejoins the
cluster and acquires an index commit for replica shard allocation concurrently
with shutting the shard down.

There's no need to fail the operation like this: if the engine is shut down
then we will clean up the unreferenced files when it's restarted (or if it's
allocated elsewhere) so we can suppress an `AlreadyClosedException` in this
case. This commit does so.

Fixes #57797
2020-06-11 15:41:50 +01:00
Alan Woodward 16e230dcb8 Update to lucene snapshot e7c625430ed (#57981)
Includes LUCENE-9148 and LUCENE-9398, which splits the BKD metadata, index and data into separate files and keeps the index off-heap.
2020-06-11 14:51:53 +01:00
Yannick Welsch 34fc52dbf3 Fix PersistedClusterStateServiceTests.testSlowLogging (#57971)
The range in the last writeDurationMillis selection could be empty, as it could prior to the call be set to 1.
2020-06-11 15:47:34 +02:00
Igor Motov 947573f309
Added standard deviation / variance sampling to extended stats (#49782) (#57947)
Per 49554 I added standard deviation sampling and variance sampling to the extended stats interface.
 
Closes #49554

Co-authored-by: Igor Motov <igor@motovs.org>

Co-authored-by: andrewjohnson2 <aj114114@gmail.com>
2020-06-11 09:19:44 -04:00
Nik Everett da72a3a51d
Speed up reducing auto_date_histo with a time zone (backport of #57933) (#57958)
When reducing `auto_date_histogram` we were using `Rounding#round`
which is quite a bit more expensive than
```
Rounding.Prepared prepared = rounding.prepare(min, max);
long result = prepared.round(date);
```
when rounding to a non-fixed time zone like `America/New_York`. This
stops using the former and starts using the latter.

Relates to #56124
2020-06-11 09:15:12 -04:00
Albert Zaharovits c57ccd99f7
Just log 401 stacktraces (#55774)
Ensure stacktraces of 401 errors for unauthenticated users are logged
but not returned in the response body.
2020-06-10 20:39:32 +03:00
Armin Braun 85f5c4192b
Improve Test Coverage for Old Repository Metadata Formats (#57915) (#57922)
Use the the hack used in `CorruptedBlobStoreRepositoryIT` in more snapshot
failure tests to verify that BwC repository metadata is handled properly
in these so far not-test-covered scenarios.
Also, some minor related dry-up of snapshot tests.

Relates #57798
2020-06-10 13:27:01 +02:00
Yannick Welsch 80f221e920
Use clean thread context for transport and applier service (#57792) (#57914)
Adds assertions to Netty to make sure that its threads are not polluted by thread contexts (and
also that thread contexts are not leaked). Moves the ClusterApplierService to use the system
context (same as we do for MasterService), which allows to remove a hack from
TemplateUgradeService and makes it clearer that applying CS updates is fully executing under
system context.
2020-06-10 10:30:28 +02:00
Armin Braun fe85bdbe6f
Fix Remote Recovery Being Retried for Removed Nodes (#57608) (#57913)
If a node is disconnected we retry. It does not make sense
to retry the recovery if the node is removed from the cluster though.
=> added a CS listener that cancels the recovery for removed nodes

Also, we were running the retry on the `SAME` pool which for each retry will
be the scheduler pool. Since the error path of the listener we use here
will do blocking operations when closing the resources used by the recovery
we can't use the `SAME` pool here since not all exceptions go to the `ActionListenerResponseHandler`
threading like e.g. `NodeNotConnectedException`.

Closes #57585
2020-06-10 09:41:52 +02:00
Armin Braun d579420452
Stop Serializing Exceptions in SnapshotInfo (#57866) (#57898)
In ff9e8c622427d42a2d87b4ceb298d043ae3c4e6a we changed the format
used when serializing snapshot failures in the cluster state and
`SnapshotInfo`. This turned them from a short string holding all the
nested exception messages into a multi kb stacktrace in many cases.
This is not great if you snapshot a large number of shards that all fail
for example and massively blows up the size of the GET snapshots response
if there are snapshots with failures in there.
This change reverts to the format used for exceptions before the above commit.

Also, this change short circuits logging and serialization of the failure
for an aborted snapshot where we don't care about the specific message at all
and aligns the message to "aborted" in all cases (current if we aborted before any IO,
it would have been "aborted" and an exception when aborting later during IO).
2020-06-10 08:41:03 +02:00
Gordon Brown aab6317260
[7.x] Include hidden indices in snapshots by default (#57325)
Previously, hidden indices were not included in snapshots by default, unless
specified using one of the usual methods for doing so: naming indices directly,
using index patterns starting with a ., or specifying expand_wildcards to
a value that includes hidden (e.g. all or hidden,open).

This commit changes the default expand_wildcards value to include hidden
indices.
2020-06-09 16:01:52 -06:00