Commit Graph

5012 Commits

Author SHA1 Message Date
Armin Braun 483386136d
Move all Snapshot Master Node Steps to SnapshotsService (#56365) (#59373)
This refactoring has three motivations:

1. Separate all master node steps during snapshot operations from all data node steps in code.
2. Set up next steps in concurrent repository operations and general improvements by centralizing tracking of each shard's state in the repository in `SnapshotsService` so that operations for each shard can be linearized efficiently (i.e. without having to inspect the full snapshot state for all shards on every cluster state update, allowing us to track more in memory and only fall back to inspecting the full CS on master failover like we do in the snapshot shards service).
    * This PR already contains some best effort examples of this, but obviously this could be way improved upon still (just did not want to do it in this PR for complexity reasons)
3. Make the `SnapshotsService` less expensive on the CS thread for large snapshots
2020-07-12 22:19:07 +02:00
Dan Hermann e01d73c737
[7.x] Data stream admin actions are now index-level actions 2020-07-10 14:36:18 -05:00
Stuart Tettemer 4c04fd1e05
Scripting: Unlimited compilation rate for ingest (#59268)
* `ingest` and `processor_conditional` default to unlimited compilation rate

Refs: #50152
2020-07-09 16:34:47 -05:00
Stuart Tettemer 94e213dd5f
Scripting: Per context stats in `script` in _nodes/stats (#59266)
Updated `_nodes/stats`:
 * Update `script` in `_node/stats` to include stats per context:

```
      "script": {
        "compilations": 1,
        "cache_evictions": 0,
        "compilation_limit_triggered": 0,
        "contexts":[
          {
            "context": "aggregation_selector",
            "compilations": 0,
            "cache_evictions": 0,
            "compilation_limit_triggered": 0
          },

```

Refs: #50152
Backport: #59625
2020-07-09 15:30:50 -05:00
Alan Woodward f4caadd239 MappedFieldType no longer requires equals/hashCode/clone (#59212)
With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer
have any cause to compare MappedFieldType instances. This means that we can remove all equals
and hashCode implementations, and in addition we no longer need the clone implementations which
were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes,
which will be particularly useful for the runtime fields project.
2020-07-09 21:05:10 +01:00
Dan Hermann c26d2b5fa5
Data stream support for indices shard stores API 2020-07-09 13:11:45 -05:00
Nik Everett 28ef997953
Improve vwh's distant bucket handling (#59094) (#59248)
This modifies the `variable_width_histogram`'s distant bucket handling
to:
1. Properly handle integer overflows
2. Recalculate the average distance when new buckets are added on the
   ends. This should slow down the rate at which we build extra buckets
   as we build more of them.

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-07-09 12:14:46 -04:00
Przemko Robakowski c870d6e570
[7.x] Restart tests with data streams (#58330) (#59303)
* Restart tests with data streams (#58330)
2020-07-09 17:52:20 +02:00
David Turner d56fc72ee5 Fix node health-check-related test failures (#59277)
In #52680 we introduced a new health check mechanism. This commit fixes
up some sporadic related test failures, and improves the behaviour of
the `FollowersChecker` slightly in the case that no retries are
configured.

Closes #59252
Closes #59172
2020-07-09 12:46:12 +01:00
David Turner c80a9e2ec2 Skip unnecessary directory iteration (#59007)
Today `NodeEnvironment#findAllShardIds` enumerates the index directories
in each data path in order to find one with a specific name. Since we
already know the name of the folder we seek we can construct the path
directly and avoid this directory listing. This commit does that.
2020-07-09 11:56:41 +01:00
Alan Woodward 67a27e2b9d Add declarative parameters to FieldMappers (#58663)
The FieldMapper infrastructure currently has a bunch of shared parameters, many of which
are only applicable to a subset of the 41 mapper implementations we ship with. Merging,
parsing and serialization of these parameters are spread around the class hierarchy, with
much repetitive boilerplate code required. It would be much easier to reason about these
things if we could declare the parameter set of each FieldMapper directly in the implementing
class, and share the parsing, merging and serialization logic instead.

This commit is a first effort at introducing a declarative parameter style. It adds a new FieldMapper
subclass, ParametrizedFieldMapper, and refactors two mappers, Boolean and Binary, to use it.
Parameters are declared on Builder classes, with the declaration including the parameter name,
whether or not it is updateable, a default value, how to parse it from mappings, and how to
extract it from another mapper at merge time. Builders have a getParameters method, which
returns a list of the declared parameters; this is then used for parsing, merging and serialization.
Merging is achieved by constructing a new Builder from the existing Mapper, and merging in
values from the merging Mapper; conflicts are all caught at this point, and if none exist then a new,
merged, Mapper can be built from the Builder. This allows all values on the Mapper to be final.

Other mappers can be gradually migrated to this new style, and once they have all been refactored
we can merge ParametrizedFieldMapper and FieldMapper entirely.
2020-07-09 11:43:21 +01:00
Ignacio Vera 1ad00d1ceb
Add Support in geo_match enrichment policy for any type of geometry (#59276)
geo_match enrichment works currently only with points. This change adds the ability to
use any type of geometry.
2020-07-09 11:41:41 +02:00
Nhat Nguyen 6a0f7411e2 Do not release safe commit with CancellableThreads (#59182)
We are leaking a FileChannel in #39585 if we release a safe commit with 
CancellableThreads. Although it is a bug in Lucene where we do not close
a FileChannel if we failed to create a NIOFSIndexInput, I think it's
safer if we release a safe commit using the generic thread pool instead.

Closes #39585
Relates #45409
2020-07-08 13:51:48 -04:00
Nhat Nguyen 00c859bfca Fix testSendSnapshotSendsOps
We need to use a concurrent collection to keep track of the shipped operations
as they can arrive concurrently since #58018.

Relates #58018
2020-07-08 12:25:33 -04:00
Martijn van Groningen 17bd559253
Fix the timestamp field of a data stream to @timestamp (#59210)
Backport of #59076 to 7.x branch.

The commit makes the following changes:
* The timestamp field of a data stream definition in a composable
  index template can only be set to '@timestamp'.
* Removed custom data stream timestamp field validation and reuse the validation from `TimestampFieldMapper` and
  instead only check that the _timestamp field mapping has been defined on a backing index of a data stream.
* Moved code that injects _timestamp meta field mapping from `MetadataCreateIndexService#applyCreateIndexRequestWithV2Template58956(...)` method
  to `MetadataIndexTemplateService#collectMappings(...)` method.
* Fixed a bug (#58956) that cases timestamp field validation to be performed
  for each template and instead of the final mappings that is created.
* only apply _timestamp meta field if index is created as part of a data stream or data stream rollover,
this fixes a docs test, where a regular index creation matches (logs-*) with a template with a data stream definition.

Relates to #58642
Relates to #53100
Closes #58956
Closes #58583
2020-07-08 17:30:46 +02:00
Nik Everett a29d3515a2
Improve cardinality measure used to build aggs (#56533) (#59107)
This makes a `parentCardinality` available to every `Aggregator`'s ctor
so it can make intelligent choices about how it collects bucket values.
This replaces `collectsFromSingleBucket` and is similar to it but:
1. It supports `NONE`, `ONE`, and `MANY` values and is generally
   extensible if we decide we can use more precise counts.
2. It is more accurate. `collectsFromSingleBucket` assumed that all
   sub-aggregations live under multi-bucket aggregations. This is
   normally true but `parentCardinality` is properly carried forward
   for single bucket aggregations like `filter` and for multi-bucket
   aggregations configured in single-bucket for like `range` with a
   single range.

While I was touching every aggregation I renamed `doCreateInternal` to
`createMapped` because that seemed like a much better name and it was
right there, next to the change I was already making.

Relates to #56487

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-07-08 08:42:23 -04:00
Dan Hermann 90c8d3fc9d
IndexNameExpressionResolver::dataStreamNames should support exclusions 2020-07-08 07:35:52 -05:00
Armin Braun 9268b25789
Add Check for Metadata Existence in BlobStoreRepository (#59141) (#59216)
In order to ensure that we do not write a broken piece of `RepositoryData`
because the phyiscal repository generation was moved ahead more than one step
by erroneous concurrent writing to a repository we must check whether or not
the current assumed repository generation exists in the repository physically.
Without this check we run the risk of writing on top of stale cached repository data.

Relates #56911
2020-07-08 14:25:01 +02:00
Tim Brooks 3700bd1c08
Fix assertion in testCollectNodes test (#58948)
Currently we assert that the reason we fail collecting nodes in this
test is due to the fact that no seeds are available or no connections
could be established to cluster_2. However, the collection could fail if
we cannot establish connections to cluster_1. This commit adds that as
an acceptible assertion.
2020-07-07 21:37:10 -06:00
Nhat Nguyen ef5c397c0f
Sending operations concurrently in peer recovery (#58018)
Today, we send operations in phase2 of peer recoveries batch by batch
sequentially. Normally that's okay as we should have a fairly small of
operations in phase 2 due to the file-based threshold. However, if
phase1 takes a lot of time and we are actively indexing, then phase2 can
have a lot of operations to replay.

With this change, we will send multiple batches concurrently (defaults
to 1) to reduce the recovery time.

Backport of #58018
2020-07-07 22:03:31 -04:00
Lee Hinman b832fe30ab
[7.x] Validate Data Streams reference a template on composable template update (#59106) (#59193)
This commit adds validation that when a composable index template is updated, that the number
of unreferenced data streams does not increase. While it is still possible to have data streams
without a backing template (through snapshot restoration), this reduces the chance of getting
in to that scenario.

Relates to #53100
2020-07-07 15:38:27 -06:00
Tim Brooks b1c3ad8f59
Fix race in RecoveryRequestTrackerTests (#59187)
Currently in the recovery request tracker tests we place the futures
into the future map on the GENERIC thread. It is possible that the test
has already advanced past the point where we block on these futures
before they are placed in the map. This introduces other potential
failures as we expect all futures have been completed. This commit fixes
the test by places the futures in the map prior to dispatching.
2020-07-07 15:10:31 -06:00
Nik Everett d536854879 Fix test bug in auto_date_histo
The test would try to prepare a `Rounding` even when there aren't any
buckets. This would fail because there is no range over which to prepare
the rounding. It turns out that we don't need the rounding in that case
so we just use `null` then.

Closes #59131
2020-07-07 15:39:48 -04:00
Andrei Dan 24c6a30e2b
[7.9] GET data stream API returns additional information (#59128) (#59177)
* GET data stream API returns additional information (#59128)

This adds the data stream's index template, the configured ILM policy
(if any) and the health status of the data stream to the GET _data_stream
response.

Restoring a data stream from a snapshot could install a data stream that
doesn't match any composable templates. This also makes the `template`
field in the `GET _data_stream` response optional.

(cherry picked from commit 0d9c98a82353b088c782b6a04c44844e66137054)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-07-07 20:30:09 +01:00
Nhat Nguyen de6ac6aea6 Fix recovery stage transition with sync_id (#57754)
If the recovery source is on an old node (before 7.2), then the recovery
target won't have the safe commit after phase1 because the recovery
source does not send the global checkpoint in the clean_files step. And
if the recovery fails and retries, then the recovery stage won't
transition properly. If a sync_id is used in peer recovery, then the
clean_files step won't be executed to move the stage to TRANSLOG.

Relates ##7187
Closes #57708
2020-07-07 12:00:37 -04:00
Rene Groeschke a896df53ac
Remove misc dependency related deprecation warnings (7.x backport) (#59122)
* Fix dependency related deprecations (#58892)
* Fix classpath setup for forbiddenapi usage
2020-07-07 17:10:31 +02:00
Nik Everett eb169ae226
Fix lookup support in adjacency matrix (backport of #59099) (#59108)
This request:
```
POST /_search
{
  "aggs": {
    "a": {
      "adjacency_matrix": {
        "filters": {
          "1": {
            "terms": { "t": { "index": "lookup", "id": "1", "path": "t" } }
          }
        }
      }
    }
  }
}
```

Would fail with a 500 error and a message like:
```
{
  "error": {
    "root_cause": [
      {
        "type": "illegal_state_exception",
        "reason":"async actions are left after rewrite"
      }
    ]
  }
}
```

This fixes that by moving the query rewrite phase from a synchronous
call on the data nodes into the standard aggregation rewrite phase which
can properly handle the asynchronous actions.
2020-07-07 10:28:20 -04:00
David Turner 46c8d00852
Remove nodes with read-only filesystems (#52680) (#59138)
Today we do not allow a node to start if its filesystem is readonly, but
it is possible for a filesystem to become readonly while the node is
running. We don't currently have any infrastructure in place to make
sure that Elasticsearch behaves well if this happens. A node that cannot
write to disk may be poisonous to the rest of the cluster.

With this commit we periodically verify that nodes' filesystems are
writable. If a node fails these writability checks then it is removed
from the cluster and prevented from re-joining until the checks start
passing again.

Closes #45286

Co-authored-by: Bukhtawar Khan <bukhtawar7152@gmail.com>
2020-07-07 14:00:02 +01:00
Francisco Fernández Castaño 1ced3f0eb3
Extract recovery files details to its own class (#59121)
Backport of #59039
2020-07-07 12:35:57 +02:00
Ignacio Vera 5cc6457ed8
upgrade to lucene-8.6.0-snapshot-6a715e2ecc3 (#59091) (#59120) 2020-07-07 12:07:41 +02:00
Armin Braun d6d6df16bb
Share IT Infrastructure between Core Snapshot and SLM ITs (#59082) (#59119)
For #58994 it would be useful to be able to share test infrastructure.
This PR shares `AbstractSnapshotIntegTestCase` for that purpose, dries up SLM tests
accordingly and adds a shared and efficient (compared to the previous implementations)
way of waiting for no running snapshot operations to the test infrastructure to dry things up further.
2020-07-07 12:04:41 +02:00
David Turner ef2f0d1f67 Inline no-op IndicesModule#getEngineFactories (#59051)
This method was introduced in #31183 but it has no effect and is never
overridden so this commit removes it.
2020-07-07 09:15:20 +01:00
Francisco Fernández Castaño 0752a86fe5
Enforce higher priority for RepositoriesService ClusterStateApplier (#59040)
* Enforce higher priority for RepositoriesService ClusterStateApplier

This avoids shards allocation failures when the repository instance
comes in the same ClusterState update as the shard allocation.

Backport of #58808
2020-07-07 09:51:08 +02:00
Howard 00ed31d000 Remove IndexShardRoutingTable#primaryAsList (#59044) 2020-07-07 07:34:32 +01:00
Nik Everett be13dea113 Drop a TODO from the terms aggregator (#59100)
We did it in #56487.
2020-07-06 17:46:06 -04:00
Nik Everett eff5f4d234
Add pipeline aggregations to the rewrite phase (backport #58878) (#59081)
This allows pipeline aggregations to participate in the up-front rewrite
phase for searches, in particular, it allows them to load data that they
need asynchronously.

Relates to #58193

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-07-06 15:13:45 -04:00
Nhat Nguyen e827d2ed92 Fix testRestoreLocalHistoryFromTranslogOnPromotion (#58745)
If the global checkpoint equals max_seq_no, then we won't reset an engine 
(as all operations are safe), and max_seqno_of_updates_or_deletes 
won't advance to max_seq_no.

Closes #58163
2020-07-06 12:19:45 -04:00
Andrei Dan 2d516d7bcc
[7.x] Search all (_all, *) resolves data streams too (#58869) (#59058)
Part of the original PR was merged by #59028

(cherry picked from commit 2598327726124d8a86333f79cdc45bf6a4297dbc)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-07-06 14:19:15 +01:00
Dan Hermann 550dcb0ca6
[7.x] Delete data stream API accepts multiple names (#59064) 2020-07-06 08:06:10 -05:00
Armin Braun 722d94688b
Fix MinimumMasterNodesIT Test (#59054) (#59057)
Tiny oversight in dee9e048bdcc5ba59f20d2554e989015463df05a caused
the `otherNodes` collection to incorrectly contain `master` here.
2020-07-06 13:00:15 +02:00
Armin Braun 62eabdac6e
Dry up Snapshot ITs further (#59035) (#59052)
Some more obvious cleaning up of the snapshot ITs.

follow up to #58818
2020-07-06 12:26:42 +02:00
Martijn van Groningen f0dd9b4ace
Add data stream timestamp validation via metadata field mapper (#59002)
Backport of #58582 to 7.x branch.

This commit adds a new metadata field mapper that validates,
that a document has exactly a single timestamp value in the data stream timestamp field and
that the timestamp field mapping only has `type`, `meta` or `format` attributes configured.
Other attributes can affect the guarantee that an index with this meta field mapper has a
useable timestamp field.

The MetadataCreateIndexService inserts a data stream timestamp field mapper whenever
a new backing index of a data stream is created.

Relates to #53100
2020-07-06 11:32:33 +02:00
Armin Braun 49857cc35d
Dry up Master Disconnect Disruption Tests (#58953) (#59050)
Dry up tests that use a disruption that isolates the master from all other nodes.
Also, turn disruption types that have neither parameters nor state into constants
to make things a little clearer.
2020-07-06 11:04:24 +02:00
Nhat Nguyen 62763b177d Implement toString for BulkByScrollTask (#59042)
We should implement "toString" of BulkByScrollTask.StatusOrException 
to have a meaningful log message when a reindex task completes.
2020-07-05 22:06:56 -04:00
Armin Braun 071d8b2c1c
Deduplicate Empty InternalAggregations (#58386) (#59032)
Working through a heap dump for an unrelated issue I found that we can easily rack up
tens of MBs of duplicate empty instances in some cases.
I moved to a static constructor to guard against that in all cases.
2020-07-04 14:02:16 +02:00
Dan Hermann 7c43cbca82
[7.x] Ignore matching data streams if include_data_streams is false (#59028) 2020-07-03 14:51:32 -05:00
Dan Hermann c1781bc7e7
[7.x] Add include_data_streams flag for authorization (#59008) 2020-07-03 12:58:39 -05:00
Dan Hermann 5e7746d3bd
[7.x] Mirror privileges over data streams to their backing indices (#58991) 2020-07-03 06:33:38 -05:00
Armin Braun d22dd437f1
Fix Two Common Zero Len Array Instantiations (#58944) (#58993)
Two spots I found in which we commonly instatiate a non-trivial number of zero length arrays.
2020-07-03 09:18:14 +02:00
Nhat Nguyen 65645217bc Handle IOException while checking translog corruption
We can hit an IOException while reading a translog header after corrupting it.

Relates #58866
2020-07-02 22:38:05 -04:00