Commit Graph

104 Commits

Author SHA1 Message Date
Alan Woodward 19ba6c39d2 Migrate CompletionFieldMapper to parametrized format (#59291)
This adds some optional extra configuration to Parameter:

* custom serialization (to handle analyzers)
* deprecated parameter names
* parameter validation
2020-07-13 12:43:15 +01:00
Armin Braun 08b54feaaf
Remove Snapshot INIT Step (#55918) (#59374)
With #55773 the snapshot INIT state step has become obsolete. We can set up the snapshot directly in one single step to simplify the state machine.

This is a big help for building concurrent snapshots because it allows us to establish a deterministic order of operations between snapshot create and delete operations since all of their entries now contain a repository generation. With this change simple queuing up of snapshot operations can and will be added in a follow-up.
2020-07-13 13:41:09 +02:00
Dan Hermann e01d73c737
[7.x] Data stream admin actions are now index-level actions 2020-07-10 14:36:18 -05:00
Dan Hermann c26d2b5fa5
Data stream support for indices shard stores API 2020-07-09 13:11:45 -05:00
Przemko Robakowski c870d6e570
[7.x] Restart tests with data streams (#58330) (#59303)
* Restart tests with data streams (#58330)
2020-07-09 17:52:20 +02:00
Martijn van Groningen 17bd559253
Fix the timestamp field of a data stream to @timestamp (#59210)
Backport of #59076 to 7.x branch.

The commit makes the following changes:
* The timestamp field of a data stream definition in a composable
  index template can only be set to '@timestamp'.
* Removed custom data stream timestamp field validation and reuse the validation from `TimestampFieldMapper` and
  instead only check that the _timestamp field mapping has been defined on a backing index of a data stream.
* Moved code that injects _timestamp meta field mapping from `MetadataCreateIndexService#applyCreateIndexRequestWithV2Template58956(...)` method
  to `MetadataIndexTemplateService#collectMappings(...)` method.
* Fixed a bug (#58956) that cases timestamp field validation to be performed
  for each template and instead of the final mappings that is created.
* only apply _timestamp meta field if index is created as part of a data stream or data stream rollover,
this fixes a docs test, where a regular index creation matches (logs-*) with a template with a data stream definition.

Relates to #58642
Relates to #53100
Closes #58956
Closes #58583
2020-07-08 17:30:46 +02:00
Armin Braun 9268b25789
Add Check for Metadata Existence in BlobStoreRepository (#59141) (#59216)
In order to ensure that we do not write a broken piece of `RepositoryData`
because the phyiscal repository generation was moved ahead more than one step
by erroneous concurrent writing to a repository we must check whether or not
the current assumed repository generation exists in the repository physically.
Without this check we run the risk of writing on top of stale cached repository data.

Relates #56911
2020-07-08 14:25:01 +02:00
Andrei Dan 24c6a30e2b
[7.9] GET data stream API returns additional information (#59128) (#59177)
* GET data stream API returns additional information (#59128)

This adds the data stream's index template, the configured ILM policy
(if any) and the health status of the data stream to the GET _data_stream
response.

Restoring a data stream from a snapshot could install a data stream that
doesn't match any composable templates. This also makes the `template`
field in the `GET _data_stream` response optional.

(cherry picked from commit 0d9c98a82353b088c782b6a04c44844e66137054)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-07-07 20:30:09 +01:00
Nhat Nguyen de6ac6aea6 Fix recovery stage transition with sync_id (#57754)
If the recovery source is on an old node (before 7.2), then the recovery
target won't have the safe commit after phase1 because the recovery
source does not send the global checkpoint in the clean_files step. And
if the recovery fails and retries, then the recovery stage won't
transition properly. If a sync_id is used in peer recovery, then the
clean_files step won't be executed to move the stage to TRANSLOG.

Relates ##7187
Closes #57708
2020-07-07 12:00:37 -04:00
Armin Braun d6d6df16bb
Share IT Infrastructure between Core Snapshot and SLM ITs (#59082) (#59119)
For #58994 it would be useful to be able to share test infrastructure.
This PR shares `AbstractSnapshotIntegTestCase` for that purpose, dries up SLM tests
accordingly and adds a shared and efficient (compared to the previous implementations)
way of waiting for no running snapshot operations to the test infrastructure to dry things up further.
2020-07-07 12:04:41 +02:00
Andrei Dan 2d516d7bcc
[7.x] Search all (_all, *) resolves data streams too (#58869) (#59058)
Part of the original PR was merged by #59028

(cherry picked from commit 2598327726124d8a86333f79cdc45bf6a4297dbc)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-07-06 14:19:15 +01:00
Dan Hermann 550dcb0ca6
[7.x] Delete data stream API accepts multiple names (#59064) 2020-07-06 08:06:10 -05:00
Armin Braun 722d94688b
Fix MinimumMasterNodesIT Test (#59054) (#59057)
Tiny oversight in dee9e048bdcc5ba59f20d2554e989015463df05a caused
the `otherNodes` collection to incorrectly contain `master` here.
2020-07-06 13:00:15 +02:00
Armin Braun 62eabdac6e
Dry up Snapshot ITs further (#59035) (#59052)
Some more obvious cleaning up of the snapshot ITs.

follow up to #58818
2020-07-06 12:26:42 +02:00
Martijn van Groningen f0dd9b4ace
Add data stream timestamp validation via metadata field mapper (#59002)
Backport of #58582 to 7.x branch.

This commit adds a new metadata field mapper that validates,
that a document has exactly a single timestamp value in the data stream timestamp field and
that the timestamp field mapping only has `type`, `meta` or `format` attributes configured.
Other attributes can affect the guarantee that an index with this meta field mapper has a
useable timestamp field.

The MetadataCreateIndexService inserts a data stream timestamp field mapper whenever
a new backing index of a data stream is created.

Relates to #53100
2020-07-06 11:32:33 +02:00
Armin Braun 49857cc35d
Dry up Master Disconnect Disruption Tests (#58953) (#59050)
Dry up tests that use a disruption that isolates the master from all other nodes.
Also, turn disruption types that have neither parameters nor state into constants
to make things a little clearer.
2020-07-06 11:04:24 +02:00
Dan Hermann 7c43cbca82
[7.x] Ignore matching data streams if include_data_streams is false (#59028) 2020-07-03 14:51:32 -05:00
Tim Brooks dc9e364ff2
Count coordinating and primary bytes as write bytes (#58984)
This is a follow-up to #57573. This commit combines coordinating and
primary bytes under the same "write" bucket. Double accounting is
prevented by only accounting the bytes at either the reroute phase or
the primary phase. TransportBulkAction calls execute directly, so the
operations handler is skipped and the bytes are not double accounted.
2020-07-02 19:48:19 -06:00
Mark Vieira 8fca312a3a
Mute WriteMemoryLimitsIT.testWriteBytesAreIncremented 2020-07-02 16:58:23 -07:00
Tim Brooks 9d1bf383d0
Add test assertions to ensure write bytes released (#58970)
This is a follow-up to #57573. This commit ensures that the bytes marked
in WriteMemoryLimits are released by any test using an internal test
cluster.
2020-07-02 17:38:23 -06:00
Tim Brooks 1ef2cd7f1a
Add memory tracking to queued write operations (#58957)
Currently we do not track the memory consuming by in-process write
operations.

This commit adds a mechanism to track write operation memory usage.
2020-07-02 14:14:57 -06:00
Lee Hinman e32623ef52
[7.x] Add test for component templates updated after cluster restart (#58883) (#58914)
This commit adds an integration test that component templates used to form a composite template can
still be updated after a cluster restart.

In #58643 an issue arose where mappings were causing problems because of the way we unwrap `_doc` in
template mappings. This was also related to the mappings being merged manually rather than using the
`MapperService` to do the merging. #58643 was fixed in 7.9 and master with the #58521 change, since
mappings now are read and digested by the actual mapper service.

This test passes for 7.x and master, and I intend to open a separate PR including this test for
7.8.1 along with a bug fix for #58643. This test is to ensure we don't have any regression in the
future.
2020-07-02 08:23:34 -06:00
Armin Braun 62152852dc
Cleanup Duplication in Snapshot ITs (#58818) (#58915)
Just a few obvious static cleanups of duplication to push back against the ever increasing complexity of these tests.
2020-07-02 16:00:01 +02:00
Lee Hinman d3d03fc1c6
[7.x] Add default composable templates for new indexing strategy (#57629) (#58757)
Backports the following commits to 7.x:

    Add default composable templates for new indexing strategy (#57629)
2020-07-01 09:32:32 -06:00
Andrei Dan f7dc09340b
Prohibit custom _routing for index requests targetting a data stream (#58749) (#58831)
This prohibits the use of a custom _routing when the index/bulk requests are targetting a data stream.
Using a custom _routing when targetting a backing index is still permitted.

Relates to #53100

(cherry picked from commit ece6b7a318a8bd3a010499189f31fc5e3a012d4f)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-07-01 14:54:18 +01:00
David Turner 822b7421ce Forbid read-only-allow-delete block in blocks API (#58727)
The read-only-allow-delete block is not really under the user's control
since Elasticsearch adds/removes it automatically. This commit removes
support for it from the new API for adding blocks to indices that was
introduced in #58094.
2020-07-01 13:18:26 +01:00
Martijn van Groningen a0df96befb
Add data stream support to put mapping and update index settings APIs. (#58758)
Backport of #58231 to 7.x branch.

Change update index setting and put mapping api
to execute on all backing indices if data stream is targeted.

Relates #53100
2020-07-01 13:32:21 +02:00
Yannick Welsch 15c85b29fd
Account for recovery throttling when restoring snapshot (#58658) (#58811)
Restoring from a snapshot (which is a particular form of recovery) does not currently take recovery throttling into account
(i.e. the `indices.recovery.max_bytes_per_sec` setting). While restores are subject to their own throttling (repository
setting `max_restore_bytes_per_sec`), this repository setting does not allow for values to be configured differently on a
per-node basis. As restores are very similar in nature to peer recoveries (streaming bytes to the node), it makes sense to
configure throttling in a single place.

The `max_restore_bytes_per_sec` setting is also changed to default to unlimited now, whereas previously it was set to
`40mb`, which is the current default of `indices.recovery.max_bytes_per_sec`). This means that no behavioral change
will be observed by clusters where the recovery and restore settings were not adapted.

Relates https://github.com/elastic/elasticsearch/issues/57023

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2020-07-01 12:19:29 +02:00
David Turner 3a234d2669
Account for remaining recovery in disk allocator (#58800)
Today the disk-based shard allocator accounts for incoming shards by
subtracting the estimated size of the incoming shard from the free space on the
node. This is an overly conservative estimate if the incoming shard has almost
finished its recovery since in that case it is already consuming most of the
disk space it needs.

This change adds to the shard stats a measure of how much larger each store is
expected to grow, computed from the ongoing recovery, and uses this to account
for the disk usage of incoming shards more accurately.

Backport of #58029 to 7.x

* Picky picky

* Missing type
2020-07-01 10:12:44 +01:00
Dan Hermann 1c2a726731
Data stream support for search shards API (#58486) (#58765) 2020-06-30 17:59:51 -05:00
Dan Hermann cae49b0fd7
[7.x] Add data stream support to open index API (#58767) 2020-06-30 14:30:32 -05:00
Dan Hermann a84ff81743
Data stream support for get field mappings API (#58488) (#58766) 2020-06-30 13:45:04 -05:00
Armin Braun b52a764143
Fix NPE in SnapshotService CS Application (#58680) (#58735)
In the unlikely corner case of deleting a relocation (hence `WAITING`) primary shard's
index during a partial snapshot, we would throw an NPE when checking if there's any external
changes to process.
2020-06-30 15:20:49 +02:00
Yannick Welsch b885cbff1a
Add index block api (#58716)
Adds an API for putting an index block in place, which also ensures for write blocks that, once successfully returning to
the user, all shards of the index are properly accounting for the block, for example that all in-flight writes to an index have
been completed after adding the write block.

This API allows coordinating more complex workflows, where it is crucial that an index is no longer receiving writes after
the API completes, useful for example when marking an index as read-only during an upgrade in order to reindex its
documents.
2020-06-30 14:06:52 +02:00
Armin Braun 95d85f29f8
Fix Snapshots Capturing Incomplete Datastreams (#58630) (#58656)
Only snapshot datastreams that are recorded in `SnapshotInfo` and clean those
that aren't from the snapshotted metadata.
Do not restore all datastreams by default when restoring global metadata, use the same
mechanics used for indices here.

Closes #58544
2020-06-29 12:51:40 +02:00
Armin Braun 4f2f257b12
Fix DataStream Handling on Restore of Global Metadata (#58631) (#58649)
When restoring a global metadata snapshot we were overwriting the correctly
adjusted data streams in the metadata when looping over all custom values.

Closes #58496
2020-06-29 10:58:41 +02:00
Armin Braun 090211f768
Fix Incorrect Snapshot Shar Status for DONE Shards in Running Snapshots (#58390) (#58593)
Minor bugs/inconsistencies:

If a shard hasn't changed at all we were reporting `0` for total size and total file count
while it was ongoing.

If a data node restarts/drops out during snapshot creation the fallback logic did not load the correct statistic from the repository but just created a status with `0` counts from the snapshot state in the CS. Added a fallback to reading from the repository in this case.
2020-06-26 16:11:30 +02:00
Nik Everett 5f52bc4c9f
Fix two scripted_metric bugs (backport of #58547) (#58565)
Fixes two bugs introduced by #57627:
1. We were not properly letting go of memory from the request breaker
   when the aggregation finished.
2. We no longer supported totally arbitrary stuff produced by the init
   script because we *assumed* that it'd be ok to run the script once
   and clone its results. Sadly, cloning can't clone *anything* that the
   init script can make, like `String` arrays. This runs the init script
   once for every new bucket so we don't need to clone.
2020-06-25 16:16:10 -04:00
Armin Braun 468e559ff7
Fix Memory Leak From Master Failover During Snapshot (#58511) (#58560)
If we failed over while the data nodes were doing their work
we would never resolve the listener and leak it.
This change fails all listeners if master fails over.
2020-06-25 20:43:08 +02:00
Jason Tedor 52ad5842a9
Introduce node.roles setting (#58512)
Today we have individual settings for configuring node roles such as
node.data and node.master. Additionally, roles are pluggable and we have
used this to introduce roles such as node.ml and node.voting_only. As
the number of roles is growing, managing these becomes harder for the
user. For example, to create a master-only node, today a user has to
configure:
 - node.data: false
 - node.ingest: false
 - node.remote_cluster_client: false
 - node.ml: false

at a minimum if they are relying on defaults, but also add:
 - node.master: true
 - node.transform: false
 - node.voting_only: false

If they want to be explicit. This is also challenging in cases where a
user wants to have configure a coordinating-only node which requires
disabling all roles, a list which we are adding to, requiring the user
to keep checking whether a node has acquired any of these roles.

This commit addresses this by adding a list setting node.roles for which
a user has explicit control over the list of roles that a node has. If
the setting is configured, the node has exactly the roles in the list,
and not any additional roles. This means to configure a master-only
node, the setting is merely 'node.roles: [master]', and to configure a
coordinating-only node, the setting is merely: 'node.roles: []'.

With this change we deprecate the existing 'node.*' settings such as
'node.data'.
2020-06-25 14:14:51 -04:00
Armin Braun 9e4c5d1dde
Cleaner Handling of Snapshot Related null Custom Values in CS (#58382) (#58501)
Add the ability to get a custom value while specifying a default and use it throughout the
codebase to get rid of the `null` edge case and shorten the code a little.
2020-06-24 17:24:44 +02:00
David Turner 796cb9e9ca Reword INDEX_READ_ONLY_ALLOW_DELETE_BLOCK message (#58410)
Users are perennially confused by the message they get when writing to
an index is blocked due to excessive disk usage:

    TOO_MANY_REQUESTS/12/index read-only / allow delete (api)

Of course this is technically accurate but it is hard to join the dots
from this message to "your disk was too full" without some searching of
forums and documentation. Additionally in #50166 we changed the status
code to today's `429` from the previous `403` which changed the message
from the one that's widely documented elsewhere:

    FORBIDDEN/12/index read-only / allow delete (api)

Since #42559 we've considered this block to be under the sole control of
the disk-based shard allocator, and we have seen no evidence to suggest
that anyone is applying this block manually. Therefore this commit
adjusts this block's message to indicate that it's caused by a lack of
disk space.
2020-06-24 10:22:11 +01:00
Martijn van Groningen 7dda9934f9
Keep track of timestamp_field mapping as part of a data stream (#58400)
Backporting #58096 to 7.x branch.
Relates to #53100

* use mapping source direcly instead of using mapper service to extract the relevant mapping details
* moved assertion to TimestampField class and added helper method for tests
* Improved logic that inserts timestamp field mapping into an mapping.
If the timestamp field path consisted out of object fields and
if the final mapping did not contain the parent field then an error
occurred, because the prior logic assumed that the object field existed.
2020-06-22 17:46:38 +02:00
Przemko Robakowski a44dad9fbb
[7.x] Add support for snapshot and restore to data streams (#57675) (#58371)
* Add support for snapshot and restore to data streams (#57675)

This change adds support for including data streams in snapshots.
Names are provided in indices field (the same way as in other APIs), wildcards are supported.
If rename pattern is specified it renames both data streams and backing indices.
It also adds test to make sure SLM works correctly.

Closes #57127

Relates to #53100

* version fix

* compilation fix

* compilation fix

* remove unused changes

* compilation fix

* test fix
2020-06-19 22:41:51 +02:00
Andrei Dan 30e777856f
[7.x] Validate alias operations don't target data streams (#58327) (#58337)
This adds validation to make sure alias operations (add, remove, remove index)
don't target data streams or the backing indices.

(cherry picked from commit 816448990e464a02f3960f12f6f6644a8cce36a4)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-06-18 20:23:07 +01:00
Alan Woodward 4b8cf2af6a
Add serialization test for FieldMappers when include_defaults=true (#58235) (#58328)
Fixes a bug in TextFieldMapper serialization when index is false, and adds a
base-class test to ensure that all field mappers are tested against all variations
with defaults both included and excluded.

Fixes #58188
2020-06-18 15:46:04 +01:00
Christoph Büscher ba0b046909 Fix test compilation issue 2020-06-18 11:36:11 +02:00
Christoph Büscher 31d8e03954 Prevent BigInteger serialization errors in term queries (#57987)
When a numeric value in e.g. a `term` query doesn't fit into a long, it
curerently gets parsed to a BigInteger object, that the various term query
builders store untouched. This leads to serialization errors when these queries
are sent across the wire. Instead we can convert to a string representation
early on, since that is what we store e.g. when indexing big integers into
`keyword` fields anyway.

Closes #57917
2020-06-18 11:17:12 +02:00
Jim Ferenczi 82db0b575c
Allow index filtering in field capabilities API (#57276) (#58299)
This change allows to use an `index_filter` in the
field capabilities API. Indices are filtered from
the response if the provided query rewrites to `match_none`
on every shard:

````
GET metrics-*
{
  "index_filter": {
    "bool": {
      "must": [
        "range": {
          "@timestamp": {
            "gt": "2019"
          }
        }
      }
  }
}
````

The filtering is done on a best-effort basis, it uses the can match phase
to rewrite queries to `match_none` instead of fully executing the request.
The first shard that can match the filter is used to create the field
capabilities response for the entire index.

Closes #56195
2020-06-18 10:23:26 +02:00
Tim Brooks 2074412d79
Retry failed replication due to transient errors (#56230)
Currently a failed replication action will fail an entire replica. This
includes when replication fails due to potentially short lived transient
issues such as network distruptions or circuit breaking errors.

This commit implements retries using the retryable action.
2020-06-17 10:17:30 -06:00