OpenSearch

Commit Graph

Author	SHA1	Message	Date
Tanguy Leroux	9f5e95505b	Also abort ongoing file restores when snapshot restore is aborted (#62441 ) (#62607 ) Today when a snapshot restore is aborted (for example when the index is explicitly deleted) while the restoration of the files from the repository has already started the file restores are not interrupted. It means that Elasticsearch will continue to read the files from the repository and will continue to write them to disk until all files are restored; the store will then be closed and files will be deleted from disk at some point but this can take a while. This will also take some slots in the SNAPSHOT thread pool too. The Recovery API won't show any files actively being recovered, the only notable indicator would be the active threads in the SNAPSHOT thread pool. This commit adds a check before reading a file to restore and before writing bytes on disk so that a closing store can be detected more quickly and the file recovery process aborted. This way the file restores just stops and for most of the repository implementations it means that no more bytes are read (see #62370 for S3), finishing threads in the SNAPSHOT thread pool more quickly too.	2020-09-18 14:04:58 +02:00
Ignacio Vera	6a3d731be1	Only call reduce on a single InternalAggregation when needed (#62525 ) (#62594 ) Adds a new abstract method in InternalAggregation that flags the framework if it needs to reduce on a single InternalAggregation.	2020-09-18 08:43:58 +02:00
Ryan Ernst	ede62d722f	Skip release build tests for external test modules (#62579 ) The tests don't make sense for release builds. closes #62435	2020-09-17 13:08:17 -07:00
Alan Woodward	91e2330529	Warn on badly-formed null values for date and IP field mappers (#62487 ) In #57666 we changed when null_value was parsed for ip and date fields. Previously, the null value was stored as a string, and parsed into a date or InetAddress whenever a document containing a null value was encountered. Now, the values are parsed when the mappings are built, which means that bad values are detected up front; if you try and add a mapping with a badly-parsed ip or date for a null_value, the mapping will be rejected. This causes problems for upgrades in the case when you have a badly-formed null_value in a pre-7.9 cluster. This commit fixes the upgrade case by changing the logic to only logging a warning on the badly formed value, replicating the earlier behaviour. Fixes #62363	2020-09-17 16:38:08 +01:00
Martijn van Groningen	11cef15b83	Ignore 404 when wiping data streams. (#62492 ) Backport of #62484 to 7.x branch. It is possible in mixed version clusters (nodes prior to 7.10) that a 404 is returned when wiping all data streams. This is because there are no data streams and the coordinator node is on a version that doesn't mark the delete request for wildcard usage.	2020-09-17 11:04:05 +02:00
Nik Everett	24a24d050a	Implement fields fetch for runtime fields (backport of #61995 ) (#62416 ) This implements the `fields` API in `_search` for runtime fields using doc values. Most of that implementation is stolen from the `docvalue_fields` fetch sub-phase, just moved into the same API that the `fields` API uses. At this point the `docvalue_fields` fetch phase looks like a special case of the `fields` API. While I was at it I moved the "which doc values sub-implementation should I use for fetching?" question from a bunch of `instanceof`s to a method on `LeafFieldData` so we can be much more flexible with what is returned and we're not forced to extend certain classes just to make the fetch phase happy. Relates to #59332	2020-09-15 20:24:10 -04:00
Jim Ferenczi	4eea602d2d	Add a snapshot test module to delay shard aggregations (#62082 ) (#62359 ) This change adds an aggregation that can be used to delay the query phase execution on shards with a configurable time: { "aggs": { "delay": { "shard_delay": { "value": "30s" }, "aggs": { "host": { "terms": { "field": "hostname" } } } } } } This test module is built on top of #61954 so the aggregation will be available only within snapshots since this module is not meant to be used in production. Closes #54159	2020-09-15 13:52:38 +02:00
Lee Hinman	6b2af30a62	[7.x] Add "synthetics--" templates for synthetics fleet data (#62193 ) (#62346 ) * Add "synthetics--" templates for synthetics fleet data For the Elastic Agent we currently have `logs` and `metrics`, however, synthetic data doesn't belong with those and thus we should have a place for it to live. This would be data reported from heartbeat and under the 'monitoring' category. This commit adds a composable index template for `synthetics--` indices similar to the work in #56709 and #57629. Resolves #61665	2020-09-14 17:14:34 -06:00
Alan Woodward	5358cee29c	Cut over more mapping tests to MapperServiceTestCase (#62312 ) Shaves a few more seconds off the build.	2020-09-14 16:00:37 +01:00
Nhat Nguyen	aafb2cb812	Support point in time cross cluster search (#61827 ) This commit integrates point in time into cross cluster search. Relates #61062 Closes #61790	2020-09-10 19:25:48 -04:00
Nhat Nguyen	035f0638f4	Support point in time in async_search (#61560 ) This commit integrates point in time into async search and ensures that it works correctly with security enabled. Relates #61062	2020-09-10 19:25:48 -04:00
Nhat Nguyen	2eb1e8bc84	Make keep alive of point in time optional in search (#62184 ) A search request should not be required to extend the keep_alive of a point in time. This change makes that parameter optional.	2020-09-10 19:25:48 -04:00
Luca Cavanna	44bd4a6004	Fix point in time toXContent impl (#62080 ) PointInTimeBuilder is a ToXContentObject yet it does not print out a whole object (it is rather a fragment). Also, when it is printed out as part of SearchSourceBuilder, an error is thrown because pit should be wrapped into its own object. This commit fixes this and adds tests for it.	2020-09-10 19:25:47 -04:00
Nhat Nguyen	3d69b5c41e	Introduce point in time APIs in x-pack basic (#61062 ) This commit introduces a new API that manages point-in-times in x-pack basic. Elasticsearch pit (point in time) is a lightweight view into the state of the data as it existed when initiated. A search request by default executes against the most recent point in time. In some cases, it is preferred to perform multiple search requests using the same point in time. For example, if refreshes happen between search_after requests, then the results of those requests might not be consistent as changes happening between searches are only visible to the more recent point in time. A point in time must be opened before being used in search requests. The `keep_alive` parameter tells Elasticsearch how long it should keep a point in time around. ``` POST /my_index/_pit?keep_alive=1m ``` The response from the above request includes a `id`, which should be passed to the `id` of the `pit` parameter of search requests. ``` POST /_search { "query": { "match" : { "title" : "elasticsearch" } }, "pit": { "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "1m" } } ``` Point-in-times are automatically closed when the `keep_alive` is elapsed. However, keeping point-in-times has a cost; hence, point-in-times should be closed as soon as they are no longer used in search requests. ``` DELETE /_pit { "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA=" } ``` #### Notable works in this change: - Move the search state to the coordinating node: #52741 - Allow searches with a specific reader context: #53989 - Add the ability to acquire readers in IndexShard: #54966 Relates #46523 Relates #26472 Co-authored-by: Jim Ferenczi <jimczi@apache.org>	2020-09-10 19:25:47 -04:00
Martijn van Groningen	81b89fe3ba	Change yaml test suite testcase to automatically delete all data streams after each yaml test (#62214 ) Backporting #62205 to 7.x branch. This is similar to what happens for indices. Initially we decided to let each test cleanup the data streams it created. The reason behind this was that client yaml test runners would need to be modified to do this too and because data steams were new, we waited with that and let each test cleanup the data stream it created. However we sometimes have very hard to debug test failures, because many tests fail because another test failed mid way and didn't clean up the data streams it created. Given that and data streams exist in the code base for a while now, we should automatically delete all data streams after each yaml test. Relates to #62190 * preserve data streams for rolling upgrade yaml tests	2020-09-10 15:10:57 +02:00
Alan Woodward	5f05eef7e3	Convert some more mapping tests to MapperServiceTestCase (#62089 ) We don't need to extend ESSingleNodeTestCase for all these tests.	2020-09-08 17:51:40 +01:00
Francisco Fernández Castaño	2bb5716b3d	Add repositories metering API (#62088 ) This pull request adds a new set of APIs that allows tracking the number of requests performed by the different registered repositories. In order to avoid losing data, the repository statistics are archived after the repository is closed for a configurable retention period `repositories.stats.archive.retention_period`. The API exposes the statistics for the active repositories as well as the modified/closed repositories. Backport of #60371	2020-09-08 14:01:04 +02:00
David Turner	3389d5ccb2	Introduce integ tests for high disk watermark (#60460 ) An important goal of the disk threshold decider is to ensure that nodes use less disk space than the high watermark, and to take action if a node ever exceeds this watermark. Today we do not have any integration-style tests of this high-level behaviour. This commit introduces a small test harness that can adjust the apparent size of the disk and verify that the disk threshold decider moves shards around in response. Co-authored-by: Yannick Welsch <yannick@welsch.lu>	2020-09-07 14:39:39 +02:00
Luca Cavanna	0c8b438577	Add support for runtime fields (#61776 ) This commit includes the work that has been done on the runtime fields feature branch until now. The high level tasks are listed in #59332. The tasks that have not yet been completed can be worked on after merging the feature branch. We are adding a new x-pack plugin called runtime-fields that plugs in a custom mapper which allows to define runtime fields based on a script. The changes included in this commit that were made outside of the x-pack/plugin/runtime-fields directory are minimal and revolve around 1) making the ScriptService available while parsing index mappings so that the scripts associated to runtime fields can be compiled 2) sharing code to manipulate ranges etc. as it can be reused in runtime fields. Co-authored-by: Nik Everett <nik9000@gmail.com>	2020-09-07 09:14:53 +02:00
Ryan Ernst	6d3b691048	Add snapshot only test modules (#61954 ) This commit adds external test modules. These are modules meant for external systems to test edge cases in elasticsearch, but only within snapshots. They are not meant to be used in production, so protections are also added from their accidental inclusion in release builds. Note that this commit does not actually add any new modules, it only adds the infrastructure for the new modules, under `test/external-modules`.	2020-09-04 16:35:18 -07:00
Alan Woodward	af01ccee93	Add specific test for serializing all mapping parameter values (#61844 ) (#61877 ) This commit adds a test to MapperTestCase that explicitly checks that a mapper can serialize all its default values, and that this serialization can then be re-parsed. Note that the test is disabled for non-parametrized mappers as their serialization may in some cases output parameters that are not accepted. Gradually moving all mappers to parametrized form will address this. The commit also contains a fix to keyword mappers, which were not correctly serializing the similarity parameter; this partially addresses #61563. It also enables `null` as a value for `null_value` on `scaled_float`, as a follow-up to #61798	2020-09-03 09:20:26 +01:00
Alan Woodward	d59343b4ba	Allow [null] values in [null_value] (#61798 ) (#61807 ) Several field mappers have a null_value parameter, that allows you to specify a placeholder value to insert into a document if the incoming value for that field is null. The default value for this is always null, meaning "add no placeholder". However, we explicitly bar users from setting this parameter directly to null (done in #7978, in order to fix an NPE). This exclusion means that if a mapper is serialized with include_defaults, then we either need to special-case null_value to ensure that it is not output when it holds the default value, or we find that the resulting serialized form cannot be used to create a mapping. This stops us doing some useful generic testing of mappers. This commit permits null as a parameter value for null_value, and changes the tests to check that it is a) permissible and b) applied without throwing errors. As part of the testing changes, a new base class MapperServiceTestCase is refactored from MapperTestCase, holding the various helper methods related to building mappings but not the single-mapper specific abstract methods. Closes #58823	2020-09-02 10:42:19 +01:00
Tim Brooks	e573fa9abc	Add data.path fast path for FilePermission (#61302 ) The recursive data.path FilePermission check is an extremely hot codepath in Elasticsearch. Unfortunately the FilePermission check in Java is extremely allocation heavy. As it iterates through different file permissions, it allocates byte arrays for each Path component that must be compared. This PR improves the situation by adding the recursive data.path FilePermission it its own PermissionsCollection object which is checked first.	2020-09-01 12:03:22 -06:00
Rory Hunter	ff6c071275	Implement deprecation logging using log4j (#61629 ) Backport of #61474. Part of #46106. Simplify the implementation of deprecation logging by relying of log4j more completely, and implementing additional behaviour through custom appenders and filters.	2020-08-31 12:42:04 +01:00
Luca Cavanna	f769821bc8	Pass SearchLookup supplier through to fielddataBuilder (#61430 ) (#61638 ) Runtime fields need to have a SearchLookup available, when building their fielddata implementations, so that they can look up other fields, runtime or not. To achieve that, we add a Supplier<SearchLookup> argument to the existing MappedFieldType#fielddataBuilder method. As we introduce the ability to look up other fields while building fielddata for mapped fields, we implicitly add the ability for a field to require other fields. This requires some protection mechanism that detects dependency cycles to prevent stack overflow errors. With this commit we also introduce detection for cycles, as well as a limit on the depth of the references for a runtime field. Note that we also plan on introducing cycles detection at compile time, so the runtime cycles detection is a last resort to prevent stack overflow errors but we hope that we can reject runtime fields from being registered in the mappings when they create a cycle in their definition. Note that this commit does not introduce any production implementation of runtime fields, but is rather a pre-requisite to merge the runtime fields feature branch. This is a breaking change for MapperPlugins that plug in a mapper, as the signature of MappedFieldType#fielddataBuilder changes from taking a single argument (the index name), to also accept a Supplier<SearchLookup>. Relates to #59332 Co-authored-by: Nik Everett <nik9000@gmail.com>	2020-08-27 18:09:56 +02:00
David Turner	411965d392	Allow background cluster state update in tests (#61455 ) Today the `CoordinatorTests` run the publication process as a single atomic action; however in production it appears possible that another master may be elected, publish its state, then fail, then we win another election, all in between the time we sampled our previous cluster state and started to publish the one we first thought of. This violates the `assertClusterStateConsistency()` assertion that verifies the cluster state update event matches the states we actually published and applied. This commit adjusts the tests to run the publication process more asynchronously so as to allow time for this behaviour to occur. This should eventually result in a reproduction of the failure in #61437 that will let us analyse what's really going on there and help us fix it.	2020-08-27 11:22:58 +01:00
David Turner	e14d9c9514	Introduce cache index for searchable snapshots (#61595 ) If a searchable snapshot shard fails (e.g. its node leaves the cluster) we want to be able to start it up again on a different node as quickly as possible to avoid unnecessarily blocking or failing searches. It isn't feasible to fully restore such shards in an acceptably short time. In particular we would like to be able to deal with the `can_match` phase of a search ASAP so that we can skip unnecessary waiting on shards that may still be warming up but which are not required for the search. This commit solves this problem by introducing a system index that holds much of the data required to start a shard. Today() this means it holds the contents of every file with size <8kB, and the first 4kB of every other file in the shard. This system index acts as a second-level cache, behind the first-level node-local disk cache but in front of the blob store itself. Reading chunks from the index is slower than reading them directly from disk, but faster than reading them from the blob store, and is also replicated and accessible to all nodes in the cluster. () the exact heuristics for what we should put into the system index are still under investigation and may change in future. This second-level cache is populated when we attempt to read a chunk which is missing from both levels of cache and must therefore be read from the blob store. We also introduce `SearchableSnapshotsBlobStoreCacheIntegTests` which verify that we do not hit the blob store more than necessary when starting up a shard that we've seen before, whether due to a node restart or because a snapshot was mounted multiple times. Backport of #60522 Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>	2020-08-27 06:38:32 +01:00
Przemyslaw Gomulka	9f566644af	Do not create two loggers for DeprecationLogger backport(#58435 ) (#61530 ) DeprecationLogger's constructor should not create two loggers. It was taking parent logger instance, changing its name with a .deprecation prefix and creating a new logger. Most of the time parent logger was not needed. It was causing Log4j to unnecessarily cache the unused parent logger instance. depends on #61515 backports #58435	2020-08-26 16:04:02 +02:00
Nik Everett	87cf81e179	Migrate some more mapper test cases (#61507 ) (#61552 ) Migrate some more mapper test cases from `ESSingleNodeTestCase` to `MapperTestCase`.	2020-08-25 15:27:26 -04:00
Przemyslaw Gomulka	f3f7d25316	Header warning logging refactoring backport(#55941 ) (#61515 ) Splitting DeprecationLogger into two. HeaderWarningLogger - responsible for adding a response warning headers and ThrottlingLogger - responsible for limiting the duplicated log entries for the same key (previously deprecateAndMaybeLog). Introducing A ThrottlingAndHeaderWarningLogger which is a base for other common logging usages where both response warning header and logging throttling was needed. relates #55699 relates #52369 backports #55941	2020-08-25 16:35:54 +02:00
Armin Braun	f22ddf822e	Some Optimizations around BytesArray (#61183 ) (#61511 ) * Faster `equals` for `BytesArray` which is nice since with this change we use it for the search cache * Lighter `StreamInput` for `BytesArray` that should save memory and some indirection relative to the one on the abstract bytes reference * Lighter `writeTo` implementation * Build a `BytesArray` instead of a PagedBytesReference whenever possible to save indirection and memory	2020-08-25 07:13:39 +02:00
Benjamin Trent	6ffcc02fb9	Muting test o.e.t.t.ESTestCaseTests.testRandomDateFormatterPattern (#61497 )	2020-08-24 13:58:09 -04:00
Nik Everett	f3b6d49ae1	Migrate server mapper tests to new MapperTestCase (#61378 ) (#61490 ) This continues #61301, migrating all of the mappers in `server` to the new `MapperTestCase` which is nicer than `FieldMapperTestCase` because it doesn't depend on all of Elasticsearch.	2020-08-24 13:33:35 -04:00
Armin Braun	af2e2782eb	Stop Needlessly Copying Bytes in XContent Parsing (#61447 ) (#61469 ) Wrapping a `BytesArray` in a `StreamInput` for deserialization is inefficient. This forces Jackson to internally buffer (i.e. copy) all bytes from the `BytesArray` before deserializing, adding overhead for copying the bytes and managing the buffers. This commit fixes a number of spots where `BytesArray` is the most common type of `BytesReference` to special case this type and parse it more efficiently. Also improves parsing `String`s to use the more efficient direct `String` parsing APIs.	2020-08-24 15:49:15 +02:00
Armin Braun	22509c95f8	Fix Blackholed Connection Behavior in DisruptableMockTransport (#61310 ) (#61381 ) It is not realistic to drop messages without eventually failing. To retain the coverage of long pauses this PR adjusts the blackholed behavior to fail a send after 24h (which is assumed to be longer than any timeout in the system) instead of never. Closes #61034	2020-08-21 07:54:56 +02:00
Julie Tibshirani	997c73ec17	Correct how field retrieval handles multifields and copy_to. (#61391 ) Before when a value was copied to a field through a parent field or `copy_to`, we parsed it using the `FieldMapper` from the source field. Instead we should parse it using the target `FieldMapper`. This ensures that we apply the appropriate mapping type and options to the copied value. To implement the fix cleanly, this PR refactors the value parsing strategy. Now instead of looking up values directly, field mappers produce a helper object `ValueFetcher`. The value fetchers are responsible for almost all aspects of fetching, including looking up the right paths in the _source. The PR is fairly big but each commit can be reviewed individually. Fixes #61033.	2020-08-20 15:53:35 -07:00
Julie Tibshirani	85ad328df7	Ensure fetch fields aren't dropped when rewriting search. (#61390 ) Previously we didn't retain the requested fields when performing a shallow copy of the search source. This meant that when a search was rewritten, we could drop the requested fields and fail to return them in the response.	2020-08-20 14:58:58 -07:00
Alan Woodward	a3a0c63ccf	Convert NumberFieldMapper to parametrized form (#61092 ) (#61376 ) In addition, this commit converts ScaledFloatFieldMapper as it was relying on a number of static values taken from NumberFieldMapper that had changed or been removed.	2020-08-20 16:43:26 +01:00
Nik Everett	9789e6d154	Migrate some field mapper tests to ESTestCase (#61301 ) (#61346 ) This switches a few tests for field mappers from `ESSingleNodeTestCase` to `ESTestCase` because, in general, we prefer to avoid `ESSingleNodeTestCase` when we can because it is slow and "big". "Big" here means that it pulls in an entire node, making it difficult to reason about what you are testing.	2020-08-19 15:43:49 -04:00
Nik Everett	5e723c5cc2	Weaken random date formatter test assertion `ESTestCase#testRandomDateFormatterPattern` previously asserted that round tripping `millis -> text -> millis` wouldn't lose any precision. But some date formats don't include the time of day so, of course, this could lose precision. This replaces that with an assertion that `text -> millis -> text` doesn't lose precision. Which should be true for any sane date format. Really, we're just trying to make sure that the random date formats that we return are fairly sane.	2020-08-18 16:45:38 -04:00
Nik Everett	1b7bbafd81	Add method to make random DateFormatter pattern (backport of #60613 ) (#61213 ) Adds a method to make a random date `DateFormatter` pattern. We expect this'll be useful for runtime fields to compate their formatting with the standard date field.	2020-08-17 10:57:52 -04:00
David Turner	b21cb7f466	Reduce allocations when persisting cluster state (#61159 ) Today we allocate a new `byte[]` for each document written to the cluster state. Some of these documents may be quite large. We need a buffer that's at least as large as the largest document, but there's no need to use a fresh buffer for each document. With this commit we re-use the same `byte[]` much more, only allocating it afresh if we need a larger one, and using the buffer needed for one round of persistence as a hint for the size needed for the next one.	2020-08-17 13:45:31 +01:00
Jay Modi	f0128ae074	Canonicalize client name in krb5kdc-fixture (#61119 ) This commit changes the value for client name canonicalization to true in the krb5.conf template file. This is done as a means to workaround JDK-8246193 which has made it into some builds of JDK8. Closes #61050	2020-08-13 14:58:08 -06:00
Lee Hinman	e3df64a429	[7.x] Add data tiers (hot, warm, cold, frozen) as custom node roles (#60994 ) (#61045 ) This commit adds the `data_hot`, `data_warm`, `data_cold`, and `data_frozen` node roles to the x-pack plugin. These roles are intended to be the base for the formalization of data tiers in Elasticsearch. These roles all act as data nodes (meaning shards can be allocated to them). Nodes with the existing `data` role acts as though they have all of the roles configured (it is a hot, warm, cold, and frozen node). This also includes a custom `AllocationDecider` that allows the user to configure the following settings on a cluster level: - `cluster.routing.allocation.require._tier` - `cluster.routing.allocation.include._tier` - `cluster.routing.allocation.exclude._tier` And in index settings: - `index.routing.allocation.require._tier` - `index.routing.allocation.include._tier` - `index.routing.allocation.exclude._tier` Relates to #60848	2020-08-12 11:06:23 -06:00
Yannick Welsch	25404cbe3d	Provide option to allow writes when master is down (#60605 ) Elasticsearch currently blocks writes by default when a master is unavailable. The cluster.no_master_block setting allows a user to change this behavior to also block reads when a master is unavailable. This PR introduces a way to now also still allow writes when a master is offline. Writes will continue to work as long as routing table changes are not needed (as those require the master for consistency), or if dynamic mapping updates are not required (as again, these require the master for consistency). Eventually we should switch the default of cluster.no_master_block to this new mode.	2020-08-12 16:56:45 +02:00
Armin Braun	3a046e125d	Speed up MockSinglePrioritizingExecutor (#61011 ) (#61012 ) Found this while checking if I can speed up SnapshotResiliencyTests to get more coverage/time. Turns out throwing a new instance here on every task was taking 9% of the CPU wall-time in those tests. With this change it's 4% of the overall.	2020-08-12 12:24:04 +02:00
Nik Everett	664ba0a80a	Fix the parent join aggregator test case (#60991 ) The test was putting parent and child documents into different segments which is unrealistic and was causing errors. Closes #60980	2020-08-11 17:53:15 -04:00
Armin Braun	3e2dfc6eac	Remove GCS Bucket Exists Check (#60899 ) (#60914 ) Same as https://github.com/elastic/elasticsearch/pull/43288 for GCS. We don't need to do the bucket exists check before using the repo, that just needlessly increases the necessary permissions for using the GCS repository.	2020-08-11 09:54:27 +02:00
Jim Ferenczi	f30f1f04e2	Replace AggregatorTestCase#search with AggregatorTestCase#searchAndReduce (#60816 ) This commit removes the ability to test the top level result of an aggregator before it runs the final reduce. All aggregator tests that use AggregatorTestCase#search are rewritten with AggregatorTestCase#searchAndReduce in order to ensure that we test the final output (the one sent to the end user) rather than an intermediary result that could be different. This change also removes spurious commits triggered on top of a random index writer. These commits slow down the tests and are redundant with the commits that the random index writer performs.	2020-08-10 17:23:00 +02:00
Andrei Dan	235e5ed3ea	[7.x] ILM: add force-merge step to searchable snapshots action (#60819 ) (#60882 ) This adds a force-merge step to the searchable snapshot action, enabled by default, but parameterizable using the `force_merge-index" optional boolean. eg. ``` PUT _ilm/policy/my_policy { "policy": { "phases": { "cold": { "actions": { "searchable_snapshot" : { "snapshot_repository" : "backing_repo", "force_merge_index": true } } } } } } ``` (cherry picked from commit d0a17b2d35f1b083b574246bdbf3e1929471a4a9) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-08-10 13:45:11 +01:00
Dan Hermann	9d96128c7e	Fix warning handler used in DataStreamsUpgradeIT (#59960 ) (#60682 )	2020-08-04 16:23:46 -05:00
Yannick Welsch	9e24a54382	Clean existing index folder when loading searchable snapshot (#60122 ) Closing a regular index and mounting a snapshot-backed index into that existing index does not clean the existing index folders of those preexisting shards. This PR removes the existing Lucene / translog files once the searchable snapshot shard is starting up. Future PRs will make reuse of the existing index files to populate the cache.	2020-08-03 13:19:11 +02:00
Armin Braun	204efe9387	Add Repository Setting to Disable Writing index.latest (#60448 ) (#60576 ) Writing the `index.latest` blob is unnecessary unless the contents of the repository are to be used as a URL-repository. Also, in some edge cases, the fact that `index.latest` is the only blob in the repository that regularly gets overwritten was causing compatibility issues with some backing blobstores (Azure no-overwrite policy, Hitachy S3 equivalent). => this commit changes behavior to make snapshots not fail if writing `index.latest` fails and adds a setting to disable writing `index.latest`.	2020-08-03 11:11:24 +02:00
Armin Braun	8c7eae15ba	Increase Timeout in testSnapshotRestore (#60532 ) (#60538 ) It seems this test only fails with `FsRepository` and mostly just barely times out (takes just a little over 30s to go green). I think just increasing the timeout should be fine as a fix here as it's a little interesting to check larger amounts of data in this test generally speaking. Closes #39299	2020-07-31 21:53:58 +02:00
Rene Groeschke	ed4b70190b	Replace immediate task creations by using task avoidance api (#60071 ) (#60504 ) - Replace immediate task creations by using task avoidance api - One step closer to #56610 - Still many tasks are created during configuration phase. Tackled in separate steps	2020-07-31 13:09:04 +02:00
Julie Tibshirani	dfd7f226f0	Clarify SourceLookup sharing across fetch subphases. (#60484 ) The `SourceLookup` class provides access to the _source for a particular document, specified through `SourceLookup#setSegmentAndDocument`. Previously the search context contained a single `SourceLookup` that was shared between different fetch subphases. It was hard to reason about its state: is `SourceLookup` set to the expected document? Is the _source already loaded and available? Instead of using a global source lookup, the fetch hit context now provides access to a lookup that is set to load from the hit document. This refactor closes #31000, since the same `SourceLookup` is no longer shared between the 'fetch _source phase' and script execution.	2020-07-30 13:22:31 -07:00
Mark Tozzi	970a0c8957	[7.x] Aggregation tests for Wildcard Field (#58507 ) (#60423 )	2020-07-30 08:56:21 -04:00
Julie Tibshirani	5359417ec3	Minor clean-up around search highlight context. (#60422 ) * Rename SearchContextHighlight -> SearchHighlightContext. * Rename HighlighterContext to FieldHighlightContext. * Make the search highlight context immutable. * Avoid storing SearchHighlightContext on HighlighterContext.	2020-07-29 11:39:17 -07:00
Julie Tibshirani	c7bfb5de41	Add search `fields` parameter to support high-level field retrieval. (#60258 ) This feature adds a new `fields` parameter to the search request, which consults both the document `_source` and the mappings to fetch fields in a consistent way. The PR merges the `field-retrieval` feature branch. Addresses #49028 and #55363.	2020-07-28 10:58:20 -07:00
Yannick Welsch	ffe114b890	Set specific keepalive options by default on supported platforms (#59278 ) keepalives tell any intermediate devices that the connection remains alive, which helps with overzealous firewalls that are killing idle connections. keepalives are enabled by default in Elasticsearch, but use system defaults for their configuration, which often times do not have reasonable defaults (e.g. 7200s for TCP_KEEP_IDLE) in the context of distributed systems such as Elasticsearch. This PR sets the socket-level keep_alive options for network.tcp.{keep_idle,keep_interval} to 5 minutes on configurations that support it (>= Java 11 & (MacOS \|\| Linux)) and where the system defaults are set to something higher than 5 minutes. This helps keep the connections alive while not interfering with system defaults or user-specified settings unless they are deemed to be set too high by providing better out-of-the-box defaults.	2020-07-28 11:10:04 +02:00
David Turner	bf7e53a91e	Remove node-level canAllocate override (#59389 ) Today there is a node-level `canAllocate` override which the balancer uses to ignore certain nodes to which it is certain no more shards can be allocated. In fact this override only ignores nodes which have hit the rarely-used `cluster.routing.allocation.total_shards_per_node` limit, so this optimization doesn't have a meaningful impact on real clusters. This commit removes this unnecessary fast path from the balancer, and also removes all the machinery needed to support it.	2020-07-23 08:48:59 +01:00
Jay Modi	c8ef2e18f7	Thread safe clean up of LocalNodeModeListeners (#60007 ) This commit continues on the work in #59801 and makes other implementors of the LocalNodeMasterListener interface thread safe in that they will no longer allow the callbacks to run on different threads and possibly race each other. This also helps address other issues where these events could be queued to wait for execution while the service keeps moving forward thinking it is the master even when that is not the case. In order to accomplish this, the LocalNodeMasterListener no longer has the executorName() method to prevent future uses that could encounter this surprising behavior. Each use was inspected and if the class was also a ClusterStateListener, the implementation of LocalNodeMasterListener was removed in favor of a single listener that combined the logic. A single listener is used and there is currently no guarantee on execution order between ClusterStateListeners and LocalNodeMasterListeners, so a future change there could cause undesired consequences. For other classes, the implementations of the callbacks were inspected and if the operations were lightweight, the overriden executorName method was removed to use the default, which runs on the same thread. Backport of #59932	2020-07-22 08:02:18 -06:00
Nik Everett	49f365ddfd	Fix bug in deep pipeline agg serialization (#59984 ) In #54716 I removed pipeline aggregators from the aggregation result tree and caused us to read them from the request. This saves a bunch of round trip bytes, which is neat. But there was a bug in the backwards compatibility logic. You see, we still have to give the pipeline aggregations to nodes older than 7.8 over the wire because that is how they know what pipelines to run. They have the pipelines in the request but they don't read them. They use the ones in the response tree. Anyway, we had a bug where we were never sending pipelines defined two levels down. So while you are upgrading the pipeline wouldn't run. Sometimes. If the data node of the "first" result was post-7.8 and the coordinating node was pre-7.8. This fixes the bug.	2020-07-21 16:03:15 -04:00
Nik Everett	6f6076e208	Drop some params from IndexFieldData.Builder (backport of #59934 ) (#59972 ) We never used the `IndexSettings` parameter and we only used the `MappedFieldType` parameter to get the name of the field which we already know everywhere where we build the `IFD.Builder`. This allows us to drop a fair bit of ceremony from a couple of tests.	2020-07-21 10:28:59 -04:00
Armin Braun	cefaa17c52	Simplify CheckSumBlobStoreFormat and make it more Reusable (#59888 ) (#59950 ) Refactored `CheckSumBlobStoreFormat` so it can more easily be reused in other functionality (i.e. upcoming repair logic). Simplified away constant `failIfAlreadyExists` parameter and removed the atomic write method and its tests. The atomic write method was only used in a single spot and that spot has now been adjusted to work the same way writing root level metadata works.	2020-07-21 11:20:56 +02:00
Alan Woodward	b29d368b52	Convert DateFieldMapper to parametrized format (#59429 ) (#59759 ) This commit makes DateFieldMapper extend ParametrizedFieldMapper, declaring its parameters explicitly. As well as changes to DateFieldMapper itself, there are some changes to dynamic mapping code to ensure that dynamically detected date formats are passed through to new date mapper builders.	2020-07-17 12:46:18 +01:00
Martijn van Groningen	0096238df1	Replaced _data_stream_timestamp meta field's 'path' option with 'enabled' option (#59727 ) Backport #59503 to 7.x and adjusted exception messages. Relates to #59076	2020-07-16 22:29:40 +02:00
Martijn van Groningen	4089cbd767	Ignore multiple matching templates warning in specific tests. (#59692 ) (#59715 ) Closes #59679	2020-07-16 20:07:38 +02:00
Martijn van Groningen	2a89e13e43	Move data stream transport and rest action to xpack (#59593 ) Backport of #59525 to 7.x branch. * Actions are moved to xpack core. * Transport and rest actions are moved the data-streams module. * Removed data streams methods from Client interface. * Adjusted tests to use client.execute(...) instead of data stream specific methods. * only attempt to delete all data streams if xpack is installed in rest tests * Now that ds apis are in xpack and ESIntegTestCase no longers deletes all ds, do that in the MlNativeIntegTestCase class for ml tests.	2020-07-15 16:50:44 +02:00
Armin Braun	2dd086445c	Enable Fully Concurrent Snapshot Operations (#56911 ) (#59578 ) Enables fully concurrent snapshot operations: * Snapshot create- and delete operations can be started in any order * Delete operations wait for snapshot finalization to finish, are batched as much as possible to improve efficiency and once enqueued in the cluster state prevent new snapshots from starting on data nodes until executed * We could be even more concurrent here in a follow-up by interleaving deletes and snapshots on a per-shard level. I decided not to do this for now since it seemed not worth the added complexity yet. Due to batching+deduplicating of deletes the pain of having a delete stuck behind a long -running snapshot seemed manageable (dropped client connections + resulting retries don't cause issues due to deduplication of delete jobs, batching of deletes allows enqueuing more and more deletes even if a snapshot blocks for a long time that will all be executed in essentially constant time (due to bulk snapshot deletion, deleting multiple snapshots is mostly about as fast as deleting a single one)) * Snapshot creation is completely concurrent across shards, but per shard snapshots are linearized for each repository as are snapshot finalizations See updated JavaDoc and added test cases for more details and illustration on the functionality. Some notes: The queuing of snapshot finalizations and deletes and the related locking/synchronization is a little awkward in this version but can be much simplified with some refactoring. The problem is that snapshot finalizations resolve their listeners on the `SNAPSHOT` pool while deletes resolve the listener on the master update thread. With some refactoring both of these could be moved to the master update thread, effectively removing the need for any synchronization around the `SnapshotService` state. I didn't do this refactoring here because it's a fairly large change and not necessary for the functionality but plan to do so in a follow-up. This change allows for completely removing any trickery around synchronizing deletes and snapshots from SLM and 100% does away with SLM errors from collisions between deletes and snapshots. Snapshotting a single index in parallel to a long running full backup will execute without having to wait for the long running backup as required by the ILM/SLM use case of moving indices to "snapshot tier". Finalizations are linearized but ordered according to which snapshot saw all of its shards complete first	2020-07-15 03:42:31 +02:00
Armin Braun	06d94cbb2a	Fix TODO about Spurious FAILED Snapshots (#58994 ) (#59576 ) There is no point in writing out snapshots that contain no data that can be restored whatsoever. It may have made sense to do so in the past when there was an `INIT` snapshot step that wrote data to the repository that would've other become unreferenced, but in the current day state machine without the `INIT` step there is no point in doing so.	2020-07-15 00:54:30 +02:00
Armin Braun	e1014038e9	Simplify Repository.finalizeSnapshot Signature (#58834 ) (#59574 ) Many of the parameters we pass into this method were only used to build the `SnapshotInfo` instance to write. This change simplifies the signature. Also, it seems less error prone to build `SnapshotInfo` in `SnapshotsService` isntead of relying on the fact that each repository implementation will build the correct `SnapshotInfo`.	2020-07-15 00:14:28 +02:00
Martijn van Groningen	35ae3d19db	Remove data stream feature flag (#59572 ) so that it can used in the next minor release (7.9.0). Backport of #59504 to 7.x branch. Closes #53100	2020-07-14 23:50:41 +02:00
Armin Braun	d456f7870a	Deduplicate Index Metadata in BlobStore (#50278 ) (#59514 ) This PR introduces two new fields in to `RepositoryData` (index-N) to track the blob name of `IndexMetaData` blobs and their content via setting generations and uuids. This is used to deduplicate the `IndexMetaData` blobs (`meta-{uuid}.dat` in the indices folders under `/indices` so that new metadata for an index is only written to the repository during a snapshot if that same metadata can't be found in another snapshot. This saves one write per index in the common case of unchanged metadata thus saving cost and making snapshot finalization drastically faster if many indices are being snapshotted at the same time. The implementation is mostly analogous to that for shard generations in #46250 and piggy backs on the BwC mechanism introduced in that PR (which means this PR needs adjustments if it doesn't go into `7.6`). Relates to #45736 as it improves the efficiency of snapshotting unchanged indices Relates to #49800 as it has the potential of loading the index metadata for multiple snapshots of the same index concurrently much more efficient speeding up future concurrent snapshot delete	2020-07-14 22:18:42 +02:00
Tim Brooks	408a07f96a	Separate coordinating and primary bytes in stats (#59487 ) Currently we combine coordinating and primary bytes into a single bucket for indexing pressure stats. This makes sense for rejection logic. However, for metrics it would be useful to separate them.	2020-07-14 12:37:06 -06:00
Tim Brooks	623df95a32	Adding indexing pressure stats to node stats API (#59467 ) We have recently added internal metrics to monitor the amount of indexing occurring on a node. These metrics introduce back pressure to indexing when memory utilization is too high. This commit exposes these stats through the node stats API.	2020-07-13 17:23:42 -06:00
Mark Vieira	dc7d4c615c	Ensure fixture runtime dependencies are built before starting containers (#59474 )	2020-07-13 15:58:01 -07:00
Armin Braun	64c5f70a2d	Remove Needless Context Switches on Loading RepositoryData (#56935 ) (#59452 ) We don't need to switch to the generic or snapshot pool for loading cached repository data (i.e. most of the time in normal operation). This makes `executeConsistentStateUpdate` less heavy if it has to retry and lowers the chance of having to retry in the first place. Also, this change allowed simplifying a few other spots in the codebase where we would fork off to another pool just to load repository data.	2020-07-13 21:38:29 +02:00
Martijn van Groningen	b1b7bf3912	Make data streams a basic licensed feature. (#59392 ) Backport of #59293 to 7.x branch. * Create new data-stream xpack module. * Move TimestampFieldMapper to the new module, this results in storing a composable index template with data stream definition only to work with default distribution. This way data streams can only be used with default distribution, since a data stream can currently only be created if a matching composable index template exists with a data stream definition. * Renamed `_timestamp` meta field mapper to `_data_stream_timestamp` meta field mapper. * Add logic to put composable index template api to fail if `_data_stream_timestamp` meta field mapper isn't registered. So that a more understandable error is returned when attempting to store a template with data stream definition via the oss distribution. In a follow up the data stream transport and rest actions can be moved to the xpack data-stream module.	2020-07-13 17:26:46 +02:00
Alan Woodward	f4caadd239	MappedFieldType no longer requires equals/hashCode/clone (#59212 ) With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer have any cause to compare MappedFieldType instances. This means that we can remove all equals and hashCode implementations, and in addition we no longer need the clone implementations which were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes, which will be particularly useful for the runtime fields project.	2020-07-09 21:05:10 +01:00
Przemko Robakowski	c870d6e570	[7.x] Restart tests with data streams (#58330 ) (#59303 ) * Restart tests with data streams (#58330)	2020-07-09 17:52:20 +02:00
Lee Hinman	bb1c53a0f5	Allow warnings about 'global' template in upgrade tests (#59242 ) These tests sometimes install a template so they can be compatible with older versions, but they run amok of the occasionally installed "global" template which changes the default number of shards. This commit adds `allowedWarnings` and allows these warnings to be present, but doesn't fail if they are not (since the global template is only randomly installed). Resolves #58807 Resolves #58258	2020-07-08 13:40:55 -06:00
Nhat Nguyen	e50a0330ec	Remove random of recovery chunk size setting The recovery chunk size setting was injected in #58018, but too aggressively and broke several tests. This change removes that random injection. Relates #58018	2020-07-08 15:29:37 -04:00
Martijn van Groningen	17bd559253	Fix the timestamp field of a data stream to @timestamp (#59210 ) Backport of #59076 to 7.x branch. The commit makes the following changes: * The timestamp field of a data stream definition in a composable index template can only be set to '@timestamp'. * Removed custom data stream timestamp field validation and reuse the validation from `TimestampFieldMapper` and instead only check that the _timestamp field mapping has been defined on a backing index of a data stream. * Moved code that injects _timestamp meta field mapping from `MetadataCreateIndexService#applyCreateIndexRequestWithV2Template58956(...)` method to `MetadataIndexTemplateService#collectMappings(...)` method. * Fixed a bug (#58956) that cases timestamp field validation to be performed for each template and instead of the final mappings that is created. * only apply _timestamp meta field if index is created as part of a data stream or data stream rollover, this fixes a docs test, where a regular index creation matches (logs-*) with a template with a data stream definition. Relates to #58642 Relates to #53100 Closes #58956 Closes #58583	2020-07-08 17:30:46 +02:00
Armin Braun	c66b80b9fa	Disable WindowsFS in MockAPITests (#59163 ) (#59214 ) Turns out these tests sometimes run very slow on `WindowsFS` as well so disabling it here. Closes #59133	2020-07-08 14:47:40 +02:00
Nik Everett	a29d3515a2	Improve cardinality measure used to build aggs (#56533 ) (#59107 ) This makes a `parentCardinality` available to every `Aggregator`'s ctor so it can make intelligent choices about how it collects bucket values. This replaces `collectsFromSingleBucket` and is similar to it but: 1. It supports `NONE`, `ONE`, and `MANY` values and is generally extensible if we decide we can use more precise counts. 2. It is more accurate. `collectsFromSingleBucket` assumed that all sub-aggregations live under multi-bucket aggregations. This is normally true but `parentCardinality` is properly carried forward for single bucket aggregations like `filter` and for multi-bucket aggregations configured in single-bucket for like `range` with a single range. While I was touching every aggregation I renamed `doCreateInternal` to `createMapped` because that seemed like a much better name and it was right there, next to the change I was already making. Relates to #56487 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-07-08 08:42:23 -04:00
Armin Braun	9268b25789	Add Check for Metadata Existence in BlobStoreRepository (#59141 ) (#59216 ) In order to ensure that we do not write a broken piece of `RepositoryData` because the phyiscal repository generation was moved ahead more than one step by erroneous concurrent writing to a repository we must check whether or not the current assumed repository generation exists in the repository physically. Without this check we run the risk of writing on top of stale cached repository data. Relates #56911	2020-07-08 14:25:01 +02:00
Nhat Nguyen	ef5c397c0f	Sending operations concurrently in peer recovery (#58018 ) Today, we send operations in phase2 of peer recoveries batch by batch sequentially. Normally that's okay as we should have a fairly small of operations in phase 2 due to the file-based threshold. However, if phase1 takes a lot of time and we are actively indexing, then phase2 can have a lot of operations to replay. With this change, we will send multiple batches concurrently (defaults to 1) to reduce the recovery time. Backport of #58018	2020-07-07 22:03:31 -04:00
David Turner	46c8d00852	Remove nodes with read-only filesystems (#52680 ) (#59138 ) Today we do not allow a node to start if its filesystem is readonly, but it is possible for a filesystem to become readonly while the node is running. We don't currently have any infrastructure in place to make sure that Elasticsearch behaves well if this happens. A node that cannot write to disk may be poisonous to the rest of the cluster. With this commit we periodically verify that nodes' filesystems are writable. If a node fails these writability checks then it is removed from the cluster and prevented from re-joining until the checks start passing again. Closes #45286 Co-authored-by: Bukhtawar Khan <bukhtawar7152@gmail.com>	2020-07-07 14:00:02 +01:00
Armin Braun	d6d6df16bb	Share IT Infrastructure between Core Snapshot and SLM ITs (#59082 ) (#59119 ) For #58994 it would be useful to be able to share test infrastructure. This PR shares `AbstractSnapshotIntegTestCase` for that purpose, dries up SLM tests accordingly and adds a shared and efficient (compared to the previous implementations) way of waiting for no running snapshot operations to the test infrastructure to dry things up further.	2020-07-07 12:04:41 +02:00
Dan Hermann	550dcb0ca6	[7.x] Delete data stream API accepts multiple names (#59064 )	2020-07-06 08:06:10 -05:00
Armin Braun	49857cc35d	Dry up Master Disconnect Disruption Tests (#58953 ) (#59050 ) Dry up tests that use a disruption that isolates the master from all other nodes. Also, turn disruption types that have neither parameters nor state into constants to make things a little clearer.	2020-07-06 11:04:24 +02:00
Armin Braun	071d8b2c1c	Deduplicate Empty InternalAggregations (#58386 ) (#59032 ) Working through a heap dump for an unrelated issue I found that we can easily rack up tens of MBs of duplicate empty instances in some cases. I moved to a static constructor to guard against that in all cases.	2020-07-04 14:02:16 +02:00
David Kyle	f6a0c2c59d	[7.x] Pipeline Inference Aggregation (#58965 ) Adds a pipeline aggregation that loads a model and performs inference on the input aggregation results.	2020-07-03 09:29:04 +01:00
Tim Brooks	dc9e364ff2	Count coordinating and primary bytes as write bytes (#58984 ) This is a follow-up to #57573. This commit combines coordinating and primary bytes under the same "write" bucket. Double accounting is prevented by only accounting the bytes at either the reroute phase or the primary phase. TransportBulkAction calls execute directly, so the operations handler is skipped and the bytes are not double accounted.	2020-07-02 19:48:19 -06:00
Tim Brooks	9d1bf383d0	Add test assertions to ensure write bytes released (#58970 ) This is a follow-up to #57573. This commit ensures that the bytes marked in WriteMemoryLimits are released by any test using an internal test cluster.	2020-07-02 17:38:23 -06:00
Tim Brooks	1ef2cd7f1a	Add memory tracking to queued write operations (#58957 ) Currently we do not track the memory consuming by in-process write operations. This commit adds a mechanism to track write operation memory usage.	2020-07-02 14:14:57 -06:00
Ryan Ernst	d825d4352c	Eagerly compile condition script at processor creation (#58882 ) Ingest script processors were changed to eagerly compile their scripts when the ingest pipeline is saved, but conditional scripts were missed. This commit adds eager compilation to ingest conditional scripts, which will help surface errors before runtime, as well as adds tests for each case we might encounter between inline and stored script compilation failures. closes #58864	2020-07-02 11:10:20 -07:00
Lee Hinman	d3d03fc1c6	[7.x] Add default composable templates for new indexing strategy (#57629 ) (#58757 ) Backports the following commits to 7.x: Add default composable templates for new indexing strategy (#57629)	2020-07-01 09:32:32 -06:00
Alan Woodward	3ba16e0f39	Move MappedFieldType#getSearchAnalyzer and #getSearchQuoteAnalyzer to TextSearchInfo (#58830 ) Analyzers are specific to text searching, and so should be in TextSearchInfo rather than on the generic MappedFieldType. Backport of #58639	2020-07-01 14:52:14 +01:00

1 2 3 4 5 ...

2683 Commits