OpenSearch

Commit Graph

Author	SHA1	Message	Date
David Turner	bf7e53a91e	Remove node-level canAllocate override (#59389 ) Today there is a node-level `canAllocate` override which the balancer uses to ignore certain nodes to which it is certain no more shards can be allocated. In fact this override only ignores nodes which have hit the rarely-used `cluster.routing.allocation.total_shards_per_node` limit, so this optimization doesn't have a meaningful impact on real clusters. This commit removes this unnecessary fast path from the balancer, and also removes all the machinery needed to support it.	2020-07-23 08:48:59 +01:00
Armin Braun	43a6ff5eb1	Optimize some Spots around Closing Resources (#60049 ) (#60096 ) The single element `close` calls go through a very inefficient path that includes creating a one element list. `releaseOnce` is only with a single non-null input in production in two spots so no need for varargs and any complexity here. `ReleasableBytesStreamOutput` does not require any `releaseOnce` wrapping because we already have that kind of logic implemented in `org.elasticsearch.common.util.AbstractArray` (which we were wrapping here) already.	2020-07-23 08:49:06 +02:00
Julie Tibshirani	aa57bbd422	Consolidate validation for 'docvalue_fields'. (#60065 ) This improves modularity and also fixes some issues when `docvalues_fields` is used within `inner_hits` or the `top_hits` agg: * We previously didn't resolve wildcards in field names. * We also forgot to enforce the limit `index.max_docvalue_fields_search`.	2020-07-22 17:26:58 -07:00
Armin Braun	ebb6677815	Formalize and Streamline Buffer Sizes used by Repositories (#59771 ) (#60051 ) Due to complicated access checks (reads and writes execute in their own access context) on some repositories (GCS, Azure, HDFS), using a hard coded buffer size of 4k for restores was needlessly inefficient. By the same token, the use of stream copying with the default 8k buffer size for blob writes was inefficient as well. We also had dedicated, undocumented buffer size settings for HDFS and FS repositories. For these two we would use a 100k buffer by default. We did not have such a setting for e.g. GCS though, which would only use an 8k read buffer which is needlessly small for reading from a raw `URLConnection`. This commit adds an undocumented setting that sets the default buffer size to `128k` for all repositories. It removes wasteful allocation of such a large buffer for small writes and reads in case of HDFS and FS repositories (i.e. still using the smaller buffer to write metadata) but uses a large buffer for doing restores and uploading segment blobs. This should speed up Azure and GCS restores and snapshots in a non-trivial way as well as save some memory when reading small blobs on FS and HFDS repositories.	2020-07-22 21:06:31 +02:00
Tim Brooks	ba01540d7e	Implement human readable indexing pressure stats (#60058 ) The indexing pressure stats do not currently have human readable variants. This commit add human readable variants and updates the documentation.	2020-07-22 12:07:59 -06:00
Jay Modi	c8ef2e18f7	Thread safe clean up of LocalNodeModeListeners (#60007 ) This commit continues on the work in #59801 and makes other implementors of the LocalNodeMasterListener interface thread safe in that they will no longer allow the callbacks to run on different threads and possibly race each other. This also helps address other issues where these events could be queued to wait for execution while the service keeps moving forward thinking it is the master even when that is not the case. In order to accomplish this, the LocalNodeMasterListener no longer has the executorName() method to prevent future uses that could encounter this surprising behavior. Each use was inspected and if the class was also a ClusterStateListener, the implementation of LocalNodeMasterListener was removed in favor of a single listener that combined the logic. A single listener is used and there is currently no guarantee on execution order between ClusterStateListeners and LocalNodeMasterListeners, so a future change there could cause undesired consequences. For other classes, the implementations of the callbacks were inspected and if the operations were lightweight, the overriden executorName method was removed to use the default, which runs on the same thread. Backport of #59932	2020-07-22 08:02:18 -06:00
Luca Cavanna	702c997819	ParametrizedFieldMapper to run validators against default value (#60042 ) Sometimes there is the need to make a field required in the mappings, and validate that a value has been provided for it. This can be done through a validator when using ParametrizedFieldMapper, but validators need to run also when a value for a field has not been specified. Relates to #59332	2020-07-22 14:12:38 +02:00
Armin Braun	c06c9fb966	Fix BwC Snapshot INIT Path (#60006 ) There were two subtle bugs here from backporting #56911 to 7.x. 1. We passed `null` for the `shards` map which isn't nullable any longer when creating `SnapshotsInProgress.Entry`, fixed by just passing an empty map like the `null` handling did in the past. 2. The removal of a failed `INIT` state snapshot from the cluster state tried removing it from the finalization loop (the set of repository names that are currently finalizing). This will trip an assertion since the snapshot failed before its repository was put into the set. I made the logic ignore the set in case we remove a failed `INIT` state snapshot to restore the old logic to exactly as it was before the concurrent snapshots backport to be on the safe side here. Also, added tests that explicitly call the old code paths because as can be seen from initially missing this, the BwC tests will only run in the configuration new version master, old version nodes ever so often and having a deterministic test for the old state machine seems the safest bet here. Closes #59986	2020-07-22 10:09:55 +02:00
Jake Landis	55216dabb4	[7.x] Per processor description for verbose simulate (#58207 ) (#60008 ) For ingest node processors a per processor description was recently added. This commit displays that description in the verbose output of the pipeline simulation. related #57906	2020-07-21 17:32:45 -05:00
Nik Everett	49f365ddfd	Fix bug in deep pipeline agg serialization (#59984 ) In #54716 I removed pipeline aggregators from the aggregation result tree and caused us to read them from the request. This saves a bunch of round trip bytes, which is neat. But there was a bug in the backwards compatibility logic. You see, we still have to give the pipeline aggregations to nodes older than 7.8 over the wire because that is how they know what pipelines to run. They have the pipelines in the request but they don't read them. They use the ones in the response tree. Anyway, we had a bug where we were never sending pipelines defined two levels down. So while you are upgrading the pipeline wouldn't run. Sometimes. If the data node of the "first" result was post-7.8 and the coordinating node was pre-7.8. This fixes the bug.	2020-07-21 16:03:15 -04:00
David Turner	dde568caf7	Fix scheduling of ClusterInfoService#refresh (#59880 ) Today the `InternalClusterInfoService` uses the `LocalNodeMasterListener` interface to start/stop its operations. Since the `onMaster` and `offMaster` methods are called on the `MANAGEMENT` threadpool, there's no guarantee that they run in the correct sequence, which could result in an elected master failing to regularly update the cluster info. Since this service is also a `ClusterStateListener` we may as well drop the usage of the `LocalNodeMasterListener` interface and simply update the status of the local node on the applier thread in `clusterChanged` to ensure consistency. Additionally, today the `InternalClusterInfoService` uses a simple flag to track whether the local node is the elected master or not. If the node stops being the master and then starts again within a few seconds then the scheduled updates from the old mastership might carry on running in addition to the ones for the new mastership. This commit addresses that by tracking the identity of the scheduled update job and creating a new job for each mastership.	2020-07-21 17:14:49 +01:00
Alan Woodward	a0ad1a196b	Wrap up building parametrized TypeParsers (#59977 ) The TypeParser implementations of all ParametrizedFieldMapper descendant classes are essentially the same - stateless, requiring the construction of a Builder object, and calling parse on it before returning it. We can make this easier (and less error-prone) to implement by wrapping the logic up into a final class, which takes a function to produce the Builder from a name and parser context.	2020-07-21 16:00:11 +01:00
Nik Everett	6f6076e208	Drop some params from IndexFieldData.Builder (backport of #59934 ) (#59972 ) We never used the `IndexSettings` parameter and we only used the `MappedFieldType` parameter to get the name of the field which we already know everywhere where we build the `IFD.Builder`. This allows us to drop a fair bit of ceremony from a couple of tests.	2020-07-21 10:28:59 -04:00
Luca Cavanna	5e17f00ecf	Tweak toXContent implementation of ParametrizedFieldMapper (#59968 ) ParametrizedFieldMapper overrides `toXContent` from `FieldMapper`, yet it could override `doXContentBody` and rely on the `toXContent` from the base class. Additionally, this allows to make `doXContentBody` final. Also, toXContent is still overridden only to make it final.	2020-07-21 16:01:51 +02:00
Przemyslaw Gomulka	19fe3e511f	Deprecate camel case date format backport(#59555 ) (#59948 ) Camel case date formats are deprecated and snake case should be used instead. backports #59555	2020-07-21 15:56:44 +02:00
Armin Braun	e37bfe8a5f	Stop Checking if Segment Data Blob Exists before Write (#59905 ) (#59971 ) With uuid named segment data blobs there is no reason to ensure no overwrites are happening for these blobs when writing. On the contrary, at least on Azure this check can conflict with the SDK's retrying and cause upload failures randomly.	2020-07-21 15:23:42 +02:00
Yannick Welsch	07784a0b16	CCR recoveries using wrong setting for chunk sizes (#59597 ) The default chunk size for CCR file-based recoveries was wrongly set to 40MB instead of 1MB.	2020-07-21 13:56:06 +02:00
Armin Braun	cefaa17c52	Simplify CheckSumBlobStoreFormat and make it more Reusable (#59888 ) (#59950 ) Refactored `CheckSumBlobStoreFormat` so it can more easily be reused in other functionality (i.e. upcoming repair logic). Simplified away constant `failIfAlreadyExists` parameter and removed the atomic write method and its tests. The atomic write method was only used in a single spot and that spot has now been adjusted to work the same way writing root level metadata works.	2020-07-21 11:20:56 +02:00
Armin Braun	5b92596fad	Cleanup and Optimize Multiple Serialization Spots (#59626 ) (#59936 ) Follow up to #59606 using some of the new infrastructure and making similar cleanups (and due to at times better handling of size hints and empty collections also optimizations in the stream utility methods this also means speedups) in various spots in the core codebase.	2020-07-21 10:06:56 +02:00
Julie Tibshirani	8647872a1e	Simplify structure for parsing points. (#59938 ) Previously we constructed a GeometryFormat object and delegated point parsing to it. This wasn't a good fit conceptually because each GeometryFormat instance didn't represent a distinct point format.	2020-07-20 17:11:43 -07:00
Nik Everett	b2ca19484a	Allocate slightly less per bucket (#59740 ) (#59873 ) This replaces that data structure that we use to resolve bucket ids in bucketing aggs that are inside other bucketing aggs. This replaces the "legoed together" data structure with a purpose built `LongLongHash` with semantics similar to `LongHash`, except that it has two `long`s as keys instead of one. The microbenchmarks show a fairly substantial performance gain on the hot path, around 30%. Rally's higher level benchmarks show anywhere from 0 to 7% speed improvements. Not as much as I'd hoped, but nothing to sneeze at. And, after all, we all allocating slightly less data per owningBucketOrd, which is always nice.	2020-07-20 10:43:11 -04:00
Stéphane Campinas	bcebdfe5b1	fix handling of alias filter in SearchService#canMatch (#59368 ) The check against the alias filter should be done after the request is rewritten. Close #59367	2020-07-20 16:25:15 +02:00
David Turner	b75207a09f	Remove sporadic min/max usage estimates from stats (#59755 ) Today `GET _nodes/stats/fs` includes `{least,most}_usage_estimate` fields for some nodes. These fields have rather strange semantics. They are only reported on the elected master and on nodes that have been the elected master since they were last restarted; when a node stops being the elected master these stats remain in place but we stop updating them so they may become arbitrarily stale. This means that these statistics are pretty meaningless and impossible to use correctly. Even if they were kept up to date they're never reported for data-only nodes anyway, despite the fact that data nodes are the ones where we care most about disk usage. The information needed to compute the path with the least/most available space is already provided in the rest the stats output, so we can treat the inclusion of these stats as a bug and fix it by simply removing them in this commit. Since these stats were always optional and mostly omitted (for opaque reasons) this is not considered a breaking change.	2020-07-20 15:22:04 +01:00
Lee Hinman	8c7d414a3b	[7.x] Fix retrieving data stream stats for a DS with multiple backing indices (#59806 ) (#59810 ) Backports the following commits to 7.x: Fix retrieving data stream stats for a DS with multiple backing indices (#59806)	2020-07-17 16:56:07 -06:00
Nik Everett	514b2f3414	Clean up a few of vwh's rough edges (#59341 ) (#59807 ) This cleans up a few rough edged in the `variable_width_histogram`, mostly found by @wwang500: 1. Setting its tuning parameters in an unexpected order could cause the request to fail. 2. We checked that the maximum number of buckets was both less than 50000 and MAX_BUCKETS. This drops the 50000. 3. Fixes a divide by 0 that can occur of the `shard_size` is 1. 4. Fixes a divide by 0 that can occur if the `shard_size * 3` overflows a signed int. 5. Requires `shard_size * 3 / 4` to be at least `buckets`. If it is less than `buckets` we will very consistently return fewer buckets than requested. For the most part we expect folks to leave it at the default. If they change it, we expect it to be much bigger than `buckets`. 6. Allocate a smaller `mergeMap` in when initially bucketing requests that don't use the entire `shard_size * 3 / 4`. Its just a waste. 7. Default `shard_size` to `10 * buckets` rather than `100`. It looks like that was our intention the whole time. And it feels like it'd keep the algorithm humming along more smoothly. 8. Default the `initial_buffer` to `min(10 * shard_size, 50000)` like we've documented it rather than `5000`. Like the point above, this feels like the right thing to do to keep the algorithm happy. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-07-17 15:16:09 -04:00
Lee Hinman	f6b08a3115	[7.x] Allow simulating existing composable index template (#59733 ) (#59798 ) Backports the following commits to 7.x: Allow simulating existing composable index template (#59733)	2020-07-17 13:10:07 -06:00
Nik Everett	95e6e4a452	Small cleanup for IndexFieldData (#59724 ) (#59800 ) This drops `IndexComponent` from `IndexFieldData` because it wasn't doing anything other than forcing us to perform a bunch of ceremony to build them.	2020-07-17 13:38:15 -04:00
Tal Levy	c9ab7bb651	Fix bug in circuit-breaker check for geoshape grid aggregations (#57962 ) (#59741 ) There was a bug in the geoshape circuit-breaker check where the hash values array was being allocated before its new size was accounted for by the circuit breaker. Fixes #57847.	2020-07-17 09:26:00 -07:00
Christoph Büscher	f4ff5fe93b	Add `zero_terms_query` support to `match_phrase_prefix` (#58822 ) (#59784 ) Currently `match_phrase_prefix` doesn't support `zero_terms_query` like the other match-type queries. This change adds this support. Closes #58468	2020-07-17 17:23:23 +02:00
Benjamin Trent	b7f30fc929	[7.x] Adding new `require_alias` option to indexing requests (#58917 ) (#59769 ) * Adding new `require_alias` option to indexing requests (#58917) This commit adds the `require_alias` flag to requests that create new documents. This flag, when `true` prevents the request from automatically creating an index. Instead, the destination of the request MUST be an alias. When the flag is not set, or `false`, the behavior defaults to the `action.auto_create_index` settings. This is useful when an alias is required instead of a concrete index. closes https://github.com/elastic/elasticsearch/issues/55267	2020-07-17 10:24:58 -04:00
Alan Woodward	65f6fb8e94	Shortcut mapping update if the incoming mapping version is the same as the current mapping version (#59517 ) (#59772 ) Currently, when we apply a cluster state change to a shard on a non-master node, we check to see if the mappings need to be updated by comparing the decompressed serialized mappings from the update against the serialized version of the shard's existing mappings. However, we already have a much simpler way of checking this, by comparing mapping versions on the index metadata of the old and new states. This commit adds a shortcut to MapperService.updateMappings() that compares these mapping versions, and ignores the merge if they are equal.	2020-07-17 14:53:09 +01:00
Alan Woodward	b29d368b52	Convert DateFieldMapper to parametrized format (#59429 ) (#59759 ) This commit makes DateFieldMapper extend ParametrizedFieldMapper, declaring its parameters explicitly. As well as changes to DateFieldMapper itself, there are some changes to dynamic mapping code to ensure that dynamically detected date formats are passed through to new date mapper builders.	2020-07-17 12:46:18 +01:00
Przemko Robakowski	790fbbcd87	[7.x] Fix handling of final pipelines when destination is changed (#59522 ) (#59746 ) * Fix handling of final pipelines when destination is changed (#59522) This change fixes final pipelines if destination index is changed during pipeline run: -final pipelines can't change destination anymore, exception is thrown if they try to -if request/default pipeline changes destination final pipeline from old index won't be executed -if request/default pipeline changes destination and new index has final pipeline it will be executed -default pipeline from new index won't be executed Additionally TransportBulkAction.resolvePipelines was moved to IngestService as it's needed for resolving pipelines from new index. Tests were moved accordingly. Closes #57968	2020-07-17 11:13:48 +02:00
Tim Brooks	b6e6a8c090	Fix replication operation transient retry test (#58205 ) After the work to retry transient replication failures, the local and global checkpoint test metadata can be incremented on a different thread than the test thread. This appears to introduce an extremely rare scenario where this data is not visible for later test assertions. This commit fixes the issue by using synchronized maps.	2020-07-16 16:01:47 -06:00
Martijn van Groningen	0096238df1	Replaced _data_stream_timestamp meta field's 'path' option with 'enabled' option (#59727 ) Backport #59503 to 7.x and adjusted exception messages. Relates to #59076	2020-07-16 22:29:40 +02:00
Igor Motov	2408803fad	Adds hard_bounds to histogram aggregations (#59175 ) (#59656 ) Adds a hard_bounds parameter to explicitly limit the buckets that a histogram can generate. This is especially useful in case of open ended ranges that can produce a very large number of buckets.	2020-07-16 15:31:53 -04:00
Alan Woodward	10be10c99b	Migrate CompletionFieldMapper to parametrized format (#59691 ) This adds a number of new optional parameters to Parameter, including: * custom serialization (to handle analyzers) * deprecated parameter names * parameter validation * allowing default values to be based on the values of other parameters We preserve the previous serialization format of CompletionFieldMapper, always emitting most fields, in order to meet mapping checks in mixed version clusters, where the mapper service will check that mappings have been correctly parsed and updated by checking their serialized outputs.	2020-07-16 19:15:00 +01:00
Howard	c0d429863c	remove unused cluster name in environment. (backport of #59605 ) (#59681 ) removes an unused variable	2020-07-16 09:25:55 -04:00
Nik Everett	343053c0a7	Fix compilation in Eclipse (backport #59675 ) Eclipse was confused by #59583. It can't see a the public inner interface within the superclass. This time. Usually that is fine, but the Eclipse gods don't like this particular code, I guess.	2020-07-16 08:25:12 -04:00
Alan Woodward	27067de699	Make MappedFieldType#meta final (#59383 ) The MappedFieldType#updateMeta method was used for testing equality checks, but we no longer need these after #59212 , so we can remove this method and make meta final.	2020-07-16 09:45:55 +01:00
Przemysław Witek	df4fea79cb	Add a "verbose" option to the data frame analytics stats endpoint (#59589 ) (#59621 )	2020-07-16 09:51:31 +02:00
Armin Braun	6db481f49e	Fix ConcurrentSnapshotsIT.testEquivalentDeletesAreDeduplicated (#59611 ) (#59653 ) Trying to queue up snapshot deletes by blocking the delete of the latest index-N doesn't work here. The first delete will block on the delete operation but only do so after having already written the updated repository data. Since that repository data will contain no snapshots, the subsequent deletes for `*` will just fall through and complete instead of queue up. => Fixed by simply waiting on all files on master so that we block before updating the repository data and get to test the queueing of equivalent operations closes #59608	2020-07-16 09:28:36 +02:00
Nhat Nguyen	b599f7a9c0	Fix estimate size of translog operations (#59206 ) Make sure that the estimateSize method includes all fields of translog operations.	2020-07-16 00:19:30 -04:00
Julie Tibshirani	2b70758a05	Correct type parametrization in geo mappers. (#59583 ) Previously the concrete type parameters for the MappedFieldType didn't always match those for the FieldMapper. This PR updates the mappers so that the type parameters always match, which makes the design easier to follow.	2020-07-15 14:10:47 -07:00
Boice Huang	ef26c1739b	fix typo in Exception Response in GeoJson (#59270 )	2020-07-15 20:15:18 +01:00
Boice Huang	07a58d915d	Fix typo in AggregationProfiler (#59269 )	2020-07-15 20:14:19 +01:00
Armin Braun	cc7093645c	Cleanup some Serialization Code around Snapshots (#59532 ) (#59606 ) A number of obvious possible simplifications that also improve efficiency in some cases (better empty collection handling and size hint use). Also, added a shortcut for writing and reading immutable open maps that can be used to dry up additional spots.	2020-07-15 20:40:43 +02:00
David Turner	67e7c3f60e	Fix failing test introduced in #59601	2020-07-15 17:44:27 +01:00
Rory Hunter	b8d73a1e7e	Default gateway.auto_import_dangling_indices to false (#59302 ) Backport of #58898. Part of #48366. Now that there is a dedicated API for dangling indices, the auto-import behaviour can default to off. Also add a note to the breaking changes for 7.9.0.	2020-07-15 17:10:42 +01:00
David Turner	691759fb1f	Validate snapshot UUID during restore (#59601 ) Today when mounting a searchable snapshot we obtain the snapshot/index UUIDs and then assume that these are the UUIDs used during the subsequent restore. If you concurrently delete the snapshot and replace it with one with the same name then this assumption is violated, with chaotic consequences. This commit introduces a check that ensures that the snapshot UUID does not change during the mount process. If the snapshot remains in place then the index UUID necessarily does not change either. Relates #50999	2020-07-15 16:23:20 +01:00
Martijn van Groningen	2a89e13e43	Move data stream transport and rest action to xpack (#59593 ) Backport of #59525 to 7.x branch. * Actions are moved to xpack core. * Transport and rest actions are moved the data-streams module. * Removed data streams methods from Client interface. * Adjusted tests to use client.execute(...) instead of data stream specific methods. * only attempt to delete all data streams if xpack is installed in rest tests * Now that ds apis are in xpack and ESIntegTestCase no longers deletes all ds, do that in the MlNativeIntegTestCase class for ml tests.	2020-07-15 16:50:44 +02:00
Rory Hunter	2e05ce5f88	Bump version to 7.10.0	2020-07-15 11:56:45 +01:00
Ignacio Vera	f8037abf47	upgrade to lucene-8.6.0 release (#59596 ) (#59599 )	2020-07-15 12:40:57 +02:00
David Turner	0c2510dc68	Don't request cluster metadata in _cat/shards impl (#59548 ) Today `GET _cat/shards` requests the nodes, routing table, and metadata from the cluster state, but it does not use any information from the metadata portion of the response. Metadata includes things like mappings and templates that may be substantial in size. This commit drops the unnecessary metadata portion of this cluster state request.	2020-07-15 10:14:48 +01:00
Francisco Fernández Castaño	66ef1cdad7	Add the possibility to inject a custom RecoveryState factory to IndexStorePlugin implementations (#59124 ) Add a custom factory for recovery state into IndexStorePlugin that allows different implementors to provide its own RecoveryState implementation. Backport of #59038	2020-07-15 11:11:07 +02:00
Armin Braun	96f52a028f	Fix Snapshot not Starting in Partial Snapshot Corner Case (#59428 ) (#59584 ) We were not handling the case where during a partial snapshot all shards would enter a failed state right off the bat. Closes #59384	2020-07-15 07:59:22 +02:00
Armin Braun	2dd086445c	Enable Fully Concurrent Snapshot Operations (#56911 ) (#59578 ) Enables fully concurrent snapshot operations: * Snapshot create- and delete operations can be started in any order * Delete operations wait for snapshot finalization to finish, are batched as much as possible to improve efficiency and once enqueued in the cluster state prevent new snapshots from starting on data nodes until executed * We could be even more concurrent here in a follow-up by interleaving deletes and snapshots on a per-shard level. I decided not to do this for now since it seemed not worth the added complexity yet. Due to batching+deduplicating of deletes the pain of having a delete stuck behind a long -running snapshot seemed manageable (dropped client connections + resulting retries don't cause issues due to deduplication of delete jobs, batching of deletes allows enqueuing more and more deletes even if a snapshot blocks for a long time that will all be executed in essentially constant time (due to bulk snapshot deletion, deleting multiple snapshots is mostly about as fast as deleting a single one)) * Snapshot creation is completely concurrent across shards, but per shard snapshots are linearized for each repository as are snapshot finalizations See updated JavaDoc and added test cases for more details and illustration on the functionality. Some notes: The queuing of snapshot finalizations and deletes and the related locking/synchronization is a little awkward in this version but can be much simplified with some refactoring. The problem is that snapshot finalizations resolve their listeners on the `SNAPSHOT` pool while deletes resolve the listener on the master update thread. With some refactoring both of these could be moved to the master update thread, effectively removing the need for any synchronization around the `SnapshotService` state. I didn't do this refactoring here because it's a fairly large change and not necessary for the functionality but plan to do so in a follow-up. This change allows for completely removing any trickery around synchronizing deletes and snapshots from SLM and 100% does away with SLM errors from collisions between deletes and snapshots. Snapshotting a single index in parallel to a long running full backup will execute without having to wait for the long running backup as required by the ILM/SLM use case of moving indices to "snapshot tier". Finalizations are linearized but ordered according to which snapshot saw all of its shards complete first	2020-07-15 03:42:31 +02:00
Armin Braun	06d94cbb2a	Fix TODO about Spurious FAILED Snapshots (#58994 ) (#59576 ) There is no point in writing out snapshots that contain no data that can be restored whatsoever. It may have made sense to do so in the past when there was an `INIT` snapshot step that wrote data to the repository that would've other become unreferenced, but in the current day state machine without the `INIT` step there is no point in doing so.	2020-07-15 00:54:30 +02:00
Armin Braun	e1014038e9	Simplify Repository.finalizeSnapshot Signature (#58834 ) (#59574 ) Many of the parameters we pass into this method were only used to build the `SnapshotInfo` instance to write. This change simplifies the signature. Also, it seems less error prone to build `SnapshotInfo` in `SnapshotsService` isntead of relying on the fact that each repository implementation will build the correct `SnapshotInfo`.	2020-07-15 00:14:28 +02:00
Armin Braun	16a47e0d08	Simplify SnapshotsInProgress Construction (#58893 ) (#59573 ) With parallel snapshots incoming (but also in isolation) it makes sense to clean up `SnapshotsInProgress` construction. We don't need to pre-compute the waiting shards for every entry. We rarely use this information (only on routing changes) and in the one spot we did we now simply spent the extra cycles for looping over all shards instead of just the waiting ones once per routing change tops instead of on every change to `SnapshotsInProgress` (moreover, we would burn the cycles for looping on all nodes even though only the current master cares about the information). In addition to that change I removed some dead code constructors and slighly optimized deserialization.	2020-07-15 00:00:53 +02:00
Martijn van Groningen	35ae3d19db	Remove data stream feature flag (#59572 ) so that it can used in the next minor release (7.9.0). Backport of #59504 to 7.x branch. Closes #53100	2020-07-14 23:50:41 +02:00
Armin Braun	68a199f75f	Minor Cleanup Dead Code Snapshotting (#57716 ) (#59569 ) * Use consistent cluster state instead in state update * Remove dead loop in tests * Remove some dead exception ctors Just three trivial/random things I found.	2020-07-14 23:13:14 +02:00
James Baiera	5f7e7e9410	[7.x] Data Stream Stats API (#58707 ) (#59566 ) This API reports on statistics important for data streams, including the number of data streams, the number of backing indices for those streams, the disk usage for each data stream, and the maximum timestamp for each data stream	2020-07-14 16:57:46 -04:00
Mark Tozzi	ed2c29f102	If no perBucketSample has been allocated for the parent bucket return a doc count of 0 (#59360 ) (#59567 ) Co-authored-by: Fabio Corneti <info@corneti.com>	2020-07-14 16:56:29 -04:00
Armin Braun	d456f7870a	Deduplicate Index Metadata in BlobStore (#50278 ) (#59514 ) This PR introduces two new fields in to `RepositoryData` (index-N) to track the blob name of `IndexMetaData` blobs and their content via setting generations and uuids. This is used to deduplicate the `IndexMetaData` blobs (`meta-{uuid}.dat` in the indices folders under `/indices` so that new metadata for an index is only written to the repository during a snapshot if that same metadata can't be found in another snapshot. This saves one write per index in the common case of unchanged metadata thus saving cost and making snapshot finalization drastically faster if many indices are being snapshotted at the same time. The implementation is mostly analogous to that for shard generations in #46250 and piggy backs on the BwC mechanism introduced in that PR (which means this PR needs adjustments if it doesn't go into `7.6`). Relates to #45736 as it improves the efficiency of snapshotting unchanged indices Relates to #49800 as it has the potential of loading the index metadata for multiple snapshots of the same index concurrently much more efficient speeding up future concurrent snapshot delete	2020-07-14 22:18:42 +02:00
Tim Brooks	408a07f96a	Separate coordinating and primary bytes in stats (#59487 ) Currently we combine coordinating and primary bytes into a single bucket for indexing pressure stats. This makes sense for rejection logic. However, for metrics it would be useful to separate them.	2020-07-14 12:37:06 -06:00
Tim Brooks	a46e5e0f04	Increase default write queue size (#59464 ) This commit increases the default write queue size to 10000. This is to allow a greater number of pending indexing requests. This work is safe as we have added additional memory limits. Relates to #59263.	2020-07-14 10:35:25 -06:00
Tim Brooks	1a24916fef	Enable replication retries on 7.9+ (#59546 ) Currently the work to support replication retries is present on 7.9. This commit enables these retries by setting the replication timeout to 60s.	2020-07-14 10:35:05 -06:00
Dan Hermann	e54b4a729f	[7.x] Adds write_index_only option to put mapping API (#59539 )	2020-07-14 10:34:08 -05:00
Luca Cavanna	af2f85be15	Consolidate script parsing from object (7.x) (#59509 ) The update by query action parses a script from an object (map or string). We will need to do the same for runtime fields as they are parsed as part of mappings (#59391). This commit moves the existing parsing of a script from an object from RestUpdateByQueryAction to the Script class. It also adds tests and adjusts some error messages that are incorrect. Also, options were not parsed before and they are now. And unsupported fields trigger now a deprecation warning.	2020-07-14 17:08:29 +02:00
Mark Tozzi	b357c1b77a	[7.x] Fix NPE when building exception messages for aggregations (#59156 ) (#59334 )	2020-07-14 09:37:44 -04:00
Andrei Dan	7dcdaeae49	Default to @timestamp in composable template datastream definition (#59317 ) (#59516 ) This makes the data_stream timestamp field specification optional when defining a composable template. When there isn't one specified it will default to `@timestamp`. (cherry picked from commit 5609353c5d164e15a636c22019c9c17fa98aac30) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-07-14 12:36:54 +01:00
Andrei Dan	4180333bbc	[7.x] Composable templates: add a default mapping for @timestamp (#59244 ) (#59510 ) This adds a low precendece mapping for the `@timestamp` field with type `date`. This will aid with the bootstrapping of data streams as a timestamp mapping can be omitted when nanos precision is not needed. (cherry picked from commit 4e72f43d62edfe52a934367ce9809b5efbcdb531) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-07-14 11:29:33 +01:00
Armin Braun	0e3d87ab54	Add Assertions on CS Application in Snapshot Logic (#58681 ) (#59511 ) Relates to #58680. Bugs like that should not only show up in logs but ideally also get caught in tests. We expect to never see exceptions in these two spots.	2020-07-14 12:16:42 +02:00
Armin Braun	81e96954d0	Improve Efficiency of SnapshotsService CS Apply (#56874 ) (#59508 ) This change removes the redundant submitting of two separate cluster state updates for the node configuration changes and routing changes that affect snapshots. Since we submitted the task to deal with node configuration changes every time on master fail-over we could also move the BwC cleanup loop that removes `INIT` state snapshots as well as snapshots that have all their shards completed into this cluster state update task. Aside from improving efficiency overall this change has the fortunate side effect of moving all snapshot finalization to the CS update thread. This is helpful for concurrent snapshots since it makes it very natural and straight forward to order snapshot finalizations by exploiting that they are all initiated on the same thread.	2020-07-14 11:49:09 +02:00
Tim Brooks	623df95a32	Adding indexing pressure stats to node stats API (#59467 ) We have recently added internal metrics to monitor the amount of indexing occurring on a node. These metrics introduce back pressure to indexing when memory utilization is too high. This commit exposes these stats through the node stats API.	2020-07-13 17:23:42 -06:00
Tim Brooks	68d56fa7db	Implement rejections in `WriteMemoryLimits` (#59451 ) This commit adds rejections when the indexing memory limits are exceeded for primary or coordinating operations. The amount of bytes allow for indexing is controlled by a new setting `indexing_limits.memory.limit`.	2020-07-13 14:34:50 -06:00
Mark Tozzi	eb0b28dd1d	Move getPointReaderOrNull into AggregatorBase (#58769 ) (#59455 )	2020-07-13 16:31:33 -04:00
Armin Braun	64c5f70a2d	Remove Needless Context Switches on Loading RepositoryData (#56935 ) (#59452 ) We don't need to switch to the generic or snapshot pool for loading cached repository data (i.e. most of the time in normal operation). This makes `executeConsistentStateUpdate` less heavy if it has to retry and lowers the chance of having to retry in the first place. Also, this change allowed simplifying a few other spots in the codebase where we would fork off to another pool just to load repository data.	2020-07-13 21:38:29 +02:00
Armin Braun	bde92fc5fc	Remove Needless Context Switch From Snapshot Finalization (#56871 ) (#59443 ) No need to do any switch to the `SNAPSHOT` pool here, the blob store repo handles all its writes async on the `SNAPSHOT` pool so we're just needlessly context-switching to enqueue those tasks there. Also cleaned up the source only repository (the only override to `finalizeSnapshot`) to make it clear that no IO is happening there and we don't need to run it on the `SNAPSHOT` pool either.	2020-07-13 20:11:07 +02:00
Armin Braun	31be3a3645	More Efficient Snapshot State Handling (#56669 ) (#59430 ) Follow up to #56365. Instead of redundantly checking snapshots for completion over and over, just track the completed snapshots in the CS updates that complete them instead of looping over the smae snapshot entries over and over. Also, in the batched snapshot shard status updates, only check for completion of a snapshot entry if it isn't already finalizing.	2020-07-13 18:58:04 +02:00
Christos Soulios	3868bcc7b8	[7.x] Histogram integration on Histogram field type (#59431 ) Backports #58930 to 7.x Implements histogram aggregation over histogram fields as requested in #53285.	2020-07-13 19:36:33 +03:00
Henning Andersen	adf6083dd0	Enhance real memory circuit breaker with G1 GC (#58674 ) (#59394 ) Using G1 GC, Elasticsearch can rarely trigger that heap usage goes above the real memory circuit breaker limit and stays there for an extended period. This situation will persist until the next young GC. The circuit breaking itself hinders that from occurring in a timely manner since it breaks all request before real work is done. This commit gently nudges G1 to do a young GC and then double checks that heap usage is still above the real memory circuit breaker limit before throwing the circuit breaker exception. Related to #57202	2020-07-13 17:41:09 +02:00
Martijn van Groningen	b1b7bf3912	Make data streams a basic licensed feature. (#59392 ) Backport of #59293 to 7.x branch. * Create new data-stream xpack module. * Move TimestampFieldMapper to the new module, this results in storing a composable index template with data stream definition only to work with default distribution. This way data streams can only be used with default distribution, since a data stream can currently only be created if a matching composable index template exists with a data stream definition. * Renamed `_timestamp` meta field mapper to `_data_stream_timestamp` meta field mapper. * Add logic to put composable index template api to fail if `_data_stream_timestamp` meta field mapper isn't registered. So that a more understandable error is returned when attempting to store a template with data stream definition via the oss distribution. In a follow up the data stream transport and rest actions can be moved to the xpack data-stream module.	2020-07-13 17:26:46 +02:00
Alan Woodward	bd01fd107c	Revert "Migrate CompletionFieldMapper to parametrized format (#59291 )" This reverts commit `19ba6c39d2`.	2020-07-13 14:16:09 +01:00
Armin Braun	4e574a7136	Remove Dead Code from Closed Index Snapshot Logic (#56764 ) (#59398 ) The code path for closed indices is dead code here ever since #39644 because `shards(currentState, indexIds, ...)` does not set `MISSING` on a closed index's shard that is assigned any longer. Before that change it would always set `MISSING` for a closed index's shard even it was assigned. => simplified the code accordingly.	2020-07-13 14:49:16 +02:00
David Turner	3fb9dccc22	Fix FSHealthServiceTests on Windows (#59387 ) In #52680 we introduced a new health check mechanism. This commit fixes up some related test failures on Windows caused by erroneously assuming that all paths begin with `/`. Closes #59380	2020-07-13 12:43:45 +01:00
Alan Woodward	19ba6c39d2	Migrate CompletionFieldMapper to parametrized format (#59291 ) This adds some optional extra configuration to Parameter: * custom serialization (to handle analyzers) * deprecated parameter names * parameter validation	2020-07-13 12:43:15 +01:00
Armin Braun	08b54feaaf	Remove Snapshot INIT Step (#55918 ) (#59374 ) With #55773 the snapshot INIT state step has become obsolete. We can set up the snapshot directly in one single step to simplify the state machine. This is a big help for building concurrent snapshots because it allows us to establish a deterministic order of operations between snapshot create and delete operations since all of their entries now contain a repository generation. With this change simple queuing up of snapshot operations can and will be added in a follow-up.	2020-07-13 13:41:09 +02:00
Alan Woodward	c810a4a12e	Continue to accept unused 'universal' params in <8.0 indexes (#59381 ) We have a number of parameters which are universally parsed by almost all mappers, whether or not they make sense. Migrating the binary and boolean mappers to the new style of declaring their parameters explicitly has meant that these universal parameters stopped being accepted, which would break existing mappings. This commit adds some extra logic to ParametrizedFieldMapper that checks for the existence of these universal parameters, and issues a warning on 7x indexes if it finds them. Indexes created in 8.0 and beyond will throw an error. Fixes #59359	2020-07-13 11:15:56 +01:00
David Kyle	7dcd943e1d	Mute FsHealthServiceTests testFailsHealthOnIOException (#59382 ) For #59380	2020-07-13 09:48:07 +01:00
Armin Braun	483386136d	Move all Snapshot Master Node Steps to SnapshotsService (#56365 ) (#59373 ) This refactoring has three motivations: 1. Separate all master node steps during snapshot operations from all data node steps in code. 2. Set up next steps in concurrent repository operations and general improvements by centralizing tracking of each shard's state in the repository in `SnapshotsService` so that operations for each shard can be linearized efficiently (i.e. without having to inspect the full snapshot state for all shards on every cluster state update, allowing us to track more in memory and only fall back to inspecting the full CS on master failover like we do in the snapshot shards service). * This PR already contains some best effort examples of this, but obviously this could be way improved upon still (just did not want to do it in this PR for complexity reasons) 3. Make the `SnapshotsService` less expensive on the CS thread for large snapshots	2020-07-12 22:19:07 +02:00
Dan Hermann	e01d73c737	[7.x] Data stream admin actions are now index-level actions	2020-07-10 14:36:18 -05:00
Stuart Tettemer	4c04fd1e05	Scripting: Unlimited compilation rate for ingest (#59268 ) * `ingest` and `processor_conditional` default to unlimited compilation rate Refs: #50152	2020-07-09 16:34:47 -05:00
Stuart Tettemer	94e213dd5f	Scripting: Per context stats in `script` in _nodes/stats (#59266 ) Updated `_nodes/stats`: * Update `script` in `_node/stats` to include stats per context: ``` "script": { "compilations": 1, "cache_evictions": 0, "compilation_limit_triggered": 0, "contexts":[ { "context": "aggregation_selector", "compilations": 0, "cache_evictions": 0, "compilation_limit_triggered": 0 }, ``` Refs: #50152 Backport: #59625	2020-07-09 15:30:50 -05:00
Alan Woodward	f4caadd239	MappedFieldType no longer requires equals/hashCode/clone (#59212 ) With the removal of mapping types and the immutability of FieldTypeLookup in #58162, we no longer have any cause to compare MappedFieldType instances. This means that we can remove all equals and hashCode implementations, and in addition we no longer need the clone implementations which were required for equals/hashcode testing. This greatly simplifies implementing new MappedFieldTypes, which will be particularly useful for the runtime fields project.	2020-07-09 21:05:10 +01:00
Dan Hermann	c26d2b5fa5	Data stream support for indices shard stores API	2020-07-09 13:11:45 -05:00
Nik Everett	28ef997953	Improve vwh's distant bucket handling (#59094 ) (#59248 ) This modifies the `variable_width_histogram`'s distant bucket handling to: 1. Properly handle integer overflows 2. Recalculate the average distance when new buckets are added on the ends. This should slow down the rate at which we build extra buckets as we build more of them. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-07-09 12:14:46 -04:00
Przemko Robakowski	c870d6e570	[7.x] Restart tests with data streams (#58330 ) (#59303 ) * Restart tests with data streams (#58330)	2020-07-09 17:52:20 +02:00
David Turner	d56fc72ee5	Fix node health-check-related test failures (#59277 ) In #52680 we introduced a new health check mechanism. This commit fixes up some sporadic related test failures, and improves the behaviour of the `FollowersChecker` slightly in the case that no retries are configured. Closes #59252 Closes #59172	2020-07-09 12:46:12 +01:00

1 2 3 4 5 ...

5153 Commits