OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nik Everett	ab2c6d9696	Save memory when auto_date_histogram is not on top (backport of #57304 ) (#58190 ) This builds an `auto_date_histogram` aggregator that natively aggregates from many buckets and uses it when the `auto_date_histogram` used to use `asMultiBucketAggregator` which should save a significant amount of memory in those cases. In particular, this happens when `auto_date_histogram` is a sub-aggregator of a multi-bucketing aggregator like `terms` or `histogram` or `filters`. For the most part we preserve the original implementation when `auto_date_histogram` only collects from a single bucket. It isn't possible to "just port the aggregator" without taking a pretty significant performance hit because we used to rewrite all of the buckets every time we switched to a coarser and coarser rounding configuration. Without some major surgery to how to delay sub-aggs we'd end up rewriting the delay list zillions of time if there are many buckets. The multi-bucket version of the aggregator has a "budget" of "wasted" buckets and only rewrites all of the buckets when we exceed that budget. Now that we don't rebucket every time we increase the rounding we can no longer get an accurate count of the number of buckets! So instead the aggregator uses an estimate of the number of buckets to trigger switching to a coarser rounding. This estimate is likely to be terrible when buckets are far apart compared to the rounding. So it also uses the difference between the first and last bucket to trigger switching to a coarser rounding. Which covers for the shortcomings of the bucket estimation technique pretty well. It also causes the aggregator to emit fewer buckets in cases where they'd be reduced together on the coordinating node. This is wonderful! But probably fairly rare. All of that does buy us some speed improvements when the aggregator is a child of multi-bucket aggregator: Without metrics or time zone: 25% faster With metrics: 15% faster With time zone: 22% faster Relates to #56487	2020-06-17 08:48:41 -04:00
Jason Tedor	b78b3edeea	Upgrade to JNA 5.5.0 (#58183 ) This commit bumps our JNA dependency from 4.5.1 to 5.5.0, so that we are now on the latest maintained line, and pick up a large collection of bug fixes that have accumulated.	2020-06-17 07:35:08 -04:00
Ignacio Vera	b6585f2b51	Add new extensions for Lucene86 points codec to FsDirectoryFactory (#58226 ) (#58233 )	2020-06-17 12:55:33 +02:00
Armin Braun	85be78b624	Fix Snapshot Abort Not Waiting for Data Nodes (#58214 ) (#58228 ) This was a really subtle bug that we introduced a long time ago. If a shard snapshot is in aborted state but hasn't started snapshotting on a node we can only send the failed notification for it if the shard was actually supposed to execute on the local node. Without this fix, if shard snapshots were spread out across at least two data nodes (so that each data node does not have all the primaries) the abort would actually never wait on the data nodes. This isn't a big deal with uuid shard generations but could lead to potential corruption on S3 when using numeric shard generations (albeit very unlikely now that we have the 3 minute wait there). Another negative side-effect of this bug was that master would receive a lot more shard status update messages for aborted shards since each data node not assigned a primary would send one message for that primary.	2020-06-17 11:39:50 +02:00
Armin Braun	c2b416ee31	Fix DanglingIndicesIT Failures from MasterNotDiscoveredException (#58215 ) (#58221 ) The dangling indices action is not a proper master node action so it does not retry when executed while the cluster hasn't fully formed yet. Since we use node restarts when setting up the dangling indices state we need to manually ensure a fully formed cluster before moving on with the tests to avoid failures.	2020-06-17 10:34:08 +02:00
Stuart Tettemer	01795d1925	Revert "Scripting: Deprecate general cache settings (#55753 )" (#58201 ) This reverts commit `88e8b34fc2`.	2020-06-16 14:58:18 -06:00
Rory Hunter	03369e0980	Implement dangling indices API (#58176 ) Backport of #50920. Part of #48366. Implement an API for listing, importing and deleting dangling indices. Co-authored-by: David Turner <david.turner@elastic.co>	2020-06-16 21:50:38 +01:00
Stuart Tettemer	88e8b34fc2	Scripting: Deprecate general cache settings (#55753 ) Backport: ef543b0	2020-06-16 13:06:59 -06:00
Alan Woodward	c6acc7c976	Correctly deal with aliases when retrieving lucene FieldType	2020-06-16 18:06:37 +01:00
Alan Woodward	12a3f6dfca	MappedFieldType should not extend FieldType (#58160 ) MappedFieldType is a combination of two concerns: * an extension of lucene's FieldType, defining how a field should be indexed * a set of query factory methods, defining how a field should be searched We want to break these two concerns apart. This commit is a first step to doing this, breaking the inheritance relationship between MappedFieldType and FieldType. MappedFieldType instead has a series of boolean flags defining whether or not the field is searchable or aggregatable, and FieldMapper has a separate FieldType passed to its constructor defining how indexing should be done. Relates to #56814	2020-06-16 16:56:43 +01:00
Dan Hermann	911d46370e	Prohibit clone, shrink, and split on a data stream's write index	2020-06-16 10:53:20 -05:00
Lee Hinman	03ce0f8a4d	[7.x] Normalized prefix for rollover API (#57271 ) (69e1c066) (#58171 ) * Normalized prefix for rollover API (#57271) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Lee Hinman <lee@writequit.org> It fixes the issue #53388 by normalizing prefix at index creation request itself * Fix compilation for backport Co-authored-by: Gaurav Chandani <chngau@amazon.com>	2020-06-16 09:22:10 -06:00
Francisco Fernández Castaño	a5bc5ae030	Don't log on RetentionLeaseSync error handler (#58157 ) After an index has been deleted it may take some time to cancel all the maintenance tasks such as RetentionLeaseSync, it's possible that the task is already executing before the cancellation. This commit just avoids logging a warning message for those scenarios. Closes #57864 Backport of (#58098)	2020-06-16 14:04:32 +02:00
Yannick Welsch	e046b0a8fa	Fix realtime get of numeric fields (#58121 ) Using realtime get on numeric fields when reading from the translog would yield a ClassCastException. Closes #57462	2020-06-16 09:16:26 +02:00
Tal Levy	69d5e044af	Add optional description parameter to ingest processors. (#57906 ) (#58152 ) This commit adds an optional field, `description`, to all ingest processors so that users can explain the purpose of the specific processor instance. Closes #56000.	2020-06-15 19:27:57 -07:00
Stuart Tettemer	71a42dbde9	[7.x] Rely on the computeIfAbsent logic to prevent duplicated compilation of scripts (#55467 ) (#58123 ) Instead of serializing compilation using a plain lock / mutex combined with a double check, rely on the computeIfAbsent logic to prevent duplicated compilation of scripts. Made checkCompilationLimit to be thread-safe and lock free. Backport: 865acad Co-authored-by: Michael Bischoff <michael.bischoff@elastic.co>	2020-06-15 12:01:22 -06:00
markharwood	03dd73dc0d	Fix for wildcard fields that returned ByteRefs not Strings to scripts. (#58060 ) (#58109 ) This need some reorg of BinaryDV field data classes to allow specialisation of scripted doc values. Moved common logic to a new abstract base class and added a new subclass to return string-based representations to scripts. Closes #58044	2020-06-15 14:52:56 +01:00
Dan Hermann	8a910443c4	Add ignore_empty_value parameter in set ingest processor (#57030 ) (#58108 )	2020-06-15 08:35:08 -05:00
Rene Groeschke	01e9126588	Remove deprecated usage of testCompile configuration (#57921 ) (#58083 ) * Remove usage of deprecated testCompile configuration * Replace testCompile usage by testImplementation * Make testImplementation non transitive by default (as we did for testCompile) * Update CONTRIBUTING about using testImplementation for test dependencies * Fail on testCompile configuration usage	2020-06-14 22:30:44 +02:00
Armin Braun	1a48983a56	Fix Running TranslogOps on CS Thread (#58056 ) (#58076 ) We should fork off from the CS thread to run this even if it's a rare condition.	2020-06-13 17:00:49 +02:00
Nik Everett	a5571eb1a8	Save memory when rare_terms is not on top (backport of #57948 ) (#58069 ) This uses the optimization that we started making in #55873 for `rare_terms` to save a bit of memory when that aggregation is not on the top level.	2020-06-12 17:47:10 -04:00
Dan Hermann	17f3318732	[7.x] Resolve index API (#58037 )	2020-06-12 15:41:32 -05:00
Mayya Sharipova	8bd0147ba7	Correct how meta-field is defined for pre 7.8 hits (#57951 ) We keep a static list of meta-fields: META_FIELDS_BEFORE_7_8 as it was before. This is done to ensure the backwards compatability with pre 7.8 nodes. Closes #57831	2020-06-12 09:39:53 -04:00
Armin Braun	5662281562	Fix ExtraFS Breaking SharedClusterSnapshotRestoreIT (#58026 ) (#58040 ) If `ExtraFS` decides to put `extra0/0` into the indices folder then the previous logic in this test would have interpreted the `0` as shard `0` of index `extra0` and fail to list its contents (since it's a file and not an actual shard directory). => simplified the logic to use actually referenced `IndexId` for iterating over indices instead.	2020-06-12 15:27:48 +02:00
Martijn van Groningen	01d8bb8cfa	Enforce valid field mapping exists for timestamp_field in templates. (#58036 ) Backport of #57741 to 7.x branch. Relates to #53100	2020-06-12 15:24:42 +02:00
Armin Braun	a5a251d8c0	Handle Rejections when Scheduling RetryableAction (#58033 ) (#58039 ) Scheduling on the threadpool will throw if the scheduler is already shut down. Handled by treating the rejection like any other non-retryable exception. Closes #58021	2020-06-12 15:23:02 +02:00
Nik Everett	d6c8d9415d	Give significance lookups their own home (backport of #57903 ) (#57959 ) This moves the code to look up significance heuristics information like background frequency and superset size out of `SignificantTermsAggregatorFactory` and into its own home so that it is easier to pass around. This will: 1. Make us feel better about ourselves for not passing around the factory, which is really supposed to be a throw away thing. 2. Abstract the significance lookup logic so we can reuse it for the `significant_text` aggregation. 3. Make if very simple to cache the background frequencies which should speed up when the agg is a sub-agg. We had done this for numerics but not string-shaped significant terms.	2020-06-12 09:21:19 -04:00
Martijn van Groningen	f4199f2ee0	Prohibit append-only writes targeting backing indices directly. (#58025 ) Backport of #57788 to 7.x branch. Append-only writes can only target the corresponding data stream. Relates to #53100	2020-06-12 13:17:55 +02:00
Armin Braun	db03e7c93b	Exclude WindowsFS from SharedClusterSnapshotRestoreIT (#58020 ) (#58023 ) Same as #52488 but for a different test suite Closes #58019	2020-06-12 10:49:03 +02:00
Mark Tozzi	36f551bdb4	Make ValuesSourceConfig behave like a config object (#57762 ) (#58012 )	2020-06-11 17:23:55 -04:00
Igor Motov	5138c0c045	Fix missing null values for std_deviation_bounds in ext. stats aggs (#58000 ) Adds missing null values for std_deviation_bounds in extended stats aggs and improves null handling in parsed extended stats.	2020-06-11 16:23:20 -04:00
Lee Hinman	ffc3c77f75	[7.x] Disallow deletion of composable template if in use by data stream (#57957 ) (#57994 ) Backports the following commits to 7.x: Disallow deletion of composable template if in use by data stream (#57957)	2020-06-11 13:51:56 -06:00
Jim Ferenczi	4c6bfe32a7	Fix possible NPE on search phase failure (#57952 ) When a search phase fails, we release the context of all successful shards. Successful shards that rewrite the request to match none will not create any context since #. This change ensures that we don't try to release a `null` context on these successful shards. Closes #57945	2020-06-11 18:54:16 +02:00
Yannick Welsch	85b0b540f0	Fix refresh behavior in MockDiskUsagesIT (#57926 ) Ensures that InternalClusterInfoService's internally cached stats are refreshed whenever the shard size or disk usage function (to mock out disk usage) are overridden. Closes #57888	2020-06-11 17:38:12 +02:00
David Turner	f950c121bb	Hide AlreadyClosedException on IndexCommit release (#57986 ) Today `InternalEngine#releaseIndexCommit` fails with an `AlreadyClosedException` if the engine is closed before the index commit is released. This can happen if, for example, a node leaves and rejoins the cluster and acquires an index commit for replica shard allocation concurrently with shutting the shard down. There's no need to fail the operation like this: if the engine is shut down then we will clean up the unreferenced files when it's restarted (or if it's allocated elsewhere) so we can suppress an `AlreadyClosedException` in this case. This commit does so. Fixes #57797	2020-06-11 15:41:50 +01:00
Alan Woodward	16e230dcb8	Update to lucene snapshot e7c625430ed (#57981 ) Includes LUCENE-9148 and LUCENE-9398, which splits the BKD metadata, index and data into separate files and keeps the index off-heap.	2020-06-11 14:51:53 +01:00
Yannick Welsch	34fc52dbf3	Fix PersistedClusterStateServiceTests.testSlowLogging (#57971 ) The range in the last writeDurationMillis selection could be empty, as it could prior to the call be set to 1.	2020-06-11 15:47:34 +02:00
Igor Motov	947573f309	Added standard deviation / variance sampling to extended stats (#49782 ) (#57947 ) Per 49554 I added standard deviation sampling and variance sampling to the extended stats interface. Closes #49554 Co-authored-by: Igor Motov <igor@motovs.org> Co-authored-by: andrewjohnson2 <aj114114@gmail.com>	2020-06-11 09:19:44 -04:00
Nik Everett	da72a3a51d	Speed up reducing auto_date_histo with a time zone (backport of #57933 ) (#57958 ) When reducing `auto_date_histogram` we were using `Rounding#round` which is quite a bit more expensive than ``` Rounding.Prepared prepared = rounding.prepare(min, max); long result = prepared.round(date); ``` when rounding to a non-fixed time zone like `America/New_York`. This stops using the former and starts using the latter. Relates to #56124	2020-06-11 09:15:12 -04:00
Albert Zaharovits	c57ccd99f7	Just log 401 stacktraces (#55774 ) Ensure stacktraces of 401 errors for unauthenticated users are logged but not returned in the response body.	2020-06-10 20:39:32 +03:00
Armin Braun	85f5c4192b	Improve Test Coverage for Old Repository Metadata Formats (#57915 ) (#57922 ) Use the the hack used in `CorruptedBlobStoreRepositoryIT` in more snapshot failure tests to verify that BwC repository metadata is handled properly in these so far not-test-covered scenarios. Also, some minor related dry-up of snapshot tests. Relates #57798	2020-06-10 13:27:01 +02:00
Yannick Welsch	80f221e920	Use clean thread context for transport and applier service (#57792 ) (#57914 ) Adds assertions to Netty to make sure that its threads are not polluted by thread contexts (and also that thread contexts are not leaked). Moves the ClusterApplierService to use the system context (same as we do for MasterService), which allows to remove a hack from TemplateUgradeService and makes it clearer that applying CS updates is fully executing under system context.	2020-06-10 10:30:28 +02:00
Armin Braun	fe85bdbe6f	Fix Remote Recovery Being Retried for Removed Nodes (#57608 ) (#57913 ) If a node is disconnected we retry. It does not make sense to retry the recovery if the node is removed from the cluster though. => added a CS listener that cancels the recovery for removed nodes Also, we were running the retry on the `SAME` pool which for each retry will be the scheduler pool. Since the error path of the listener we use here will do blocking operations when closing the resources used by the recovery we can't use the `SAME` pool here since not all exceptions go to the `ActionListenerResponseHandler` threading like e.g. `NodeNotConnectedException`. Closes #57585	2020-06-10 09:41:52 +02:00
Armin Braun	d579420452	Stop Serializing Exceptions in SnapshotInfo (#57866 ) (#57898 ) In ff9e8c622427d42a2d87b4ceb298d043ae3c4e6a we changed the format used when serializing snapshot failures in the cluster state and `SnapshotInfo`. This turned them from a short string holding all the nested exception messages into a multi kb stacktrace in many cases. This is not great if you snapshot a large number of shards that all fail for example and massively blows up the size of the GET snapshots response if there are snapshots with failures in there. This change reverts to the format used for exceptions before the above commit. Also, this change short circuits logging and serialization of the failure for an aborted snapshot where we don't care about the specific message at all and aligns the message to "aborted" in all cases (current if we aborted before any IO, it would have been "aborted" and an exception when aborting later during IO).	2020-06-10 08:41:03 +02:00
Gordon Brown	aab6317260	[7.x] Include hidden indices in snapshots by default (#57325 ) Previously, hidden indices were not included in snapshots by default, unless specified using one of the usual methods for doing so: naming indices directly, using index patterns starting with a ., or specifying expand_wildcards to a value that includes hidden (e.g. all or hidden,open). This commit changes the default expand_wildcards value to include hidden indices.	2020-06-09 16:01:52 -06:00
Yannick Welsch	9eec819c5b	Revert "Use clean thread context for transport and applier service (#57792 )" This reverts commit `259be236cf`.	2020-06-09 22:24:54 +02:00
Yannick Welsch	8199956937	Revert "Assert on request headers only (#57792 )" This reverts commit `b5d3565214`.	2020-06-09 22:24:35 +02:00
Henning Andersen	1e8e115ae1	Rollover avoid heavy lifting in dry-run/validation (#57894 ) Fixed two newly introduced issues with rollover: 1. Using auto-expand replicas, rollover could result in unexpected log messages on future indexes. 2. It did a reroute and other heavy work on the network thread. Closes #57706 Supersedes #57865 Relates #53965	2020-06-09 22:07:30 +02:00
Jake Landis	fff0a106c9	[7.x] Support `if_seq_no` and `if_primary_term` for ingest (#55430 ) (#57768 ) Allow for optimistic concurrency control during ingest by checking the sequence number and primary term. This is accomplished by defining _if_seq_no and _if_primary_term in the pipeline, similarly to _version and _version_type. Closes #41255 Co-authored-by: Maria Ralli <mariai.ralli@gmail.com>	2020-06-09 14:20:26 -05:00
Andrei Dan	3945712c72	[7.x] ILM add data stream support to the Shrink action (#57616 ) (#57884 ) The shrink action creates a shrunken index with the target number of shards. This makes the shrink action data stream aware. If the ILM managed index is part of a data stream the shrink action will make sure to swap the original managed index with the shrunken one as part of the data stream's backing indices and then delete the original index. (cherry picked from commit 99aeed6acf4ae7cbdd97a3bcfe54c5d37ab7a574) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-09 19:45:22 +01:00
Jim Ferenczi	ea696198e9	Fix rounding composite aggs on sorted index (#57867 ) This commit fixes a bug on the composite aggregation when the index is sorted and the primary composite source needs to round values (date_histo). In such case, we cannot take into account the subsequent sources even if they match the index sort because the rounding of the primary sort value may break the original index order. Fixes #57849	2020-06-09 20:41:45 +02:00
Nik Everett	44a79d1739	Deprecte Rounding#round (#57845 ) (#57893 ) This deprecates `Rounding#round` and `Rounding#nextRoundingValue` in favor of calling ``` Rounding.Prepared prepared = rounding.prepare(min, max); ... prepared.round(val) ``` because it is always going to be faster to prepare once. There are going to be some cases where we won't know what to prepare for and in those cases you can call `prepareForUnknown` and stil be faster than calling the deprecated method over and over and over again. Ultimately, this is important because it doesn't look like there is an easy way to cache `Rounding.Prepared` or any of its precursors like `LocalTimeOffset.Lookup`. Instead, we can just build it at most once per request. Relates to #56124	2020-06-09 14:30:56 -04:00
Tim Brooks	2630c80b5d	Fix IndexRecoveryIT transient error test (#57826 ) Currently it is possible for a transient network error to disrupt the start recovery request from the remote to source node. This disruption is racy with the recovery occurring on the source node. It is possible for the source node to finish and clear its recovery. When this occurs, the recovery cannot be reestablished and the "no two start" assertion is tripped. This commit fixes this issue by allowing two starts if the finalize request has been received. Fixes #57416.	2020-06-09 10:49:38 -06:00
Tim Brooks	8119b96517	Fix stalled send translog ops request (#57859 ) Currently, the translog ops request is reentrent when there is a mapping update. The impact of this is that a translog ops ends up waiting on the pre-existing listener and it is never completed. This commit fixes this by introducing a new code path to avoid the idempotency logic.	2020-06-09 10:46:34 -06:00
Tim Brooks	c17121428e	Fix translog ops action name in channel listener (#57854 ) The action name is passed to the `ChannelListener` and is used for logging purposes. Currently, we are using the incorrect action name for the translog ops listener. This commit fixes the issue.	2020-06-09 10:38:58 -06:00
Lee Hinman	cb2ce3736a	[7.x] Make noop template updates be cluster state noops (#57851 ) (#57880 ) Backports the following commits to 7.x: Make noop template updates be cluster state noops (#57851)	2020-06-09 09:26:06 -06:00
Dan Hermann	b501b282f8	Change default backing index naming scheme	2020-06-09 09:31:34 -05:00
Nik Everett	e7cc2448d2	Save memory when string terms are not on top (#57758 ) (#57876 ) This reworks string flavored implementations of the `terms` aggregation to save memory when it is under another bucket by dropping the usage of `asMultiBucketAggregator`.	2020-06-09 10:26:29 -04:00
Yannick Welsch	b5d3565214	Assert on request headers only (#57792 ) Only assert that actual request headers are empty, as default headers might still be there when stashing the context.	2020-06-09 14:08:25 +02:00
Yannick Welsch	259be236cf	Use clean thread context for transport and applier service (#57792 ) Adds assertions to Netty to make sure that its threads are not polluted by thread contexts (and also that thread contexts are not leaked). Moves the ClusterApplierService to use the system context (same as we do for MasterService), which allows to remove a hack from TemplateUgradeService and makes it clearer that applying CS updates is fully executing under system context.	2020-06-09 12:32:28 +02:00
Tim Brooks	9eaee3da8d	Fix exception check in RecoveryRequestTrackerTests (#57493 ) Currently we check that exceptions are the same in the recovery request tracker test. This is inconsistent because the future wraps the exception in a new instance. This commit fixes the test by comparing a random exception message. Fixes #57199	2020-06-08 15:42:48 -06:00
Lee Hinman	fe2eaf0d03	[7.x] Throw exception on duplicate mappings metadata fields (#57839 ) In #57701 we changed mappings merging so that duplicate fields specified in mappings caused an exception during validation. This change makes the same exception thrown when metadata fields are duplicated. This will allow us to be strict currently with plans to make the merging more fine-grained in a later release.	2020-06-08 14:21:18 -06:00
Tim Brooks	952cf770ed	Reestablish peer recovery after network errors (#57827 ) Currently a network disruption will fail a peer recovery. This commit adds network errors as retryable actions for the source node. Additionally, it adds sequence numbers to the recovery request to ensure that the requests are idempotent. Additionally it adds a reestablish recovery action. The target node will attempt to reestablish an existing recovery after a network failure. This is necessary to ensure that the retries occurring on the source node provide value in bidirectional failures.	2020-06-08 14:17:52 -06:00
Lee Hinman	6e8cf0973f	[7.x] Disallow merging existing mapping field definitions in templates (#57701 ) (#57822 ) Backports the following commits to 7.x: Disallow merging existing mapping field definitions in templates (#57701)	2020-06-08 12:56:09 -06:00
Armin Braun	0987c0a5f3	Fix Broken Numeric Shard Generations in RepositoryData (#57813 ) (#57821 ) Fix broken numeric shard generations when reading them from the wire or physically from the physical repository. This should be the cheapest way to clean up broken shard generations in a BwC and safe-to-backport manner for now. We can potentially further optimize this by also not doing the checks on the generations based on the versions we see in the `RepositoryData` but I don't think it matters much since we will read `RepositoryData` from cache in almost all cases. Closes #57798	2020-06-08 18:36:56 +02:00
Nik Everett	ee0ce8ffaf	Fix a bug with missing fields in sig_terms (#57757 ) When you run a `significant_terms` aggregation on a field and it is mapped but there aren't any values for it then the count of the documents that match the query on that shard still have to be added to the overall doc count. I broke that in #57361. This fixes that. Closes #57402	2020-06-08 10:07:14 -04:00
Mayya Sharipova	70e63a365a	Refactor how to determine if a field is metafield (#57378 ) (#57771 ) Before to determine if a field is meta-field, a static method of MapperService isMetadataField was used. This method was using an outdated static list of meta-fields. This PR instead changes this method to the instance method that is also aware of meta-fields in all registered plugins. Related #38373, #41656 Closes #24422	2020-06-08 09:16:18 -04:00
Andrei Dan	1b84e93d83	[7.x] DataStream creation validation allows for prefixed indices (#57750 ) (#57799 ) We want to validate the DataStreams on creation to make sure the future backing indices would not clash with existing indices in the system (so we can always rollover the data stream). This changes the validation logic to allow for a DataStream to be created with a backing index that has a prefix (eg. `shrink-foo-000001`) even if the former backing index (`foo-000001`) exists in the system. The new validation logic will look for potential index conflicts with indices in the system that have the counter in the name greater than the data stream's generation. This ensures that the `DataStream`'s future rollovers are safe because for a `DataStream` `foo` of generation 4, we will look for standalone indices in the form of `foo-%06d` with the counter greater than 4 (ie. validation will fail if `foo-000006` exists in the system), but will also allow replacing a backing index with an index named by prefixing the backing index it replaces. (cherry picked from commit 695b242d69f0dc017e732b63737625adb01fe595) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-06-08 13:31:52 +01:00
Armin Braun	004eb8bd7e	Fix Bug With RepositoryData Caching (#57785 ) (#57800 ) * Fix Bug With RepositoryData Caching This fixes a really subtle bug with caching `RepositoryData` that can corrupt a repository. We were caching `RepositoryData` serialized in the newest metadata format. This lead to a confusing situation where numeric shard generations would be cached in `ShardGenerations` that were not written to the repository because the repository or cluster did not yet support `ShardGenerations`. In the case where shard generations are not actually supported yet, these cached numeric generations are not safe and there's multiple scenarios where they would be incorrect, leading to the repository trying to read shard level metadata from index-N that don't exist. This commit makes it so that cached metadata is always in the same format as the metadata in the repository. Relates #57798	2020-06-08 13:16:45 +02:00
Luca Cavanna	7a06a13d99	Add description to submit and get async search, as well as cancel tasks (#57745 ) This makes it easier to debug where such tasks come from in case they are returned from the get tasks API. Also renamed the last occurrence of waitForCompletion to waitForCompletionTimeout in get async search request.	2020-06-08 11:17:29 +02:00
Armin Braun	619e4f8c02	Make BackgroundIndexer more Efficient (#57781 ) (#57789 ) Improve efficiency of background indexer by allowing to add an assertion for failures while they are produced to prevent queuing them up. Also, add non-blocking stop to the background indexer so that when stopping multiple indexers we don't needlessly continue indexing on some indexers while stopping another one. Closes #57766	2020-06-08 10:18:47 +02:00
Nik Everett	3b1dfa3b5d	Remove deprecated wrapped from scripted_metric (backport of #57627 ) (#57763 ) This removes the deprecated `asMultiBucketAggregator` wrapper from `scripted_metric`. Unlike most other such removals, this isn't likely to save much memory. But it does make the internals of the aggregator slightly less twisted. Relates to #56487	2020-06-05 16:14:28 -04:00
Martijn van Groningen	f170b52e64	Backing indices should use composable template matching with the corresponding data stream name (#57728 ) Backport of #57640 to 7.x branch. Composable templates with exact matches, can match with the data stream name, but not with the backing index name. Also if the backing index naming scheme changes, then a composable template may never match with a backing index. In that case mappings and settings may not get applied.	2020-06-05 18:38:22 +02:00
Dan Hermann	3fe93e24a6	[7.x] Prohibit closing the write index for a data stream (#57740 )	2020-06-05 11:14:43 -05:00
Jake Landis	459ab9a0b2	[7.x] Ensure type exists for all monitoring configuration (#57399 ) (#57704 ) #47711 and #47246 helped to validate that monitoring settings are rejected at time of setting the monitoring settings. Else an invalid monitoring setting can find it's way into the cluster state and result in an exception thrown [1] on the cluster state application (there by causing significant issues). Some additional monitoring settings have been identified that can result in invalid cluster state that also result in exceptions thrown on cluster state application. All settings require a type of either http or local to be applicable. When a setting is changed, the exporters are automatically updated with the new settings. However, if the old or new settings lack of a type setting an exception will be thrown (since exporters are always of type 'http' or 'local'). Arguably we shouldn't blindly create and destroy new exporters on each monitoring setting update, but the lifecycle of the exporters is abit out the scope this PR is trying to address. This commit introduces a similar methodology to check for validity as #47711 and #47246 but this time for ALL (including non-http) settings. Monitoring settings are not useful unless there an exporter with a type defined. The type is used as dependent setting, such that it must exist to set the value. This ensures that when any monitoring settings changes that they can only get added to cluster state if the type exists. If the type exists (and the other validations pass) then the exporters will get re-built and the cluster state remains valid. Tests have been included to ensure that all dynamic monitoring settings have the type as dependent settings. [1] org.elasticsearch.common.settings.SettingsException: missing exporter type for [found-user-defined] exporter at org.elasticsearch.xpack.monitoring.exporter.Exporters.initExporters(Exporters.java:126) ~[?:?]	2020-06-05 10:47:11 -05:00
Tanguy Leroux	0e57528d5d	Remove more //NORELEASE (#57517 ) We agreed on removing the following //NORELEASE tags.	2020-06-05 15:34:06 +02:00
Gordon Brown	5a4e5a1e9d	Handle `cluster.max_shards_per_node` in YAML config (#57234 ) Prior to this commit, `cluster.max_shards_per_node` is not correctly handled when it is set via the YAML config file, only when it is set via the Cluster Settings API. This commit refactors how the limit is implemented, both to enable correctly handling the setting in the YAML and to more effectively centralize the logic used to enforce the limit. The logic used to apply the limit, as well as the setting value, has been moved to the new `ShardLimitValidator`.	2020-06-04 14:02:21 -06:00
Nik Everett	98c379c507	Merge remaining sig_terms into terms (#57397 ) (#57687 ) Merges the remaining implementation of `significant_terms` into `terms` so that we can more easilly make them work properly without `asMultiBucketAggregator` which should save memory and speed them up. Relates #56487	2020-06-04 14:32:32 -04:00
Mark Vieira	9b0f5a1589	Include vendored code notices in distribution notice files (#57017 ) (#57569 ) (cherry picked from commit 627ef279fd29f8af63303bcaafd641aef0ffc586)	2020-06-04 10:34:24 -07:00
Armin Braun	80d1b12fa3	Restore ThreadContext after Serializing OutboundMessage (#57659 ) (#57681 ) Stash the current context before restoring the stored context on the IO thread so that its thread context does not get polluted. Closes #57554	2020-06-04 17:55:26 +02:00
David Turner	fc4dd6d681	Timeout health API on busy master (#57587 ) Today `GET _cluster/health?wait_for_events=...&timeout=...` will wait indefinitely for the master to process the pending cluster health task, ignoring the specified timeout. This could take a very long time if the master is overloaded. This commit fixes this by adding a timeout to the pending cluster health task.	2020-06-04 13:39:22 +01:00
William Brafford	7de6d97363	Version bump for 7.7.1 release (#57619 )	2020-06-03 16:38:25 -04:00
Igor Motov	8d7f389f3a	Increase search.max_buckets to 65,535 (#57042 ) Increases the default search.max_buckets limit to 65,535, and only counts buckets during reduce phase. Closes #51731	2020-06-03 15:35:41 -04:00
Julie Tibshirani	e0a15e8dc4	Remove the 'array value parser' marker interface. (#57571 ) (#57622 ) This PR replaces the marker interface with the method FieldMapper#parsesArrayValue. I find this cleaner and it will help with the fields retrieval work (#55363). The refactor also ensures that only field mappers can declare they parse array values. Previously other types like ObjectMapper could implement the marker interface and be passed array values, which doesn't make sense.	2020-06-03 11:30:14 -07:00
Nik Everett	7fd94f7d0f	Test: Protect auto_date_histo from 0 buckets The test for `auto_date_histogram` as trying to round `Long.MAX_VALUE` if there were 0 buckets. That doesn't work. Also, this replaces all of the class variables created to make consistent random result when testing `InternalAutoDateHistogram` with the newer `randomResultsToReduce` which is a little simpler to understand.	2020-06-03 12:51:22 -04:00
Christos Soulios	67abde326e	[7.x] Introduce v6.8.11 (#57600 )	2020-06-03 19:10:16 +03:00
Nhat Nguyen	5097071230	Increase timeout for GlobalCheckpointSyncIT (#57567 ) The test failed when it was running with 4 replicas and 3 indexing threads. The recovering replicas can prevent the global checkpoint from advancing. This commit increases the timeout to 60 seconds for this suite and the check for no inflight requests. Closes #57204	2020-06-03 08:50:02 -04:00
Nik Everett	2a27c411fb	Same memory when geo aggregations are not on top (#57483 ) (#57551 ) Saves memory when the `geotile_grid` and `geohash_grid` are not on the top level by using the `LongKeyedBucketOrds` we built in #55873.	2020-06-02 16:21:50 -04:00
Zachary Tong	79ac69cfa3	[7.x Backport] Prevent SigTerms/SigText from running on fields they do not support (#57485 ) SigTerms cannot run on fields that are not searchable, and SigText cannot run on fields that do not have analyzers. Both of these situations fail today with an esoteric exception, so this just formalizes the constraint by throwing an IllegalArgumentException up front. In practice, the only affected field seems to be the `binary` field, which is neither searchable or has a default analyzer (e.g. even numeric and keyword fields have a default analyzer despite not being tokenized) Adds supported-type tests, and makes some changes to the test itself to allow testing sigtext (indexing _source). Also a few tweaks to the test to avoid bad randomization (negative numbers, etc).	2020-06-02 16:03:37 -04:00
Nik Everett	97c06816a4	Fix an optimization in terms agg (backport #57438 ) (#57547 ) When the `terms` agg runs against strings and uses global ordinals it has an optimization when it collects segments that only ever have a single value for the particular string. This is very common. But I broke it in #57241. This fixes that optimization and adds `debug` information that you can use to see how often we collect segments of each type. And adds a test to make sure that I don't break the optimization again. We also had a specialiation for when there isn't a filter on the terms to aggregate. I had removed that specialization in #57241 which resulted in some slow down as well. This adds it back but in a more clear way. And, hopefully, a way that is marginally faster when there is a filter. Closes #57407	2020-06-02 14:57:45 -04:00
Mark Tozzi	e50f514092	IndexFieldData should hold the ValuesSourceType (#57373 ) (#57532 )	2020-06-02 12:16:53 -04:00
Armin Braun	ba2d70d8eb	Serialize Outbound Messages on IO Threads (#56961 ) (#57080 ) Almost every outbound message is serialized to buffers of 16k pagesize. We were serializing these messages off the IO loop (and retaining the concrete message instance as well) and would then enqueue it on the IO loop to be dealt with as soon as the channel is ready. 1. This would cause buffers to be held onto for longer than necessary, causing less reuse on average. 2. If a channel was slow for some reason, not only would concrete message instances queue up for it, but also 16k of buffers would be reserved for each message until it would be written+flushed physically. With this change, the serialization happens on the event loop which effectively limits the number of buffers that `N` IO-threads will ever use so long as messages are small and channels writable. Also, this change dereferences the reference to the concrete outbound message as soon as it has been serialized to save some more on GC. This reduces the GC time for a default PMC run by about 50% in experiments (3 nodes, 2G heap each, loopback ... obvious caveat is that GC isn't that heavy in the first place with recent changes but still a measurable gain). I also expect it to be helpful for master node stability by causing less of a spike if master is e.g. hit by a large number of requests that are processed batched (e.g. shard snapshot status updates) and responded to in a short time frame all at once. Obviously, the downside to this change is that it introduces more latency on the IO loop for the serialization. But since we read all of these messages on the IO loop as well I don't see it as much of a qualitative change really and the more predictable buffer use seems much more valuable relatively.	2020-06-02 16:15:18 +02:00
Armin Braun	9bc9d01b84	Do not Block Snapshot Thread Pool Fully During Restore or Snapshot (#57360 ) (#57511 ) Allow for a fairer distribution of snapshot and restore operations to enable parallel snapshots and improve behaviour for parallel snapshot + restore. Closes #55803	2020-06-02 11:45:55 +02:00
Ryan Ernst	7aad4f6470	Store parsed mapping settings in IndexSettings (#57492 ) There are several mapping settings that are currently re-parsed every time they are read. This can be quite frequent, for example within every document ingestion. This commit moves the parsed versions of these mapping settings to be stored in IndexSettings, just as other index settings are already. closes #57395	2020-06-01 16:45:36 -07:00
Mark Tozzi	1f500583b1	Clean up Aggregator Supplier Boiler Plate (#57442 ) (#57452 )	2020-06-01 14:21:07 -04:00
Nik Everett	c6c0b1a968	Optimize `routingNodes` variable in AddIncrementallyTests (#57140 ) (#57447 ) The `routingNodes` variable is unused. Replace `clusterState.getRoutingNodes()` with `routingNodes`. Co-authored-by: Boice Huang <boicehuang@tencent.com>	2020-06-01 14:13:45 -04:00
Zachary Tong	daaf5a3dcc	Fix assertion catching in aggregation supported type test (#56466 ) (#57382 ) At some point, we changed the supported-type test to also catch assertion errors. This has the side effect of also catching the `fail()` call inside the try-catch, which silently smothered some failures. This modifies the test to throw at the end of the try-catch block to prevent from accidentally catching itself. Catching the AssertionError is convenient because there are other locations that do throw an assertion in tests (due to hitting an assertion before the exception is thrown) so I think we should keep it around. Also includes a variety of fixes to other tests which were failing but being silently smothered.	2020-06-01 12:10:05 -04:00
Armin Braun	59570eaa7d	Fix Local Translog Recovery not Updating Safe Commit in Edge Case (#57350 ) (#57380 ) In case the local checkpoint in the latest commit is less than the last processed local checkpoint we would recover 0 ops and hence not commit again. This would lead to the logic in `IndexShard#recoverLocallyUpToGlobalCheckpoint` not seeing the latest local checkpoint when it reload the safe commit from the store and thus cause inefficient recoveries because the recoveries would work from a lower than possible local checkpoint. Closes #57010	2020-05-30 09:28:50 +02:00
Nik Everett	d6a3704932	Fold some of sig_terms into terms (backport of #57361 ) (#57386 ) This merges the global-ordinals-based implementation for `significant_terms` into the global-ordinals-based implementation of `terms`, removing a bunch of copy and pasted code that is subtly different across the two implementations and replacing it with an explicit `ResultStrategy` with nice stuff like Javadoc. The actual behavior is mostly unchanged, though I was able to remove a redundant copy of bytes representing the string from the result construction phase of `significant_terms`. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-05-29 22:51:11 -04:00
Nik Everett	f52e779806	Fix casting of scaled_float in sorts (#57207 ) (#57385 ) Previously we'd get a `ClassCastException` when you tried to use `numeric_type` on `scaled_float`. Oops! This cleans up the CCE and moves some code around so the casting actually works.	2020-05-29 18:06:04 -04:00

1 2 3 4 5 ...

4923 Commits