OpenSearch

Commit Graph

Author	SHA1	Message	Date
Julie Tibshirani	b2d3c3f6f9	Fix bug where fvh fragments could be loaded from wrong doc (#66142 ) This PR fixes a regression where fvh fragments could be loaded from the wrong document _source. Some `FragmentsBuilder` implementations contain a `SourceLookup` to load from _source. The lookup should be positioned to load from the current hit document. However, since `FragmentsBuilder` are cached and shared across hits, the lookup is never updated to load from the new documents. This means we accidentally load _source from a different document. The regression was introduced in #60179, which started storing `SourceLookup` on `FragmentsBuilder`. Fixes #65533.	2020-12-09 17:52:58 -08:00
Francisco Fernández Castaño	55246d8d9b	[7.10] Bump version after 7.10.1 release	2020-12-09 16:11:29 +01:00
Lee Hinman	8cbb9612d0	[7.10] Create AllocationDeciders in the main method of the ILM step (#65037 ) (8ac30f9a) (#66070 ) Backports the following commits to 7.x: Create AllocationDeciders in the main method of the ILM step (#65037) (8ac30f9)	2020-12-08 16:56:25 -07:00
Gordon Brown	fb65fd8723	[7.10] Correctly determine defaults of settings which depend on other settings (#65989 ) This commit adjusts the behavior when calculating the diff between two `AbstractScopedSettings` objects, so that the default values of settings whose default values depend on the values of other settings are correctly calculated. Previously, when calculating the diff, the default value of a depended setting would be calculated based on the default value of the setting(s) it depends on, rather than the current value of those settings.	2020-12-08 13:21:00 -07:00
Tanguy Leroux	16fae5d66d	Also reroute after shard snapshot size fetch failure (#66008 ) In #61906 we added the possibility for the master node to fetch the size of a shard snapshot before allocating the shard to a data node with enough disk space to host it. When merging this change we agreed that any failure during size fetching should not prevent the shard to be allocated. Sadly it does not work as expected: the service only triggers reroutes when fetching the size succeed but never when it fails. It means that a shard might stay unassigned until another cluster state update triggers a new allocation (as in #64372). More sadly, the test I wrote was wrong as it explicitly triggered a reroute. This commit changes the InternalSnapshotsInfoService so that it also triggers a reroute when fetching the snapshot shard size failed, ensuring that the allocation can move forward by using an UNAVAILABLE_EXPECTED_SHARD_SIZE shard size. This unknown shard size is kept around in the snapshot info service until no corresponding unassigned shards need the information. Backport of #65436	2020-12-08 12:10:37 +01:00
Przemko Robakowski	eaab5c65e0	Allow more legit cases in Metadata.Builder.validateDataStreams (#65791 ) (#65938 ) This change simplifies logic and allow more legit cases in Metadata.Builder.validateDataStreams. It will only show conflict on names that are in form of .ds-<data stream name>-<[0-9]+> and will allow any names like .ds-<data stream name>-something-else-<[0-9]+>. This fixes problem with rollover when you have 2 data streams with names like a and a-b - currently if a-b has generation greater than a you won't be able to rollover a anymore. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-12-07 19:54:46 +01:00
Nhat Nguyen	26d67c1662	Ensure notify when proxy connections disconnect (#65697 ) TransportService doesn't respond to the pending requests of proxy connections when the underlying connections get disconnected because proxy connections do not override the getCacheKey method. Some CCS requests would never be completed because of this bug.	2020-12-03 14:53:17 -05:00
Armin Braun	745f527fea	Deduplicate Index Meta Generations when Deserializing (#65619 ) (#65666 ) These strings are quite long individually and will be repeated potentially up to the number of snapshots in the repository times. Since these make up more than half of the size of the repository metadata and are likely the same for all snapshots the savings from deduplicating them can make up for more than half the size of `RepositoryData` easily in most real-world cases.	2020-12-01 12:34:35 +01:00
Armin Braun	f8f08ba3a7	Fix NPE in ClusterInfoService (#65654 ) (#65659 ) Store stats can be `null` if e.g. the shard was already closed when the stats where retrieved. Don't record those shards in the sizes map to fix an NPE in this case.	2020-12-01 10:33:36 +01:00
Armin Braun	6bbeedc932	Reset Deflater/Inflater after Use in DeflateCompressor (#65617 ) (#65646 ) We should reset after use, not before reuse. Otherwise we keep the input buffers on these objects around for a long time and they can grow to O(MB).	2020-12-01 02:44:36 +01:00
Przemko Robakowski	bb0fcb150b	Fix TranslogTests.testTotalTests when n=0 (#65632 ) When n=0 in TranslogTests.testTotalTests we never update earliestLastModifiedAge so it fails comparison with default value of total.getEarliestLastModifiedAge() which is 0. In this change we always check this special case and then select n>0 Closes #65629	2020-11-30 18:35:55 -05:00
Howard	0137c1679b	Fix the earliest last modified age of translog stats (#64753 ) Currently translog's `earliest_last_modified_age` field is always 0 in `_nodes/stats` response.	2020-11-30 17:34:55 -05:00
Alan Woodward	fb84b6710d	Restore use of default search and search_quote analyzers (#65491 ) (#65562 ) In the refactoring of TextFieldMapper, we lost the ability to define a default search or search_quote analyzer in index settings. This commit restores that ability, and adds some more comprehensive testing. Fixes #65434	2020-11-26 18:34:59 +00:00
Ioannis Kakavas	f6921af885	Revert "Gracefully handle exceptions from Security Providers (#65464 ) (#65554 )" This reverts commit `12ba9e3e16`. This commit was mechanically backported to 7.10 while it shouldn't have been.	2020-11-26 17:11:34 +02:00
Ioannis Kakavas	12ba9e3e16	Gracefully handle exceptions from Security Providers (#65464 ) (#65554 ) In certain situations, such as when configured in FIPS 140 mode, the Java security provider in use might throw a subclass of java.lang.Error. We currently do not catch these and as a result the JVM exits, shutting down elasticsearch. This commit attempts to address this by catching subclasses of Error that might be thrown for instance when a PBKDF2 implementation is used from a Security Provider in FIPS 140 mode, with the password input being less than 14 bytes (112 bits). - In our PBKDF2 family of hashers, we catch the Error and throw an ElasticsearchException while creating or verifying the hash. We throw on verification instead of simply returning false on purpose so that the message bubbles up and the cause becomes obvious (otherwise it would be indistinguishable from a wrong password). - In KeyStoreWrapper, we catch the Error in order to wrap and re-throw a GeneralSecurityException with a helpful message. This can happen when using any of the keystore CLI commands, when the node starts or when we attempt to reload secure settings. - In the `elasticsearch-users` tool, we catch the ElasticsearchException that the Hasher class re-throws and throw an appropriate UserException. Tests are missing because it's not trivial to set CI in fips approved mode right now, and thus any tests would need to be muted. There is a parallel effort in #64024 to enable that and tests will be added in a followup.	2020-11-26 17:04:34 +02:00
Ioannis Kakavas	b4b4483e24	Do not interpret SecurityException in KeystoreAwareCommand (#65366 ) (#65486 ) KeyStoreAwareCommand attempted to deduce whether an error occurred because of a wrong password by checking the cause of the SecurityException that KeyStoreWrapper.decrypt() throws. Checking for AEADBadTagException was wrong becase that exception could be (and usually is) wrapped in an IOException. Furthermore, since we are doing the check already in KeyStoreWrapper, we can just return the message of the SecurityException to the user directly, as we do in other places.	2020-11-26 13:12:18 +02:00
Jim Ferenczi	88993e763f	Fix handling of null values in geo_point (#65307 ) A bug was introduced in 7.10 that causes explicit `null` values to be indexed in the _field_names field. This change fixes this bug for newly ingested data but `null` values ingested with 7.10 will continue to match `exists` query so a reindex is required. Fixes #65306	2020-11-24 11:00:37 +01:00
Jim Ferenczi	359b89a19b	Fix cacheability of custom LongValuesSource in TermsSetQueryBuilder (#65367 ) (#65389 ) This change fixes the equals and hashCode methods of the custom FieldValuesSource that is used internally to extract the value from a doc value field. Using the field data instance to check equality prevented the query to be cached in previous versions. Switching to the field name should make the query eligible for caching again.	2020-11-23 22:21:01 +01:00
Jay Modi	1a13a0b10f	Watcher understands hidden expand wildcard value (#65372 ) Watcher has a search template that stores indices options to be used as part of a search during watch execution, but this was not updated to be aware of hidden indices and the `hidden` expand_wildcards option. This change makes use of the `IndicesOptions#toXContent` method in Watcher, which already handles the new value. Additionally, the XContent parsing is moved to the IndicesOptions class so that we will be less likely to miss updating this in the future. Closes #65148 Backport of #65332	2020-11-23 09:17:49 -07:00
Nik Everett	56605e4d9a	Fixup reduceRandom tests (#65263 ) In aa1ea96b8698aa12bed1c4e8d704882a2a639791 I made all `testReduceRandom` tests for aggs mimick production more precisely. More precisely, they pick the correct "lead" result when performing partial reduction. This is great, but, sadly, some tests assumed that we always reduced against the "first" aggregator. This fixes those tests. Closes #65163	2020-11-20 13:10:34 -05:00
James Rodewig	feca22729c	[DOCS] Remove duplicated word in replica shard allocator comment (#65295 ) (#65317 ) Co-authored-by: Howard <danielhuang@tencent.com>	2020-11-20 12:25:52 -05:00
Jay Modi	893e1a5282	Fix date math hidden index resolution (#65278 ) This commit updates the IndexAbstractionResolver so that hidden indices are properly resolved when date math is in use and when we are checking if the index is visible. Closes #65157 Backport of #65236	2020-11-19 12:40:14 -07:00
Julie Tibshirani	5495032b00	Remove unused method Analysis#isNoStopwords.	2020-11-17 16:34:33 -08:00
Sylvain Wallez	b2475f9ccf	Fix parsing RareTerms aggregation response in RestHighLevelClient (#65144 ) Backport of #64454 - Add LongRareTerms and StringRareTerms to the DefaultNamedXContents, ensure that the response of RareTerms aggregation can be parsed correctly. - Add testSearchWithRareTermsAgg method to test the response of RareTerms aggregation can be parsed correctly. - Add some test code to ensure the AggregationsTests can execute successfully. Co-authored-by: bellengao <gbl_long@163.com>	2020-11-17 17:43:51 +01:00
Julie Tibshirani	3974c3b066	Move the shared fetch cache to highlighting. (#65105 ) The cache is only used by highlighters, so it can be scoped to only the highlighting context.	2020-11-16 18:54:32 -08:00
Mark Vieira	afd12fddaf	Remove reference to 7.9.4 relase which won't happen	2020-11-16 10:31:36 -08:00
Przemysław Witek	de668ab84b	[7.10] [ML] Extract dependent variable's mapping correctly in case of a multi-field (#63813 ) (#64287 )	2020-11-16 10:34:58 +01:00
Alan Woodward	caf143f4a5	Unused boost parameter should not throw mapping exception (#64999 ) (#65014 ) We were correctly dealing with boosts that had an effect, but mappers that had a silently accepted but ignored boost parameter were throwing an error instead of continuing to ignore the boost but emitting a warning. Fixes #64982	2020-11-12 19:28:32 +00:00
James Rodewig	75b4af5833	[DOCS] Fix plugins service comment typo (#64902 ) (#64933 ) Co-authored-by: Howard <danielhuang@tencent.com>	2020-11-11 10:30:44 -05:00
Daniel Mitterdorfer	a6302d2169	Mute RolloverIT#testRolloverWithClosedIndexInAlias (#64925 ) Relates #64921	2020-11-11 14:33:48 +01:00
Andrei Dan	cd35122e48	Bump versions after 7.10 release (#64856 )	2020-11-11 13:08:16 +00:00
Tim Brooks	f96dccd1ec	Propogate rejected execution during bulk actions (#64886 ) Currently a rejected execution exception can be swallowed when async actions return during transport bulk actions. This includes scenarios where we went async to perform ingest pipelines or index creation. This commit resolves the issue by propagating a rejected exception.	2020-11-10 12:16:40 -07:00
Nhat Nguyen	207e4b00f9	Busily assert in testCreateSearchContextFailure (#64243 ) If a background refresh is running, then the refCount assertion will fail as Engine#refreshIsNeeded can increase the refCount by 2. Closes #64052	2020-11-10 11:51:41 -05:00
Armin Braun	d173ba6b2d	Fix NPE in toString of FailedShard (#64770 ) (#64779 ) The concatenation took precedence over the null check, leading to an NPE because `null` was passed to `ExceptionsHelper.stackTrace(failure))`.	2020-11-09 17:02:11 +01:00
David Turner	33f703ef1f	Fix up roles after rolling upgrade (#64693 ) Node roles vary by version, and new roles are suppressed for BWC. This means we can receive a join from a node that's already in the cluster but with a different set of roles: the node didn't change roles, but the cluster state came via an older master. This commit ensures that we properly process a join from such a node to ensure that the roles are correct. Closes #62840	2020-11-06 12:33:09 +00:00
Armin Braun	51e9d6f227	Revert Serializing Outbound Transport Messages on IO Threads (#64632 ) (#64654 ) Serializing outbound transport message on the IO loop was introduced in https://github.com/elastic/elasticsearch/pull/56961. Unfortunately it turns out that this is incompatible with assumptions made by CCR code here: `f22ddf822e/x-pack/plugin/ccr/src/main/java/org/elasticsearch/xpack/ccr/action/repositories/GetCcrRestoreFileChunkAction.java (L60-L61)` and that are not easy to work around on short notice. Raising reverting this move (as a temporary solution, it's still a valuable change long-term) as a blocker therefore as this seriously affects the stability of the initial phase of the CCR following by causing corrupted bytes to be send to the follower.	2020-11-05 16:29:12 +01:00
Jim Ferenczi	9e4105ec37	Validate PIT on _msearch (#63167 ) This change ensures that we validate point in times provided by individual search requests in _msearch. Relates #63132	2020-11-05 15:38:28 +01:00
Jim Ferenczi	3e2fa09666	Fix merging of terms aggregation with compound order (#64469 ) This change fixes a bug introduced in #61779 that uses a compound order to compare buckets when merging. The bug is triggered when the compound order uses a primary sort ordered by key (asc or desc). This commit ensures that we always extract the primary sort when comparing keys during merging. The PR is marked as no-issue since the bug has not been released in any official version.	2020-11-05 12:05:19 +01:00
markharwood	1fb6206fbc	SignificantText aggregation had include/exclude logic back to front (#64520 ) (#64538 ) Backport bugfix. SignificantText aggregation had include/exclude logic back to front. Added test. Closes #64519	2020-11-03 16:43:03 +00:00
Ignacio Vera	4851bc7bae	Upgrade to Lucene-8.7.0 (#64532 ) (#64537 )	2020-11-03 16:57:04 +01:00
Ignacio Vera	156c931745	LinearCounting recompute size tripping assertion (#64465 ) (#64531 ) Guard recomputeSize method from out of bounds exception	2020-11-03 15:52:48 +01:00
James Rodewig	4a64134718	[DOCS] Fix typo in IndexService.java (#64034 ) (#64447 ) Co-authored-by: mushaoqiong <mushaoqiong@126.com>	2020-11-02 08:16:29 -05:00
Armin Braun	dad3b26560	Fix Typo in Repository Exception Message (#64412 ) (#64434 ) Missing space fixed.	2020-10-30 21:10:17 +01:00
Jason Tedor	fedaa3be05	Remove mute from testDiscoveryNodeRoleWithOldVersion This commit removes a mute on DiscoveryNodeTest#testDiscoveryNodeRoleWithOldVersion after a fix was pushed in `6b119a43c1`. Relates #64385	2020-10-29 22:37:38 -04:00
Jason Tedor	6b119a43c1	Fix version in testDiscoveryNodeRoleWithOldVersion This commits fixes the version when reading from the stream in DiscoveryNodeTests#testDiscoveryNodeRoleWithOldVersion. Closes #64385	2020-10-29 22:36:14 -04:00
Yang Wang	533b929e6c	[Test] Mute DiscoveryNodeTests.testDiscoveryNodeRoleWithOldVersion The issue is tracked at https://github.com/elastic/elasticsearch/issues/64385	2020-10-30 13:28:52 +11:00
Jason Tedor	1126ba4df8	Serialize can contain data with roles (#64324 ) This commit internalizes whether or not a role represents the ability to contain data. In the future, this will let us remove the compatibility role notion.	2020-10-29 20:44:39 -04:00
Jason Tedor	827dd39a12	Filter node.roles setting in transport client (#64276 ) This commit filters out the node.roles setting from the transport client, since the transport client does not take on these roles.	2020-10-28 16:24:14 -04:00
Jason Tedor	5d42c2b06e	Deprecate the no-jdk distributions (#64275 ) This commit adds logging to indicate that the no-jdk distributions are deprecated and will be removed in a future release.	2020-10-28 10:35:23 -04:00
Nik Everett	0c47d49784	Make sure non-collecting aggs include sub-aggs (backport of #64214 ) (#64247 ) Now that we're consistently using `cat_match` to filter which shards we run on we can get this confusing case: 1. You have a search with, say, a range and a sub-agg. 2. That search has a query that `can_match` can recognize will match no docs. On any shard. 3. So we dutifully run it on a single shard so it can produce the "empty" aggs. 4. The shard we pick happens to not have the target of the range mapped. 5. This kicks in the special range aggregator that doesn't collect any documents. 6. Before this commit, that range aggregator also never produced any sub-aggs. So, without this change, it was quite possible for a search that happened to match no documents to "throw away" the sub-aggs of a range and a few other aggs. We've had this problem for a long, long time but it is more confusing now because `can_match` is really kicking in and causing us to see cases where it looks like you are targeting a lot of shards but you really are only targeting a couple. It used to be that to get the "no sub-aggs" behavior you had to explicitly target only shards that didn't map the target field of the `range` agg. And, like, in that case it isn't too bad because you targeted a sort of degenerate shard. But now that `can_match` is doing its thing you can end up with the confusing steps above. It took me several hours to track down what what happening I know how the individual pieces of all of this works. It took four hours to figure out how they fit together in this case.... Anyway! This replaces all the aggregator implementations that throw out the sub-aggregators with ones that keep them. I think this'll be less confusing in the future. Closes #64142	2020-10-28 08:38:05 -04:00
Jason Tedor	78c741ab32	Log whether or not we are using the bundled JDK (#64255 ) This commit adds logging to indicate whether or not we are using the bundled JDK. We distinguish between using a distribution that bundles the JDK versus using a distribution that does not bundle the JDK.	2020-10-28 07:10:47 -04:00
Armin Braun	2983584ef6	Fix #invariant Assertion in CacheFile (#64180 ) (#64264 ) Fix #invariant Assertion in CacheFile closes #64141	2020-10-28 10:22:47 +01:00
Armin Braun	a697d5edae	Don't Generate an Index Setting History UUID unless it's Supported (#64164 ) (#64213 ) In 7.x we can't just by default generate this setting as it might not be supported by data nodes that are assigned shards for an older version in mixed version clusters. Closes #64152	2020-10-28 09:03:09 +01:00
Jason Tedor	dfc8ae48cc	Fix using bundled JDK detection on macOS (#64236 ) This commit fixes an issue with the detection on macOS for whether or not the bundled JDK is being used. The logic between macOS and non-macOS is different because the JDK has a different directory structure on macOS versus non-macOS. However, due to notarization issues, we changed the top-level directory from jdk to jdk.app, yet never updated this detection logic to account for that. Ideally, we would have a packaging test that asserts that we have the behavior here correct, and it maintains over time. Alas, we do not currently have packaging tests on macOS.	2020-10-27 16:47:02 -04:00
Nhat Nguyen	566d1fd459	Return the same point in time in search response (#64188 ) With this change, we will always return the same point in time in a search response as its input until we implement the retry mechanism for the point in times.	2020-10-27 10:17:44 -04:00
Jim Ferenczi	e34014eb6a	Fix sorted query when date_nanos is used as the numeric_type (#64183 ) The formatting of the global bottom value does not take the resolution of the provided numeric_type into account. This change fixes this bug by providing the resolution directly in the doc value format if the numeric_type is provided as `date_nanos`. Closes #63719	2020-10-27 11:00:23 +01:00
Armin Braun	e02561476e	Fix Broken Clone Snapshot CS Update (#64116 ) (#64159 ) We must not remove the snapshot from the initializing set in the `timeout` getter. This was a plain oversight/mistake and went unnoticed. It can lead to the removal of a valid snapshot clone from the cluster state in rare circumstances (e.g. when a node concurrently joins the cluster or a routing change happens as it did in the linked test failure). Closes #64115	2020-10-26 14:32:42 +01:00
Armin Braun	96407268a0	Fix Background Merge Breaking Snapshot Restore Test (#63579 ) (#64129 ) If we run into a background merge between creating the snapshot and closing the index then with compound files we could be in a situation where we get zero file reuse on restore. Force merging before the snapshot gives us a single segment that won't change down the line so the restore always sees file reuse from the close index. Closes #63476	2020-10-26 09:34:43 +01:00
Armin Braun	bdea16301d	Fix testMasterFailoverDuringCloneStep1 (#63580 ) (#64127 ) Assuming the clone failed when the request failed is not sufficient. There are failure modes where the request fails but the clone still works out because the data node resent the requeest after the first clone had already been failed and removed from the cluster state when master was restarted. Closes #63473	2020-10-26 09:30:09 +01:00
Marios Trivyzas	9b8ea63cd2	[7.10] Bump version after 7.9.3 release (#63818 )	2020-10-22 17:49:21 +02:00
Przemyslaw Gomulka	bab426be2c	[7.10] add 6.8.14 version (#63824 ) adding 6.8.14 after version 6.8.13 release	2020-10-22 16:51:01 +02:00
Armin Braun	e0f73c96f7	Fix testStartCloneWithSuccessfulShardSnapshotPendingFinalization (#63966 ) (#64000 ) We have to wait for no more operations here not for `1`. This mostly worked because the test thread would add the listener quickly enough so that it sees the state where either the snapshot or clone but not both have already finished but randomly the test thread would be slow and time out on a state without snaphots in it.	2020-10-21 15:33:12 +02:00
markharwood	b933bd9f45	Search - make term/prefix/wildcard/regex query parsing more lenient (#63926 ) Remove errors when case_insensitive flag set to false Closes #63893	2020-10-21 13:33:19 +01:00
Henning Andersen	ddd897f747	Fix test timeout for health on master failover (#63455 ) testHealthOnMasterFailover could timeout on some of the health requests in the case where an index is added, since the recovery leads to extended test run time. Closes #62690	2020-10-21 14:31:53 +02:00
Nik Everett	8d30766a7d	Fix scripted metric BWC serialization (backport of #63821 ) (#63897 ) We had and an error when serializing fully reduced scripted metrics. Small typo and sever lack of tests..... Anyway, this fixed the one character typo and adds a bunch more tests.	2020-10-20 13:15:26 -04:00
Ignacio Vera	d0f5066310	Upgrade to lucene-8.7.0-snapshot-72d8528c3a6 (#63912 ) (#63928 ) (#63933 )	2020-10-20 15:08:06 +02:00
Tanguy Leroux	b2e07076a0	Add snapshot shard size based test in DiskThresholdDeciderTests (#63913 ) This commit adds a test in DiskThresholdDeciderTests that verifies the allocation of a snapshot recovery source based shard in the situation where the snapshot shard size was successfully provided by the SnapshotInfoService introduced in #61906 and when the service failed to provide the size. Relates #61906	2020-10-20 14:59:00 +02:00
Jim Ferenczi	3423f214dd	Composite aggregation must check live docs when the index is sorted (#63864 ) This change ensures that the live docs are checked in the composite aggregator when the index is sorted.	2020-10-20 11:40:28 +02:00
Armin Braun	1880bcdc09	Add REST Test for Snapshot Clone API (#63863 ) (#63881 ) Adds snapshot clone REST tests and HLRC support for the API.	2020-10-20 09:48:03 +02:00
Nik Everett	5583db5a73	Fix broken parent and child aggregator (backport #63811 ) (#63892 ) In #57892 I broke some sub-aggregations inside of the `parent` and `child` aggregator, specifically any sub-aggregations that do work in the `postCollect` phase. This fixes it by delaying the post collect phase of aggs under `parent` and `child` until `beforeBuildingBuckets` because, well, we haven't done any collection until after that phase.	2020-10-19 13:05:22 -04:00
Mayya Sharipova	c0c1a7a9a6	Apply boost only once for distance_feature query (#63767 ) Currently if distance_feature query contains boost, it incorrectly gets applied twice: in AbstractQueryBuilder::toQuery and we also pass this boost to Lucene's LongPoint.newDistanceFeatureQuery. As a result we get incorrect scores. This fixes this error to ensure that boost is applied only once. Closes #63691	2020-10-16 10:02:55 -04:00
Ioannis Kakavas	364511395d	[7.10] Move RestRequestFilter to core (#63507 ) Move RestRequestFilter to core so that Rest requests outside xpack can use it to filter fields and expand its usage. Backport of #63507	2020-10-16 13:57:52 +03:00
Tanguy Leroux	7ea44d20c3	Try to fix DiskThresholdDeciderIT (#63614 ) (#63721 ) This is another attempt to fix #62326 as my previous attempts failed (#63112, #63385).	2020-10-16 09:20:54 +02:00
Jay Modi	822fea9889	Fix threadpool setting test for system_write (#63706 ) This commit fixes the UpdateThreadPoolSettingsTests to be aware of the hard limit on the maximum size of the system_write executor. This executor has a hard limit that matches the write executor, which is the number of allocated processors. Closes #63131 Backport #63700	2020-10-14 14:57:43 -06:00
James Rodewig	ac2b668016	[DOCS] Fix AbstractDiffable typo (#59034 ) (#63668 ) Co-authored-by: Howard <danielhuang@tencent.com>	2020-10-14 09:56:56 -04:00
Armin Braun	424b313784	Adapt Shard Generation Assertion for 7.x (#63625 ) (#63642 ) In 7.x we can have `null` generations so we need to adjust the `assert` accordingly. See e.g. failure https://gradle-enterprise.elastic.co/s/dgypleytdotfu/tests/:server:internalClusterTest/org.elasticsearch.snapshots.ConcurrentSnapshotsIT/testConcurrentSnapshotWorksWithOldVersionRepo	2020-10-14 06:57:25 +02:00
Nhat Nguyen	9015b50e1b	Check docs limit before indexing on primary (#63273 ) Today indexing to a shard with 2147483519 documents will fail that shard. We should check the number of documents and reject the write requests instead. Closes #51136	2020-10-13 17:39:08 -04:00
Lee Hinman	7371e51583	[7.10] Add DiscoveryNodeRole compatibility role for bwc tier serialization (#63581 ) (#63613 ) Backports the following commits to 7.10: Add DiscoveryNodeRole compatibility role for bwc tier serialization (#63581)	2020-10-13 09:17:15 -06:00
Armin Braun	f70391c6cc	Fix Broken Snapshot State Machine in Corner Case (#63534 ) (#63608 ) This fixes a gap in testing and a bug that can occur in various forms: When we would start a snapshot or clone related to a shard that was done snapshotting/cloning but its overall operation was not yet finalized at the time of starting the operation, we would base the operation off of the wrong generation. This would not cause a corrupted repo, but would cause the operation to be `PARTIAL`. This commit fixes the state machine to take into account the correct generation in this case. Closes #63498	2020-10-13 16:05:34 +02:00
James Rodewig	845ccc2264	[DOCS] Fix dup word in ShardRouting hashcode method. (#63452 ) (#63583 ) Co-authored-by: Howard <danielhuang@tencent.com>	2020-10-13 09:05:19 -04:00
Tanguy Leroux	8499924e51	InternalSnapshotsInfoService should also removed failed snapshot shard size infos (#63492 ) (#63592 ) Relates #61906	2020-10-13 10:42:38 +02:00
Julie Tibshirani	9e52513c7b	Add support for missing value fetchers. (#63585 ) This PR implements value fetching for the following field types: * `text` phrase and prefix subfields * `search_as_you_type`, plus its subfields * `token_count`, which is implemented by fetching doc values Supporting these types helps ensure that retrieving all fields through `"fields": ["*"]` doesn't fail because of unsupported value fetchers.	2020-10-12 17:34:21 -07:00
Tim Brooks	56092b1a9f	Flush translog writer before adding new operation (#63505 ) Currently we flush the Translog buffer when a new operation causes the buffer to breach 1MB. This introduces a scenario where an exception is thrown AFTER the writer has accepted the operation. To avoid this, this commit flushes the Translog in an #add call before adding a new operation. This fixes #63299.	2020-10-09 10:02:55 -06:00
Julie Tibshirani	ae2fc4118d	Add factory methods for common value fetchers. (#63438 ) This PR adds factory methods for the most common implementations: * `SourceValueFetcher.identity` to pass through the source value untouched. * `SourceValueFetcher.toString` to simply convert the source value to a string.	2020-10-08 12:14:53 -07:00
Julie Tibshirani	c6b915c8e6	Make TextFieldMapper.FAST_PHRASE_SUFFIX private.	2020-10-08 11:45:53 -07:00
Tanguy Leroux	943fcaf970	Simplify reroute counting in InternalSnapshotsInfoServiceTests (#63416 ) (#63491 ) Closes #63352	2020-10-08 18:20:07 +02:00
Dan Hermann	85886e71c2	Handle error conditions when simulating ingest pipelines with verbosity enabled (#63327 ) (#63484 )	2020-10-08 09:21:05 -05:00
Przemyslaw Gomulka	d7391bc040	[7.10] Fix incorrect use of Format.equals instead of matches backport#63462 #63463 closes #63459 backports #63462	2020-10-08 15:35:13 +02:00
Christoph Büscher	517d3e4336	Mute DiskThresholdDeciderIT.testHighWatermarkNotExceeded	2020-10-08 15:14:50 +02:00
Mayya Sharipova	e022b78198	Upgrade to lucene-8.7.0-snapshot-5c4168d (#63466 ) This disables sort optim on _doc, which may still be unstable. Backport for #63444	2020-10-08 08:20:43 -04:00
Christoph Büscher	564823b00f	Muting parts of JavaJodaTimeDuellingTests	2020-10-08 11:50:47 +02:00
Alan Woodward	c4726a2cec	Don't emit separate warnings for type filters (#63391 ) #63214 made TypeFieldType a constant field, and fixed things so that it always emits deprecation warnings whenever it is referenced in a query or aggregation. However, it also emits warnings when it is used to build a type filter through the search context; this is unnecessary, as warnings are already emitted by the REST layer when types are specified as part of the URL, and it is causing failures in some BWC tests. This commit adds a specialised typeFilter method to TypeFieldType to handle this case without emitted any extra warnings. It also removes an unused duplicate TypeFieldType class that resulted from a backport merge error. Fixes #63366	2020-10-07 15:56:39 +01:00
Mayya Sharipova	e236ea43e9	Upgrade to lucene-8.7.0-snapshot-e914862 (#63401 ) Backport for: #63395	2020-10-07 09:45:14 -04:00
Alan Woodward	88b45dfa61	Convert TextFieldMapper to parametrized form (#63269 ) (#63392 ) As a result of this, we can remove a chunk of code from TypeParsers as well. Tests for search/index mode analyzers have moved into their own file. This commit also rationalises the serialization checks for parameters into a single SerializerCheck interface that takes the values includeDefaults, isConfigured and the value itself. Relates to #62988	2020-10-07 13:26:25 +01:00
Przemyslaw Gomulka	5534a60fa0	strict_date_optional_time_nanos with width 1 on nanos part (#63117 ) (#63387 ) This formatter should allow parsing fraction of a second with minimum width of 1. The same is allowed for strict_date_optional_time closes #61357	2020-10-07 14:12:04 +02:00
Armin Braun	244f1a60f9	Selectively Add ClusterState Listeners Depending on Node Roles (#63223 ) (#63396 ) We were not consistent in checking for node roles before adding listeners. In some cases we did check the necessity of a CS listener and in others we did not. This commit fixes a number of cases of redundant listeners that don't apply to all node roles.	2020-10-07 14:11:43 +02:00
Tanguy Leroux	eac99dd594	SnapshotShardSizeInfo should prefer default value when provided (#63390 ) (#63394 ) In #61906 we agreed on always providing the default value ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE when the SnasphotInfoService failed to retrieve the exact size for a given snapshot shard. The motivation was to allow the shard allocation to move forward in case of failures (so that the unassigned shard does not get stuck in an unassigned state for too long) while relying on the fallback values for shard sizes. Sadly a bug in the SnapshotShardSizeInfo#getShardSize(ShardRouting, long) makes the default value to be ignored when the snapshot shard size retrieval previously failed, returning ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE instead of the provided default value. With DiskThresholdDecider also not relying on the provided default value this triggers some assertion like in #63376 which helped us to spot the bug. Closes ##63376	2020-10-07 13:53:05 +02:00
Tanguy Leroux	581490d83c	Fix DiskThresholdDeciderIT.testHighWatermarkNotExceeded (#63112 ) (#63385 ) The first refreshDiskUsage() refreshes the ClusterInfo update which in turn calls listeners like DiskThreshMonitor. This one triggers a reroute as expected and turns an internal checkInProgress flag before submitting a cluster state update to relocate shards (the internal flag is toggled again once the cluster state update is processed). In the test I suspect that the second refreshDiskUsage() may complete before DiskThreshMonitor's internal flag is set back to its initial state, resulting in the second ClusterInfo update to be ignored and message like "[node_t0] skipping monitor as a check is already in progress" to be logged. Adding another wait for languid events to be processed before executing the second refreshDiskUsage() should help here. Closes #62326	2020-10-07 11:27:25 +02:00
Przemyslaw Gomulka	eadd69e1e4	Deprecate week_year in favour of weekyear date format backport(63307) (#63308 ) week_year is misleading as the formatter only has a weekyear. A field corresponding to 'Y'. 'weekyear' should be used instead relates #60707 backports https://github.com/elastic/elasticsearch/pull/63307	2020-10-07 09:16:27 +02:00
Tim Brooks	dd4b0d85fe	Write translog operation bytes to byte stream (#63298 ) Currently we add translog operation bytes to an array list and flush them on the next write. Unfortunately, this does not currently play well with our byte pooling which means each operation is backed, at minimum, by a 16KB array. This commit improves memory efficiency for small operations by serializing the operations to an output stream.	2020-10-06 20:55:44 -06:00
Tim Brooks	64bbbaeef1	Do not block Translog add on file write (#63374 ) Currently a TranslogWriter add operation is synchronized. This operation adds the bytes to the file output stream buffer and issues a write system call if the buffer is filled. This happens every 8KB which means that we routinely block other add calls on system writes. This commit modifies the add operation to simply place the operation in an array list. The array list if flushed when the sync call occurs or when 1MB is buffered.	2020-10-06 20:40:15 -06:00
Mayya Sharipova	f2ba62b894	Upgrade to lucene- 8.7.0-snapshot-66c49a35402 (#63372 ) This includes fixing a bug in doc iteration during sort optimization Backport for #63349	2020-10-06 22:38:58 -04:00
Dawid Weiss	dbcbdcc029	Set context class loader for plugin initialization (#63185 ) Plugins are loaded in isolated child class loaders of the root class loader. However, some libraries depend on the context class loader being set. This commit sets the context class loader for the duration of calling each plugins constructor. relates #52320 Co-authored-by: Ryan Ernst <ryan@iernst.net>	2020-10-06 18:00:21 -07:00
Julie Tibshirani	f17ca18dfa	Make array value parsing flag more robust. (#63371 ) When constructing a value fetcher, the 'parsesArrayValue' flag must match `FieldMapper#parsesArrayValue`. However there is nothing in code or tests to help enforce this. This PR reworks the value fetcher constructors so that `parsesArrayValue` is 'false' by default. Just as for `FieldMapper#parsesArrayValue`, field types must explicitly set it to true and ensure the behavior is covered by tests. Follow-up to #62974.	2020-10-06 17:49:25 -07:00
Gordon Brown	5c8b0662df	Deprecate REST access to System Indices (#63274 ) (Original #60945 ) This PR adds deprecation warnings when accessing System Indices via the REST layer. At this time, these warnings are only enabled for Snapshot builds by default, to allow projects external to Elasticsearch additional time to adjust their access patterns. Deprecation warnings will be triggered by all REST requests which access registered System Indices, except for purpose-specific APIs which access System Indices as an implementation detail a few specific APIs which will continue to allow access to system indices by default: - `GET _cluster/health` - `GET {index}/_recovery` - `GET _cluster/allocation/explain` - `GET _cluster/state` - `POST _cluster/reroute` - `GET {index}/_stats` - `GET {index}/_segments` - `GET {index}/_shard_stores` - `GET _cat/[indices,aliases,health,recovery,shards,segments]` Deprecation warnings for accessing system indices take the form: ``` this request accesses system indices: [.some_system_index], but in a future major version, direct access to system indices will be prevented by default ```	2020-10-06 13:41:40 -06:00
Tanguy Leroux	87076c32e2	Determine shard size before allocating shards recovering from snapshots (#61906 ) (#63337 ) Determines the shard size of shards before allocating shards that are recovering from snapshots. It ensures during shard allocation that the target node that is selected as recovery target will have enough free disk space for the recovery event. This applies to regular restores, CCR bootstrap from remote, as well as mounting searchable snapshots. The InternalSnapshotInfoService is responsible for fetching snapshot shard sizes from repositories. It provides a getShardSize() method to other components of the system that can be used to retrieve the latest known shard size. If the latest snapshot shard size retrieval failed, the getShardSize() returns ShardRouting.UNAVAILABLE_EXPECTED_SHARD_SIZE. While we'd like a better way to handle such failures, returning this value allows to keep the existing behavior for now. Note that this PR does not address an issues (we already have today) where a replica is being allocated without knowing how much disk space is being used by the primary. Co-authored-by: Yannick Welsch <yannick@welsch.lu>	2020-10-06 18:37:05 +02:00
Julie Tibshirani	733e89d7ed	Make sure that IdFieldType#isAggregatable is accurate. (#62903 ) Before, it always returned 'true' even when the setting "indices.id_field_data.enabled" was false. Fixes #62897.	2020-10-06 09:33:44 -07:00
Dan Hermann	7a59ae8fa2	[7.x] Allow_duplicates option for append processor (#61916 ) (#63257 )	2020-10-06 09:03:47 -05:00
Armin Braun	a8dbab23a5	Increase Timeout in testDynamicRestoreThrottling (#63300 ) (#63324 ) Even if we increase the limit it might not take effect straight away if a thread is blocked on a long wait in `org.elasticsearch.index.snapshots.blobstore.RateLimitingInputStream#maybePause`. Let's increase the limit a little and see if that deals with the remaining failures for good and stop burning cycles busy asserting a future completion. Closes #63246	2020-10-06 15:27:05 +02:00
Luca Cavanna	ca68298e89	Remove MapperService argument from IndexFieldData.Builder#build (#63197 ) (#63311 ) MapperService carries a lot of weight and is only used to determine if loading of field data for the id field is enabled, which can be done in a different way.	2020-10-06 15:04:23 +02:00
Armin Braun	2aa80f9ee3	Dry up Searchable Snapshots ITs (#63190 ) (#63321 ) Just a few spots where we can dry up these tests using the snapshot test infrastructure in core that I found while studying the existing searchable snapshot tests.	2020-10-06 14:41:11 +02:00
Christoph Büscher	82096d3971	Enable SourceLookup to leverage sequential stored fields reader (#63035 ) (#63316 ) In #62509 we already plugged faster sequential access for stored fields in the fetch phase. This PR now adds using the potentially better field reader also in SourceLookup. Rally exeriments are showing that this speeds up e.g. when runtime fields that are using "_source" are added e.g. via "docvalue_fields" or are used in queries or aggs. Closes #62621	2020-10-06 14:34:39 +02:00
Alan Woodward	7405af8060	Convert TypeFieldType to a constant field type (#63214 ) In 6x and 7x, indexes can have only one type, which means that we can rework all queries against the type field to use a ConstantFieldType. This has already been done in master with the removal of the TypeFieldMapper, but we still need that class in 7x to deal with nested documents. This commit leaves TypeFieldMapper in place, but refactors TypeFieldType to extend ConstantFieldType and consolidates deprecation warnings within that class. It also incidentally removes the requirement to pass a MapperService to IndexFieldData.Builder#build, which should allow #63197 to be backported.	2020-10-06 10:27:37 +01:00
Armin Braun	d7f6812d78	Improve Snapshot Abort Efficiency (#62173 ) (#63297 ) There is no need to let snapshots that haven't yet written anything to the repo finalize with `FAILED`. When we still had the `INIT` state we would also just remove these snapshots from the state without any further action. This is not just a theoretical optimization. Currently, the situation of having a lot of queued up snapshots is fairly complicated to resolve when all the queued shards move to aborted since it is now necessary to execute tasks on the `SNAPSHOT` pool (that might be very busy) to remove the snapshot from the CS (including a number of redundant CS updates and repo writes for finalizing these snapshots before deleting them right away after).	2020-10-06 05:14:25 +02:00
Nhat Nguyen	25fbc01459	Retry CCR shard follow task when no seed node left (#63225 ) If the connection between clusters is disconnected or the leader cluster is offline, then CCR shard-follow tasks can stop with "no seed node left". CCR should retry on this error.	2020-10-05 21:56:56 -04:00
Armin Braun	5c3a4c13dd	Clone Snapshot API (#61839 ) (#63291 ) Snapshot clone API. Complete except for some TODOs around documentation (and adding HLRC support). backport of #61839, #63217, #63037	2020-10-06 01:52:25 +02:00
Armin Braun	e91936512a	Refactor SnapshotsInProgress State Transitions (#60517 ) (#63266 ) The copy constructors previously used were hard to read and the exact state changes were not obvious at all. Refactored those into a number of named constructors instead, added additional assertions and moved the snapshot abort logic into `SnapshotsInProgress`.	2020-10-06 00:03:42 +02:00
Armin Braun	860791260d	Implement Shard Snapshot Clone Logic (#62771 ) (#63260 ) First part of the snapshot clone logic that implements the snapshot clone functionality on the repository level.	2020-10-05 22:55:52 +02:00
Nhat Nguyen	1a6837883a	Upgrade to Lucene-8.7.0-snapshot-77396dbf339 (#63222 ) Includes LUCENE-9554, which exposes the pendingNumDocs from IndexWriter.	2020-10-05 14:39:30 -04:00
Nik Everett	7f07deb8d8	Skip broken test In #63242 we changed how we build `nextRoundingValue` to, well, be correct. But the old `org.elasticsearch.common.rounding.Rounding` implementation didn't get the fix. Which is fine, because it doesn't that method on that implementation doesn't receive any use outside of tests. In fact, it is entirely removed in master. Anyway, now that the two implementation produce different values we really can't go around asserting that they produce the same values now can we? Well, we were! This skips that assertion if we know `nextRoundingValue` is implemented differently. Closes #63256	2020-10-05 14:25:53 -04:00
Stuart Tettemer	791a9d5102	Scripting: enable regular expressions by default (#63029 ) (#63272 ) * Setting `script.painless.regex.enabled` has a new option, `use-factor`, the default. This defaults to using regular expressions but limiting the complexity of the regular expressions. In addition to `use-factor`, the setting can be `true`, as before, which enables regular expressions without limiting them. `false` totally disables regular expressions, which was the old default. * New setting `script.painless.regex.limit-factor`. This limits regular expression complexity by limiting the number characters a regular expression can consider based on input length. The default is `6`, so a regular expression can consider `6` * input length number of characters. With input `foobarbaz` (length `9`), for example, the regular expression can consider `54` (`6 * 9`) characters. This reduces the impact of exponential backtracking in Java's regular expression engine. * add `@inject_constant` annotation to whitelist. This annotation signals that a compiler settings will be injected at the beginning of a whitelisted method. The format is `argnum=settingname`: `1=foo_setting 2=bar_setting`. Argument numbers must start at one and must be sequential. * Augment `Pattern.split(CharSequence)` `Pattern.split(CharSequence, int)`, `Pattern.splitAsStream(CharSequence)` `Pattern.matcher(CharSequence)` to take the value of `script.painless.regex.limit-factor` as a an injected parameter, limiting as explained above when this setting is in use. Fixes: #49873 Backport of: 93f29a4	2020-10-05 13:17:47 -05:00
Armin Braun	cf75abb021	Optimize XContentParserUtils.ensureExpectedToken (#62691 ) (#63253 ) We only ever use this with `XContentParser` no need to make it inline worse by forcing the lambda and hence dynamic callsite here. => Extraced the exception formatting code path that is likely very cold to a separate method and removed the lambda usage in hot loops by simplifying the signature here.	2020-10-05 19:08:32 +02:00
Armin Braun	51d0ed1bf3	Prepare Snapshot Shard State Update Logic For Clone Logic (#62617 ) (#63255 ) Small refactoring to shorten the diff with the clone logic in #61839: * Since clones will create a different kind of shard state update that isn't the same request sent by the snapshot shards service (and cannot be the same request because we have no `ShardId`) base the shard state updates on a different class that can be extended to be general enough to accomodate shard clones as well. * Make the update executor a singleton (can't make it an inline lambda as that would break CS update batching because the executor is used as a map key but this change still makes it crystal clear that there's no internal state to the executor) * Make shard state update responses a singleton (can't use TransportResponse.Empty because we need an action response but still it makes it clear that there's no actual response with content here)	2020-10-05 18:54:01 +02:00
Armin Braun	de6eeecbd3	Dry up Snapshot Integ Tests some More (#62856 ) (#63248 ) * Just some obvious drying up of these super complex tests. * Mainly just shortening the diff of #61839 here by moving test utilities to the abstract test case. Also, making use of the now available functionality to simplify existing tests and improve logging in them.	2020-10-05 18:33:59 +02:00
David Roberts	a522e932e8	Mute RoundingDuelTests.testSerialization Due to https://github.com/elastic/elasticsearch/issues/63256	2020-10-05 17:22:40 +01:00
Armin Braun	89de9fdcf7	Cleanup Blobstore Repository Metadata Serialization (#62727 ) (#63249 ) Follow ups to #62684 making use of shorter utility for corruption checks.	2020-10-05 17:44:27 +02:00
Nik Everett	461475f9e9	Make Rounding.nextRoundingValue consistent (backport #62983 ) (#63242 ) "interval" style roundings were implementing `nextRoundingValue` in a fairly inconsistent way - it'd produce a value, but sometimes that value would be the same as the previous rounding value. This makes it consistently the next value that `rounding` would make.	2020-10-05 10:38:20 -04:00
Armin Braun	d13c1f5058	Fix Overly Strict Assertion in BlobStoreRepository (#63061 ) (#63236 ) As long as `bestEffortConsistency` is `true`, the value of `latestKnownRepoGen` can be updated as a result of reads. We can only assert that `latestKnownRepoGen` and cluster state move in lock-step if `bestEffortConsistency` was `false` before updating the metadata generation as well as after. Closes #62877	2020-10-05 14:06:57 +02:00
Yannick Welsch	b4a1199e87	Uniquely associate term with update task during election (#62212 ) There is a small race when processing the cluster state that is used to establish a newly elected leader as master of the cluster: it can pick the term in its master state update task from a different (newer) election. This trips an assertion in `Coordinator.publish(...)` where we claim that the term on the state allows to uniquely define the pre-state but this isn't so. There are no bad consequences of this race since such a publication fails later on anyway. This PR fixes things so that the assertion holds true by improving the handling of terms during cluster state processing by associating each master state update task that is used to establish a newly elected leader with the correct corresponding term from its election. It also explicitly handles the case where the pre-state that is used as base state has already superseded the current state. As a nice side-effect, join batching now only happens based on the same term. Closes #61437	2020-10-05 11:46:10 +01:00
Armin Braun	106695bec8	Fix Race in ClusterApplierService Shutdown (#62944 ) (#63228 ) The iteration over `timeoutClusterStateListeners` starts when the CS applier thread is still running. This can lead to entries being added to it that never get their listener resolved on shutdown and thus leak that listener as observed in a stuck test in #62863. Since `listener.onClose()` is idempotent we can just call it if we run into a stopped service on the CS thread to avoid the race with certainty (because the iteration in `doStop` starts after the stopped state has been set). Closes #62863	2020-10-05 12:35:42 +02:00
Alan Woodward	01950bc80f	Move FieldMapper#valueFetcher to MappedFieldType (#62974 ) (#63220 ) For runtime fields, we will want to do all search-time interaction with a field definition via a MappedFieldType, rather than a FieldMapper, to avoid interfering with the logic of document parsing. Currently, fetching values for runtime scripts and for building top hits responses need to call a method on FieldMapper. This commit moves this method to MappedFieldType, incidentally simplifying the current call sites and freeing us up to implement runtime fields as pure MappedFieldType objects.	2020-10-04 14:54:59 +01:00
nitin2goyal	c9baadd19b	Fix to actually throttle indexing when throttling is activated (#61768 ) In #22721, the decision to throttle indexing was inadvertently flipped, so that we until this commit throttle indexing during recovery but never throttle user initiated indexing requests. This commit fixes that to throttle user initiated indexing requests and never throttle recovery requests. Closes #61959	2020-10-02 15:50:31 +02:00
Martijn van Groningen	300e525138	Fix querying a data stream name in _index field. (#63178 ) Backport #63170 to 7.x branch. The _index field is a special field that allows using queries against the name of an index or alias. Data stream names were not included, this pr fixes that by changing SearchIndexNameMatcher (which used via IndexFieldMapper) to also include data streams.	2020-10-02 15:29:20 +02:00
Armin Braun	022a3ef831	Split Tests out of SharedClusterSnapshotRestoreIT (#63130 ) (#63176 ) Splitting some tests out of this class that has become a catch-all for random snapshot related tests into either existing suits that fit better for these tests or one of two new suits to prevent timeouts in extreme cases (e.g. `WindowsFS` + many nodes + multiple data paths per node). No other changes to tests were made whatsoever. Closes #61541	2020-10-02 15:26:22 +02:00
Przemyslaw Gomulka	eb630e599d	Allow passing versioned media types to 7.x server (#63071 ) 7.x client can pass media type with a version which will return a 7.x version of the api in ES 8. In ES server 7 this media type shoulld be accepted but it serve the same version of the API (7x) relates #61427	2020-10-02 09:17:11 +02:00
William Brafford	6899ce6309	System index auto-creation should not be disabled by user settings (#62984 ) (#63147 ) * Add System Indices check to AutoCreateIndex By default, Elasticsearch auto-creates indices when a document is submitted to a non-existent index. There is a setting that allows users to disable this behavior. However, this setting should not apply to system indices, so that Elasticsearch modules and plugins are able to use auto-create behavior whether or not it is exposed to users. This commit constructs the AutoCreateIndex object with a reference to the SystemIndices object so that we bypass the check for the user-facing autocreate setting when it's a system index that is being autocreated. We also modify the logic in TransportBulkAction to make sure that if a system index is included in a bulk request, we don't skip the autocreation step.	2020-10-01 16:26:07 -04:00
Igor Motov	6a9cde2918	Add support for x_opaque_id to _cat/tasks (#63036 ) (#63135 ) Adds an optional column with support for x_opaque_id to _cat/tasks API. Closes #61118	2020-10-01 13:17:46 -04:00
Ignacio Vera	ba5574935e	Remove dependency of Geometry queries with mapped type names (#63077 ) (#63110 ) It extracts the query capabilities from AbstractGeometryFieldType into two new interfaces, GeoshapeQueryable and ShapeQueryable. Those interfaces are implemented by the final mappers.	2020-10-01 10:49:12 +02:00
Howard	8c6e197f51	Remove allocation id from engine (#62680 ) We no longer need the allocation id in Engine.	2020-09-30 15:28:27 -04:00
Alan Woodward	4fe09b4bf0	Convert test field mappers to parametrized forms (#63018 ) Relates to #62988	2020-09-30 16:59:35 +01:00
Tanguy Leroux	b099bfb789	InternalClusterInfoService should not ignore hidden indices (#62995 ) (#63048 ) Today InternalClusterInfoService ignores hidden indices when retrieving shard stats of the cluster. This can lead to suboptimal shard allocation decisions as the size of shards are taken into account when allocating new shards or rebalancing existing shards.	2020-09-30 11:02:57 +02:00
Ignacio Vera	8e67ec8647	Add equals and hashcode implementation to KnownCardinalityUpperBound (#62930 ) (#63045 )	2020-09-30 09:14:56 +02:00
Alan Woodward	2f5a813589	Convert all FieldMappers in mapper-extras to parametrized form (#62938 ) (#63034 ) This converts RankFeatureFieldMapper, RankFeaturesFieldMapper, SearchAsYouTypeFieldMapper and TokenCountFieldMapper to parametrized forms. It also adds a TextParams utility class to core containing functions that help declare text parameters - mainly shared between SearchAsYouTypeFieldMapper and KeywordFieldMapper at the moment, but it will come in handy when we convert TextFieldMapper and friends. Relates to #62988	2020-09-29 20:50:34 +01:00
Mayya Sharipova	4c8c3c8df6	Upgrade lucene to lucene-8.7.0-snapshot-3b59906 (#62978 ) Backport for #62970	2020-09-28 16:52:31 -04:00
Armin Braun	2247ab3295	Make TransportNodesAction finishHim Execute on Configured Executor (#62753 ) (#62955 ) Currently, `finishHim` can either execute on the specified executor (in the less likely case that the local node request is the last to arrive) or on a transport thread. In case of e.g. `org.elasticsearch.action.admin.cluster.stats.TransportClusterStatsAction` this leads to an expensive execution that deserializes all mapping metadata in the cluster running on the transport thread and destabilizing the cluster. In case of this transport action it was specifically moved to the `MANAGEMENT` thread to avoid the high cost of processing the stats requests on the nodes during fan-out but that did not cover the final execution on the node that received the initial request. This PR adds to ability to optionally specify the executor for the final step of the nodes request execution and uses that to work around the issue for the slow `TransportClusterStatsAction`. Note: the specific problem that motivated this PR is essentially the same as https://github.com/elastic/elasticsearch/pull/57937 where we moved the execution off the transport and on the management thread as a fix as well.	2020-09-28 18:35:35 +02:00
Alan Woodward	a3ba24123e	Refactor PointParser to not take FieldMapper as a parameter (#62950 ) Passing FieldMappers to point parsing functions makes trying to build source-only fields from MappedFieldTypes more complicated. This small refactoring changes things so that the relevant parsing and factory functions from AbstractGeometryFieldMapper are instead passed as lambdas to the PointParser constructor.	2020-09-28 13:45:13 +01:00
Hendrik Muhs	4d43fa8816	Make Noderesolver robust against null values (#62893 ) make node resolving more robust by ignoring null values. This is a bug in the usage of this class, however you don't want NPE's in prod. The root cause might be a corner case. Because silencing the root cause is bad, the assert causes a fail if assertions are enabled relates #62847	2020-09-28 13:31:21 +02:00
Armin Braun	21e534e0e6	Fix RareClusterStateIT Publication Cancel (#62662 ) (#62914 ) We have to make sure the applier and not the accept state versions allign here. Otherwise we can get into the situation where the data node is so slow to process one version that the next one arrives, gets rejected and the request return with ack `false` and we fail the assertion that the put mapping request didn't complete. Closes #62446	2020-09-25 21:57:55 +02:00
Tim Brooks	43a4882951	Move CorsHandler to server (#62007 ) Currently we duplicate our specialized cors logic in all transport plugins. This is unnecessary as it could be implemented in a single place. This commit moves the logic to server. Additionally it fixes a but where we are incorrectly closing http channels on early Cors responses.	2020-09-24 16:32:59 -06:00
Mayya Sharipova	54064a1eec	Unsigned long 64bits(#62892 ) Introduce 64-bit unsigned long field type This field type supports - indexing of integer values from [0, 18446744073709551615] - precise queries (term, range) - precise sort and terms aggregations - other aggregations are based on conversion of long values to double and can be imprecise for large values. Backport for #60050 Closes #32434	2020-09-24 16:51:47 -04:00
Alan Woodward	e28750b001	Add parameter update and conflict tests to MapperTestCase (#62828 ) (#62902 ) This commit adds a mechanism to MapperTestCase that allows implementing test classes to check that their parameters can be updated, or throw conflict errors as advertised. Child classes override the registerParameters method and tell the passed-in UpdateChecker class about their parameters. Simple conflicts can be checked, using the existing minimal mappings as a base to compare against, or alternatively a particular initial mapping can be provided to check edge cases (eg, norms can be updated from true to false, but not vice versa). Updates are registered with a predicate that checks that the update has in fact been applied to the resulting FieldMapper. Fixes #61631	2020-09-24 20:38:12 +01:00
Jim Ferenczi	78a93dc18f	Request-level circuit breaker support on coordinating nodes (#62884 ) This commit allows coordinating node to account the memory used to perform partial and final reduce of aggregations in the request circuit breaker. The search coordinator adds the memory that it used to save and reduce the results of shard aggregations in the request circuit breaker. Before any partial or final reduce, the memory needed to reduce the aggregations is estimated and a CircuitBreakingException} is thrown if exceeds the maximum memory allowed in this breaker. This size is estimated as roughly 1.5 times the size of the serialized aggregations that need to be reduced. This estimation can be completely off for some aggregations but it is corrected with the real size after the reduce completes. If the reduce is successful, we update the circuit breaker to remove the size of the source aggregations and replace the estimation with the serialized size of the newly reduced result. As a follow up we could trigger partial reduces based on the memory accounted in the circuit breaker instead of relying on a static number of shard responses. A simpler follow up that could be done in the mean time is to [reduce the default batch reduce size](https://github.com/elastic/elasticsearch/issues/51857) of blocking search request to a more sane number. Closes #37182	2020-09-24 18:59:28 +02:00
Dan Hermann	cd584d49dc	Bump version after 7.9.2 release	2020-09-24 10:48:57 -05:00
Martijn van Groningen	8ca33feffd	Fail with correct error if first backing index exists when auto creating data stream (#62862 ) Backport #62825 to 7.x branch. Today if a data stream is auto created, but an index with same name as the first backing index already exists then internally that error is ignored, which then result that later in the execution of a bulk request, the bulk item fails due to that the data stream hasn't been auto created. This situation can only occur if an index with same is created that will be the backing index of a data stream prior to the creation of the data stream. Co-authored-by: Dan Hermann <danhermann@users.noreply.github.com>	2020-09-24 17:16:34 +02:00
Nik Everett	ce24115ba3	Speed up date_histogram by precomputing ranges (backport of #61467 ) (#62880 ) A few of us were talking about ways to speed up the `date_histogram` using the index for the timestamp rather than the doc values. To do that we'd have to pre-compute all of the "round down" points in the index. It turns out that just precomputing those values speeds up rounding fairly significantly: ``` Benchmark (count) (interval) (range) (zone) Mode Cnt Score Error Units before 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 96461080.982 ± 616373.011 ns/op before 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 130598950.850 ± 1249189.867 ns/op after 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 52311775.080 ± 107171.092 ns/op after 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 54800134.968 ± 373844.796 ns/op ``` That's a 46% speed up when there isn't a time zone and a 58% speed up when there is. This doesn't work for every time zone, specifically those that have two midnights in a single day due to daylight savings time will produce wonky results. So they don't get the optimization. Second, this requires a few expensive computation up front to make the transition array. And if the transition array is too large then we give up and use the original mechanism, throwing away all of the work we did to build the array. This seems appropriate for most usages of `round`, but this change uses it for all usages of `round`. That seems ok for now, but it might be worth investigating in a follow up. I ran a macrobenchmark as well which showed an 11% preformance improvement. BUT the benchmark wasn't tuned for my desktop so it overwhelmed it and might have produced "funny" results. I think it is pretty clear that this is an improvement, but know the measurement is weird: ``` Benchmark (count) (interval) (range) (zone) Mode Cnt Score Error Units before 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 96461080.982 ± 616373.011 ns/op before 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 g± 1249189.867 ns/op after 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 52311775.080 ± 107171.092 ns/op after 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 54800134.968 ± 373844.796 ns/op Before: \| Min Throughput \| hourly_agg \| 0.11 \| ops/s \| \| Median Throughput \| hourly_agg \| 0.11 \| ops/s \| \| Max Throughput \| hourly_agg \| 0.11 \| ops/s \| \| 50th percentile latency \| hourly_agg \| 650623 \| ms \| \| 90th percentile latency \| hourly_agg \| 821478 \| ms \| \| 99th percentile latency \| hourly_agg \| 859780 \| ms \| \| 100th percentile latency \| hourly_agg \| 864030 \| ms \| \| 50th percentile service time \| hourly_agg \| 9268.71 \| ms \| \| 90th percentile service time \| hourly_agg \| 9380 \| ms \| \| 99th percentile service time \| hourly_agg \| 9626.88 \| ms \| \|100th percentile service time \| hourly_agg \| 9884.27 \| ms \| \| error rate \| hourly_agg \| 0 \| % \| After: \| Min Throughput \| hourly_agg \| 0.12 \| ops/s \| \| Median Throughput \| hourly_agg \| 0.12 \| ops/s \| \| Max Throughput \| hourly_agg \| 0.12 \| ops/s \| \| 50th percentile latency \| hourly_agg \| 519254 \| ms \| \| 90th percentile latency \| hourly_agg \| 653099 \| ms \| \| 99th percentile latency \| hourly_agg \| 683276 \| ms \| \| 100th percentile latency \| hourly_agg \| 686611 \| ms \| \| 50th percentile service time \| hourly_agg \| 8371.41 \| ms \| \| 90th percentile service time \| hourly_agg \| 8407.02 \| ms \| \| 99th percentile service time \| hourly_agg \| 8536.64 \| ms \| \|100th percentile service time \| hourly_agg \| 8538.54 \| ms \| \| error rate \| hourly_agg \| 0 \| % \| ```	2020-09-24 11:03:47 -04:00
Daniel Mitterdorfer	00ce1d7e4b	Mute failing test in IndexRecoveryIT (#62865 ) (#62868 ) Relates #62863	2020-09-24 15:16:40 +02:00
Daniel Mitterdorfer	aec7c65af4	Mute DiskThresholdDeciderIT (#62858 ) (#62859 ) Relates #62326	2020-09-24 13:24:11 +02:00
Julie Tibshirani	f971146de4	Rename FieldValueRetriever -> FieldFetcher. (#62795 ) (#62836 ) The name `FieldFetcher` fits better with the 'fetch' terminology we use elsewhere, for example `FetchFieldsPhase` and `ValueFetcher`. This PR also moves the construction of the fetcher off the context and onto `FetchFieldsPhase`, which feels like a more natural place for it, and fixes a TODO in javadocs.	2020-09-23 10:12:23 -07:00
Nhat Nguyen	38c8a55df8	Better UUID for reader context (#62799 ) We can use a single and stronger UUID for all reader contexts created by the same SearchService. Backport of #62715	2020-09-23 12:50:18 -04:00
Julie Tibshirani	7ba0c95191	Mute ClusterHealthIT.testHealthOnMasterFailover while we await a fix.	2020-09-23 09:17:45 -07:00
Alan Woodward	7984e4e89f	Fix test bug in SpanMultiTermQueryBuilderTests (#62833 ) This test checks to see if the index has been created before version 6.4, in which case index prefixes are unavailable and so it expects to see a span multi-term wrapper. However, the production code doesn't bother with checking for versions, because if the field in question is configured with index_prefixes then it knows that it must have been created post 6.4 (you can't merge in a new index_prefixes configuration). This commit alters the test to remove the random version checks, as we know we will always have a prefix field available in this scenario. Fixes #58199	2020-09-23 17:02:12 +01:00
Martijn van Groningen	0baefc8ddc	Always validate that only a create op is allowed in bulk api for data streams (#62820 ) Backport #62766 to 7.x branch. The bulk api cache the resolved concrete indices when resolving the user provided index name into the actual index name. The validation that prevents write ops other than create from being executed in a data stream was only performed if the result wasn't cached. In case of cached resolvings, the validation never occurs. The validation would be skipped for all bulk items for a data stream after a create operation for that same data stream. This commit ensures that the validation is always performed for all bulk items (whether the concrete index resolution has been cached or not cached). Closes #62762	2020-09-23 16:27:54 +02:00
Armin Braun	a754fd8020	Fix CoordinatorTests.testLogsMessagesIfPublicationDelayed (#62815 ) (#62822 ) We need to account for an addional `DEFAULT_DELAY_VARIABILITY` timeout for the lag detector task to be executed after its scheduled. Closes #62383	2020-09-23 14:23:28 +02:00
Christoph Büscher	29074e7055	Add case insensitive prefix and wildcard to 'version' field (#62754 ) (#62782 ) This change adds support for the recently introduced case insensitivity flag for wildcard and prefix queries. Since version field values are encoded differently we need to adapt our own AutomatonQuery variation to add both cases if case insensitivity is turned on.	2020-09-23 11:48:34 +02:00
Ignacio Vera	81645ec2cc	nextSetBit should check if the underlaying array contains the current word (#62805 ) (#62812 ) This is a recent addition and it is missing a check as the underlaying array can be smaller that the numBits capacity.	2020-09-23 11:17:26 +02:00
Luca Cavanna	862fab06d3	Share same existsQuery impl throughout mappers (#57607 ) Most of our field types have the same implementation for their `existsQuery` method which relies on doc_values if present, otherwise it queries norms if available or uses a term query against the _field_names meta field. This standard implementation is repeated in many different mappers. There are field types that only query doc_values, because they always have them, and field types that always query _field_names, because they never have norms nor doc_values. We could apply the same standard logic to all of these field types as `MappedFieldType` has the knowledge about what data structures are available. This commit introduces a standard implementation that does the right thing depending on the data structure that is available. With that only field types that require a different behaviour need to override the existsQuery method. At the same time, this no longer forces subclasses to override `existsQuery`, which could be forgotten when needed. To address this we introduced a new test method in `MapperTestCase` that verifies the `existsQuery` being generated and its consistency with the available data structures.	2020-09-23 11:00:53 +02:00
Luca Cavanna	5ca86d541c	Move stored flag from TextSearchInfo to MappedFieldType (#62717 ) (#62770 )	2020-09-23 09:40:34 +02:00
Nhat Nguyen	663b85b98f	Make keep alive optional in PointInTimeBuilder (#62720 ) Remove the keepAlive parameter from the constructor of PointInTimeBuilder as it's optional.	2020-09-22 18:52:54 -04:00
Jay Modi	cb1dc5260f	Dedicated threadpool for system index writes (#62792 ) This commit adds a dedicated threadpool for system index write operations. The dedicated resources for system index writes serves as a means to ensure that user activity does not block important system operations from occurring such as the management of users and roles. Backport of #61655	2020-09-22 15:31:38 -06:00
Benjamin Trent	77bfb32635	[7.x] [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls (#62694 ) (#62784 ) * [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls (#62694) * [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls global parameters, outside of the global index, are ignored for internal callers in certain cases. If the interal caller is adding requests via the following methods: ``` - BulkRequest#add(IndexRequest) - BulkRequest#add(UpdateRequest) - BulkRequest#add(DocWriteRequest) - BulkRequest#add(DocWriteRequest[]) ``` It is better to specifically set the desired parameters on the requests before they are added to the bulk request object. This commit addresses this issue for the ML plugin * unmuting test	2020-09-22 15:07:08 -04:00
Rory Hunter	3f856d1c81	Prioritise recovery of system index shards (#62640 ) Closes #61660. When ordering shard for recovery, ensure system index shards are ordered first so that their recovery will be started first. Note that I rewrote PriorityComparatorTests to use IndexMetadata instead of its local IndexMeta POJO.	2020-09-22 15:48:27 +01:00
markharwood	a0df0fb074	Search - add case insensitive flag for "term" family of queries #61596 (#62661 ) Backport of fe9145f Closes #61546	2020-09-22 13:56:51 +01:00
Armin Braun	0d5250c99b	Add Trace Logging to File Restore (#62755 ) (#62761 ) Requested by the performance team and generally potentially useful to log each file at `TRACE` like we do for snapshot create.	2020-09-22 14:44:40 +02:00
Amogh Mishra	bc6bea5924	Remove node from cluster when node locks broken (#61400 ) In #52680 we introduced a mechanism that will allow nodes to remove themselves from the cluster if they locally determine themselves to be unhealthy. The only check today is that their data paths are all empirically writeable. This commit extends this check to consider a failure of `NodeEnvironment#assertEnvIsLocked()` to be an indication of unhealthiness. Closes #58373	2020-09-22 10:08:41 +01:00
Armin Braun	aa0dc56412	Ensure MockRepository is Unblocked on Node Close (#62711 ) (#62748 ) `RepositoriesService#doClose` was never called which lead to mock repositories not unblocking until the `ThreadPool` interrupts all threads. Thus stopping a node that is blocked on a mock repository operation wastes `10s` in each test that does it (which is quite a few as it turns out).	2020-09-22 11:00:18 +02:00
Armin Braun	4bdbc39e9f	Fix testQueuedSnapshotOperationsAndBrokenRepoOnMasterFailOverMultiple (#62713 ) (#62747 ) There's possible retries here that work out if both the snapshot and the delete operation are retried when master shuts down and hits the unlikely case of the retried delete executing before the retried snapshot, making both operations pass. Closes #62686	2020-09-22 10:42:11 +02:00
Luca Cavanna	9ae29713fd	Dense vector field type minor fixes (#62631 ) The dense vector field is not aggregatable although it produces fielddata through its BinaryDocValuesField. It should pass up hasDocValues set to true to its parent class in its constructor, and return isAggregatable false. Same for the sparse vector field (only in 7.x). This may not have consequences today, but it will be important once we try to share the same exists query implementation throughout all of the mappers with #57607.	2020-09-22 10:40:51 +02:00
Ignacio Vera	265387f348	override needsScore() on ValueCountAggregator (#62683 ) (#62745 )	2020-09-22 08:47:16 +02:00
Yang Wang	897d2e8a02	Fix ccs permission for search with a scroll id (#62053 ) (#62695 ) CCS with remote indices only does not require any privileges on the local cluster. This PR ensures that search with scroll follow the permission model.	2020-09-22 11:49:40 +10:00
Jim Ferenczi	1fc78d430b	Fix terms aggregation ordering after the final reduce (#62732 ) This commit ensures that the final order of the terms aggregations is registered correctly after the final reduce. This bug was introduced in #62028 which is not released yet so this PR is marked as a non-issue. This issue was discovered when running a terms aggregation under an auto-date histogram. In such a case, the auto-date histogram may run multiple final reduce to merge buckets together. This change makes sure that running multiple final reduces doesn't create duplicates but it doesn't fix the fact that the final reduce may prune the list of terms prematurely. This other bug is tracked separately in #62731.	2020-09-22 00:03:04 +02:00
Nhat Nguyen	f9f4d87437	Remove invalid assertion in SearchService (#62675 ) This assertion does not always hold because there can be a race between `putReaderContext` and `afterIndexRemoved` when an index is deleted. Closes #62624	2020-09-21 16:29:00 -04:00
Ignacio Vera	cadd5dc53f	Fix bug when initializing HyperLogLogPlusPlusSparse (#62602 ) (#62702 ) This is a follow up of #62480 where we are oversizing one array when initialising. In addition it prevents a possible CircuitBreaker leak during initialisation.	2020-09-21 17:30:40 +02:00
Armin Braun	13e28b85ff	Speed up RepositoryData Serialization (#62684 ) (#62703 ) Make serializing `RepositoryData` a little faster and split up/document the code for it a little as well given how massive this method has gotten at this point.	2020-09-21 17:29:56 +02:00
Dan Hermann	a06339ffae	Fix NPE when deleting multiple backing indices on a data stream (#62274 ) (#62708 )	2020-09-21 10:26:47 -05:00
Alan Woodward	1dde4983f6	Convert ConstantKeywordFieldMapper to parametrized form (#62688 ) As part of the conversion, adds the ability to customize merge validation - in this case, we allow an update to the constant value if it is currently set to null, but refuse further updates once it has been set once. This commit also converts ParametrizedMapperTests to use MapperServiceTestCase.	2020-09-21 15:22:56 +01:00
Henning Andersen	0c4cfe4c44	Cardinality request breaker leak (#62685 ) If HyperLogLogPlusPlus failed during construction, it would not release already allocated resources, causing the request circuit breaker to not be adjusted down. Closes #62439	2020-09-21 15:54:04 +02:00
Christoph Büscher	803f78ef05	Add field type for version strings (#59773 ) (#62692 ) This PR adds a new 'version' field type that allows indexing string values representing software versions similar to the ones defined in the Semantic Versioning definition (semver.org). The field behaves very similar to a 'keyword' field but allows efficient sorting and range queries that take into accound the special ordering needed for version strings. For example, the main version parts are sorted numerically (ie 2.0.0 < 11.0.0) whereas this wouldn't be possible with 'keyword' fields today. Valid version values are similar to the Semantic Versioning definition, with the notable exception that in addition to the "main" version consiting of major.minor.patch, we allow less or more than three numeric identifiers, i.e. "1.2" or "1.4.6.123.12" are treated as valid too. Relates to #48878	2020-09-21 14:25:42 +02:00
Alan Woodward	178b25fc4b	Fix standard filter BWC check to allow for cacheing bug (#62649 ) The `standard` tokenfilter was removed by #33310, and should have been unuseable in any indexes created since 7.0. However, a cacheing bug fixed by #51092 meant that it was still possible in certain circumstances to create indexes referencing the standard filter in versions up to 7.5.2. Our checks in AnalysisModule still refer to 7.0.0, however, meaning that a cluster that contains one of these rogue indexes cannot be upgraded. This commit adjusts the AnalysisModule checks so that we only refuse to build a mapping referring to standard filter if the index created version is 7.6 or later. Fixes #62644	2020-09-21 10:12:55 +01:00
Henning Andersen	9a77f41e55	Fix cluster health when closing (#61709 ) When master shuts down it's cluster service, a waiting health request would fail rather than fail over to a new master.	2020-09-19 10:02:36 +02:00
Luca Cavanna	00272ea877	Remove cache key renderer argument from IndicesRequestCache (#62534 ) In the context of of a recurring test failure tracked by #32827, we added trace logging and an extra cache key renderer argument to IndicesRequestCache#getOrCompute (see #39475 and #34180). We addressed the issue with #54071, but the extra argument was left behind, with a NORELEASE comment saying it should be removed. With this commit, we remove the extra cache key rendered argument and the corresponding log lines which are not so useful without it. Closes #55837	2020-09-19 00:24:02 +02:00
Lee Hinman	4a08928c47	[7.x] Add index.routing.allocation.include._tier_preference setting (#62589 ) (#62667 ) This commit adds the `index.routing.allocation.prefer._tier` setting to the `DataTierAllocationDecider`. This special-purpose allocation setting lets a user specify a preference-based list of tiers for an index to be assigned to. For example, if the setting were set to: ``` "index.routing.allocation.prefer._tier": "data_hot,data_warm,data_content" ``` If the cluster contains any nodes with the `data_hot` role, the decider will only allow them to be allocated on the `data_hot` node(s). If there are no `data_hot` nodes, but there are `data_warm` and `data_content` nodes, then the index will be allowed to be allocated on `data_warm` nodes. This allows us to specify an index's preference for tier(s) without causing the index to be unassigned if no nodes of a preferred tier are available. Subsequent work will change the ILM migration to make additional use of this setting. Relates to #60848	2020-09-18 15:41:36 -06:00
Christos Soulios	6a298970fd	[7.x] Allow metadata fields in the _source (#62616 ) Backports #61590 to 7.x So far we don't allow metadata fields in the document _source. However, in the case of the _doc_count field mapper (#58339) we want to be able to set This PR adds a method to the metadata field parsers that exposes if the field can be included in the document source or not. This way each metadata field can configure if it can be included in the document _source	2020-09-18 19:56:41 +03:00
Alan Woodward	17aabaed15	Fix warning on boost docs and warning message on non-implementing fieldmappers	2020-09-18 16:45:08 +01:00
Alan Woodward	43ace5f80d	Emit deprecation warnings when boosts are defined in mappings (#62623 ) We removed index-time boosting back in 5x, and we no longer document the 'boost' parameter on any of our mapping types. However, it is still possible to define an index-time boost on a field mapper for a surprisingly large number of field types, and they even have an effect (sometimes, on some queries). As a first step in finally removing all traces of index time boosting, this comment emits a deprecation warning whenever a boost parameter is found on a mapping definition.	2020-09-18 15:40:53 +01:00
Igor Motov	260c11d89e	Add an additional cancellation check to the fetch phase (#62577 ) (#62587 ) In #62357 we introduced an additional optimization that allows us to skip the most of the fetch phase early if no results are found. This change caused some cancellation test failures that were relying on definitive cancellation during the fetch phase. This commit adds an additional quick cancellation check at the very beginning of the fetch phase to make cancellation process more deterministic. Fixes #62530	2020-09-18 10:00:36 -04:00
Ignacio Vera	18a52f7477	Use BitArray instead of FixedBitSet for collecting ordinals in Cardinality Aggregator (#62600 ) (#62619 ) Changes the way we collecting ordinals in the Cardinality aggregation from Lucene FixedBitSet to BitArray. The benefit is that BitArray is tracked by our Circuit breakers so it is safer.	2020-09-18 14:16:31 +02:00
Tanguy Leroux	9f5e95505b	Also abort ongoing file restores when snapshot restore is aborted (#62441 ) (#62607 ) Today when a snapshot restore is aborted (for example when the index is explicitly deleted) while the restoration of the files from the repository has already started the file restores are not interrupted. It means that Elasticsearch will continue to read the files from the repository and will continue to write them to disk until all files are restored; the store will then be closed and files will be deleted from disk at some point but this can take a while. This will also take some slots in the SNAPSHOT thread pool too. The Recovery API won't show any files actively being recovered, the only notable indicator would be the active threads in the SNAPSHOT thread pool. This commit adds a check before reading a file to restore and before writing bytes on disk so that a closing store can be detected more quickly and the file recovery process aborted. This way the file restores just stops and for most of the repository implementations it means that no more bytes are read (see #62370 for S3), finishing threads in the SNAPSHOT thread pool more quickly too.	2020-09-18 14:04:58 +02:00
Armin Braun	73d19271a9	Fix Races in testQueuedSnapshotOperationsAndBrokenRepoOnMasterFailOverMultipleRepos (#62431 ) (#62614 ) This test (in-part) verifies that snapshot creation is not retried on master fail-over once a snaphot has been started already. Unless we wait for the snapshot creation to show up in the cluster state before failing the master node though, we could run into a race where the snapshot wasn't yet in the cluster state and a retry goes through successfully.	2020-09-18 12:20:23 +02:00
Przemyslaw Gomulka	d87268a264	Round up parsers should be based on a list of parsers backport(#62290 ) (#62604 ) a dateformatter can be created with a list of parsers which are iterated during parsing and the first one that passes will return a parsed date. DateMathParser should do the same, when created based on a list of non-rounding parsers it should also iterate over all of them - it is at the moment only taking first element closing #62207	2020-09-18 12:03:20 +02:00
Adrien Grand	4de8579455	Upgrade to lucene-8.7.0-snapshot-830bd186a8d. (#62596 )	2020-09-18 09:51:34 +02:00
David Turner	06d5d360f9	Tidy up fillInStackTrace implementations (#62555 ) Removes the unnecessary `synchronized` introduced in #62433 and adjusts the others to return `this` not `null` as required by the parent method's Javadocs.	2020-09-18 08:29:48 +01:00
Ignacio Vera	6a3d731be1	Only call reduce on a single InternalAggregation when needed (#62525 ) (#62594 ) Adds a new abstract method in InternalAggregation that flags the framework if it needs to reduce on a single InternalAggregation.	2020-09-18 08:43:58 +02:00
Nhat Nguyen	0127b71901	Adjust keep alive assertion in ShardSearchRequest (#62582 ) Relates #62184	2020-09-17 16:09:54 -04:00
Lee Hinman	9bb7ce0b22	[7.x] Allocate new indices on "hot" or "content" tier depending on data stream inclusion (#62338 ) (#62557 ) Backports the following commits to 7.x: Allocate new indices on "hot" or "content" tier depending on data stream inclusion (#62338)	2020-09-17 13:29:23 -06:00
Martijn van Groningen	5f643433c6	Prohibit the usage of create index api in namespaces managed by data stream templates (#62574 ) Backport of #62527 to 7.x branch. This commit adds validation that prohibits the creation of regular indices in the namespace of templates with data streams enabled. It shouldn't be possible to create ordinary indices when the name of the index matches with a composable index template that enables data streams. Auto creation has logic that creates data streams instead of regular indices. However validation logic for the create index api was missing.	2020-09-17 20:10:42 +02:00
Jim Ferenczi	df93b31b15	Faster sequential access for stored fields (#62509 ) (#62573 ) Faster sequential access for stored fields Spinoff of #61806 Today retrieving stored fields at search time is optimized for random access. So we make no effort to keep state in order to not decompress the same data multiple times because two documents might be in the same compressed block. This strategy is acceptable when retrieving a top N sorted by score since there is no guarantee that documents will be on the same block. However, we have some use cases where the document to retrieve might be completely sequential: Scrolls or normal search sorted by document id. Queries on Runtime fields that extract from _source. This commit exposes a sequential stored fields reader in the custom leaf reader that we use at search time. That allows to leverage the merge instances of stored fields readers that are optimized for sequential access. This change focuses on the fetch phase for now and leverages the merge instances for stored fields only if all documents to retrieve are adjacent. Applying the same logic in the source lookup of runtime fields should be trivial but will be done in a follow up. The speedup on queries sorted by doc id is significant. I played with the scroll task of the http_logs rally track on my laptop and had the following result: \| Metric \| Task \| Baseline \| Contender \| Diff \| Unit \| \|--------------------------------------------------------------:\|-------:\|------------:\|------------:\|---------:\|--------:\| \| Total Young Gen GC \| \| 0.199 \| 0.231 \| 0.032 \| s \| \| Total Old Gen GC \| \| 0 \| 0 \| 0 \| s \| \| Store size \| \| 17.9704 \| 17.9704 \| 0 \| GB \| \| Translog size \| \| 2.04891e-06 \| 2.04891e-06 \| 0 \| GB \| \| Heap used for segments \| \| 0.820332 \| 0.820332 \| 0 \| MB \| \| Heap used for doc values \| \| 0.113979 \| 0.113979 \| 0 \| MB \| \| Heap used for terms \| \| 0.37973 \| 0.37973 \| 0 \| MB \| \| Heap used for norms \| \| 0.03302 \| 0.03302 \| 0 \| MB \| \| Heap used for points \| \| 0 \| 0 \| 0 \| MB \| \| Heap used for stored fields \| \| 0.293602 \| 0.293602 \| 0 \| MB \| \| Segment count \| \| 541 \| 541 \| 0 \| \| \| Min Throughput \| scroll \| 12.7872 \| 12.8747 \| 0.08758 \| pages/s \| \| Median Throughput \| scroll \| 12.9679 \| 13.0556 \| 0.08776 \| pages/s \| \| Max Throughput \| scroll \| 13.4001 \| 13.5705 \| 0.17046 \| pages/s \| \| 50th percentile latency \| scroll \| 524.966 \| 251.396 \| -273.57 \| ms \| \| 90th percentile latency \| scroll \| 577.593 \| 271.066 \| -306.527 \| ms \| \| 100th percentile latency \| scroll \| 664.73 \| 272.734 \| -391.997 \| ms \| \| 50th percentile service time \| scroll \| 522.387 \| 248.776 \| -273.612 \| ms \| \| 90th percentile service time \| scroll \| 573.118 \| 267.79 \| -305.328 \| ms \| \| 100th percentile service time \| scroll \| 660.642 \| 268.963 \| -391.678 \| ms \| \| error rate \| scroll \| 0 \| 0 \| 0 \| % \| Closes #62024	2020-09-17 19:58:18 +02:00
Alan Woodward	5421a743a7	Move SearchLookup into FetchContext (#62549 ) FetchSubPhase#getProcessor currently takes a SearchLookup parameter. This however is only needed by a couple of subphases, and will almost certainly change in future as we want to simplify how fetch phases retrieve values for individual hits. To future-proof against further signature changes, this commit moves the SearchLookup reference into FetchContext instead.	2020-09-17 17:39:02 +01:00
Alan Woodward	e3e3aef3d8	Load version metadata even when stored fields are disabled (#62533 ) Currently we throw an error if stored fields are disabled, but hit version metadata is requested on a search. This doesn't make much sense, as the version information is stored in docvalues and so has no connection with stored fields. This commit removes the link between the two, allowing version metadata to be loaded even when stored fields are disabled in a request. Fixes #62456	2020-09-17 17:39:02 +01:00
Alan Woodward	91e2330529	Warn on badly-formed null values for date and IP field mappers (#62487 ) In #57666 we changed when null_value was parsed for ip and date fields. Previously, the null value was stored as a string, and parsed into a date or InetAddress whenever a document containing a null value was encountered. Now, the values are parsed when the mappings are built, which means that bad values are detected up front; if you try and add a mapping with a badly-parsed ip or date for a null_value, the mapping will be rejected. This causes problems for upgrades in the case when you have a badly-formed null_value in a pre-7.9 cluster. This commit fixes the upgrade case by changing the logic to only logging a warning on the badly formed value, replicating the earlier behaviour. Fixes #62363	2020-09-17 16:38:08 +01:00
Ignacio Vera	901000891a	Fix test error in InternalCardinalityTests#testEqualsAndHashcode (#62542 ) (#62554 ) Make sure the the new HLL++ is different to the original one	2020-09-17 17:09:13 +02:00
Alan Woodward	63afc61b08	Introduce FetchContext (#62357 ) We currently pass a SearchContext around to share configuration among FetchSubPhases. With the introduction of runtime fields, it would be useful to start storing some state on this context to be shared between different subphases (for example, stored fields or search lookups can be loaded lazily but referred to by many different subphases). However, SearchContext is a very large and unwieldy class, and adding more methods or state here feels like a bridge too far. This commit introduces a new FetchContext class that exposes only those methods on SearchContext that are required for fetch phases. This reduces the API surface area for fetch phases considerably, and should give us some leeway to add further state.	2020-09-17 09:57:43 +01:00
Adrien Grand	e0a4a94985	Speed up merging when source is disabled. (#62443 ) (#62474 ) The CodecReader wrapper we use to remove the `_recovery_source` field doesn't override `StoredFieldsreader#getMergeInstance`, which has the undesired side-effect of preventing the wrapped stored fields reader from optimizing merging.	2020-09-17 10:53:31 +02:00
David Turner	62dcc5b1ae	Suppress stack in VersionConflictEngineException (#62433 ) `VersionConflictEngineException` is thrown on the hot path for updates, but stack traces are expensive to compute and transport and rarely useful for this kind of exception. This commit avoids computing the stack trace for these exceptions.	2020-09-17 09:40:07 +01:00
Adrien Grand	9a8225bbc1	Upgrade to lucene-8.7.0-snapshot-9cd3af50f80. (#62450 ) (#62476 ) This new snapshot contains the following JIRAs that we're interested in: - [LUCENE-9525](https://issues.apache.org/jira/browse/LUCENE-9525) Better handling of small documents. This should improve retrieval times when documents are less than ~1kB. - [LUCENE-9510](https://issues.apache.org/jira/browse/LUCENE-9510) Faster flushes when index sorting is enabled by not compressing the temporary files that store stored fields and term vectors.	2020-09-17 10:28:20 +02:00
Armin Braun	5112c17319	Add WARN Logging on Slow Transport Message Handling (#62444 ) (#62521 ) Add simple WARN logging on slow inbound TCP messages.	2020-09-17 10:12:20 +02:00
David Turner	14aec44cd8	Log if recovery affected by disconnect (#62437 ) Today we only emit `DEBUG` logs if the source disconnects from the target during a recovery. This deserves to be noisier by default since it should be rare and may help users identify other problems with their network or with their shard movements. This commit promotes this message to `INFO`. There's no need for `WARN` since these days we will normally resume the recovery where it left off.	2020-09-17 08:22:40 +01:00
Ignacio Vera	2d3ca9c155	Introduce a sparse HyperLogLogPlusPlus class for cloning and serializing low cardinality buckets (#62480 ) (#62520 ) Reduces the memory footprint of an HLL++ structure that uses Linear counting when cloning or deserialising the data structure.	2020-09-17 08:54:50 +02:00
Julie Tibshirani	e1da558206	Remove unused test search context for significant_terms.	2020-09-16 14:27:11 -07:00
Jay Modi	5da922064f	LocalNodeMasterListener is a regular listener (#62485 ) This commit makes the LocalNodeMasterListener interface extend the ClusterStateListener interface and use a default implementation for detecting whether the local node master status changed. Backport of #62422	2020-09-16 11:42:53 -06:00
Tanguy Leroux	8a2e9e66d4	Wait for relocations and disk threshold monitor in DiskThresholdDeciderIT (#62358 ) (#62467 ) Closes #62326	2020-09-16 17:40:20 +02:00
Armin Braun	f6a8599cf8	Don't Start Redundant ConsistentSettingsService (#62283 ) (#62428 ) The consistent settings service is only used in tests so far. No need to start it unless it's actually used.	2020-09-16 09:43:04 +02:00
Ignacio Vera	f3ed641fc7	Adds bucketOrd back to cardinality algorithms (#62389 ) (#62427 )	2020-09-16 08:41:57 +02:00
Nik Everett	24a24d050a	Implement fields fetch for runtime fields (backport of #61995 ) (#62416 ) This implements the `fields` API in `_search` for runtime fields using doc values. Most of that implementation is stolen from the `docvalue_fields` fetch sub-phase, just moved into the same API that the `fields` API uses. At this point the `docvalue_fields` fetch phase looks like a special case of the `fields` API. While I was at it I moved the "which doc values sub-implementation should I use for fetching?" question from a bunch of `instanceof`s to a method on `LeafFieldData` so we can be much more flexible with what is returned and we're not forced to extend certain classes just to make the fetch phase happy. Relates to #59332	2020-09-15 20:24:10 -04:00
Nik Everett	0a7f335215	Speed up writeVInt (backport of #62345 ) (#62419 ) This speeds up `StreamOutput#writeVInt` quite a bit which is nice because it is very commonly called when serializing aggregations. Well, when serializing anything. All "collections" serialize their size as a vint. Anyway, I was examining the serialization speeds of `StringTerms` and this saves about 30% of the write time for that. I expect it'll be useful other places.	2020-09-15 17:14:08 -04:00
Nik Everett	771a8893a6	Add more debugging information for cardinality agg (#62317 ) (#62397 ) This adds two extra bits of info to the profiler: 1. Count of the number of different types of collectors. This lets us figure out if we're using the optimization for segment ordinals. It adds a few more similar counters just for good measure. 2. Profiles the `getLeafCollector` and `postCollection` methods. These are non-trivial for some aggregations, like cardinality.	2020-09-15 13:21:11 -04:00
Armin Braun	ffbc64bd10	Log WARN on Response Deserialization Failure (#62368 ) (#62388 ) We never see this exception in the logs even though it's pretty severe. All we might see is an exception about a transport message not having been read fully from the logic that follows this code. Technically we should probably bubble up the exception but that's a bigger change and needs some carefully reasoning, this change for the time being at least simplifies tracking down deserialization issues in responses.	2020-09-15 18:27:39 +02:00
Adrien Grand	6db8afefc2	Upgrade to lucene-8.7.0-snapshot-cdfdc1e0851. (#62376 ) Upgrade to a new Lucene snapshot that (at least partially) addresses the indexing rate regression when index sorting is enabled. Backport of #62334.	2020-09-15 17:48:07 +02:00
Alan Woodward	f89fa421e2	Remove unnecessary IndexSearcher field on HitContext (#62378 ) FastVectorHighlighter uses the top-level reader to rewrite queries against, which it gets via an IndexSearcher field on HitContext. However, we can already access this top-level reader via HitContext's existing LeafReaderContext field. This commit removes the unnecessary field and constructor parameter, and changes the implementation of topLevelReader to go via ReaderUtils and the leaf reader context.	2020-09-15 15:46:14 +01:00
Christoph Büscher	0ca9829867	Muting CoordinatorTests#testLogsMessagesIfPublicationDelayed	2020-09-15 15:40:51 +02:00
Albert Zaharovits	aeed1c05b0	Ensure authz operation overrides transient authz headers (#61621 ) AuthorizationService#authorize uses the thread context to carry the result of the authorisation as transient headers. The listener argument to the `authorize` method must necessarily observe the header values. This PR makes it so that the authorisation transient headers (`_indices_permissions` and `_authz_info`, but NOT `_originating_action_name`) of the child action override the ones of the parent action. Co-authored-by: Tim Vernum tim@adjective.org	2020-09-15 16:37:38 +03:00
Armin Braun	eae6a3b18e	Fix testMappingVersionAfterDynamicMappingUpdate (#62352 ) (#62360 ) There is a race in this test where the index request will return once the dynamic mapping update has been observed by the cluster state observer internally used by the indexing but not hit all state appliers and thus isn't showing up as the applied state returned by `clusterService.state()` yet.	2020-09-15 11:59:22 +02:00
Alan Woodward	a68f7077c7	Rationalise fetch phase exceptions (#62230 ) We have a special FetchPhaseExecutionException which contains some useful information about which shard and doc a fetch phase has failed in. However, this is not used in many places - currently only the ExplainPhase and the highlighters throw one, and the FetchPhase itself catches IOExceptions and just passes them to the ExceptionsHelper with no extra context. This commit changes FetchPhase to throw FetchPhaseExecutionException if it encounters problems in any of its subphases, and removes the special handling from the explain and highlight phases. It also removes the need to pass shard ids around when building HitContext objects.	2020-09-15 09:28:19 +01:00
Alan Woodward	8089210815	Some small cleanups in TermVectorsService (#62292 ) We removed the use of aggregated stats from term vectors back in #16452, but there is a bunch of dead code left here which can be stripped out.	2020-09-15 09:01:49 +01:00
Ignacio Vera	3536f7f7c2	Initialize BitArray storage as number of bits (#62327 ) (#62354 )	2020-09-15 08:34:22 +02:00
Armin Braun	c81a076f5a	Improve Efficiency of ClusterApplierService Iteration (#62282 ) (#62350 ) The complexity of removing a timeout listener was `O(n)` which means that in case of many queued up CS update tasks (such as in the case of an avalanche of dynamic mapping updates) we're dealing with quadratic complexity for timing out N tasks which was observed to be an issue in practice. This PR makes the complexity of timing out a task `O(1)` and generally simplifies the iteration logic of listeners and applies to be a little more efficient and inline better.	2020-09-15 05:59:48 +02:00
Julie Tibshirani	f56ce4f39b	Fix failure in InnerHitBuilderTests around 'fields' option. (#62344 ) The case InnerHitBuilderTests#testEqualsAndHashcode creates a copy of the object by serializing + deserializing it, then applies a modification. If the 'fields' list is empty, then deserializing it results in Collections.emptyList. Because this is immutable, then modifying it can throw an UnsupportedOperationException. This PR takes the same approach as for docvalue_fields, where we create a new list instead of trying to add to an empty one.	2020-09-14 15:39:03 -07:00
Julie Tibshirani	4a19bdb2ea	Support the 'fields' option in inner_hits and top_hits. (#62337 ) This PR adds support for the 'fields' option in the following places: * Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing * The `top_hits` aggregation Addresses #61949.	2020-09-14 11:51:45 -07:00
David Turner	9acd2fd1fd	Minor cleanups to BytesReferenceStreamInput (#62302 ) Followup to #61681: - reuse the current iterator in `reset()` if possible - simply some integer-overflow-avoidance in `skip()` - clarify some comments - address some IntelliJ warnings	2020-09-14 17:02:27 +01:00
Christoph Büscher	e2eada2498	Fix disabling `allow_leading_wildcard` (#62300 ) (#62318 ) Disabling the `query_string` queries `allow_leading_wildcard` parameter didn't work after a change probably introduced in #60959 because the various field types `wildcardQuery` don't check the leading characters like QueryParserBase#getWildcardQuery does. This PR adds the missing check also before calling the field types wildcard generating method. Closes #62267	2020-09-14 17:13:17 +02:00
Alan Woodward	5358cee29c	Cut over more mapping tests to MapperServiceTestCase (#62312 ) Shaves a few more seconds off the build.	2020-09-14 16:00:37 +01:00
Armin Braun	95766da345	Save Some Allocations when Working with ClusterState (#62060 ) (#62303 ) Just a number of obvious spots where we were allocating duplicate empty structures or otherwise inefficient that I found while investigating snapshot cluster state update performance.	2020-09-14 15:09:54 +02:00
Armin Braun	875af1c976	Remove Dead Variable in BlobStoreIndexShardSnapshots. (#62285 ) (#62295 ) This was never used. Co-authored-by: Howard <danielhuang@tencent.com>	2020-09-14 13:40:39 +02:00
Luca Cavanna	53bf057a53	[TEST] avoid double null check in TransportSearchActionTests	2020-09-11 10:10:09 +02:00
Nhat Nguyen	aafb2cb812	Support point in time cross cluster search (#61827 ) This commit integrates point in time into cross cluster search. Relates #61062 Closes #61790	2020-09-10 19:25:48 -04:00
Nhat Nguyen	808c8689ac	Always include the matching node when resolving point in time (#61658 ) If shards are relocated to new nodes, then searches with a point in time will fail, although a pit keeps search contexts open. This commit solves this problem by reducing info used by SearchShardIterator and always including the matching nodes when resolving a point in time. Closes #61627	2020-09-10 19:25:48 -04:00
Nhat Nguyen	035f0638f4	Support point in time in async_search (#61560 ) This commit integrates point in time into async search and ensures that it works correctly with security enabled. Relates #61062	2020-09-10 19:25:48 -04:00
Nhat Nguyen	063a6d047c	Release search context when scroll keep_alive is too large (#62179 ) Previously, we close related search contexts if the keep_alive of a scroll is too large. But we accidentally change this behavior in #62061.	2020-09-10 19:25:48 -04:00
Nhat Nguyen	2eb1e8bc84	Make keep alive of point in time optional in search (#62184 ) A search request should not be required to extend the keep_alive of a point in time. This change makes that parameter optional.	2020-09-10 19:25:48 -04:00
Jim Ferenczi	3fc35aa76e	Shard Search Scroll failures consistency (#62061 ) Today some uncaught shard failures such as RejectedExecutionException skips the release of shard context and let subsequent scroll requests access the same shard context again. Depending on how the other shards advanced, this behavior can lead to missing data since scrolls always move forward. In order to avoid hidden data loss, this commit ensures that we always release the context of shard search scroll requests whenever a failure occurs locally. The shard search context will no longer exist in subsequent scroll requests which will lead to consistent shard failures in the responses. This change also modifies the retry tests of the reindex feature. Reindex retries scroll search request that contains a shard failure and move on whenever the failure disappears. That is not compatible with how scrolls work and can lead to missing data as explained above. That means that reindex will now report scroll failures when search rejection happen during the operation instead of skipping document silently. Finally this change removes an old TODO that was fulfilled with #61062.	2020-09-10 19:25:48 -04:00
Jim Ferenczi	4d528e91a1	Ensure validation of the reader context is executed first (#61831 ) This change makes sure that reader context is validated (`SearchOperationListener#validateReaderContext) before any other operation and that it is correctly recycled or removed at the end of the operation. This commit also fixes a race condition bug that would allocate the security reader for scrolls more than once. Relates #61446 Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>	2020-09-10 19:25:48 -04:00

... 3 4 5 6 7 ...

5757 Commits