OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nik Everett	97c06816a4	Fix an optimization in terms agg (backport #57438 ) (#57547 ) When the `terms` agg runs against strings and uses global ordinals it has an optimization when it collects segments that only ever have a single value for the particular string. This is very common. But I broke it in #57241. This fixes that optimization and adds `debug` information that you can use to see how often we collect segments of each type. And adds a test to make sure that I don't break the optimization again. We also had a specialiation for when there isn't a filter on the terms to aggregate. I had removed that specialization in #57241 which resulted in some slow down as well. This adds it back but in a more clear way. And, hopefully, a way that is marginally faster when there is a filter. Closes #57407	2020-06-02 14:57:45 -04:00
Mark Tozzi	e50f514092	IndexFieldData should hold the ValuesSourceType (#57373 ) (#57532 )	2020-06-02 12:16:53 -04:00
Armin Braun	ba2d70d8eb	Serialize Outbound Messages on IO Threads (#56961 ) (#57080 ) Almost every outbound message is serialized to buffers of 16k pagesize. We were serializing these messages off the IO loop (and retaining the concrete message instance as well) and would then enqueue it on the IO loop to be dealt with as soon as the channel is ready. 1. This would cause buffers to be held onto for longer than necessary, causing less reuse on average. 2. If a channel was slow for some reason, not only would concrete message instances queue up for it, but also 16k of buffers would be reserved for each message until it would be written+flushed physically. With this change, the serialization happens on the event loop which effectively limits the number of buffers that `N` IO-threads will ever use so long as messages are small and channels writable. Also, this change dereferences the reference to the concrete outbound message as soon as it has been serialized to save some more on GC. This reduces the GC time for a default PMC run by about 50% in experiments (3 nodes, 2G heap each, loopback ... obvious caveat is that GC isn't that heavy in the first place with recent changes but still a measurable gain). I also expect it to be helpful for master node stability by causing less of a spike if master is e.g. hit by a large number of requests that are processed batched (e.g. shard snapshot status updates) and responded to in a short time frame all at once. Obviously, the downside to this change is that it introduces more latency on the IO loop for the serialization. But since we read all of these messages on the IO loop as well I don't see it as much of a qualitative change really and the more predictable buffer use seems much more valuable relatively.	2020-06-02 16:15:18 +02:00
Armin Braun	9bc9d01b84	Do not Block Snapshot Thread Pool Fully During Restore or Snapshot (#57360 ) (#57511 ) Allow for a fairer distribution of snapshot and restore operations to enable parallel snapshots and improve behaviour for parallel snapshot + restore. Closes #55803	2020-06-02 11:45:55 +02:00
Ryan Ernst	7aad4f6470	Store parsed mapping settings in IndexSettings (#57492 ) There are several mapping settings that are currently re-parsed every time they are read. This can be quite frequent, for example within every document ingestion. This commit moves the parsed versions of these mapping settings to be stored in IndexSettings, just as other index settings are already. closes #57395	2020-06-01 16:45:36 -07:00
Mark Tozzi	1f500583b1	Clean up Aggregator Supplier Boiler Plate (#57442 ) (#57452 )	2020-06-01 14:21:07 -04:00
Nik Everett	c6c0b1a968	Optimize `routingNodes` variable in AddIncrementallyTests (#57140 ) (#57447 ) The `routingNodes` variable is unused. Replace `clusterState.getRoutingNodes()` with `routingNodes`. Co-authored-by: Boice Huang <boicehuang@tencent.com>	2020-06-01 14:13:45 -04:00
Zachary Tong	daaf5a3dcc	Fix assertion catching in aggregation supported type test (#56466 ) (#57382 ) At some point, we changed the supported-type test to also catch assertion errors. This has the side effect of also catching the `fail()` call inside the try-catch, which silently smothered some failures. This modifies the test to throw at the end of the try-catch block to prevent from accidentally catching itself. Catching the AssertionError is convenient because there are other locations that do throw an assertion in tests (due to hitting an assertion before the exception is thrown) so I think we should keep it around. Also includes a variety of fixes to other tests which were failing but being silently smothered.	2020-06-01 12:10:05 -04:00
Armin Braun	59570eaa7d	Fix Local Translog Recovery not Updating Safe Commit in Edge Case (#57350 ) (#57380 ) In case the local checkpoint in the latest commit is less than the last processed local checkpoint we would recover 0 ops and hence not commit again. This would lead to the logic in `IndexShard#recoverLocallyUpToGlobalCheckpoint` not seeing the latest local checkpoint when it reload the safe commit from the store and thus cause inefficient recoveries because the recoveries would work from a lower than possible local checkpoint. Closes #57010	2020-05-30 09:28:50 +02:00
Nik Everett	d6a3704932	Fold some of sig_terms into terms (backport of #57361 ) (#57386 ) This merges the global-ordinals-based implementation for `significant_terms` into the global-ordinals-based implementation of `terms`, removing a bunch of copy and pasted code that is subtly different across the two implementations and replacing it with an explicit `ResultStrategy` with nice stuff like Javadoc. The actual behavior is mostly unchanged, though I was able to remove a redundant copy of bytes representing the string from the result construction phase of `significant_terms`. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-05-29 22:51:11 -04:00
Nik Everett	f52e779806	Fix casting of scaled_float in sorts (#57207 ) (#57385 ) Previously we'd get a `ClassCastException` when you tried to use `numeric_type` on `scaled_float`. Oops! This cleans up the CCE and moves some code around so the casting actually works.	2020-05-29 18:06:04 -04:00
Nik Everett	d5e86d7c4d	Small cleanups for terms aggregator (#57315 ) (#57381 ) This includes a few small cleanups for the `TermsAggregatorFactory`: 1. Removes an unused `DeprecationLogger` 2. Moves the members to right above the ctor. 3. Merges some all of the heuristics for picking `SubAggCollectionMode` into a single method.	2020-05-29 16:59:35 -04:00
Nik Everett	4263c25b2f	Save memory when histogram agg is not on top (backport of #57277 ) (#57377 ) This saves some memory when the `histogram` aggregation is not a top level aggregation by dropping `asMultiBucketAggregator` in favor of natively implementing multi-bucket storage in the aggregator. For the most part this just uses the `LongKeyedBucketOrds` that we built the first time we did this.	2020-05-29 15:07:37 -04:00
Benjamin Trent	15aba60c02	[7.x] Add new circuitbreaker plugin and refactor CircuitBreakerService (#55695 ) (#57359 ) * Add new circuitbreaker plugin and refactor CircuitBreakerService (#55695) This commit lays the ground work for plugins supplying their own circuit breakers. It adds a new interface: `CircuitBreakerPlugin`. This interface provides methods for providing custom child CircuitBreaker objects. There are also facilities for allowing dynamic settings for the custom breakers. With the refactor, circuit breakers are no longer replaced on setting changes. Instead, the two mutable settings themselves are `volatile`. Plugins that want to use their custom circuit breaker should keep a reference of their constructed breaker.	2020-05-29 12:13:46 -04:00
Mayya Sharipova	aebb78bf5c	Run sort optimization when from+size>0 (#57250 )	2020-05-29 11:30:35 -04:00
Armin Braun	e4fd78f866	Remove Overly Strict Safety Mechnism in Shard Snapshot Logic (#57227 ) (#57362 ) Unfortunately, we cannot have a safety mechnism like this where we throw whenever we find unreadable data in a shard. This breaks in the case of an older ES version (without shard generations enabled) having failed to snapshot a shard snapshot after writing some data to its path and having finalized it for example. Another example of where we can't support this check is the test I added, if we snapshot an index with a name that already exists in the repository and more shards than the existing index, fail doing that and then retry snapshotting it we will also see unexpected data in the path. We could technically do deeper inspections on the unexpected data but I don't think it's worth it really. In the end if we are unable to read the data here it's broken anyway. By moving to a new `index-` blob in the shard directory I don't see us ever corrupting existing data and since we (by virtue of moving to an empty generation) won't do any incremental work on top of potentially corrupt data we also do not risk creating broken snapshots going forward. => Just logging a warning in this very unlikely case is the best we can do I think	2020-05-29 16:41:57 +02:00
Dan Hermann	6b0d707671	[7.x] Do not report negative values for swap sizes (#57353 )	2020-05-29 08:11:47 -05:00
Henning Andersen	8427d677e9	Reindex and friends fail nicely when max_docs < slices (#54901 ) (#57348 ) When the parameter `max_docs` is less than `slices` in update_by_query, delete_by_query or reindex API, `max_docs ` is set to 0 and we throw an action_request_validation_exception with confused error message: "maxDocs should be greater than 0...". This change checks that whether `max_docs` is less than `slices` and throw an illegal_argument_exception with clear message. Relates to #52786. Co-authored-by: bellengao <gbl_long@163.com>	2020-05-29 14:30:14 +02:00
Martijn van Groningen	04ef39da77	Change cluster info actions to be able to resolve data streams. (#57343 ) Backport of #56878 to 7.x branch. With this change the following APIs will be able to resolve data streams: get index, get mappings and ilm explain APIs. Relates to #53100	2020-05-29 12:17:53 +02:00
Ignacio Vera	75868ea915	Catch InputCoercionException thrown by Jackson parser (#57287 ) (#57330 ) Jackson 2.10 library has added a new type of error that is thrown when a numeric value is out of range. This error should be catch and handle properly in case the flag ignore_malformed has been set to true.	2020-05-29 09:47:47 +02:00
Nik Everett	b9fe10866e	Make global ords terms simpler to understand (backport of #57241 ) (#57311 ) When the `terms` enum operates on non-numeric data it can collect it via global ordinals. It actually has two separate collection strategies for, one "dense" and one "remapping". Each of those strategies has two "iteration" strategies that it uses to build buckets, depending on whether or not we need buckets with `0` docs in them. Previously this was done with several `null` checks and never really explained. This change replaces those checks with two `CollectionStrategy` classes which have good stuff like documentation.	2020-05-28 16:52:35 -04:00
Julie Tibshirani	10e1dc199d	Revert "Remove unused logic from FieldNamesFieldMapper. (#56834 )" This reverts commit `343fb699a4`.	2020-05-28 10:54:10 -07:00
Martijn van Groningen	225ccd1cfa	Ensure template exists when creating data stream (#57275 ) Backporting #56888 to 7.x branch. Limit the creation of data streams only for namespaces that have a composable template with a data stream definition. This way we ensure that mappings/settings have been specified and will be used at data stream creation and data stream rollover. Also remove `timestamp_field` parameter from create data stream request and let the create data stream api resolve the timestamp field from the data stream definition snippet inside a composable template. Relates to #53100	2020-05-28 15:08:25 +02:00
Nhat Nguyen	5b08eaf90c	Fix trimUnsafeCommits for indices created before 6.2 (#57187 ) If an upgraded node is restarted multiple times without flushing a new index commit, then we will wrongly exclude all commits from the starting commits. This bug is reproducible with these minimal steps: (1) create an empty index on 6.1.4 with translog retention disabled, (2) upgrade the cluster to 7.7.0, (3) restart the upgraded the cluster. The problem is that with the new translog policy can trim translog without having a new index commit, while the existing commit still refers to the previous translog generation. Closes #57091	2020-05-27 15:08:49 -04:00
Lee Hinman	c0f732b9f6	[7.x] Rename template V2 classes to ComposableTemplate (#57183 ) (#57232 ) Backports the following commits to 7.x: Rename template V2 classes to ComposableTemplate (#57183)	2020-05-27 11:01:59 -06:00
Nik Everett	4d5be7c817	Save memory on numeric sig terms when not top (backport of #56789 ) (#57221 ) This saves memory when running numeric significant terms which are not at the top level by merging its collection into numeric terms and relying on the optimization that we made in #55873.	2020-05-27 12:03:28 -04:00
Przemyslaw Gomulka	0e34b2f42e	SlowLoggers using single logger (#56708 ) Slow loggers should use single shared logger as otherwise when index is deleted the log4j logger will remain reachable (log4j is caching) and will create a memory leak. closes https://github.com/elastic/elasticsearch/issues/56171	2020-05-27 16:38:31 +02:00
Alan Woodward	d6b79bcd95	Remove Mapper.updateFieldType() (#57151 ) When we had multiple mapping types, an update to a field in one type had to be propagated to the same field in all other types. This was done using the Mapper.updateFieldType() method, called at the end of a merge. However, now that we only have a single type per index, this method is unnecessary and can be removed. Relates to #41059 Backport of #56986	2020-05-27 09:21:24 +01:00
Julie Tibshirani	343fb699a4	Remove unused logic from FieldNamesFieldMapper. (#56834 ) This logic is no longer used, now that each field mapper handles adding the `_field_names` fields.	2020-05-26 17:40:36 -07:00
Nik Everett	0fce2b7713	Fix DateHistogramAggregatorTests.testAsSubAgg Closes #57168 by using `AggregatorTestCase#newIndexSearcher` in the `AggregatorTestCase#testCase`. Without that global ordinals will sometimes fail to work.	2020-05-26 15:05:31 -04:00
Mark Vieira	92e127e90d	Mute DateHistogramAggregatorTests.testAsSubAgg (cherry picked from commit 4d050a7a6438a7d102eeef9e03a7d79565bddab7)	2020-05-26 10:57:22 -07:00
Christoph Büscher	56625e35b7	Fix `bool` query behaviour on null value (#56817 ) Until 7.7 we used to ignore `null` values for `bool`queries `minimum_should_match`, parameters and also for the `must`, `must_not`, `should` and `filter` clauses. An internal refactoring has changed this so now we get a parsing error. While `null` should not a common value here, we should restore the old behaviour for bwc for now. Closes #56812	2020-05-26 16:23:40 +02:00
Armin Braun	184338ed61	Fix Snapshot Javadoc Issues (#57083 ) (#57122 ) Fixing some incorrect JavaDoc and a typo. Co-authored-by: jinwook han <jin942002@naver.com>	2020-05-25 18:05:01 +02:00
Dan Hermann	c5f61fe24c	Handle exceptions when building _cat/indices response	2020-05-25 09:59:24 -05:00
Armin Braun	dde75b0f64	Fix Confusing Exception on Shard Snapshot Abort (#57116 ) (#57117 ) If a partial snapshot has some of its shards aborted because an index got deleted, this can lead to confusing `IllegalStateExceptions` when trying to increment the ref count of the already closed `Store`. Refactored this a little to throw the same exception for aborted shards no matter the timing of the store close and assert that the concurrent store close can in fact only happen when the shard snapshot has already been aborted.	2020-05-25 16:50:11 +02:00
Nhat Nguyen	4511611802	Fix testTrackingChannelTask (#57061 ) A task might not be canceled on disconnection if it is completed before the cancellation is started. We need to relax the assertion in this test. Closes #56746	2020-05-25 09:53:50 -04:00
Armin Braun	5569137ae3	Flatten ReleaseableBytesReference Object Trees (#57092 ) (#57109 ) When slicing a releasable bytes reference we would create a new counter every time and pass the original reference chain to the new slice on every slice invocation. This would lead to extremely deep reference chains and needlessly uses a dedicated counter for every slice when all the slices eventually just refer to the same underlying bytes and `Releasable`. This commit tracks the ref count wrapper with its releasable in a separate object that can be passed around on every slicing, making the slices' tree as flat as the original releasable bytes reference. Also, we were needlessly creating a redundant releasable bytes reference from a releasable bytes-stream-output that we never actually used for releasing (all code that uses it just releases the stream itself instead).	2020-05-25 13:00:37 +02:00
Armin Braun	56401d3f66	Release HTTP Request Body Earlier (#57094 ) (#57110 ) We don't need to hold on to the request body past the beginning of sending the response. There is no need to keep a reference to it until after the response has been sent fully and we can eagerly release it here. Note, this can be optimized further to release the contents even earlier but for now this is an easy increment to saving some memory on the IO pool.	2020-05-25 13:00:19 +02:00
Armin Braun	9fa60f7367	Add History UUID Index Setting (#56930 ) (#57104 ) Pre-requesite for #50278 to be able to uniquely identify index metadata by its version fields and UUIDs when restoring into closed indices.	2020-05-25 11:26:03 +02:00
Armin Braun	05c019585e	Fix Test Failure from Incorrect Mapping Conflict Assertion (#57085 ) (#57088 ) I think this is a left-over from #56915 where a change in assertion message didn't make it to this very rare-case assertion.	2020-05-24 09:16:28 +02:00
Nhat Nguyen	d8165a3439	Turn off translog retention only when shard started (#57063 ) We should only turn off the translog retention when a shard is started; otherwise, we can issue unnecessary warn logs.	2020-05-22 09:05:05 -04:00
Jack Conradson	35c546b388	Backports for _source bug fix in scripting (#57068 ) * Update DeprecationMap to DynamicMap (#56149) This renames DeprecationMap to DynamicMap, and changes the deprecation messages Map to accept a Map of String (keys) to Functions (updated values) instead. This creates more flexibility in either logging or updating values from params within a script. This change is required to fix (#52103) in a future PR. * Fix Source Return Bug in Scripting (#56831) This change ensures that when a user returns _source directly no matter where accessed within scripting, the value is a Map of the converted source as opposed to a SourceLookup.	2020-05-21 17:07:38 -07:00
Mayya Sharipova	4cf49bc05e	Don't run sort optimization on size=0 (#57044 ) Sort optimization creates TopFieldCollector that errors when size=0. This ensures that sort optimization is not run when size=0. Closes #56923	2020-05-21 14:52:28 -04:00
Andrei Dan	9af31109fa	Change "apply create index" log level to DEBUG (#56947 ) (#57028 ) These log statements are also logged by every "simulate adding this index" functionality. One of them is the rollover action in ILM which executes rollover dry-runs until the conditions are met, when the actual rollover is executed. This changes the statements log level to DEBUG and changes the phrasing from V1/V2 to legacy/composable templates. (cherry picked from commit 7cc8e1fe7f9731213ac4869fe99853564fbaaba9) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-05-21 12:14:53 +01:00
markharwood	eb8cb31d46	Update Lucene version to 8.6.0-snapshot-9d6c738ffce (#57024 ) Same version as master	2020-05-21 11:28:16 +01:00
David Turner	1dabdd9a20	Close channel on handshake error with old version (#56989 ) (#57025 ) Today a transport response uses the same wire format version as the corresponding request. This mostly works since we mostly know we are communicating with a node with a compatible version. TCP handshakes don't have this guarantee since they use `Version.CURRENT.minimumCompatibilityVersion()` to let us handshake with older nodes. This results in the strange situation of a node of major version `N` responding to a node of major version `N-1` using a wire format of version `N-2`. We put extra effort into the longer BWC requirements for successful responses, but we do not offer the same guarantees for error responses since they may be rather complicated to serialize. This can result in the request sender misinterpreting the response which may have unpredictable consequences. Rather than strengthening the guarantees in this area, this commit simply logs the exception and closes the connection on a handshake error with a node that uses an incompatible wire format. Closes #54337	2020-05-21 11:00:22 +01:00
David Turner	99f7115f22	Revert "Close channel on handshake error with old version (#56989 )" This reverts commit `c81a189da9`.	2020-05-21 09:43:28 +01:00
David Turner	c81a189da9	Close channel on handshake error with old version (#56989 ) Today a transport response uses the same wire format version as the corresponding request. This mostly works since we mostly know we are communicating with a node with a compatible version. TCP handshakes don't have this guarantee since they use `Version.CURRENT.minimumCompatibilityVersion()` to let us handshake with older nodes. This results in the strange situation of a node of major version `N` responding to a node of major version `N-1` using a wire format of version `N-2`. We put extra effort into the longer BWC requirements for successful responses, but we do not offer the same guarantees for error responses since they may be rather complicated to serialize. This can result in the request sender misinterpreting the response which may have unpredictable consequences. Rather than strengthening the guarantees in this area, this commit simply logs the exception and closes the connection on a handshake error with a node that uses an incompatible wire format. Closes #54337	2020-05-21 09:00:08 +01:00
Julie Tibshirani	fb000d6cf4	Simplify range query methods for range types. (#56976 ) For me this is easier to follow. It also avoids parsing the query bounds twice.	2020-05-20 12:02:27 -07:00
Jason Tedor	e690c5a68e	Fix some licenses in our own code (#56978 ) All of these files were written by us, and not sourced from anywhere. Therefore, the license head should be granting licenses to Elasticsearch, rathern than to the ASF. This commit address them by changing the license to our standard Apache 2.0 license header.	2020-05-20 09:24:31 -04:00
Alan Woodward	18bfbeda29	Move merge compatibility logic from MappedFieldType to FieldMapper (#56915 ) Merging logic is currently split between FieldMapper, with its merge() method, and MappedFieldType, which checks for merging compatibility. The compatibility checks are called from a third class, MappingMergeValidator. This makes it difficult to reason about what is or is not compatible in updates, and even what is in fact updateable - we have a number of tests that check compatibility on changes in mapping configuration that are not in fact possible. This commit refactors the compatibility logic so that it all sits on FieldMapper, and makes it called at merge time. It adds a new FieldMapperTestCase base class that FieldMapper tests can extend, and moves the compatibility testing machinery from FieldTypeTestCase to here. Relates to #56814	2020-05-20 09:43:13 +01:00
Nik Everett	8b9c4eb3e0	Save memory when date_histogram is not on top (#56921 ) (#56960 ) When `date_histogram` is a sub-aggregator it used to allocate a bunch of objects for every one of it's parent's buckets. This uses the data structures that we built in #55873 rework the `date_histogram` aggregator instead of all of the allocation. Part of #56487	2020-05-19 17:36:55 -04:00
Lee Hinman	e208925465	[7.x] Add template simulation API for simulating template composition (#56842 ) (#56924 )	2020-05-19 08:12:21 -06:00
Tim Brooks	57c3a61535	Create HttpRequest earlier in pipeline (#56393 ) Elasticsearch requires that a HttpRequest abstraction be implemented by http modules before server processing. This abstraction controls when underlying resources are released. This commit moves this abstraction to be created immediately after content aggregation. This change will enable follow-up work including moving Cors logic into the server package and tracking bytes as they are aggregated from the network level.	2020-05-18 14:54:01 -06:00
Armin Braun	46e5c37267	Remove Dead Conditional from RoutingTable (#56870 ) (#56914 ) `delta` is always positive here. Co-authored-by: Howard <danielhuang@tencent.com>	2020-05-18 17:18:26 +02:00
David Turner	9ba897fbd6	Random iterations in testDataOnlyNodePersistence (#56906 ) PR #56893 was supposed to randomise the iteration count in `testDataOnlyNodePersistence` but this change was mistakenly omitted. This commit addresses this.	2020-05-18 15:16:22 +01:00
David Turner	64280b489b	Fix testDataOnlyNodePersistence (#56893 ) This test failed if all 1000 top-level `rarely()` calls in the loop returned `false`, because then we would never set the term of the persisted state. This commit fixes this by adding an earlier call to `persistedState#setCurrentTerm`. It also changes the test to clean up the threadpools it starts whether it passes or fails.	2020-05-18 13:57:36 +01:00
Armin Braun	e75a6f13a1	Stop Redundantly Serializing ShardId in BulkShardResponse (#56094 ) (#56866 ) When reading/writing the individual doc responses in the context of a bulk shard response there is no need to serialize the `ShardId` over and over. This can waste a lot of memory when handling large bulk requests.	2020-05-17 10:27:17 +02:00
Armin Braun	31f54c934e	Relax Assertion About SnapshotsService Listeners (#56608 ) (#56863 ) This assertion is too strict. A snapshot will be removed from the cluster state on the CS thread before it is removed from the listeners map on the snapshot thread pool. Throughout the removal from the cluster state and listener map, the snapshot is tracked in `endingSnapshots` though, so we can relax the assertion accordingly and are still able to catch leaked listeners. Closes #56607	2020-05-17 09:17:41 +02:00
Armin Braun	b9614558b9	Fix SnapshotStatusApisIT (#56859 ) (#56861 ) In the unlikely event that the data nodes started snapshotting the shards already (and hence got blocked on the data blobs) before the master has applied the cluster state to its own `SnapshotsService` on the CS applier thread, we can get a `SnapshotMissingException` here which breaks the busy assert loop so we have to deal with it explicitly. Closes #56858	2020-05-16 21:50:25 +02:00
Tim Brooks	195a5247d4	Prevent connection races in testEnsureWeReconnect (#56654 ) Currently it is possible that a sniff connection round is occurring as we enter another test loop in testEnsureWeReconnect. The problem is that once we enter another loop, closing the connection manually can cause this pre-existing connection round to fail. This round failing can fail the test. This commit fixes the issue by ensuring that there are no in-progress connections before entering another loop.	2020-05-15 14:58:46 -06:00
Nik Everett	f3e962707b	Mute TaskManagerTests#testTrackingChannelTask It fails sometimes. Tracked by #56746.	2020-05-15 16:48:33 -04:00
Nik Everett	7b626826eb	Fix sum test It was relying on the compensated sum working but the test framework was dodging it. This forces the accuracy tests to come from a single shard where we get the proper compensated sum. Closes #56757	2020-05-15 16:16:30 -04:00
Jason Tedor	da833d6cd3	Use settings infrastructure for shards and replicas (#56801 ) We get the number of shards and replicas with our bare hands in index metadata, rather than letting the settings infrastructure do the work for us. This commit switches to using the settings infrastructure.	2020-05-15 15:59:30 -04:00
David Turner	a3e845cbad	Suppress cluster UUID logs in 6.8/7.x upgrade (#56835 ) Today a 7.x node logs `cluster UUID set to [...]` on every cluster state update received from a 6.8 master, because 6.8 nodes are not able to commit the cluster UUID properly. We could try and deduplicate these logs somehow, but that would introduce a good deal of complexity. Instead, this commit suppresses these logs entirely when receiving cluster state updates from a 6.8 master.	2020-05-15 19:45:32 +01:00
Dan Hermann	66871c5342	[7.x] Rename endpoint from plural "_data_streams" to singular "_data_stream" (#56825 )	2020-05-15 10:27:53 -05:00
Alan Woodward	d33d13f2be	Simplify generics on Mapper.Builder (#56747 ) Mapper.Builder currently has some complex generics on it to allow fluent builder construction. However, the second parameter, a return type from the build() method, is unnecessary, as we can use covariant return types. This commit removes this second generic parameter.	2020-05-15 12:14:49 +01:00
Ryan Ernst	9fb80d3827	Move publishing configuration to a separate plugin (#56727 ) This is another part of the breakup of the massive BuildPlugin. This PR moves the code for configuring publications to a separate plugin. Most of the time these publications are jar files, but this also supports the zip publication we have for integ tests.	2020-05-14 20:23:07 -07:00
Tal Levy	5e90ff32f7	Add Normalize Pipeline Aggregation (#56399 ) (#56792 ) This aggregation will perform normalizations of metrics for a given series of data in the form of bucket values. The aggregations supports the following normalizations - rescale 0-1 - rescale 0-100 - percentage of sum - mean normalization - z-score normalization - softmax normalization To specify which normalization is to be used, it can be specified in the normalize agg's `normalizer` field. For example: ``` { "normalize": { "buckets_path": <>, "normalizer": "percent" } } ```	2020-05-14 17:40:15 -07:00
Lee Hinman	a73d7d9e2b	[7.x] Don't allow invalid template combinations (#56397 ) (#56795 ) Backports the following commits to 7.x: - Don't allow invalid template combinations (#56397)	2020-05-14 16:20:53 -06:00
Mark Tozzi	b718193a01	Clean up DocValuesIndexFieldData (#56372 ) (#56684 )	2020-05-14 12:42:37 -04:00
Nhat Nguyen	044ee380e8	Use ConcurrentSet in testTrackingChannelTask (#56775 ) We need to use a ConcurrentSet to track the canceled tasks as cancelTaskAndDescendants can be called concurrently. Closes #56746	2020-05-14 12:22:59 -04:00
David Turner	f0c2c25527	AwaitsFix for #56746 (and #56751 )	2020-05-14 12:46:32 +01:00
David Turner	63cc53e512	AwaitsFix for #56757	2020-05-14 12:00:15 +01:00
Martijn van Groningen	b87aeb09f7	Allow more apis to resolve data streams (#56743 ) Backporting #56683 to 7.x branch. Allow get settings, cluster state and field caps apis to resolve data streams.	2020-05-14 10:57:13 +02:00
Nhat Nguyen	ac432f6612	Reduce test load in TaskManagerTests	2020-05-13 23:52:48 -04:00
Nhat Nguyen	566b23c42c	Cancel task and descendants on channel disconnects (#56620 ) If a channel gets disconnected, then we should cancel the tasks associated with that channel as their results won't be retrieved. Closes #56327 Relates #56619 Backport of #56620	2020-05-13 22:09:58 -04:00
Jason Tedor	7c8860b7e6	Update number of replicas when removing setting (#56723 ) We previously rejected removing the number of replicas setting, which prevents users from reverting this setting to its default the natural way. To fix this, we put back the setting with the default value in the cases that the user is trying to remove it. Yet, we also need to do the work of updating the routing table and so on appropriately. This case was missed because when the setting is being removed, we were defaulting to -1 in this code path, which is treated as not being updated. Instead, we must treat the case when we are removing this setting as if the setting is being updated, too. This commit does that.	2020-05-13 20:13:25 -04:00
David Roberts	ab40466bfb	Prevent unexpected native controller output hanging the process (#56685 ) In normal operation native controllers are not expected to write anything to stdout or stderr. However, if due to an error or something unexpected with the environment a native controller does write something to stdout or stderr then it will block if nothing is reading that output. This change makes the stdout and stderr of native controllers reuse the same stdout and stderr as the Elasticsearch JVM (which are by default redirected to es.stdout.log and es.stderr.log) so that if something unexpected is written to native controller output then: 1. The native controller process does not block, waiting for something to read the output 2. We can see what the output was, making it easier to debug obscure environmental problems Backport of #56491 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-05-13 22:57:00 +01:00
Nik Everett	b98b260048	Merge significant_terms into the terms package (backport of #56699 ) (#56715 ) This merges the code for the `significant_terms` agg into the package for the code for the `terms` agg. They are super entangled already, this mostly just admits that to ourselves. Precondition for the terms work in #56487	2020-05-13 17:36:21 -04:00
Luca Cavanna	34410814b9	Don't omit empty arrays when filtering _source (#56527 ) When using source filtering exclusions, empty arrays are not preserved in documents, and no empty arrays are returned if arrays are empty after applying exclusions. We have special treatment to make sure that we preserve empty objects, but the behaviour for arrays is different. It looks like this regression was introduced by #22593, shortly after we refactored source filtering to use automata (#20736). Note that this change affects what the search API returns when using source exclusions, as well as what gets indexed when using source exclusions for the _source field. Closes #23796	2020-05-13 23:24:21 +02:00
Nik Everett	126619ae3c	Add list of defered aggregations to the profiler (backport of #56208 ) (#56682 ) This adds a few things to the `breakdown` of the profiler: * `histogram` aggregations now contain `total_buckets` which is the count of buckets that they collected. This could be useful when debugging a histogram inside of another bucketing agg that is fairly selective. * All bucketing aggs that can delay their sub-aggregations will now add a list of delayed sub-aggregations. This is useful because we sometimes have fairly involved logic around which sub-aggregations get delayed and this will save you from having to guess. * Aggregtations wrapped in the `MultiBucketAggregatorWrapper` can't accurately add anything to the breakdown. Instead they the wrapper adds a marker entry `"multi_bucket_aggregator_wrapper": true` so we can be quickly pick out such aggregations when debugging. It also fixes a bug where `_count` breakdown entries were contributing to the overall `time_in_nanos`. They didn't add a large amount of time so it is unlikely that this caused a big problem, but I was there. To support the arbitrary breakdown data this reworks the profiler so that the `breakdown` can contain any data that is supported by `StreamOutput#writeGenericValue(Object)` and `XContentBuilder#value(Object)`.	2020-05-13 16:33:22 -04:00
Julie Tibshirani	1ad83c37c4	Use index sort range query when possible. (#56710 ) This PR proposes to use `IndexSortSortedNumericDocValuesRangeQuery` when possible to speed up certain range queries. Points-based queries are already very efficient, the only time this query makes a difference is when the range matches a large number of documents. Relates to #48665.	2020-05-13 13:24:45 -07:00
Jason Tedor	5ca2ea2dde	Allow removing replicas setting on closed indices (#56680 ) This is similar to a previous change that allowed removing the number of replicas settings (so setting it to its default) on open indices. This commit allows the same for closed indices. It is unfortunate that we have separate branches for handling open and closed indices here, but I do not see a clean way to merge these two together without making a rather unnatural method (note that they invoke different methods for doing the settings updates). For now, we leave this as-is even though it led to the miss here.	2020-05-13 15:56:58 -04:00
Mark Vieira	e3be18a443	Add version 6.8.10	2020-05-13 11:27:40 -07:00
Bogdan Pintea	2f0663c490	Add the 7.7.1 Version Add the bumped 7.7 branch new version, 7.7.1	2020-05-13 18:46:07 +02:00
Ignacio Vera	b4521d5183	upgrade to Lucene 8.6.0 snapshot (#56661 )	2020-05-13 14:25:16 +02:00
Jason Tedor	4394235c63	Allow removing index.number_of_replicas setting (#56656 ) Today a user can create an index without setting the index.number_of_replicas setting even though the index metadata requires that the setting has a value. We do this when creating an index by explicitly settings index.number_of_replicas to a default value if one is not provided. However, if a user updates the number of replicas, and then let wants to return to the default value, they are naturally inclined to try setting this setting to null, as the agreed upon way to return a setting to its default. Since the index metadata requires that this setting has a non-null value, we blow up when a user attempts to make this change. This is because we are not taking the same action when updating a setting on an index that we take when create an index. Namely, we are not explicitly setting index.number_of_replicas if the request does not carry a value for this setting. This would happen when nulling the setting, which we want to support. This commit addresses this by setting index.number_of_replicas to the default if the value for this setting is null when updating the settings for an index.	2020-05-13 06:25:43 -04:00
Christoph Büscher	73b64908b2	Fix `time_zone` on `query_string` and date fields (#55881 ) (#56668 ) Currently the `time_zone` parameter in `query_string` queries gets applied correctly only when using the range syntax, e.g "date:[2020-01-02 TO 2020-01-05]. When a date field gets searched without explicit range syntax, e.g. "date:"2020-01-01" we internally create a range query than uses the specified date as start date and rounds up to the next underspecified units for the end date (e.g. here 2020-01-01T23:59:59) without considering the `time_zone` settings. This change adds a check in QueryStringQueryParser to detect this scenario early where we have access to the time zone information and directly create a range query using it. Closes #55813	2020-05-13 11:20:25 +02:00
Henning Andersen	48a8c7eb88	Ensure search contexts are removed on index delete (#56335 ) (#56617 ) In a race condition, a search context could remain enlisted in SearchService when an index is deleted, potentially causing the index folder to not be cleaned up (for either lengthy searches or scrolls with timeouts > 30 minutes or if the scroll is kept active).	2020-05-13 09:41:02 +02:00
Jake Landis	a56fb6192e	[7.x] Fix ingest simulate verbose on failure with conditional (#56478 ) (#56635 ) If a conditional is added to a processor, and that processor fails, and that processor has an on_failure handler, the full trace of all of the executed processors may not be displayed in simulate verbose. The information is correct, but misses displaying some of the steps used to get there. This happens because a processor that is conditional processor is a wrapper around the real processor and a processor with an on_failure handler is also a wrapper around the processor(s). When decorating for simulation we treat compound processor specially, but if a compound processor is wrapped by a conditional processor that compound processor's processors can be missed for decoration resulting in the missing displayed steps. The fix to this is to treat the conditional processor specially and explicitly seperate it from the processor it is wrapping. This requires us to keep track of 2 processors a possible conditional processor and the actual processor it may be wrapping. related: #56004	2020-05-12 15:41:05 -05:00
Armin Braun	0a879b95d1	Save Bounds Checks in BytesReference (#56577 ) (#56621 ) Two spots that allow for some optimization: * We are often creating a composite reference of just a single item in the transport layer => special cased via static constructor to make sure we never do that * Also removed the pointless case of an empty composite bytes ref * `ByteBufferReference` is practically always created from a heap buffer these days so there is no point of dealing with all the bounds checks and extra references to sliced buffers from that and we can just use the underlying array directly	2020-05-12 20:33:45 +02:00
Jason Tedor	f7b8f0b2f4	Adjust warning for heap size bootstrap check (#56565 ) Today the heap size check warns the user about two issues why they might care about the heap size check: resize pauses, and if memory locking is enabled. Yet, we unconditionally make mention of the memory locking reason, even if memory locking is not enabled. This can confuse some users, so we adjust the warning about memory locking to only display if memory locking is enabled.	2020-05-12 14:31:21 -04:00
Martijn van Groningen	0c61bc63e4	Backport: auto create data streams using index templates v2 (#56596 ) Backport: #55377 This commit adds the ability to auto create data streams using index templates v2. Index templates (v2) now have a data_steam field that includes a timestamp field, if provided and index name matches with that template then a data stream (plus first backing index) is auto created. Relates to #53100	2020-05-12 17:01:15 +02:00
Dan Hermann	dfdd7e4fce	Report used memory as zero when total memory cannot be obtained (#56412 )	2020-05-12 07:43:51 -05:00
Ignacio Vera	222ee721ec	Add moving percentiles pipeline aggregation (#55441 ) (#56575 ) Similar to what the moving function aggregation does, except merging windows of percentiles sketches together instead of cumulatively merging final metrics	2020-05-12 11:35:23 +02:00
Martijn van Groningen	7b1f978931	Move data stream test (#56505 ) (#56570 ) Move data stream resolvability test from IndicesOptionsIntegrationIT to DataStreamIT class. Whether a transport action supports data streams is no longer controlled via indices options.	2020-05-12 10:44:13 +02:00
Armin Braun	2d08ef729c	Deduplicate Strings in REST Bulk Request Parsing (#56506 ) (#56568 ) We can save a little memory here since these strings might live for quite a while on the coordinating node.	2020-05-12 09:52:44 +02:00
Ryan Ernst	902fc546bd	Migrate remaining ESIntegTestCases to internalClusterTest (#56479 ) (#56563 ) This commit migrates the ESIntegTestCase tests in x-pack to the internalClusterTest source set.	2020-05-11 21:06:04 -07:00
Nik Everett	137df274ab	Add support for numeric range keys (#56452 ) (#56552 ) This adds support for parsing numbers as range keys. They get converted into a string, but we allow numbers. While I was there I replaced the parser for `Range` with a `ConstructingObjectParser` which will automatically add support for "did you mean" style corrections on errors. Closes #56402	2020-05-11 19:48:59 -04:00

1 2 3 4 5 ...

4834 Commits