OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nik Everett	3b1dfa3b5d	Remove deprecated wrapped from scripted_metric (backport of #57627 ) (#57763 ) This removes the deprecated `asMultiBucketAggregator` wrapper from `scripted_metric`. Unlike most other such removals, this isn't likely to save much memory. But it does make the internals of the aggregator slightly less twisted. Relates to #56487	2020-06-05 16:14:28 -04:00
Martijn van Groningen	f170b52e64	Backing indices should use composable template matching with the corresponding data stream name (#57728 ) Backport of #57640 to 7.x branch. Composable templates with exact matches, can match with the data stream name, but not with the backing index name. Also if the backing index naming scheme changes, then a composable template may never match with a backing index. In that case mappings and settings may not get applied.	2020-06-05 18:38:22 +02:00
Dan Hermann	3fe93e24a6	[7.x] Prohibit closing the write index for a data stream (#57740 )	2020-06-05 11:14:43 -05:00
Jake Landis	459ab9a0b2	[7.x] Ensure type exists for all monitoring configuration (#57399 ) (#57704 ) #47711 and #47246 helped to validate that monitoring settings are rejected at time of setting the monitoring settings. Else an invalid monitoring setting can find it's way into the cluster state and result in an exception thrown [1] on the cluster state application (there by causing significant issues). Some additional monitoring settings have been identified that can result in invalid cluster state that also result in exceptions thrown on cluster state application. All settings require a type of either http or local to be applicable. When a setting is changed, the exporters are automatically updated with the new settings. However, if the old or new settings lack of a type setting an exception will be thrown (since exporters are always of type 'http' or 'local'). Arguably we shouldn't blindly create and destroy new exporters on each monitoring setting update, but the lifecycle of the exporters is abit out the scope this PR is trying to address. This commit introduces a similar methodology to check for validity as #47711 and #47246 but this time for ALL (including non-http) settings. Monitoring settings are not useful unless there an exporter with a type defined. The type is used as dependent setting, such that it must exist to set the value. This ensures that when any monitoring settings changes that they can only get added to cluster state if the type exists. If the type exists (and the other validations pass) then the exporters will get re-built and the cluster state remains valid. Tests have been included to ensure that all dynamic monitoring settings have the type as dependent settings. [1] org.elasticsearch.common.settings.SettingsException: missing exporter type for [found-user-defined] exporter at org.elasticsearch.xpack.monitoring.exporter.Exporters.initExporters(Exporters.java:126) ~[?:?]	2020-06-05 10:47:11 -05:00
Tanguy Leroux	0e57528d5d	Remove more //NORELEASE (#57517 ) We agreed on removing the following //NORELEASE tags.	2020-06-05 15:34:06 +02:00
Gordon Brown	5a4e5a1e9d	Handle `cluster.max_shards_per_node` in YAML config (#57234 ) Prior to this commit, `cluster.max_shards_per_node` is not correctly handled when it is set via the YAML config file, only when it is set via the Cluster Settings API. This commit refactors how the limit is implemented, both to enable correctly handling the setting in the YAML and to more effectively centralize the logic used to enforce the limit. The logic used to apply the limit, as well as the setting value, has been moved to the new `ShardLimitValidator`.	2020-06-04 14:02:21 -06:00
Nik Everett	98c379c507	Merge remaining sig_terms into terms (#57397 ) (#57687 ) Merges the remaining implementation of `significant_terms` into `terms` so that we can more easilly make them work properly without `asMultiBucketAggregator` which should save memory and speed them up. Relates #56487	2020-06-04 14:32:32 -04:00
Mark Vieira	9b0f5a1589	Include vendored code notices in distribution notice files (#57017 ) (#57569 ) (cherry picked from commit 627ef279fd29f8af63303bcaafd641aef0ffc586)	2020-06-04 10:34:24 -07:00
Armin Braun	80d1b12fa3	Restore ThreadContext after Serializing OutboundMessage (#57659 ) (#57681 ) Stash the current context before restoring the stored context on the IO thread so that its thread context does not get polluted. Closes #57554	2020-06-04 17:55:26 +02:00
David Turner	fc4dd6d681	Timeout health API on busy master (#57587 ) Today `GET _cluster/health?wait_for_events=...&timeout=...` will wait indefinitely for the master to process the pending cluster health task, ignoring the specified timeout. This could take a very long time if the master is overloaded. This commit fixes this by adding a timeout to the pending cluster health task.	2020-06-04 13:39:22 +01:00
William Brafford	7de6d97363	Version bump for 7.7.1 release (#57619 )	2020-06-03 16:38:25 -04:00
Igor Motov	8d7f389f3a	Increase search.max_buckets to 65,535 (#57042 ) Increases the default search.max_buckets limit to 65,535, and only counts buckets during reduce phase. Closes #51731	2020-06-03 15:35:41 -04:00
Julie Tibshirani	e0a15e8dc4	Remove the 'array value parser' marker interface. (#57571 ) (#57622 ) This PR replaces the marker interface with the method FieldMapper#parsesArrayValue. I find this cleaner and it will help with the fields retrieval work (#55363). The refactor also ensures that only field mappers can declare they parse array values. Previously other types like ObjectMapper could implement the marker interface and be passed array values, which doesn't make sense.	2020-06-03 11:30:14 -07:00
Nik Everett	7fd94f7d0f	Test: Protect auto_date_histo from 0 buckets The test for `auto_date_histogram` as trying to round `Long.MAX_VALUE` if there were 0 buckets. That doesn't work. Also, this replaces all of the class variables created to make consistent random result when testing `InternalAutoDateHistogram` with the newer `randomResultsToReduce` which is a little simpler to understand.	2020-06-03 12:51:22 -04:00
Christos Soulios	67abde326e	[7.x] Introduce v6.8.11 (#57600 )	2020-06-03 19:10:16 +03:00
Nhat Nguyen	5097071230	Increase timeout for GlobalCheckpointSyncIT (#57567 ) The test failed when it was running with 4 replicas and 3 indexing threads. The recovering replicas can prevent the global checkpoint from advancing. This commit increases the timeout to 60 seconds for this suite and the check for no inflight requests. Closes #57204	2020-06-03 08:50:02 -04:00
Nik Everett	2a27c411fb	Same memory when geo aggregations are not on top (#57483 ) (#57551 ) Saves memory when the `geotile_grid` and `geohash_grid` are not on the top level by using the `LongKeyedBucketOrds` we built in #55873.	2020-06-02 16:21:50 -04:00
Zachary Tong	79ac69cfa3	[7.x Backport] Prevent SigTerms/SigText from running on fields they do not support (#57485 ) SigTerms cannot run on fields that are not searchable, and SigText cannot run on fields that do not have analyzers. Both of these situations fail today with an esoteric exception, so this just formalizes the constraint by throwing an IllegalArgumentException up front. In practice, the only affected field seems to be the `binary` field, which is neither searchable or has a default analyzer (e.g. even numeric and keyword fields have a default analyzer despite not being tokenized) Adds supported-type tests, and makes some changes to the test itself to allow testing sigtext (indexing _source). Also a few tweaks to the test to avoid bad randomization (negative numbers, etc).	2020-06-02 16:03:37 -04:00
Nik Everett	97c06816a4	Fix an optimization in terms agg (backport #57438 ) (#57547 ) When the `terms` agg runs against strings and uses global ordinals it has an optimization when it collects segments that only ever have a single value for the particular string. This is very common. But I broke it in #57241. This fixes that optimization and adds `debug` information that you can use to see how often we collect segments of each type. And adds a test to make sure that I don't break the optimization again. We also had a specialiation for when there isn't a filter on the terms to aggregate. I had removed that specialization in #57241 which resulted in some slow down as well. This adds it back but in a more clear way. And, hopefully, a way that is marginally faster when there is a filter. Closes #57407	2020-06-02 14:57:45 -04:00
Mark Tozzi	e50f514092	IndexFieldData should hold the ValuesSourceType (#57373 ) (#57532 )	2020-06-02 12:16:53 -04:00
Armin Braun	ba2d70d8eb	Serialize Outbound Messages on IO Threads (#56961 ) (#57080 ) Almost every outbound message is serialized to buffers of 16k pagesize. We were serializing these messages off the IO loop (and retaining the concrete message instance as well) and would then enqueue it on the IO loop to be dealt with as soon as the channel is ready. 1. This would cause buffers to be held onto for longer than necessary, causing less reuse on average. 2. If a channel was slow for some reason, not only would concrete message instances queue up for it, but also 16k of buffers would be reserved for each message until it would be written+flushed physically. With this change, the serialization happens on the event loop which effectively limits the number of buffers that `N` IO-threads will ever use so long as messages are small and channels writable. Also, this change dereferences the reference to the concrete outbound message as soon as it has been serialized to save some more on GC. This reduces the GC time for a default PMC run by about 50% in experiments (3 nodes, 2G heap each, loopback ... obvious caveat is that GC isn't that heavy in the first place with recent changes but still a measurable gain). I also expect it to be helpful for master node stability by causing less of a spike if master is e.g. hit by a large number of requests that are processed batched (e.g. shard snapshot status updates) and responded to in a short time frame all at once. Obviously, the downside to this change is that it introduces more latency on the IO loop for the serialization. But since we read all of these messages on the IO loop as well I don't see it as much of a qualitative change really and the more predictable buffer use seems much more valuable relatively.	2020-06-02 16:15:18 +02:00
Armin Braun	9bc9d01b84	Do not Block Snapshot Thread Pool Fully During Restore or Snapshot (#57360 ) (#57511 ) Allow for a fairer distribution of snapshot and restore operations to enable parallel snapshots and improve behaviour for parallel snapshot + restore. Closes #55803	2020-06-02 11:45:55 +02:00
Ryan Ernst	7aad4f6470	Store parsed mapping settings in IndexSettings (#57492 ) There are several mapping settings that are currently re-parsed every time they are read. This can be quite frequent, for example within every document ingestion. This commit moves the parsed versions of these mapping settings to be stored in IndexSettings, just as other index settings are already. closes #57395	2020-06-01 16:45:36 -07:00
Mark Tozzi	1f500583b1	Clean up Aggregator Supplier Boiler Plate (#57442 ) (#57452 )	2020-06-01 14:21:07 -04:00
Nik Everett	c6c0b1a968	Optimize `routingNodes` variable in AddIncrementallyTests (#57140 ) (#57447 ) The `routingNodes` variable is unused. Replace `clusterState.getRoutingNodes()` with `routingNodes`. Co-authored-by: Boice Huang <boicehuang@tencent.com>	2020-06-01 14:13:45 -04:00
Zachary Tong	daaf5a3dcc	Fix assertion catching in aggregation supported type test (#56466 ) (#57382 ) At some point, we changed the supported-type test to also catch assertion errors. This has the side effect of also catching the `fail()` call inside the try-catch, which silently smothered some failures. This modifies the test to throw at the end of the try-catch block to prevent from accidentally catching itself. Catching the AssertionError is convenient because there are other locations that do throw an assertion in tests (due to hitting an assertion before the exception is thrown) so I think we should keep it around. Also includes a variety of fixes to other tests which were failing but being silently smothered.	2020-06-01 12:10:05 -04:00
Armin Braun	59570eaa7d	Fix Local Translog Recovery not Updating Safe Commit in Edge Case (#57350 ) (#57380 ) In case the local checkpoint in the latest commit is less than the last processed local checkpoint we would recover 0 ops and hence not commit again. This would lead to the logic in `IndexShard#recoverLocallyUpToGlobalCheckpoint` not seeing the latest local checkpoint when it reload the safe commit from the store and thus cause inefficient recoveries because the recoveries would work from a lower than possible local checkpoint. Closes #57010	2020-05-30 09:28:50 +02:00
Nik Everett	d6a3704932	Fold some of sig_terms into terms (backport of #57361 ) (#57386 ) This merges the global-ordinals-based implementation for `significant_terms` into the global-ordinals-based implementation of `terms`, removing a bunch of copy and pasted code that is subtly different across the two implementations and replacing it with an explicit `ResultStrategy` with nice stuff like Javadoc. The actual behavior is mostly unchanged, though I was able to remove a redundant copy of bytes representing the string from the result construction phase of `significant_terms`. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-05-29 22:51:11 -04:00
Nik Everett	f52e779806	Fix casting of scaled_float in sorts (#57207 ) (#57385 ) Previously we'd get a `ClassCastException` when you tried to use `numeric_type` on `scaled_float`. Oops! This cleans up the CCE and moves some code around so the casting actually works.	2020-05-29 18:06:04 -04:00
Nik Everett	d5e86d7c4d	Small cleanups for terms aggregator (#57315 ) (#57381 ) This includes a few small cleanups for the `TermsAggregatorFactory`: 1. Removes an unused `DeprecationLogger` 2. Moves the members to right above the ctor. 3. Merges some all of the heuristics for picking `SubAggCollectionMode` into a single method.	2020-05-29 16:59:35 -04:00
Nik Everett	4263c25b2f	Save memory when histogram agg is not on top (backport of #57277 ) (#57377 ) This saves some memory when the `histogram` aggregation is not a top level aggregation by dropping `asMultiBucketAggregator` in favor of natively implementing multi-bucket storage in the aggregator. For the most part this just uses the `LongKeyedBucketOrds` that we built the first time we did this.	2020-05-29 15:07:37 -04:00
Benjamin Trent	15aba60c02	[7.x] Add new circuitbreaker plugin and refactor CircuitBreakerService (#55695 ) (#57359 ) * Add new circuitbreaker plugin and refactor CircuitBreakerService (#55695) This commit lays the ground work for plugins supplying their own circuit breakers. It adds a new interface: `CircuitBreakerPlugin`. This interface provides methods for providing custom child CircuitBreaker objects. There are also facilities for allowing dynamic settings for the custom breakers. With the refactor, circuit breakers are no longer replaced on setting changes. Instead, the two mutable settings themselves are `volatile`. Plugins that want to use their custom circuit breaker should keep a reference of their constructed breaker.	2020-05-29 12:13:46 -04:00
Mayya Sharipova	aebb78bf5c	Run sort optimization when from+size>0 (#57250 )	2020-05-29 11:30:35 -04:00
Armin Braun	e4fd78f866	Remove Overly Strict Safety Mechnism in Shard Snapshot Logic (#57227 ) (#57362 ) Unfortunately, we cannot have a safety mechnism like this where we throw whenever we find unreadable data in a shard. This breaks in the case of an older ES version (without shard generations enabled) having failed to snapshot a shard snapshot after writing some data to its path and having finalized it for example. Another example of where we can't support this check is the test I added, if we snapshot an index with a name that already exists in the repository and more shards than the existing index, fail doing that and then retry snapshotting it we will also see unexpected data in the path. We could technically do deeper inspections on the unexpected data but I don't think it's worth it really. In the end if we are unable to read the data here it's broken anyway. By moving to a new `index-` blob in the shard directory I don't see us ever corrupting existing data and since we (by virtue of moving to an empty generation) won't do any incremental work on top of potentially corrupt data we also do not risk creating broken snapshots going forward. => Just logging a warning in this very unlikely case is the best we can do I think	2020-05-29 16:41:57 +02:00
Dan Hermann	6b0d707671	[7.x] Do not report negative values for swap sizes (#57353 )	2020-05-29 08:11:47 -05:00
Henning Andersen	8427d677e9	Reindex and friends fail nicely when max_docs < slices (#54901 ) (#57348 ) When the parameter `max_docs` is less than `slices` in update_by_query, delete_by_query or reindex API, `max_docs ` is set to 0 and we throw an action_request_validation_exception with confused error message: "maxDocs should be greater than 0...". This change checks that whether `max_docs` is less than `slices` and throw an illegal_argument_exception with clear message. Relates to #52786. Co-authored-by: bellengao <gbl_long@163.com>	2020-05-29 14:30:14 +02:00
Martijn van Groningen	04ef39da77	Change cluster info actions to be able to resolve data streams. (#57343 ) Backport of #56878 to 7.x branch. With this change the following APIs will be able to resolve data streams: get index, get mappings and ilm explain APIs. Relates to #53100	2020-05-29 12:17:53 +02:00
Ignacio Vera	75868ea915	Catch InputCoercionException thrown by Jackson parser (#57287 ) (#57330 ) Jackson 2.10 library has added a new type of error that is thrown when a numeric value is out of range. This error should be catch and handle properly in case the flag ignore_malformed has been set to true.	2020-05-29 09:47:47 +02:00
Nik Everett	b9fe10866e	Make global ords terms simpler to understand (backport of #57241 ) (#57311 ) When the `terms` enum operates on non-numeric data it can collect it via global ordinals. It actually has two separate collection strategies for, one "dense" and one "remapping". Each of those strategies has two "iteration" strategies that it uses to build buckets, depending on whether or not we need buckets with `0` docs in them. Previously this was done with several `null` checks and never really explained. This change replaces those checks with two `CollectionStrategy` classes which have good stuff like documentation.	2020-05-28 16:52:35 -04:00
Julie Tibshirani	10e1dc199d	Revert "Remove unused logic from FieldNamesFieldMapper. (#56834 )" This reverts commit `343fb699a4`.	2020-05-28 10:54:10 -07:00
Martijn van Groningen	225ccd1cfa	Ensure template exists when creating data stream (#57275 ) Backporting #56888 to 7.x branch. Limit the creation of data streams only for namespaces that have a composable template with a data stream definition. This way we ensure that mappings/settings have been specified and will be used at data stream creation and data stream rollover. Also remove `timestamp_field` parameter from create data stream request and let the create data stream api resolve the timestamp field from the data stream definition snippet inside a composable template. Relates to #53100	2020-05-28 15:08:25 +02:00
Nhat Nguyen	5b08eaf90c	Fix trimUnsafeCommits for indices created before 6.2 (#57187 ) If an upgraded node is restarted multiple times without flushing a new index commit, then we will wrongly exclude all commits from the starting commits. This bug is reproducible with these minimal steps: (1) create an empty index on 6.1.4 with translog retention disabled, (2) upgrade the cluster to 7.7.0, (3) restart the upgraded the cluster. The problem is that with the new translog policy can trim translog without having a new index commit, while the existing commit still refers to the previous translog generation. Closes #57091	2020-05-27 15:08:49 -04:00
Lee Hinman	c0f732b9f6	[7.x] Rename template V2 classes to ComposableTemplate (#57183 ) (#57232 ) Backports the following commits to 7.x: Rename template V2 classes to ComposableTemplate (#57183)	2020-05-27 11:01:59 -06:00
Nik Everett	4d5be7c817	Save memory on numeric sig terms when not top (backport of #56789 ) (#57221 ) This saves memory when running numeric significant terms which are not at the top level by merging its collection into numeric terms and relying on the optimization that we made in #55873.	2020-05-27 12:03:28 -04:00
Przemyslaw Gomulka	0e34b2f42e	SlowLoggers using single logger (#56708 ) Slow loggers should use single shared logger as otherwise when index is deleted the log4j logger will remain reachable (log4j is caching) and will create a memory leak. closes https://github.com/elastic/elasticsearch/issues/56171	2020-05-27 16:38:31 +02:00
Alan Woodward	d6b79bcd95	Remove Mapper.updateFieldType() (#57151 ) When we had multiple mapping types, an update to a field in one type had to be propagated to the same field in all other types. This was done using the Mapper.updateFieldType() method, called at the end of a merge. However, now that we only have a single type per index, this method is unnecessary and can be removed. Relates to #41059 Backport of #56986	2020-05-27 09:21:24 +01:00
Julie Tibshirani	343fb699a4	Remove unused logic from FieldNamesFieldMapper. (#56834 ) This logic is no longer used, now that each field mapper handles adding the `_field_names` fields.	2020-05-26 17:40:36 -07:00
Nik Everett	0fce2b7713	Fix DateHistogramAggregatorTests.testAsSubAgg Closes #57168 by using `AggregatorTestCase#newIndexSearcher` in the `AggregatorTestCase#testCase`. Without that global ordinals will sometimes fail to work.	2020-05-26 15:05:31 -04:00
Mark Vieira	92e127e90d	Mute DateHistogramAggregatorTests.testAsSubAgg (cherry picked from commit 4d050a7a6438a7d102eeef9e03a7d79565bddab7)	2020-05-26 10:57:22 -07:00
Christoph Büscher	56625e35b7	Fix `bool` query behaviour on null value (#56817 ) Until 7.7 we used to ignore `null` values for `bool`queries `minimum_should_match`, parameters and also for the `must`, `must_not`, `should` and `filter` clauses. An internal refactoring has changed this so now we get a parsing error. While `null` should not a common value here, we should restore the old behaviour for bwc for now. Closes #56812	2020-05-26 16:23:40 +02:00

1 2 3 4 5 ...

4802 Commits