OpenSearch

Commit Graph

Author	SHA1	Message	Date
Dan Hermann	39c4ec6821	[7.x] Create first backing index when creating data stream	2020-04-02 17:19:35 -05:00
Nik Everett	54ea4f4f50	Begin to drop pipeline aggs from the result tree (backport of #54311 ) (#54659 ) Removes pipeline aggregations from the aggregation result tree as they are no longer used. This stops us from building the pipeline aggregators at all on data nodes except for backwards compatibility serialization. This will save a tiny bit of space in the aggregation tree which is lovely, but the biggest benefit is that it is a step towards simplifying pipeline aggregators. This only does about half of the work to remove the pipeline aggs from the tree. Removing all of it would, well, double the size of the change and make it harder to review.	2020-04-02 16:45:12 -04:00
Nik Everett	cc6468a0cb	Fix BWC error on pipeline aggs (#54672 ) I derped out on a last minute bug fix when backporting #54282 and it only causes the tests to fail about half the time. So I didn't catch it until after merging. Great! This fixes it.	2020-04-02 14:51:30 -04:00
Zachary Tong	20d67720aa	Refactor Percentiles/Ranks aggregation builders and factories (#51887 ) (#54537 ) - Consolidates HDR/TDigest factories into a single factory - Consolidates most HDR/TDigest builder into an abstract builder - Deprecates method(), compression(), numSigFig() in favor of a new unified PercentileConfig object - Disallows setting algo options that don't apply to current algo The unified config method carries both the method and algo-specific setting. This provides a mechanism to reject settings that apply to the wrong algorithm. For BWC the old methods are retained but marked as deprecated, and can be removed in future versions. Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com> Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>	2020-04-02 10:39:41 -04:00
Nik Everett	a5adac0d1e	Fix pipeline agg serialization for ccs (backport of #54282 ) (#54468 ) This fixes pipeline aggregations used in cross cluster search from an older version of Elasticsearch to a newer version of Elasticsearch. I broke this in #53730 when I was too aggressive in shutting off serialization of pipeline aggs. In particular, this comes up when the coordinating node is pre-7.8.0 and the gateway node is on or after 7.8.0. The fix is another step down the line to remove pipeline aggregators from the aggregation tree. Sort of. It create a new `List<PipelineAggregator>` member in `InternalAggregation` but it is only used for bwc serialization and it is fed by the mechanism established in #53730 to read the pipelines from the	2020-04-02 10:35:40 -04:00
Nik Everett	b4feda84e8	Add scroll info to search task description (backport of #54606 ) (#54612 ) Right now you can't tell from the task description whether or not the search is a scroll. This adds that information to the description which is super useful if you are trying to debug a cluster that is running out of scroll contexts.	2020-04-02 09:04:49 -04:00
Jason Tedor	18b602280c	Add validation to the usage service (#54617 ) Today the usage service can let in some issues, such as handlers that do not have a name, where the errors do not manifest until later (calling the usage API), or conflicting handlers with the same name. This commit addresses this by adding some validation to the usage service.	2020-04-02 08:56:28 -04:00
Andy Bristol	eb14635f1f	add tests to StatsAggregatorTests (#53768 ) Adds tests for supported ValuesSourceTypes, unmapped fields, scripting, and the missing param. The tests for unmapped fields and scripting are migrated from the StatsIT integration test	2020-04-01 17:07:51 -07:00
Andy Bristol	c87b830d06	migrate tests from MissingIT to agg tests (#53448 ) Move the remaining tests for the missing aggregation into its AggregatorTestCase out of its integration test and remove the IT	2020-04-01 17:05:44 -07:00
Andy Bristol	ec76e7306e	supported field type tests for max agg (#53701 ) Adds test hooks for testing supported ValuesSource types for the max aggregation	2020-04-01 15:24:53 -07:00
Andy Bristol	5d0351ea00	add tests to SumAggregatorTests (#53568 ) This adds tests for supported ValuesSourceTypes, unmapped fields, scripting, and the missing param. The tests for unmapped fields and scripting are migrated from the SumIT integration test	2020-04-01 15:24:21 -07:00
Andy Bristol	62a52465fc	aggregator and yaml tests for missing agg (#53214 ) Tests for unmapped fields, the missing parameter, scripting, and correct ValuesSource types in MissingAggregatorTests. Basic yaml tests for the missing agg For #42949	2020-04-01 15:23:08 -07:00
William Brafford	958e9d1b78	Refactor nodes stats request builders to match requests (#54363 ) (#54604 ) * Refactor nodes stats request builders to match requests (#54363) * Remove hard-coded setters from NodesInfoRequestBuilder * Remove hard-coded setters from NodesStatsRequest * Use static imports to reduce clutter * Remove uses of old info APIs	2020-04-01 17:03:04 -04:00
Gordon Brown	f0cb8a56a9	Handle -1 gc_threshold settings explicitly (#54546 ) Because -1 is technically a valid TimeValue (as a sentinel value), that is now explicitly checked for when validating gc_thresholds. The tests are also adjusted to test this case separately from other negative values.	2020-04-01 13:56:50 -06:00
Mayya Sharipova	bf4857d9e0	Search hit refactoring (#41656 ) (#54584 ) Refactor SearchHit to have separate document and meta fields. This is a part of bigger refactoring of issue #24422 to remove dependency on MapperService to check if a field is metafield. Relates to PR: #38373 Relates to issue #24422 Co-authored-by: sandmannn <bohdanpukalskyi@gmail.com>	2020-04-01 15:19:00 -04:00
jimczi	7787603d56	Add 7.6.3 version	2020-04-01 16:23:28 +02:00
David Turner	6d976e1468	Resolve some coordination-layer TODOs (#54511 ) This commit removes a handful of TODO comments in the cluster coordination layer that no longer apply. Relates #32006	2020-04-01 12:36:18 +01:00
David Turner	5e3b6ab82b	Use VotingConfiguration#of where possible (#54507 ) This resolves a longstanding TODO in the cluster coordination subsystem. Relates #32006	2020-04-01 09:30:42 +01:00
Nhat Nguyen	c2506af8a6	Enable engine debug log for testMaybeFlush Relates #52223	2020-03-31 23:40:14 -04:00
Jason Tedor	63e5f2b765	Rename META_DATA to METADATA This is a follow up to a previous commit that renamed MetaData to Metadata in all of the places. In that commit in master, we renamed META_DATA to METADATA, but lost this on the backport. This commit addresses that.	2020-03-31 17:30:51 -04:00
Jason Tedor	5fcda57b37	Rename MetaData to Metadata in all of the places (#54519 ) This is a simple naming change PR, to fix the fact that "metadata" is a single English word, and for too long we have not followed general naming conventions for it. We are also not consistent about it, for example, METADATA instead of META_DATA if we were trying to be consistent with MetaData (although METADATA is correct when considered in the context of "metadata"). This was a simple find and replace across the code base, only taking a few minutes to fix this naming issue forever.	2020-03-31 17:24:38 -04:00
Zachary Tong	c9db2de41d	[7.x] Comprehensively test supported/unsupported field type:agg combinations (#54451 ) * Comprehensively test supported/unsupported field type:agg combinations (#52493) This adds a test to AggregatorTestCase that allows us to programmatically verify that an aggregator supports or does not support a particular field type. It fetches the list of registered field type parsers, creates a MappedFieldType from the parser and then attempts to run a basic agg against the field. A supplied list of supported VSTypes are then compared against the output (success or exception) and suceeds or fails the test accordingly. Co-Authored-By: Mark Tozzi <mark.tozzi@gmail.com> * Skip fields that are not aggregatable * Use newIndexSearcher() to avoid incompatible readers (#52723) Lucene's `newSearcher()` can generate readers like ParallelCompositeReader which we can't use. We need to instead use our helper `newIndexSearcher`	2020-03-31 14:35:03 -04:00
Jake Landis	9b1fe93363	[7.x] introduce 6.8.9 as a version (#53817 )	2020-03-31 13:03:28 -05:00
Armin Braun	c38e125425	Remove Redundant Documentation on SnapshotsService (#54482 ) (#54505 ) The docs here add nothing compared to those in the package. If anything they are somewhat confusing since they don't give all necessary details to understand the snapshot process. => remove them and link to the complete docs at the package level	2020-03-31 17:07:48 +02:00
Yannick Welsch	597dfa8481	Avoid holding onto bulk items until all completed (#54407 ) Bulk requests currently keep a reference to all bulk item requests until every one of them has completed. There is no need to do so, however, and, in case of large bulks, can mean unnecessary holding onto memory that might be better used elsewhere. More so as different shard-level bulks can complete at different speeds, and one slow shard-level request should not require holding onto every other shard-level request.	2020-03-31 16:19:07 +02:00
Dan Hermann	2ede8662e1	Bump multi-release JARs to Java 11	2020-03-31 06:48:46 -05:00
Tim Brooks	915435bbe4	Fix issue with pipeline releasing bytes early (#54474 ) Currently there is an issue with the InboundPipeline releasing bytes earlier than appropriate. This can lead to the bytes being reused before the message is handled. This commit fixes that issue and adds a test to detect when it is occurring.	2020-03-30 22:39:15 -06:00
Lee Hinman	a3d1945254	[7.x] Add warnings/errors when V2 templates would match same i… (#54449 ) * Add warnings/errors when V2 templates would match same indices… (#54367) * Add warnings/errors when V2 templates would match same indices as V1 With the introduction of V2 index templates, we want to warn users that templates they put in place might not take precedence (because v2 templates are going to "win"). This adds this validation at `PUT` time for both V1 and V2 templates with the following rules: ** When creating or updating a V2 template - If the v2 template would match indices for an existing v1 template or templates, provide a warning (through the deprecation logging so it shows up to the client) as well as logging the warning The v2 warning looks like: ``` index template [my-v2-template] has index patterns [foo-] matching patterns from existing older templates [old-v1-template,match-all-template] with patterns (old-v1-template => [foo],match-all-template => []); this template [my-v2-template] will take precedence during new index creation ``` * When creating a V1 template - If the v1 template is for index patterns of `""` and a v2 template exists, warn that the v2 template may take precedence - If the v1 template is for index patterns other than all indices, and a v2 template exists that would match, throw an error preventing creation of the v1 template * When updating a V1 template (without changing its existing `index_patterns`!) - If the v1 template is for index patterns that would match an existing v2 template, warn that the v2 template may take precedence. The v1 warning looks like: ``` template [my-v1-template] has index patterns [] matching patterns from existing index templates [existing-v2-template] with patterns (existing-v2-template => [foo]); this template [my-v1-template] may be ignored in favor of an index template at index creation time ``` And the v1 error looks like: ``` template [my-v1-template] has index patterns [foo] matching patterns from existing index templates [existing-v2-template] with patterns (existing-v2-template => [f]), use index templates (/_index_template) instead ``` Relates to #53101 * Remove v2 index and component templates when cleaning up tests * Finish half-finished comment sentence * Guard template removal and ignore for earlier versions of ES Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> * Also ignore 500 errors when clearing index template v2 templates Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-30 13:25:50 -06:00
Mark Tozzi	529622d4f4	Unit tests for Range and DateRange aggs (#52380 ) (#54455 )	2020-03-30 15:07:43 -04:00
Mark Tozzi	10e0e59561	Tests for agg missing values (#51068 ) (#54452 )	2020-03-30 15:05:38 -04:00
Ryan Ernst	3a24fe9d37	Move keystore-cli to its own tools project (#40787 ) (#54294 ) This commit moves the keystore cli into its own project, so that the test dependencies can be isolated from the rest of server.	2020-03-30 11:20:07 -07:00
Nik Everett	56047f74be	Fix auto_date_histogram serialization bug (#54447 ) This fixes a serialization bug in `auto_date_histogram` that comes up in a cluster mixed between pre-7.3.0 and post-7.3.0. Includes #54429 to keep 7.x looking like master for simpler backports. Closes #54382	2020-03-30 13:49:38 -04:00
Nik Everett	e58ad9fed3	Clean up how pipeline aggs check for multi-bucket (backport of #54161 ) (#54379 ) Pipeline aggregations like `stats_bucket`, `sum_bucket`, and `percentiles_bucket` only operate on buckets that have multiple buckets. This adds support for those aggregations to `geo_distance`, `ip_range`, `auto_date_histogram`, and `rare_terms`. This all happened because we used a marker interface to mark compatible aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget to implement the interface. This replaces the marker interface with an abstract method in `AggregationBuilder`, `bucketCardinality` which makes you return `NONE`, `ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At this point `ONE` and `NONE` amount to about the same thing, but I suspect that'll be a useful distinction when validating bucket sorts. Closes #53215	2020-03-30 10:44:55 -04:00
Andrei Dan	d5320d9d29	Read the index.number_of_replicas from template so that wait_for_active_shards is interpreted correctly (#54231 ) (#54413 ) This commit takes into account the index.number_of_replicas (defaults to 0 - no replicas- ) value when setting an index template. This change enables the index.wait_for_active_shards value to be interpreted correctly (cherry picked from commit 07026ac3d56dc9fae69467adfda7eaed7ea3ca00) Signed-off-by: Andrei Dan <andrei.dan@elastic.co> Co-authored-by: tninokehoe <62655306+tninokehoe@users.noreply.github.com>	2020-03-30 14:34:49 +01:00
Armin Braun	9392fca36a	Improve Snapshot Abort Behavior (#54256 ) (#54410 ) This commit improves the behavior of aborting snapshots and by that fixes some extremely rare test failures. Improvements: 1. When aborting a snapshot while it is in the `INIT` stage we do not need to ever delete anything from the repository because nothing is written to the repo during INIT any more (in the past running deletes for these snapshots made sense because we were writing `snap-` and `meta-` blobs during the `INIT` step). 2. Do not try to finalize snapshots that never moved past `INIT`. Same reason as with the first step. If we never moved past `INIT` no data was written to the repo so no need to now write a useless entry for the aborted snapshot to `index-N`. This is especially true, since the reason the snapshot was aborted during `INIT` was a delete call so the useless empty snapshot just added to `index-N` would be removed by the subsequent delete that is still waiting anyway. 3. if after aborting a snapshot we wait for it to finish we should not try deleting it if it failed. If the snapshot failed it means it did not become part of the most recent `RepositoryData` so a delete for it will needlessly fail with a confusing message about that snapshot being missing or concurrent repository modification. I moved to throw the snapshot missing exception here because that seems the most user friendly. This allows the user to simply ignore `404` returns from the delete API when using it to make sure a snapshot is aborted+deleted. Marking this as a non-issue since it doesn't have any negative repercussions other than confusing exceptions on some snapshot aborts. Closes #52843	2020-03-30 15:08:18 +02:00
Jim Ferenczi	12cfdc24b0	Fixed rewrite of time zone without DST (#54398 ) We try to rewrite time zones to fixed offsets in the date histogram aggregation if the data in the shard is within a single transition. However this optimization is not applied on time zones that don't apply daylight saving changes but had some random transitions in the past (e.g. Australia/Brisbane or Asia/Katmandu). This changes fixes the rewrite of such time zones to fixed offsets.	2020-03-30 13:18:57 +02:00
Martijn van Groningen	4b4fbc160d	Refactor AliasOrIndex abstraction. (#54394 ) Backport of #53982 In order to prepare the `AliasOrIndex` abstraction for the introduction of data streams, the abstraction needs to be made more flexible, because currently it really can be only an alias or an index. * Renamed `AliasOrIndex` to `IndexAbstraction`. * Introduced a `IndexAbstraction.Type` enum to indicate what a `IndexAbstraction` instance is. * Replaced the `isAlias()` method that returns a boolean with the `getType()` method that returns the new Type enum. * Moved `getWriteIndex()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface. * Moved `getAliasName()` up from the `IndexAbstraction.Alias` to the `IndexAbstraction` interface and renamed it to `getName()`. * Removed unnecessary casting to `IndexAbstraction.Alias` by just checking the `getType()` method. Relates to #53100	2020-03-30 10:12:16 +02:00
Nhat Nguyen	6e025c12f0	Add debug logging for testRunningTasksCount Relates #53594	2020-03-29 18:34:41 -04:00
Jason Tedor	f0033783db	Deprecate node local storage setting (#54374 ) This setting is not documented and has dubious value since it means there can be nodes in the cluster (non-data and non-master nodes) that do not have persistent node IDs. This does not have any use cases so this commit removes the setting.	2020-03-28 14:36:41 -04:00
Jason Tedor	60437b474d	Fix line-length violation in DiscoveryNodeRole This commit fixes a line-length checkstyle violation in DiscoveryNodeRole.java.	2020-03-28 13:06:20 -04:00
Jason Tedor	03cab96b2d	Fix imports in discovery node classes This commit fixes some imports that were leftover after resolving some merge conflicts on a backport.	2020-03-28 12:56:22 -04:00
Jason Tedor	c3be3206ce	Decouple environment from DiscoveryNode (#54373 ) Today Environment is coupled to DiscoveryNode via the node.local_storage setting. This commit decouples Environment from this setting.	2020-03-28 12:52:47 -04:00
Jason Tedor	37b59a357f	Ensure that the output of node roles are sorted (#54376 ) This commit ensures that node roles are sorted by node role name, which makes the output easier to consume, and also makes it easier to rely on the behavior of the output in assertions.	2020-03-28 12:51:21 -04:00
Tim Brooks	2ccddbfa88	Move transport decoding and aggregation to server (#54360 ) Currently all of our transport protocol decoding and aggregation occurs in the individual transport modules. This means that each implementation (test, netty, nio) must implement this logic. Additionally, it means that the entire message has been read from the network before the server package receives it. This commit creates a pipeline in server which can be passed arbitrary bytes to handle. Internally, the pipeline will decode, decompress, and aggregate the messages. Additionally, this allows us to run many megabytes of bytes through the pipeline in tests to ensure that the logic works. This work will enable future work: Circuit breaking or backoff logic based on message type and byte in the content aggregator. Sharing bytes with the application layer using the ref counted releasable network bytes. Improved network monitoring based specifically on channels. Finally, this fixes the bug where we do not circuit break on the correct message size when compression is enabled.	2020-03-27 14:13:10 -06:00
Stuart Tettemer	1630de4a42	Scripting: stats per context in nodes stats (#54008 ) (#54357 ) Adds script cache stats to `_node/stats`. If using the general cache: ``` "script_cache": { "sum": { "compilations": 12, "cache_evictions": 9, "compilation_limit_triggered": 5 } } ``` If using context caches: ``` "script_cache": { "sum": { "compilations": 13, "cache_evictions": 9, "compilation_limit_triggered": 5 }, "contexts": [ { "context": "aggregation_selector", "compilations": 8, "cache_evictions": 6, "compilation_limit_triggered": 3 }, { "context": "aggs", "compilations": 5, "cache_evictions": 3, "compilation_limit_triggered": 2 }, ``` Backport of: 32f46f2 Refs: #50152	2020-03-27 12:26:00 -06:00
Tim Brooks	f5b4020819	Remove netty BytesReference implementations (#54355 ) Elasticsearch has a number of different BytesReference implementations. These implementations can all implement the interface in different ways with subtly different behavior and performance characteristics. On the other-hand, the JVM only represents bytes as an array or a direct byte buffer. This commit deletes the specialized Netty implementations and moves to using a generic ByteBuffer reference type. This will allow us to focus on standardizing performance and behave around a smaller number of implementations that can be used by all components in Elasticsearch.	2020-03-27 11:01:33 -06:00
Lee Hinman	f2cc2b1127	[7.x] Add REST APIs for IndexTemplateV2Metadata CRUD (#54039 ) (#54347 ) * Add REST APIs for IndexTemplateV2Metadata CRUD (#54039) * Add REST APIs for IndexTemplateV2Metadata CRUD This commit adds the get/put/delete APIs for interacting with the now v2 versions of index templates. These APIs are behind the existing `es.itv2_feature_flag_registered` system property feature flag. Relates to #53101 * Add exceptions for HLRC tests * Add skips for 7.x versions * Use index_template instead of template_v2 in action names * Add test for MetaDataIndexTemplateService.addIndexTemplateV2 * Move removal to static method and add test * Add unit tests for request classes (implement hashCode & equals) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> * Fix compilation Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-27 10:47:22 -06:00
Dan Hermann	1690e78646	Validation for data stream creation	2020-03-27 10:07:46 -05:00
Alan Woodward	461f1307d6	Add XContentHelper.childBytes() method (#54287 ) We have a number of places where we want to read a fairly complex object from XContent, but aren't interested in its contents; for example, mappings are often serialized and deserialized between several objects before they are actually built into a MappingMetaData object. This means that potentially large maps of maps are constructed several times, only to immediately be re-serialized again. This commit adds a new helper method to XContentHelper that reads the children of an xcontent object directly to a BytesReference, serialized via the same xcontenttype as the parent parser, avoiding the construction of intermediary maps or lists.	2020-03-27 14:21:56 +00:00
Armin Braun	14b5daad7c	Fix Snapshot Completion Listener Lost on Master Failover (#54286 ) (#54330 ) * Fix Snapshot Completion Listener Lost on Master Failover If master fails over before (or we run into any other exception) when removing the snapshot from the CS we must still resolve all the completion listeners for the snapshot.	2020-03-27 14:11:13 +01:00
Gordon Brown	0d30b48613	Disallow negative TimeValues (#53913 ) This commit causes negative TimeValues, other than -1 which is sometimes used as a sentinel value, to be rejected during parsing. Also introduces a hack to allow ILM to load policies which were written to the cluster state with a negative min_age, treating those values as 0, which should match the behavior of prior versions.	2020-03-26 13:30:35 -06:00
William Brafford	14204f8381	Use set-based interface for NodesStatsRequest (#53637 ) (#54141 ) The NodesStatsRequest class uses a set of strings for its internal serialization. This commit updates the class's interface so that we no longer use hard-coded getters and setters, but rather methods that add strings directly. For example, the old way of adding "os" metrics to a request would be to call request.os(true). The new way of doing this is to call request.addMetric("os"). For the time being, the canonical list of metrics is an enum in NodesStatsRequest. This will eventually be replaced with something pluggable.	2020-03-26 14:41:49 -04:00
Christoph Büscher	da404bbce2	HLRC: Don't send defaults for SubmitAsyncSearchRequest (#54200 ) (#54266 ) Currently we set the defaults for ccsMinimizeRoundtrips, preFilterShardSize and requestCache on the HLRC SubmitAsyncSearchRequest in the constructor. This is no longer needed since we now only send the parameters along with the rest request that are supported (omitting e.g. ccsMinimizeRoundtrips) and the correct defaults are set on the client side. This change removes setting and sending these defaults where possible, leaving only the overwrite of batchedReduceSize with a default value of 5, since the default used in the vanilla SearchRequest is 512. However, we don't need to send this value along as a request parameter if its the default since the correct one will be set on the receiving end if no value is specified. Also adding tests for RestSubmitAsyncSearchAction that check the correct defaults are set when parameters are missing on the server side. Backport of #54200	2020-03-26 19:01:17 +01:00
Nik Everett	b6a8de0d89	Fix compilation in Eclipse (backport of #54275 ) (#54284 ) These mock calls cause Eclipse to think that `Exception` can be thrown because `CheckedFunction`'s lower bound is `Exception`. This makes Eclipse happy.	2020-03-26 12:53:13 -04:00
Lee Hinman	3e1fc8a5c9	[7.x] #32478 fixed cluster stats return 8EB (#32480 ) (32972311) (#54273 ) Co-authored-by: Darby.Han <darby.han@navercorp.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: 한우람 <hgword@gmail.com> Co-authored-by: Darby.Han <darby.han@navercorp.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-26 09:12:09 -06:00
Henning Andersen	5bfaa20dd4	Rollover: refactor out cluster state update (#53965 ) (#54269 ) Make it possible to reuse the cluster state update of rollover for simulation purposes by extracting it. Also now run the full rollover in the pre-rollover phase and the actual rollover phase, allowing a dedicated exception in case of concurrent rollovers as well as a more thorough pre-check.	2020-03-26 16:06:13 +01:00
Yannick Welsch	8176f1c7f8	Delegate toXContent logic from ClusterState to its member classes (#54192 ) Backport of #52743 Co-authored-by: Yannick Welsch <yannick@welsch.lu>	2020-03-26 15:08:30 +01:00
Alan Woodward	048797389a	Explicitly set mappings in SearchAfterIT (#54262 ) We have some very occasional failures in SearchAfterIT, where a search throws an exception because a shard does not have the mapping for the requested sort field. The field should have been added in a dynamic mapping update after an index event, but it seems that there can sometimes be a small delay in propagating this update to the shards. This commit changes the test to explicitly define the relevant field at index creation time. Fixes #51900	2020-03-26 13:05:19 +00:00
Jason Tedor	d8f745736b	Clarify the remove keystore command can handle many (#54244 ) The remove keystore command can handle multiple settings. In a few places, we were not consistent about mentioning this. This commit addreses this, in the CLI help, and the docs.	2020-03-26 08:49:43 -04:00
William Brafford	b11960e3e6	Create set-based interface for NodesInfoRequest (#53410 ) (#54223 ) This commit begins the work of removing the "hard-coded" metric getters and setters from the NodesInfoRequest classes. We start by providing new flexible getters and setters. We then update the test classes to remove the old getters, and then remove those getters.	2020-03-26 07:28:09 -04:00
Yannick Welsch	1ba6783780	Schedule commands in current thread context (#54187 ) Changes ThreadPool's schedule method to run the schedule task in the context of the thread that scheduled the task. This is the more sensible default for this method, and eliminates a range of bugs where the current thread context is mistakenly dropped. Closes #17143	2020-03-26 10:07:59 +01:00
Jason Tedor	6af89e62d1	Allow keystore add-file to handle multiple settings (#54240 ) Today the keystore add-file command can only handle adding a single setting/file pair in a single invocation. This incurs the startup costs of the JVM many times, which in some environments can be expensive. This commit teaches the add-file keystore command to accept adding multiple settings in a single invocation.	2020-03-26 00:07:05 -04:00
Jason Tedor	fe8257d981	Allow keystore add to handle multiple settings (#54229 ) Today the keystore add command can only handle adding a single setting/value pair in a single invocation. This incurs the startup costs of the JVM many times, which in some environments can be expensive. This commit teaches the add keystore command to accept adding multiple settings in a single invocation.	2020-03-25 22:58:20 -04:00
Nik Everett	8f40f1435a	Save a little space in agg tree (backport of #53730 ) (#54213 ) This drop the "top level" pipeline aggregators from the aggregation result tree which should save a little memory and a few serialization bytes. Perhaps more imporantly, this provides a mechanism by which we can remove all pipelines from the aggregation result tree. This will save quite a bit of space when pipelines are deep in the tree. Sadly, doing this isn't simple because of backwards compatibility. Nodes before 7.7.0 need those pipelines. We provide them by setting passing a `Supplier<PipelineTree>` into the root of the aggregation tree that we only call if we need to serialize to a version before 7.7.0. This solution works for cross cluster search because we always reduce the aggregations in each remote cluster and then forward them back to the coordinating node. Its quite possible that the coordinating node needs the pipeline (say it is version 7.1.0) and the gateway node in the remote cluster doesn't (version 7.7.0). In that case the data nodes won't send the pipeline aggregations back to the gateway node. Critically, the gateway node will send the pipeline aggregations back to the coordinating node. This is all managed with that `Supplier<PipelineTree>`, but how it is managed is a bit tricky.	2020-03-25 15:51:16 -04:00
Martijn van Groningen	66861a82a1	Data stream should refer to the backing indices using the Index class (#54199 ) Backport of #54189	2020-03-25 20:18:15 +01:00
Bogdan Pintea	77da9dd040	Add version 7.8.0 Add version 7.8.0	2020-03-25 18:10:30 +01:00
jimczi	04fabead14	Fix small typo in SearchService#executeQueryPhase Relates #54044	2020-03-25 16:16:30 +01:00
Armin Braun	32fa90c9ba	Fix ClusterHealthIT.testHealthOnMasterFailover (#54170 ) (#54177 ) We can run into a state where there's no more events to wait for temporarily but the cluster still isn't green. I added the wait for green flag to the request so the assertion for green cluster health below doesn't fail. Closes #53457	2020-03-25 14:33:31 +01:00
Armin Braun	0a70250201	Fix BlobStoreIncrementalityIT Assertion (#54149 ) (#54155 ) We are using this assertion for identical shard snapshots for situations where `snapshot1` wasn't the first snapshot for the tested shard. Hence, we can't assume that it will not share any files with previous snapshots. This showed up in failing tests when `snapshot1` was equivalent to a previous snapshot because no documents were deleted from the repo randomly in the failing test but even if documents are deleted there is no guarantee that no files will be shared. => I removed this assertion since its immaterial for what is tested here anyway. Closes #54034	2020-03-25 12:13:52 +01:00
jimczi	e380a5a8c3	Fix off-by one error in TransportSearchActionTests Closes #54156	2020-03-25 11:41:02 +01:00
Tanguy Leroux	b6e482295b	Mute TransportSearchActionTests.testShouldPreFilterSearchShards (#54158 ) Relates #53873 Relates #54156	2020-03-25 11:25:06 +01:00
Jim Ferenczi	3b4751bdb7	Avoid I/O operations when rewriting shard search request (#54044 ) (#54139 ) This commit ensures that we rewrite the shard request with a short-lived can_match searcher. This is required for frozen indices since the high level rewrite is now performed on a network thread where we don't want to perform I/O. Closes #53985	2020-03-25 09:02:36 +01:00
Jason Tedor	381d7586e4	Introduce formal role for remote cluster client (#54138 ) This commit introduce a formal role for identifying nodes that are capable of making connections to remote clusters. Relates #53924	2020-03-24 21:59:43 -04:00
Henning Andersen	7ce7aff66e	Reindex negative TimeValue fix (#54057 ) Reindex would use timeValueNanos(System.nanoTime()). The intended use for TimeValue is as a duration, not as absolute time. In particular, this could result in negative TimeValue's, being unsupported in #53913. Modified to use the bare long nano-second value.	2020-03-24 22:29:09 +01:00
Przemko Robakowski	fc498f625a	[7.x] Add validation for component templates (#54023 ) (#54118 ) * Add validation for component templates (#54023) * Add validation for component templates This change adds validation to make sure that settings and mappings are correct in component template. It's done the same way as in index templates - code is reused. Reletes to #53101 * Fix checkstyle violation * Update server/src/main/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateService.java Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com> * Update server/src/test/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateServiceTests.java Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com> * Update server/src/test/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateServiceTests.java Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com> * Update server/src/test/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateServiceTests.java Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com> * Update server/src/main/java/org/elasticsearch/cluster/metadata/MetaDataIndexTemplateService.java Co-Authored-By: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> * Adjusted to 7.7 * unused import fixed * npe fixeD * change exception type Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>	2020-03-24 22:20:34 +01:00
Jim Ferenczi	025857d949	Fix ShardSearchRequest cache key (#54071 ) This commit ensures that we don't use the non-deterministic canReturnNullResponseIfMatchNoDocs boolean in the cache key of the ShardSearchRequest. The value of this boolean has no influence on the cacheability of the request. Closes #32827	2020-03-24 22:15:52 +01:00
Przemko Robakowski	5594d57727	/_cat/shards support path stats (#53461 ) (#54119 ) * _cat/shards support path stats * fix some style case * fix some style case * fix rest-api-spec cat.shards error * fix rest-api-spec cat.shards bwc error Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: weizijun <weizijun1989@gmail.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-24 20:46:05 +01:00
Tim Brooks	b21b7fb09b	Allow proxy mode server name to be updated (#54107 ) Currently there is a bug where the proxy strategy will not be rebuilt if the server_name is dynamically updated. This commit fixes this issue.	2020-03-24 11:54:23 -06:00
David Turner	21afc788f8	Reduce log level for pipeline failure (#54097 ) Today we log `failed to execute pipeline for a bulk request` at `ERROR` level if an attempt to run an ingest pipeline fails. A failure here is commonly due to an `EsRejectedExecutionException`. We also feed such failures back to the client and record the rejection in the threadpool statistics. In line with #51459 there is no need to log failures within actions so noisily and with such urgency. It is better to leave it up to the client to react accordingly. Typically an `EsRejectedExecutionException` should result in the client backing off and retrying, so a failure here is not normally fatal enough to justify an `ERROR` log at all. This commit reduces the log level for this message to `DEBUG`.	2020-03-24 17:41:35 +00:00
markharwood	6a60f85bba	Wildcard field - add normalizer support (#53851 ) (#54109 ) Backport support for normalisation to wildcard field Closes #53603	2020-03-24 17:37:47 +00:00
Yannick Welsch	e006d1f6cf	Use special XContent registry for node tool (#54050 ) Fixes an issue where the elasticsearch-node command-line tools would not work correctly because PersistentTasksCustomMetaData contains named XContent from plugins. This PR makes it so that the parsing for all custom metadata is skipped, even if the core system would know how to handle it. Closes #53549	2020-03-24 17:40:51 +01:00
Nik Everett	42be39177b	Remove ceremony declaring aggs (backport of #53990 ) (#54099 ) This removes some more ceremony when declaring agg parsers. You no longer need a static `parse` method, instead you can just make the `PARSER` public in most cases. There are still a few aggs with the `parse` method, but those `parse` methods are a little more complex to untangle.	2020-03-24 12:29:52 -04:00
Tim Brooks	caefa78513	Align remote info api with new settings (#54102 ) Currently the remote info api has added a number of possible fields (proxy, num_socket_connections, etc) that are available in proxy mode. These fields are not aligned with what the settings are named. This commit modifies this API to align with the settings.	2020-03-24 10:27:24 -06:00
Christoph Büscher	1c1730facd	Mask wildcard query special characters on keyword queries (#53127 ) (#53512 ) Wildcard queries on keyword fields get normalized, however this normalization step should exclude the two special characters * and ? in order to keep the wildcard query itself intact. Closes #46300	2020-03-24 17:22:29 +01:00
Alan Woodward	39d7d0dc10	Upgrade to lucene 8.5.0 release (#54077 ) Upgrades our lucene dependency to the released 8.5.0 version.	2020-03-24 13:45:50 +00:00
Dan Hermann	30105a5ab5	[7.x] Cluster state and CRUD operations for data streams (#54073 )	2020-03-24 07:58:52 -05:00
Armin Braun	4e462db2ed	Fix BlobStoreIncrementalityIT (#54055 ) (#54060 ) The snapshot stats response list of snapshot statuses is not ordered according to the given list of snapshot names so randomly we could mix up snapshot1 and snapshot2 when asserting on the stats. Fixed by getting each snapshot's stats individually. Closes #54034	2020-03-24 11:46:40 +01:00
muachilin	b33fbe7026	Deprecate alternatives to the hot threads API (#52930 ) This commit deprecates various undocumented alternatives to the hot threads API.	2020-03-23 23:24:40 -04:00
Jim Ferenczi	9e3f7f4575	Add heuristics to compute pre_filter_shard_size when unspecified (#53873 ) (#54007 ) This commit changes the pre_filter_shard_size default from 128 to unspecified. This allows to apply heuristics based on the request and the target indices when deciding whether the can match phase should run or not. When unspecified, this pr runs the can match phase automatically if one of these conditions is met: * The request targets more than 128 shards. * The request contains read-only indices. * The primary sort of the query targets an indexed field. Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value. Closes #39835	2020-03-24 02:05:15 +01:00
Nik Everett	4734c645f1	Fix serialization bug for aggs (#54029 ) I created this bug today in #53793. When a `DelayableWriteable` that references an existing object serializes itself it wasn't taking the version of the node on the other side of the wire into account. This fixes that.	2020-03-23 19:00:47 -04:00
Jason Tedor	5c96a7e210	Fix compilation in RemoteClusterServiceTests This commit fixes an issue when a JDK collection convenience method not available in JDK 8 was backported to 7.x.	2020-03-23 18:41:17 -04:00
Jason Tedor	d3cc5bff17	Give helpful message on remote connections disabled (#53690 ) Today when cluster.remote.connect is set to false, and some aspect of the codebase tries to get a remote client, today we return a no such remote cluster exception. This can be quite perplexing to users, especially if the remote cluster is actually defined in their cluster state, it is only that the local node is not a remote cluter client. This commit addresses this by providing a dedicated error message when a remote cluster is not available because the local node is not a remote cluster client.	2020-03-23 18:32:38 -04:00
Mark Vieira	70cfedf542	Refactor global build info plugin to leverage JavaInstallationRegistry (#54026 ) This commit removes the configuration time vs execution time distinction with regards to certain BuildParms properties. Because of the cost of determining Java versions for configuration JDK locations we deferred this until execution time. This had two main downsides. First, we had to implement all this build logic in tasks, which required a bunch of additional plumbing and complexity. Second, because some information wasn't known during configuration time, we had to nest any build logic that depended on this in awkward callbacks. We now defer to the JavaInstallationRegistry recently added in Gradle. This utility uses a much more efficient method for probing Java installations vs our jrunscript implementation. This, combined with some optimizations to avoid probing the current JVM as well as deferring some evaluation via Providers when probing installations for BWC builds we can maintain effectively the same configuration time performance while removing a bunch of complexity and runtime cost (snapshotting inputs for the GenerateGlobalBuildInfoTask was very expensive). The end result should be a much more responsive build execution in almost all scenarios. (cherry picked from commit ecdbd37f2e0f0447ed574b306adb64c19adc3ce1)	2020-03-23 15:30:10 -07:00
Mark Vieira	be1b34c3f8	Mute BlobStoreIncrementalityIT.testIncrementalBehaviorOnPrimaryFailover	2020-03-23 15:15:30 -07:00
Nik Everett	b9bfba2c8b	Move pipeline agg validation to coordinating node (backport of #53669 ) (#54019 ) This moves the pipeline aggregation validation from the data node to the coordinating node so that we, eventually, can stop sending pipeline aggregations to the data nodes entirely. In fact, it moves it into the "request validation" stage so multiple errors can be accumulated and sent back to the requester for the entire request. We can't always take advantage of that, but it'll be nice for folks not to have to play whack-a-mole with validation. This is implemented by replacing `PipelineAggretionBuilder#validate` with: ``` protected abstract void validate(ValidationContext context); ``` The `ValidationContext` handles the accumulation of validation failures, provides access to the aggregation's siblings, and implements a few validation utility methods.	2020-03-23 17:22:56 -04:00
Jason Tedor	bc7b995523	Use deprecation logger holder in byte size value (#53928 ) If a setting is touched during bootstrap before logging is configured, and that setting uses a byte size value, the deprecation logger for ByteSizeValue will be initialized. However, this means a logger will be configured before log4j is initialized, which we reject at startup. This commit puts this deprecation logger in a holder pattern so that it is not initialized until first use, which will happen after logging is configured.	2020-03-23 17:06:12 -04:00
Marios Trivyzas	3a3e964956	Reduce performance impact of ExitableDirectoryReader (#53978 ) (#54014 ) Benchmarking showed that the effect of the ExitableDirectoryReader is reduced considerably when checking every 8191 docs. Moreover, set the cancellable task before calling QueryPhase#preProcess() and make sure we don't wrap with an ExitableDirectoryReader at all when lowLevelCancellation is set to false to avoid completely any performance impact. Follows: #52822 Follows: #53166 Follows: #53496 (cherry picked from commit cdc377e8e74d3ca6c231c36dc5e80621aab47c69)	2020-03-23 21:30:34 +01:00
Nik Everett	181bc807be	Try to save memory on aggregations (backport of #53793 ) (#53996 ) This delays deserializing the aggregation response try until right before we merge the objects.	2020-03-23 15:45:22 -04:00
Dan Hermann	ce31997ab2	disable check for non-snapshot builds for data streams feature flag (#54000 )	2020-03-23 13:29:51 -05:00
Luca Cavanna	932a7e3112	Backport of async search changes (#53976 ) * Get Async Search: omit _clusters section when empty (#53907) The _clusters section is omitted by the search API whenever no remote clusters are searched. Async search should do the same, but Get Async Search returns a deserialized response, hence a weird `_clusters` section with all values set to `0` gets returned instead. In fact the recreated Clusters object is not the same object as the EMPTY constant, yet it has the same content. This commit addresses this by changing the comparison in the `toXContent` method to not print out the section if the number of total clusters is `0`. * Async search: remove version from response (#53960) The goal of the version field was to quickly show when you can expect to find something new in the search response, compared to when nothing has changed. This can also be done by looking at the `_shards` section and `num_reduce_phases` returned with the search response. In fact when there has been one or more additional reduction of the results, you can expect new results in the search response. Otherwise, the `_shards` section could notify of additional failures of shards that have completed the query, but that is not a guarantee that their results will be exposed (only when the following partial reduction is performed their results will be available). That said this commit clarifies this in the docs and removes the version field from the async search response * Async Search: replicas to auto expand from 0 to 1 (#53964) This way single node clusters that are green don't go yellow once async search is used, while all the others still have one replica. * [DOCS] address timing issue in async search docs tests (#53910) The docs snippets for submit async search have proven difficult to test as it is not possible to guarantee that you get a response that is not final, even when providing `wait_for_completion=0`. In the docs we want to show though a proper long-running query, and its first response should be partial rather than final. With this commit we adapt the docs snippets to show a partial response, and replace under the hood all that's needed to make the snippets tests succeed when we get a final response. Also, increased the timeout so we always get a final response. Closes #53887 Closes #53891	2020-03-23 19:13:31 +01:00
Ryan Ernst	960d1fb578	Revert "Introduce system index APIs for Kibana (#53035 )" (#53992 ) This reverts commit `c610e0893d`. backport of #53912	2020-03-23 10:29:35 -07:00
Armin Braun	5b9864db2c	Better Incrementality for Snapshots of Unchanged Shards (#52182 ) (#53984 ) Use sequence numbers and force merge UUID to determine whether a shard has changed or not instead before falling back to comparing files to get incremental snapshots on primary fail-over.	2020-03-23 16:43:41 +01:00
Tanguy Leroux	8b9d6e6dbb	Increase ensureGreen() timeout in CloseWhileRelocatingShardsIT (#53981 ) The test in CloseWhileRelocatingShardsIT failed recently multiple times (3) when waiting for initial indices to be become green. Looking at the execution logs from #53544 it appears at the very beginning of the test and when the WindowsFS file system is picked up (which is known to slow down tests). This commit simply increases the timeout for the first ensureGreen() to 60 seconds. If the test continues to fail, we might want to test a larger timeout or disable WindowsFS for this test. Closes #53544	2020-03-23 16:24:25 +01:00
Martijn van Groningen	aef7b89219	Backport: initial data stream commit (#53959 ) This commits adds a data stream feature flag, initial definition of a data stream and the stubs for the data stream create, delete and get APIs. Also simple serialization tests are added and a rest test to thest the data stream API stubs. This is a large amount of code and mainly mechanical, but this commit should be straightforward to review, because there isn't any real logic. The data stream transport and rest action are behind the data stream feature flag and are only intialized if the feature flag is enabled. The feature flag is enabled if elasticsearch is build as snapshot or a release build and the 'es.datastreams_feature_flag_registered' is enabled. The integ-test-zip sets the feature flag if building a release build, otherwise rest tests would fail. Relates to #53100	2020-03-23 12:58:09 +01:00
David Turner	0fb31d9e7a	Allow static cluster.max_voting_config_exclusions (#53717 ) Today we only read `cluster.max_voting_config_exclusions` from the dynamic settings in the cluster metadata, ignoring any value set in `elasticsearch.yml`. This commit addresses this. Closes #53455	2020-03-23 08:38:12 +00:00
Ignacio Vera	efd1838206	Handle properly indexing rectangles that crosses the dateline (#53810 ) (#53947 ) When indexing a rectangle that crosses the dateline, we are currently not handling it properly and we index a polygon that do not cross the dateline. This changes generates two polygons wrapping the dateline.	2020-03-23 09:12:03 +01:00
Stuart Tettemer	d25c01a373	Scripting: Increase ingest script cache defaults (#53906 ) * Adds ability for contexts to specify their own defaults. * Context defaults are applied if no context-specific or general setting exists. * See 070ea7e for settings keys. * Increases the per-context default for the `ingest` context. * Cache size is doubled, 200 compared to default of 100 * Cache expiration is unchanged at no expiration * Cache max compilation is quintupled, 375/5m instead of 75/5m Backport of: 1b37d4b Refs: #50152	2020-03-20 16:48:50 -06:00
Gordon Brown	10cabbbade	Transition Transforms to using hidden indices for notifcations index (#53773 ) This commit changes the Transforms notifications index to be hidden index, with a hidden alias. This commit also removes the temporary hack in MetaDataCreateIndexService that prevents deprecation warnings for known dot-prefixed index names which are not hidden/system indices, as this was the last index pattern to need that hack.	2020-03-20 15:40:58 -06:00
Stuart Tettemer	ac575b68a9	Scripting: Context script cache unlimited compile (#53769 ) (#53899 ) * Adds "unlimited" compilation rate for context script caches * `script.context.${CONTEXT}.max_compilations_rate` = `unlimited` disables compilation rate limiting for `${CONTEXT}`'s script cache Refs: #50152	2020-03-20 15:14:30 -06:00
Lee Hinman	1f3de2fa7e	Set feature flags for IndexTemplatesV2 in top-level gradle file (#53898 ) Resolves #53892	2020-03-20 14:52:22 -06:00
Gordon Brown	f0674af132	Add isHidden to AliasActions equals/hashcode (#53700 ) This commit adds the `isHidden` flag to the `equals` and `hashCode` methods for `AliasActions`.	2020-03-20 13:59:40 -06:00
David Turner	879e26ec06	Describe STALE_STATE_CONFIG in ClusterFormationFH (#53878 ) We mark cluster states persisted on master-ineligible nodes as potentially-stale using the voting configuration `{STALE_STATE_CONFIG}` which prevents these nodes from being elected as master if they are restarted as master-eligible. Today we do not handle this special voting configuration differently in the `ClusterFormationFailureHandler`, leading to a mysterious message `an election requires a node with id [STALE_STATE_CONFIG]` if the election does not succeed. This commit adds a special case description for this situation to explain better why this node cannot win an election. Closes #53734	2020-03-20 20:02:51 +01:00
Igor Motov	88d50ec583	Fix random failures in InternalTopHitsTests#testReduceRandom (#53832 ) The test was randomly and very rarely failing due to generating the same sort key for multiple records, which was making order of these records in the results nondeterministic. While investigating the test I also found that the data wasn't generated in the way that matches the actual data. Normally, the order of documents in hits and scoreDocs in InternalTopHits should be the same. However, in the test only scoreDocs were sorted which was cause very confusing failure messages. This commit fixes this issue as well. Fixes #53676	2020-03-20 13:35:59 -04:00
David Turner	adfeb50a53	Use consistent threadpools in CoordinatorTests (#53868 ) Today in the `CoordinatorTests` each node uses multiple threadpools. This is mostly fine as they are almost completely stateless, except for the `ThreadContext`: by using multiple threadpools we cannot make assertions that the thread context is/isn't preserved as we expect. This commit consolidates the threadpool instances in use so that each node uses just one.	2020-03-20 16:22:42 +01:00
Alan Woodward	a3f21f24ea	Emit deprecation warning when TermsLookup contains a type (#53731 ) TermsLookup in master no longer accepts a type parameter. We should emit a deprecate warning in 7.x when a terms lookup requests includes type to prepare users for its removal. Relates to #41059	2020-03-20 15:11:31 +00:00
Christoph Büscher	8eacb153df	Add async_search.submit to HLRC #53592 (#53852 ) This commit adds a new AsyncSearchClient to the High Level Rest Client which initially supporst the submitAsyncSearch in its blocking and non-blocking flavour. Also adding client side request and response objects and parsing code to parse the xContent output of the client side AsyncSearchResponse together with parsing roundtrip tests and a simple roundtrip integration test. Relates to #49091 Backport of #53592	2020-03-20 13:15:58 +01:00
Alan Woodward	d23112f441	Report parser name and location in XContent deprecation warnings (#53805 ) It's simple to deprecate a field used in an ObjectParser just by adding deprecation markers to the relevant ParseField objects. The warnings themselves don't currently have any context - they simply say that a deprecated field has been used, but not where in the input xcontent it appears. This commit adds the parent object parser name and XContentLocation to these deprecation messages. Note that the context is automatically stripped from warning messages when they are asserted on by integration tests and REST tests, because randomization of xcontent type during these tests means that the XContentLocation is not constant	2020-03-20 11:52:55 +00:00
Jason Tedor	4e6bbf6e3c	Execute retention lease syncs under system context (#53838 ) The retention lease syncs need to occur under the system context, because they are internal actions executed on behalf of the user. Today we are relying on this happening for background syncs by virtue of the fact that the context the syncs are created under is the system context. This is due to these occurring on the cluster state applier thread. However, there are situations where this does not hold such as when a timed out cluster state publication occurs, and the node where the shard is allocated is the elected master node. In that case, the context will be empty due to the fact that we do not reschedule publication under the system context. Currently, doing so runs us into some troubles with losing the existing context, possibly dropping deprecation headers. We could copy that context over when marking the current context as the system context, but the implications of that require some more investigation. For now, we explicitly mark the retention lease syncs as executing under the system context, as this is situation that we can reason about.	2020-03-20 07:36:12 -04:00
Ryan Ernst	f7143b8d85	Fix Joda compatibility in stream protocol (#53823 ) The JodaCompatibleZonedDateTime is a compatibility object that unions Joda's DateTime and Java's ZonedDateTime, meant for use in scripts. When it was added, we serialized the JCZDT as a Joda DateTime so that when sending to older nodes they could still read the object. However, on newer nodes, we continued also reading this as a Joda DateTime. This commit changes the read side to form a JCZDT. closes #53586	2020-03-19 16:39:20 -07:00
Lee Hinman	c3dee628c7	[7.x] Add IndexTemplateV2 to MetaData (#53753 ) (#53827 ) * Add IndexTemplateV2 to MetaData (#53753) * Add IndexTemplateV2 to MetaData This adds the `IndexTemplateV2` and `IndexTemplateV2Metadata` class to be used for the new implementation of index templates. The new metadata is stored as a `MetaData.Custom` implementation. Relates to #53101 * Add ITV2Metadata unit tests Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> * Update min supported version constant Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-19 15:04:00 -06:00
Mayya Sharipova	2c77c0df65	Fix testIndexhasDuplicateData tests (#49786 ) testIndexHasDuplicateData tests were failing ocassionally, due to approximate calculation of BKDReader.estimatePointCount, where if the node is Leaf, the number of points in it was (maxPointsInLeafNode + 1) / 2. As DEFAULT_MAX_POINTS_IN_LEAF_NODE = 1024, for small indexes used in tests, the estimation could be really off. This rewrites tests, to make the max points in leaf node to be a small value to control the tests. Closes #49703	2020-03-19 15:09:23 -04:00
Mark Vieira	3b2b564c91	Improve IntelliJ IDE integration (#53747 ) This commit makes a number of improvements when importing the Elasticsearch project into IntelliJ IDEA. Specifically: - Contributing documentation has been updated to reflect that the 'idea' task should no long be used and Gradle project import is instead the officially supported way of setting up the project. - Attempts to run the 'idea' task will result in a failure with a message directing folks to our CONTRIBUTING.md document. - The project JDK is explicit set rather that using whatever JAVA_HOME is. - Gradle build operation delegation is disabled, and test execution is configured to 'choose per test'. - Gradle is configured to inherit the project JDK. - Some code style conventions are automatically configured. - File encoding is explicitly set to UTF-8. - Parallel module compilation is enabled and deprecated feature warnings are disabled. - A remote debug run configuration using listen mode is created. - JUnit runner is configured with required system properties. - License headers are configured such that Apache 2 is the default notice added to all source files with exception of source in /x-pack which will use the Elastic license.	2020-03-19 11:43:33 -07:00
David Turner	7d3ac4f57d	Revert "Apply cluster states in system context (#53785 )" This reverts commit `4178c57410`.	2020-03-19 15:20:36 +00:00
David Turner	4178c57410	Apply cluster states in system context (#53785 ) Today cluster states are sometimes (rarely) applied in the default context rather than system context, which means that any appliers which capture their contexts cannot do things like remote transport actions when security is enabled. There are at least two ways that we end up applying the cluster state in the default context: 1. locally applying a cluster state that indicates that the master has failed 2. the elected master times out while waiting for a response from another node This commit ensures that cluster states are always applied in the system context. Mitigates #53751	2020-03-19 14:48:55 +00:00
Ignacio Vera	4f1b2fd2b1	Add support for distance queries on geo_shape queries (#53466 ) (#53795 ) With the upgrade to Lucene 8.5, LatLonShape field has support for distance queries. This change implements this new feature and removes the limitation.	2020-03-19 15:21:58 +01:00
Dominic Page	b0884baf46	Geo shape query vs geo point backport (#53774 ) Backport to 7x Enable geo_shape query to work on geo_point fields for shapes: circle, polygon, multipolygon, rectangle see: #48928 Co-Authored-By: @iverase	2020-03-19 13:00:36 +01:00
Jim Ferenczi	4b0ae15a9d	Disable distributed sort optimization on scroll requests (#53759 ) This commit disables the sort optimization added in #51852 for scroll requests. Scroll queries keep a state per shard so we cannot modify the request on the first round (submit). This bug was introduced in non-released versions which is why this pr is marked as a non-issue.	2020-03-19 08:11:23 +01:00
Mark Vieira	9b3b08318d	Remove unused import	2020-03-18 21:07:17 -07:00
Jason Tedor	bc5dae2713	Fix compilation in RoutingNode This commit fixes compilation in RoutingNode.java after a backport brought back usage of an API not available in JDK 8.	2020-03-18 22:21:54 -04:00
Jason Tedor	90ab949415	Improve performance of shards limits decider (#53577 ) On clusters with a large number of shards, the shards limits allocation decider can exhibit poor performance leading to timeouts applying cluster state updates. This occurs because for every shard, we do a loop to count the number of shards on the node, and the number of shards for the index of the shard. This is roughly quadratic in the number of shards. This loop is not necessary, since we already have a O(1) method to count the number of non-relocating shards on a node, and with this commit we add some infrastructure to RoutingNode to make counting the number of shards per index O(1).	2020-03-18 20:58:22 -04:00
Stuart Tettemer	cdbee32f55	Scripting: Per-context script cache, default off (#52855 ) (#53756 ) * Adds per context settings: `script.context.${CONTEXT}.cache_max_size` ~ `script.cache.max_size` `script.context.${CONTEXT}.cache_expire` ~ `script.cache.expire` `script.context.${CONTEXT}.max_compilations_rate` ~ `script.max_compilations_rate` * Context cache is used if: `script.max_compilations_rate=use-context`. This value is dynamically updatable, so users can switch back to the general cache if desired. * Settings for context caches take the first value that applies: 1) Context specific settings if set, eg `script.context.ingest.cache_max_size` 2) Correlated general setting is set to the non-default value, eg `script.cache.max_size` 3) Context default The reason for 2's inclusion is to allow an easy transition for users who've customized their general cache settings. Using the general cache settings for the context caches results in higher effective settings, since they are multiplied across the number of contexts. So a general cache max size of 200 will become 200 * # of contexts. However, this behavior it will avoid users snapping to a value that is too low for them. Backport of: #52855 Refs: #50152	2020-03-18 14:44:04 -06:00
Jim Ferenczi	8e17322b3a	Shortcut query phase using the results of other shards (#51852 ) (#53659 ) This commit, built on top of #51708, allows to modify shard search requests based on informations collected on other shards. It is intended to speed up sorted queries on time-based indices. For queries that are only interested in the top documents. This change will rewrite the shard queries to match none if the bottom sort value computed in prior shards is better than all values in the shard. For queries that mix top documents and aggregations this change will reset the size of the top documents to 0 instead of rewriting to match none. This means that we don't need to keep a search context open for this shard since we know in advance that it doesn't contain any competitive hit.	2020-03-18 17:20:35 +01:00
Nhat Nguyen	1615c4b379	Fix testKeepTranslogAfterGlobalCheckpoint (#53704 ) Read the global checkpoint after flushed as we might advance it while flushing. Closes #53505	2020-03-18 11:24:19 -04:00
Alan Woodward	580bc40c0c	Make it possible to deprecate all variants of a ParseField with no replacement (#53722 ) Sometimes we want to deprecate and remove a ParseField entirely, without replacement; for example, the various places where we specify a _type field in 7x. Currently we can tell users only that a particular field name should not be used, and that another name should be used in its place. This commit adds the ability to say that a field should not be used at all.	2020-03-18 14:16:19 +00:00
Marios Trivyzas	d56dee599a	Increase step between checks for cancellation (#53712 ) The introduction of the ExitableDirectoryReader showed increase of latencies for range queries using pointvalues. Check for cancellation every 1024 docs instead of every 15 to lower the impact of the check in query's performance. Follows: #52822 Fixes: #53496 (cherry picked from commit 6b5fc35e4458e60a7ca5822584ec6a60562f2c01)	2020-03-18 14:52:40 +01:00
Tanguy Leroux	6cc564d677	Restore off-heap loading for term dictionary in ReadOnlyEngine (#53713 ) This is a partial restore of #43158, following decision taken in #51247 Closes #51247	2020-03-18 13:24:34 +01:00
Tianlun Li	e7ae9ae596	Deprecate delaying state recovery for master nodes (#53646 ) It is useful to be able to delay state recovery until enough data nodes have joined the cluster, since this gives the shard allocator a decent opportunity to re-use as much existing data as possible. However we also have the option to delay state recovery until a certain number of master-eligible nodes have joined, and this is unnecessary: we require a majority of master-eligible nodes for state recovery, and there is no advantage in waiting for more. This commit deprecates the unnecessary settings in preparation for their removal. Relates #51806	2020-03-18 10:04:22 +00:00
Lee Hinman	9c0e846db3	[7.x] Add REST API for ComponentTemplate CRUD (#53558 ) (#53681 ) * Add REST API for ComponentTemplate CRUD This adds the Put/Get/DeleteComponentTemplate APIs that allow inserting, retrieving, and removing ComponentTemplateMetadata into the cluster state metadata. These APIs are currently only available behind a feature flag system property - `es.itv2_feature_flag_registered`. Relates to #53101 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-17 13:23:28 -06:00
Ryan Ernst	5c472fcb47	Upgrade jackson to 2.10.3 and GeoIP to 2.13.1 (#53642 ) Re-applies the change from #53523 along with test fixes. closes #53626 closes #53624 closes #53622 closes #53625 Co-authored-by: Nik Everett <nik9000@gmail.com> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> Co-authored-by: Jake Landis <jake.landis@elastic.co>	2020-03-17 10:28:51 -07:00
Alan Woodward	71b703edd1	Rename AtomicFieldData to LeafFieldData (#53554 ) This conforms with lucene's LeafReader naming convention, and matches other per-segment structures in elasticsearch.	2020-03-17 12:30:12 +00:00
Jason Tedor	01d2339883	Invoke response handler on failure to send (#53631 ) Today it can happen that a transport message fails to send (for example, because a transport interceptor rejects the request). In this case, the response handler is never invoked, which can lead to necessary cleanups not being performed. There are two ways to handle this. One is to expect every callsite that sends a message to try/catch these exceptions and handle them appropriately. The other is merely to invoke the response handler to handle the exception, which is already equipped to handle transport exceptions.	2020-03-16 21:28:24 -04:00
Jason Tedor	881d0bfa8a	Add server name to remote info API (#53634 ) This commit adds the configured server_name to the proxy mode info so that it can be exposed in the remote info API.	2020-03-16 21:20:42 -04:00
Luca Cavanna	c3d2417448	Cumulative backport of async search changes (#53635 ) * Submit async search to work only with POST (#53368) Currently the submit async search API can be called using both GET and POST at REST, but given that it submits a call and creates internal state, POST should be the only allowed method. * Refine SearchProgressListener internal API (#53373) The following cumulative improvements have been made: - rename `onReduce` and `notifyReduce` to `onFinalReduce` and `notifyFinalReduce` - add unit test for `SearchShard` - on* methods in `SearchProgressListener` shouldn't need to be public as they should never be called directly, they only need to be overridden hence they can be made protected. They are actually called directly from a test which required some adapting, like making `AsyncSearchTask.Listener` class package private instead of private - Instead of overriding `getProgressListener` in `AsyncSearchTask`, as it feels weird to override a getter method, added a specific method that allows to retrieve the Listener directly without needing to cast it. Made the getter and setter for the listener final in the base class. - rename `SearchProgressListener#searchShards` methods to `buildSearchShards` and make it static given that it accesses no instance members - make `SearchShard` and `SearchShardTask` classes final * Move async search yaml tests to x-pack yaml test folder (#53537) The yaml tests for async search currently sit in its qa folder. There is no reason though for them to live in a separate folder as they don't require particular setup. This commit moves them to the main folder together with the other x-pack yaml tests so that they will be run by the client test runners too. * [DOCS] Add temporary redirect for async-search (#53454) The following API spec files contain a link to a not-yet-created async search docs page: * [async_search.delete.json][0] * [async_search.get.json][1] * [async_search.submit.json][2] The Elaticsearch-js client uses these spec files to create their docs. This created a broken link in the Elaticsearch-js docs, which has broken the docs build. This PR adds a temporary redirect for the docs page. This redirect should be removed when the actual API docs are added. [0]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.delete.json [1]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.get.json [2]: https://github.com/elastic/elasticsearch/blob/master/x-pack/plugin/src/test/resources/rest-api-spec/api/async_search.submit.json Co-authored-by: James Rodewig <james.rodewig@elastic.co>	2020-03-17 00:08:17 +01:00
Nik Everett	9845dbb7d6	Fix sorting agg buckets by doc_count (backport of #53617 ) (#53627 ) I broke sorting aggregations by `doc_count` in #51271 by mixing up true and false. This flips that comparison and adds a few tests to double check that we don't so this again.	2020-03-16 17:35:43 -04:00
Nik Everett	f0beab4041	Stop using round-tripped PipelineAggregators (backport of #53423 ) (#53629 ) This begins to clean up how `PipelineAggregator`s and executed. Previously, we would create the `PipelineAggregator`s on the data nodes and embed them in the aggregation tree. When it came time to execute the pipeline aggregation we'd use the `PipelineAggregator`s that were on the first shard's results. This is inefficient because: 1. The data node needs to make the `PipelineAggregator` only to serialize it and then throw it away. 2. The coordinating node needs to deserialize all of the `PipelineAggregator`s even though it only needs one of them. 3. You end up with many `PipelineAggregator` instances when you only really need one per pipeline. 4. `PipelineAggregator` needs to implement serialization. This begins to undo these by building the `PipelineAggregator`s directly on the coordinating node and using those instead of the `PipelineAggregator`s in the aggregtion tree. In a follow up change we'll stop serializing the `PipelineAggregator`s to node versions that support this behavior. And, one day, we'll be able to remove `PipelineAggregator` from the aggregation result tree entirely. Importantly, this doesn't change how pipeline aggregations are declared or parsed or requested. They are still part of the `AggregationBuilder` tree because that makes sense.	2020-03-16 16:15:23 -04:00
Gordon Brown	031932b32f	Allow _cat indices & aliases to use indices options (#53248 ) This commit adjusts the _cat/indices and _cat/aliases APIs to allow specifying indices options, so that these APIs can handle hidden indices/aliases in the same way as other APIs. Also adds the hidden option to the expand_wildcards parameter in the YAML spec for every API that accepts it.	2020-03-16 11:25:05 -06:00
markharwood	2c74f3e22c	Backport of new wildcard field type (#53590 ) * New wildcard field optimised for wildcard queries (#49993) Indexes values using size 3 ngrams and also stores the full original as a binary doc value. Wildcard queries operate by using a cheap approximation query on the ngram field followed up by a more expensive verification query using an automaton on the binary doc values. Also supports aggregations and sorting.	2020-03-16 15:07:13 +00:00
Mayya Sharipova	a906f8a0e4	Highlighters skip ignored keyword values (#53408 ) (#53604 ) Keyword field values with length more than ignore_above are not indexed. But highlighters still were retrieving these values from _source and were trying to highlight them. This sometimes lead to errors if a field length exceeded max_analyzed_offset. But also this is an overall wrong behaviour to attempt to highlight something that was ignored during indexing. This PR checks if a keyword value was ignored because of its length, and if yes, skips highlighting it. Backport: #53408 Closes #43800	2020-03-16 11:06:25 -04:00
Jim Ferenczi	e6680be0b1	Add new x-pack endpoints to track the progress of a search asynchronously (#49931 ) (#53591 ) This change introduces a new API in x-pack basic that allows to track the progress of a search. Users can submit an asynchronous search through a new endpoint called `_async_search` that works exactly the same as the `_search` endpoint but instead of blocking and returning the final response when available, it returns a response after a provided `wait_for_completion` time. ```` GET my_index_pattern/_async_search?wait_for_completion=100ms { "aggs": { "date_histogram": { "field": "@timestamp", "fixed_interval": "1h" } } } ```` If after 100ms the final response is not available, a `partial_response` is included in the body: ```` { "id": "9N3J1m4BgyzUDzqgC15b", "version": 1, "is_running": true, "is_partial": true, "response": { "_shards": { "total": 100, "successful": 5, "failed": 0 }, "total_hits": { "value": 1653433, "relation": "eq" }, "aggs": { ... } } } ```` The partial response contains the total number of requested shards, the number of shards that successfully returned and the number of shards that failed. It also contains the total hits as well as partial aggregations computed from the successful shards. To continue to monitor the progress of the search users can call the get `_async_search` API like the following: ```` GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms ```` That returns a new response that can contain the same partial response than the previous call if the search didn't progress, in such case the returned `version` should be the same. If new partial results are available, the version is incremented and the `partial_response` contains the updated progress. Finally if the response is fully available while or after waiting for completion, the `partial_response` is replaced by a `response` section that contains the usual _search response: ```` { "id": "9N3J1m4BgyzUDzqgC15b", "version": 10, "is_running": false, "response": { "is_partial": false, ... } } ```` Asynchronous search are stored in a restricted index called `.async-search` if they survive (still running) after the initial submit. Each request has a keep alive that defaults to 5 days but this value can be changed/updated any time: ````` GET my_index_pattern/_async_search?wait_for_completion=100ms&keep_alive=10d ````` The default can be changed when submitting the search, the example above raises the default value for the search to `10d`. ````` GET _async_search/9N3J1m4BgyzUDzqgC15b/?wait_for_completion=100ms&keep_alive=10d ````` The time to live for a specific search can be extended when getting the progress/result. In the example above we extend the keep alive to 10 more days. A background service that runs only on the node that holds the first primary shard of the `async-search` index is responsible for deleting the expired results. It runs every hour but the expiration is also checked by running queries (if they take longer than the keep_alive) and when getting a result. Like a normal `_search`, if the http channel that is used to submit a request is closed before getting a response, the search is automatically cancelled. Note that this behavior is only for the submit API, subsequent GET requests will not cancel if they are closed. Asynchronous search are not persistent, if the coordinator node crashes or is restarted during the search, the asynchronous search will stop. To know if the search is still running or not the response contains a field called `is_running` that indicates if the task is up or not. It is the responsibility of the user to resume an asynchronous search that didn't reach a final response by re-submitting the query. However final responses and failures are persisted in a system index that allows to retrieve a response even if the task finishes. ```` DELETE _async_search/9N3J1m4BgyzUDzqgC15b ```` The response is also not stored if the initial submit action returns a final response. This allows to not add any overhead to queries that completes within the initial `wait_for_completion`. The `.async-search` index is a restricted index (should be migrated to a system index in +8.0) that is accessible only through the async search APIs. These APIs also ensure that only the user that submitted the initial query can retrieve or delete the running search. Note that admins/superusers would still be able to cancel the search task through the task manager like any other tasks. Relates #49091 Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>	2020-03-16 15:31:27 +01:00
David Turner	7e82a4f78c	Do not log no-op reconnections at DEBUG (#53469 ) Today the NodeConnectionsService emits a DEBUG-level log message each time it calls TransportService#connectToNode, which happens for every node in the cluster every ten seconds, and also at every cluster state update. That's a lot of log messages. Most of these calls are no-ops and can be ignored, but if the call was not a no-op then it may be worth investigating further. Since the logs do not distinguish the interesting and uninteresting cases, they are not useful. This commit distinguishes the two cases and pushes the noisy logging for the common no-op case down to TRACE level, leaving only useful and actionable information in the DEBUG-level logs.	2020-03-16 08:56:20 +00:00
Mark Vieira	2f0aca992b	Revert "Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576 )" This reverts commit `b7dbadeea0`.	2020-03-15 18:10:40 -07:00
Jason Tedor	66374b61ca	Remove extra code in allocation commands parsing (#53579 ) This commit removes some code that is duplicated in the parsing of allocation commands in the cluster reroute API.	2020-03-14 18:14:13 -04:00
Jason Tedor	b7dbadeea0	Upgrade to Jackson 2.10.3 and GeoIP2 to 2.13.1 (#53576 ) This commit upgrades our Jackson dependency to 2.10.3 and our GeoIP2 dependency to 2.13.1. Relates #53523	2020-03-14 13:28:06 -04:00
Marios Trivyzas	b6c94fd73e	Fix Term Vectors with artificial docs and keyword fields (#53504 ) (#53550 ) Previously, Term Vectors API was returning empty results for artificial documents with keyword fields. Checking only for `string()` on `IndexableField` is not enough, since for `KeywordFieldType` `binaryValue()` must be used instead. Fixes #53494 (cherry picked from commit 1fc3fe3d32f41eab2101c0536751b7c47e63cc48)	2020-03-13 19:26:14 +01:00
Dan Hermann	fb29c2dccf	Fix ingest pipeline _simulate api with empty docs never returns a res… (#52937 ) (#53547 )	2020-03-13 09:41:14 -05:00
William Brafford	5b718d2565	Use snake case for nodes stats/info metric names (#53446 ) (#53535 ) * Use snake case for nodes stats/info metric names (#53446) The REST API uses "thread_pool" as the name of the thread pool metric. If we use this name internally when we serialize nodes stats and info requests, we won't need to do any fancy logic to check for and switch out "threadPool", which was the previous internal name.	2020-03-13 07:49:14 -04:00
Jim Ferenczi	9dfcc07401	Fix pre-sorting of shards in the can_match phase (#53397 ) This commit fixes a bug on sorted queries with a primary sort field that uses different types in the requested indices. In this scenario the returned min/max values to sort the shards are not comparable so we should avoid the sorting rather than throwing an obscure exception.	2020-03-13 01:28:11 +01:00
Nhat Nguyen	fe2f6b359e	Fix concurrent requests race over scroll context limit (#53449 ) Concurrent search scroll requests can lead to more scroll contexts than the limit.	2020-03-12 17:56:51 -04:00
Lee Hinman	2789fe4179	[7.x] Add ComponentTemplate to MetaData (#53290 ) (#53489 ) * Add ComponentTemplate to MetaData (#53290) * Add ComponentTemplate to MetaData This adds a `ComponentTemplate` datastructure that will be used as part of #53101 (Index Templates v2) to the `MetaData` class. Currently there are no APIs for interacting with this class, so it will always be an empty map (other than in tests). This infrastructure will be built upon to add APIs in a subsequent commit. A `ComponentTemplate` is made up of a `Template`, a version, and a MetaData.Custom class. The `Template` contains similar information to an `IndexTemplateMetaData` object— settings, mappings, and alias configuration. * Update minimal supported version constant Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-03-12 15:33:32 -06:00
Nik Everett	9dcd64c110	Preserve metric types in top_metrics (backport of #53288 ) (#53440 ) This changes the `top_metrics` aggregation to return metrics in their original type. Since it only supports numerics, that means that dates, longs, and doubles will come back as stored, with their appropriate formatter applied.	2020-03-12 17:17:09 -04:00
Jay Modi	0353b804bf	Mute testKeepTranslogAfterGlobalCheckpoint (#53510 ) This change mutes a test that fails reproducibly in InternalEngineTests. Relates #53505	2020-03-12 13:08:13 -06:00
Lee Hinman	67fffe676e	[7.x] Add read/writeOptionalVLong to StreamInput/Output (#5314… (#53491 ) The spirit of StreamInput/StreamOutput is that common I/O patterns should be handled by these classes so that the persistence methods in application classes can be kept short, which facilitates easy visual comparison between read and write methods, and reduces risks of having serialization issues due to mismatched implementations. To this end, this change adds readOptionalVLong and writeOptionalVLong methods to these classes as we have started to build up cases where that conditional/null logic has been implemented directly in the read & write methods. Co-authored-by: Tim Vernum <tim.vernum@elastic.co>	2020-03-12 10:59:31 -06:00
Przemyslaw Gomulka	2438b899eb	Support joda style date patterns in 7.x (#52555 ) If an index was created in version 6 and contain a date field with a joda-style pattern it should still be allowed to search and insert document into it. Those created in 6 but date pattern starts with 8, should be considered as java style.	2020-03-12 08:57:03 +01:00
Nik Everett	9ada508347	Fix date_nanos in composite aggs (backport of #53315 ) (#53347 ) It looks like `date_nanos` fields weren't likely to work properly in composite aggs because composites iterate field values using points and we weren't converting the points into milliseconds. Because the doc values were coming back in milliseconds we ended up geting very confused and just never collecting sub-aggregations. This fixes that by adding a method to `DateFieldMapper.Resolution` to `parsePointAsMillis` which is similarly in name and function to `NumberFieldMapper.NumberType`'s `parsePoint` except that it normalizes to milliseconds which is what aggs need at the moment. Closes #53168	2020-03-11 13:00:07 -04:00
Nhat Nguyen	1fd56698fa	Adjust wire version for search context id Relates #53143	2020-03-11 11:48:11 -04:00
Nhat Nguyen	6665ebe7ab	Harden search context id (#53143 ) Using a Long alone is not strong enough for the id of search contexts because we reset the id generator whenever a data node is restarted. This can lead to two issues: 1. Fetch phase can fetch documents from another index 2. A scroll search can return documents from another index This commit avoids these issues by adding a UUID to SearchContexId.	2020-03-11 11:48:11 -04:00
David Turner	ac721938c2	Allow joining node to trigger term bump (#53338 ) In rare circumstances it is possible for an isolated node to have a greater term than the currently-elected leader. Today such a node will attempt to join the cluster but will not offer a vote to the leader and will reject its cluster state publications due to their stale term. This situation persists since there is no mechanism for the joining node to inform the leader that its term is stale and a new election is required. This commit adds the current term of the joining node to the join request. Once the join has been validated, the leader will perform another election to increase its term far enough to allow the isolated node to join properly. Fixes #53271	2020-03-11 09:19:44 +00:00
Armin Braun	7189c57b6c	Record Force Merges in Live Commit Data (#52694 ) (#53372 ) * Record Force Merges in live commit data Prerequisite of #52182. Record force merges in the live commit data so two shard states with the same sequence number that differ only in whether or not they have been force merged can be distinguished when creating snapshots.	2020-03-11 06:30:36 +01:00
Nhat Nguyen	24f114766f	Fix doc_stats and segment_stats of ReadOnlyEngine (#53345 ) We can't always have the same segment stats and doc stats between InternalEngine and ReadOnlyEngine if there are some fully deleted segments. ReadOnlyEngine always filters out them. InternalEngine, however, will keep them if peer recovery retention leases exist or the number of the retaining operations is non-zero. This change reverts the fix in #51331 and uses the wrapped reader to calculate the segment stats and doc stats. For the test, we need to disable the extra retaining soft-deletes operations. Closes #51303	2020-03-10 21:51:33 -04:00
Gordon Brown	20bbe5bae4	Fix Rollover handing of hidden aliases (#53146 ) Prior to this commit, rollover did not propagate the `is_hidden` alias property when rollover over an index. This commit ensures that an alias that's rollover over will remain hidden.	2020-03-10 10:56:12 -06:00
Nik Everett	5ce6de2c1a	Simplify SiblingPipelineAggregator (#53144 ) (#53341 ) This removes the `instanceof`s from `SiblingPipelineAggregator` by adding a `rewriteBuckets` method to `InternalAggregation` that can be called to, well, rewrite the buckets. The default implementation of `rewriteBuckets` throws the same exception that was thrown when you attempted to run a `SiblingPipelineAggregator` on an aggregation without buckets. It is overridden by `InternalSingleBucketAggregation` and `InternalMultiBucketAggregation` to correctly rewrite their buckets.	2020-03-10 11:39:10 -04:00
Nik Everett	89c0e1f566	Fix composite agg sort bug (backport of #53296 ) (#53337 ) When an composite aggregation is run against an index with a sort that starts with the "source" fields from the composite but has additional fields it'd blow up in while trying to decide if it could use the sort. This changes it to decide that it can use the sort. Closes #52480	2020-03-10 11:32:46 -04:00
Jim Ferenczi	ae6c25b749	Speed up partial reduce of terms aggregations (#53216 ) This change optimizes the merge of terms aggregations by removing the priority queue that was used to collect all the buckets during a non-final reduction. We don't need to keep the result sorted since the merge of buckets in a subsequent reduce can modify the order. I wrote a small micro-benchmark to test the change and the speed ups are significative for small merge buffer sizes: ```` ########## Master: Benchmark (bufferSize) (cardinality) (numShards) (topNSize) Mode Cnt Score Error Units TermsReduceBenchmark.reduceTopHits 5 10000 1000 1000 avgt 10 2459,690 ± 198,682 ms/op TermsReduceBenchmark.reduceTopHits 16 10000 1000 1000 avgt 10 1030,620 ± 91,544 ms/op TermsReduceBenchmark.reduceTopHits 32 10000 1000 1000 avgt 10 558,608 ± 44,915 ms/op TermsReduceBenchmark.reduceTopHits 128 10000 1000 1000 avgt 10 287,333 ± 8,342 ms/op TermsReduceBenchmark.reduceTopHits 512 10000 1000 1000 avgt 10 257,325 ± 54,515 ms/op ########## Patch: Benchmark (bufferSize) (cardinality) (numShards) (topNSize) Mode Cnt Score Error Units TermsReduceBenchmark.reduceTopHits 5 10000 1000 1000 avgt 10 805,611 ± 14,630 ms/op TermsReduceBenchmark.reduceTopHits 16 10000 1000 1000 avgt 10 378,851 ± 17,929 ms/op TermsReduceBenchmark.reduceTopHits 32 10000 1000 1000 avgt 10 261,094 ± 10,176 ms/op TermsReduceBenchmark.reduceTopHits 128 10000 1000 1000 avgt 10 241,051 ± 19,558 ms/op TermsReduceBenchmark.reduceTopHits 512 10000 1000 1000 avgt 10 231,643 ± 6,170 ms/op ```` The code for the benchmark can be found [here](). It seems to be up to 3x faster for terms aggregations that return 10,000 unique terms (1000 terms per shard). For a cardinality of 100,000 terms, this patch is up to 5x faster: ```` ########## Patch: Benchmark (bufferSize) (cardinality) (numShards) (topNSize) Mode Cnt Score Error Units TermsReduceBenchmark.reduceTopHits 5 100000 1000 1000 avgt 10 12791,083 ± 397,128 ms/op TermsReduceBenchmark.reduceTopHits 16 100000 1000 1000 avgt 10 3974,939 ± 324,617 ms/op TermsReduceBenchmark.reduceTopHits 32 100000 1000 1000 avgt 10 2186,285 ± 267,124 ms/op TermsReduceBenchmark.reduceTopHits 128 100000 1000 1000 avgt 10 914,657 ± 160,784 ms/op TermsReduceBenchmark.reduceTopHits 512 100000 1000 1000 avgt 10 604,198 ± 145,457 ms/op ########## Master: Benchmark (bufferSize) (cardinality) (numShards) (topNSize) Mode Cnt Score Error Units TermsReduceBenchmark.reduceTopHits 5 100000 1000 1000 avgt 10 60696,107 ± 929,944 ms/op TermsReduceBenchmark.reduceTopHits 16 100000 1000 1000 avgt 10 16292,894 ± 783,398 ms/op TermsReduceBenchmark.reduceTopHits 32 100000 1000 1000 avgt 10 7705,444 ± 77,588 ms/op TermsReduceBenchmark.reduceTopHits 128 100000 1000 1000 avgt 10 2156,685 ± 88,795 ms/op TermsReduceBenchmark.reduceTopHits 512 100000 1000 1000 avgt 10 760,273 ± 53,738 ms/op ```` The merge of buckets can also be optimized. Currently we use an hash map to merge buckets coming from different shards so this can be costly if the number of unique terms is high. Instead, we could always sort the shard terms result by key and perform a merge sort to reduce the results. This would save memory and make the merge more linear in terms of complexity in the coordinating node at the expense of an additional sort in the shards. I plan to test this possible optimization in a follow up. Relates #51857	2020-03-10 14:26:59 +01:00
Nik Everett	e23c3f915f	Save a little space on empty BitArrays (#53243 ) (#53316 ) It doesn't make a whole lot of sense for `BitArray#clear` to grow the underlying storage array just to clear the bit. We already treat indices outside of the storage array as unset. This turns such operations into a noop.	2020-03-10 09:22:19 -04:00
Alan Woodward	5c861cfe6e	Upgrade to final lucene 8.5.0 snapshot (#53293 ) Lucene 8.5.0 release candidates are imminent. This commit upgrades master to use the latest snapshot to check that there are no last-minute bugs or regressions.	2020-03-10 09:32:59 +00:00
Gordon Brown	1cb0a4399d	Fix Get Alias API handling of hidden indices with visible aliases (#53147 ) This commit changes the Get Aliases API to include hidden indices by default - this is slightly different from other APIs, but is necessary to make this API work intuitively.	2020-03-09 16:16:29 -06:00
William Brafford	2bb4b96a7f	Serialize NodesStatsRequest as set of strings (#53235 ) (#53313 ) * Add unit tests before refactoring * Convert boolean fields to set of strings In order to make nodes stats plugins pluggable, we need to make the NodesStatsRequest class capable of carrying a flexible list of metrics rather than a fixed list of boolean flags. This commit changes the internal storage of the class without changing its serialization. * Change serialization of NodesStatsRequest * Set up BWC before merging * Singularize enum name	2020-03-09 18:13:29 -04:00
Jason Tedor	1860c57147	Deprecate the listener thread pool (#53266 ) The listener thread pool is being removed from use in the server codebase. This commit deprecates configuring the listener thread pool.	2020-03-09 16:56:01 -04:00
David Turner	b20f86e450	Clarify JavaDoc for DiscoveryNodes#resolveNodes (#53277 ) Closes #52887	2020-03-09 14:44:29 +00:00
David Turner	52ff341814	Deprecate passing settings in restore requests (#53268 ) Today we accept a `settings` field in snapshot restore requests, but this field is not used. This commit deprecates it.	2020-03-09 12:01:07 +00:00
Christoph Büscher	2fd954a3b7	Fix potential NPE in FuzzyTermsEnum (#53231 ) Under certain circumstances SpanMultiTermQueryWrapper uses SpanBooleanQueryRewriteWithMaxClause as its rewrite method, which in turn tries to get a TermsEnum from the wrapped MultiTermQuery currently using a `null` AttributeSource. While queries TermsQuery or subclasses of AutomatonQuery ignore this argument, FuzzyQuery uses it to create a FuzzyTermsEnum which triggers an NPE when the AttributeSource is not provided. This PR fixes this by supplying an empty AttributeSource instead of a `null` value. Closes #52894	2020-03-09 12:59:08 +01:00
Jason Tedor	5e96d3e59a	Use given executor for global checkpoint listener (#53260 ) Today when notifying a global checkpoint listener, we use the listener thread pool. This commit turns this inside out so that the global checkpoint listener must provide an executor on which to notify the listener.	2020-03-08 13:51:05 -04:00
Jason Tedor	79b67eb3ba	Drop action future that forks on listener executor (#53261 ) This commit drops the dispatching listenable action future that forks to the listener thread pool. This was previously used in the transport client but is no longer used.	2020-03-08 12:36:09 -04:00
Jason Tedor	a0b235888f	Avoid self-suppression on grouped action listener (#53262 ) It can be that a failure is repeated to a grouped action listener. For example, if the same exception such as a connect transport exception, is the cause of repeated failures. Previously we were unconditionally self-suppressing the exception into the first exception, but self-supressing is not allowed. Thus, we would throw an exception and the grouped action listener would never complete. This commit addresses this by guarding against self-suppression.	2020-03-08 08:59:57 -04:00
Jason Tedor	c5738ae312	Notify refresh listeners on the calling thread (#53259 ) Today we notify refresh listeners by forking to the listener thread pool and then serially notifying listeners on a thread there. Refreshes are expensive though, so the expectation is that we are executing refreshes on threads that can afford an expensive operation (e.g., not a network thread) and as such, executing listeners that we expect to be cheap aon the calling thread is okay. This commit removes the forking of notifying refresh listeners to run directly on the calling thread that executed a refresh.	2020-03-07 13:12:40 -05:00
Gordon Brown	ff9b8bda63	Implement hidden aliases (#52547 ) This commit introduces hidden aliases. These are similar to hidden indices, in that they are not visible by default, unless explicitly specified by name or by indicating that hidden indices/aliases are desired. The new alias property, `is_hidden` is implemented similarly to `is_write_index`, except that it must be consistent across all indices with a given alias - that is, all indices with a given alias must specify the alias as either hidden, or all specify it as non-hidden, either explicitly or by omitting the `is_hidden` property.	2020-03-06 16:02:38 -07:00
Nik Everett	7c9641ef9d	Simplify BucketedSort (#53199 ) (#53240 ) Our lovely `BitArray` compactly stores "flags", lazilly growing its underlying storage. It is super useful when you need to store one bit of data for a zillion buckets or a documents or something. Usefully, it defaults to `false`. But there is a wrinkle! If you ask it whether or not a bit is set but it hasn't grown its underlying storage array "around" that index then it'll throw an `ArrayIndexOutOfBoundsException`. The per-document use cases tend to show up in order and don't tend to mind this too much. But the use case in aggregations, the per-bucket use case, does. Because buckets are collected out of order all the time. This changes `BitArray` so it'll return `false` if the index is too big for the underlying storage. After all, that index can't have been set or else we would have grown the underlying array. Logically, I believe this makes sense. And it makes my life easy. At the cost of three lines. but this adds an extra test to every call to `get`. I think this is likely ok because it is "very close" to an array index lookup that already runs the same test. So I think it'll end up merged with the array bounds check.	2020-03-06 15:27:51 -05:00
Jay Modi	a81460dbf5	Make watch history indices hidden (#52974 ) This commit updates the template used for watch history indices with the hidden index setting so that new indices will be created as hidden. Relates #50251 Backport of #52962	2020-03-06 09:47:03 -07:00
Christoph Büscher	9e561c2921	Fix AbstractBulkByScrollRequest slices parameter via Rest (#53068 ) Currently the AbstractBulkByScrollRequest accepts slice values of 0 via its `setSlices` method, denoting the "auto" slicing behaviour that is usable by settting the "slices=auto" parameter on rest requests. When using the High Level Rest Client, however, we send the 0 value as an integer, which is then rejected as invalid by `AbstractBulkByScrollRequest#parseSlices`. Instead of making parsing of the rest request more lenient, this PR opts for changing the RequestConverter logic in the client to translate 0 values to "auto" on the rest requests. Closes #53044	2020-03-06 15:38:04 +01:00
William Brafford	d145b5536f	Serialize NodesInfoRequest as a set of strings (#53140 ) (#53202 ) For Node Info to be pluggable, NodesInfoRequest must be able to carry arbitrary strings. This commit reworks the internals of that class to use a set rather than hard-coded boolean fields. NodesInfoRequest defaults to specifying all values. We test for this behavior as we refactor and use random testing for the various combinations of metrics. Add backwards compatibility for transport requests.	2020-03-06 09:07:49 -05:00
Marios Trivyzas	7ddbda4c20	Check for query cancellation during rewrite (#53166 ) (#53203 ) With ExitableDirectoryReader in place, check for query cancellation during QueryPhase#preProcess where the query rewriting takes place. Follows: #52822 (cherry picked from commit 0d38626d8e6e9e2620a7a446b617a2ac42852461)	2020-03-06 11:04:01 +01:00
Alan Woodward	c204137451	Deprecate BoolQueryBuilder's mustNot field (#53125 ) The bool query builder in elasticsearch accepts both must_not and mustNot fields. Given that leniency is abhorrent and must be eschewed, we should deprecate the latter as it doesn't fit with the style of parameters elsewhere in the DSL.	2020-03-06 09:11:34 +00:00
Henning Andersen	2e924e4a83	Fix ClusterDisruptionIT.testAckedIndexing (#53169 ) Use assertBusy when doing reroute after bridged disruption, since it can return non-acked if a node is marked faulty by follower check after disruption ended. Closes #53064	2020-03-06 08:56:55 +01:00
Nhat Nguyen	5476a49833	Revert "upgrade to lucene-snapshot-fa75139efea (#53150 ) (#53151 )" This reverts commit `058113aa42`.	2020-03-05 17:33:00 -05:00
Nhat Nguyen	d456e8ffca	Revert "Mute InternalEngineTests.testVersionOnPrimaryWithConcurrentRefresh" This reverts commit `66788afa67`.	2020-03-05 17:32:18 -05:00
Nhat Nguyen	e9e209ae58	Revert "Mute InternalEngineTests.testRandomOperations" This reverts commit `d1cc2e68d5`.	2020-03-05 17:32:11 -05:00
Nhat Nguyen	dc78cc6131	Revert "Mute InternalEngineTests.testForceMergeWithSoftDeletesRetentionAndRecoverySource" This reverts commit `da8aac9e66`.	2020-03-05 17:31:56 -05:00
Nhat Nguyen	f11ae5fd14	Revert "Mute GatewayMetaStatePersistedStateTests.testDataOnlyNodePersistence" This reverts commit `4452addf10`.	2020-03-05 17:31:38 -05:00
James Baiera	4452addf10	Mute GatewayMetaStatePersistedStateTests.testDataOnlyNodePersistence	2020-03-05 16:44:03 -05:00
James Baiera	da8aac9e66	Mute InternalEngineTests.testForceMergeWithSoftDeletesRetentionAndRecoverySource	2020-03-05 15:55:50 -05:00
James Baiera	d1cc2e68d5	Mute InternalEngineTests.testRandomOperations	2020-03-05 15:09:47 -05:00
James Baiera	66788afa67	Mute InternalEngineTests.testVersionOnPrimaryWithConcurrentRefresh	2020-03-05 15:09:47 -05:00
Mayya Sharipova	7e2a9f58ee	script_score query errors on negative scores (#53133 ) 7.5 and 7.6 had a regression that allowed for script_score queries to have negative scores. We have corrected this regression in #52478. This is an addition to #52478 that adds a test and release notes.	2020-03-05 14:23:39 -05:00
Marios Trivyzas	487d442760	Implement Exitable DirectoryReader (#52822 ) (#53162 ) Implement an Exitable DirectoryReader that wraps the original DirectoryReader so that when a search task is cancelled the DirectoryReaders also stop their work fast. This is usuful for expensive operations like wilcard/prefix queries where the DirectoryReaders can spend lots of time and consume resources, as previously their work wouldn't stop even though the original search task was cancelled (e.g. because of timeout or dropped client connection). (cherry picked from commit 67acaf61f33bc5f54e26541514d07e375c202e03)	2020-03-05 14:17:31 +01:00
Nik Everett	28df7ae5ed	Support multiple metrics in `top_metrics` agg (backport of #52965 ) (#53163 ) This adds support for returning multiple metrics to the `top_metrics` agg. It looks like: ``` POST /test/_search?filter_path=aggregations { "aggs": { "tm": { "top_metrics": { "metrics": [ {"field": "v"}, {"field": "m"} ], "sort": {"s": "desc"} } } } } ```	2020-03-05 08:12:01 -05:00
Alan Woodward	3cd4b97618	Remove UnknownNamedObjectException (#53105 ) This was originally thrown from NamedXContentRegistry#parseNamedObject() but that method now throws a NamedObjectNotFoundException, so this is unused.	2020-03-05 10:06:59 +00:00
Ignacio Vera	058113aa42	upgrade to lucene-snapshot-fa75139efea (#53150 ) (#53151 )	2020-03-05 10:04:05 +01:00
Nik Everett	302980e0c4	Remove some ceremony in agg parsing (#53078 ) (#53117 ) With #50871 aggrgations should now be parsed directly by an `ObjectParser` or `ConstructingObjectParser` without the need for the ceremonial `parse` method. This removes 9 of those `parse` methods and parses the aggregation directly from their `ObjectParser`.	2020-03-04 13:06:41 -05:00
Tim Brooks	f68917160e	Fix RemoteConnectionManager size() method (#52823 ) Currently the remote connection manager will delegate the size() call to the underlying cluster connection manager. This introduces the possibility that call will return 1 before the nodeConnection method has been triggered to add the connection to the remote connection list. This can cause issues, as the ensureConnected method checks the connection managers size and executes synchronously if the size is > 0. This leads to a potential cluster not connected exception while we are still waiting for the connection opened callback to be triggered. This commit fixes this issue by using the remote connection manager's size to report the connection manager's size. Fixes #52029.	2020-03-04 09:53:22 -07:00
Yannick Welsch	8ab74fea58	[7.x] Add 7.6.2 as version (#53114 )	2020-03-04 10:39:09 -06:00
Jake Landis	f08ed1f69a	[7.x] add 6.8.8 as version (#53021 )	2020-03-04 10:38:07 -06:00
Alan Woodward	dfebbbf862	BoolQueryBuilder uses ObjectParser (#52880 ) This commit removes the hand-rolled x-content parsing logic from BoolQueryBuilder and instead uses an ObjectParser to handle parsing. It also removes the long-deprecated (since version 6) disable_coord parameter.	2020-03-04 15:48:38 +00:00
Zachary Tong	3fcf598b92	Reduce deprecation log noise from DateIntervalWrapper (#52655 ) Converts the deprecations to `deprecatedAndMaybeLog` to reduce the number of times we log deprecations, since some of these could be called at a high frequency (due to unconverted queries, aggs, etc)	2020-03-03 17:08:10 -05:00
Jay Modi	c610e0893d	Introduce system index APIs for Kibana (#53035 ) This commit introduces a module for Kibana that exposes REST APIs that will be used by Kibana for access to its system indices. These APIs are wrapped versions of the existing REST endpoints. A new setting is also introduced since the Kibana system indices' names are allowed to be changed by a user in case multiple instances of Kibana use the same instance of Elasticsearch. Additionally, the ThreadContext has been extended to indicate that the use of system indices may be allowed in a request. This will be built upon in the future for the protection of system indices. Backport of #52385	2020-03-03 14:11:36 -07:00
Nik Everett	7339427af5	Remove some deprecation warnings parsing aggs (backport of #53026 ) (#53072 ) With #50871 aggrgations should now be parsed directly by an `ObjectParser` or `ConstructingObjectParser` without the need for the ceremonial `parse` method. This removes 10 of those `parse` methods and parses the aggregation directly from their `ObjectParser`.	2020-03-03 15:27:49 -05:00
Luca Cavanna	8a05b670ca	Address MinAndMax generics warnings (#52642 ) `MinAndMax` encapsulates min and max values for a shard. It uses generics to make sure that the values are of the same type and are also comparable. Though there are warnings whenever this class is currently used, which are addressed with this commit. Relates to #49092	2020-03-03 16:08:10 +01:00
Adrien Grand	cb868d2f5e	Introduce a `constant_keyword` field. (#49713 ) (#53024 ) This field is a specialization of the `keyword` field for the case when all documents have the same value. It typically performs more efficiently than keywords at query time by figuring out whether all or none of the documents match at rewrite time, like `term` queries on `_index`. The name is up for discussion. I liked including `keyword` in it, so that we still have room for a `singleton_numeric` in the future. However I'm unsure whether to call it `singleton`, `constant` or something else, any opinions? For this field there is a choice between 1. accepting values in `_source` when they are equal to the value configured in mappings, but rejecting mapping updates 2. rejecting values in `_source` but then allowing updates to the value that is configured in the mapping This commit implements option 1, so that it is possible to reindex from/to an index that has the field mapped as a keyword with no changes to the source. Backport of #49713	2020-03-03 16:01:47 +01:00
Jason Tedor	a154f9c657	Early return if no global checkpoint listeners (#53036 ) When notifying global checkpoint listeners, we have an opportunity to early return if there are not any registered listeners. This is important since it saves some allocations, and also saves forking some empty work to another thread. This commit adds an early return from notifying listeners if there are not any registered.	2020-03-02 23:28:22 -05:00
Stuart Tettemer	210aab0935	Settings: AffixSettings as validator dependencies (#52973 ) (#52982 ) Allow AffixSetting as validator dependencies. If a validator specifies AffixSettings as a dependency, then `validate(T, Map)` will have the concrete setting in a map. Backport of: #52973, 1e0ba70 Fixes: #52933	2020-02-29 09:38:46 -07:00
Nhat Nguyen	e6755afeeb	Upgrade to Lucene 8.5.0-snapshot-c4475920b08 (#52950 ) (#52977 ) To give LUCENE-9228 more CI cycles	2020-02-29 09:29:16 -05:00
Jay Modi	1cd0eee723	Remove TODO in IndexNameExpressionResolver (#52969 ) This commit removes a TODO in the IndexNameExpressionResolver that indicated the API should use a Set instead of a List. However, this TODO was not completely correct since the ordering of arguments matters due to negations when evaluating wildcards and since we also allow a list of patterns like `,-foo,`, which would have a different meaning even when using a Set with insertion ordering. Relates #52788 Backport of #52963	2020-02-28 13:56:28 -07:00
Adrien Grand	331d4bb0af	HybridDirectory should mmap postings. (#52641 ) (#52873 ) Since version 8.4, `MMapDirectory` has an optimization to read long[] arrays directly in little endian order, which postings leverage. So it'd be more efficient to open postings with `MMapDirectory`. I refactored a bit the existing logic to better explain why every listed file extension is open with `mmap`.	2020-02-28 18:45:46 +01:00
Martijn van Groningen	6aa9aaa2c6	Add validation for dynamic templates (#52890 ) Backport of #51233 to the seven dot x branch. Tries to load a `Mapper` instance for the mapping snippet of a dynamic template. This should catch things like using an analyzer that is undefined or mapping attributes that are unused. This is best effort: * If `{{name}}` placeholder is used in the mapping snippet then validation is skipped. * If `match_mapping_type` is not specified then validation is performed for all mapping types. If parsing succeeds with a single mapping type then this the dynamic mapping is considered valid. If is detected that a dynamic template mapping snippet is invalid at mapping update time then the mapping update is failed for indices created on 8.0.0-alpha1 and later. For indices created on prior version a deprecation warning is omitted instead. In 7.x clusters the mapping update will never fail in case of an invalid dynamic template mapping snippet and a deprecation warning will always be omitted. Closes #17411 Closes #24419 Co-authored-by: Adrien Grand <jpountz@gmail.com>	2020-02-28 10:35:04 +01:00
Nik Everett	407101c39b	Clean and document sorting with partialy built buckets (backport of #52769 ) (#52925 ) The `terms` aggregation can be sortd by the results of its sub-aggregations. Because it uses that sorting for filtering to the top-n it tries not to construct all of the buckets for the child aggregations. This has its own interesting problem around reduction, but they aren't super relevant to this change. This change moves that optimization from the `TermsAggregator` and into the aggregators being sorted on. This should make it more clear what is going on and it unifies this optimization with validating the sort. Finally, this should enable some minor optimizations to save a few comparisons when sorting multi-valued buckets. I'll get those in a follow up because they are now fairly obvious. They probably won't be a huge performance improvement, but it'll be nice anyway.	2020-02-27 17:50:55 -05:00
Nik Everett	1d1956ee93	Add size support to `top_metrics` (backport of #52662 ) (#52914 ) This adds support for returning the top "n" metrics instead of just the very top. Relates to #51813	2020-02-27 16:12:52 -05:00
Lee Hinman	e139d70abe	Remove TODO in MaxSizeCondition (#52854 ) Similar to what we did in #52794, this removes the TODO. Relates again to #52505	2020-02-27 09:29:12 -07:00
Dan Hermann	3c8b46a8c1	[7.x] Handle errors when evaluating if conditions in processors (#52892 )	2020-02-27 09:00:51 -06:00
hezhen Zhang	280d59c724	Append index name for the source of the cluster put-mapping task (#52690 ) Add index name(s) into the source for the cluster state update done when putting mapping. This ensures that the pending tasks API includes information on source indices.	2020-02-27 12:16:24 +01:00
David Turner	52fa465300	Cache completion stats between refreshes (#52872 ) Computing the stats for completion fields may involve a significant amount of work since it walks every field of every segment looking for completion fields. Innocuous-looking APIs like `GET _stats` or `GET _cluster/stats` do this for every shard in the cluster. This repeated work is unnecessary since these stats do not change between refreshes; in many indices they remain constant for a long time. This commit introduces a cache for these stats which is invalidated on a refresh, allowing most stats calls to bypass the work needed to compute them on most shards. Closes #51915 Backport of #51991	2020-02-27 10:01:24 +00:00
Nhat Nguyen	814c275f35	Add more assertions to testMaybeFlush (#52792 ) We aren't able to reproduce or figure out the reason that failed this test. This commit adds more assertions so we can narrow the scope. Relates #52223	2020-02-26 17:08:18 -05:00
Nhat Nguyen	0a15a6bfad	Fix testSeqNoCollision (#52588 ) Adjusts the assertion as we trim translog more eagerly since #52556. Relates #52556 Closes #52148	2020-02-26 17:08:18 -05:00
Nhat Nguyen	87e765609e	Fix testResyncAfterPrimaryPromotion (#52615 ) Adjusts the assertion as we might eagerly clean up translog during resync since #52556 Relates #52556 Closes #52598	2020-02-26 17:08:18 -05:00
Nhat Nguyen	5aa612c275	Fix testRestoreLocalHistoryFromTranslog (#52441 ) Asserts that no new operations are made into the translog since we re-opened the engine. Relates #51905 Closes #52410	2020-02-26 17:08:18 -05:00
Nhat Nguyen	a92bf5ec61	Fix IndexShardIT#testMaybeFlush (#52247 ) Since #51905, we use the local checkpoint of the safe commit to calculate the number of uncommitted operations of a translog stats. If a periodic flush triggered by afterWriteOperation completes before we sync translog, then the last commit is not safe. We also need to sync translog from Engine instead of the translog so that we can advance the safe commit. Relates #51905 Closes #52223	2020-02-26 17:08:18 -05:00
Nhat Nguyen	d7fe135d90	Fix testPrepareIndexForPeerRecovery (#52245 ) Since #51905, we skip translog recovery if the local checkpoint of the safe commit equals to the global checkpoint. This change adjusts the test not to create a new snapshot in that case. Closes #52221 Relates #51905	2020-02-26 17:08:18 -05:00
Yannick Welsch	82ab1bc1ff	Separate translog from index deletion conditions (#52556 ) Separates the translog from the index deletion conditions (allowing the translog to be cleaned up more eagerly), and avoids taking the write lock on the translog if no clean-up is actually necessary.	2020-02-26 17:08:18 -05:00
Nhat Nguyen	db6b9c21c7	Use local checkpoint to calculate min translog gen for recovery (#51905 ) Today we use the translog_generation of the safe commit as the minimum required translog generation for recovery. This approach has a limitation, where we won't be able to clean up translog unless we flush. Reopening an already recovered engine will create a new empty translog, and we leave it there until we force flush. This commit removes the translog_generation commit tag and uses the local checkpoint of the safe commit to calculate the minimum required translog generation for recovery instead. Closes #49970	2020-02-26 17:08:18 -05:00
Dan Hermann	3ffd34617f	Switch to AtomicLong for ingestCurrent metric to prevent negative values (#52581 ) (#52834 )	2020-02-26 13:26:26 -06:00
Jay Modi	07ef8ccff4	Allow dynamic updates for index.hidden setting (#52837 ) This commit changes the `index.hidden` setting from being final to a dynamic setting. While the setting being final allows for easier reasoning about an index, making this setting update-able has more benefits in that we can upgrade existing indices to be hidden and it will enable future features that would dynamically make indices hidden. Backport of #52772	2020-02-26 11:46:29 -07:00
Nik Everett	bfaa487757	Switch pipeline agg parsing to ContextParser (#52776 ) (#52832 ) We've pretty well settled on `ContextParser` for a generic interface to `ObjectParser`-like-things. This switches the interface used for building parsing pipeline aggregations to `ContextParser` which saves a couple of little wrappers around `ObjectParser`.	2020-02-26 12:57:20 -05:00
Tim Brooks	be8d704e2b	Remove seeds depedency for remote cluster settings (#52829 ) Currently 3 remote cluster settings (ping interval, skip unavailable, and compression) have a dependency on the seeds setting being comfigured. With proxy mode, it is now possible that these settings the seeds setting has not been configured. This commit removes this dependency and adds new validation for these settings.	2020-02-26 10:17:25 -07:00
Adrien Grand	1807f86751	Generalize how queries on `_index` are handled at rewrite time (#52815 ) Generalize how queries on `_index` are handled at rewrite time (#52486) Since this change refactors rewrites, I also took it as an opportunity to adrress #49254: instead of returning the same queries you would get on a keyword field when a field is unmapped, queries get rewritten to a MatchNoDocsQueryBuilder. This change exposed a couple bugs, like the fact that the percolator doesn't rewrite queries at query time, or that the significant_terms aggregation doesn't rewrite its inner filter, which I fixed. Closes #49254	2020-02-26 15:37:43 +01:00
Luca Cavanna	9e38125464	Clarify when shard iterators get sorted (#52810 ) Currently we have two ways to create a GroupShardsIterator: one that will resort the iterators based on their natural ordering, and another one that will leave them in their original order. This is currently done through two constructors, one that accepts a single argument which does the sorting, and another which accepts a second boolean argument to control whether sorting should happen or not. This second constructor is only called externally to disable the sorting. By introducing a specific method to create a sorted shard iterator we clarify and make it easier to track when we do sort and when we do not as the iterators are externally sorted.	2020-02-26 13:58:20 +01:00
Jim Ferenczi	a73ad248e8	Fix backport of #46731 (#52744 ) This change fixes the incomplete backport of #46731 in 7.x (as of 7.5). We now check if `max_children` is set on the top level nested sort and fails with an exception if it's not the case. Relates #46731 Closes #52202	2020-02-26 10:46:51 +01:00
Sachin Frayne	d3c0a2f013	Improve the error message when loading text fielddata. (#52753 ) Emphasize keyword over fielddata as the preferred way to use String fields for aggregations or sorting.	2020-02-25 15:45:44 -08:00
Lee Hinman	662f21fcea	Remove TODO in MaxAgeCondition serialization (#52794 ) * Remove TODO in MaxAgeCondition serialization This removes the TODO with a message for any future readers regarding the code in question. Resolves #52505	2020-02-25 15:47:36 -07:00
Tim Brooks	c8ef9649e2	Force execution of finish shard bulk request (#51957 ) (#52484 ) Currently the shard bulk request can be rejected by the write threadpool after a mapping update. This introduces a scenario where the mapping listener thread will attempt to finish the request and fsync. This thread can potentially be a transport thread. This commit fixes this issue by forcing the finish action to happen on the write threadpool. Fixes #51904.	2020-02-25 14:37:11 -07:00
Nhat Nguyen	848d3bc153	Revert "Fix testKeepTranslogAfterGlobalCheckpoint" This reverts commit `a88d54eb2d`.	2020-02-25 14:12:35 -05:00
Nhat Nguyen	a88d54eb2d	Fix testKeepTranslogAfterGlobalCheckpoint Read the last synced global checkpoint after flushing as we might advance it during committing. CI: https://gradle-enterprise.elastic.co/s/7o6qengg4gva2	2020-02-25 11:49:24 -05:00
Alan Woodward	638f3e4183	Use ByteBuffersDirectory rather than RAMDirectory (#52768 ) Lucene's RAMDirectory has been deprecated. This commit replaces all uses of RAMDirectory in elasticsearch with the newer ByteBuffersDirectory. Most uses are in tests, but the percolator and painless executor may get some small speedups.	2020-02-25 15:46:35 +00:00
Alan Woodward	18663b0a85	Don't index ranges including NOW in percolator (#52748 ) Currently, date ranges queries using NOW-based date math are rewritten to MatchAllDocs queries when being preprocessed for the percolator. However, since we added the verification step, this can result in incorrect matches when percolator queries are run without scores. This commit changes things to instead wrap date queries that use NOW with a new DateRangeIncludingNowQuery. This is a simple wrapper query that returns its delegate at rewrite time, but it can be detected by the percolator QueryAnalyzer and be dealt with accordingly. This also allows us to remove a method on QueryRewriteContext, and push all logic relating to NOW-based ranges into the DateFieldMapper. Fixes #52617	2020-02-25 12:18:16 +00:00
Ryan Ernst	5fba8cbc7b	Rename local Environment var in Node to avoid confusion (#52602 ) When the Node class is being constructed, an initial environment is passed in with the initial settings for the node. Once the plugin servicie is initialized, the final Environment+Settings are created, at which point the initial environment should no longer be used. This commit renames the constructor arg to avoid naming clashes with the final environment variable.	2020-02-24 11:14:46 -08:00
Lee Hinman	7d9de8412a	[7.x] fix npe in RestPluginsAction (#52620 ) (de56de9a) (#52721 ) Relates #45321 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Kaihong.Wang <kyra.wkh@alibaba-inc.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-02-24 11:57:01 -07:00
Mayya Sharipova	034b1c0ba3	Correct boost calculation in script_score query (#52478 ) (#52724 ) Before boost in script_score query was wrongly applied only to the subquery. This commit makes sure that the boost is applied to the whole score that comes out of script. Closes #48465	2020-02-24 13:48:21 -05:00
Adrien Grand	f993ef80f8	Move the terms index of `_id` off-heap. (#52518 ) In #42838 we moved the terms index of all fields off-heap except the `_id` field because we were worried it might make indexing slower. In general, the indexing rate is only affected if explicit IDs are used, as otherwise Elasticsearch almost never performs lookups in the terms dictionary for the purpose of indexing. So it's quite wasteful to require the terms index of `_id` to be loaded on-heap for users who have append-only workloads. Furthermore I've been conducting benchmarks when indexing with explicit ids on the http_logs dataset that suggest that the slowdown is low enough that it's probably not worth forcing the terms index to be kept on-heap. Here are some numbers for the median indexing rate in docs/s: \| Run \| Master \| Patch \| \| --- \| ------- \| ------- \| \| 1 \| 45851.2 \| 46401.4 \| \| 2 \| 45192.6 \| 44561.0 \| \| 3 \| 45635.2 \| 44137.0 \| \| 4 \| 46435.0 \| 44692.8 \| \| 5 \| 45829.0 \| 44949.0 \| And now heap usage in MB for segments: \| Run \| Master \| Patch \| \| --- \| ------- \| -------- \| \| 1 \| 41.1720 \| 0.352083 \| \| 2 \| 45.1545 \| 0.382534 \| \| 3 \| 41.7746 \| 0.381285 \| \| 4 \| 45.3673 \| 0.412737 \| \| 5 \| 45.4616 \| 0.375063 \| Indexing rate decreased by 1.8% on average, while memory usage decreased by more than 100x. The `http_logs` dataset contains small documents and has a simple indexing chain. More complex indexing chains, e.g. with more fields, ingest pipelines, etc. would see an even lower decrease of indexing rate.	2020-02-24 18:14:12 +01:00
Alan Woodward	7dc41a3b83	Use BoostQuery rather than FunctionScoreQuery for query-time indices_boost (#52272 ) This is a trivial change, but it should result in a slightly more efficient query boost.	2020-02-24 14:41:46 +00:00
Nik Everett	d26d7721ea	Continue realizing sorting by aggregations (backport of #52298 ) (#52667 ) This drops more of the `instanceof`s from `AggregationPath`. There are still a couple in `AggregationPath`. And I ended up moving two into `BucketsAggregator`, but I think this is still an improvement!	2020-02-23 17:13:55 -05:00
bellengao	02cb5b6c0e	Return 429 status code on read_only_allow_delete index block (#50166 ) We consider index level read_only_allow_delete blocks temporary since the DiskThresholdMonitor can automatically release those when an index is no longer allocated on nodes above high threshold. The rest status has therefore been changed to 429 when encountering this index block to signal retryability to clients. Related to #49393	2020-02-22 16:24:25 +01:00
Jay Modi	8abfda0b59	Rename assertThrows to prevent naming clash (#52651 ) This commit renames ElasticsearchAssertions#assertThrows to assertRequestBuilderThrows and assertFutureThrows to avoid a naming clash with JUnit 4.13+ and static imports of these methods. Additionally, these methods have been updated to make use of expectThrows internally to avoid duplicating the logic there. Relates #51787 Backport of #52582	2020-02-21 13:30:11 -07:00
Stuart Tettemer	376932a47d	Scripting: split out compile limits and caching (#52498 ) (#52652 ) Phase 1 of adding compilation limits per context. * Refactor rate limiting and caching into separate class, `ScriptCache`, which will be used per context. * Disable compilation limit for certain tests. Backport of 0866031 Refs: #50152	2020-02-21 12:10:51 -07:00
Jay Modi	f3f6ff97ee	Single instance of the IndexNameExpressionResolver (#52604 ) This commit modifies the codebase so that our production code uses a single instance of the IndexNameExpressionResolver class. This change is being made in preparation for allowing name expression resolution to be augmented by a plugin. In order to remove some instances of IndexNameExpressionResolver, the single instance is added as a parameter of Plugin#createComponents and PersistentTaskPlugin#getPersistentTasksExecutor. Backport of #52596	2020-02-21 07:50:02 -07:00
markharwood	96d603979b	Upgrade Lucene to 8.5.0-snapshot-b01d7cb (#52584 ) Upgrading 7x to same Lucene 8.5 version used in master	2020-02-21 10:25:03 +00:00
Armin Braun	0a09e15959	Add Caching for RepositoryData in BlobStoreRepository (#52341 ) (#52566 ) Cache latest `RepositoryData` on heap when it's absolutely safe to do so (i.e. when the repository is in strictly consistent mode). `RepositoryData` can safely be assumed to not grow to a size that would cause trouble because we often have at least two copies of it loaded at the same time when doing repository operations. Also, concurrent snapshot API status requests currently load it independently of each other and so on, making it safe to cache on heap and assume as "small" IMO. The benefits of this move are: * Much faster repository status API calls * listing all snapshot names becomes instant * Other operations are sped up massively too because they mostly operate in two steps: load repository data then load multiple other blobs to get the additional data * Additional cloud cost savings * Better resiliency, saving another spot where an IO issue could break the snapshot * We can simplify a number of spots in the current code that currently pass around the repository data in tricky ways to avoid loading it multiple times in follow ups.	2020-02-21 10:20:07 +01:00
Armin Braun	4bb780bc37	Refactor Inflexible Snapshot Repository BwC (#52365 ) (#52557 ) * Refactor Inflexible Snapshot Repository BwC (#52365) Transport the version to use for a snapshot instead of whether to use shard generations in the snapshots in progress entry. This allows making upcoming repository metadata changes in a flexible manner in an analogous way to how we handle serialization BwC elsewhere. Also, exposing the version at the repository API level will make it easier to do BwC relevant changes in derived repositories like source only or encrypted.	2020-02-21 09:14:34 +01:00
Ignacio Vera	107f00a4ec	Add support for multipoint geoshape queries (#52133 ) (#52553 ) Currently multi-point queries are not supported when indexing your data using BKD-backed geoshape strategy. This commit removes this limitation.	2020-02-21 07:45:53 +01:00
Yannick Welsch	d76358c875	Deprecate fixed_auto_queue_size thread pool type (#52399 ) Relates #52280	2020-02-20 11:11:06 +01:00
Yannick Welsch	3afb5ca133	Fix synchronization in ByteSizeCachingDirectory (#52512 ) One particular code place was synchronizing on the wrong object.	2020-02-19 16:10:39 +01:00
Przemysław Witek	7cd997df84	[ML] Make ml internal indices hidden (#52423 ) (#52509 )	2020-02-19 14:02:32 +01:00
Ignacio Vera	8d2261fe47	Refactor GeoShapeIndexer by extracting polygon / line decomposers (#52422 ) (#52506 ) Refactor GeoShapeIndexer. We extract Polygon and Line decomposers which are in charge of breaking a shape around the dateline if needed.	2020-02-19 12:04:29 +01:00
Henning Andersen	9d40277d4c	Deciders should not by default collect yes'es (#52438 ) AllocationDeciders would collect Yes decisions when not asking for debug info. Changed to only include Yes decisions when debug is requested (explain).	2020-02-19 11:18:03 +01:00
Henning Andersen	d4bc3b75dc	Reindex: allow comma separated source indices (#52044 ) Added ability to specify comma separated list of source indices without array. Also fixed so that empty string results in validation error rather than index does not exist. Closes #51949	2020-02-19 09:23:15 +01:00
David Turner	baf184c93f	Avoid using WindowsFS in ClusterRerouteIT (#52488 ) Issue #52000 looks like a case of cluster state updates being slower than expected, but it seems that these slowdowns are relatively rare: most invocations of `testDelayWithALargeAmountOfShards` take well under a minute in CI, but there are occasional failures that take 6+ minutes instead. When it fails like this, cluster state persistence seems generally slow: most are slower than expected, with some small updates even taking over 2 seconds to complete. The failures all have in common that they use `WindowsFS` to emulate Windows' behaviour of refusing to delete files that are still open, by tracking all files (really, inodes) and validating that deleted files are really closed first. There is a suggestion that this is a little slow in the Lucene test framework [1]. To see if we can attribute the slowdown to that common factor, this commit suppresses the use of `WindowsFS` for this test suite. [1] `4a513fa99f/lucene/test-framework/src/java/org/apache/lucene/util/TestRuleTemporaryFilesCleanup.java (L166)`	2020-02-19 07:52:49 +00:00
Tim Brooks	8038f9bba6	Do not lock when generating time based uuid (#52436 ) Currently we lock when generating time based uuids. The lock is implemented to prevent concurrent writes to the last timestamp. The uuid generation is an area of contention when indexing. This commit modifies the code to use atomic compare and set operations to update the last timestamp.	2020-02-18 09:55:51 -07:00
Tim Brooks	7fcd997b39	Do not lock on settings keyset if keys initialized (#52435 ) Every time a setting#exist call is made we lock on the keyset to ensure that it has been initialized. This a heavyweight operation that only should be done once. This commit moves to a volatile read instead to prevent unnecessary locking.	2020-02-18 09:36:07 -07:00
Tim Brooks	a742c58d45	Extract a ConnectionManager interface (#51722 ) Currently we have three different implementations representing a `ConnectionManager`. There is the basic `ConnectionManager` which holds all connections for a cluster. And a remote connection manager which support proxy behavior. And a stubbable connection manager for tests. The remote and stubbable instances use the delegate pattern, so this commit extracts an interface for them all to implement.	2020-02-18 09:19:24 -07:00
Benedict Jin	0c4f7dc193	Minor code improvements (#51921 ) Fix some whitespaces, comments and usage of `this.`. (cherry picked from commit 9f59900bf6389172811eb2279c17a2dc7cd9dfdf)	2020-02-18 16:00:05 +01:00
David Turner	3d57a78deb	Add extra logging for investigation into #52000 (#52472 ) It looks like #52000 is caused by a slowdown in cluster state application (maybe due to #50907) but I would like to understand the details to ensure that there's nothing else going on here too before simply increasing the timeout. This commit enables some relevant `DEBUG` loggers and also captures stack traces from all threads rather than just the three hottest ones.	2020-02-18 13:02:33 +00:00
Armin Braun	57d6dd7e31	Fix Non-Verbose Snapshot List Missing Empty Snapshots (#52433 ) (#52456 ) We were not including snapshots without indices in the non-verbose listing because we used the snapshot -> indices mapping to get the snapshots.	2020-02-18 11:37:53 +01:00
Armin Braun	cc628748e1	Optimize FilterStreamInput for Network Reads (#52395 ) (#52403 ) When `FilterStreamInput` wraps a Netty `ByteBuf` based stream it did not forward the bulk primitive reads to the delegate. These are optimized on the delegate but if they're not forwarded then the delegate will be called e.g. 4 times to read an `int`. This happens for essentially all network reads prior to this change because they all run from a `NamedWritableAwareStreamInput`. This also required optimising `BufferedChecksumStreamInput` individually to use bulk reads from the buffer because it implicitly assumed that the filter stream input wouldn't override any of the bulk operations.	2020-02-17 13:07:19 +01:00
Nik Everett	146def8caa	Implement top_metrics agg (#51155 ) (#52366 ) The `top_metrics` agg is kind of like `top_hits` but it only works on doc values so it should be faster. At this point it is fairly limited in that it only supports a single, numeric sort and a single, numeric metric. And it only fetches the "very topest" document worth of metric. We plan to support returning a configurable number of top metrics, requesting more than one metric and more than one sort. And, eventually, non-numeric sorts and metrics. The trick is doing those things fairly efficiently. Co-Authored by: Zachary Tong <zach@elastic.co>	2020-02-14 11:19:11 -05:00
Nik Everett	53b6583fed	Decode max and min optimization more carefully (#52336 ) (#52358 ) Fixes the the no-query optimization for `min` and `max` aggregations for `date_nanos` fields by delegating decoding dates "through" their `resolution` member. Closes #52220	2020-02-14 07:07:56 -05:00
Julie Tibshirani	0d7165a40b	Standardize naming of fetch subphases. (#52171 ) This commit makes the names of fetch subphases more consistent: * Now the names end in just 'Phase', whereas before some ended in 'FetchSubPhase'. This matches the query subphases like AggregationPhase. * Some names include 'fetch' like FetchScorePhase to avoid ambiguity about what they do.	2020-02-13 13:00:46 -08:00
Nik Everett	2dac36de4d	HLRC support for string_stats (#52163 ) (#52297 ) This adds a builder and parsed results for the `string_stats` aggregation directly to the high level rest client. Without this the HLRC can't access the `string_stats` API without the elastic licensed `analytics` module. While I'm in there this adds a few of our usual unit tests and modernizes the parsing.	2020-02-12 19:25:05 -05:00
Nik Everett	7efce22f19	Fix a DST error in date_histogram (backport #52016 ) (#52237 ) When `date_histogram` attempts to optimize itself it for a particular time zone it checks to see if the entire shard is within the same "transition". Most time zone transition once every size months or thereabouts so the optimization can usually kicks in. But it crashes when you attempt feed it a time zone who's last DST transition was before epoch. The reason for this is a little twisted: before this patch it'd find the next and previous transitions in milliseconds since epoch. Then it'd cast them to `Long`s and pass them into the `DateFieldType` to check if the shard's contents were within the range. The trouble is they are then converted to `String`s which are then parsed back to `Instant`s which are then convertd to `long`s. And the parser doesn't like most negative numbers. And everything before epoch is negative. This change removes the `long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of passing the `long` -> `Instant` -> `long` which avoids the fairly complex parsing code and handles a bunch of interesting edge cases around epoch. And other edge cases around `date_nanos`. Closes #50265	2020-02-12 17:57:04 -05:00
Nhat Nguyen	12cb6dcefe	Fix testFlushOnInactive (#52275 ) We need to reduce the translog sync interval for indices with translog async setting so that we can have the safe commit in the assertBusy interval. This is needed since #51905, where we use the local checkpoint of the safe commit to calculate the number of uncommitted operations of a translog stats. Closes #52251 Relates #51905	2020-02-12 17:19:02 -05:00
Jay Modi	5bcc6fce5c	Remove DeprecationLogger from route objects (#52285 ) This commit removes the need for DeprecatedRoute and ReplacedRoute to have an instance of a DeprecationLogger. Instead the RestController now has a DeprecationLogger that will be used for all deprecated and replaced route messages. Relates #51950 Backport of #52278	2020-02-12 15:05:41 -07:00
Marios Trivyzas	dac720d7a1	Add a cluster setting to disallow expensive queries (#51385 ) (#52279 ) Add a new cluster setting `search.allow_expensive_queries` which by default is `true`. If set to `false`, certain queries that have usually slow performance cannot be executed and an error message is returned. - Queries that need to do linear scans to identify matches: - Script queries - Queries that have a high up-front cost: - Fuzzy queries - Regexp queries - Prefix queries (without index_prefixes enabled - Wildcard queries - Range queries on text and keyword fields - Joining queries - HasParent queries - HasChild queries - ParentId queries - Nested queries - Queries on deprecated 6.x geo shapes (using PrefixTree implementation) - Queries that may have a high per-document cost: - Script score queries - Percolate queries Closes: #29050 (cherry picked from commit a8b39ed842c7770bd9275958c9f747502fd9a3ea)	2020-02-12 22:56:14 +01:00
Ryan Ernst	c07f46409c	Fix single newline in logging output stream buffer (#52253 ) The buffer in LoggingOutputStream skips flushing when only a newline appears. However, if a windows newline appeared, the buffer length was not reset. This commit resets the length so the \r does not appear in the next logging message. closes #51838	2020-02-12 10:48:55 -08:00
Nhat Nguyen	e098e837f7	Fix testShouldPeriodicallyFlushAfterMerge (#52243 ) MockRandomMergePolicy randomly determines if a segment should use a compound format. This can cause a force merge performing two merges: (1) merging to a single segment, (2) rewriting the new segment using the compound format. If the second merge completes after we have flushed, then it can flip the flag shouldPeriodicallyFlushAfterBigMerge to true. Closes #52205	2020-02-12 11:25:39 -05:00
Gordon Brown	d48ce12920	Convert ILM and SLM histories into hidden indices (#51456 ) Modifies SLM's and ILM's history indices to be hidden indices for added protection against accidental querying and deletion, and improves IndexTemplateRegistry to handle upgrading index templates. Also modifies the REST test cleanup to delete hidden indices.	2020-02-11 14:18:55 -07:00
Nik Everett	86d5211c05	Make sorting by an agg results a real abstraction (#52007 ) (#52212 ) This removes a bunch of `instanceof`s in favor of two new methods on `InernalAggregation`. The default implementations of these methods just throw exceptions explaining that you can't sort on this aggregation. They are overridden by all of the classes that used to have `instanceof` checks against them. I doubt this is really any faster in practice. The real benefit here is that it is a little more obvious that you can sort by the results of an aggregation and it should be much more obvious where to look at how aggregations sort themselves. There are still a bunch more `instanceof`s in left in `AggregationPath` but those will wait for a followup change.	2020-02-11 12:58:40 -05:00
Hendrik Muhs	098380e483	Percentiles aggregation validation checks for range (#51871 ) disallow to specify percentile out of range [0,100]. This also fixes a problem in transform by failing validation if an invalid percentile configuration is used.	2020-02-11 17:25:39 +01:00
David Roberts	473468d763	[ML] Better error when persistent task assignment disabled (#52014 ) Changes the misleading error message when attempting to open a job while the "cluster.persistent_tasks.allocation.enable" setting is set to "none" to a clearer message that names the setting. Closes #51956	2020-02-11 15:23:21 +00:00
Zachary Tong	87854573e4	Add version constant for 7.6.1	2020-02-11 09:44:43 -05:00
Igor Motov	667e1a5225	Add Boxplot Aggregation (#52174 ) Adds a `boxplot` aggregation that calculates min, max, medium and the first and the third quartiles of the given data set. Closes #33112	2020-02-11 09:38:17 -05:00
David Turner	00b9098250	Ignore timeouts with single-node discovery (#52159 ) Today we use `cluster.join.timeout` to prevent nodes from waiting indefinitely if joining a faulty master that is too slow to respond, and `cluster.publish.timeout` to allow a faulty master to detect that it is unable to publish its cluster state updates in a timely fashion. If these timeouts occur then the node restarts the discovery process in an attempt to find a healthier master. In the special case of `discovery.type: single-node` there is no point in looking for another healthier master since the single node in the cluster is all we've got. This commit suppresses these timeouts and instead lets the node wait for joins and publications to succeed no matter how long this might take.	2020-02-11 14:15:01 +00:00
David Kyle	343ced42be	Mute LoggingOutputStreamTests.testMaxBuffer (#52193 ) Relates to https://github.com/elastic/elasticsearch/issues/51838	2020-02-11 11:46:17 +00:00
Gordon Brown	350288ddf8	Check dot-index rules after template application (#52087 ) Previously, the dot-index rules (namely, that indices with dot-prefixed names should be either hidden indices or system indices) was done before* template application, and so only checked for the `index.hidden` setting in the request, ignoring if that setting was set via a template. This commit moves that check to a different method, which is applied after templates have been resolved and applied to the index settings.	2020-02-10 17:01:59 -07:00
Ryan Ernst	88cf8ac0a8	Fix windows empty line in logging capture (#52162 ) This commit fixes another edge case in handling windows newlines in our capture of stdout/stderr to log4j. The case is that the \r appears at the beginning of the buffer when flushing, which would unintentionally be emitted as an empty string. This commit skips the flush if only a \r was found. closes #51838	2020-02-10 13:29:50 -08:00
Julie Tibshirani	28a8db730f	In FieldTypeLookup, factor out flat object field logic. (#52091 ) Currently, the logic for looking up `flattened` field types lives in the top-level `FieldTypeLookup`. This PR moves it into a dedicated class `DynamicKeyFieldTypeLookup`.	2020-02-10 10:44:02 -08:00

... 4 5 6 7 8 ...

4709 Commits