OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-07 21:48:39 +00:00

Author	SHA1	Message	Date
Lisa Cawley	1474606b18	[DOCS] Clarify model snapshot retention properties (#56477 )	2020-05-11 07:43:10 -07:00
James Rodewig	ba67ab3b64	[DOCS] Add reference docs for `search.max_buckets` setting (#56449 ) (#56511 ) Adds reference-style setting documentation for the `search.max_buckets` setting. This setting was previously only documented on the [bucket aggregations][0] page. [0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-bucket.html	2020-05-11 09:45:09 -04:00
Jim Ferenczi	02ab9112a9	Fix spurious failures in AsyncSearchIntegTestCase (#56026 ) Async search integration tests are subject to random failures when: * The test index has more than one replica. * The request cache is used. * Some shards are empty. * The maintenance service starts a garbage collection when node is closing. They are also slow because the test index is created/populated on each test method. This change refactors these integration tests in order to: * Create the index once for the entire test suite. * Fix the usage of the request cache and replicas. * Ensures that all shards have at least one document. * Increase the delay of the maintenance service garbage collection. Closes #55895 Closes #55988	2020-05-11 15:03:03 +02:00
Martijn van Groningen	9ae09570d8	Allow a number of broadcast transport actions to resolve data streams (#55726 ) (#56502 ) Change TransportBroadcastByNodeAction and TransportBroadcastReplicationAction to be able to resolve data streams by default. Implementations can change this ability. This change allows to following APIs to resolve data streams: flush, refresh (already supported data streams), force merge, clear indices cache, indices stats (already supported data streams), segments, upgrade stats, upgrade, validate query, searchable snapshots stats, clear searchable snapshots cache and reload analyzers APIs. Relates to #53100	2020-05-11 12:48:35 +02:00
Rene Groeschke	c29bc87040	Move bwcVersions extension property to BuildParams (back port) (#56381 ) * Move bwcVersions extension property to BuildParams (#56206) * Fix :qa Task Using Broken BwC Versions Resolution (#56332) Co-authored-by: Armin Braun <me@obrown.io>	2020-05-11 09:39:13 +02:00
István Zoltán Szabó	ebe1e4c4c4	[DOCS] Expands GET DFA stats API docs with new phases (#56407 ) Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-05-11 09:26:15 +02:00
Hendrik Muhs	c0985615aa	[DOC] document transform settings and docs_per_second (#56178 ) add documentation for throttling, added in #56007	2020-05-11 09:23:49 +02:00
Hendrik Muhs	ad54c51467	[7.x] add a basic get index rolling upgrade test (#56322 ) (#56411 ) add a very basic rolling upgrade test for get index, post mortem action of #56274	2020-05-11 07:51:38 +02:00
Nik Everett	2823300bdf	Speed up rounding in auto_date_histogram (#56384 ) (#56486 ) This wires `auto_date_histogram` into the rounding optimization that I built in #55559. This is should significantly speed up any `auto_date_histogram`s with `time_zone`s on them.	2020-05-09 11:37:30 -04:00
Nik Everett	2f38aeb5e2	Save memory when numeric terms agg is not top (#55873 ) (#56454 ) Right now all implementations of the `terms` agg allocate a new `Aggregator` per bucket. This uses a bunch of memory. Exactly how much isn't clear but each `Aggregator` ends up making its own objects to read doc values which have non-trivial buffers. And it forces all of it sub-aggregations to do the same. We allocate a new `Aggregator` per bucket for two reasons: 1. We didn't have an appropriate data structure to track the sub-ordinals of each parent bucket. 2. You can only make a single call to `runDeferredCollections(long...)` per `Aggregator` which was the only way to delay collection of sub-aggregations. This change switches the method that builds aggregation results from building them one at a time to building all of the results for the entire aggregator at the same time. It also adds a fairly simplistic data structure to track the sub-ordinals for `long`-keyed buckets. It uses both of those to power numeric `terms` aggregations and removes the per-bucket allocation of their `Aggregator`. This fairly substantially reduces memory consumption of numeric `terms` aggregations that are not the "top level", especially when those aggregations contain many sub-aggregations. It also is a pretty big speed up, especially when the aggregation is under a non-selective aggregation like the `date_histogram`. I picked numeric `terms` aggregations because those have the simplest implementation. At least, I could kind of fit it in my head. And I haven't fully understood the "bytes"-based terms aggregations, but I imagine I'll be able to make similar optimizations to them in follow up changes.	2020-05-08 20:38:53 -04:00
Mark Vieira	0fb9bc5379	Always use archive base name as the pom artifact id (#56447 ) (#56467 )	2020-05-08 16:11:19 -07:00
debadair	6ae7327061	[DOCS] Align with ILM changes. (#55953 ) (#56455 ) * [DOCS] Align with ILM changes. * Apply suggestions from code review Co-authored-by: James Rodewig <james.rodewig@elastic.co> Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com> * Incorporated review comments.	2020-05-08 14:22:27 -07:00
Armin Braun	0a254cf223	Serialize Monitoring Bulk Request Compressed (#56410 ) (#56442 ) Even with changes from #48854 we're still seeing significant (as in tens and hundreds of MB) buffer usage for bulk exports in some cases which destabilizes master nodes. Since we need to know the serialized length of the bulk body we can't do the serialization in a streaming manner. (also it's not easily doable with the HTTP client API we're using anyway). => let's at least serialize on heap in compressed form and decompress as we're streaming to the HTTP connection. For small requests this adds negligible overhead but for large requests this reduces the size of the payload field by about an order of magnitude (empirically determined) which is a massive reduction in size when considering O(100MB) bulk requests.	2020-05-08 23:16:07 +02:00
Jake Landis	95e5e9e598	[7.x] [DOCS] Update default value of index.name.time_format (#56453 ) (#56456 ) Corrects the default value of index.name.time_format	2020-05-08 16:09:42 -05:00
Théophile Helleboid - chtitux	23e419a7aa	SLM PUT: add precision on date math support in indices parameter (#55203 ) It was not clear for me that `indices` parameter supports date math expression. I think it may be worth to add the precision in the documentation.	2020-05-08 15:06:09 -06:00
Dimitris Athanasiou	44ffa388ac	[7.x][ML] Use non-zero timeout when force stopping DF analytics (#56423 ) (#56428 ) We have been using a zero timeout in the case that DF analytics is stopped. This may cause a timeout when we cancel, for example, the reindex task. This commit fixes this by using the default timeout instead. Backport of #56423	2020-05-08 21:12:11 +03:00
Nik Everett	bd4b9dd10e	Speed up time interval arounding around dst (backport #56371 ) (#56396 ) When an index spans a daylight savings time transition we can't use our optimization that rewrites the requested time zone to a fixed time zone and instead we used to fall back to a java.util.time based rounding implementation. In #55559 we optimized "time unit" rounding. This optimizes "time interval" rounding. The java.util.time based implementation is about 1650% slower than the rounding implementation for a fixed time zone. This replaces it with a similar optimization that is only about 30% slower than the fixed time zone. The java.util.time implementation allocates a ton of short lived objects but the optimized implementation doesn't. So it might end up being faster than the microbenchmarks imply.	2020-05-08 13:39:27 -04:00
Nicole Albee	5b708f846c	[DOCS] Fix broken link in the ilm-tutorial. (#56310 ) (#56311 ) (#56446 )	2020-05-08 12:22:39 -05:00
Armin Braun	b18d242300	Fix Simulate Template Endpoint Temporary Index Handling (#56406 ) (#56432 ) Use proper facility for creating temporary index service for the simulation that does not add itself to the `IndicesService` unnecessarily (breaking an assertion about the internal consistency of the cluster state and the `IndicesService`). Closes #56298	2020-05-08 18:05:24 +02:00
David Roberts	9a3924a641	[ML] Adjust list of platforms that have ML native code (#56426 ) Native code is now available for linux-aarch64. Note that it is _not_ currently supported!	2020-05-08 16:22:45 +01:00
Martijn van Groningen	83739b5806	Backport: allow cluster health api to resolve data streams (#56425 ) Backport of: #56413 Allow cluster health api to resolve data streams and automatically remove data streams after each test in test cases extending from `ESIntegTestCase` Relates to #53100	2020-05-08 17:16:25 +02:00
Dimitris Athanasiou	c117ae7a6e	[7.x][ML] Force stopping stopped DF analytics should succeed (#56421 ) (#56424 ) Force stopping a DF analytics job whose config exists and that is stopped should succeed. This was broken by #56360. Closes #56414 Backport of #56421	2020-05-08 18:04:24 +03:00
Tanguy Leroux	8e9b69bfd7	Use snapshot information to build searchable snapshot store MetadataSnapshot (#56289 ) (#56403 ) While investigating possible optimizations to speed up searchable snapshots shard restores, we noticed that Elasticsearch builds the list of shard files on local disk in order to compare it with the list of files contained in the snapshot to restore. This list of files is materialized with a MetadataSnapshot object whose construction involves to read the footer checksum of every files of the shard using Store.checksumFromLuceneFile() method. Further investigation shows that a MetadataSnapshot object is also created for other types of operations like building the list of files to recover in a peer recovery (and primary shard relocation) or in order to assign a shard to a node. These operations use the Store.getMetadata(IndexCommit) method to build the list of files and checksums. In the case of searchable snapshots building the MetadataSnapshot object can potentially trigger cache misses, which in turn can cause the download and the writing in cache of the last range of the file in order to check the 16 bytes footer. This in turn can cause more evictions. Since searchable snapshots already contains the footer information of every file in BlobStoreIndexShardSnapshot it can directly read the checksum from it and avoid to use the cache at all to create a MetadataSnapshot for the operations mentioned above. This commit adds a shortcut to the SearchableSnapshotDirectory.openInput() method - similarly to what already exists for segment infos - so that it creates a specific IndexInput for checksum reading operation.	2020-05-08 14:16:19 +02:00
Dimitris Athanasiou	60b1c67409	[7.x][ML] Allow stopping DF analytics whose config is missing (#56360 ) (#56408 ) It is possible that the config document for a data frame analytics job is deleted from the config index. If that is the case the user is unable to stop a running job because we attempt to retrieve the config and that will throw. This commit changes that. When the request is forced, we do not expand the requested ids based on the existing configs but from the list of running tasks instead. Backport of #56360	2020-05-08 13:54:44 +03:00
Hendrik Muhs	e2e4c3179c	Revert "add a basic get index rolling upgrade test (#56322 )" This reverts commit 0b40886db3e469c0371f5ccefef85329070a4544.	2020-05-08 12:41:15 +02:00
Hendrik Muhs	406d58f28c	Revert "dont use mappings for the test, due to different output in 6.9 (type removal)" This reverts commit b2c70a95daf2e3fc2fc0b05cb01cdee3525240d5.	2020-05-08 12:41:11 +02:00
Hendrik Muhs	b2c70a95da	dont use mappings for the test, due to different output in 6.9 (type removal)	2020-05-08 11:53:47 +02:00
Hendrik Muhs	0b40886db3	add a basic get index rolling upgrade test (#56322 ) add a very basic rolling upgrade test for get index, post mortem action of #56274	2020-05-08 10:53:01 +02:00
Hendrik Muhs	cc35d37788	[Transform] unmute transform upgrade tests (#56296 ) the transform upgrade tests broke due to #56238, but got fixed with #56274 fixes #56269 fixes #56250	2020-05-08 10:48:58 +02:00
Armin Braun	085ff8c404	Add More Trace Logging to BlobStoreRepository (#56336 ) (#56401 ) Adding more trace logging that would be helpful in understanding the precise order of blob-level operations if needed.	2020-05-08 08:31:32 +02:00
István Zoltán Szabó	ceb0b0dba3	[DOCS] Updated screenshots in transform ecommerce example. (#56359 )	2020-05-08 07:46:15 +02:00
Ryan Ernst	582145a493	Upgrade forbidden apis to 3.0 (#56368 ) This commit upgrades forbidden apis to the latest version, which also means we now get task configuration avoidance.	2020-05-07 19:05:07 -07:00
Tal Levy	13944b1bf9	Fix max-int limit for number of points reduced in geo_centroid (#56370 ) A bug in InternalGeoCentroid#reduce existed that summed up the aggregation's long-valued counts into a local integer variable. Since it is definitely possible to reduce more than Integer.MAX points, this change simply updates that variable to be a long-valued number. Closes #55992.	2020-05-07 14:30:29 -07:00
Tal Levy	6e0178fb68	Move CumulativeSumPipelineAgg to use ConstructingObjectParser parsing (#55990 ) (#56380 ) As part of #52776, this refactors the aggregation to use the context parser to parse its parameters.	2020-05-07 12:34:54 -07:00
Tim Brooks	b84d1e2577	Improve logging around SniffConnectionStrategy (#56378 ) Currently, the logging around the SniffConnectionStrategy is limited. The log messages are inconsistent and sometimes wrong. This commit cleans up these log message to describe when connections are happening and what failed if a step fails. Additionally, this commit enables TRACE logging for a problematic test (testEnsureWeReconnect).	2020-05-07 13:11:56 -06:00
Tim Brooks	9d076364d7	Fix testCollectNodes test assertion (#56294 ) Currently when a connection closes a new sniff round begins. The testCollectNodes test closes four transports before triggering the method to collect the remote nodes. This leads to a race where there are a number of reasons the collect nodes call might fail. This commit fixes that issue by changing the test assertion to include a potential failure condition. Fixes #55292.	2020-05-07 11:52:43 -06:00
Nik Everett	ca9bec7c1a	Build: fix eclipse after icTests (#56362 ) We made a small mistake when breaking out the `ESIntegTestCase` subclasses that confused eclipse. This makes it happy again. Poor eclipse! Relates #55896	2020-05-07 13:29:43 -04:00
Dimitris Athanasiou	d064eda2b0	[7.x][ML] Ensure phase progress may only increase (#56339 ) (#56357 ) Due to multi-threading it is possible that phase progress updates written from the c++ process arrive reordered. We can address this by ensuring that progress may only increase. Closes #56282 Backport of #56339	2020-05-07 19:46:58 +03:00
David Turner	8f4af292a7	Hide c.a.a.p.i.BasicProfileConfigFileLoader noise (#56346 ) A recent AWS SDK upgrade has introduced a new source of spurious `WARN` logs when the security manager prevents access to the user's home directory and therefore to their shared client configuration. This is actually the behaviour we want, and it's harmless and handled by the SDK as if the profile config doesn't exist, so this log message is unnecessary noise. This commit suppresses this noisy logging by default. Relates #20313 Closes #56333	2020-05-07 17:00:58 +01:00
James Rodewig	ea76b0c22b	[DOCS] Relocate search API's request body parameters (#56304 ) Changes: * Moves the document request body parameters for the search API from the Request body search page to the Search API reference page. * Relocates a search request body example from the Request body search page to the Search API reference page. * Adds a note to any duplicated query and request body parameters.	2020-05-07 11:00:03 -04:00
William Brafford	691044e67b	Add xpack setting deprecations to deprecation API (#56290 ) * Add xpack setting deprecations to deprecation API The deprecated settings showed up in the deprecation log file by default, but I did not add them to the deprecation API. This commit fixes that. Now if you use one of the deprecated basic feature enablement settings, calling _monitoring/deprecations will inform you of that fact. * Remove incorrectly backported settings documents It seems that I backported these docs to the wrong place in #56061, in #55980, and in #56167. I hope they're in the right place now. Co-authored-by: debadair <debadair@elastic.co>	2020-05-07 10:28:17 -04:00
Nik Everett	b5e385fa56	Fix auto_date_histogram interval (#56252 ) (#56341 ) `auto_date_histogram` was returning the incorrect `interval` because of a combination of two things: 1. When pipeline aggregations rewrote `auto_date_histogram` we reset the interval to 1. Oops. Fixed that. 2. Every bucket aggregation was rewriting its buckets as though there was a pipeline aggregation even if there aren't any. This is a bit silly so we skip that too. Closes #56116	2020-05-07 10:27:40 -04:00
Nhat Nguyen	bd0e0f41a0	Ensure unregister child node if failed to register task (#56254 ) We fail to unregister the child node in registerAndExecute if the parent task is being canceled. This leads to a bug where a cancel request never completes. Closes #55875 Relates #54312	2020-05-07 10:10:13 -04:00
James Rodewig	8e005db3e6	[DOCS] EQL: Document math functions (#55810 ) (#56337 ) Documents the following EQL functions: * `add` * `divide` * `module` * `multiply` * `subtract`	2020-05-07 09:18:43 -04:00
Nik Everett	e35919d3b8	Optimize date_histograms across daylight savings time (backport of #55559 ) (#56334 ) Rounding dates on a shard that contains a daylight savings time transition is currently something like 1400% slower than when a shard contains dates only on one side of the DST transition. And it makes a ton of short lived garbage. This replaces that implementation with one that benchmarks to having around 30% overhead instead of the 1400%. And it doesn't generate any garbage per search hit. Some background: There are two ways to round in ES: * Round to the nearest time unit (Day/Hour/Week/Month/etc) * Round to the nearest time interval (3 days/2 weeks/etc) I'm only optimizing the first one in this change and plan to do the second in a follow up. It turns out that rounding to the nearest unit really is two problems: when the unit rounds to midnight (day/week/month/year) and when it doesn't (hour/minute/second). Rounding to midnight is consistently about 25% faster and rounding to individual hour or minutes. This optimization relies on being able to usually figure out what the minimum and maximum dates are on the shard. This is similar to an existing optimization where we rewrite time zones that aren't fixed (think America/New_York and its daylight savings time transitions) into fixed time zones so long as there isn't a daylight savings time transition on the shard (UTC-5 or UTC-4 for America/New_York). Once I implement time interval rounding the time zone rewriting optimization should no longer be needed. This optimization doesn't come into play for `composite` or `auto_date_histogram` aggs because neither have been migrated to the new `DATE` `ValuesSourceType` which is where that range lookup happens. When they are they will be able to pick up the optimization without much work. I expect this to be substantial for `auto_date_histogram` but less so for `composite` because it deals with fewer values. Note: My 30% overhead figure comes from small numbers of daylight savings time transitions. That overhead gets higher when there are more transitions in logarithmic fashion. When there are two thousand years worth of transitions my algorithm ends up being 250% slower than rounding without a time zone, but java time is 47000% slower at that point, allocating memory as fast as it possibly can.	2020-05-07 09:10:51 -04:00
Armin Braun	3bad5b3c01	Fix Noisy Logging during Snapshot Delete (#56264 ) (#56329 ) We were logging the cleanup of the snap- and meta- blobs for every snapshot delete which is needlessly noisy and confusing to users. We should only log actual stale/unexpected blobs here.	2020-05-07 13:48:53 +02:00
Armin Braun	60b6d4eddc	Increase Timeout in S3 Cooldown Test (#56267 ) (#56323 ) Moving from `5s` to `10s` here because of #56095. This adds `10s` to the overall runtime of the test which should be a reasonable tradeoff for stability. Closes #56095	2020-05-07 11:23:07 +02:00
Tanguy Leroux	6233e32ab3	Fix SearchableSnapshotDirectoryTests.testIndexSearcher() (#56275 ) Closes #56233	2020-05-07 11:12:35 +02:00
Tanguy Leroux	65a061e33a	Fix SearchableSnapshotDirectoryTests.testClearCache (#56277 ) This test sometimes fails when prewarming is enabled because it's possible that some files are cached in background while the test tries to clear the cache. This commit disables prewarming for this test.	2020-05-07 10:59:33 +02:00
Ryan Ernst	33d6a55d1d	Create plugin for internalClusterTest task (#56067 ) This commit creates a new gradle plugin to provide a separate task name and source set for running ESIntegTestCase tests. The only project converted to use the new plugin in this PR is server, as an example. The remaining cases in x-pack will be handled in followups. backport of #55896	2020-05-06 17:20:52 -07:00

1 2 3 4 5 ...

51566 Commits