OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-05 20:48:22 +00:00

Author	SHA1	Message	Date
Armin Braun	ffcbf9ca0c	Fix Snapshots Capturing Incomplete Datastreams (#58630 ) (#58656 ) Only snapshot datastreams that are recorded in `SnapshotInfo` and clean those that aren't from the snapshotted metadata. Do not restore all datastreams by default when restoring global metadata, use the same mechanics used for indices here. Closes #58544	2020-06-29 13:32:21 +02:00
István Zoltán Szabó	13aa8b8d9a	[DOCS] Updates results_field description in the inference processor docs (#58554 )	2020-06-29 13:15:15 +02:00
Armin Braun	95d85f29f8	Fix Snapshots Capturing Incomplete Datastreams (#58630 ) (#58656 ) Only snapshot datastreams that are recorded in `SnapshotInfo` and clean those that aren't from the snapshotted metadata. Do not restore all datastreams by default when restoring global metadata, use the same mechanics used for indices here. Closes #58544	2020-06-29 12:51:40 +02:00
Armin Braun	4f2f257b12	Fix DataStream Handling on Restore of Global Metadata (#58631 ) (#58649 ) When restoring a global metadata snapshot we were overwriting the correctly adjusted data streams in the metadata when looping over all custom values. Closes #58496	2020-06-29 10:58:41 +02:00
Przemysław Witek	3f7c45472e	[7.x] Introduce DataFrameAnalyticsConfig update API (#58302 ) (#58648 )	2020-06-29 10:56:11 +02:00
David Turner	8f82ec0b19	Revert "Suppress searchable snapshots docs in releases (#58556 )" This reverts commit f0c0ee691a0b0da458b99f7b33b7e6a099141556.	2020-06-29 09:21:58 +01:00
David Turner	f0c0ee691a	Suppress searchable snapshots docs in releases (#58556 ) This commit adds conditional logic to the docs to avoid including any docs on searchable snapshots in released versions.	2020-06-29 08:34:11 +01:00
Yang Wang	61fa7f4d22	Change privilege of enrich stats API to monitor (#52027 ) (#52196 ) The remote_monitoring_user user needs to access the enrich stats API. But the request is denied because the API is categorized under admin. The correct privilege should be monitor.	2020-06-29 10:25:33 +10:00
Dimitris Athanasiou	1817b896c9	[7.x][ML] Add status and increased estimate to memory usage (#58588 ) (#58606 ) Adds parsing of `status` and `memory_reestimate_bytes` to data frame analytics `memory_usage`. When the training surpasses the model memory limit, the status will be set to `hard_limit` and `memory_reestimate_bytes` can be used to update the job's limit in order to restart the job. Backport of #58588	2020-06-28 16:27:26 +03:00
Costin Leau	3c81b91474	EQL: Add Head/Tail pipe support (#58536 ) Introduce pipe support, in particular head and tail (which can also be chained). (cherry picked from commit 4521ca3367147d4d6531cf0ab975d8d705f400ea) (cherry picked from commit d6731d659d012c96b19879d13cfc9e1eaf4745a4)	2020-06-27 09:49:14 +03:00
Ryan Ernst	08e75abd4e	Always add Java-9 style file permissions (#46050 ) (#58628 ) Java 9 removed pathname canonicalization, which means that we need to add permissions for the path and also the real path when adding file permissions. Since master requires a minimum runtime of JDK 11, we no longer need conditional logic here to apply this pathname canonicalization with our bares hands. This commit removes that conditional pathname canonicalization. Co-authored-by: Jason Tedor <jason@tedor.me>	2020-06-26 18:19:07 -07:00
James Rodewig	69d8285a28	[DOCS] Add data streams to multi search API docs (#58610 ) (#58622 ) Makes the existing multi search API docs aware of data streams.	2020-06-26 17:32:56 -04:00
Benjamin Trent	7a202b149e	Muting analytics tests (#58617 ) (#58618 )	2020-06-26 16:50:59 -04:00
Nik Everett	67e9d39932	Remove useless aggregation helper (#58571 ) (#58578 ) `descendsFromBucketAggregator` was important before we removed `asMultiBucketAggregator` but now that it is gone `collectsFromSingleBucket` is good enough. Relates to #56487 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-06-26 15:58:44 -04:00
Tanguy Leroux	775fb5d4cf	Allows SparseFileTracker to progressively execute listeners during Gap processing (#58477 ) (#58584 ) Today SparseFileTracker allows to wait for a range to become available before executing a given listener. In the case of searchable snapshot, we'd like to be able to wait for a large range to be filled (ie, downloaded and written to disk) while being able to execute the listener as soon as a smaller range is available. This pull request is an extract from #58164 which introduces a ProgressListenableActionFuture that is used internally by SparseFileTracker. The progressive listenable future allows to register listeners attached to SparseFileTracker.Gap so that they are executed once the Gap is completed (with success or failure) or as soon as the Gap progress reaches a given progress value. This progress value is defined when the tracker.waitForRange() method is called; this method has been modified to accept a range and another listener's range to operate on. Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-06-26 18:26:20 +02:00
James Rodewig	c06c89d3db	[DOCS] Remove `composable index template` refs (#58567 ) (#58612 ) Replaces `composable index template` and `composable template` with `index template` throughout data stream-related docs. `Composable index template` is only used to contrast with legacy index templates.	2020-06-26 11:52:58 -04:00
James Rodewig	b37b318d0d	[DOCS] EQL: Remove references to partial async EQL results (#58548 ) (#58609 ) Removes references to partial results from the async EQL search docs. If an EQL search does not complete during the `wait_for_completion_timeout` timeout period, it returns no results.	2020-06-26 11:11:55 -04:00
Nik Everett	bfe1dc8a56	Mute the PluginCli test case for symlinks (#58607 ) It fails on windows. Tracked in #58605.	2020-06-26 11:01:40 -04:00
James Rodewig	28717d1e02	[DOCS] Fix analyzer page titles (#58362 ) (#58603 ) Changes the titles for analyzer pages to sentence case. Also changes the 'Pattern character filter' page title to sentence case.	2020-06-26 10:17:01 -04:00
Armin Braun	090211f768	Fix Incorrect Snapshot Shar Status for DONE Shards in Running Snapshots (#58390 ) (#58593 ) Minor bugs/inconsistencies: If a shard hasn't changed at all we were reporting `0` for total size and total file count while it was ongoing. If a data node restarts/drops out during snapshot creation the fallback logic did not load the correct statistic from the repository but just created a status with `0` counts from the snapshot state in the CS. Added a fallback to reading from the repository in this case.	2020-06-26 16:11:30 +02:00
James Rodewig	c613e0915a	[DOCS] EQL: Document search API's `tiebreaker_field` param (#57935 ) (#58540 )	2020-06-26 09:25:24 -04:00
James Rodewig	ab29162ab3	[DOCS] Fix tokenizer page titles (#58361 ) (#58598 ) Changes the titles for tokenizer pages to sentence case. Also moves the 'Path hierarchy tokenizer examples' page within the 'Path hierarchy tokenizer' page and adds a related redirect.	2020-06-26 09:24:41 -04:00
Howard	eaa60b7c54	[Docs] Fix return tuple element order (#58463 )	2020-06-26 12:24:54 +02:00
Przemyslaw Gomulka	5149554709	Update format.asciidoc to describe strict_date_optional_time_nanos (#57527 ) (#58581 ) closes #57019	2020-06-26 09:02:08 +02:00
Ryan Ernst	a524800b3e	Migrate plugin packaging tests to java (#58518 ) This commit converts the bats tests for the plugin cli into the java packaging test framework. The new tests only use the example plugin to test the plugin cli. The tests for each individual plugin's contents after being installed are handled by a new unit test for the plugin installer added in #58287.	2020-06-25 14:16:33 -07:00
Nik Everett	d22a242613	Docs: Mark variable_width_histogram experimental (#58574 ) We're tracking this aggregation's experimental-progress in #58573. We'd like a little time to be able to make backwards incompatible changes to the aggregation because we're not 100% sure about the request and response format yet.	2020-06-25 16:54:57 -04:00
James Baiera	89243857ce	Update precommit to filter out project dependencies (#58189 ) (#58572 ) If a project is pulling in an external org.elasticsearch dependency, the dependency report generation would require a license file for the dependency to be present. This would break precommit because a license was present that it did not feel was warranted. This un-reverts the update to the dependenciesInfo task, as well as the JNA license addition.	2020-06-25 16:33:25 -04:00
Lee Hinman	f732003370	[7.x] Fix negative limiting with fewer PARTIAL snapshots than minimum required (#58563 ) (#58569 ) In SLM retention, when a minimum number of snapshots is required for retention, we prefer to remove the oldest snapshots first. To perform this, we limit one of the streams, in a rare case this can cause: ``` [mynode] error during snapshot retention task java.lang.IllegalArgumentException: -5 at java.util.stream.ReferencePipeline.limit(ReferencePipeline.java:469) ~[?:?] at org.elasticsearch.xpack.core.slm.SnapshotRetentionConfiguration.lambda$getSnapshotDeletionPredicate$6(SnapshotRetentionConfiguration.java:195) ~[?:?] at org.elasticsearch.xpack.slm.SnapshotRetentionTask.snapshotEligibleForDeletion(SnapshotRetentionTask.java:245) ~[?:?] at org.elasticsearch.xpack.slm.SnapshotRetentionTask$1.lambda$onResponse$0(SnapshotRetentionTask.java:163) ~[?:?] at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:176) ~[?:?] at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1624) ~[?:?] at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484) ~[?:?] at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474) ~[?:?] at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913) ~[?:?] ``` When certain criteria are met. This commit fixes the negative limiting with `Math.max(0, ...)` and adds a unit test for the behavior. Resolves #58515	2020-06-25 14:16:34 -06:00
Nik Everett	5f52bc4c9f	Fix two scripted_metric bugs (backport of #58547 ) (#58565 ) Fixes two bugs introduced by #57627: 1. We were not properly letting go of memory from the request breaker when the aggregation finished. 2. We no longer supported totally arbitrary stuff produced by the init script because we assumed that it'd be ok to run the script once and clone its results. Sadly, cloning can't clone anything that the init script can make, like `String` arrays. This runs the init script once for every new bucket so we don't need to clone.	2020-06-25 16:16:10 -04:00
Armin Braun	468e559ff7	Fix Memory Leak From Master Failover During Snapshot (#58511 ) (#58560 ) If we failed over while the data nodes were doing their work we would never resolve the listener and leak it. This change fails all listeners if master fails over.	2020-06-25 20:43:08 +02:00
Henning Andersen	38be2812b1	Enhance extensible plugin (#58542 ) Rather than let ExtensiblePlugins know extending plugins' classloaders, we now pass along an explicit ExtensionLoader that loads the extensions asked for. Extensions constructed that way can optionally receive their own Plugin instance in the constructor.	2020-06-25 20:37:56 +02:00
Jason Tedor	52ad5842a9	Introduce node.roles setting (#58512 ) Today we have individual settings for configuring node roles such as node.data and node.master. Additionally, roles are pluggable and we have used this to introduce roles such as node.ml and node.voting_only. As the number of roles is growing, managing these becomes harder for the user. For example, to create a master-only node, today a user has to configure: - node.data: false - node.ingest: false - node.remote_cluster_client: false - node.ml: false at a minimum if they are relying on defaults, but also add: - node.master: true - node.transform: false - node.voting_only: false If they want to be explicit. This is also challenging in cases where a user wants to have configure a coordinating-only node which requires disabling all roles, a list which we are adding to, requiring the user to keep checking whether a node has acquired any of these roles. This commit addresses this by adding a list setting node.roles for which a user has explicit control over the list of roles that a node has. If the setting is configured, the node has exactly the roles in the list, and not any additional roles. This means to configure a master-only node, the setting is merely 'node.roles: [master]', and to configure a coordinating-only node, the setting is merely: 'node.roles: []'. With this change we deprecate the existing 'node.*' settings such as 'node.data'.	2020-06-25 14:14:51 -04:00
Igor Motov	20af856abd	[7.x] EQL: Adds an ability to execute an asynchronous EQL search (#58192 ) Adds async support to EQL searches Closes #49638 Co-authored-by: James Rodewig james.rodewig@elastic.co	2020-06-25 14:11:57 -04:00
Benjamin Trent	c7ba79bc19	[7.x] [ML] make waiting for renormalization optional for internally flushing job (#58537 ) (#58553 ) * [ML] make waiting for renormalization optional for internally flushing job (#58537) When flushing, datafeeds only need the guaruntee that the latest bucket has been handled. But, in addition to this, the typical call to flush waits for renormalization to complete. For large jobs, this can take a fair bit of time (even longer than a bucket length). This causes unnecessary delays in handling data. This commit adds a new internal only flag that allows datafeeds (and forecasting) to skip waiting on renormalization. closes #58395	2020-06-25 12:26:52 -04:00
Jim Ferenczi	6451187e84	Filter empty fields in SearchHit#toXContent (#58418 ) This commit restores the filtering of empty fields during the xcontent serialization of SearchHit. The filtering was removed unintentionally in #41656.	2020-06-25 17:49:03 +02:00
Nik Everett	03e6d1b535	Add Variable Width Histogram Aggregation (backport of #42035 ) (#58440 ) Implements a new histogram aggregation called `variable_width_histogram` which dynamically determines bucket intervals based on document groupings. These groups are determined by running a one-pass clustering algorithm on each shard and then reducing each shard's clusters using an agglomerative clustering algorithm. This PR addresses #9572. The shard-level clustering is done in one pass to minimize memory overhead. The algorithm was lightly inspired by [this paper](https://ieeexplore.ieee.org/abstract/document/1198387). It fetches a small number of documents to sample the data and determine initial clusters. Subsequent documents are then placed into one of these clusters, or a new one if they are an outlier. This algorithm is described in more details in the aggregation's docs. At reduce time, a [hierarchical agglomerative clustering](https://en.wikipedia.org/wiki/Hierarchical_clustering) algorithm inspired by [this paper](https://arxiv.org/abs/1802.00304) continually merges the closest buckets from all shards (based on their centroids) until the target number of buckets is reached. The final values produced by this aggregation are approximate. Each bucket's min value is used as its key in the histogram. Furthermore, buckets are merged based on their centroids and not their bounds. So it is possible that adjacent buckets will overlap after reduction. Because each bucket's key is its min, this overlap is not shown in the final histogram. However, when such overlap occurs, we set the key of the bucket with the larger centroid to the midpoint between its minimum and the smaller bucket’s maximum: `min[large] = (min[large] + max[small]) / 2`. This heuristic is expected to increases the accuracy of the clustering. Nodes are unable to share centroids during the shard-level clustering phase. In the future, resolving https://github.com/elastic/elasticsearch/issues/50863 would let us solve this issue. It doesn’t make sense for this aggregation to support the `min_doc_count` parameter, since clusters are determined dynamically. The `order` parameter is not supported here to keep this large PR from becoming too complex. Co-authored-by: James Dorfman <jamesdorfman@users.noreply.github.com>	2020-06-25 11:40:47 -04:00
Nik Everett	c7726cc93e	Fix janky test Fixes a test that incorrectly assumed that a list of random values less than or equal to `n` always contained `n`. Oops. Closes #58353	2020-06-25 11:13:29 -04:00
Nik Everett	71adade73a	Return clear error message if aggregation type is invalid (#58255 ) (#58365 ) The main changes are: 1. Catch the `NamedObjectNotFoundException` when parsing aggregation type, and then throw a `ParsingException` with clear error message with hint. 2. Add a unit test method: AggregatorFactoriesTests#testInvalidType(). Closes #58146. Co-authored-by: bellengao <gbl_long@163.com>	2020-06-25 11:08:25 -04:00
Dimitris Athanasiou	c3dfafe0b4	[7.x][ML] Avoid assertion error on empty string feature values for inference (#58541 ) (#58550 ) It is possible for the source document to have an empty string value for a field that is mapped as numeric. We should treat those as missing values and avoid throwing an assertion error. Backport of #58541	2020-06-25 18:07:29 +03:00
Dimitris Athanasiou	5af7071db0	[7.x][ML] Change inference default field name to <dep_var>_prediction… (#58546 ) This changes the default value for the results field of inference applied on models that are trained via a data frame analytics job. Previously, the results field default was `predicted_value`. This commit makes it the same as in the training job itself. The new default field is `<dependent_variable>_prediction`. Apart from making inference consistent with the training job the model came from, it is helpful to preserve the dependent variable name by default as it provides some context to the user that may avoid confusion as to which model results came from. Backport of #58538	2020-06-25 18:03:43 +03:00
David Roberts	1742b1c39e	Cancel persistent task recheck when no longer master (#58539 ) If a persistent task cannot be assigned on the first attempt then the master node will schedule periodic rechecks to see if the assignment requirements have been met. These periodic rechecks should be cancelled if the node ceases to be master. Previously they weren't, leading to exceptions being logged repeatedly. This PR cancels the rechecks on learning that the node is no longer the master. Fixes #58531	2020-06-25 15:51:57 +01:00
William Brafford	958b21d727	Enable TTY password OS tests, plus refactoring (#57759 ) (#58200 ) * Enable TTY password OS tests, plus refactoring (#57759) Two keystore tests were unintentionally ignored when the password-protected keystore work was merged. I've reënabled those tests here. I've also refactored the test methods a little bit to reduce the API surface: instead of having a "startElasticsearchTtyPassword" method and a "startElasticsearchStandardInputPassword" method, I've made a single "startElasticsearch" method with a "useTty" boolean argument. * Separate daemonization and non-daemonization case for tty Centos 6 uses a version of expect that kills the elasticsearch process when it tries to daemonize. I will fix this in future work but for now I'm replacing it with a todo.	2020-06-25 10:49:17 -04:00
Nik Everett	335505c4e1	Drop deprecated aggregator wrapper (backport of #58367 ) (#58448 ) This drops the deprecated and now unused `asMultiBucketAggregator`. It was too easy to use it to make inefficient `Aggregators`. Relates to #56487	2020-06-25 09:31:19 -04:00
Rory Hunter	ebe1d9cdbe	Update rest-api-spec keyword list Follow-up to 35aecf4c9aa. Somehow I missed the fact that there's an ILM API named `retry`, which is a keyword in Ruby. I've removed it from the keywords list.	2020-06-25 09:55:13 +01:00
Rory Hunter	e413de4203	Validate that REST API names do not contain keywords (#58452 ) If an API name (or components of a name) overlaps with a reserved word in the programming language for an ES client, then it's possible that the code that is generated from the API will not compile. This PR adds validation to check for such overlaps.	2020-06-25 09:48:54 +01:00
James Rodewig	c3f4034199	[DOCS] Note that DS timestamp field mapping changes require reindex (#58444 ) (#58517 ) With #58096, data streams now track the timestamp field mapping outside of the template associated with the stream. This means you can no longer update the timestamp field mapping using template changes. This updates the associated data stream docs.	2020-06-24 17:21:26 -04:00
Julie Tibshirani	1f2e05c947	Simplify mapping validation for resizing indices. (#58514 ) When creating a target index from a source index, we don't allow for target mappings to be specified. This PR simplifies the check that the target mappings are empty. This refactor will help when implementing composable template merging, since we no longer need to resolve + check the target mappings when creating an index from a template.	2020-06-24 14:07:19 -07:00
Benjamin Trent	add8ff1ad3	[ML] assume data streams are enabled in data stream tests (#58502 ) (#58508 )	2020-06-24 14:14:48 -04:00
markharwood	837f2643eb	Docs - Added field capabilities breaking change (#58509 )	2020-06-24 18:39:01 +01:00
Chris Roberson	d5899d1765	[Monitoring] APM mapping update (#46244 ) (#58498 ) * Add acm mapping to APM for beats * Add root mapping for APM * Add sourcemap mapping to APM * Fix missing properties * Fix a second missing properties * Add request property to acm * Remove root and sourcemap per review Co-authored-by: Mike Place <mike.place@elastic.co> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-06-24 13:26:30 -04:00

1 2 3 4 5 ...

52314 Commits