OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-03-03 17:39:15 +00:00

Author	SHA1	Message	Date
Gordon Brown	f207e5bde9	Add generated benchmark files to gitignore (#51000 ) When building IntelliJ generates several source files related to the benchmarks. This commit adds that path to the gitignore so these don't get accidentally committed.	2020-01-14 16:18:28 -07:00
Nik Everett	fc5fde7950	Add "did you mean" to ObjectParser (#50938 ) (#50985 ) Check it out: ``` $ curl -u elastic:password -HContent-Type:application/json -XPOST localhost:9200/test/_update/foo?pretty -d'{ "dac": {} }' { "error" : { "root_cause" : [ { "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" } ], "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" }, "status" : 400 } ``` The tricky thing about implementing this is that x-content doesn't depend on Lucene. So this works by creating an extension point for the error message using SPI. Elasticsearch's server module provides the "spell checking" implementation. s	2020-01-14 17:53:41 -05:00
Ryan Ernst	4bdab0e985	Fix windows chown to work with single file (#51004 ) The chown utility for packaging tests works on windows when the given path is a directory, but would fail if the path was a single file. This commit fixes it to handle both cases. relates #50825	2020-01-14 14:40:04 -08:00
Yannick Welsch	4b0581f182	Remove custom metadata tool (#50813 ) Adds a command-line tool to remove broken custom metadata from the cluster state. Relates to #48701	2020-01-14 23:08:33 +01:00
James Rodewig	a290762df1	[DOCS] Document `breakers`, `script`, and `discovery` node stats (#50509 ) Documents the `breakers`, `script`, and `discovery` parameters returned by the `_nodes/stats` API.	2020-01-14 16:51:50 -05:00
Nik Everett	a8aca6b2a0	Switch AggregationSpec to ContextParser (#50871 ) (#50980 ) We seem to have settled on the `ContextParser` interface for parsing stuff, mostly because `ObjectParser` implements it. We don't really need the old `Aggregator.Parser` interface any more because it duplicates `ContextParser` but with the arguments reversed. This adds support to `AggregationSpec` to declare aggregation parsers using `ContextParser`. This should integrate cleanly with `ObjectParser`. It doesn't drop support for `Aggregator.Parser` or change the plugin intrface at all so it should be safe to backport to 7.x. And we can remove `Aggregator.Parser` in a follow up which is only targeted to 8.0.	2020-01-14 16:50:52 -05:00
Nik Everett	9a3d4db840	Begin moving date_histogram to offset rounding (backport of #50873 ) (#50978 ) We added a new rounding in #50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset.	2020-01-14 16:50:27 -05:00
Jason Tedor	d5623c8f09	Report progress of multiple plugin installs (#51001 ) When installing multiple plugins at once, this commit changes the behavior to report installed plugins as we go. In the case of failure, we emit a message that we are rolling back any plugins that were installed successfully, and also that they were successfully rolled back. In the case a plugin is not successfully rolled back, we report this clearly too, alerting the user that there might still be state on disk they would have to clean up.	2020-01-14 16:24:41 -05:00
Christoph Büscher	2f13751bad	Deprecate and remove camel-case nGram and edgeNGram tokenizers (#50862 ) (#50991 ) We deprecated and removed the camel-case versions of the nGram and edgeNGram filters a while ago and we should do the same with the nGram and edgeNGram tokenizers. This PR deprecates the use of these names in favour of ngram and edge_ngram in 7. Usage will be disallowed on new indices starting with 8 then.	2020-01-14 21:42:34 +01:00
lcawl	6848dee84b	[DOCS] Fixes typo in keystore command	2020-01-14 11:57:02 -08:00
Tal Levy	9ee2e11181	[7.x] Adds support for geo-bounds filtering in geogrid aggregations (#50996 ) * Adds support for geo-bounds filtering in geogrid aggregations (#50002) It is fairly common to filter the geo point candidates in geohash_grid and geotile_grid aggregations according to some viewable bounding box. This change introduces the option of specifying this filter directly in the tiling aggregation. This is even more relevant to `geo_shape` where the bounds will restrict the shape to be within the bounds this optional `bounds` parameter is parsed in an equivalent fashion to the bounds specified in the geo_bounding_box query.	2020-01-14 11:18:46 -08:00
Mark Vieira	2de2a3634e	Add matrix job params as build scan tag Signed-off-by: Mark Vieira <portugee@gmail.com>	2020-01-14 09:54:10 -08:00
Benjamin Trent	72c270946f	[ML][Inference] Adding classification_weights to ensemble models (#50874 ) (#50994 ) * [ML][Inference] Adding classification_weights to ensemble models classification_weights are a way to allow models to prefer specific classification results over others this might be advantageous if classification value probabilities are a known quantity and can improve model error rates.	2020-01-14 12:40:25 -05:00
Tom Veasey	de5713fa4b	[ML] Disable invalid assertion (#50988 ) Backport #50986.	2020-01-14 17:35:00 +00:00
Jason Tedor	ca9ca68cbe	Allow installing multiple plugins as a transaction (#50924 ) This commit allows the plugin installer to install multiple plugins in a single invocation. The installation will be treated as a transaction, so that all of the plugins are install successfully, or none of the plugins are installed.	2020-01-14 12:20:54 -05:00
Armin Braun	16c07472e5	Track Snapshot Version in RepositoryData (#50930 ) (#50989 ) * Track Snapshot Version in RepositoryData (#50930) Add tracking of snapshot versions to RepositoryData to make BwC logic more efficient. Follow up to #50853	2020-01-14 18:15:07 +01:00
David Kyle	7f309a18f1	[7.x][ML] Explicitly require a OriginSettingClient in ML results iterators (#50981 ) In classes where the client is used directly rather than through a call to executeAsyncWithOrigin explicitly require the client to be OriginSettingClient rather than using the Client interface. Also remove calls to deprecated ClientHelper.clientWithOrigin() method.	2020-01-14 17:14:39 +00:00
Lisa Cawley	a5a8b60d78	[DOCS] Fix realm chains example (#50568 )	2020-01-14 09:01:45 -08:00
Tim Brooks	6e7478b846	Allow proxy mode server name to be configured (#50951 ) Currently, proxy mode allows a remote cluster connection to be setup by expecting all open connections to be routed through an intermediate proxy. The proxy must use some logic to ensure that the connections end up on the correct remote cluster. One mechanism provided is that the default distribution TLS implementations will forward the host component of the configured address to the remote connection using the SNI extension. This is limiting as it requires that the proxy be configured in a way that always uses a valid hostname as the proxy address. Instead, this commit adds an additional setting to allow the server_name to be configured independently. This allows the proxy address to be specified as a IP literal, but the server_name specified as an arbitrary string which still must be a valid hostname. It also decouples the server_name from the requirement of being a DNS resolvable domain.	2020-01-14 10:57:44 -06:00
Armin Braun	1fe2d76a91	Fix S3 3rd Party Tests (#50983 ) Only load fixtures plugin in snapshot-tool tests if we're actually going to use a fixture because otherwise configuration fails. Closes #50971	2020-01-14 17:46:47 +01:00
Armin Braun	6e8ea7aaa2	Work around JVM Bug in LongGCDisruptionTests (#50731 ) (#50974 ) There is a JVM bug causing `Thread#suspend` calls to randomly take multiple seconds breaking these tests that call the method numerous times in a loop. Increasing the timeout would will not work since we may call `suspend` tens if not hundreds of times and even a small number of them experiencing the blocking will lead to multiple minutes of waiting. This PR detects the specific issue by timing the `Thread#suspend` calls and skips the remainder of the test if it timed out because of the JVM bug. Closes #50047	2020-01-14 17:13:21 +01:00
Tim Brooks	d8510be3d9	Revert "Send cluster name and discovery node in handshake (#48916 )" (#50944 ) This reverts commit 0645ee88e2a3fff562a055fba2eaf928653c0db3.	2020-01-14 09:53:13 -06:00
Alan Woodward	4974f56b25	Fix analysis BWC tests - warnings now emitted on index creation	2020-01-14 14:48:40 +00:00
Dimitris Athanasiou	1d8cb3c741	[7.x][ML] Add num_top_feature_importance_values param to regression and classi… (#50914 ) (#50976 ) Adds a new parameter to regression and classification that enables computation of importance for the top most important features. The computation of the importance is based on SHAP (SHapley Additive exPlanations) method. Backport of #50914	2020-01-14 16:46:09 +02:00
Hendrik Muhs	0178c7c5d0	[7.x][Transform] correctly retrieve checkpoints from remote indices (#50903 ) (#50969 ) uses remote client(s) to correctly retrieve index checkpoints from remote clusters	2020-01-14 15:09:14 +01:00
Yannick Welsch	f1c5031766	Fix queuing in AsyncLucenePersistedState (#50958 ) The logic in AsyncLucenePersistedState was flawed, unexpectedly queuing up two update tasks in parallel.	2020-01-14 15:04:28 +01:00
Yannick Welsch	91d7b446a0	Warn on slow metadata performance (#50956 ) Has the new cluster state storage layer emit warnings in case metadata performance is very slow. Relates #48701	2020-01-14 15:04:28 +01:00
Alan Woodward	8c16725a0d	Check for deprecations when analyzers are built (#50908 ) Generally speaking, deprecated analysis components in elasticsearch will issue deprecation warnings when they are first used. However, this means that no warnings are emitted when indexes are created with deprecated components, and users have to actually index a document to see warnings. This makes it much harder to see these warnings and act on them at appropriate times. This is worse in the case where components throw exceptions on upgrade. In this case, users will not be aware of a problem until a document is indexed, instead of at index creation time. This commit adds a new check that pushes an empty string through all user-defined analyzers and normalizers when an IndexAnalyzers object is built for each index; deprecation warnings and exceptions are now emitted when indexes are created or opened. Fixes #42349	2020-01-14 13:52:02 +00:00
Przemysław Witek	9c6ffdc2be	[7.x] Handle nested and aliased fields correctly when copying mapping. (#50918 ) (#50968 )	2020-01-14 14:43:39 +01:00
James Rodewig	f028ab08d1	[DOCS] Use `s` parameter in cat API overview example (#50616 ) Updates a snippet to use the `s` query string parameter rather than piping the output to a separate `sort` command. This ensures the snippet is tested and available in clients other than curl (Kibana console, etc.). Issue was originally raised by @hackaholic in #40926.	2020-01-14 08:22:07 -05:00
David Kyle	69a3626ee1	Mute SnapshotLifecycleRestIT testFullPolicySnapshot Relates to #50358	2020-01-14 13:46:37 +01:00
Florian Kelbert	277798606b	[DOCS] Correctly read total hits inside watcher config Relates to #50611 and #50612	2020-01-14 12:58:52 +01:00
Daniel Mitterdorfer	263083b882	Mute HttpCertificateCommandTests.testTextFileSubstitutions (#50965 ) (#50966 ) Relates #50964	2020-01-14 12:40:34 +01:00
Tim Vernum	2bb7b53e41	Add certutil http command (#50952 ) This adds a new "http" sub-command to the certutil CLI tool. The http command generates certificates/CSRs for use on the http interface of an elasticsearch node/cluster. It is designed to be a guided tool that provides explanations and sugestions for each of the configuration options. The generated zip file output includes extensive "readme" documentation and sample configuration files for core Elastic products. Backport of: #49827	2020-01-14 21:24:21 +11:00
Yannick Welsch	22ba759e1f	Move metadata storage to Lucene (#50928 ) * Move metadata storage to Lucene (#50907) Today we split the on-disk cluster metadata across many files: one file for the metadata of each index, plus one file for the global metadata and another for the manifest. Most metadata updates only touch a few of these files, but some must write them all. If a node holds a large number of indices then it's possible its disks are not fast enough to process a complete metadata update before timing out. In severe cases affecting master-eligible nodes this can prevent an election from succeeding. This commit uses Lucene as a metadata storage for the cluster state, and is a squashed version of the following PRs that were targeting a feature branch: * Introduce Lucene-based metadata persistence (#48733) This commit introduces `LucenePersistedState` which master-eligible nodes can use to persist the cluster metadata in a Lucene index rather than in many separate files. Relates #48701 * Remove per-index metadata without assigned shards (#49234) Today on master-eligible nodes we maintain per-index metadata files for every index. However, we also keep this metadata in the `LucenePersistedState`, and only use the per-index metadata files for importing dangling indices. However there is no point in importing a dangling index without any shard data, so we do not need to maintain these extra files any more. This commit removes per-index metadata files from nodes which do not hold any shards of those indices. Relates #48701 * Use Lucene exclusively for metadata storage (#50144) This moves metadata persistence to Lucene for all node types. It also reenables BWC and adds an interoperability layer for upgrades from prior versions. This commit disables a number of tests related to dangling indices and command-line tools. Those will be addressed in follow-ups. Relates #48701 * Add command-line tool support for Lucene-based metadata storage (#50179) Adds command-line tool support (unsafe-bootstrap, detach-cluster, repurpose, & shard commands) for the Lucene-based metadata storage. Relates #48701 * Use single directory for metadata (#50639) Earlier PRs for #48701 introduced a separate directory for the cluster state. This is not needed though, and introduces an additional unnecessary cognitive burden to the users. Co-Authored-By: David Turner <david.turner@elastic.co> * Add async dangling indices support (#50642) Adds support for writing out dangling indices in an asynchronous way. Also provides an option to avoid writing out dangling indices at all. Relates #48701 * Fold node metadata into new node storage (#50741) Moves node metadata to uses the new storage mechanism (see #48701) as the authoritative source. * Write CS asynchronously on data-only nodes (#50782) Writes cluster states out asynchronously on data-only nodes. The main reason for writing out the cluster state at all is so that the data-only nodes can snap into a cluster, that they can do a bit of bootstrap validation and so that the shard recovery tools work. Cluster states that are written asynchronously have their voting configuration adapted to a non existing configuration so that these nodes cannot mistakenly become master even if their node role is changed back and forth. Relates #48701 * Remove persistent cluster settings tool (#50694) Adds the elasticsearch-node remove-settings tool to remove persistent settings from the on disk cluster state in case where it contains incompatible settings that prevent the cluster from forming. Relates #48701 * Make cluster state writer resilient to disk issues (#50805) Adds handling to make the cluster state writer resilient to disk issues. Relates to #48701 * Omit writing global metadata if no change (#50901) Uses the same optimization for the new cluster state storage layer as the old one, writing global metadata only when changed. Avoids writing out the global metadata if none of the persistent fields changed. Speeds up server:integTest by ~10%. Relates #48701 * DanglingIndicesIT should ensure node removed first (#50896) These tests occasionally failed because the deletion was submitted before the restarting node was removed from the cluster, causing the deletion not to be fully acked. This commit fixes this by checking the restarting node has been removed from the cluster. Co-authored-by: David Turner <david.turner@elastic.co> * fix tests Co-authored-by: David Turner <david.turner@elastic.co>	2020-01-14 09:35:43 +01:00
Tim Vernum	b02b073a57	Increase Size and lower TTL on DLS BitSet Cache (#50953 ) The Document Level Security BitSet Cache (see #43669) had a default configuration of "small size, long lifetime". However, this is not a very useful default as the cache is most valuable for BitSets that take a long time to construct, which is (generally speaking) the same ones that operate over a large number of documents and contain many bytes. This commit changes the cache to be "large size, short lifetime" so that it can hold bitsets representing billions of documents, but releases memory quickly. The new defaults are 10% of heap, and 2 hours. This also adds some logging when a single BitSet exceeds the size of the cache and when the cache is full. Backport of: #50535	2020-01-14 18:04:02 +11:00
Tim Vernum	33c29fb5a3	Support Client and RoleMapping in custom Realms (#50950 ) Previously custom realms were limited in what services and components they had easy access to. It was possible to work around this because a security extension is packaged within a Plugin, so there were ways to store this components in static/SetOnce variables and access them from the realm, but those techniques were fragile, undocumented and difficult to discover. This change includes key services as an argument to most of the methods on SecurityExtension so that custom realm / role provider authors can have easy access to them. Backport of: #50534	2020-01-14 15:26:41 +11:00
Tim Brooks	50cb770315	Use default profile for remote connections (#50947 ) Currently, the connection manager is configured with a default profile for both the sniff and proxy connection stratgies. This profile correctly reflects the expected number of connection (6 for sniff, 18 for proxy). This commit removes the proxy strategy usages of the per connection attempt profile configuration. Additionally, it refactors other unnecessary code around the connection manager. The connection manager now can always be built inside the remote connection.	2020-01-13 21:46:23 -06:00
Tim Vernum	90ba77951a	Fix memory leak in DLS bitset cache (#50946 ) The Document Level Security BitSet cache stores a secondary "lookup map" so that it can determine which cache entries to invalidate when a Lucene index is closed (merged, etc). There was a memory leak because this secondary map was not cleared when entries were naturally evicted from the cache (due to size/ttl limits). This has been solved by adding a cache removal listener and processing those removal events asyncronously. Backport of: #50635	2020-01-14 13:19:05 +11:00
Tim Brooks	27c2eb744e	Fix open/close race in ConnectionManagerTests (#50621 ) Currently we reuse the same test connection for all connection attempts in the testConcurrentConnectsAndDisconnects test. This means that if the connection fails due to a pre-existing connection, the connection will be closed impacting the state of all connection attempts. This commit fixes the test, by returning a unique connection for each attempt. Fixes #49903.	2020-01-13 18:43:18 -07:00
Tim Vernum	1577a0e617	Validate field permissions when creating a role (#50917 ) When creating a role, we do not check if the exceptions for the field permissions are a subset of granted fields. If such a role is assigned to a user then that user's authentication fails for this reason. We added a check to validate role query in #46275 and on the same lines, this commit adds check if the exceptions for the field permissions is a subset of granted fields when parsing the index privileges from the role descriptor. Backport of: #50212 Co-authored-by: Yogesh Gaikwad <bizybot@users.noreply.github.com>	2020-01-14 12:37:45 +11:00
Tim Vernum	c2acb8830a	Add max_resource_units to enterprise license (#50910 ) The enterprise license type must have "max_resource_units" and may not have "max_nodes". This change adds support for this new field, validation that the field is present if-and-only-if the license is enterprise and bumps the license version number to reflect the new field. Includes a BWC layer to return "max_nodes: ${max_resource_units}" in the GET license API. Backport of: #50735	2020-01-14 12:37:05 +11:00
Nhat Nguyen	f0924e6d5b	Remove outdated requirement of CCR (#50859 ) With retention leases, users do not need to set index.soft_deletes.retention.operations. This change removes it from the requirements of CCR	2020-01-13 20:00:23 -05:00
Nhat Nguyen	fb32a55dd5	Deprecate synced flush (#50835 ) A normal flush has the same effect as a synced flush on Elasticsearch 7.6 or later. It's deprecated in 7.6 and will be removed in 8.0. Relates #50776	2020-01-13 19:54:38 -05:00
Przemko Robakowski	a18736b46d	[7.x] ILM action to wait for SLM policy execution (#50454 ) (#50943 ) * ILM action to wait for SLM policy execution (#50454) This change add new ILM action to wait for SLM policy execution to ensure that index has snapshot before deletion. Closes #45067 * Fix flaky TimeSeriesLifecycleActionsIT#testWaitForSnapshot test This change adds some randomness and cleanup step to TimeSeriesLifecycleActionsIT#testWaitForSnapshot and testWaitForSnapshotSlmExecutedBefore tests in attempt to make them stable. Reletes to #50781 * Formatting changes * Longer timeout * Fix Map.of in Java8 * Unused import removed	2020-01-14 01:34:33 +01:00
Peter Dyson	4cb525d8d3	[DOCS] Array of index patterns is also valid source indices with transform (#50777 )	2020-01-13 15:46:45 -08:00
Ryan Ernst	86fb06a108	Migrate certgen packaging test from bats (#50880 ) This commit moves the packaging tests for elasticsearch-certgen to java from bats. Although certgen is deprecated, the tests are moved rather than just deleted, and the tests themselves should be easily adaptable to certutil. One note is that the test is simplified to use a single node, rather than the two node test from bats, which was problematic given how the newer distro tests only operate with a single distribution. relates #46005	2020-01-13 13:56:30 -08:00
Lee Hinman	91689e793d	[7.x] Refresh cached phase policy definition if possible on ne… (#50941 ) * Refresh cached phase policy definition if possible on new policy There are some cases when updating a policy does not change the structure in a significant way. In these cases, we can reread the policy definition for any indices using the updated policy. This commit adds this refreshing to the `TransportPutLifecycleAction` to allow this. It allows us to do things like change the configuration values for a particular step, even when on that step (for example, changing the rollover criteria while on the `check-rollover-ready` step). There are more cases where the phase definition can be reread that just the ones checked here (for example, removing an action that has already been passed), and those will be added in subsequent work. Relates to #48431	2020-01-13 14:31:41 -07:00
Lisa Cawley	a82ddfb182	[DOCS] Adds elasticsearch-keystore command reference (#50872 )	2020-01-13 13:08:21 -08:00
Bogdan Pintea	f04b4cbee8	SQL: Optimisation fixes for conjunction merges (#50703 ) (#50933 ) * SQL: Optimisation fixes for conjunction merges This commit fixes the following issues around the way comparisions are merged with ranges in conjunctions: * the decision to include the equality of the lower limit is corrected; * the selection of the upper limit is corrected to use the upper bound of the range; * the list of terms in the conjunction is sorted to have the ranges at the bottom; this allows subsequent binary comarisions to find compatible ranges and potentially be merged away. The end guarantee being that the optimisation takes place irrespective of the order of the conjunction terms in the statement. Some comments are also corrected. * adress review observation on anon. comparator Replace anonymous comparator of split AND Expressions with a lambda. (cherry picked from commit 9828cb143a41f1bda1219541f3a8fdc03bf6dd14)	2020-01-13 21:51:29 +01:00

1 2 3 4 5 ...

49482 Commits