OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-03-01 08:29:09 +00:00

Author	SHA1	Message	Date
James Rodewig	9f1f468cef	[DOCS] Document dynamic discovery settings (#61420 ) (#62002 )	2020-09-04 11:36:34 -04:00
James Rodewig	7e2903d888	[DOCS] Document dynamic index mgmt and buffer settings (#61753 ) (#61996 )	2020-09-04 10:40:55 -04:00
Dimitris Athanasiou	d37f197efd	[7.x][ML] Allow training_percent to be any positive double up to hundred (#61977 ) (#61990 ) This changes the valid range of `training_percent` for regression and classification from [1, 100] to (0, 100]. Backport of #61977	2020-09-04 17:34:14 +03:00
James Rodewig	3396184ff3	[DOCS] Use correct get document API (#61804 ) (#61992 ) The documentation refers to a deprecated get document API call (it uses document `type`). Co-authored-by: Thiago Souza <thiago@elastic.co>	2020-09-04 10:04:33 -04:00
Yannick Welsch	6d08b55d4e	Simplify searchable snapshot shard allocation (#61911 ) Simplifies allocation for snapshot-backed shards by always making the recovery source "from snapshot" for those snapshot-backed shards (instead of "recover from local or from empty store"). Also let's the balancer pick a node which to allocate the snapshot-backed shard to (which takes number of shards on each node into account unlike the current implementation which just picks whatever node we are allowed to allocate to, with no notion of "balancing" at all).	2020-09-04 15:45:00 +02:00
James Rodewig	7863df88e3	[DOCS] Fix typo in URL-based access control docs (#61896 ) (#61986 ) Co-authored-by: George Tseres <george.tseres@gmail.com>	2020-09-04 09:24:48 -04:00
Alan Woodward	66bb1eea98	Improve error messages on bad [format] and [null_value] params for date mapper (#61932 ) Currently, if an incorrectly formatted date is passed as a null_value for a date field mapper configuration, you get a vague error: Failed to parse mapping [_doc]: cannot parse empty date Similarly, if you pass an incorrect format, you get the error: Failed to parse mapping [_doc]: Invalid format [...] This commit improves both these errors by including the mapper name and parameter that are misconfigured. Fixes #61712	2020-09-04 14:13:28 +01:00
Tanguy Leroux	289b1f4ae7	Reduce locking in prewarming (#61837 ) (#61967 ) During prewarming of a Lucene file a CacheFile is acquired and then locked for the duration of the prewarming, ie locked until all the part of the file has been downloaded and written to cache on disk. The locking (executed with CacheFile#fileLock()) is here to prevent the cache file to be evicted while it is prewarming. But holding the lock may take a while for large files, specially since restoring snapshot files now respects the indices.recovery.max_bytes_per_sec setting of 40mb (#58658), and this can have bad consequences like preventing the CacheFile to be evicted, opened or closed. In manual tests this bug slow downs various requests like mounting a new searchable snapshot index or deleting an existing one that is still prewarming. This commit reduces the time the lock is held during prewarming so that the read lock is only required when actively writing to the CacheFile.	2020-09-04 15:06:50 +02:00
Mikołaj Przybysz	3e6e81c993	[DOCS] Add line break to get ILM lifecycle API docs (#61892 )	2020-09-04 09:00:42 -04:00
Théophile Helleboid - chtitux	9416a55687	[DOCS] Add jump link for 7.9.1 release notes (#61960 )	2020-09-04 08:56:52 -04:00
Martijn van Groningen	84af9abd76	Fix skip versions fix xpack data stream yaml tests. (#61981 ) Backport of #61926 to 7.x branch. Relates to #61904	2020-09-04 14:53:38 +02:00
Benjamin Trent	cec102a391	[7.x] [ML] adds new n_gram_encoding custom processor (#61578 ) (#61935 ) * [ML] adds new n_gram_encoding custom processor (#61578) This adds a new `n_gram_encoding` feature processor for analytics and inference. The focus of this processor is simple ngram encodings that allow: - multiple ngrams [1..5] - Prefix, infix, suffix	2020-09-04 08:36:50 -04:00
Ioannis Kakavas	7b021bf3fb	Run zulu8 fips CI with BCJSSE instead of SunJSSE (#61857 ) As we figured out in https://github.com/elastic/elasticsearch/issues/61316#issuecomment-685482708 Azul brings back a lot of changes from JDK 11 to their Zulu8 build and this means that we can't run this with SunJSSE in FIPS 140 mode. This change ensures that we configure Zulu8 JDK JVMs in FIPS 140 mode, using the bouncy castle JSSE FIPS provider, instead of the SunJSSE one ( as we do for the rest of the java 8 JVMs ) Resolves: #61316	2020-09-04 14:53:43 +03:00
Ignacio Vera	31c026f25c	upgrade to Lucene-8.7.0-snapshot-61ea26a (#61957 ) (#61974 )	2020-09-04 13:46:20 +02:00
Dimitris Athanasiou	bdccab7c7a	[7.x][ML] Add incremental id during data frame analytics reindexing (#61943 ) (#61971 ) Previously, we added a copy of the `_id` during reindexing and sorted the destination index on that. This allowed us to traverse the docs in the destination index in a stable order multiple times and with efficiency. However, the destination index being sorted means we cannot have `nested` typed fields. This is a problem as it does not allow us to provide a good experience with our evaluate API when it comes to computing metrics for specific classes, features, etc. This commit changes the approach in order to result to a destination index that allows nested fields. Instead of adding a copy of the `_id` field, we now add an incremental id that we can use to traverse the docs in a stable order. We also ensure we always assign the same incremental id to the same doc from the source indices by sorting on `_seq_no` during reindexing. That in combination with the reindexing API using scroll gives us a stable order as scroll uses the (`_index`, `_doc`, shard_id) tuple to resolve ties. The extractor now does not need to scroll. Instead we sort on the incremental id and we do ranged searches to avoid the sort-all-docs overhead. Finally, the `TestDocsIterator` is simply changed to search_after the incremental id. With these changes data frame analytics jobs do not use scroll at any part. Having all these in place, the commit adds the `nested` types to the necessary fields of `classification` and `regression` analyses results. Backport of #61943	2020-09-04 13:24:42 +03:00
Tanguy Leroux	10d14ce101	Enable searchable snapshot feature for all test clusters (#61888 ) (#61965 ) This commit reenables the searchable snapshot feature for integration tests after #61802 which changed some build plugins.	2020-09-04 11:20:24 +02:00
Ioannis Kakavas	6d250e0f44	Add runtimeJavaDetails property in BuildParams (#61901 ) (#61961 ) Relates to #61857	2020-09-04 11:47:44 +03:00
Tim Vernum	cdfb163c7c	Add explicit test for DLS with OIDC metadata (#61955 ) When a user authenticates via OpenID Connect we copy information from the OIDC claims into the user's metadata in a particular format. This commit adds a test that metadata in that format can be used in a mustache template for Document Level Security. Backport of: #60030	2020-09-04 16:21:20 +10:00
Tim Vernum	57efda2865	Add DEBUG logging for undefined role mapping field (#61887 ) A role mapping with the following content: "rules": { "field": { "userid" : "admin" } } will never match because `userid` is not a valid field. The correct field is `username`. This change adds DEBUG logging when an undefined field is referenced. The choice to use DEBUG rather than INFO/WARN is that the set of fields is partially dynamic (e.g. the `metadata.*` fields), so it may be perfectly reasonable to check a field that is not defined for that user. For example this rule: "rules": { "field": { "metadata.ranking" : "A" } } would generate a log message for an unranked user, which would erroneously suggest that such a rule is an error. This DEBUG logging will assist in diagnosing problems, without introducing that confusion. Backport of: #61246 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-09-04 14:19:05 +10:00
Ryan Ernst	d6e17170c3	Simplify adding plugins and modules to testclusters (#61886 ) There are currently half a dozen ways to add plugins and modules for test clusters to use. All of them require the calling project to peek into the plugin or module they want to use to grab its bundlePlugin task, and then both depend on that task, as well as extract the archive path the task will produce. This creates cross project dependencies that are difficult to detect, and if the dependent plugin/module has not yet been configured, the build will fail because the task does not yet exist. This commit makes the plugin and module methods for testclusters symmetetric, and simply adding a file provider directly, or a project path that will produce the plugin/module zip. Internally this new variant uses normal configuration/dependencies across projects to get the zip artifact. It also has the added benefit of no longer needing the caller to add to the test task a dependsOn for bundlePlugin task.	2020-09-03 19:37:46 -07:00
James Rodewig	6fc1bb011e	remove xref from heading	2020-09-03 17:49:36 -04:00
Jake Landis	ea1e8ad6ea	[7.x] Fix passing params to template or script failed in watcher (#58559 ) (#61885 ) The main changes are: * Fix custom params are missing when using template or script in watcher's logging action or jira action. * Add yaml tests to test passing params to template or script successfully. Relates to #57625 Co-authored-by: bellengao <gbl_long@163.com>	2020-09-03 15:47:51 -05:00
Costin Leau	99ee87e332	EQL: Revert filter pipe (#61907 ) The current implementation of the filter pipe is incomplete hence why it got reverted. Note this is not a complete revert as some of the improvements of said commit (such as the PostAnalyzer) are useful in general. Relates #61805 (cherry picked from commit 7a7eb66f7d39586c3a3bc00dce49e6c47a23b46a)	2020-09-03 22:31:08 +03:00
Lisa Cawley	3fb6dc05d2	[DOCS] Remove #60900 from release notes (#61944 )	2020-09-03 10:57:00 -07:00
James Rodewig	2a62c8772a	Add release notes for 7.9.1 (#61861 ) (#61937 ) Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com> Co-authored-by: Lisa Cawley <lcawley@elastic.co> Co-authored-by: Martijn Laarman <Mpdreamz@gmail.com> Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-09-03 13:25:52 -04:00
Rene Groeschke	9d4e1f1589	Fix duplicate dir handling in untar transform (#61917 ) (#61933 )	2020-09-03 18:52:48 +02:00
Nik Everett	3d23dcd742	Use standard bit set impl in cardinality (#61816 ) (#61930 ) This replaces a specialized bit set implementation used in cardinality with our standard `BitArray` which works exactly the same way. Its also tracked by `BigArrays` which is great!	2020-09-03 12:37:30 -04:00
Nik Everett	3934e14bc0	Fixup vwhisto test (#60936 ) (#61928 ) This test assumed some random bounds that turned out not to hold in some cases. Closes #60673	2020-09-03 12:37:17 -04:00
James Rodewig	574b177528	[DOCS] Remove 7.9.1 coming tag (#61929 )	2020-09-03 12:30:31 -04:00
Alan Woodward	48870c60c7	Don't spin up a whole node to unit test some data structures (#61923 ) BytesRefHashTests and LongObjectHashMapTests currently extend ESSingleNodeTestCase, which builds an entire node just to run some unit tests over entirely in-memory data structures. This commit converts them both to extend ESTestCase.	2020-09-03 17:19:42 +01:00
Alan Woodward	3a1e0edf0a	Convert DateFieldMapperTests to MapperTestCase (#61920 )	2020-09-03 16:04:02 +01:00
Martijn Laarman	cfa54c08bd	[7.x] Version bump 7.9.1 release	2020-09-03 16:41:58 +02:00
Martijn van Groningen	3d9c12e2d3	Fix data stream wildcard resolution bug in eql search api.(#61910 ) Backport of #61904 to 7.x branch. The eql search api redirects to the search api. For this reason the eql search api could work with concrete data stream names. However if security is enabled and a data stream name snippet with a wildcard was used then it could not resolve this expressions. This is because the EqlSearchRequest class didn't overwrite the `includeDataStreams()` method. This pr fixes this, so that the security layer can properly expand data stream name wildcard expressions for the eql search api. This commit also moves the eql data stream test to xpack rest tests, so that the test runs with security enabled. This is required to reproduce the bug. Closes #60828	2020-09-03 16:03:57 +02:00
Tanguy Leroux	c90ee32cdc	Mute ClassificationIT.testTooLowConfiguredMemoryStillStarts (#61915 ) Relates #61913	2020-09-03 15:52:01 +02:00
István Zoltán Szabó	acc9ef52db	[7.x] [DOCS] Adds filter aggregation example link to painless examples (#61890 ) (#61902 ) * [DOCS] Adds filter aggregation example link to painless examples (#61890) * Update docs/reference/transform/painless-examples.asciidoc	2020-09-03 15:32:30 +02:00
Alan Woodward	e2f006eeb4	Merge FetchSubPhase hitsExecute and hitExecute methods (#60907 ) (#61893 ) FetchSubPhase has two 'execute' methods, one which takes all hits to be examined, and one which takes a single HitContext. It's not obvious which one should be implemented by a given sub-phase, or if implementing both is a possibility; nor is it obvious that we first run the hitExecute methods of all subphases, and then subsequently call all the hitsExecute methods. This commit reworks FetchSubPhase to replace these two variants with a processor class, `FetchSubPhaseProcessor`, that is returned from a single `getProcessor` method. This processor class has two methods, `setNextReader()` and `process`. FetchPhase collects processors from all its subphases (if a subphase does not need to execute on the current search context, it can return `null` from `getProcessor`). It then sorts its hits by docid, and groups them by lucene leaf reader. For each reader group, it calls `setNextReader()` on all non-null processors, and then passes each doc id to `process()`. Implementations of fetch sub phases can divide their concerns into per-request, per-reader and per-document sections, and no longer need to worry about sorting docs or dealing with reader slices. FetchSubPhase now provides a FetchSubPhaseExecutor that exposes two methods, setNextReader(LeafReaderContext) and execute(HitContext). The parent FetchPhase collects all these executors together (if a phase should not be executed, then it returns null here); then it sorts hits, and groups them by reader; for each reader it calls setNextReader, and then execute for each hit in turn. Individual sub phases no longer need to concern themselves with sorting docs or keeping track of readers; global structures can be built in getExecutor(SearchContext), per-reader structures in setNextReader and per-doc in execute.	2020-09-03 12:20:55 +01:00
Alan Woodward	af01ccee93	Add specific test for serializing all mapping parameter values (#61844 ) (#61877 ) This commit adds a test to MapperTestCase that explicitly checks that a mapper can serialize all its default values, and that this serialization can then be re-parsed. Note that the test is disabled for non-parametrized mappers as their serialization may in some cases output parameters that are not accepted. Gradually moving all mappers to parametrized form will address this. The commit also contains a fix to keyword mappers, which were not correctly serializing the similarity parameter; this partially addresses #61563. It also enables `null` as a value for `null_value` on `scaled_float`, as a follow-up to #61798	2020-09-03 09:20:26 +01:00
Julie Tibshirani	2a02c6ee36	Remove a redundant section on field data types. (#61821 ) All information in the section is already included in the 'mapping-types' page.	2020-09-02 15:29:48 -07:00
Jake Landis	dbb78e1c45	[7.x] Correct the query dsl for watching elasticsearch version (#58321 ) (#61882 ) The term query should be looking at the cluster_uuid field in elasticsearch_version_mismatch.json. Co-authored-by: bellengao <gbl_long@163.com>	2020-09-02 16:58:21 -05:00
Nik Everett	c19f67ce30	Support longs in BitArray (backport of #61867 ) (#61871 ) We frequently use `long`s with `BitArray` in aggs and right now we have to assert that the `long` fits in an `int`. This adds support for `long` to `BitArray` so we don't need those assertions.	2020-09-02 17:24:31 -04:00
Dan Hermann	e0eafec897	[DOCS] Update tie_breaker defaults for bool_prefix and most_fields query types (#61112 ) (#61881 )	2020-09-02 15:46:38 -05:00
Dimitris Athanasiou	ec405978fc	[7.x][ML] Update reindexing task progress before persisting job progress (#61868 ) (#61875 ) This fixes a bug introduced by #61782. In that PR I thought I could simplify the persistence of progress by using the progress straight from the stats holder in the task instead of calling the get stats action. However, I overlooked that it is then possible to have stale progress for the reindexing task as that is only updated when the get stats API is called. In this commit this is fixed by updating reindexing task progress before persisting the job progress. This seems to be much more lightweight than calling the get stats request. Closes #61852 Backport of #61868	2020-09-02 21:44:18 +03:00
Benjamin Trent	c22415c241	[7.x] [ML] unmute testTooLowConfiguredMemoryStillStarts (#61846 ) (#61869 ) * [ML] unmute testTooLowConfiguredMemoryStillStarts (#61846) Native PR addresses this test failure: https://github.com/elastic/ml-cpp/pull/1465 closes https://github.com/elastic/elasticsearch/issues/61704 closes https://github.com/elastic/elasticsearch/issues/61561	2020-09-02 13:23:23 -04:00
Henning Andersen	867d5f1c68	Search memory leak (#61788 ) (#61862 ) Search could leak memory if global ordinals were calculated as part of a search with low level cancellation enabled. QueryPhase registers a cancellation on the reader that is never removed, which ends up being referenced from the global ordinals cache entry. This keeps an indirect reference to the search context. A significant leak can occur when a heavy aggregation (cardinality for instance) is used and a failure occurs during search, in particular if the pages backing the hyperlog++ structure are not recycled when it is closed. This commit also fixes an issue with an unclosed resource and request breaker adjustment in the cardinality aggregation.	2020-09-02 18:51:14 +02:00
Jake Landis	f6b3148e5e	[7.x] Convert second 1/2 x-pack plugins from integTest to [yaml \| java]RestTest or internalClusterTest (#61802 ) (#61856 ) For 1/2 the plugins in x-pack, the integTest task is now a no-op and all of the tests are now executed via a test, yamlRestTest, javaRestTest, or internalClusterTest. This includes the following projects: security, spatial, stack, transform, vecotrs, voting-only-node, and watcher. A few of the more specialized qa projects within these plugins have not been changed with this PR due to additional complexity which should be addressed separately. related: #60630 related: #56841 related: #59939 related: #55896	2020-09-02 11:20:55 -05:00
Jake Landis	794aac717d	[7.x] Convert first 1/2 x-pack plugins from integTest to [yaml \| java]RestTest or internalClusterTest (#60630 ) (#61855 ) For 1/2 the plugins in x-pack, the integTest task is now a no-op and all of the tests are now executed via a test, yamlRestTest, javaRestTest, or internalClusterTest. This includes the following projects: async-search, autoscaling, ccr, enrich, eql, frozen-indicies, data-streams, graph, ilm, mapper-constant-keyword, mapper-flattened, ml A few of the more specialized qa projects within these plugins have not been changed with this PR due to additional complexity which should be addressed separately. A follow up PR will address the remaining x-pack plugins (this PR is big enough as-is). related: #61802 related: #56841 related: #59939 related: #55896	2020-09-02 11:19:24 -05:00
James Rodewig	6eacb6dd89	[DOCS] Fix keyword xref	2020-09-02 11:47:17 -04:00
James Rodewig	8da4e4ab15	[DOCS] Update shard allocation awareness xref	2020-09-02 11:34:22 -04:00
Dimitris Athanasiou	07ab0beea0	[7.x][ML] Improve handling of exception while starting DFA process (#61838 ) (#61847 ) While starting the data frame analytics process it is possible to get an exception before the process crash handler is in place. In addition, right after starting the process, we check the process is alive to ensure we capture a failed process. However, those exceptions are unhandled. This commit catches any exception thrown while starting the process and sets the task to failed with the root cause error message. I have also taken the chance to remove some unused parameters in `NativeAnalyticsProcessFactory`. Relates #61704 Backport of #61838	2020-09-02 16:32:45 +03:00
Costin Leau	e6dc8054a5	EQL: Introduce filter pipe (#61805 ) Allow filtering through a pipe, across events and sequences. Filter pipes are pushed down to base queries. For now filtering after limit (head/tail) is forbidden as the semantics are still up for debate. Fix #59763 (cherry picked from commit 80569a388b76cecb5f55037fe989c8b6f140761b)	2020-09-02 15:48:51 +03:00

1 2 3 4 5 ...

53452 Commits