OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nhat Nguyen	acf84b68cb	Do not wrap soft-deletes reader for segment stats (#51331 ) IndexWriter might not filter out fully deleted segments if retention leases exist or the number of the retaining operations is non-zero. SoftDeletesDirectoryReaderWrapper, however, always filters out fully deleted segments. This change uses the original directory reader when calculating segment stats instead. Relates #51192 Closes #51303	2020-01-23 08:43:06 -05:00
David Kyle	0ac03ac5e7	[ML] Add parsers for inference configuration classes (#51300 )	2020-01-22 17:03:01 +00:00
David Kyle	ca4b90a001	[ML] Calculate results and snapshot retention using latest bucket timestamps (#51061 ) (#51301 ) The retention period is calculated relative to the last bucket result or snapshot time rather than wall clock	2020-01-22 14:52:33 +00:00
Dimitris Athanasiou	59687a9384	[7.x][ML] Validate classification dependent_variable cardinality is at lea… (#51232 ) (#51309 ) Data frame analytics classification currently only supports 2 classes for the dependent variable. We were checking that the field's cardinality is not higher than 2 but we should also check it is not less than that as otherwise the process fails. Backport of #51232	2020-01-22 16:51:16 +02:00
Benjamin Trent	2a73e849d6	[ML][Inference] fixing ingest IT tests (#51267 ) (#51311 ) Converts InferenceIngestIT into a `ESRestTestCase`. closes #51201	2020-01-22 09:50:17 -05:00
David Roberts	932c63297f	[ML] Fix possible race condition when starting datafeed (#51302 ) The ID of the datafeed's associated job was being obtained frequently by looking up the datafeed task in a map that was being modified in other threads. This could lead to NPEs if the datafeed stopped running at an unexpected time. This change reduces the number of places where a datafeed's associated job ID is looked up to avoid the possibility of failures when the datafeed's task is removed from the map of running tasks during multi-step operations in other threads. Fixes #51285	2020-01-22 11:40:39 +00:00
Przemysław Witek	bfcfcdee33	[7.x] Do not copy mapping from dependent variable to prediction field in regression analysis (#51227 ) (#51288 )	2020-01-22 12:36:24 +01:00
Andrei Dan	421aa14972	ILM: Make UpdateSettingsStep retryable (#51235 ) (#51298 ) This makes the UpdateSettingsStep retryable. This step updates settings needed during the execution of ILM actions (mark indexes as read-only, change allocation configurations, mark indexing complete, etc) As the index updates are idempotent in nature (PUT requests and are applied only if the values have changed) and the settings values are seldom user-configurable (aside from the allocate action) the testing for this change goes along the lines of artificially simulating a setting update failure on a particular value update, which is followed by a successful step execution (a retry) in an environment outside of ILM (the step executions are triggered manually). (cherry picked from commit 8391b0aba469f39532bfc2796b76148167dc0289) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-01-22 11:02:26 +00:00
Andrei Dan	123266714b	ILM wait for active shards on rolled index in a separate step (#50718 ) (#51296 ) After we rollover the index we wait for the configured number of shards for the rolled index to become active (based on the index.write.wait_for_active_shards setting which might be present in a template, or otherwise in the default case, for the primaries to become active). This wait might be long due to disk watermarks being tripped, replicas not being able to spring to life due to cluster nodes reconfiguration and others and, the RolloverStep might not complete successfully due to this inherent transient situation, albeit the rolled index having been created. (cherry picked from commit 457a92fb4c68c55976cc3c3e2f00a053dd2eac70) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-01-22 11:01:52 +00:00
Ioannis Kakavas	a76321437c	Truncate SAML Response in trace log (#51237 ) (#51283 ) When not truncated, a long SAML response XML document can fill max line length and mask the actual exception message that the trace statement is meant to inform about. The same XML Document is also printed in full on trace level in SamlRequestHandler#parseSamlMessage() so there is no loss of information	2020-01-22 09:56:39 +02:00
Nik Everett	ca15a3f5a8	Add "did you mean" to unknown queries (#51177 ) (#51254 ) This replaces the message we return for unknown queries with the standard one that we use for unknown fields from `ObjectParser`. This is nice because it includes "did you mean". One day we might convert parsing queries to using object parser, but that looks complex. This change is much smaller and seems useful.	2020-01-21 12:45:52 -05:00
Benjamin Trent	a9b2bc525e	[ML] address two edge cases for categorization.GrokPatternCreator#findBestGrokMatchFromExamples (#51168 ) (#51255 ) There are two edge cases that can be ran into when example input is matched in a weird way. 1. Recursion depth could continue many many times, resulting in a HUGE runtime cost. I put a limit of 10 recursions (could be adjusted I suppose). 2. If there are no "fixed regex bits", exploring the grok space would result in a fence-post error during runtime (with assertions turned off)	2020-01-21 10:29:29 -05:00
Martijn van Groningen	6b5b26a595	Protects against NPE: 2> REPRODUCE WITH: ./gradlew ':x-pack:plugin:watcher:test' --tests "org.elasticsearch.xpack.watcher.history.HistoryTemplateTransformMappingsTests.testTransformFields" -Dtests.seed=26754396AB9C1A30 -Dtests.security.manager=true -Dtests.locale=lv-LV -Dtests.timezone=America/Dominica -Dcompiler.java=13 -Druntime.java=8 2> java.lang.NullPointerException at __randomizedtesting.SeedInfo.seed([26754396AB9C1A30:B2A3CA27E260803B]:0) at org.elasticsearch.xpack.watcher.history.HistoryTemplateTransformMappingsTests.lambda$testTransformFields$1(HistoryTemplateTransformMappingsTests.java:85) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193) at java.util.HashMap$ValueSpliterator.forEachRemaining(HashMap.java:1628) at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482) at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472) at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708) at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499) at org.elasticsearch.xpack.watcher.history.HistoryTemplateTransformMappingsTests.lambda$testTransformFields$2(HistoryTemplateTransformMappingsTests.java:88) at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:892) at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:877) at org.elasticsearch.xpack.watcher.history.HistoryTemplateTransformMappingsTests.testTransformFields(HistoryTemplateTransformMappingsTests.java:74)	2020-01-21 15:42:22 +01:00
Nik Everett	788836ea3f	Revert "Begin moving date_histogram to offset rounding (backport of #50873 ) (#50978 )" (#51239 ) This reverts commit `9a3d4db840`. It was subtly broken in ways we didn't have tests for.	2020-01-21 08:50:02 -05:00
David Roberts	0fa7db9a95	[ML] Make datafeeds work with nanosecond time fields (#51180 ) Allows ML datafeeds to work with time fields that have the "date_nanos" type _and make use of the extra precision_. (Previously datafeeds only worked with time fields that were exact multiples of milliseconds. So datafeeds would work with "date_nanos" only if the extra precision over "date" was not used.) Relates #49889	2020-01-21 09:59:50 +00:00
Nhat Nguyen	43ed244a04	Account soft-deletes in FrozenEngine (#51192 ) (#51229 ) Currently, we do not exclude soft-deleted documents when opening index reader in the FrozenEngine. Backport of #51192	2020-01-20 17:07:29 -05:00
Adrien Grand	1a73d8329c	Disable xpack/15_basic/Usage stats for mappings. Relates #51127	2020-01-20 18:05:26 +01:00
Andrei Stefan	2908b7e5fc	SQL: add support for passing query parameters in REST API calls (#51029 ) (#51222 ) * REST PreparedStatement-like query parameters are now supported in the form of an array of non-object, non-array values where ES SQL parser will try to infer the data type of the value being passed as parameter. (cherry picked from commit 45b8bf619aecb1c03d7bc0cf06928dcc36005a66)	2020-01-20 16:40:19 +02:00
Andrei Stefan	543cc85b78	Add trace logging for responses coming from server (#50530 ) (#51221 ) (cherry picked from commit 38eb485deffa175c7eb0b55a42a3e309f8a9802d)	2020-01-20 16:39:46 +02:00
Andrei Stefan	df36169220	SQL: change the way unsupported data types fields are handled (#50823 ) (#51220 ) The hierarchy of fields/sub-fields under a field that is of an unsupported data type will be marked as unsupported as well. Until this change, the behavior was to set the unsupported data type field's hierarchy as empty. Example, considering the following hierarchy of fields/sub-fields a -> b -> c -> d, if b would be of type "foo", then b, c and d will be marked as unsupported. (cherry picked from commit 7adb286c4c485b9e781f88b0a2f98cab9ec5b7e2)	2020-01-20 16:23:43 +02:00
Hendrik Muhs	51134d9738	check custom meta data to avoid NPE (#51163 ) check custom meta data to avoid NPE, fixes a problem introduced in #51072 fixes #51153	2020-01-20 13:53:42 +01:00
Tim Vernum	a0ca82422c	Mute TimeSeriesLifecycleActionsIT.waitForSnapshot (#51208 ) This test was recently un-muted, but is still failing Relates: #50781 Backport of: #51203	2020-01-20 20:19:29 +11:00
Nik Everett	977b53ab91	Fix flaky usage tracking test (#51169 ) (#51179 ) We added tracking of index feature usage in #51031 but due to some copy and paste errors the test fails on some seeds. This fixes those errors.	2020-01-17 16:53:13 -05:00
Jason Tedor	9ce4d2b901	Initial autoscaling commit (#51161 ) This commit merely adds the skeleton for the autoscaling project, adding the basics to include the autoscaling module in the default distribution, opt-in to code formatting, and a placeholder for the docs.	2020-01-17 15:31:12 -05:00
Lee Hinman	731c96b507	[7.x] Use separate policies for tests in SnapshotLifecycleRest… (#51181 ) These policies store statistics, but since stats updating is asynchronous, it's possible for the update from one test to bleed into a separate one. This change switches the tests to use separate policy ids so that their stats are tracked independently. It also relaxes the checking constraint in one of the tests. Hopefully this: Resolves #48531 Resolves #48017	2020-01-17 13:26:40 -07:00
Jay Modi	107989df3e	Introduce hidden indices (#51164 ) This change introduces a new feature for indices so that they can be hidden from wildcard expansion. The feature is referred to as hidden indices. An index can be marked hidden through the use of an index setting, `index.hidden`, at creation time. One primary use case for this feature is to have a construct that fits indices that are created by the stack that contain data used for display to the user and/or intended for querying by the user. The desire to keep them hidden is to avoid confusing users when searching all of the data they have indexed and getting results returned from indices created by the system. Hidden indices have the following properties: * API calls for all indices (empty indices array, _all, or ) will not return hidden indices by default. Wildcard expansion will not return hidden indices by default unless the wildcard pattern begins with a `.`. This behavior is similar to shell expansion of wildcards. * REST API calls can enable the expansion of wildcards to hidden indices with the `expand_wildcards` parameter. To expand wildcards to hidden indices, use the value `hidden` in conjunction with `open` and/or `closed`. * Creation of a hidden index will ignore global index templates. A global index template is one with a match-all pattern. * Index templates can make an index hidden, with the exception of a global index template. * Accessing a hidden index directly requires no additional parameters. Backport of #50452	2020-01-17 10:09:01 -07:00
Jay Modi	96e8f67425	Upgrade to the latest OWASP HTML sanitizer (#50765 ) (#51166 ) This commit upgrades the OWASP HTML sanitizer used by watcher to the latest version and also upgrades guava, which it depends on. The guava upgrade also requires the addition of a new dependency that guava itself requires as of version 27.0. The sanitizer's behavior has changed to re-write these templated values with a comment that results in this output `{<!-- -->{ctx.metadata.name}}`. This would be an issue if we attempted to sanitize the template, but the code that uses the sanitizer runs the rendered string through the sanitizer, which means that the templated values have been replaced already. Relates #50395	2020-01-17 10:00:33 -07:00
Ioannis Kakavas	4fc865e579	Don't fallback to anonymous for tokens/apikeys (#51042 ) (#51159 ) This commit changes our behavior so that when we receive a request with an invalid/expired/wrong access token or API Key we do not fallback to authenticating as the anonymous user even if anonymous access is enabled for Elasticsearch.	2020-01-17 18:56:02 +02:00
David Roberts	295665b1ea	[ML] Add audit warning for 1000 categories found early in job (#51146 ) If 1000 different category definitions are created for a job in the first 100 buckets it processes then an audit warning will now be created. (This will cause a yellow warning triangle in the ML UI's jobs list.) Such a large number of categories suggests that the field that categorization is working on is not well suited to the ML categorization functionality.	2020-01-17 16:28:45 +00:00
Przemysław Witek	da73c9104e	[ML] Fix tests randomly failing on CI (#51142 ) (#51150 )	2020-01-17 14:58:58 +01:00
Dimitris Athanasiou	b70ebdeb96	[7.x][ML] DF Analytics _explain API should skip object fields (#51115 ) (#51147 ) Object fields cannot be used as features. At the moment _explain API includes them and even worse it allows it does not error when an object field is excluded. This creates the expectation to the user that all children fields will also be excluded while it's not the case. This commit omits object fields from the _explain API and also adds an error if an object field is included or excluded. Backport of #51115	2020-01-17 14:02:59 +02:00
Przemysław Witek	b1a526d5e9	[7.x] [ML] Update DFA progress document in the index the document belongs to (#51111 ) (#51117 )	2020-01-17 08:12:54 +01:00
Hendrik Muhs	13343b15c9	[Transform] Improve force stop robustness in case of an error (#51072 ) If a transform config got lost (e.g. because the internal index disappeared) tasks could not be stopped using transform API. This change makes it possible to stop transforms without a config, meaning to remove the background task. In order to do so force must be set to true.	2020-01-17 07:42:21 +01:00
Ioannis Kakavas	d0554fd317	Fail gracefully on invalid token strings (#51014 ) (#51096 ) When we receive a request with an Authorization header that contains a Bearer token that is not generated by us or that is malformed in some way, attempting to decode it as one of our own might cause a number of exceptions that are not IOExceptions. This commit ensures that we catch and log these too and call onResponse with `null, so that we can return 401 instead of 500. Resolves: #50497	2020-01-16 17:00:17 +02:00
Bogdan Pintea	fb65ef3f2d	SQL: Extend the optimisations for equalities (#50792 ) (#51098 ) * Extend the optimizations for equalities This commit supplements the optimisations of equalities in conjunctions and disjunctions: * for conjunctions, the existing optimizations with ranges are extended with not-equalities and inequalities; these lead to a fast resolution, the conjunction either being evaluate to a FALSE, or the non-equality conditions being dropped as superfluous; * optimisations for disjunctions are added to be applied against ranges, inequalities and not-equalities; these lead to disjunction either becoming TRUE or the equality being dropped, either as superfluous or merged into a range/inequality. * Adress review notes * Fix the bug around wrongly optimizing 'a=2 OR a!=?', which only yields TRUE for same values in equality and inequality. * Var renamings, code style adjustments, comments corrections. * Address further review comments. Extend optim. - fix a few code comments; - extend the Equals OR NotEquals optimitsation (a=2 OR a!=5 -> a!=5); - extend the Equals OR Range optimisation on limits equality (a=2 OR 2<=a<5 -> 2<=a<5); - in case an equality is being removed in a conjunction, the rest of possible optimisations to test is now skipped. * rename one var for better legiblity - s/rmEqual/removeEquals (cherry picked from commit 62e7c6a010f10cd7893ee5c99bad8b8d2a693436)	2020-01-16 14:32:34 +01:00
Tom Veasey	32ec934b15	[7.x][ML] Assert top classes are ordered by score (#51028 ) Backport #51003.	2020-01-16 12:23:15 +00:00
markharwood	ff0a45f882	Fix NPE in PinnedQuery call to DisjunctionMaxScorer. (#51047 ) (#51064 ) Fix NPE in PinnedQuery call to DisjunctionMaxScorer. (#51047) Added test and fix that tests for score type. Closes #51034	2020-01-16 10:41:43 +00:00
Rory Hunter	80d925e225	Auto-format buildSrc (#51043 ) Backport / reimplementation of #50786 on 7.x. Opt-in `buildSrc` for automatic formatting. This required a config tweak in order to pick up all the Java sources, and as a result more files are now found in the Enrich plugin, that were previously missed. I also moved the 2 Java files in `buildSrc/src/main/groovy` into the Java directory, which required some follow-up changes.	2020-01-16 10:26:27 +00:00
Adrien Grand	45d7bdcfd7	Add analysis components and mapping types to the usage API. (#51062 ) Knowing about used analysis components and mapping types would be incredibly useful in order to know which ones may be deprecated or should get more love. Some field types also act as a proxy to know about feature usage of some APIs like the `percolator` or `completion` fields types for percolation and the completion suggester, respectively.	2020-01-16 09:56:41 +01:00
Tim Vernum	ac6602a156	Fix windows newline issue in test (#51082 ) Fixes HttpCertificateCommandTests.testTextFileSubstitutions on Windows Backport of: #51030	2020-01-16 17:01:58 +11:00
Yang Wang	c1a6d5d9ff	Encrypt generated key with AES (#51019 ) (#51076 ) Replace DES with AES to align with modern encryption standards Backport also fixs Files.readString API that is not available in Java 8 Resolves: #50843	2020-01-16 14:47:21 +11:00
Lee Hinman	2d1c28a45d	[7.x] Fix AllocateRoutedStepTests reusing keys for random valu… (#51058 ) In these tests there was a very small chance that keys could collide, which causes test failures. Resolves #49307	2020-01-15 11:36:34 -07:00
Lee Hinman	e395cf3419	Guard against null settings in CCRIndexLifecycleIT (#51008 ) (#51054 ) It's possible that the index could return no settings and thus throw a `NullPointerException`. I wasn't able to reproduce the original issue, but this should guard against in the future. Resolves #50646 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-01-15 11:21:18 -07:00
Lee Hinman	ad60f0015e	Address failures in SnapshotLifecycleRestIT.testFullPolicySnapshot (#51013 ) This test failed a couple of different ways, related to timing, as well as concurrent snapshots, and also naming. This commit splits the giant `assertBusy` into separate parts so that we don't perform ~5 different requests and tests in the same loop. It also gives each test a unique repository so that no other test can accidentally re-use snapshots. Resolves #50358 (hopefully!)	2020-01-15 09:47:41 -07:00
Rory Hunter	2f069d8f3f	Tweak formatter config for long generic lines (#51027 ) Backport of #50909. The current formatting config allows some long generic declarations to break the 140 character limit. Tweak the config to wrap such lines.	2020-01-15 13:17:37 +00:00
David Roberts	1536c3e622	[TEST] Increase ML distributed test job open timeout (#50998 ) There have been occasional failures, presumably due to too many tests running in parallel, caused by jobs taking around 15 seconds to open. (You can see the job open successfully during the cleanup phase shortly after the failure of the test in these cases.) This change increases the wait time from 10 seconds to 20 seconds to reduce the risk of this happening.	2020-01-15 08:58:55 +00:00
Martijn van Groningen	e76c3d4d32	Tidy up enrich processors: (#50957 ) * Fix generics usages. * Sealed match processor class.	2020-01-15 08:51:22 +01:00
Tomas Della Vedova	5b6fa79fd8	[ML] Removed key value from the catch regex test (#50977 ) (#51021 )	2020-01-15 08:50:59 +01:00
Tim Vernum	e41c0b1224	Deprecating kibana_user and kibana_dashboard_only_user roles (#50963 ) This change adds a new `kibana_admin` role, and deprecates the old `kibana_user` and`kibana_dashboard_only_user`roles. The deprecation is implemented via a new reserved metadata attribute, which can be consumed from the API and also triggers deprecation logging when used (by a user authenticating to Elasticsearch). Some docs have been updated to avoid references to these deprecated roles. Backport of: #46456 Co-authored-by: Larry Gregory <lgregorydev@gmail.com>	2020-01-15 11:07:19 +11:00
Nik Everett	fc5fde7950	Add "did you mean" to ObjectParser (#50938 ) (#50985 ) Check it out: ``` $ curl -u elastic:password -HContent-Type:application/json -XPOST localhost:9200/test/_update/foo?pretty -d'{ "dac": {} }' { "error" : { "root_cause" : [ { "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" } ], "type" : "x_content_parse_exception", "reason" : "[2:3] [UpdateRequest] unknown field [dac] did you mean [doc]?" }, "status" : 400 } ``` The tricky thing about implementing this is that x-content doesn't depend on Lucene. So this works by creating an extension point for the error message using SPI. Elasticsearch's server module provides the "spell checking" implementation. s	2020-01-14 17:53:41 -05:00
Nik Everett	9a3d4db840	Begin moving date_histogram to offset rounding (backport of #50873 ) (#50978 ) We added a new rounding in #50609 that handles offsets to the start and end of the rounding so that we could support `offset` in the `composite` aggregation. This starts moving `date_histogram` to that new offset.	2020-01-14 16:50:27 -05:00
Benjamin Trent	72c270946f	[ML][Inference] Adding classification_weights to ensemble models (#50874 ) (#50994 ) * [ML][Inference] Adding classification_weights to ensemble models classification_weights are a way to allow models to prefer specific classification results over others this might be advantageous if classification value probabilities are a known quantity and can improve model error rates.	2020-01-14 12:40:25 -05:00
Tom Veasey	de5713fa4b	[ML] Disable invalid assertion (#50988 ) Backport #50986.	2020-01-14 17:35:00 +00:00
Armin Braun	16c07472e5	Track Snapshot Version in RepositoryData (#50930 ) (#50989 ) * Track Snapshot Version in RepositoryData (#50930) Add tracking of snapshot versions to RepositoryData to make BwC logic more efficient. Follow up to #50853	2020-01-14 18:15:07 +01:00
David Kyle	7f309a18f1	[7.x][ML] Explicitly require a OriginSettingClient in ML results iterators (#50981 ) In classes where the client is used directly rather than through a call to executeAsyncWithOrigin explicitly require the client to be OriginSettingClient rather than using the Client interface. Also remove calls to deprecated ClientHelper.clientWithOrigin() method.	2020-01-14 17:14:39 +00:00
Dimitris Athanasiou	1d8cb3c741	[7.x][ML] Add num_top_feature_importance_values param to regression and classi… (#50914 ) (#50976 ) Adds a new parameter to regression and classification that enables computation of importance for the top most important features. The computation of the importance is based on SHAP (SHapley Additive exPlanations) method. Backport of #50914	2020-01-14 16:46:09 +02:00
Hendrik Muhs	0178c7c5d0	[7.x][Transform] correctly retrieve checkpoints from remote indices (#50903 ) (#50969 ) uses remote client(s) to correctly retrieve index checkpoints from remote clusters	2020-01-14 15:09:14 +01:00
Przemysław Witek	9c6ffdc2be	[7.x] Handle nested and aliased fields correctly when copying mapping. (#50918 ) (#50968 )	2020-01-14 14:43:39 +01:00
David Kyle	69a3626ee1	Mute SnapshotLifecycleRestIT testFullPolicySnapshot Relates to #50358	2020-01-14 13:46:37 +01:00
Daniel Mitterdorfer	263083b882	Mute HttpCertificateCommandTests.testTextFileSubstitutions (#50965 ) (#50966 ) Relates #50964	2020-01-14 12:40:34 +01:00
Tim Vernum	2bb7b53e41	Add certutil http command (#50952 ) This adds a new "http" sub-command to the certutil CLI tool. The http command generates certificates/CSRs for use on the http interface of an elasticsearch node/cluster. It is designed to be a guided tool that provides explanations and sugestions for each of the configuration options. The generated zip file output includes extensive "readme" documentation and sample configuration files for core Elastic products. Backport of: #49827	2020-01-14 21:24:21 +11:00
Tim Vernum	b02b073a57	Increase Size and lower TTL on DLS BitSet Cache (#50953 ) The Document Level Security BitSet Cache (see #43669) had a default configuration of "small size, long lifetime". However, this is not a very useful default as the cache is most valuable for BitSets that take a long time to construct, which is (generally speaking) the same ones that operate over a large number of documents and contain many bytes. This commit changes the cache to be "large size, short lifetime" so that it can hold bitsets representing billions of documents, but releases memory quickly. The new defaults are 10% of heap, and 2 hours. This also adds some logging when a single BitSet exceeds the size of the cache and when the cache is full. Backport of: #50535	2020-01-14 18:04:02 +11:00
Tim Vernum	33c29fb5a3	Support Client and RoleMapping in custom Realms (#50950 ) Previously custom realms were limited in what services and components they had easy access to. It was possible to work around this because a security extension is packaged within a Plugin, so there were ways to store this components in static/SetOnce variables and access them from the realm, but those techniques were fragile, undocumented and difficult to discover. This change includes key services as an argument to most of the methods on SecurityExtension so that custom realm / role provider authors can have easy access to them. Backport of: #50534	2020-01-14 15:26:41 +11:00
Tim Vernum	90ba77951a	Fix memory leak in DLS bitset cache (#50946 ) The Document Level Security BitSet cache stores a secondary "lookup map" so that it can determine which cache entries to invalidate when a Lucene index is closed (merged, etc). There was a memory leak because this secondary map was not cleared when entries were naturally evicted from the cache (due to size/ttl limits). This has been solved by adding a cache removal listener and processing those removal events asyncronously. Backport of: #50635	2020-01-14 13:19:05 +11:00
Tim Vernum	1577a0e617	Validate field permissions when creating a role (#50917 ) When creating a role, we do not check if the exceptions for the field permissions are a subset of granted fields. If such a role is assigned to a user then that user's authentication fails for this reason. We added a check to validate role query in #46275 and on the same lines, this commit adds check if the exceptions for the field permissions is a subset of granted fields when parsing the index privileges from the role descriptor. Backport of: #50212 Co-authored-by: Yogesh Gaikwad <bizybot@users.noreply.github.com>	2020-01-14 12:37:45 +11:00
Tim Vernum	c2acb8830a	Add max_resource_units to enterprise license (#50910 ) The enterprise license type must have "max_resource_units" and may not have "max_nodes". This change adds support for this new field, validation that the field is present if-and-only-if the license is enterprise and bumps the license version number to reflect the new field. Includes a BWC layer to return "max_nodes: ${max_resource_units}" in the GET license API. Backport of: #50735	2020-01-14 12:37:05 +11:00
Przemko Robakowski	a18736b46d	[7.x] ILM action to wait for SLM policy execution (#50454 ) (#50943 ) * ILM action to wait for SLM policy execution (#50454) This change add new ILM action to wait for SLM policy execution to ensure that index has snapshot before deletion. Closes #45067 * Fix flaky TimeSeriesLifecycleActionsIT#testWaitForSnapshot test This change adds some randomness and cleanup step to TimeSeriesLifecycleActionsIT#testWaitForSnapshot and testWaitForSnapshotSlmExecutedBefore tests in attempt to make them stable. Reletes to #50781 * Formatting changes * Longer timeout * Fix Map.of in Java8 * Unused import removed	2020-01-14 01:34:33 +01:00
Lee Hinman	91689e793d	[7.x] Refresh cached phase policy definition if possible on ne… (#50941 ) * Refresh cached phase policy definition if possible on new policy There are some cases when updating a policy does not change the structure in a significant way. In these cases, we can reread the policy definition for any indices using the updated policy. This commit adds this refreshing to the `TransportPutLifecycleAction` to allow this. It allows us to do things like change the configuration values for a particular step, even when on that step (for example, changing the rollover criteria while on the `check-rollover-ready` step). There are more cases where the phase definition can be reread that just the ones checked here (for example, removing an action that has already been passed), and those will be added in subsequent work. Relates to #48431	2020-01-13 14:31:41 -07:00
Bogdan Pintea	f04b4cbee8	SQL: Optimisation fixes for conjunction merges (#50703 ) (#50933 ) * SQL: Optimisation fixes for conjunction merges This commit fixes the following issues around the way comparisions are merged with ranges in conjunctions: * the decision to include the equality of the lower limit is corrected; * the selection of the upper limit is corrected to use the upper bound of the range; * the list of terms in the conjunction is sorted to have the ranges at the bottom; this allows subsequent binary comarisions to find compatible ranges and potentially be merged away. The end guarantee being that the optimisation takes place irrespective of the order of the conjunction terms in the statement. Some comments are also corrected. * adress review observation on anon. comparator Replace anonymous comparator of split AND Expressions with a lambda. (cherry picked from commit 9828cb143a41f1bda1219541f3a8fdc03bf6dd14)	2020-01-13 21:51:29 +01:00
Ioannis Kakavas	ba37e3c4a0	Disable DiagnosticTrustManager in FIPS 140 (#49888 ) This commit changes the default behavior for xpack.security.ssl.diagnose.trust when running in a FIPS 140 JVM. More specifically, when xpack.security.fips_mode.enabled is true: - If xpack.security.ssl.diagnose.trust is not explicitly set, the default value of it becomes false and a log message is printed on info level, notifying of the fact that the TLS/SSL diagnostic messages are not enabled when in a FIPS 140 JVM. - If xpack.security.ssl.diagnose.trust is explicitly set, the value of it is honored, even in FIPS mode. This is relevant only for 7.x where we support Java 8 in which SunJSSE can still be used as a FIPS 140 provider for TLS. SunJSSE in FIPS mode, disallows the use of other TrustManager implementations than the one shipped with SunJSSE.	2020-01-13 17:04:23 +02:00
Larry Gregory	cc8aafcfc2	[7.x] - Adding GET/PUT ILM cluster privileges to `kibana_syste… (#50878 ) Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com> Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-01-13 08:36:48 -05:00
Benjamin Trent	eb8fd44836	[ML][Inference] minor fixes for created_by, and action permission (#50890 ) (#50911 ) The system created and models we provide now use the `_xpack` user for uniformity with our other features The `PUT` action is now an admin cluster action And XPackClient class now references the action instance.	2020-01-13 07:59:31 -05:00
Albert Zaharovits	4e837599b3	Nit fix test randomInt bound Relates `2b789fa3e6`	2020-01-13 13:28:20 +02:00
Albert Zaharovits	2b789fa3e6	Make .async-search-* a restricted namespace (#50294 ) Hide the `.async-search-*` in Security by making it a restricted index namespace. The namespace is hard-coded. To grant privileges on restricted indices, one must explicitly toggle the `allow_restricted_indices` flag in the indices permission in the role definition. As is the case with any other index, if a certain user lacks all permissions for an index, that index is effectively nonexistent for that user.	2020-01-13 12:20:54 +02:00
Tim Vernum	985c95dcca	Populate OpenIDConnect metadata collections (#50893 ) The OpenIdConnectRealm had a bug which would cause it not to populate User metadata for collections contained in the user JWT claims. This commit fixes that bug. Backport of: #50521	2020-01-13 18:02:22 +11:00
Benjamin Trent	fa116a6d26	[7.x] [ML][Inference] PUT API (#50852 ) (#50887 ) * [ML][Inference] PUT API (#50852) This adds the `PUT` API for creating trained models that support our format. This includes * HLRC change for the API * API creation * Validations of model format and call * fixing backport	2020-01-12 10:59:11 -05:00
Lee Hinman	63472d30c7	[7.x] Fix SLM check for restore in progress (#50868 ) (#50876 ) * Fix SLM check for restore in progress (#50868) * Fix SLM check for restore in progress This commit fixes the check in SLM where the `RestoreInProgress` metadata was checked for existence. Rather than check existence we should instead check the `isEmpty` method. Prior to this, a successful restore for a repository that used SLM retention would prevent SLM retention from running in subsequent invocations, due to SLM thinking that a restore was still running. * Fix 7.x-isms	2020-01-10 14:27:55 -07:00
Julie Tibshirani	3bac1dc414	Adjust the skip version in flattened field telemetry tests. We forgot to adjust the version when backporting the commit to 7.x.	2020-01-10 10:36:41 -08:00
Benjamin Trent	5afa0b71e9	[ML][Inference] Unify top_classes object field names with analytics (#50858 ) (#50861 )	2020-01-10 12:00:37 -05:00
Dimitris Athanasiou	422422a2bc	[7.x][ML] Reuse SourceDestValidator for data frame analytics (#50841 ) (#50850 ) This commit removes validation logic of source and dest indices for data frame analytics and replaces it with using the common `SourceDestValidator` class which is already used by transforms. This way the validations and their messages become consistent while we reduce code. This means that where these validations fail the error messages will be slightly different for data frame analytics. Backport of #50841	2020-01-10 14:24:13 +02:00
Nik Everett	ae40e22452	Drop "funny" functions building parsers (#50715 ) (#50814 ) Replaces the "funny" `Function<String, ConstructingObjectParser<T, Void>>` with a much simpler `ConstructingObjectParser<T, String>`. This makes pretty much all of our object parsers static.	2020-01-09 15:53:03 -05:00
Jake Landis	de6f132887	[7.x] Foreach processor - fork recursive call (#50514 ) (#50773 ) A very large number of recursive calls can cause a stack overflow exception. This commit forks the recursive calls for non-async processors. Once forked, each thread will handle at most 10 recursive calls to help keep the stack size and thread count down to a reasonable size.	2020-01-09 13:21:18 -06:00
Benjamin Trent	cc0e64572a	[ML][Inference][HLRC] Add necessary lang ident classes (#50705 ) (#50794 ) This adds the necessary named XContent classes to the HLRC for the lang ident model. This is so the HLRC can call `GET _ml/inference/lang_ident_model_1?include_definition=true` without XContent parsing errors. The constructors are package private as since this classes are used exclusively within the pre-packaged model (and require the specific weights, etc. to be of any use).	2020-01-09 10:33:38 -05:00
Benjamin Trent	3e014d39c2	[Transform] fail to start/put on missing pipeline (#50701 ) (#50795 ) If a pipeline referenced by a transform does not exist, we should not allow the transform to be created. We do allow the pipeline existence check to be skipped with defer_validations, but if the pipeline still does not exist on `_start`, the pipeline will fail to start. relates: #50135	2020-01-09 10:33:22 -05:00
Martijn van Groningen	f75d99149b	Wrap triggering of a watch inside an assertBusy(...) invocation This test replaces the watch index after watcher got started. This triggers watches being reloaded and while this happens the trigger engine is paused, which disallows watches from being triggered. At this time there are no watches in the .watches index and I think this is just unlucky timing. Reloading of watches happens in the background and the watch state can be started when that happens. For normal schedule trigger engines this is not an issue, because watches that are meant to be triggered are triggered when the engine triggers the next time. However for the mock scheduled trigger engine this is different, because watches are triggered programatically and there is no retry in this test. I think just adding `timeWarp().trigger("mywatch");` inside a `assertBusy(...)`` is the right fix here. If it fails because the mock schedule trigger engine is paused then the test will try again. In the mean time the the watches can be reloaded, which then resumes the mock scheduled trigger engine. Closes #50658	2020-01-09 09:05:20 +01:00
Ioannis Kakavas	d2189b9d80	Mute SamlAuthenticatorTests in Azulu Zulu (#50779 ) See #49742	2020-01-09 09:41:04 +02:00
Christoph Büscher	b1b4282273	Make Multiplexer inherit filter chains analysis mode (#50662 ) Currently, if an updateable synonym filter is included in a multiplexer filter, it is not reloaded via the _reload_search_analyzers because the multiplexer itself doesn't pass on the analysis mode of the filters it contains, so its not recognized as "updateable" in itself. Instead we can check and merge the AnalysisMode settings of all filters in the multiplexer and use the resulting mode (e.g. search-time only) for the multiplexer itself, thus making any synonym filters contained in it reloadable. This, of course, will also make the analyzers using the multiplexer be usable at search-time only. Closes #50554	2020-01-08 22:12:01 +01:00
Lee Hinman	8dc6e98819	[7.x] Make InitializePolicyContextStep retryable (#50685 ) (#50760 ) This commits makes the "init" ILM step retryable. It also adds a test where an index is created with a non-parsable index name and then fails. Related to #48183	2020-01-08 13:13:57 -07:00
Nhat Nguyen	90e66a7b97	Mute testPolicyCRUD Tracked at #44997	2020-01-08 13:25:40 -05:00
Adrien Grand	4f2299c714	Upgrade to Lucene 8.4.0. (#50518 ) (#50750 )	2020-01-08 18:53:59 +01:00
Lee Hinman	615532b4f8	Mute TimeSeriesLifecycleActionsIT.testHistoryIsWritten* (#50755 ) Related to #50353	2020-01-08 10:35:44 -07:00
Adrien Grand	31158ab3d5	Add per-field metadata. (#50333 ) This PR adds per-field metadata that can be set in the mappings and is later returned by the field capabilities API. This metadata is completely opaque to Elasticsearch but may be used by tools that index data in Elasticsearch to communicate metadata about fields with tools that then search this data. A typical example that has been requested in the past is the ability to attach a unit to a numeric field. In order to not bloat the cluster state, Elasticsearch requires that this metadata be small: - keys can't be longer than 20 chars, - values can only be numbers or strings of no more than 50 chars - no inner arrays or objects, - the metadata can't have more than 5 keys in total. Given that metadata is opaque to Elasticsearch, field capabilities don't try to do anything smart when merging metadata about multiple indices, the union of all field metadatas is returned. Here is how the meta might look like in mappings: ```json { "properties": { "latency": { "type": "long", "meta": { "unit": "ms" } } } } ``` And then in the field capabilities response: ```json { "latency": { "long": { "searchable": true, "aggreggatable": true, "meta": { "unit": [ "ms" ] } } } } ``` When there are no conflicts, values are arrays of size 1, but when there are conflicts, Elasticsearch includes all unique values in this array, without giving ways to know which index has which metadata value: ```json { "latency": { "long": { "searchable": true, "aggreggatable": true, "meta": { "unit": [ "ms", "ns" ] } } } } ``` Closes #33267	2020-01-08 16:21:18 +01:00
Andrei Dan	3915d4c055	Make the UpdateRolloverLifecycleDateStep retryable (#50702 ) (#50730 ) This makes the "update-rollover-lifecycle-date" step, which is part of the rollover action, retryable. It also adds an integration test to check the step is retried and it eventually succeeds. (cherry picked from commit 5bf068522deb2b6cd2563bcf80f34fdbf459c9f2) Signed-off-by: Andrei Dan <andrei.dan@elastic.co>	2020-01-08 11:45:26 +01:00
Christoph Büscher	d8c907d648	Remove _reload_search_analyzer experimental status (#50696 ) Removing the experimental status in the docs and the rest specs.	2020-01-08 10:35:19 +01:00
Tim Vernum	293661d62c	Security should not reload files that haven't changed (#50724 ) In security we currently monitor a set of files for changes: - config/role_mapping.yml (or alternative configured path) - config/roles.yml - config/users - config/users_roles This commit prevents unnecessary reloading when the file change actually doesn't change the internal structure. Backport of: #50207 Co-authored-by: Anton Shuvaev <anton.shuvaev91@gmail.com>	2020-01-08 15:13:47 +11:00
Mayya Sharipova	c1c0b47d5e	Specify the indexname in searches (#50717 ) vector REST tests occasionally fail on 7.x because we don't receive the expected response headers with deprecation warnings. This happens as searchers were executed against all indices including internal indices, whose shards did not produce expected warnings. This PR ensures that searchers are executed only against expected indices. Closes #50716	2020-01-07 17:06:52 -05:00
Benjamin Trent	060e0a6277	[ML][Inference] Add support for models shipped as resources (#50680 ) (#50700 ) This adds support for models that are shipped as resources in the ML plugin. The first of which is the `lang_ident` model.	2020-01-07 09:21:59 -05:00
Hendrik Muhs	98ca9500e8	implement a workaround for remote cluster validation (#50460 ) In 7.x an internal API used for validating remote cluster does not throw, see #50420 for the details. This change implements a workaround for remote cluster validation, only for 7.x branches. fixes #50420	2020-01-07 13:51:51 +01:00
Przemysław Witek	4116452d90	Implement testStopAndRestart for ClassificationIT (#50585 ) (#50698 )	2020-01-07 13:41:37 +01:00
David Roberts	35453e2b0e	[ML] Improve uniqueness of result document IDs (#50644 ) Switch from a 32 bit Java hash to a 128 bit Murmur hash for creating document IDs from by/over/partition field values. The 32 bit Java hash was not sufficiently unique, and could produce identical numbers for relatively common combinations of by/partition field values such as L018/128 and L017/228. Fixes #50613	2020-01-07 10:24:45 +00:00

1 2 3 4 5 ...

4004 Commits