OpenSearch

Commit Graph

Author	SHA1	Message	Date
Armin Braun	de6eeecbd3	Dry up Snapshot Integ Tests some More (#62856 ) (#63248 ) * Just some obvious drying up of these super complex tests. * Mainly just shortening the diff of #61839 here by moving test utilities to the abstract test case. Also, making use of the now available functionality to simplify existing tests and improve logging in them.	2020-10-05 18:33:59 +02:00
David Roberts	a522e932e8	Mute RoundingDuelTests.testSerialization Due to https://github.com/elastic/elasticsearch/issues/63256	2020-10-05 17:22:40 +01:00
Adam Locke	83fcaf4fe7	[DOCS] [7.x] Add PGSync as community-supported integration (#63250 ) * Add PGSync as a new community supported tool (#62788) * Remvoing errant space in Kafka link. Co-authored-by: Tolu Aina <7848930+toluaina@users.noreply.github.com>	2020-10-05 12:02:23 -04:00
Armin Braun	89de9fdcf7	Cleanup Blobstore Repository Metadata Serialization (#62727 ) (#63249 ) Follow ups to #62684 making use of shorter utility for corruption checks.	2020-10-05 17:44:27 +02:00
Armin Braun	509fa46c9e	Fix Broken Exception Handling in Snapshot Cleanup Tool (#63243 ) In the latest version of the GCS SDK the `404` exception is wrapped in an `IOException` making it not pass to the unwrapping added in the previous fix #63168. We can't be handling `IOException` differently here now that GCS uses it for `404`s so I adjusted the exception unwrapping accordingly. Closes #63091	2020-10-05 16:50:47 +02:00
Nik Everett	461475f9e9	Make Rounding.nextRoundingValue consistent (backport #62983 ) (#63242 ) "interval" style roundings were implementing `nextRoundingValue` in a fairly inconsistent way - it'd produce a value, but sometimes that value would be the same as the previous rounding value. This makes it consistently the next value that `rounding` would make.	2020-10-05 10:38:20 -04:00
David Roberts	1b32daf37b	Mute FullClusterRestartIT.testWatcherWithApiKey (#63241 ) Due to https://github.com/elastic/elasticsearch/issues/63088	2020-10-05 15:03:42 +01:00
Nik Everett	77757b28e0	Speed up date_histogram by precomputing ranges (backport of #61467 ) (#62881 ) A few of us were talking about ways to speed up the `date_histogram` using the index for the timestamp rather than the doc values. To do that we'd have to pre-compute all of the "round down" points in the index. It turns out that just precomputing those values speeds up rounding fairly significantly: ``` Benchmark (count) (interval) (range) (zone) Mode Cnt Score Error Units before 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 96461080.982 ± 616373.011 ns/op before 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 130598950.850 ± 1249189.867 ns/op after 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 52311775.080 ± 107171.092 ns/op after 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 54800134.968 ± 373844.796 ns/op ``` That's a 46% speed up when there isn't a time zone and a 58% speed up when there is. This doesn't work for every time zone, specifically those that have two midnights in a single day due to daylight savings time will produce wonky results. So they don't get the optimization. Second, this requires a few expensive computation up front to make the transition array. And if the transition array is too large then we give up and use the original mechanism, throwing away all of the work we did to build the array. This seems appropriate for most usages of `round`, but this change uses it for all usages of `round`. That seems ok for now, but it might be worth investigating in a follow up. I ran a macrobenchmark as well which showed an 11% preformance improvement. BUT the benchmark wasn't tuned for my desktop so it overwhelmed it and might have produced "funny" results. I think it is pretty clear that this is an improvement, but know the measurement is weird: ``` Benchmark (count) (interval) (range) (zone) Mode Cnt Score Error Units before 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 96461080.982 ± 616373.011 ns/op before 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 g± 1249189.867 ns/op after 10000000 calendar month 2000-10-28 to 2000-10-31 UTC avgt 10 52311775.080 ± 107171.092 ns/op after 10000000 calendar month 2000-10-28 to 2000-10-31 America/New_York avgt 10 54800134.968 ± 373844.796 ns/op Before: \| Min Throughput \| hourly_agg \| 0.11 \| ops/s \| \| Median Throughput \| hourly_agg \| 0.11 \| ops/s \| \| Max Throughput \| hourly_agg \| 0.11 \| ops/s \| \| 50th percentile latency \| hourly_agg \| 650623 \| ms \| \| 90th percentile latency \| hourly_agg \| 821478 \| ms \| \| 99th percentile latency \| hourly_agg \| 859780 \| ms \| \| 100th percentile latency \| hourly_agg \| 864030 \| ms \| \| 50th percentile service time \| hourly_agg \| 9268.71 \| ms \| \| 90th percentile service time \| hourly_agg \| 9380 \| ms \| \| 99th percentile service time \| hourly_agg \| 9626.88 \| ms \| \|100th percentile service time \| hourly_agg \| 9884.27 \| ms \| \| error rate \| hourly_agg \| 0 \| % \| After: \| Min Throughput \| hourly_agg \| 0.12 \| ops/s \| \| Median Throughput \| hourly_agg \| 0.12 \| ops/s \| \| Max Throughput \| hourly_agg \| 0.12 \| ops/s \| \| 50th percentile latency \| hourly_agg \| 519254 \| ms \| \| 90th percentile latency \| hourly_agg \| 653099 \| ms \| \| 99th percentile latency \| hourly_agg \| 683276 \| ms \| \| 100th percentile latency \| hourly_agg \| 686611 \| ms \| \| 50th percentile service time \| hourly_agg \| 8371.41 \| ms \| \| 90th percentile service time \| hourly_agg \| 8407.02 \| ms \| \| 99th percentile service time \| hourly_agg \| 8536.64 \| ms \| \|100th percentile service time \| hourly_agg \| 8538.54 \| ms \| \| error rate \| hourly_agg \| 0 \| % \| ```	2020-10-05 09:58:24 -04:00
Rene Groeschke	f58ebe58ee	Use services for archive and file operations in tasks (#62968 ) (#63201 ) Referencing a project instance during task execution is discouraged by Gradle and should be avoided. E.g. It is incompatible with Gradles incubating configuration cache. Instead there are services available to handle archive and filesystem operations in task actions. Brings us one step closer to #57918	2020-10-05 15:52:15 +02:00
Benjamin Trent	1e63313c19	[ML] adds feature_importance_baseline object to model metadata (#63172 ) (#63237 ) this adds the new field `feature_importance_baseline` and allows it to be optionally be included in the model's metadata. Related to: https://github.com/elastic/ml-cpp/pull/1522	2020-10-05 09:33:38 -04:00
Ayush Eshan	dab1b14a10	[Docs] Fixed a couple of typos in CONTRIBUTING (#63205 ) Fixed a couple of typos in contribution guide. (cherry picked from commit 71fcaa41d83bfc5388d22c9b07d2800ee19a8fe8)	2020-10-05 16:25:57 +03:00
Marios Trivyzas	19650e860a	EQL: [Test] Add a test for `identifier` as eventType (#63227 ) (#63235 ) Add a unit test to verify that an identifier surrounded with backquotes is not a valid syntax for eventType value, as eventType is schemantically a string literal and not a field identifier. Follows: #63169 (cherry picked from commit ff12c1340b3890ac52251f31259fa9a719d9eacc)	2020-10-05 15:23:08 +02:00
Costin Leau	1047d67199	Revert "EQL: Avoid filtering on tiebreakers (#63215 )" This reverts commit `efd2243886`.	2020-10-05 15:55:59 +03:00
David Roberts	ccaec70a84	[ML] Muting mappings upgrade test for .ml-stats (#63234 ) Due to https://github.com/elastic/elasticsearch/issues/61908	2020-10-05 13:22:13 +01:00
Armin Braun	d13c1f5058	Fix Overly Strict Assertion in BlobStoreRepository (#63061 ) (#63236 ) As long as `bestEffortConsistency` is `true`, the value of `latestKnownRepoGen` can be updated as a result of reads. We can only assert that `latestKnownRepoGen` and cluster state move in lock-step if `bestEffortConsistency` was `false` before updating the metadata generation as well as after. Closes #62877	2020-10-05 14:06:57 +02:00
Costin Leau	8c4503bcc3	EQL: Change default indices options (#63192 ) Ignore by default unavailable indices (same as ES) and verify that allowNoIndices is set to false since at least one index is required for validating the query. Fix #62986 (cherry picked from commit fd75ac27223cd1b699b8d9c311dc401a39f9e0c8)	2020-10-05 14:21:56 +03:00
Costin Leau	b67d2274ae	QL: Optimize regexs without patterns as equality (#63216 ) If a QL regex doesn't contain any pattern, convert it to Equals. Close #63196 (cherry picked from commit e22a843124290aaacd0e80d7ae9b883e5ec2431e)	2020-10-05 14:21:42 +03:00
Costin Leau	efd2243886	EQL: Avoid filtering on tiebreakers (#63215 ) Do not filter by tiebreaker while searching sequence matches as it's not monotonic and thus can filter out valid data. Fix #62781 (cherry picked from commit 4d62198df70f3b70f8b6e7730e888057652c18a8)	2020-10-05 14:21:30 +03:00
Costin Leau	4f593bdd69	EQL: Make queries using Point-In-Time rely on index filtering (#63161 ) Point-In-Time queries cannot be ran on individual indices but on all. Thus all PIT queries move their index from the request level to a filter so this condition is fulfilled while keeping the query scoped accordingly. Fix #63132 (cherry picked from commit c8eb4f724d5dcc0fcc172c6219ecfbc1dc1fbbae)	2020-10-05 14:21:09 +03:00
Yannick Welsch	b4a1199e87	Uniquely associate term with update task during election (#62212 ) There is a small race when processing the cluster state that is used to establish a newly elected leader as master of the cluster: it can pick the term in its master state update task from a different (newer) election. This trips an assertion in `Coordinator.publish(...)` where we claim that the term on the state allows to uniquely define the pre-state but this isn't so. There are no bad consequences of this race since such a publication fails later on anyway. This PR fixes things so that the assertion holds true by improving the handling of terms during cluster state processing by associating each master state update task that is used to establish a newly elected leader with the correct corresponding term from its election. It also explicitly handles the case where the pre-state that is used as base state has already superseded the current state. As a nice side-effect, join batching now only happens based on the same term. Closes #61437	2020-10-05 11:46:10 +01:00
Armin Braun	106695bec8	Fix Race in ClusterApplierService Shutdown (#62944 ) (#63228 ) The iteration over `timeoutClusterStateListeners` starts when the CS applier thread is still running. This can lead to entries being added to it that never get their listener resolved on shutdown and thus leak that listener as observed in a stuck test in #62863. Since `listener.onClose()` is idempotent we can just call it if we run into a stopped service on the CS thread to avoid the race with certainty (because the iteration in `doStop` starts after the stopped state has been set). Closes #62863	2020-10-05 12:35:42 +02:00
Alan Woodward	01950bc80f	Move FieldMapper#valueFetcher to MappedFieldType (#62974 ) (#63220 ) For runtime fields, we will want to do all search-time interaction with a field definition via a MappedFieldType, rather than a FieldMapper, to avoid interfering with the logic of document parsing. Currently, fetching values for runtime scripts and for building top hits responses need to call a method on FieldMapper. This commit moves this method to MappedFieldType, incidentally simplifying the current call sites and freeing us up to implement runtime fields as pure MappedFieldType objects.	2020-10-04 14:54:59 +01:00
Jason Tedor	1c136bb7fc	Add tier preference when mounting (#63204 ) This commit adds a tier preference when mounting a searchable snapshot. This sets a preference that a searchable snapshot is mounted to a node with the cold role if one exists, then the warm role, then the hot role, assuming that no other allocation rules are in place. This means that by default, searchable snapshots are mounted to a node with the cold role. Note that depending on how we implement frozen functionality of searchable snapshots (not pre-cached/not fully-cached), we might need to adjust this to prefer frozen if mounting a not pre-cached/fully-cached searchable snapshot versus mounting a pre-cached/fully-cached searchable snapshot. This is a later concern since neither this nor the frozen role are implemented currently.	2020-10-03 07:33:36 -04:00
Nhat Nguyen	4ef8673fdd	Fix testRestartAfterCompletion (#63211 ) We need to complete the search before closing the iterator, which internally closes the point in time; otherwise, the search will fail with a missing context error. Closes #62451	2020-10-02 18:14:42 -04:00
Lisa Cawley	69c56d55dc	[DOCS] Clarify BWC of monitoring clusters (#63151 )	2020-10-02 14:09:30 -07:00
Lisa Cawley	4de6104dae	[DOCS] Fix titles for ML APIs (#63152 ) (#63207 )	2020-10-02 14:01:01 -07:00
James Rodewig	ade91a2d9d	[DOCS] EQL: Update syntax for escaped event categories (#63202 ) (#63208 )	2020-10-02 15:19:12 -04:00
James Rodewig	a22b90d3cc	[DOCS] EQL: Replace ?"..." with """...""" for raw strings (#63191 ) (#63198 )	2020-10-02 14:03:58 -04:00
Martijn van Groningen	0b6e2b8f16	Fix enrich policy test bug. Backport #63182 to 7.x branch. The `randomEnrichPolicy(...)` helper method stores the policy and creates the source indices. If a source index already exists, because it was creates for a random policy created earlier then skipping the source index fails, but that is ignored and the test continues. However if the policy has a match field that doesn't exist in the previous random policy then the mapping is never updated and the put policy api fails with the fact that the match field can't be found. This pr fixes that by execute a put mapping call in the event that the source index already exists. Closes #63126	2020-10-02 19:34:39 +02:00
Benjamin Trent	752ee0288e	[7.x] [ML] optimize delete expired snapshots (#63134 ) (#63200 ) * [ML] optimize delete expired snapshots (#63134) When deleting expired snapshots, we do an individual delete action per snapshot per job. We should instead gather the expired snapshots and delete them in a single call. This commit achieves this and a side-effect is there is less audit log spam on nightly cleanup closes https://github.com/elastic/elasticsearch/issues/62875	2020-10-02 13:24:36 -04:00
István Zoltán Szabó	8278bdb7de	[DOCS] Updates trained models API docs titles. (#63165 )	2020-10-02 10:16:19 -07:00
Lisa Cawley	57ea5d27ae	[DOCS] Add experimental tag to data frame analytics APIs (#63153 )	2020-10-02 09:44:40 -07:00
Marios Trivyzas	3cac996373	EQL: Fix syntax for event type (#63169 ) (#63194 ) Event type is actually a string value for event.category which can contain any kind of characters, or start with a digit, which currently is not supported, so we introduce the possibility to be able to use the usual syntax of " and """ for strings and raw strings. Make the grammar a bit cleaner by using the identifier only where it's actually an identifier in terms of query scemantics. Fixes: #62933 (cherry picked from commit 306e1d76da3db652db57f11f847705b3995609ff)	2020-10-02 17:28:13 +02:00
markharwood	bfb3071539	Wildcard field - add normalisation of ngram tokens to reduce disk space. (#63120 ) (#63193 ) Adds normalisation of ngram tokens to reduce disk space. All punctuation becomes / char and for A-Z0-9 chars turn even codepoints to prior odd e.g. aab becomes aaa Closes #62817	2020-10-02 16:24:27 +01:00
Przemysław Witek	5370f270d7	[7.x] [ML] Ensure data frame analytics jobs don't run on a node that's too new (#62749 ) (#63175 )	2020-10-02 17:19:58 +02:00
Marios Trivyzas	9cf0722fe6	SQL: Fix exception when using CAST on inexact field (#62943 ) (#63187 ) Currently, CAST will use the first keyword subfield of a text field for an expression in WHERE clause that gets translated to a painless script which will lead to an exception thrown: ``` "root_cause": [ { "type": "script_exception", "reason": "runtime error", "script_stack": [ "org.elasticsearch.index.mapper.TextFieldMapper$TextFieldType.fielddataBuilder(TextFieldMapper.java:759)", "org.elasticsearch.index.fielddata.IndexFieldDataService.getForField(IndexFieldDataService.java:116)", "org.elasticsearch.index.query.QueryShardContext.lambda$lookup$0(QueryShardContext.java:308)", "org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:101)", "org.elasticsearch.search.lookup.LeafDocLookup$1.run(LeafDocLookup.java:98)", "java.security.AccessController.doPrivileged(Native Method)", "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:98)", "org.elasticsearch.search.lookup.LeafDocLookup.get(LeafDocLookup.java:41)", "org.elasticsearch.xpack.sql.expression.function.scalar.whitelist.InternalSqlScriptUtils.docValue(InternalSqlScriptUtils.java:79)", "InternalSqlScriptUtils.cast(InternalSqlScriptUtils.docValue(doc,params.v0),params.v1)", " ^---- HERE" ], "script": "InternalSqlScriptUtils.cast(InternalSqlScriptUtils.docValue(doc,params.v0),params.v1)", "lang": "painless" } ], ``` Instead of allowing a painless translation using the first underlying keyword silently, which can be confusing, we detect such usage and throw\ an error early. Relates to #60178 (cherry picked from commit 7402e8267ba564e52dc672c25b262824b6048b40)	2020-10-02 16:42:59 +02:00
James Rodewig	099e5d00cc	[DOCS] EQL: Reorganize EQL syntax sections (#63179 ) (#63184 )	2020-10-02 10:25:32 -04:00
nitin2goyal	c9baadd19b	Fix to actually throttle indexing when throttling is activated (#61768 ) In #22721, the decision to throttle indexing was inadvertently flipped, so that we until this commit throttle indexing during recovery but never throttle user initiated indexing requests. This commit fixes that to throttle user initiated indexing requests and never throttle recovery requests. Closes #61959	2020-10-02 15:50:31 +02:00
James Rodewig	322a6b3655	[DOCS] Corrected track_total_hits def (#62830 ) (#63181 ) Co-authored-by: John Berryman <jnbrymn@github.com>	2020-10-02 09:46:16 -04:00
Joe Gallo	d172a18c95	Tidy up some ILM and SLM packages (#63146 ) Very minor refactoring, just moving some ILM and SLM classes around to decrease the total number of packages.	2020-10-02 09:30:24 -04:00
Martijn van Groningen	300e525138	Fix querying a data stream name in _index field. (#63178 ) Backport #63170 to 7.x branch. The _index field is a special field that allows using queries against the name of an index or alias. Data stream names were not included, this pr fixes that by changing SearchIndexNameMatcher (which used via IndexFieldMapper) to also include data streams.	2020-10-02 15:29:20 +02:00
Armin Braun	1663dc7cf8	Fix GCS Repo Cleanup Tool Exception Handling (#63168 ) We recently upgraded the SDK which resulted in the storage exception to be wrapped now so we must unwrap to check for whether it's a 404 or not. Closes #63091	2020-10-02 15:26:39 +02:00
Armin Braun	022a3ef831	Split Tests out of SharedClusterSnapshotRestoreIT (#63130 ) (#63176 ) Splitting some tests out of this class that has become a catch-all for random snapshot related tests into either existing suits that fit better for these tests or one of two new suits to prevent timeouts in extreme cases (e.g. `WindowsFS` + many nodes + multiple data paths per node). No other changes to tests were made whatsoever. Closes #61541	2020-10-02 15:26:22 +02:00
Marios Trivyzas	7d74fb8577	EQL: Replace ?"..." with """...""" for unescaped strings (#62539 ) (#63174 ) Use triple double quotes enclosing a string literal to interpret it as unescaped, in order to use `?` for marking query params and avoid user confusion. `?` also usually implies regex expressions. Any character inside the `"""` beginning-closing markings is considered raw and the only thing that is not permitted is the `"""` sequence itself. If a user wants to use that, needs to resort to the normal `"` string literal and use proper escaping. Relates to #61659 (cherry picked from commit d87c2ca2eacab5552bca1e520d33cf71da40bcfd)	2020-10-02 14:58:50 +02:00
Benjamin Trent	cfcf973259	[7.x] [ML] renames /inference apis to /trained_models (#63097 ) (#63136 ) * [ML] renames /inference apis to /trained_models (#63097) This commit renames all `inference` CRUD APIs to `trained_models`. This aligns with internal terminology, documentation, and use-cases.	2020-10-02 07:34:28 -04:00
Benjamin Trent	535f8a434b	Revert "[ML] adding `baseline` field to total_feature_importance objects (#63098 ) (#63125 )" (#63144 ) This reverts commit `95242eccee`.	2020-10-02 07:03:15 -04:00
Luca Cavanna	a42a516b67	Shorten runtime field type class names (#63123 ) In the codebase there is the non-written convention that classes that extend `MappedFieldType` are generally called `*FieldType`. With this commit we adopt the same convention for runtime field types which allows us to shorten their names by removing the `Mapped` portion which is implicit.	2020-10-02 11:25:25 +02:00
Ioannis Kakavas	e91f66e22f	Ensure domain_name setting for AD realm is present (#61983 ) (#63159 ) We would only check for a null value and not for an empty string so that meant that we were not actually enforcing this mandatory setting. This commits ensures we check for both and fail accordingly if necessary, on startup	2020-10-02 12:16:08 +03:00
David Kyle	279f951700	[ML] Set parent task Id on ml expired data removers (#62854 ) (#62966 ) Setting the parent task Id (of the delete expired data action) on the ML expired data removers makes it easier to track and cancel long running tasks	2020-10-02 10:14:10 +01:00
Ioannis Kakavas	d9d024c17f	Update bcfips in plugin-cli (#63149 ) (#63157 ) In 63099 we updated the bcfips version we use in tests to 1.0.2. We however, bundle bcfips and bcpg-fips in plugin-cli and we should update this too.	2020-10-02 11:41:26 +03:00

... 4 5 6 7 8 ...

54201 Commits All Branches Search

54201 Commits

All Branches