OpenSearch

Commit Graph

Author	SHA1	Message	Date
David Turner	e14d9c9514	Introduce cache index for searchable snapshots (#61595 ) If a searchable snapshot shard fails (e.g. its node leaves the cluster) we want to be able to start it up again on a different node as quickly as possible to avoid unnecessarily blocking or failing searches. It isn't feasible to fully restore such shards in an acceptably short time. In particular we would like to be able to deal with the `can_match` phase of a search ASAP so that we can skip unnecessary waiting on shards that may still be warming up but which are not required for the search. This commit solves this problem by introducing a system index that holds much of the data required to start a shard. Today() this means it holds the contents of every file with size <8kB, and the first 4kB of every other file in the shard. This system index acts as a second-level cache, behind the first-level node-local disk cache but in front of the blob store itself. Reading chunks from the index is slower than reading them directly from disk, but faster than reading them from the blob store, and is also replicated and accessible to all nodes in the cluster. () the exact heuristics for what we should put into the system index are still under investigation and may change in future. This second-level cache is populated when we attempt to read a chunk which is missing from both levels of cache and must therefore be read from the blob store. We also introduce `SearchableSnapshotsBlobStoreCacheIntegTests` which verify that we do not hit the blob store more than necessary when starting up a shard that we've seen before, whether due to a node restart or because a snapshot was mounted multiple times. Backport of #60522 Co-authored-by: Tanguy Leroux <tlrx.dev@gmail.com>	2020-08-27 06:38:32 +01:00
Ryan Ernst	e60c74240a	Add base precommit task to all java projects (#61439 ) This commit adds java compilation to the base precommit task, and adds that to the java plugin. This further reduces dependence on the build plugin.	2020-08-26 17:21:00 -07:00
Lisa Cawley	6d6f5d4acc	[DOCS] Per-partition categorization (#61506 )	2020-08-26 17:10:01 -07:00
James Rodewig	580ef8eb0c	[DOCS] Document static field cache settings (#61424 ) (#61606 )	2020-08-26 17:29:15 -04:00
Jason Tedor	9840fd1485	Add Lucene 8.6.0 memory leak as a known issue (#61603 ) This commit adds a note to the known issues docs that Lucene 8.6.0 contains a memory leak that manifests in Elasticsearch as a slow memory leak.	2020-08-26 15:45:14 -04:00
James Rodewig	462754e4e6	[DOCS] Reorg field data types page (#61117 ) (#61599 )	2020-08-26 14:24:09 -04:00
James Rodewig	8a6ecd5bfc	[DOCS] Fix EQL syntax admon	2020-08-26 13:39:42 -04:00
James Rodewig	20053bfd8c	[DOCS] Remove dupe EQl fn/pipe TOC	2020-08-26 12:45:09 -04:00
Jay Modi	34c4fc3b91	Remove tasks module to define tasks system index (#61588 ) This commit removes the tasks module that only existed to define the tasks result index, `.tasks`, as a system index. The definition for the tasks results system index descriptor is moved to the `SystemIndices` class with a check that no other plugin or module attempts to define an entry with the same source. Additionally, this change also makes the pattern for the tasks result index a wildcard pattern since we will need this when the index is upgraded (reindex to new name and then alias that to .tasks). Backport of #61540	2020-08-26 09:48:23 -06:00
David Turner	f2dc664228	Remove dead code in EsExecutors (#61574 ) Removes a couple of unused methods.	2020-08-26 16:08:36 +01:00
Dimitris Athanasiou	3ed65eb418	[7.x][ML] Recover data frame extraction search from latest sort key (#61544 ) (#61572 ) If a search failure occurs during data frame extraction we catch the error and retry once. However, we retry another search that is identical to the first one. This means we will re-fetch any docs that were already processed. This may result either to training a model using duplicate data or in the case of outlier detection to an error message that the process received more records than it expected. This commit fixes this issue by tracking the latest doc's sort key and then using that in a range query in case we restart the search due to a failure. Backport of #61544 Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>	2020-08-26 17:54:00 +03:00
Benjamin Trent	a6e7a3d65f	[7.x] [ML] write warning if configured memory limit is too low for analytics job (#61505 ) (#61528 ) Backports the following commits to 7.x: [ML] write warning if configured memory limit is too low for analytics job (#61505) Having `_start` fail when the configured memory limit is too low can be frustrating. We should instead warn the user that their job might not run properly if their configured limit is too low. It might be that our estimate is too high, and their configured limit works just fine.	2020-08-26 10:35:38 -04:00
Przemyslaw Gomulka	9f566644af	Do not create two loggers for DeprecationLogger backport(#58435 ) (#61530 ) DeprecationLogger's constructor should not create two loggers. It was taking parent logger instance, changing its name with a .deprecation prefix and creating a new logger. Most of the time parent logger was not needed. It was causing Log4j to unnecessarily cache the unused parent logger instance. depends on #61515 backports #58435	2020-08-26 16:04:02 +02:00
Rene Groeschke	3a8cfdc1f5	Extract distribution archive checks into plugin (7.x backport) (#61567 ) - Added test coverage - Removes build script cluttering - Splits archive building and archive checking logic - only rely on boost for now for ML licenses(tbd) - Use Gradle build-in untar and unzip support * Handle dynamic versions in func tests assertions	2020-08-26 15:04:12 +02:00
Rene Groeschke	fac66a7528	Rework test cluster distribution handling (#61407 ) (#61566 ) Driven by this issue https://github.com/elastic/elasticsearch/pull/60969#issuecomment-674962158 we apply some rework on how we handle distributions in our test cluster setups: - If no custom modules, plugins or extra jar files are declared we do not create a cluster specific distro folder and use the origin distribution folder instead. - If a custom distribution folder is required, we fallback to file copy when hard linking is not supported	2020-08-26 15:03:52 +02:00
Ioannis Kakavas	283eaabc71	[7.x] Refactor SamlAuthenticationIT (#57162 ) (#61568 ) Refactor the tests to not require a mock HTTP Server. This has been the cause of flakiness and removing it doesn't affect the logical coverage of this suite. The "fake UI" is now simulated by an http client that makes the necessary requests to Elasticsearch APIs.	2020-08-26 15:34:56 +03:00
James Rodewig	4701832879	[DOCS] Add 7.9 breaking change for built-in templates (#61549 ) (#61558 )	2020-08-26 08:10:59 -04:00
Przemysław Witek	11c2710e7f	[7.x] [ML] Do not mark the DFA job as FAILED when a failure occurs after the node is shutdown (#61331 ) (#61526 )	2020-08-26 09:53:13 +02:00
lcawl	5fa839b906	[DOCS] Fix typo in update anomaly detection job API	2020-08-25 17:13:38 -07:00
Igor Motov	f70a59971a	[7.x] Add rate aggregation (#61369 ) (#61554 ) Adds a new rate aggregation that can calculate a document rate for buckets of a date_histogram. Closes #60674	2020-08-25 17:39:00 -04:00
debadair	82585107aa	updated shard limit doc (#56496 ) (#61509 ) * updated shard limit doc As the documentation was not so clear. I have updated saying this limit includes open indices with unassigned primaries and replicas count towards the limit. * [DOCS] Incorporated edits. Co-authored-by: Deb Adair <debadair@elastic.co> Co-authored-by: gadekishore <50092970+gadekishore@users.noreply.github.com>	2020-08-25 14:24:47 -07:00
James Rodewig	e0843571c4	[DOCS] Fix typo in search your data docs	2020-08-25 17:01:08 -04:00
Nik Everett	87cf81e179	Migrate some more mapper test cases (#61507 ) (#61552 ) Migrate some more mapper test cases from `ESSingleNodeTestCase` to `MapperTestCase`.	2020-08-25 15:27:26 -04:00
markharwood	8b56441d2b	Search - add case insensitive support for regex queries. (#59441 ) (#61532 ) Backport to add case insensitive support for regex queries. Forks a copy of Lucene’s RegexpQuery and RegExp from Lucene master. This can be removed when 8.7 Lucene is released. Closes #59235	2020-08-25 17:18:59 +01:00
James Rodewig	e3d23c34ab	[DOCS] Document static HTTP settings (#61429 ) (#61536 )	2020-08-25 11:27:05 -04:00
James Rodewig	5ad0ce49e1	[DOCS] Remove response params for #61428 (#61524 ) (#61534 )	2020-08-25 11:17:56 -04:00
Brandon Morelli	fade7408cd	[DOCS] Fix link to quartz crontrigger tutorial (#61531 )	2020-08-25 10:49:00 -04:00
Przemyslaw Gomulka	f3f7d25316	Header warning logging refactoring backport(#55941 ) (#61515 ) Splitting DeprecationLogger into two. HeaderWarningLogger - responsible for adding a response warning headers and ThrottlingLogger - responsible for limiting the duplicated log entries for the same key (previously deprecateAndMaybeLog). Introducing A ThrottlingAndHeaderWarningLogger which is a base for other common logging usages where both response warning header and logging throttling was needed. relates #55699 relates #52369 backports #55941	2020-08-25 16:35:54 +02:00
Costin Leau	bff3c7470e	EQL: Replace SearchHit in response with Event (#61428 ) (#61522 ) The building block of the eql response is currently the SearchHit. This is a problem since it is tied to an actual search, and thus has scoring, highlighting, shard information and a lot of other things that are not relevant for EQL. This becomes a problem when doing sequence queries since the response is not generated from one search query and thus there are no SearchHits to speak of. Emulating one is not just conceptually incorrect but also problematic since most of the data is missed or made-up. As such this PR introduces a simple class, Event, that maps nicely to the terminology while hiding the ES internals (the use of SearchHit or GetResult/GetResponse depending on the API used). Fix #59764 Fix #59779 Co-authored-by: Igor Motov <igor@motovs.org> (cherry picked from commit 997376fbe6ef2894038968842f5e0635731ede65)	2020-08-25 17:32:42 +03:00
Armin Braun	f22ddf822e	Some Optimizations around BytesArray (#61183 ) (#61511 ) * Faster `equals` for `BytesArray` which is nice since with this change we use it for the search cache * Lighter `StreamInput` for `BytesArray` that should save memory and some indirection relative to the one on the abstract bytes reference * Lighter `writeTo` implementation * Build a `BytesArray` instead of a PagedBytesReference whenever possible to save indirection and memory	2020-08-25 07:13:39 +02:00
Armin Braun	806dfcfcf7	Speed up Compression Logic by Pooling Resources (#61358 ) (#61495 ) This is mostly motivated by the performance issues we are seeing around the GET mappings REST API which (in case of a large number of indices) will create decompressing streams in a hot loop which takes a significant amount of time for the system calls involved in instantiating deflaters and inflaters. Also, this fixes a leaked deflater when deserializing cached repository data.	2020-08-25 04:01:55 +02:00
Armin Braun	16b932c1dc	Remove Potentially Expensive Use of BytesReference.toBytesRef (#61415 ) (#61503 ) This method might have materialize all the bytes in a reference into a fresh `byte[]`. Using the stream is much safer and only trivially more expensive + in most cases we now run the fast path via `BytesArray` anyway.	2020-08-24 23:58:21 +02:00
David Kyle	539cf914bc	[ML] handle new model metadata stream from native process (#59725 ) (#61251 ) This adds the serialization handling for the new model_metadata object from the native process. Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>	2020-08-24 15:52:13 -04:00
James Rodewig	2400098a52	[DOCS] Fix typo in profile API docs (#61445 ) (#61501 ) Co-authored-by: James Rodewig <40268737+jrodewig@users.noreply.github.com> Co-authored-by: shashikumarec088 <shashikumarec088@gmail.com>	2020-08-24 15:30:18 -04:00
Nhat Nguyen	baa685c2d9	Fix anchor doc for msearch cancellation paragraph Relates #61418	2020-08-24 15:14:17 -04:00
Nhat Nguyen	f34d3efae7	Add cancellation doc for multi search (#61418 ) Relates #61337	2020-08-24 15:14:05 -04:00
Nhat Nguyen	d47bbbafe0	Cancel multisearch when http connection closed (#61399 ) Relates #61337	2020-08-24 15:12:54 -04:00
Nhat Nguyen	23a0f8b617	Detect and optimize noop of update index settings (#61348 ) This optimization is more relevant in the context of CCR. When a node in the follower cluster leaves, we reallocate the shard-follow tasks on that node to other nodes. The new tasks will overwhelm the follower cluster with many put-mapping, update-settings requests, although most of them are noop. This change detects and optimizes the noop update-settings requests.	2020-08-24 15:08:53 -04:00
James Rodewig	439fa46735	[DOCS] Remove collapsible sections in EQL fn docs (#61498 ) (#61499 )	2020-08-24 14:41:27 -04:00
Benjamin Trent	6ffcc02fb9	Muting test o.e.t.t.ESTestCaseTests.testRandomDateFormatterPattern (#61497 )	2020-08-24 13:58:09 -04:00
Nik Everett	f3b6d49ae1	Migrate server mapper tests to new MapperTestCase (#61378 ) (#61490 ) This continues #61301, migrating all of the mappers in `server` to the new `MapperTestCase` which is nicer than `FieldMapperTestCase` because it doesn't depend on all of Elasticsearch.	2020-08-24 13:33:35 -04:00
James Rodewig	17b5a0d25e	[DOCS] Combine `Search your data` files (#61477 ) (#61486 ) No-op changes to: * Move `Search your data` source files into the same directory * Rename `Search your data` source files based on page ID * Remove unneeded includes * Remove the `Request` dir	2020-08-24 13:08:00 -04:00
Benjamin Trent	1ae2923632	[7.x] [ML] adding docs + hlrc for data frame analysis feature_processors (#61149 ) (#61493 ) * [ML] adding docs + hlrc for data frame analysis feature_processors (#61149) Adds HLRC and some docs for the new feature_processors field in Data frame analytics. Co-authored-by: Przemysław Witek <przemyslaw.witek@elastic.co> Co-authored-by: Lisa Cawley <lcawley@elastic.co>	2020-08-24 12:56:21 -04:00
Armin Braun	d05649bfae	Fix PutPolicyRequestTests.testFromXContent (#61485 ) (#61494 ) We only ever support `JSON` for the query source format in practice. The reason this test worked before is a bug in xcontent parsing that parses empty maps out of streams of the wrong format. Closes #61483	2020-08-24 18:52:05 +02:00
Armin Braun	bb4d97073c	Remove Favicon Special Path in RestController (#61460 ) (#61487 ) It's unnecessary (and adds one string comparison to every request) to special case the favicon so I added it as a normal REST handler to simplify the code.	2020-08-24 18:36:23 +02:00
James Rodewig	2b852388c5	[DOCS] Fix hyphenation for "time series" (#61472 ) (#61481 )	2020-08-24 11:18:07 -04:00
Dimitris Athanasiou	618dd65d5f	[7.x][ML] Add debug logging for field caps request during DF Analytics (#61459 ) (#61478 ) Adds debug logging for the request and the response that is getting field capabilities during a data frame analytics job. Backport of #61459	2020-08-24 18:01:30 +03:00
James Rodewig	5992bb0507	[DOCS] Fix ingest script compilation rate and cache size (#61468 ) (#61479 )	2020-08-24 10:46:44 -04:00
Dimitris Athanasiou	18ca8a6be3	[7.x][ML] Remove redundant logging for creation of annotations index (#61461 ) (#61475 ) This commit removes the log info message "Created ML annotations index and aliases". The message comes in addition to elasticsearch's index creation logging and it does not add to it. In addition, since #61107 that message may be logged multiple times. Backport of #61461	2020-08-24 17:46:29 +03:00
Lisa Cawley	52b12a07c4	[DOCS] Document static machine learning settings (#61382 )	2020-08-24 07:35:38 -07:00

1 2 3 4 5 ...

53371 Commits All Branches Search

53371 Commits

All Branches