OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-19 19:35:02 +00:00

Author	SHA1	Message	Date
Ignacio Vera	3536f7f7c2	Initialize BitArray storage as number of bits (#62327 ) (#62354 )	2020-09-15 08:34:22 +02:00
Armin Braun	c81a076f5a	Improve Efficiency of ClusterApplierService Iteration (#62282 ) (#62350 ) The complexity of removing a timeout listener was `O(n)` which means that in case of many queued up CS update tasks (such as in the case of an avalanche of dynamic mapping updates) we're dealing with quadratic complexity for timing out N tasks which was observed to be an issue in practice. This PR makes the complexity of timing out a task `O(1)` and generally simplifies the iteration logic of listeners and applies to be a little more efficient and inline better.	2020-09-15 05:59:48 +02:00
Lee Hinman	6b2af30a62	[7.x] Add "synthetics--" templates for synthetics fleet data (#62193 ) (#62346 ) * Add "synthetics--" templates for synthetics fleet data For the Elastic Agent we currently have `logs` and `metrics`, however, synthetic data doesn't belong with those and thus we should have a place for it to live. This would be data reported from heartbeat and under the 'monitoring' category. This commit adds a composable index template for `synthetics--` indices similar to the work in #56709 and #57629. Resolves #61665	2020-09-14 17:14:34 -06:00
Julie Tibshirani	f56ce4f39b	Fix failure in InnerHitBuilderTests around 'fields' option. (#62344 ) The case InnerHitBuilderTests#testEqualsAndHashcode creates a copy of the object by serializing + deserializing it, then applies a modification. If the 'fields' list is empty, then deserializing it results in Collections.emptyList. Because this is immutable, then modifying it can throw an UnsupportedOperationException. This PR takes the same approach as for docvalue_fields, where we create a new list instead of trying to add to an empty one.	2020-09-14 15:39:03 -07:00
Julie Tibshirani	9332a9c74b	Add the fields option to the search API docs. (#62260 )	2020-09-14 13:44:44 -07:00
Julie Tibshirani	4a19bdb2ea	Support the 'fields' option in inner_hits and top_hits. (#62337 ) This PR adds support for the 'fields' option in the following places: * Anytime `inner_hits` is used, for both fetching nested/ child docs and field collapsing * The `top_hits` aggregation Addresses #61949.	2020-09-14 11:51:45 -07:00
David Roberts	3d5c13f559	[ML] Add an assertion on annotations mappings to upgrade test (#62331 ) The annotations index is not covered by the comparison between mappings and templates, as it does not use an index template. This commit adds an assertion on annotations index mappings that will fail if the mappings are not upgraded as expected. Backport of #62325	2020-09-14 18:46:35 +01:00
James Rodewig	ec335c7c34	[DOCS] Fix capitalization for several headings (#62324 ) (#62329 )	2020-09-14 12:35:15 -04:00
David Turner	9acd2fd1fd	Minor cleanups to BytesReferenceStreamInput (#62302 ) Followup to #61681: - reuse the current iterator in `reset()` if possible - simply some integer-overflow-avoidance in `skip()` - clarify some comments - address some IntelliJ warnings	2020-09-14 17:02:27 +01:00
David Roberts	e4275f3749	[ML] Use utility thread pool for memory estimation (#62314 ) The job comms thread pool is intended for the long-running job processes that do anomaly detection or data frame analytics and count towards job count and memory limits. This commit moves the short-lived memory estimation processes to the ML utility thread pool. Although this doesn't matter in most cases, at the limits of scale it could mean that memory estimations would get in the way of starting jobs, or would queue up for an excessive period of time while waiting for jobs to finish.	2020-09-14 16:47:12 +01:00
Lee Hinman	bf9651c635	[7.x] Add "content" tier as new "data_content" role (#62247 ) (#62322 ) Similar to the work in #60994 where we introduced the `data_hot`, `data_warm`, etc node roles. This introduces a new `data_content` node role to be used for the Content tier. Currently this tier is not used anywhere, but subsequent work will use this tier. Relates to #60848	2020-09-14 09:42:57 -06:00
Benjamin Trent	13c193a9fc	[Enrich] add logging for when there are search/bulk failures on _execute (#62313 ) (#62320 ) When calling `_execute` there is a chance that there will be bulk indexing failures or search failures. These will result in the call failing overall. But, no information is provided for troubleshooting the failure. This commit adds logging to indicate the number of failures, and new debug level logging so that failure details can be determined if necessary. closes https://github.com/elastic/elasticsearch/issues/60491	2020-09-14 11:20:13 -04:00
Christoph Büscher	e2eada2498	Fix disabling `allow_leading_wildcard` (#62300 ) (#62318 ) Disabling the `query_string` queries `allow_leading_wildcard` parameter didn't work after a change probably introduced in #60959 because the various field types `wildcardQuery` don't check the leading characters like QueryParserBase#getWildcardQuery does. This PR adds the missing check also before calling the field types wildcard generating method. Closes #62267	2020-09-14 17:13:17 +02:00
Alan Woodward	5358cee29c	Cut over more mapping tests to MapperServiceTestCase (#62312 ) Shaves a few more seconds off the build.	2020-09-14 16:00:37 +01:00
James Rodewig	f4dfdc9d59	[DOCS] Fix typo in rollup groups docs (#62269 ) (#62316 ) Co-authored-by: AndyHunt66 <andrew.hunt@elastic.co>	2020-09-14 10:42:58 -04:00
Varun Sharma	65ec94f8a3	[DOCS] Fix node roles typo (#62307 ) (#62306 )	2020-09-14 10:17:30 -04:00
James Rodewig	3ab28e84c6	[DOCS] EQL: Update keyword family field types (#62254 ) (#62310 ) Updates several keyword/constant keyword references to use any field type in the keyword family.	2020-09-14 09:51:34 -04:00
James Rodewig	af13c9802d	[7.x] [DOCS] Add PIT to search after docs (#61593 ) (#62101 )	2020-09-14 09:13:23 -04:00
Armin Braun	95766da345	Save Some Allocations when Working with ClusterState (#62060 ) (#62303 ) Just a number of obvious spots where we were allocating duplicate empty structures or otherwise inefficient that I found while investigating snapshot cluster state update performance.	2020-09-14 15:09:54 +02:00
Tanguy Leroux	9e38dd0254	Deprecate Repository Stats API (#62297 ) (#62308 ) This commit deprecates the Repository Stats API added in 7.8.0 as an experimental API behind a feature flag. The goal is to deprecate this API in 7.10.0 and remove it in a follow up PR in 8.0.0. This API is now superseded by the Repositories Metering API.	2020-09-14 14:57:38 +02:00
Armin Braun	875af1c976	Remove Dead Variable in BlobStoreIndexShardSnapshots. (#62285 ) (#62295 ) This was never used. Co-authored-by: Howard <danielhuang@tencent.com>	2020-09-14 13:40:39 +02:00
David Roberts	d8288526d9	[ML] Add null checks for C++ log handler (#62238 ) It has been observed that if the normalizer process fails to connect to the JVM then this causes a null pointer exception as the JVM tries to close the native process object. The accessors and close methods of the native process class that access the C++ log handler should not assume that it connected correctly.	2020-09-14 11:28:26 +01:00
Martijn van Groningen	c88f4174ec	Fix resolve index data streams yaml test. (#62221 ) Closes #62190	2020-09-14 08:43:58 +02:00
Nhat Nguyen	7779c1f703	Ensure to release async search iterator in tests We need to close an async search response iterator to release the related point in time if the test uses pit.	2020-09-12 12:04:10 -04:00
Leaf-Lin	5ea5cc5b54	[DOCS] Fix typo in update by query docs (#62263 ) This page is referring to update by query, not delete by query.	2020-09-11 09:48:24 -04:00
Martijn van Groningen	1bb094a27b	Return 404 when deleting a non existing data stream (#62224 ) Backport of #62059 to 7.x branch. Return a 404 http status code when attempting to delete a non existing data stream. However only return a 404 when targeting a data stream without any wildcards. Closes #62022	2020-09-11 15:36:05 +02:00
Nhat Nguyen	b118697368	Adjust BWC rest version for point in time (#62264 ) Relates #61872	2020-09-11 08:54:11 -04:00
Luca Cavanna	b5e1e652c1	Remove unused import	2020-09-11 10:19:01 +02:00
Luca Cavanna	53bf057a53	[TEST] avoid double null check in TransportSearchActionTests	2020-09-11 10:10:09 +02:00
Luca Cavanna	3d3a1b4bc2	Tweak OpenPointInTimeRequest createTask This commit addresses a super minor misalignment with master, applying exactly the same change that was made as part of #62057, which was backported before point in time APIs were backported.	2020-09-11 10:06:35 +02:00
Nhat Nguyen	aafb2cb812	Support point in time cross cluster search (#61827 ) This commit integrates point in time into cross cluster search. Relates #61062 Closes #61790	2020-09-10 19:25:48 -04:00
Nhat Nguyen	808c8689ac	Always include the matching node when resolving point in time (#61658 ) If shards are relocated to new nodes, then searches with a point in time will fail, although a pit keeps search contexts open. This commit solves this problem by reducing info used by SearchShardIterator and always including the matching nodes when resolving a point in time. Closes #61627	2020-09-10 19:25:48 -04:00
Nhat Nguyen	035f0638f4	Support point in time in async_search (#61560 ) This commit integrates point in time into async search and ensures that it works correctly with security enabled. Relates #61062	2020-09-10 19:25:48 -04:00
Nhat Nguyen	063a6d047c	Release search context when scroll keep_alive is too large (#62179 ) Previously, we close related search contexts if the keep_alive of a scroll is too large. But we accidentally change this behavior in #62061.	2020-09-10 19:25:48 -04:00
Nhat Nguyen	2eb1e8bc84	Make keep alive of point in time optional in search (#62184 ) A search request should not be required to extend the keep_alive of a point in time. This change makes that parameter optional.	2020-09-10 19:25:48 -04:00
Jim Ferenczi	3fc35aa76e	Shard Search Scroll failures consistency (#62061 ) Today some uncaught shard failures such as RejectedExecutionException skips the release of shard context and let subsequent scroll requests access the same shard context again. Depending on how the other shards advanced, this behavior can lead to missing data since scrolls always move forward. In order to avoid hidden data loss, this commit ensures that we always release the context of shard search scroll requests whenever a failure occurs locally. The shard search context will no longer exist in subsequent scroll requests which will lead to consistent shard failures in the responses. This change also modifies the retry tests of the reindex feature. Reindex retries scroll search request that contains a shard failure and move on whenever the failure disappears. That is not compatible with how scrolls work and can lead to missing data as explained above. That means that reindex will now report scroll failures when search rejection happen during the operation instead of skipping document silently. Finally this change removes an old TODO that was fulfilled with #61062.	2020-09-10 19:25:48 -04:00
Jim Ferenczi	4d528e91a1	Ensure validation of the reader context is executed first (#61831 ) This change makes sure that reader context is validated (`SearchOperationListener#validateReaderContext) before any other operation and that it is correctly recycled or removed at the end of the operation. This commit also fixes a race condition bug that would allocate the security reader for scrolls more than once. Relates #61446 Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>	2020-09-10 19:25:48 -04:00
Luca Cavanna	44bd4a6004	Fix point in time toXContent impl (#62080 ) PointInTimeBuilder is a ToXContentObject yet it does not print out a whole object (it is rather a fragment). Also, when it is printed out as part of SearchSourceBuilder, an error is thrown because pit should be wrapped into its own object. This commit fixes this and adds tests for it.	2020-09-10 19:25:47 -04:00
Nhat Nguyen	3d69b5c41e	Introduce point in time APIs in x-pack basic (#61062 ) This commit introduces a new API that manages point-in-times in x-pack basic. Elasticsearch pit (point in time) is a lightweight view into the state of the data as it existed when initiated. A search request by default executes against the most recent point in time. In some cases, it is preferred to perform multiple search requests using the same point in time. For example, if refreshes happen between search_after requests, then the results of those requests might not be consistent as changes happening between searches are only visible to the more recent point in time. A point in time must be opened before being used in search requests. The `keep_alive` parameter tells Elasticsearch how long it should keep a point in time around. ``` POST /my_index/_pit?keep_alive=1m ``` The response from the above request includes a `id`, which should be passed to the `id` of the `pit` parameter of search requests. ``` POST /_search { "query": { "match" : { "title" : "elasticsearch" } }, "pit": { "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "1m" } } ``` Point-in-times are automatically closed when the `keep_alive` is elapsed. However, keeping point-in-times has a cost; hence, point-in-times should be closed as soon as they are no longer used in search requests. ``` DELETE /_pit { "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA=" } ``` #### Notable works in this change: - Move the search state to the coordinating node: #52741 - Allow searches with a specific reader context: #53989 - Add the ability to acquire readers in IndexShard: #54966 Relates #46523 Relates #26472 Co-authored-by: Jim Ferenczi <jimczi@apache.org>	2020-09-10 19:25:47 -04:00
Nhat Nguyen	87c889f9c9	CCR should retry on CircuitBreakingException (#62013 ) CCR shard follow task can hit CircuitBreakingException on the leader cluster (read changes requests) or the follower cluster (bulk requests). CCR should retry on CircuitBreakingException as it's a transient error.	2020-09-10 17:23:47 -04:00
Nik Everett	ac23380560	Fix some query methods in runtime fields We were missing a few `@Override` annotations in runtime fields which let us drift from the methods we were supposed to override. Oops. This adds them and links the methods.	2020-09-10 17:06:05 -04:00
Armin Braun	e0a81f7d14	Speed up Version Checks (#62216 ) (#62253 ) The `fromId` method would show up in profiling and JIT analysis as not-inlinable because it's too large in the contexts it's used in in many cases and was consuming a surprising amount of cycles for computing the min compat versions. -> extract cold path from `fromId` to make JIT happy and cache minimumg compatible versions to fields.	2020-09-10 22:57:06 +02:00
Luca Cavanna	39e59d6edf	Share more query execution code for runtime fields (#62229 ) For runtime fields we have written quite some lucene queries that work against runtime values that are the result of the execution of the different script contexts that runtime fields support. The all (but one) share the same main logic: use a two phase iterator, iterate over all documents, and decide whether the current doc matches or not based on what the script returns. I went ahead and shared this bit of code in the base class for all queries on top of runtime fields.	2020-09-10 20:27:49 +02:00
Luca Cavanna	cd9774d8cb	Runtime fields: rename emitValue function to emit (#62191 ) We decided to shorten the emitValue function to emit, given that emit is self-explanatory. Relates to #59332	2020-09-10 20:27:49 +02:00
Armin Braun	25db5acb0d	Simplify TimeValue Serialization (#62023 ) (#62248 ) This can be done without map lookups => less code and much smaller methods => better inlining potentially.	2020-09-10 20:16:21 +02:00
James Rodewig	df3a7c0c8d	[DOCS] Fix ILM force merge codec param (#62243 ) (#62251 )	2020-09-10 14:08:04 -04:00
James Rodewig	2b50d7e170	[DOCS] Fix ILM attribute (#62245 ) (#62249 )	2020-09-10 14:07:31 -04:00
Francisco Fernández Castaño	21303e8e15	Take into account sas tokens while metering put object requests on azure (#62244 ) Backport of #62225 Closes #62208	2020-09-10 19:47:58 +02:00
James Rodewig	09b167c8dd	[DOCS] Add redirects for removed searchable snapshot APIs (#62236 ) (#62237 )	2020-09-10 11:40:24 -04:00
Adam Locke	a8e3796d96	[DOCS] Adding link from SAML docs to ADFS blog. (#62006 ) (#62232 )	2020-09-10 11:15:15 -04:00

1 2 3 4 5 ...

53634 Commits