OpenSearch

Commit Graph

Author	SHA1	Message	Date
Martijn van Groningen	aff0c9babc	This commits merges (#48040 ) the enrich-7.x feature branch, which is backport merge and adds a new ingest processor, named enrich processor, that allows document being ingested to be enriched with data from other indices. Besides a new enrich processor, this PR adds several APIs to manage an enrich policy. An enrich policy is in charge of making the data from other indices available to the enrich processor in an efficient manner. Related to #32789	2019-10-15 17:31:45 +02:00
jimczi	b858e19bcc	Revert #46598 that breaks the cachability of the sub search contexts.	2019-10-15 09:40:59 +02:00
Martijn van Groningen	cc4b6c43b3	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-15 07:23:47 +02:00
Jim Ferenczi	ef02a736ca	Don't apply the plugin's reader wrapper in can_match phase (#47816 ) This change modifies the local execution of the `can_match` phase to not apply the plugin's reader wrapper (if it is configured) when acquiring the searcher. We must ensure that the phase runs quickly and since we don't know the cost of applying the wrapper it is preferable to avoid it entirely. The can_match phase can aford false positives so it is also safe for the builtin plugins that use this functionality. Closes #46817	2019-10-14 13:07:05 +02:00
Martijn van Groningen	d4901a71d7	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-14 10:27:17 +02:00
Nhat Nguyen	8180cf1e68	Mute testDoNotInfinitelyWaitForMapping Tracked at #47974	2019-10-13 22:06:50 -04:00
Nhat Nguyen	2995d4a9c0	Sequence number based replica allocation (#46959 ) With this change, shard allocation prefers allocating replicas on a node that already has a copy of the shard that is as close as possible to the primary, so that it is as cheap as possible to bring the new replica in sync with the primary. Furthermore, if we find a copy that is identical to the primary then we cancel an ongoing recovery because the new copy which is identical to the primary needs no work to recover as a replica. We no longer need to perform a synced flush before performing a rolling upgrade or full cluster start with this improvement. Closes #46318	2019-10-13 22:06:50 -04:00
Nhat Nguyen	4f06225928	Avoid unneeded refresh with concurrent realtime gets (#47895 ) This change should reduce refreshes for a use-case where we perform multiple realtime gets at the same time on an active index. Currently, we only call refresh if the index operation is still on the versionMap. However, at the time we call refresh, that operation might be already or will be included in the latest reader. Hence, we do not need to refresh. Adding another lock here is not an issue as the refresh is already sequential.	2019-10-13 20:08:21 -04:00
Nhat Nguyen	4c1bb210cb	Force flush in translog retention policy test (#47879 ) If we roll translog but do not index, then a flush without force is a noop. In this case, the number of retained translog files will be higher than the value specified by the retention policy. Closes #4741	2019-10-13 20:08:21 -04:00
Przemyslaw Gomulka	6ab58de7ef	[7.x] Enable ResolverStyle.STRICT for java formatters backport(#46675 ) (#47913 ) Joda was using ResolverStyle.STRICT when parsing. This means that date will be validated to be a correct year, year-of-month, day-of-month However, we also want to make it works with Year-Of-Era as Joda used to, hence custom temporalquery.localdate in DateFormatters.from Within DateFormatters we use the correct uuuu year instead of yyyy year of era worth noting: if yyyy(without an era) is used in code, the parsing result will be a TemporalAccessor which will fail to be converted into LocalDate. We mostly use DateFormatters.from so this takes care of this. If possible the uuuu format should be used.	2019-10-11 21:19:56 +02:00
Christoph Büscher	2ef12c37f5	Add builder for distance_feature to QueryBuilders (#47846 ) The QueryBuilders convenience class is currently missing a shortcut to construct a DistanceFeatureQueryBuilder, which is added here. Closes #47767	2019-10-11 18:20:01 +02:00
Alan Woodward	ec9198d0e2	Adjust Version.V_6_8_4 to refer to Lucene 7.7.2 (#47926 ) 6.8.4 will ship with Lucene 7.7.2, so we need to change our version settings to reflect this. Relates #47901	2019-10-11 17:01:42 +01:00
David Turner	ba62eb3dce	Allow truncation of clean translog (#47866 ) Today the `elasticsearch-shard remove-corrupted-data` tool will only truncate a translog it determines to be corrupt. However there may be other cases in which it is desirable to truncate the translog, for instance if an operation in the translog cannot be replayed for some reason other than corruption. This commit adds a `--truncate-clean-translog` option to skip the corruption check on the translog and blindly truncate it.	2019-10-11 15:48:12 +01:00
Henning Andersen	a0d0866f59	Shrink should not touch max_retries (#47719 ) Shrink would set `max_retries=1` in order to avoid retrying. This however sticks to the shrunk index afterwards, causing issues when a shard copy later fails to allocate just once. Avoiding a retry of a shrink makes sense since there is no new node to allocate to and a retry will likely fail again. However, the downside of having max_retries=1 afterwards outweigh the benefit of not retrying the failed shrink a few times. This change ensures shrink no longer sets max_retries and also makes all resize operations (shrink, clone, split) leave the setting at default value rather than copy it from source.	2019-10-11 14:22:56 +02:00
Przemyslaw Gomulka	0c439fe495	[7.x] Allow partial parsing dates (#47872 ) backport(#46814 ) Enable partial parsing of date part. This is making the behaviour in java.time implementation the same as with joda. 2018, 2018-01 and 2018-01-01 are all valid dates for date_optional_time or strict_date_optional_time closes #45284 closes #47473	2019-10-11 11:17:19 +02:00
Zachary Tong	2de3411c9c	Make sibling pipeline agg ctor's protected (#42808 ) SiblingPipelineAggregator is a public interfaces, but the ctor was package-private. These should be protected so that plugin authors can extend and implement their own sibling pipeline agg.	2019-10-10 12:31:14 -04:00
Martijn van Groningen	102016d571	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-10 14:44:05 +02:00
Jim Ferenczi	bd6e2592a7	Remove the SearchContext from the highlighter context (#47733 ) Today built-in highlighter and plugins have access to the SearchContext through the highlighter context. However most of the information exposed in the SearchContext are not needed and a QueryShardContext would be enough to perform highlighting. This change replaces the SearchContext by the informations that are absolutely required by highlighter: a QueryShardContext and the SearchContextHighlight. This change allows to reduce the exposure of the complex SearchContext and remove the needs to clone it in the percolator sub phase. Relates #47198 Relates #46523	2019-10-10 10:34:10 +02:00
Jim Ferenczi	3d334a262b	Ensure that we don't call listener twice when detecting a partial failure in _search (#47694 ) This change fixes a bug that can occur when a shard failure is detected while we build the search response and accept partial failures in set to false. In this case we currently call onFailure on the provided listener but also continue the search as if the failure didn't occur. This can lead to a listener called twice, once with onFailure and once with onSuccess which is forbidden by design.	2019-10-10 09:59:49 +02:00
dengweisysu	dc4224fbdf	Sync translog without lock before trim unreferenced readers (#47790 ) This commit is similar to the optimization made in #45765. With this change, we fsync most of the data of the current generation without holding writeLock when trimming unreferenced readers. Relates #45765	2019-10-09 17:56:30 -04:00
Armin Braun	302e09decf	Simplify some Common ActionRunnable Uses (#47799 ) (#47828 ) Especially in the snapshot code there's a lot of logic chaining `ActionRunnables` in tricky ways now and the code is getting hard to follow. This change introduces two convinience methods that make it clear that a wrapped listener is invoked with certainty in some trickier spots and shortens the code a bit.	2019-10-09 23:29:50 +02:00
Igor Motov	12e4e7ef54	Geo: implement proper handling of out of bounds geo points (#47734 ) This is the first iteration in improving of handling of out of bounds geopoints with a latitude outside of the -90 - +90 range and a longitude outside of the -180 - +180 range. Relates to #43916	2019-10-09 20:30:59 +04:00
Igor Motov	f8b8afdc70	Geo: Fixes indexing of linestrings that go around the globe (#47471 ) LINESTRING (0 0, 720 20) is now decomposed into 3 strings: multilinestring ( (0.0 0.0, 180.0 5.0), (-180.0 5.0, 180 15), (-180.0 15.0, 0 20) ) It also fixes issues with linestrings that intersect antimeridian more than 5 times. Fixes #43837 Fixes #43826	2019-10-09 20:30:59 +04:00
Tim Brooks	d18ff24dbe	Fix BulkByScrollResponseTests exception assertions (#45519 ) Currently in the x content serialization tests we compare the exception messages that are serialized. These exceptions messages are not equivalent because the exception often changes when serialized to x content. This commit removes this assertion.	2019-10-09 10:15:58 -06:00
Tim Brooks	02622c1ef9	Fix issues with serializing BulkByScrollResponse (#45357 ) Currently there are two issues with serializing BulkByScrollResponse. First, when deserializing from XContent, indexing exceptions and search exceptions are switched. Additionally, search exceptions do no retain the appropriate RestStatus code, so you must evaluate the status code from the exception. However, the exception class is not always correctly retained when serialized. This commit adds tests in the failure case. Additionally, fixes the swapping of failure types and adds the rest status code to the search failure.	2019-10-09 10:12:14 -06:00
Martijn van Groningen	da1e2ea461	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-09 09:06:13 +02:00
Armin Braun	96b36b5a8c	Make loadShardSnapshot Exceptions Consistent (#47728 ) (#47735 ) Similar to #47507. We are throwing `SnapshotException` when you (and SLM tests) would expect a `SnapshotMissingException` for concurrent snapshot status and snapshot delete operations with a very low probability. Fixed the exception type and added a test for this scenario.	2019-10-08 21:04:51 +02:00
Armin Braun	5cef4752f7	Fix Ex. Handling in SnapshotsService#snapshots (#47507 ) (#47727 ) We're needlessly wrapping a `SnapshotMissingException` which itself is a `SnapshotException` when trying to load a missing snapshot. This leads to failure #47442 which expects a `SnapshotMissingException` in this case. Closes #47442	2019-10-08 17:01:54 +02:00
Henning Andersen	ce91ba7c25	Dangling indices strip aliases (#47581 ) Importing dangling indices with aliases risks breaking functionalities using those aliases. For instance, writing to an alias may break if there is no is_write_index indication on the existing alias and the dangling index import adds a second index to the alias. Or an application could have an assumption about the alias only ever pointing to one index and suddenly seeing the alias also linked to an old index could break it. With this change we strip aliases of the index meta data found before importing a dangling index.	2019-10-08 12:09:30 +02:00
David Turner	bb5f750ab4	Deprecate include_relocations setting (#47443 ) Setting `cluster.routing.allocation.disk.include_relocations` to `false` is a bad idea since it will lead to the kinds of overshoot that were otherwise fixed in #46079. This commit deprecates this setting so it can be removed in the next major release.	2019-10-08 08:19:04 +01:00
Tal Levy	a17f394e27	Geo-Match Enrich Processor (#47243 ) (#47701 ) this commit introduces a geo-match enrich processor that looks up a specific `geo_point` field in the enrich-index for all entries that have a geo_shape match field that meets some specific relation criteria with the input field. For example, the enrich index may contain documents with zipcodes and their respective geo_shape. Ingesting documents with a geo_point field can be enriched with which zipcode they associate according to which shape they are contained within. this commit also refactors some of the MatchProcessor by moving a lot of the shared code to AbstractEnrichProcessor. Closes #42639.	2019-10-07 15:03:46 -07:00
Armin Braun	b669b8f046	Simplify Snapshot Delete Further (#47626 ) (#47644 ) This change removes the special path for deleting the index metadata blobs and moves deleting them to the bulk delete of unreferenced blobs at the end of the snapshot delete process. This saves N RPC calls for a snapshot containing N indices and simplifies the code. Also, this change moves the unreferenced data cleanup up the stack to make it more obvious that any exceptions during this pahse will be ignored and not fail the delete request. Lastly, this change removes the needless chaining of first deleting unreferenced data from the snapshot delete and then running the stale data cleanup (that would also run from the cleanup endpoint) and simply fires off the cleanup right after updating the repository data (index-N) in parallel to the other delete operations to speed up the delete some more.	2019-10-07 14:18:41 +02:00
Armin Braun	1359ef73a3	Add IT for Snapshot Issue in 47552 (#47627 ) (#47634 ) * Add IT for Snapshot Issue in 47552 (#47627) Adding a specific integration test that reproduces the problem fixed in #47552. The issue fixed only reproduces in the snapshot resiliency otherwise which are not available in 6.8 where the fix is being backported to as well.	2019-10-07 10:38:19 +02:00
Armin Braun	6bd033931b	Add Consistency Assertion to SnapshotsInProgress (#47598 ) (#47633 ) Assert given input shards and indices are consistent. Also, fixed the equality check for SnapshotsInProgress. Before this change the tests never had more than a single waiting shard per index so they never failed as a result of the waiting shards list not being ordered. Follow up to #47552	2019-10-07 10:37:56 +02:00
Luca Cavanna	736fceb18b	Fold InitialSearchPhase into AbstractSearchAsyncAction (#47182 ) Historically, we have two base classes for search actions that generally need to fan out to multiple shards and then move on to the following phase: InitialSearchPhase and AbstractSearchAsyncAction that extends it. Practically, every search action extends the latter, and there are no direct subclasses of InitialSearchPhase in our codebase. This commit folds InitialSearchPhase into AbstractSearchAsyncAction in the attempt of simplifying things and making the search code running on the coordinating node easier to reason about.	2019-10-07 10:10:04 +02:00
Martijn van Groningen	f2f2304c75	Merge remote-tracking branch 'es/7.x' into enrich-7.x	2019-10-07 10:07:56 +02:00
Armin Braun	22679c7932	Fix Snapshot Corruption in Edge Case (#47552 ) (#47620 ) This fixes missing to marking shard snapshots as failures when multiple data-nodes are lost during the snapshot process or shard snapshot failures have occured before a node left the cluster. The problem was that we were simply not adding any shard entries for completed shards on node-left events. This has no effect for a successful shard, but for a failed shard would lead to that shard not being marked as failed during snapshot finalization. Fixed by corectly keeping track of all previous completed shard states as well in this case. Also, added an assertion that without this fix would trip on almost every run of the resiliency tests and adjusted the serialization of SnapshotsInProgress.Entry so we have a proper assertion message. Closes #47550	2019-10-05 15:01:06 +02:00
Armin Braun	f2d2ca21e2	Cleaner Handling of Store Refcount in BlobStoreRepository (#47560 ) (#47594 ) If a shard gets closed we properly abort its snapshot before closing it. We should in thise case make sure to not throw a confusing exception about trying to increment the reference on an already closed shard in the async tasks if the snapshot is already aborted. Also, added an assertion to make sure that aborts are in fact the only situation in which we run into a concurrently closed store.	2019-10-05 09:45:10 +02:00
Gordon Brown	e47bdf760e	Fix Rollover error when alias has closed indices (#47148 ) (#47539 ) Rollover previously requested index stats for all indices in the provided alias, which causes an exception when there is a closed index with that alias. This commit adjusts the IndicesOptions used on the index stats request so that closed indices are ignored, rather than throwing an exception.	2019-10-04 17:40:05 -06:00
Jason Tedor	35ca3d68d7	Validating monitoring hosts setting while parsing (#47571 ) This commit lifts the validation of the monitoring hosts setting into the setting itself, rather than when the setting is used. This prevents a scenario where an invalid value for the setting is accepted, but then later fails while applying a cluster state with the invalid setting.	2019-10-04 17:32:49 -04:00
Mark Tozzi	e404f7ea80	DocValueFormat implementation for date range fields (#47472 ) (#47605 )	2019-10-04 17:21:17 -04:00
Lee Hinman	79376b7219	Set default SLM retention invocation time (#47604 ) This adds a default for the `slm.retention_schedule` setting, setting it to `0 30 1 * * ?` which is 1:30am every day. Having retention unset meant that it would never be invoked and clean up snapshots. We determined it would be better to have a default than never to be run. When coming to a decision, we weighed the option of an absolute time (such as 1:30am) versus a periodic invocation (like every 12 hours). In the end we decided on the absolute time because it has better predictability and consistency than a periodic invocation, which would rely on when the master node were elected or restarted. Relates to #43663	2019-10-04 15:00:20 -06:00
Armin Braun	c1be7a802c	Simplify Snapshot Delete Process (#47439 ) (#47533 ) We don't need to read the SnapshotInfo for a snapshot to determine the indices that need to be updated when it is deleted as the `RepositoryData` contains that information already. This PR makes it so the `RepositoryData` is used to determine which indices to update and also removes the special handling for deleting snapshot metadata and the CS snapshot blob and has those simply be deleted as part of the deleting of other unreferenced blobs in the last step of the delete. This makes the snapshot delete a little faster and more resilient by removing two RPC calls (the separate delete and the get). Also, this shortens the diff with #46250 as a side-effect.	2019-10-04 13:55:16 +02:00
David Roberts	defc97a300	Remove fallback for controller location (#47104 ) This change removes the temporary controller location fallback introduced in #47013. Relates elastic/ml-cpp#593	2019-10-04 09:50:26 +01:00
Ryan Ernst	f32692208e	Add explanations to script score queries (#46693 ) (#47548 ) While function scores using scripts do allow explanations, they are only creatable with an expert plugin. This commit improves the situation for the newer script score query by adding the ability to set the explanation from the script itself. To set the explanation, a user would check for `explanation != null` to indicate an explanation is needed, and then call `explanation.set("some description")`.	2019-10-03 21:05:05 -07:00
Nhat Nguyen	5e4732f2bb	Limit number of retaining translog files for peer recovery (#47414 ) Today we control the extra translog (when soft-deletes is disabled) for peer recoveries by size and age. If users manually (force) flush many times within a short period, we can keep many small (or empty) translog files as neither the size or age condition is reached. We can protect the cluster from running out of the file descriptors in such a situation by limiting the number of retaining translog files.	2019-10-03 20:45:29 -04:00
Armin Braun	bac119f672	Fix getSnapshotIndexMetaData Exception Behavior (#47488 ) (#47496 ) If we fail to read the global metadata in a snapshot we would throw `SnapshotMissingException` but wouldn't do so for the index metadata. This is breaking SLM tests at a low rate because they use `SnapshotMissingException` thrown from snapshot status APIs to wait for a snapshot being gone. Also, we should be consistent here in general and not leak the `NoSuchFileException` to the transport layer for index meta. Closes #46508	2019-10-03 12:47:50 +02:00
Armin Braun	7549be4489	Fix es.http.cname_in_publish_address Deprecation Logging (#47451 ) Since the property defaulted to `true` this deprecation logging runs every time unless its set to `false` manually (in which case it should've also logged but didn't). I didn't add a tests and removed the tests we had in `7.x` that covered this logging. I did move the check out of the `if (InetAddresses.isInetAddress(hostString) == false) {` condition so this is sort-of covered by the REST tests. IMO, any unit-test of this would be somewhat redundant and would've forced adding a field that just indicates that the deprecated property was used to every instance which seemed pointless. Closes #47436	2019-10-03 11:10:48 +02:00
Alpar Torok	0a14bb174f	Remove eclipse conditionals (#44075 ) * Remove eclipse conditionals We used to have some meta projects with a `-test` prefix because historically eclipse could not distinguish between test and main source-sets and could only use a single classpath. This is no longer the case for the past few Eclipse versions. This PR adds the necessary configuration to correctly categorize source folders and libraries. With this change eclipse can import projects, and the visibility rules are correct e.x. auto compete doesn't offer classes from test code or `testCompile` dependencies when editing classes in `main`. Unfortunately the cyclic dependency detection in Eclipse doesn't seem to take the difference between test and non test source sets into account, but since we are checking this in Gradle anyhow, it's safe to set to `warning` in the settings. Unfortunately there is no setting to ignore it. This might cause problems when building since Eclipse will probably not know the right order to build things in so more wirk might be necesarry.	2019-10-03 11:55:00 +03:00
Armin Braun	0beb5263b4	Fix Snapshot Finalization not Waiting for Index Metadata (#47445 ) (#47459 ) * Fix Snapshot Finalization not Waiting for Index Metadata We were mixing up the listeners here which led to the final listener that should be called after all the metadata has been written to be called before that. I fixed this by removing the one redundant listener and flattening the logic out. * Closes #47425	2019-10-02 23:26:18 +02:00

1 2 3 4 5 ...

3767 Commits