OpenSearch

Commit Graph

Author	SHA1	Message	Date
Nhat Nguyen	6dd0aa54f6	Integrates soft-deletes into Elasticsearch (#33222 ) This PR integrates Lucene soft-deletes(LUCENE-8200) into Elasticsearch. Highlight works in this PR include: - Replace hard-deletes by soft-deletes in InternalEngine - Use _recovery_source if _source is disabled or modified (#31106) - Soft-deletes retention policy based on the global checkpoint (#30335) - Read operation history from Lucene instead of translog (#30120) - Use Lucene history in peer-recovery (#30522) Relates #30086 Closes #29530 --- These works have been done by the whole team; however, these individuals (lexical order) have significant contribution in coding and reviewing: Co-authored-by: Adrien Grand jpountz@gmail.com Co-authored-by: Boaz Leskes b.leskes@gmail.com Co-authored-by: Jason Tedor jason@tedor.me Co-authored-by: Martijn van Groningen martijn.v.groningen@gmail.com Co-authored-by: Nhat Nguyen nhat.nguyen@elastic.co Co-authored-by: Simon Willnauer simonw@apache.org	2018-08-30 22:11:23 -04:00
Lee Hinman	8a2d154bad	Update serialization versions for custom IndexMetaData backport	2018-08-30 15:56:53 -06:00
Igor Motov	001b78f704	Replace IndexMetaData.Custom with Map-based custom metadata (#32749 ) This PR removes the deprecated `Custom` class in `IndexMetaData`, in favor of a `Map<String, DiffableStringMap>` that is used to store custom index metadata. As part of this, there is now no way to set this metadata in a template or create index request (since it's only set by plugins, or dedicated REST endpoints). The `Map<String, DiffableStringMap>` is intended to be a namespaced `Map<String, String>` (`DiffableStringMap` implements `Map<String, String>`, so the signature is more like `Map<String, Map<String, String>>`). This is so we can do things like: ``` java Map<String, String> ccrMeta = indexMetaData.getCustom("ccr"); ``` And then have complete control over the metadata. This also means any plugin/feature that uses this has to manage its own BWC, as the map is just serialized as a map. It also means that if metadata is put in the map that isn't used (for instance, if a plugin were removed), it causes no failures the way an unregistered `Setting` would. The reason I use a custom `DiffableStringMap` here rather than a plain `Map<String, String>` is so the map can be diffed with previous cluster state updates for serialization. Supersedes #32683	2018-08-30 13:57:00 -06:00
Simon Willnauer	af2eaf2a6c	Remove usage of `index.shrink.source.` in 7.x (#33271 ) We cut over to `index.resize.source.` but still have these constants being public in `IndexMetaData`. Those Settings and constants are not needed in 7.x while we still need to keep the keys known to private settings since they might be part of the index settings of old indices. We can remove that in 8.0. Yet, we should remove the settings to make sure they are not used again.	2018-08-30 21:08:35 +02:00
Jim Ferenczi	d0630093cd	Fix serialization of empty field capabilities response (#33263 ) Fix serialization of empty field capabilities response When no response are required (no indices match the requested patterns) the empty response throws an NPE in the transport serialization (writeTo).	2018-08-30 18:07:58 +02:00
Jim Ferenczi	1404dd2a42	Fix nested _source retrieval with includes/excludes (#33180 ) If an exclude or an include clause removes an entry to a nested field in the original source at query time, the creation of nested hits fails with an NPE. This change fixes this exception and replaces the nested document source with an empty map. Closes #33163 Closes #33170	2018-08-30 15:15:50 +02:00
David Turner	47859e56ac	Move file-based discovery to core (#33241 ) Today we support a static list of seed hosts in core Elasticsearch, and allow a dynamic list of seed hosts to be provided via a file using the `discovery-file` plugin. In fact the ability to provide a dynamic list of seed hosts is increasingly useful, so this change moves this functionality to core Elasticsearch to avoid the need for a plugin. Furthermore, in order to start up nodes in integration tests we currently assign a known port to each node before startup, which unfortunately sometimes fails if another process grabs the selected port in the meantime. By moving the `discovery-file` functionality into the core product we can use it to avoid this race. This change also moves the expected path to the file from `$ES_PATH_CONF/discovery-file/unicast_hosts.txt` to `$ES_PATH_CONF/unicast_hosts.txt`. An example of this file is not included in distributions. For BWC purposes the plugin still exists, but does nothing more than create the example file in the old location, and issue a warning when it is used. We also continue to support the old location for the file, but warn about its deprecation. Relates #29244 Closes #33030	2018-08-30 06:43:04 +01:00
Armin Braun	cc4d7059bf	Ingest: Add conditional per processor (#32398 ) * Ingest: Add conditional per processor * closes #21248	2018-08-30 03:46:39 +02:00
Jason Tedor	0f22dbb1cc	Apply settings filter to get cluster settings API (#33247 ) Some settings have filters applied to them and we use this in logs and the get nodes info API. For consistency, we should apply this in the get cluster settings API too.	2018-08-29 15:56:13 -04:00
Simon Willnauer	6a0d4b4a77	Remote 6.x transport BWC Layer for `_shrink` (#33236 ) The shrink action was renamed to `_resize` with the addition or split. This bwc layer is unnecessary on 7.x since 6.latest will always use the resize action.	2018-08-29 16:43:13 +02:00
Luca Cavanna	49109187e2	Remove unsupported group_shard_failures parameter (#33208 ) We have had support for the `group_shard_failures` parameter in our code for a while, since we introduced failures grouping. When we introduced validation of parameters at REST, we seem to have forgotten to expose such parameter. Given that the parameter is effectively not supported for many months now, that no user has complained about that and that grouping is the expected behaviour, this commit removes support for the parameter.	2018-08-29 14:05:41 +02:00
Luca Cavanna	034fdbca28	Update BucketUtils#suggestShardSideQueueSize signature (#33210 ) `BucketUtils#suggestShardSideQueueSize` used to calculate the shard_size based on the number of shards. It returns now a different value only based on whether we are querying a single shard or multiple shards. This commit replaces the numberOfShards argument with a boolean that tells whether we are querying a single shard or not.	2018-08-29 13:51:54 +02:00
Armin Braun	f690b492e7	INGEST: Add Pipeline Processor (#32473 ) * INGEST: Add Pipeline Processor * Adds Processor capable of invoking other pipelines * Closes #31842	2018-08-29 11:03:10 +02:00
Alexander Reelsen	48b388ce82	Core: Add java time xcontent serializers (#33120 ) This ensures that the java time class exposed by painless have proper serialization/string representations. Closes #31853	2018-08-29 10:00:16 +02:00
Alpar Torok	f29f0af7bc	Consider multi release jars when running third party audit (#33206 ) Exclude classes meant for newer versions than what we are auditing against, those classes won't be found. There's no reason to exclude JDK classes from newer versions, with this PR, we will not extract them in the first place.	2018-08-29 09:53:04 +03:00
Mark Tozzi	84b61d0738	Scroll queries asking for rescore are considered invalid (#32918 ) This PR changes our behavior from silently ignoring rescore in a scroll query to instead report to the user that such a query is invalid. Closes #31775	2018-08-28 15:48:23 -04:00
Sohaib Iftikhar	7f5e29ddb2	HLREST: add reindex API (#32679 ) Adds the reindex API to the high level REST client.	2018-08-28 13:02:23 -04:00
Jonathan Little	9d92a87ae6	Remove support for deprecated params._agg/_aggs for scripted metric aggregations (#32979 )	2018-08-28 09:27:43 +01:00
Alpar Torok	2cc611604f	Run Third party audit with forbidden APIs CLI (part3/3) (#33052 ) The new implementation is functional equivalent with the old, ant based one. It parses task standard error to get the missing classes and violations in the same way. I considered re-using ForbiddenApisCliTask but Gradle makes it hard to build inheritance with tasks that have task actions , since the order of the task actions can't be controlled. This inheritance isn't dully desired either as the third party audit task is much more opinionated and we don't want to expose some of the configuration. We could probably extract a common base class without any task actions, but probably more trouble than it's worth. Closes #31715	2018-08-28 10:03:30 +03:00
Nhat Nguyen	014b3236dc	Ensure to generate identical NoOp for the same failure (#33141 ) We generate slightly different NoOps in InternalEngine and TransportShardBulkAction for the same failure. 1. InternalEngine uses Exception#getFailure to generate a message without the class name: newOp [NoOp{seqNo=1, primaryTerm=1, reason='Contexts are mandatory in context enabled completion field [suggest_context]'}]. 2. TransportShardBulkAction uses Exception#toString to generate a message with the class name: NoOp{seqNo=1, primaryTerm=1, reason='java.lang.IllegalArgumentException: Contexts are mandatory in context enabled completion field [suggest_context]'}. If a write operation fails while a replica is recovering, that replica will possibly receive two different NoOps: one from recovery and one from replication. These two different NoOps will trip TranslogWriter#assertNoSeqNumberConflict assertion. This commit ensures that we generate the same Noop for the same failure. Closes #32986	2018-08-27 15:59:42 -04:00
Luca Cavanna	ed0571e16c	ShardSearchFailure#readFrom to set index and shardId (#33161 ) As part of recent changes made to `ShardOperationFailedException` we introduced `index` and `shardId` members to the base class, but the subclasses are entirely responsible for the serialization of such fields. In the case of `ShardSearchFailure`, we have an additional `SearchShardTarget` instance member which also holds the index and the shardId, hence they get serialized as part of `SearchShardTarget` itself. When de-serializing a `ShardSearchFailure` though, we need to remember to also set the parent class `index` and `shardId` fields otherwise they get lost Relates to #32640	2018-08-27 20:31:27 +02:00
Jason Tedor	318df2a107	Adjust BWC version on mapping version The introduction of mapping version on index metadata has been backported to 6.x. This commit adjusts the BWC version around mapping version to account for this backport.	2018-08-27 13:17:15 -04:00
Jason Tedor	2aef7e0900	Introduce mapping version to index metadata (#33147 ) This commit introduces mapping version to index metadata. This value is monotonically increasing and is updated on mapping updates. This will be useful in cross-cluster replication so that we can request mapping updates from the leader only when there is a mapping update as opposed to the strategy we employ today which is to request a mapping update any time there is an index metadata update. As index metadata updates can occur for many reasons other than mapping updates, this leads to some unnecessary requests and work in cross-cluster replication.	2018-08-27 12:21:11 -04:00
Mikita Karaliou	f1f6d4ed33	Support only string `format` in date, root object & date range (#28117 ) Limit date `format` attribute to String values only. Closes #23650	2018-08-27 12:24:51 +02:00
Daniel Mitterdorfer	06c0055c0f	Have circuit breaker succeed on unknown mem usage With this commit we implement a workaround for https://bugs.openjdk.java.net/browse/JDK-8207200 which is a race condition in the JVM that results in `IllegalArgumentException` to be thrown in rare cases when we determine memory usage via `MemoryMXBean`. As we do not want to fail requests in those cases we always return zero memory usage. Relates #31767 Relates #33125	2018-08-27 07:09:27 +02:00
Jason Tedor	143cd9bbaa	Do not lose default mapper on metadata updates (#33153 ) When applying index metadata updates we run through the mappings updating them if needed. Today if there is not an update to the default mapper, we can lose the default mapping. This means that, for example, if we apply a settings update to an index we will lose the default mapper. This happens because we were not guarding updating the default mapping with a check that the default mapping was updated in the metadata update. When there is no update in the metadata update, we need to continue to preserve the previous default mapping. This commit achieves this by moving the updating of the default mapping under the same guard that we use for updating the default mapping source. We add a test that fails before putting the update under a guard and now passes after moving the update under the guard.	2018-08-26 15:57:52 -04:00
Jason Tedor	f8b07a0d84	Fix a mappings update test (#33146 ) This commit fixes a mappings update test. The test is broken in the sense that it passes, but for the wrong reason. The test here is testing that if we make a mapping update but do not commit that mapping update then the mapper service still maintains the previous document mapper. This was not the case long, long ago when a mapping update would update the in-memory state before the cluster state update was committed. This test was passing, but it was passing because the mapping update was never even updated. It was never even updated because it was encountering a null pointer exception. Of course the in-memory state is not going to be updated in that case, we are simply going to end up with a failed cluster state update. Fixing that leads to another issue which is that the mapping source does not even parse so again we would, of course, end up with the in-memory state not being modified. We fix these issues, assert that the result cluster state task completed successfully, and finally that the in-memory state was not updated since we never committed the resulting cluster state.	2018-08-26 09:36:17 -04:00
Simon Willnauer	3376922e8b	Add proxy support to RemoteClusterConnection (#33062 ) This adds support for connecting to a remote cluster through a tcp proxy. A remote cluster can configured with an additional `search.remote.$clustername.proxy` setting. This proxy will be used to connect to remote nodes for every node connection established. We still try to sniff the remote clsuter and connect to nodes directly through the proxy which has to support some kind of routing to these nodes. Yet, this routing mechanism requires the handshake request to include some kind of information where to route to which is not yet implemented. The effort to use the hostname and an optional node attribute for routing is tracked in #32517 Closes #31840	2018-08-25 20:41:32 +02:00
Nhat Nguyen	9dad82ece8	TEST: Skip assertSeqNos for closed shards (#33130 ) If a shard was closed, we return null for SeqNoStats. Therefore the assertion assertSeqNos will hit NPE when it verifies a closed shard. This commit skips closed shards in assertSeqNos and enables this assertion in AbstractDisruptionTestCase.	2018-08-24 21:02:13 -04:00
Nik Everett	a023e64801	Checkstyle! Catching your unused imports since 2001.	2018-08-24 14:13:13 -04:00
Jim Ferenczi	70030c18f1	[Test] Fix sporadic failure in MembershipActionTests Rewrite test that require Version.V_5 constants.	2018-08-24 18:40:04 +02:00
Mayya Sharipova	6f1ee76443	Revert "Do NOT allow termvectors on nested fields (#32728 )" This reverts commit `fdff8f3db0`.	2018-08-24 10:12:16 -04:00
Jim Ferenczi	f4e9729d64	Remove unsupported Version.V_5_* (#32937 ) This change removes the es 5x version constants and their usages.	2018-08-24 09:51:21 +02:00
Michael Basnight	8f16696fe1	Add versions 5.6.12 and 6.4.1	2018-08-23 15:49:14 -05:00
Mayya Sharipova	fdff8f3db0	Do NOT allow termvectors on nested fields (#32728 ) Requesting _termvectors on a nested field or any sub-fields of a nested field returns empty results. Closes #21625	2018-08-23 16:46:47 -04:00
Simon Willnauer	f3cfd4504f	Use `addIfAbsent` instead of checking if an element is contained Relates to #32988	2018-08-23 13:43:23 +02:00
Ignacio Vera	d7219c05a2	Search: Support of wildcard on docvalue_fields (#32980 ) * Search: Support of wildcard on docvalue_fields For consistency with stored_fields, docvalue_fields should support the use of wildcards. Documentation of doc values fields is updated accordingly. See also: #26390 Closes #26299	2018-08-23 10:04:00 +02:00
Jim Ferenczi	ffe895e16e	Change query field expansion (#33020 ) This commit changes the query field expansion for query parsers to not rely on an hardcoded list of field types. Instead we rely on the type of exception that is thrown by MappedFieldType#termQuery to include/exclude an expanded field. Supersedes #31655 Closes #31798	2018-08-23 09:52:48 +02:00
Armin Braun	46247ff1f9	INGEST: Cleanup Redundant Put Method (#33034 )	2018-08-23 07:43:36 +02:00
Luca Cavanna	393eec1482	Set maxScore for empty TopDocs to Nan rather than 0 (#32938 ) We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).	2018-08-22 17:23:54 +02:00
Jason Tedor	67bfb765ee	Refactor Netty4Utils#maybeDie (#33021 ) In our Netty layer we have had to take extra precautions against Netty catching throwables which prevents them from reaching the uncaught exception handler. This code has taken on additional uses in NIO layer and now in the scheduler engine because there are other components in stack traces that could catch throwables and suppress them from reaching the uncaught exception handler. This commit is a simple cleanup of the iterative evolution of this code to refactor all uses into a single method in ExceptionsHelper.	2018-08-22 10:18:07 -04:00
Simon Willnauer	ead198bf2e	Add settings updater for 2 affix settings (#33050 ) Today we can only have non-affix settings updated and consumed _together_. Yet, there are use-cases where two affix settings depend on each other which makes using the hard without consuming updates together. Unfortunately, there is not straight forward way to have N settings updated together in a type-safe way having 2 still serves a large portion of use-cases.	2018-08-22 14:13:27 +02:00
Nhat Nguyen	262d3c0783	Allow engine to recover from translog upto a seqno (#33032 ) This change allows an engine to recover from its local translog up to the given seqno. The extended API can be used in these use cases: When a replica starts following a new primary, it resets its index to the safe commit, then replays its local translog up to the current global checkpoint (see #32867). When a replica starts a peer-recovery, it can initialize the start_sequence_number to the persisted global checkpoint instead of the local checkpoint of the safe commit. A replica will then replay its local translog up to that global checkpoint before accepting remote translog from the primary. This change will increase the chance of operation-based recovery. I will make this in a follow-up. Relates #32867	2018-08-22 07:57:44 -04:00
Simon Willnauer	ffb1a5d5b7	Expose `max_concurrent_shard_requests` in `_msearch` (#33016 ) Today `_msearch` doesn't allow modifying the `max_concurrent_shard_requests` per sub search request. This change adds support for setting this parameter on all sub-search requests in an `_msearch`. Relates to #31877	2018-08-22 08:45:08 +02:00
Julie Tibshirani	67b5a83a9a	Ensure that _exists queries on keyword fields use norms when they're available. (#33006 )	2018-08-21 16:33:42 -07:00
Jim Ferenczi	767c69593c	Fix quoted _exists_ query (#33019 ) This change in the `query_string` query fixes the detection of the special `_exists_` field when it is used with a quoted term. Closes #28922	2018-08-21 22:15:09 +02:00
Jim Ferenczi	8b43e21521	Fix multi fields empty query (#33017 ) This change fixes empty query removal when all fields remove the search term in `simple_query_string`, `multi_match` and `query_string`. Closes #33009	2018-08-21 22:12:53 +02:00
Igor Motov	3973bb4028	Fix north pole overflow error in GeoHashUtils.bbox() (#32891 ) Fixes an overflow error in GeoHashUtils.bbox() calculation of a bounding box for geohashes with maximum precision located next to the north pole.	2018-08-21 14:59:37 -04:00
Jason Tedor	bdfcc326d7	Enable avoiding mmap bootstrap check (#32421 ) The maximum map count boostrap check can be a hindrance to users that do not own the underlying platform on which they are executing Elasticsearch. This is because addressing it requires tuning the kernel and a platform provider might now allow this, especially on shared infrastructure. However, this bootstrap check is not needed if mmapfs is not in use. Today we do not have a way for the user to communicate that they are not going to use mmapfs. This commit therefore adds a setting that enables the user to disallow mmapfs. When mmapfs is disallowed, the maximum map count bootstrap check is not enforced. Additionally, we fallback to a different default index store and prevent the explicit use of mmapfs for an index.	2018-08-21 11:02:25 -04:00
Simon Willnauer	92076497e5	Use a dedicated ConnectionManger for RemoteClusterConnection (#32988 ) This change introduces a dedicated ConnectionManager for every RemoteClusterConnection such that there is not state shared with the TransportService internal ConnectionManager. All connections to a remote cluster are isolated from the TransportService but still uses the TransportService and it's internal properties like the Transport, tracing and internal listener actions on disconnects etc. This allows a remote cluster connection to have a different lifecycle than a local cluster connection, also local discovery code doesn't get notified if there is a disconnect on from a remote cluster and each connection can use it's own dedicated connection profile which allows to have a reduced set of connections per cluster without conflicting with the local cluster. Closes #31835	2018-08-21 12:43:25 +02:00

1 2 3 4 5 ...

1073 Commits