Commit Graph

285 Commits

Author SHA1 Message Date
Martijn van Groningen c8031c6f99
Add data stream support to the reindex api. (#57970)
Backport of #57870 to 7.x branch.

This change now also copies the op_type from the reindex request's destination index request to the actual index request being used in the bulk request.

For ensuring no document exists, the op_type create doesn't need to be copied, since Versions.MATCH_DELETED will copied from the 'mainRequest.getDestination().version()'.
The `version()` method on IndexRequest only returns Versions.MATCH_DELETED if op_type=create and no specific version has been specified.

However in order to be able to index into a data stream, the op_type must be create. So in order to support that the op_type must be copied from the reindex request's destination index request to the actual index request being used in the bulk request.

Relates to #53100 and #57788
2020-06-12 09:54:37 +02:00
Martijn van Groningen 2ac32db607
Move includeDataStream flag from IndicesOptions to IndexNameExpressionResolver.Context (#56151)
Backport of #56034.

Move includeDataStream flag from an IndicesOptions to IndexNameExpressionResolver.Context
as a dedicated field that callers to IndexNameExpressionResolver can set.

Also alter indices stats api to support data streams.
The rollover api uses this api and otherwise rolling over data stream does no longer work.

Relates to #53100
2020-05-04 22:38:33 +02:00
Rory Hunter d66af46724
Always use deprecateAndMaybeLog for deprecation warnings (#55319)
Backport of #55115.

Replace calls to deprecate(String,Object...) with deprecateAndMaybeLog(...),
with an appropriate key, so that all messages can potentially be deduplicated.
2020-04-23 09:20:54 +01:00
David Turner 7941f4a47e Add RepositoriesService to createComponents() args (#54814)
Today we pass the `RepositoriesService` to the searchable snapshots plugin
during the initialization of the `RepositoryModule`, forcing the plugin to be a
`RepositoryPlugin` even though it does not implement any repositories.

After discussion we decided it best for now to pass this in via
`Plugin#createComponents` instead, pending some future work in which plugins
can depend on services more dynamically.
2020-04-16 16:27:36 +01:00
Henning Andersen 7ce7aff66e Reindex negative TimeValue fix (#54057)
Reindex would use timeValueNanos(System.nanoTime()). The intended use
for TimeValue is as a duration, not as absolute time. In particular,
this could result in negative TimeValue's, being unsupported in #53913.
Modified to use the bare long nano-second value.
2020-03-24 22:29:09 +01:00
Alan Woodward d23112f441 Report parser name and location in XContent deprecation warnings (#53805)
It's simple to deprecate a field used in an ObjectParser just by adding deprecation
markers to the relevant ParseField objects. The warnings themselves don't currently
have any context - they simply say that a deprecated field has been used, but not
where in the input xcontent it appears. This commit adds the parent object parser
name and XContentLocation to these deprecation messages.

Note that the context is automatically stripped from warning messages when they
are asserted on by integration tests and REST tests, because randomization of
xcontent type during these tests means that the XContentLocation is not constant
2020-03-20 11:52:55 +00:00
Jim Ferenczi 8e17322b3a
Shortcut query phase using the results of other shards (#51852) (#53659)
This commit, built on top of #51708, allows to modify shard search requests based on informations collected on other shards. It is intended to speed up sorted queries on time-based indices. For queries that are only interested in the top documents.

This change will rewrite the shard queries to match none if the bottom sort value computed in prior shards is better than all values in the shard.
For queries that mix top documents and aggregations this change will reset the size of the top documents to 0 instead of rewriting to match none.
This means that we don't need to keep a search context open for this shard since we know in advance that it doesn't contain any competitive hit.
2020-03-18 17:20:35 +01:00
Jay Modi f3f6ff97ee
Single instance of the IndexNameExpressionResolver (#52604)
This commit modifies the codebase so that our production code uses a
single instance of the IndexNameExpressionResolver class. This change
is being made in preparation for allowing name expression resolution
to be augmented by a plugin.

In order to remove some instances of IndexNameExpressionResolver, the
single instance is added as a parameter of Plugin#createComponents and
PersistentTaskPlugin#getPersistentTasksExecutor.

Backport of #52596
2020-02-21 07:50:02 -07:00
Jay Modi 3edadfefd0 RestHandlers declare handled routes (#52123)
This commit changes how RestHandlers are registered with the
RestController so that a RestHandler no longer needs to register itself
with the RestController. Instead the RestHandler interface has new
methods which when called provide information about the routes
(method and path combinations) that are handled by the handler
including any deprecated and/or replaced combinations.

This change also makes the publication of RestHandlers safe since they
no longer publish a reference to themselves within their constructors.

Closes #51622

Co-authored-by: Jason Tedor <jason@tedor.me>

Backport of #51950
2020-02-09 22:48:32 -07:00
Nik Everett 45663ac1a8
Use Void context on parsers where possible (#50573) (#50617)
*Most* of our parsing can be done without passing any extra context into
the parser that isn't already part of the xcontent stream. While I was
looking around at the places that *do* need a context I found a few
places that were declared to need a context but don't actually need it.
2020-01-03 13:28:55 -05:00
Henning Andersen 1d3feaf18e
Reindex sort deprecation warning take 2 (#49855) (#49899)
Moved the deprecation warning to ReindexValidator to ensure it runs
early and works with resilient reindex. Also check that the warning
is reported back for wait_for_completion=false.

Follow-up to #49458
2019-12-06 09:44:36 +01:00
Henning Andersen 5adb33ec17
Deprecate sorting in reindex (#49458) (#49738)
Reindex sort never gave a guarantee about the order of documents being
indexed into the destination, though it could give a sense of locality
of source data.

It prevents us from doing resilient reindex and other optimizations and
it has therefore been deprecated.

Related to #47567
2019-12-01 19:24:27 +01:00
Henning Andersen 1d745f1e5c Revert "Deprecate sorting in reindex (#49458)"
This reverts commit 27d45c9f1f.
2019-11-29 22:08:19 +01:00
Henning Andersen 27d45c9f1f Deprecate sorting in reindex (#49458)
Reindex sort never gave a guarantee about the order of documents being
indexed into the destination, though it could give a sense of locality
of source data.

It prevents us from doing resilient reindex and other optimizations and
it has therefore been deprecated.

Related to #47567
2019-11-29 21:35:11 +01:00
Henning Andersen 2ac38fd315 Reindex and friends fail on RED shards (#45830)
Reindex, update by query and delete by query would silently disregard
RED/unavailable shards, thus not copying, updating or deleting matching
data in those shards. Now use `allow_partial_search_results=false` to
ensure these operations fail if the search crosses an unavailable chard.

Added the option to explicitly specify `allow_partial_search_results=true`
for reindex only (seemed too strange for update/delete by query).

Relates #45739 and #42612
2019-11-18 21:23:08 +01:00
Tim Brooks 02622c1ef9
Fix issues with serializing BulkByScrollResponse (#45357)
Currently there are two issues with serializing BulkByScrollResponse.
First, when deserializing from XContent, indexing exceptions and search
exceptions are switched. Additionally, search exceptions do no retain
the appropriate RestStatus code, so you must evaluate the status code
from the exception. However, the exception class is not always correctly
retained when serialized.

This commit adds tests in the failure case. Additionally, fixes the
swapping of failure types and adds the rest status code to the search
failure.
2019-10-09 10:12:14 -06:00
Tim Brooks 956df7be92
Reindex task state initialized before reindex (#46043)
Currently the process to execute a reindex process is tightly coupled to
step of initializing the task state. This creates problems when this
process is asynchronous. It is possible that the task state has not been
initialized which prevents follow-up actions such as rethrottle. This
commit separates the task initialization so that it can be executed as a
first step in the persistent reindex process.
2019-08-27 15:28:04 -05:00
Tim Brooks 07f3ddb549
Extract reindexing logic from transport action (#46033)
This commit extracts the reindexing logic from the transport action so
that it can be incorporated into the persistent reindex work without
requiring the usage of the client.
2019-08-27 12:28:37 -05:00
Armin Braun a9e1402189
Remove Settings from BaseRestRequest Constructor (#45418) (#45429)
* Resolving the todo, cleaning up the unused `settings` parameter
* Cleaning up some other minor dead code in affected classes
2019-08-12 05:14:45 +02:00
Henning Andersen d139896b66
Reindex share retry between hit sources (#44203) (#45348)
The client and remote hit sources had each their own retry mechanism,
which would do the same. Supporting resiliency we would have to expand
on the retry mechanisms and as a preparation for that, the retry
mechanism is now shared such that each sub class is only responsible for
sending requests and converting responses/failures to common format.

Part of #42612
2019-08-08 22:01:29 +02:00
Tal Levy 075a3f0e99
remove usage of ActionType#(String) (#44459) (#44526)
this commit removes usage of the deprecated
constructor with a single argument and no Writeable.Reader.

The purpose of this is to reduce the boilerplate necessary for
properly implementing a new action, as well as reducing the
chances of using the incorrect super constructor while classes
are being migrated to Writeable

relates #34389.
2019-07-17 20:28:11 -07:00
Henning Andersen 748a10866d Reindex ScrollableHitSource pump data out (#43864)
Refactor ScrollableHitSource to pump data out and have a simplified
interface (callers should no longer call startNextScroll, instead they
simply mark that they are done with the previous result, triggering a
new batch of data). This eases making reindex resilient, since we will
sometimes need to rerun search during retries.

Relates #43187 and #42612
2019-07-09 11:50:09 +02:00
Ryan Ernst 3a2c698ce0
Rename Action to ActionType (#43778)
Action is a class that encapsulates meta information about an action
that allows it to be called remotely, specifically the action name and
response type. With recent refactoring, the action class can now be
constructed as a static constant, instead of needing to create a
subclass. This makes the old pattern of creating a singleton INSTANCE
both misnamed and lacking a common placement.

This commit renames Action to ActionType, thus allowing the old INSTANCE
naming pattern to be TYPE on the transport action itself. ActionType
also conveys that this class is also not the action itself, although
this change does not rename any concrete classes as those will be
removed organically as they are converted to TYPE constants.

relates #34389
2019-06-30 22:00:17 -07:00
Ryan Ernst 28ab77a023
Add StreamableResponseAction to aid in deprecation of Streamable (#43770)
The Action base class currently works for both Streamable and Writeable
response types. This commit intorduces StreamableResponseAction, for
which only the legacy Action implementions which provide newResponse()
will extend. This eliminates the need for overriding newResponse() with
an UnsupportedOperationException.

relates #34389
2019-06-28 21:40:00 -07:00
Tim Brooks 827f8fcbd5
Move reindex request parsing into request (#43450)
Currently the fromXContent logic for reindex requests is implemented in
the rest action. This is inconsistent with other requests where the
logic is implemented in the request. Additionally, it requires access to
the rest action in order to parse the request. This commit moves the
logic and tests into the ReindexRequest.
2019-06-20 17:49:11 -04:00
Henning Andersen 8b3716553a Remote reindex failure parse fix (#42928)
A search request that partially fails with failures without an index
(index: null) in the failure would cause a parse error in reindex from
remote. This would hide the original exception, making it hard to debug
the root cause. This commit fixes this so that we can tolerate null
index entries in a search failure.
2019-06-13 11:43:00 +02:00
Benjamin Trent 0a95b8c24d
Fixing handling of auto slices in bulk scroll requests (#43050) (#43063)
* Fixing handling of auto slices in bulk scroll requests

* adjusting assertions for tests
2019-06-10 16:47:40 -05:00
Henning Andersen dea935ac31
Reindex max_docs parameter name (#42942)
Previously, a reindex request had two different size specifications in the body:
* Outer level, determining the maximum documents to process
* Inside the source element, determining the scroll/batch size.

The outer level size has now been renamed to max_docs to
avoid confusion and clarify its semantics, with backwards compatibility and
deprecation warnings for using size.
Similarly, the size parameter has been renamed to max_docs for
update/delete-by-query to keep the 3 interfaces consistent.

Finally, all 3 endpoints now support max_docs in both body and URL.

Relates #24344
2019-06-07 12:16:36 +02:00
Tim Brooks d18f511327
Propogate version in reindex from remote search (#42958)
This is related to #31908. In order to use the external version in a
reindex from remote request, the search request must be configured to
request the version (as it is not returned by default). This commit
modifies the search request to request the version. Additionally, it
modifies our current reindex from remote tests to randomly use the
external version_type.
2019-06-06 14:50:06 -04:00
Yannick Welsch 785ae09101 Allow reindexing into write alias (#41677)
Fixes an issue where reindex currently fails if the destination is an alias pointing to multiple indices,
even it is using a write index.

Closes #41667
2019-05-08 09:38:37 +02:00
Henning Andersen b967a97f8e
Reindex from remote deprecation warning (#41005)
If a reindex from remote request contains an index name that is URL
escaped, we now issue a warning to be able to not support this in 8.0.
2019-04-11 12:09:53 +02:00
Henning Andersen 575918e8e6 Reindex from Remote allow date math (#40303)
Previously, reindexing from remote using date math in the source index
name did not work if the math contained / or ,. A workaround was to
then URL escape the index name in the request.

With this change, we now support any index name in the remote request
that the remote source supports, doing the URL escape when sending the
request.

Related to #23533
2019-04-01 19:58:06 +02:00
Boaz Leskes 033ba725af
Remove support for internal versioning for concurrency control (#38254)
Elasticsearch has long [supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) compare and set (a.k.a optimistic concurrency control) operations using internal document versioning. Sadly that approach is flawed and can sometime do the wrong thing. Here's the relevant excerpt from the resiliency status page:

> When a primary has been partitioned away from the cluster there is a short period of time until it detects this. During that time it will continue indexing writes locally, thereby updating document versions. When it tries to replicate the operation, however, it will discover that it is partitioned away. It won’t acknowledge the write and will wait until the partition is resolved to negotiate with the master on how to proceed. The master will decide to either fail any replicas which failed to index the operations on the primary or tell the primary that it has to step down because a new primary has been chosen in the meantime. Since the old primary has already written documents, clients may already have read from the old primary before it shuts itself down. The version numbers of these reads may not be unique if the new primary has already accepted writes for the same document 

We recently [introduced](https://www.elastic.co/guide/en/elasticsearch/reference/6.x/optimistic-concurrency-control.html) a new sequence number based approach that doesn't suffer from this dirty reads problem. 

This commit removes support for internal versioning as a concurrency control mechanism in favor of the sequence number approach.

Relates to #1078
2019-02-05 20:53:35 +01:00
Henning Andersen 68ed72b923
Handle scheduler exceptions (#38014)
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.

This is a continuation of #28667, #36137 and also fixes #37708.
2019-01-31 17:51:45 +01:00
Boaz Leskes 91d7050a5b remove unused parser fields in RemoteResponseParsers 2019-01-31 15:27:42 +01:00
Tim Vernum a8596de31f
Introduce ssl settings to reindex from remote (#37527)
Adds reindex.ssl.* settings for reindex from remote.

This uses the ssl-config/ internal library to parse and load SSL
configuration and files. This is applied when using the low level
rest client to connect to a remote ES node

Relates: #37287
Resolves: #29755
2019-01-31 18:06:05 +11:00
Boaz Leskes 218df3009a
Move update and delete by query to use seq# for optimistic concurrency control (#37857)
The delete and update by query APIs both offer protection against overriding concurrent user changes to the documents they touch. They currently are using internal versioning. This PR changes that to rely on sequences numbers and primary terms.

Relates #37639 
Relates #36148 
Relates #10708
2019-01-29 10:23:05 -05:00
Christoph Büscher 046f86f274
Deprecate use of type in reindex request body (#36823)
Types can be used both in the source and dest section of the body which will
be translated to search and index requests respectively. Adding a deprecation warning
for those cases and removing examples using more than one type in reindex since
support for this is going to be removed.
2019-01-03 10:29:14 +01:00
Michael Basnight a64fea10e2
Enable IPv6 URIs in reindex from remote (#36874)
Reindex from remote was using a custom regex to dermine what URIs were
valid. This commit removes the custom regex and uses the java.net.URI
class instead, allowing IPv6 support without changing the existing
validation around a URI in reindex from remote.
2018-12-20 13:48:35 -06:00
Luca Cavanna 8f04536a35
Add copy constructor to SearchRequest (#36641)
For cross cluster search alternate execution mode (see #32125), we will need to take a search request that spans across multiple clusters (based on index prefixes e.g. cluster1:index, cluster2:index etc.) and split it into multiple search requests to be sent to each cluster. A copy constructor added to `SearchRequest` would make that easy and well maintainable in the future.

Something along the same lines already happens in `BulkByScrollParallelizationHelper`, but the corresponding code went outdated as some new fields were added to `SearchRequest` which were not added to the bulk by scroll code. A copy constructor helps making the task of copying a search request maintainable over time.
2018-12-14 18:30:29 +01:00
Jim Ferenczi 18866c4c0b
Make hits.total an object in the search response (#35849)
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).

Relates #33028
2018-12-05 19:49:06 +01:00
Martijn van Groningen 11935cd480
Replace Streamable w/ Writeable in BaseTasksResponse and subclasses (#36176)
This commit replaces usages of Streamable with Writeable for the
BaseTasksResponse / TransportTasksAction classes and subclasses of
these classes.

Note that where possible response fields were made final.

Relates to #34389
2018-12-05 13:14:10 +01:00
Martijn van Groningen 43773a32a4
Replace Streamable w/ Writeable in BaseTasksRequest and subclasses (#35854)
* Replace Streamable w/ Writeable in BaseTasksRequest and subclasses

This commit replaces usages of Streamable with Writeable for the
BaseTasksRequest / TransportTasksAction classes and subclasses of
these classes.

Relates to #34389
2018-12-03 08:04:29 +01:00
Alpar Torok e0a678f0c4
Remove version.qualified from MainResponse (#35412)
The fully qualified version will be returned as `version.number`
2018-11-29 08:41:39 +02:00
Jim Ferenczi a5f5ceb869
Remove remaining line length violations for o.e.index (#35652)
This commit removes line length violations in the classes under
org.elasticsearch.index.
2018-11-20 08:09:14 +01:00
Alexander Reelsen 409050e8de
Refactor: Remove settings from transport action CTOR (#35208)
As settings are not used in the transport action constructor, this
removes the passing of the settings in all the transport actions.
2018-11-05 13:08:18 +01:00
Ryan Ernst be8475955e
Scripting: Use ParameterMap for deprecated ctx var in update scripts (#34065)
This commit removes the sysprop controlling whether ctx is in params for
update scripts and replaces it with use of the new ParameterMap, which
outputs a deprecation warning whenever params.ctx is used.
2018-09-25 22:08:02 -07:00
Nik Everett f963c29876
Logging: Drop Settings from some logger lookups (#33859)
Drops `Settings` from some of the methods to lookup loggers and
deprecates another logger lookup that takes `Settings` because
`Settings` is no longer required to build a logger.
2018-09-20 10:42:48 -04:00
Sohaib Iftikhar 761e8c461f HLRC: Add delete by query API (#32782)
Adds the delete-by-query API to the High Level REST Client.
2018-09-04 08:56:26 -04:00
Sohaib Iftikhar 389bf67275 HLREST: add update by query API (#32760)
Adds update by query to the high level rest client.
2018-09-02 15:15:00 -04:00