Commit Graph

258 Commits

Author SHA1 Message Date
Nik Everett 2e48fb8294 Move delete by query helpers into core (#22810)
This moves the building blocks for delete by query into core. This
should enabled two thigns:
1. Plugins other than reindex to implement "bulk by scroll" style
operations.
2. Plugins to directly call delete by query. Those plugins should
be careful to make sure that task cancellation still works, but
this should be possible.

Notes:
1. I've mostly just moved classes and moved around tests methods.
2. I haven't been super careful about cohesion between these core
classes and reindex. They are quite interconnected because I wanted
to make the change as mechanical as possible.

Closes #22616
2017-01-27 16:09:18 -05:00
Chris Earle f0f75b187a Support Preemptive Authentication with RestClient (#21336)
This adds the necessary `AuthCache` needed to support preemptive authorization. By adding every host to the cache, the automatically added `RequestAuthCache` interceptor will add credentials on the first pass rather than waiting to do it after _each_ anonymous request is rejected (thus always sending everything twice when basic auth is required).
2017-01-24 11:34:05 -05:00
Jim Ferenczi e48bc2eed7 Add field collapsing for search request (#22337)
* Add top hits collapsing to search request

The field collapsing is done with a custom top docs collector that "collapse" search hits with same field value.
The distributed aspect is resolve using the two passes that the regular search uses. The first pass "collapse" the top hits, then the coordinating node merge/collapse the top hits from each shard.

```
GET _search
{
   "collapse": {
      "field": "category",
   }
}
```

This change also adds an ExpandCollapseSearchResponseListener that intercepts the search response and expands collapsed hits using the CollapseBuilder#innerHit} options.
The retrieval of each inner_hits is done by sending a query to all shards filtered by the collapse key.

```
GET _search
{
   "collapse": {
      "field": "category",
      "inner_hits": {
	"size": 2
      }
   }
}
```
2017-01-23 16:33:51 +01:00
Nik Everett 6265ef1c1b Deguice rest handlers (#22575)
There are presently 7 ctor args used in any rest handlers:
* `Settings`: Every handler uses it to initialize a logger and
  some other strange things.
* `RestController`: Every handler registers itself with it.
* `ClusterSettings`: Used by `RestClusterGetSettingsAction` to
  render the default values for cluster settings.
* `IndexScopedSettings`: Used by `RestGetSettingsAction` to get
  the default values for index settings.
* `SettingsFilter`: Used by a few handlers to filter returned
  settings so we don't expose stuff like passwords.
* `IndexNameExpressionResolver`: Used by `_cat/indices` to
  filter the list of indices.
* `Supplier<DiscoveryNodes>`: Used to fill enrich the response
  by handlers that list tasks.

We probably want to reduce these arguments over time but
switching construction away from guice gives us tighter
control over the list of available arguments.

These parameters are passed to plugins using
`ActionPlugin#initRestHandlers` which is expected to build and
return that handlers immediately. This felt simpler than
returning an reference to the ctors given all the different
possible args.

Breaks java plugins by moving rest handlers off of guice.
2017-01-20 11:48:51 -05:00
Tim Brooks a10aa8aade Add TestWithDependenciesPlugin to build (#22646)
This commit adds a MessyRestTestPlugin to the gradle build. It extends 
StandaloneRestTestPlugin. The main piece of functionality that it adds 
is to copy plugin-metadata from dependencies into the 
generated-resources for the current test source. This is necessary to 
ensure that permissions for dependencies are applied when running the 
tests.

A current limitation is that the permissions are applied differently 
than in the distribution sources. When permissions are granted to all 
depedencies for a module or plugin, the permissions are granted to all 
dependencies on the classpath for tests besides a few hardcoded 
exclusions:
- es core
- es test framework
- lucene test framework
- randomized runner
- junit library
2017-01-19 09:43:53 -06:00
Nik Everett ee5f8c4522 Consolidate some reindex utility classes (#22666)
Everything that extended `AbstractAsyncBulkByScrollAction` also
extended `AbstractAsyncBulkIndexByScrollAction` so this removes
`AbstractAsyncBulkIndexByScrollAction`, merging it into
`AbstractAsyncBulkByScrollAction`.
2017-01-18 16:58:39 -05:00
Nik Everett 1fe74a6b4b Better error when can't auto create index (#22488)
Changes the error message when `action.auto_create_index` or
`index.mapper.dynamic` forbids automatic creation of an index
from `no such index` to one of:
* `no such index and [action.auto_create_index] is [false]`
* `no such index and [index.mapper.dynamic] is [false]`
* `no such index and [action.auto_create_index] contains [-<pattern>] which forbids automatic creation of the index`
* `no such index and [action.auto_create_index] ([all patterns]) doesn't match`

This should make it more clear *why* there is `no such index`.

Closes #22435
2017-01-18 15:18:32 -05:00
Simon Willnauer 24e2847af2 Streamline foreign stored context restore and allow to perserve response headers (#22677)
Today we do not preserve response headers if they are present on a transport protocol
response. While preserving these headers is not always desired, in the most cases we
should pass on these headers to have consistent results for depreciation headers etc.
yet, this hasn't been much of a problem since most of the deprecations are detected early
ie. on the coordinating node such that this bug wasn't uncovered until #22647

This commit allow to optionally preserve headers when a context is restored and also streamlines
the context restore since it leaked frequently into the callers thread context when the callers
context wasn't restored again.
2017-01-18 16:17:54 +01:00
Igor Motov 500548fcda Remove taskManager.registerChildTask
Instead of forcing each task to register all nodes where its children are running, this commit runs cancellation on all nodes. The task cancellation operation doesn't run too frequently, so this optimization doesn't seem to be worth additional complexity of the interface.
2017-01-17 18:07:31 -05:00
Tanguy Leroux f5542ed47f Simplify ElasticsearchException rendering as a XContent (#22611)
This commit tries to simplify the way ElasticsearchException are rendered to xcontent. It adds some documentation and renames and merges some methods. Current behavior is preserved, the goal is to be more readable and centralize everything in the ElasticsearchException class.
2017-01-17 15:44:49 +01:00
Alexander Reelsen f6ee6e420b Indexing: Add shard id to indexing operation listener (#22606)
The IndexingOperationListener interface did not provide any
information about the shard id when a document was indexed.

This commit adds the shard id as the first parameter to all methods
in the IndexingOperationListener.
2017-01-16 09:08:16 +01:00
Zachary Tong 18fdc39b8c Increase visibility of doExecute so it can be used directly (#22614) 2017-01-13 09:42:02 -05:00
javanna 64c3212fdb Remove ParseFieldMatcher usages from IndexSettings 2017-01-12 14:43:35 +01:00
javanna 8072f168a3 Remove ParseFieldMatcher usages from QueryParseContext 2017-01-12 14:43:35 +01:00
Luca Cavanna 0f7d52df68 Remove some more ParseFieldMatcher usages (#22571) 2017-01-12 10:04:10 +01:00
Nik Everett 25a5f1869a Improve error message when reindex-from-remote gets bad json (#22536)
Adds a message about how the remote is unlikely to be Elasticsearch.
This isn't as good as including the whole message from the remote but
we can't do that because we are stream parsing it and we don't want
to mark the whole request.

Closes #22330
2017-01-11 12:55:23 -05:00
Nik Everett abb7d7841f Remove SearchRequestParsers (#22538)
It is empty now that we've moved all the parsing into `namedObject`.
2017-01-11 10:28:14 -05:00
Luca Cavanna 0f391336f5 Clean up SearchShardTarget (#22468)
* unify shard target setter

* Remove indexText member from SearchShardTarget

* Remove duplicated indexName getter from SearchShardTarget

* Remove duplicated shardId getter from SearchShardTarget

* Remove duplicated nodeIde getter from SearchShardTarget

* Rename SearchShardTarget#nodeIdText getter to getNodeIdText

* Remove unused InternalSearchHit#internalSourceRef unused method

* Remove unused InternalSearchHit#internalHighlightFields unused method

* Make SearchShardTarget members final
2017-01-11 10:08:31 +01:00
Nik Everett b71b8acf59 Remove ClusterService from ctors in reindex (#22539)
Moves fetching the local node id into `NodeClient` which is a
fairly useful place to put it so you can generate task ids from
`NodeClient#executeLocally`.
2017-01-10 18:26:06 -05:00
Nik Everett 78bb56671e Fix reindex from remote clearing scroll (#22525)
Reindex-from-remote had a race when it tried to clear the scroll. It
first starts the request to clear the scroll and then submits a task
to the generic threadpool to shutdown the client. These two things
race and, in my experience, closing the scroll generally loses. That
means that most of the time reindex-from-remote isn't clearing the
scrolls that it uses. This isn't the end of the world because we
flush old scroll contexts after a while but this isn't great.

Noticed while experimenting with #22514.
2017-01-10 10:30:23 -05:00
Nik Everett 5ef78fd015 Fix source filtering in reindex-from-remote (#22514)
Reindex-from-remote was accepting source filtering in the request
but ignoring it and setting `_source=true` on the search URI. This
fixes the filtering so it is piped through to the remote node and
adds tests for that.

Closes #22507
2017-01-10 09:00:12 -05:00
Nik Everett 3fb9254b95 Replace Suggesters with namedObject (#22491)
Removes another parser registery type thing in favor of
`XContentParser#namedObject`.
2017-01-09 16:51:08 -05:00
Nik Everett 057194f9ab Fix test under windows
Silly `\r`.
2017-01-09 16:29:59 -05:00
Nik Everett e3f77b4795 Replace AggregatorParsers with namedObject (#22397)
Removes `AggregatorParsers`, replacing all of its functionality with
`XContentParser#namedObject`.

This is the third bit of payoff from #22003, one less thing to pass
around the entire application.
2017-01-09 13:59:38 -05:00
Nik Everett fc1f7c2147 Remove content-type detection from reindex-from-remote (#22504)
If the remote doesn't return a content type then reindex
tried to guess the content-type. This didn't work most
of the time and produced a rather useless error message.
Given that Elasticsearch always returns the content-type
we are dropping content-type detection in favor of just
failing the request if the remote didn't return a content-type.

Closes #22329
2017-01-09 11:50:20 -05:00
Nik Everett f4884e0726 Replace SearchExtRegistry with namedObject (#22492)
This is one of the last things in `SearchRequestParsers`.
2017-01-09 08:35:54 -05:00
javanna 6102523033 remove ParseFieldMatcher usages from Script parsing code 2017-01-05 19:33:04 +01:00
javanna cd6b569286 Remove some usages of ParseFieldMatcher in favour of using ParseField directly
Relates to #19552
Relates to #22130
2016-12-31 09:24:44 +01:00
javanna df2acb3d9d Remove some more usages of ParseFieldMatcher in favour of using ParseField directly
Relates to #19552
Relates to #22130
2016-12-30 18:57:47 +01:00
Nik Everett f5f2149ff2 Remove much ceremony from parsing client yaml test suites (#22311)
* Remove a checked exception, replacing it with `ParsingException`.
* Remove all Parser classes for the yaml sections, replacing them with static methods.
* Remove `ClientYamlTestFragmentParser`. Isn't used any more.
* Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods.

I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.
2016-12-22 11:00:34 -05:00
Jason Tedor 7946396fe6 Introduce translog no-op
As the translog evolves towards a full operations log as part of the
sequence numbers push, there is a need for the translog to be able to
represent operations for which a sequence number was assigned, but the
operation did not mutate the index. Examples of how this can arise are
operations that fail after the sequence number is assigned, and gaps in
this history that arise when an operation is assigned a sequence number
but the operation never completed (e.g., a node crash). It is important
that these operations appear in the history so that they can be
replicated and replayed during recovery as otherwise the history will be
incomplete and local checkpoints will not be able to advance. This
commit introduces a no-op to the translog to set the stage for these
efforts.

Relates #22291
2016-12-21 23:08:16 -05:00
Nik Everett 567c65b0d5 Replace IndicesQueriesRegistry (#22289)
* Switch query parsing to namedObject
* Remove IndicesQueriesRegistry
2016-12-21 09:05:14 -05:00
Nik Everett a04dcfb95b Introduce XContentParser#namedObject (#22003)
Introduces `XContentParser#namedObject which works a little like
`StreamInput#readNamedWriteable`: on startup components register
parsers under names and a superclass. At runtime we look up the
parser and call it to parse the object.

Right now the parsers take a context object they use to help with
the parsing but I hope to be able to eliminate the need for this
context as most what it is used for at this point is to move
around parser registries which should be replaced by this method
eventually. I make no effort to do so in this PR because it is
big enough already. This is meant to the a start down a road that
allows us to remove classes like `QueryParseContext`,
`AggregatorParsers`, `IndicesQueriesRegistry`, and
`ParseFieldRegistry`.

The goal here is to reduce the amount of plumbing required to
allow parsing pluggable things. With this you don't have to pass
registries all over the place. Instead you must pass a super
registry to fewer places and use it to wrap the reader. This is
the same tradeoff that we use for NamedWriteable and it allows
much, much simpler binary serialization. We think we want that
same thing for xcontent serialization.

The only parsing actually converted to this method is parsing
`ScoreFunctions` inside of `FunctionScoreQuery`. I chose this
because it is relatively self contained.
2016-12-20 11:05:24 -05:00
Nik Everett 73320566c1 Reindex test: catch exception name instead of reason
It looks like the exception reason can differ in different default
locales, so the build would fail in any non-English locale. This
switches the catch to the name of the exception which shouldn't
vary.
2016-12-20 10:00:14 -05:00
Nik Everett 8de4be9e4d Reinex test: don't fail if iis is running on port 0 2016-12-19 16:44:08 -05:00
Nik Everett 2d71ced221 Properly fail reindex-from-remote if can't detect content type 2016-12-19 12:51:38 -05:00
Nik Everett 749039ad4f Consolidate the last easy parser construction (#22095)
Moves the last of the "easy" parser construction into
`RestRequest`, this time with a new method
`RestRequest#contentParser`. The rest of the production
code that builds `XContentParser` isn't "easy" because it is
exposed in the Transport Client API (a Builder) object.
2016-12-14 15:41:25 -05:00
Nik Everett 872984d21a Continue consolidating `XContentParser` construction in tests (#22145)
Consolidate more parser creation in tests

Moves more parser creation in tests to the `createParser` methods
in `ESTestCase`.
2016-12-13 17:22:39 -05:00
Nik Everett 3adefb7b4a Begin centralizing XContentParser creation into RestRequest (#22041)
To get #22003 in cleanly we need to centralize as much `XContentParser` creation as possible into `RestRequest`. That'll mean we have to plumb the `NamedXContentRegistry` into fewer places.

This removes `RestAction.hasBody`, `RestAction.guessBodyContentType`, and `RestActions.getRestContent`, moving callers over to `RestRequest.hasContentOrSourceParam`, `RestRequest.contentOrSourceParam`, and `RestRequest.contentOrSourceParamParser` and `RestRequest.withContentOrSourceParamParserOrNull`. The idea is to use `withContentOrSourceParamParserOrNull` if you need to handle requests without any sort of body content and to use `contentOrSourceParamParser` otherwise.

I believe the vast majority of this PR to be purely mechanical but I know I've made the following behavioral change (I'll add more if I think of more):
* If you make a request to an endpoint that requires a request body and has cut over to the new APIs instead of getting `Failed to derive xcontent` you'll get `Body required`.
* Template parsing is now non-strict by default. This is important because we need to be able to deprecate things without requests failing.
2016-12-09 20:23:02 -05:00
Nik Everett fc2060ba7e Don't close rest client from its callback (#22061)
If you try to close the rest client inside one of its callbacks then
it blocks itself. The thread pool switches the status to one that
requests a shutdown and then waits for the pool to shutdown. When
another thread attempts to honor the shutdown request it waits
for all the threads in the pool to finish what they are working on.
Thus thread a is waiting on thread b while thread b is waiting
on thread a. It isn't quite that simple, but it is close.

Relates to #22027
2016-12-09 10:39:51 -05:00
Adrien Grand 36f598138a Start using `ObjectParser` for aggs. (#22048)
This is an attempt to start moving aggs parsing to `ObjectParser`. There is
still A LOT to do, but ObjectParser is way better than the way aggregations
parsing works today. For instance in most cases, we reject numbers that are
provided as strings, which we are supposed to accept since some client languages
(looking at you Perl) cannot make sure to use the appropriate types.

Relates to #22009
2016-12-09 09:45:16 +01:00
Ryan Ernst b1cef5fdf8 Remove 2.0 prerelease version constants (#22004)
* Remove 2.0 prerelease version constants

This is a start to addressing #21887. This removes:
* pre 2.0 snapshot format support
* automatic units addition to cluster settings
* bwc check for delete by query in pre 2.0 indexes
2016-12-08 21:48:35 -08:00
Nik Everett ef83dbfbe6 Reindex: Better error message for pipeline in wrong place (#21985)
`_update_by_query` supports specifying the `pipeline` to process the
documents as a url parameter but `_reindex` doesn't. It doesn't because
everything about the `_reindex` request that has to do with writing
the documents is grouped under the `dest` object in the request body.
This changes the response parameter from
`request [_reindex] contains unrecognized parameter: [pipeline]` to
`_reindex doesn't support [pipeline] as a query parmaeter. Specify it in the [dest] object instead.`
2016-12-06 14:55:46 -05:00
Ryan Ernst c8f241f284 Plugins: Remove response action filters (#21950)
Action filters currently have the ability to filter both the request and
response. But the response side was not actually used. This change
removes support for filtering responses with action filters.
2016-12-05 16:14:04 -08:00
Nik Everett 2087234d74 Timeout improvements for rest client and reindex (#21741)
Changes the default socket and connection timeouts for the rest
client from 10 seconds to the more generous 30 seconds.

Defaults reindex-from-remote to those timeouts and make the
timeouts configurable like so:
```
POST _reindex
{
  "source": {
    "remote": {
      "host": "http://otherhost:9200",
      "socket_timeout": "1m",
      "connect_timeout": "10s"
    },
    "index": "source",
    "query": {
      "match": {
        "test": "data"
      }
    }
  },
  "dest": {
    "index": "dest"
  }
}
```

Closes #21707
2016-12-05 10:54:51 -05:00
Igor Motov c391b3fff6 Add proper descriptions to reindex, update-by-query and delete-by-query tasks.
Related to #21768
2016-12-02 21:46:38 -05:00
Nik Everett 0c724b1878 Keep context during reindex's retries (#21941)
* Keep context during reindex's retries

This fixes reindex and friend's retries to keep the context.

* Docs
2016-12-02 13:48:51 -05:00
Jason Tedor 6c45695d52 Add version 5.1.1
This commit removes the version constant for 5.1.0 (due to an
inadvertent release) and adds the version constant for 5.1.1.

Relates #21890
2016-11-30 11:14:17 -05:00
Luca Cavanna 5b8bdba12e Remove subrequests method from CompositeIndicesRequest (#21873) 2016-11-30 15:03:58 +01:00
Adrien Grand 6231009a8f Remove 2.x backward compatibility of mappings. (#21670)
For the record, I also had to remove the geo-hash cell and geo-distance range
queries to make the code compile. These queries already throw an exception in
all cases with 5.x indices, so that does not hurt any more.

I also had to rename all 2.x bwc indices from `index-${version}` to
`unsupported-${version}` to make `OldIndexBackwardCompatibilityIT`
happy.
2016-11-30 13:34:46 +01:00