Commit Graph

451 Commits

Author SHA1 Message Date
Jim Ferenczi e48bc2eed7 Add field collapsing for search request (#22337)
* Add top hits collapsing to search request

The field collapsing is done with a custom top docs collector that "collapse" search hits with same field value.
The distributed aspect is resolve using the two passes that the regular search uses. The first pass "collapse" the top hits, then the coordinating node merge/collapse the top hits from each shard.

```
GET _search
{
   "collapse": {
      "field": "category",
   }
}
```

This change also adds an ExpandCollapseSearchResponseListener that intercepts the search response and expands collapsed hits using the CollapseBuilder#innerHit} options.
The retrieval of each inner_hits is done by sending a query to all shards filtered by the collapse key.

```
GET _search
{
   "collapse": {
      "field": "category",
      "inner_hits": {
	"size": 2
      }
   }
}
```
2017-01-23 16:33:51 +01:00
Nik Everett 6265ef1c1b Deguice rest handlers (#22575)
There are presently 7 ctor args used in any rest handlers:
* `Settings`: Every handler uses it to initialize a logger and
  some other strange things.
* `RestController`: Every handler registers itself with it.
* `ClusterSettings`: Used by `RestClusterGetSettingsAction` to
  render the default values for cluster settings.
* `IndexScopedSettings`: Used by `RestGetSettingsAction` to get
  the default values for index settings.
* `SettingsFilter`: Used by a few handlers to filter returned
  settings so we don't expose stuff like passwords.
* `IndexNameExpressionResolver`: Used by `_cat/indices` to
  filter the list of indices.
* `Supplier<DiscoveryNodes>`: Used to fill enrich the response
  by handlers that list tasks.

We probably want to reduce these arguments over time but
switching construction away from guice gives us tighter
control over the list of available arguments.

These parameters are passed to plugins using
`ActionPlugin#initRestHandlers` which is expected to build and
return that handlers immediately. This felt simpler than
returning an reference to the ctors given all the different
possible args.

Breaks java plugins by moving rest handlers off of guice.
2017-01-20 11:48:51 -05:00
Nik Everett ee5f8c4522 Consolidate some reindex utility classes (#22666)
Everything that extended `AbstractAsyncBulkByScrollAction` also
extended `AbstractAsyncBulkIndexByScrollAction` so this removes
`AbstractAsyncBulkIndexByScrollAction`, merging it into
`AbstractAsyncBulkByScrollAction`.
2017-01-18 16:58:39 -05:00
Nik Everett 1fe74a6b4b Better error when can't auto create index (#22488)
Changes the error message when `action.auto_create_index` or
`index.mapper.dynamic` forbids automatic creation of an index
from `no such index` to one of:
* `no such index and [action.auto_create_index] is [false]`
* `no such index and [index.mapper.dynamic] is [false]`
* `no such index and [action.auto_create_index] contains [-<pattern>] which forbids automatic creation of the index`
* `no such index and [action.auto_create_index] ([all patterns]) doesn't match`

This should make it more clear *why* there is `no such index`.

Closes #22435
2017-01-18 15:18:32 -05:00
Simon Willnauer 24e2847af2 Streamline foreign stored context restore and allow to perserve response headers (#22677)
Today we do not preserve response headers if they are present on a transport protocol
response. While preserving these headers is not always desired, in the most cases we
should pass on these headers to have consistent results for depreciation headers etc.
yet, this hasn't been much of a problem since most of the deprecations are detected early
ie. on the coordinating node such that this bug wasn't uncovered until #22647

This commit allow to optionally preserve headers when a context is restored and also streamlines
the context restore since it leaked frequently into the callers thread context when the callers
context wasn't restored again.
2017-01-18 16:17:54 +01:00
Igor Motov 500548fcda Remove taskManager.registerChildTask
Instead of forcing each task to register all nodes where its children are running, this commit runs cancellation on all nodes. The task cancellation operation doesn't run too frequently, so this optimization doesn't seem to be worth additional complexity of the interface.
2017-01-17 18:07:31 -05:00
Tanguy Leroux f5542ed47f Simplify ElasticsearchException rendering as a XContent (#22611)
This commit tries to simplify the way ElasticsearchException are rendered to xcontent. It adds some documentation and renames and merges some methods. Current behavior is preserved, the goal is to be more readable and centralize everything in the ElasticsearchException class.
2017-01-17 15:44:49 +01:00
Alexander Reelsen f6ee6e420b Indexing: Add shard id to indexing operation listener (#22606)
The IndexingOperationListener interface did not provide any
information about the shard id when a document was indexed.

This commit adds the shard id as the first parameter to all methods
in the IndexingOperationListener.
2017-01-16 09:08:16 +01:00
Zachary Tong 18fdc39b8c Increase visibility of doExecute so it can be used directly (#22614) 2017-01-13 09:42:02 -05:00
javanna 64c3212fdb Remove ParseFieldMatcher usages from IndexSettings 2017-01-12 14:43:35 +01:00
javanna 8072f168a3 Remove ParseFieldMatcher usages from QueryParseContext 2017-01-12 14:43:35 +01:00
Luca Cavanna 0f7d52df68 Remove some more ParseFieldMatcher usages (#22571) 2017-01-12 10:04:10 +01:00
Nik Everett 25a5f1869a Improve error message when reindex-from-remote gets bad json (#22536)
Adds a message about how the remote is unlikely to be Elasticsearch.
This isn't as good as including the whole message from the remote but
we can't do that because we are stream parsing it and we don't want
to mark the whole request.

Closes #22330
2017-01-11 12:55:23 -05:00
Nik Everett abb7d7841f Remove SearchRequestParsers (#22538)
It is empty now that we've moved all the parsing into `namedObject`.
2017-01-11 10:28:14 -05:00
Luca Cavanna 0f391336f5 Clean up SearchShardTarget (#22468)
* unify shard target setter

* Remove indexText member from SearchShardTarget

* Remove duplicated indexName getter from SearchShardTarget

* Remove duplicated shardId getter from SearchShardTarget

* Remove duplicated nodeIde getter from SearchShardTarget

* Rename SearchShardTarget#nodeIdText getter to getNodeIdText

* Remove unused InternalSearchHit#internalSourceRef unused method

* Remove unused InternalSearchHit#internalHighlightFields unused method

* Make SearchShardTarget members final
2017-01-11 10:08:31 +01:00
Nik Everett b71b8acf59 Remove ClusterService from ctors in reindex (#22539)
Moves fetching the local node id into `NodeClient` which is a
fairly useful place to put it so you can generate task ids from
`NodeClient#executeLocally`.
2017-01-10 18:26:06 -05:00
Nik Everett 78bb56671e Fix reindex from remote clearing scroll (#22525)
Reindex-from-remote had a race when it tried to clear the scroll. It
first starts the request to clear the scroll and then submits a task
to the generic threadpool to shutdown the client. These two things
race and, in my experience, closing the scroll generally loses. That
means that most of the time reindex-from-remote isn't clearing the
scrolls that it uses. This isn't the end of the world because we
flush old scroll contexts after a while but this isn't great.

Noticed while experimenting with #22514.
2017-01-10 10:30:23 -05:00
Nik Everett 5ef78fd015 Fix source filtering in reindex-from-remote (#22514)
Reindex-from-remote was accepting source filtering in the request
but ignoring it and setting `_source=true` on the search URI. This
fixes the filtering so it is piped through to the remote node and
adds tests for that.

Closes #22507
2017-01-10 09:00:12 -05:00
Nik Everett 3fb9254b95 Replace Suggesters with namedObject (#22491)
Removes another parser registery type thing in favor of
`XContentParser#namedObject`.
2017-01-09 16:51:08 -05:00
Nik Everett 057194f9ab Fix test under windows
Silly `\r`.
2017-01-09 16:29:59 -05:00
Nik Everett e3f77b4795 Replace AggregatorParsers with namedObject (#22397)
Removes `AggregatorParsers`, replacing all of its functionality with
`XContentParser#namedObject`.

This is the third bit of payoff from #22003, one less thing to pass
around the entire application.
2017-01-09 13:59:38 -05:00
Nik Everett fc1f7c2147 Remove content-type detection from reindex-from-remote (#22504)
If the remote doesn't return a content type then reindex
tried to guess the content-type. This didn't work most
of the time and produced a rather useless error message.
Given that Elasticsearch always returns the content-type
we are dropping content-type detection in favor of just
failing the request if the remote didn't return a content-type.

Closes #22329
2017-01-09 11:50:20 -05:00
Nik Everett f4884e0726 Replace SearchExtRegistry with namedObject (#22492)
This is one of the last things in `SearchRequestParsers`.
2017-01-09 08:35:54 -05:00
javanna 6102523033 remove ParseFieldMatcher usages from Script parsing code 2017-01-05 19:33:04 +01:00
javanna cd6b569286 Remove some usages of ParseFieldMatcher in favour of using ParseField directly
Relates to #19552
Relates to #22130
2016-12-31 09:24:44 +01:00
javanna df2acb3d9d Remove some more usages of ParseFieldMatcher in favour of using ParseField directly
Relates to #19552
Relates to #22130
2016-12-30 18:57:47 +01:00
Nik Everett f5f2149ff2 Remove much ceremony from parsing client yaml test suites (#22311)
* Remove a checked exception, replacing it with `ParsingException`.
* Remove all Parser classes for the yaml sections, replacing them with static methods.
* Remove `ClientYamlTestFragmentParser`. Isn't used any more.
* Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods.

I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.
2016-12-22 11:00:34 -05:00
Jason Tedor 7946396fe6 Introduce translog no-op
As the translog evolves towards a full operations log as part of the
sequence numbers push, there is a need for the translog to be able to
represent operations for which a sequence number was assigned, but the
operation did not mutate the index. Examples of how this can arise are
operations that fail after the sequence number is assigned, and gaps in
this history that arise when an operation is assigned a sequence number
but the operation never completed (e.g., a node crash). It is important
that these operations appear in the history so that they can be
replicated and replayed during recovery as otherwise the history will be
incomplete and local checkpoints will not be able to advance. This
commit introduces a no-op to the translog to set the stage for these
efforts.

Relates #22291
2016-12-21 23:08:16 -05:00
Nik Everett 567c65b0d5 Replace IndicesQueriesRegistry (#22289)
* Switch query parsing to namedObject
* Remove IndicesQueriesRegistry
2016-12-21 09:05:14 -05:00
Nik Everett a04dcfb95b Introduce XContentParser#namedObject (#22003)
Introduces `XContentParser#namedObject which works a little like
`StreamInput#readNamedWriteable`: on startup components register
parsers under names and a superclass. At runtime we look up the
parser and call it to parse the object.

Right now the parsers take a context object they use to help with
the parsing but I hope to be able to eliminate the need for this
context as most what it is used for at this point is to move
around parser registries which should be replaced by this method
eventually. I make no effort to do so in this PR because it is
big enough already. This is meant to the a start down a road that
allows us to remove classes like `QueryParseContext`,
`AggregatorParsers`, `IndicesQueriesRegistry`, and
`ParseFieldRegistry`.

The goal here is to reduce the amount of plumbing required to
allow parsing pluggable things. With this you don't have to pass
registries all over the place. Instead you must pass a super
registry to fewer places and use it to wrap the reader. This is
the same tradeoff that we use for NamedWriteable and it allows
much, much simpler binary serialization. We think we want that
same thing for xcontent serialization.

The only parsing actually converted to this method is parsing
`ScoreFunctions` inside of `FunctionScoreQuery`. I chose this
because it is relatively self contained.
2016-12-20 11:05:24 -05:00
Nik Everett 73320566c1 Reindex test: catch exception name instead of reason
It looks like the exception reason can differ in different default
locales, so the build would fail in any non-English locale. This
switches the catch to the name of the exception which shouldn't
vary.
2016-12-20 10:00:14 -05:00
Nik Everett 8de4be9e4d Reinex test: don't fail if iis is running on port 0 2016-12-19 16:44:08 -05:00
Nik Everett 2d71ced221 Properly fail reindex-from-remote if can't detect content type 2016-12-19 12:51:38 -05:00
Nik Everett 749039ad4f Consolidate the last easy parser construction (#22095)
Moves the last of the "easy" parser construction into
`RestRequest`, this time with a new method
`RestRequest#contentParser`. The rest of the production
code that builds `XContentParser` isn't "easy" because it is
exposed in the Transport Client API (a Builder) object.
2016-12-14 15:41:25 -05:00
Nik Everett 872984d21a Continue consolidating `XContentParser` construction in tests (#22145)
Consolidate more parser creation in tests

Moves more parser creation in tests to the `createParser` methods
in `ESTestCase`.
2016-12-13 17:22:39 -05:00
Nik Everett 3adefb7b4a Begin centralizing XContentParser creation into RestRequest (#22041)
To get #22003 in cleanly we need to centralize as much `XContentParser` creation as possible into `RestRequest`. That'll mean we have to plumb the `NamedXContentRegistry` into fewer places.

This removes `RestAction.hasBody`, `RestAction.guessBodyContentType`, and `RestActions.getRestContent`, moving callers over to `RestRequest.hasContentOrSourceParam`, `RestRequest.contentOrSourceParam`, and `RestRequest.contentOrSourceParamParser` and `RestRequest.withContentOrSourceParamParserOrNull`. The idea is to use `withContentOrSourceParamParserOrNull` if you need to handle requests without any sort of body content and to use `contentOrSourceParamParser` otherwise.

I believe the vast majority of this PR to be purely mechanical but I know I've made the following behavioral change (I'll add more if I think of more):
* If you make a request to an endpoint that requires a request body and has cut over to the new APIs instead of getting `Failed to derive xcontent` you'll get `Body required`.
* Template parsing is now non-strict by default. This is important because we need to be able to deprecate things without requests failing.
2016-12-09 20:23:02 -05:00
Nik Everett fc2060ba7e Don't close rest client from its callback (#22061)
If you try to close the rest client inside one of its callbacks then
it blocks itself. The thread pool switches the status to one that
requests a shutdown and then waits for the pool to shutdown. When
another thread attempts to honor the shutdown request it waits
for all the threads in the pool to finish what they are working on.
Thus thread a is waiting on thread b while thread b is waiting
on thread a. It isn't quite that simple, but it is close.

Relates to #22027
2016-12-09 10:39:51 -05:00
Adrien Grand 36f598138a Start using `ObjectParser` for aggs. (#22048)
This is an attempt to start moving aggs parsing to `ObjectParser`. There is
still A LOT to do, but ObjectParser is way better than the way aggregations
parsing works today. For instance in most cases, we reject numbers that are
provided as strings, which we are supposed to accept since some client languages
(looking at you Perl) cannot make sure to use the appropriate types.

Relates to #22009
2016-12-09 09:45:16 +01:00
Ryan Ernst b1cef5fdf8 Remove 2.0 prerelease version constants (#22004)
* Remove 2.0 prerelease version constants

This is a start to addressing #21887. This removes:
* pre 2.0 snapshot format support
* automatic units addition to cluster settings
* bwc check for delete by query in pre 2.0 indexes
2016-12-08 21:48:35 -08:00
Nik Everett ef83dbfbe6 Reindex: Better error message for pipeline in wrong place (#21985)
`_update_by_query` supports specifying the `pipeline` to process the
documents as a url parameter but `_reindex` doesn't. It doesn't because
everything about the `_reindex` request that has to do with writing
the documents is grouped under the `dest` object in the request body.
This changes the response parameter from
`request [_reindex] contains unrecognized parameter: [pipeline]` to
`_reindex doesn't support [pipeline] as a query parmaeter. Specify it in the [dest] object instead.`
2016-12-06 14:55:46 -05:00
Ryan Ernst c8f241f284 Plugins: Remove response action filters (#21950)
Action filters currently have the ability to filter both the request and
response. But the response side was not actually used. This change
removes support for filtering responses with action filters.
2016-12-05 16:14:04 -08:00
Nik Everett 2087234d74 Timeout improvements for rest client and reindex (#21741)
Changes the default socket and connection timeouts for the rest
client from 10 seconds to the more generous 30 seconds.

Defaults reindex-from-remote to those timeouts and make the
timeouts configurable like so:
```
POST _reindex
{
  "source": {
    "remote": {
      "host": "http://otherhost:9200",
      "socket_timeout": "1m",
      "connect_timeout": "10s"
    },
    "index": "source",
    "query": {
      "match": {
        "test": "data"
      }
    }
  },
  "dest": {
    "index": "dest"
  }
}
```

Closes #21707
2016-12-05 10:54:51 -05:00
Igor Motov c391b3fff6 Add proper descriptions to reindex, update-by-query and delete-by-query tasks.
Related to #21768
2016-12-02 21:46:38 -05:00
Nik Everett 0c724b1878 Keep context during reindex's retries (#21941)
* Keep context during reindex's retries

This fixes reindex and friend's retries to keep the context.

* Docs
2016-12-02 13:48:51 -05:00
Jason Tedor 6c45695d52 Add version 5.1.1
This commit removes the version constant for 5.1.0 (due to an
inadvertent release) and adds the version constant for 5.1.1.

Relates #21890
2016-11-30 11:14:17 -05:00
Luca Cavanna 5b8bdba12e Remove subrequests method from CompositeIndicesRequest (#21873) 2016-11-30 15:03:58 +01:00
Adrien Grand 6231009a8f Remove 2.x backward compatibility of mappings. (#21670)
For the record, I also had to remove the geo-hash cell and geo-distance range
queries to make the code compile. These queries already throw an exception in
all cases with 5.x indices, so that does not hurt any more.

I also had to rename all 2.x bwc indices from `index-${version}` to
`unsupported-${version}` to make `OldIndexBackwardCompatibilityIT`
happy.
2016-11-30 13:34:46 +01:00
Jason Tedor 8416b16dfd Improve handling of unreleased versions
Today when handling unreleased versions for backwards compatilibity
support, we scatted version constants across the code base and add some
asserts to support removing these constants when the version in question
is actually released. This commit improves this situation, enabling us
to just add a single unreleased version constant that can be renamed
when the version is actually released. This should make maintenance of
these versions simpler.

Relates #21760
2016-11-23 15:49:05 -05:00
Ryan Ernst 6940b2b8c7 Remove groovy scripting language (#21607)
* Scripting: Remove groovy scripting language

Groovy was deprecated in 5.0. This change removes it, along with the
legacy default language infrastructure in scripting.
2016-11-22 19:24:12 -08:00
Jason Tedor b08a2e1f31 Expose executor service interface from thread pool
This commit exposes the executor service interface from thread
pool. This will enable some high-level concurrency primitives that will
make some code cleaner and simpler.

Relates #21608
2016-11-17 09:18:49 -05:00
Simon Willnauer de04aad994 Remove `modules/transport_netty_3` in favor of `netty_4` (#21590)
We kept `netty_3` as a fallback in the 5.x series but now that master
is 6.0 we don't need this or in other words all issues coming up with
netty 4 will be blockers for 6.0.
2016-11-17 12:44:42 +01:00
Boaz Leskes c9f49039d3 Merge remote-tracking branch 'upstream/master' into feature/seq_no 2016-11-15 10:14:47 +00:00
Ryan Ernst d14c470b89 Remove generics from ActionRequest
closes #21368
2016-11-14 15:32:01 -08:00
Jason Tedor d3417fb022 Merge branch 'master' into feature/seq_no
* master: (516 commits)
  Avoid angering Log4j in TransportNodesActionTests
  Add trace logging when aquiring and releasing operation locks for replication requests
  Fix handler name on message not fully read
  Remove accidental import.
  Improve log message in TransportNodesAction
  Clean up of Script.
  Update Joda Time to version 2.9.5 (#21468)
  Remove unused ClusterService dependency from SearchPhaseController (#21421)
  Remove max_local_storage_nodes from elasticsearch.yml (#21467)
  Wait for all reindex subtasks before rethrottling
  Correcting a typo-Maan to Man-in README.textile (#21466)
  Fix InternalSearchHit#hasSource to return the proper boolean value (#21441)
  Replace all index date-math examples with the URI encoded form
  Fix typos (#21456)
  Adapt ES_JVM_OPTIONS packaging test to ubuntu-1204
  Add null check in InternalSearchHit#sourceRef to prevent NPE (#21431)
  Add VirtualBox version check (#21370)
  Export ES_JVM_OPTIONS for SysV init
  Skip reindex rethrottle tests with workers
  Make forbidden APIs be quieter about classpath warnings (#21443)
  ...
2016-11-10 23:40:33 -05:00
Jack Conradson aeb97ff412 Clean up of Script.
Closes #21321
2016-11-10 09:59:13 -08:00
Nik Everett 4db21db0aa Wait for all reindex subtasks before rethrottling
In the test for reindex and friend's rethrottling feature we were waiting
only for a single reindex sub task to start before rethrottling. This
mostly worked because starting tasks is fast. But it didn't *always work
and CI found that for us. This fixes the test to wait for all subtasks
to start before rethrottling.

I reproduced this locally semi-consistently with some fairly creative
`Thread.sleep` calls and this test fix fixes the issue even with the
sleeps so I'm fairly sure this will work consistently.

Closes #21446
2016-11-10 10:49:25 -05:00
Luca Cavanna bd23921a3a Fix InternalSearchHit#hasSource to return the proper boolean value (#21441)
The method used to be called `isSourceEmpty`, and was renamed to `hasSource`, but the return value never changed. Updated tests and users accordingly.

Closes #21419
2016-11-10 13:13:38 +01:00
Nik Everett b0f5ea3f59 Skip reindex rethrottle tests with workers
They are flakey and spuriously fail the build. I'll hunt down
the cause soon and reenabled but for now they should stop.

Relates #21446
2016-11-09 17:50:09 -05:00
Nik Everett a3bd6d1ad9 Switch reindex with slices error to IAE
If you try to reindex with multiple slices against a node that
doesn't support it we throw an `IllegalArgumentException` so
`assertVersionSerializable` is ok with it and so if this happens
in REST it comes back as a 400 error.
2016-11-08 11:42:07 -05:00
Luca Cavanna 293a3cab01 Rest client: don't reuse that same HttpAsyncResponseConsumer across multiple retries (#21378)
* Rest client: don't reuse that same HttpAsyncResponseConsumer across multiple retries

Turns out that AbstractAsyncResponseConsumer from apache async http client is stateful and cannot be reused across multiple requests. The failover mechanism was mistakenly reusing that same instance, which can be provided by users, across retries in case nodes are down or return 5xx errors. The downside is that we have to change the signature of two public methods, as HttpAsyncResponseConsumer cannot be provided directly anymore, rather its factory needs to be provided which is going to be used to create one instance of the consumer per request attempt.

Up until now we tested our RestClient against multiple nodes only in a mock environment, where we don't really send http requests. In that scenario we can verify that retries etc. work properly but the interaction with the http client library in a real scenario is different and can catch other problems. With this commit we also add an integration test that sends requests to multiple hosts, and some of them may also get stopped meanwhile. The specific test for pathPrefix was also removed as pathPrefix is now randomly applied by default, hence implicitly tested. Moved also a small test method that checked the validity of the path argument to the unit test RestClientSingleHostTests.

Also increase default buffer limit to 100MB and make it required in default consumer

The default buffer limit used to be 10MB but that proved not to be high enough for scroll requests (see reindex from remote). With this commit we increase the limit to 100MB and make it a bit more visibile in the consumer factory.
2016-11-08 16:42:42 +01:00
Ryan Ernst 7a2c984bcc Test: Remove multi process support from rest test runner (#21391)
At one point in the past when moving out the rest tests from core to
their own subproject, we had multiple test classes which evenly split up
the tests to run. However, we simplified this and went back to a single
test runner to have better reproduceability in tests. This change
removes the remnants of that multiplexing support.
2016-11-07 15:07:34 -08:00
Nik Everett a13a050271 Add automatic parallelization support to reindex and friends (#20767)
Adds support for `?slices=N` to reindex which automatically
parallelizes the process using parallel scrolls on `_uid`. Performance
testing sees a 3x performance improvement for simple docs
on decent hardware, maybe 30% performance improvement
for more complex docs. Still compelling, especially because
clusters should be able to get closer to the 3x than the 30%
number.

Closes #20624
2016-11-04 20:59:15 -04:00
Nik Everett a612e5988e Bump reindex-from-remote's buffer to 200mb
It was 10mb and that was causing trouble when folks reindex-from-remoted
with large documents.

We also improve the error reporting so it tells folks to use a smaller
batch size if they hit a buffer size exception. Finally, adds some docs
to reindex-from-remote mentioning the buffer and giving an example of
lowering the size.

Closes #21185
2016-11-01 13:19:28 -04:00
Jack Conradson 512a77a633 Refactor ScriptType to be a top-level class. 2016-10-26 10:21:22 -07:00
Nik Everett 18393a06f3 Fix reindex-from-remote for parent/child from <2.0
Versions before 2.0 needed to be told to return interesting fields
like `_parent`, `_routing`, `_ttl`, and `_timestamp`. And they come
back inside a `fields` block which we need to parse.

Closes #21044
2016-10-21 13:14:33 -04:00
Nik Everett b5da42905f Remove publishAddress from reindex whitelist
Removes the `publishAddress` parameter from the reindex-from-remote
whitelist checking because it isn't in use after #21004.
2016-10-20 12:51:10 -04:00
Nik Everett acf7c7430b Add "simple match" support for reindex-from-remote whitelist
This allows you to whitelist `localhost:*` or `127.0.10.*:9200`.
It explicitly checks for patterns like `*` in the whitelist and
refuses to start if the whitelist would match everything. Beyond
that the user is on their own designing a secure whitelist.
2016-10-18 21:47:21 -04:00
Areek Zillur 481f7909ae Merge branch 'master' into cleanup/transport_bulk 2016-10-11 16:04:47 -04:00
Areek Zillur 0e8b6532ec rename DocumentRequest to DocWriteRequest 2016-10-11 16:00:10 -04:00
Areek Zillur 396f80c963 Revert "rename DocumentRequest to DocumentWriteRequest"
This reverts commit b5079ce009.
2016-10-07 17:50:07 -04:00
Simon Willnauer 7452028e50 Simplify TransportAddress (#20798)
since TransportAddress is now final we can simplify it's interface a bit
and remove methods that are only used in tests or are plain delegates.
2016-10-07 15:56:54 +02:00
Simon Willnauer 194a6b1df0 Remove LocalTransport in favor of MockTcpTransport (#20695)
This change proposes the removal of all non-tcp transport implementations. The
mock transport can be used by default to run tests instead of local transport that has
roughly the same performance compared to TCP or at least not noticeably slower.

This is a master only change, deprecation notice in 5.x will be committed as a
separate change.
2016-10-07 11:27:47 +02:00
Areek Zillur b5079ce009 rename DocumentRequest to DocumentWriteRequest 2016-10-06 05:05:59 -04:00
Areek Zillur bd4a03a426 Merge branch 'master' into cleanup/transport_bulk 2016-10-04 14:06:17 -04:00
Jason Tedor 51d53791fe Remove lenient URL parameter parsing
Today when parsing a request, Elasticsearch silently ignores incorrect
(including parameters with typos) or unused parameters. This is bad as
it leads to requests having unintended behavior (e.g., if a user hits
the _analyze API and misspell the "tokenizer" then Elasticsearch will
just use the standard analyzer, completely against intentions).

This commit removes lenient URL parameter parsing. The strategy is
simple: when a request is handled and a parameter is touched, we mark it
as such. Before the request is actually executed, we check to ensure
that all parameters have been consumed. If there are remaining
parameters yet to be consumed, we fail the request with a list of the
unconsumed parameters. An exception has to be made for parameters that
format the response (as opposed to controlling the request); for this
case, handlers are able to provide a list of parameters that should be
excluded from tripping the unconsumed parameters check because those
parameters will be used in formatting the response.

Additionally, some inconsistencies between the parameters in the code
and in the docs are corrected.

Relates #20722
2016-10-04 12:45:29 -04:00
Areek Zillur 248ac240ed Merge branch 'master' into cleanup/transport_bulk 2016-10-03 16:12:11 -04:00
Jason Tedor 25fd9e26c4 Merge branch 'master' into feature/seq_no
* master: (1199 commits)
  [DOCS] Remove non-valid link to mapping migration document
  Revert "Default `include_in_all` for numeric-like types to false"
  test: add a test with ipv6 address
  docs: clearify that both ip4 and ip6 addresses are supported
  Include complex settings in settings requests
  Add production warning for pre-release builds
  Clean up confusing error message on unhandled endpoint
  [TEST] Increase logging level in testDelayShards()
  change health from string to enum (#20661)
  Provide error message when plugin id is missing
  Document that sliced scroll works for reindex
  Make reindex-from-remote ignore unknown fields
  Remove NoopGatewayAllocator in favor of a more realistic mock (#20637)
  Remove Marvel character reference from guide
  Fix documentation for setting Java I/O temp dir
  Update client benchmarks to log4j2
  Changes the API of GatewayAllocator#applyStartedShards and (#20642)
  Removes FailedRerouteAllocation and StartedRerouteAllocation
  IndexRoutingTable.initializeEmpty shouldn't override supplied primary RecoverySource (#20638)
  Smoke tester: Adjust to latest changes (#20611)
  ...
2016-09-29 00:22:31 +02:00
Nik Everett 370afa371b Make reindex-from-remote ignore unknown fields
reindex-from-remote should ignore unknown fields so it is mostly
future compatible. This makes it ignore unknown fields by adding an
option to `ObjectParser` and `ConstructingObjectParser` that, if
enabled, causes them to ignore unknown fields.

Closes #20504
2016-09-26 00:55:46 +02:00
Luca Cavanna 37489c3274 Add clusterUUID to RestMainAction output (#20503)
Add clusterUUID to RestMainAction output

GET / now returns the clusterUUID as well as part of its output for monitoring purposes
2016-09-15 16:25:17 +02:00
javanna 90ab460fcc move parsing of search ext sections to the coordinating node 2016-09-09 19:10:42 +02:00
javanna 536d13ff11 ProcessInfo to implement Writeable rather than Streamable 2016-09-02 10:23:05 +02:00
Martijn van Groningen a110498ad8 settings: Make `action.auto_create_index` setting a dynamic cluster setting.
Closes #7513
2016-09-01 12:33:30 +02:00
Jason Tedor 76ab02e002 Merge branch 'master' into log4j2
* master:
  Avoid NPE in LoggingListener
  Randomly use Netty 3 plugin in some tests
  Skip smoke test client on JDK 9
  Revert "Don't allow XContentBuilder#writeValue(TimeValue)"
  [docs] Remove coming in 2.0.0
  Don't allow XContentBuilder#writeValue(TimeValue)
  [doc] Remove leftover from CONSOLE conversion
  Parameter improvements to Cluster Health API wait for shards (#20223)
  Add 2.4.0 to packaging tests list
  Docs: clarify scale is applied at origin+offest (#20242)
2016-08-31 16:37:55 -04:00
Jason Tedor 54083f7d6e Randomly use Netty 3 plugin in some tests
When Netty 4 was introduced, it was not the default network
implementation. Some tests were constructed to randomly use Netty 4
instead of the default network implementation. When Netty 4 was made the
default implementation, these tests were not updated. Thus, these tests
are randomly choosing between the default network implementation (Netty
4) and Netty 4. This commit updates these tests to reverse the role of
Netty 3 and Netty 4 so that the randomization is choosing between Netty
3 and the default (again, now Netty 4).

Relates #20265
2016-08-31 15:41:39 -04:00
Jason Tedor abf8a1a3f0 Avoid allocating log parameterized messages
This commit modifies the call sites that allocate a parameterized
message to use a supplier so that allocations are avoided unless the log
level is fine enough to emit the corresponding log message.
2016-08-30 18:17:09 -04:00
Jason Tedor 7da0cdec42 Introduce Log4j 2
This commit introduces Log4j 2 to the stack.
2016-08-30 13:31:24 -04:00
Chris Earle bd0b06440e Add "Async" to the end of each Async RestClient method
This makes it much harder to accidentally miss the Response.
2016-08-26 10:51:33 -04:00
Jim Ferenczi 4682fc34ae Add the ability to disable the retrieval of the stored fields entirely
This change adds a special field named _none_ that allows to disable the retrieval of the stored fields in a search request or in a TopHitsAggregation.

To completely disable stored fields retrieval (including disabling metadata fields retrieval such as _id or _type) use _none_ like this:

````
POST _search
{
   "stored_fields": "_none_"
}
````
2016-08-24 16:40:08 +02:00
Clinton Gormley abc025e18b Fixed the reindex_rethrottle REST tests
The API was renamed from reindex.rethrottle to reindex_rethrottle
2016-08-24 14:55:02 +02:00
Areek Zillur 80ca78479f Make bulk item-level requests implement DocumentRequest interface
Currently, bulk item requests can be any ActionRequest, this commit
restricts bulk item requests to DocumentRequest. This simplifies
handling failures during bulk requests. Additionally, a new enum
is added to DocumentRequest to represent the intended operation
to be performed by a document request. Now, index operation type
also uses the new enum to specify whether the request should
create or index a document.
2016-08-23 10:33:37 -04:00
Nik Everett 312a7d45ba Wait for task to start in reindex test
`RethrottleTests#testReindex` fail in CI:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-intake/1274/console

I was unable to reproduce it locally but it *looks* like a race to start
the task. So I've added a wait for it to start just in case.
2016-08-17 12:08:55 -04:00
Nik Everett 39d8f5f123 Reindex tests should expect the right failure
Reindex intentionally tries to fail the search operation to make sure
that the exception flows back. The exception message changed so we
should catch the appropriate exception.
2016-08-17 10:25:38 -04:00
Nik Everett 34bbd27f84 Fix _update_by_query's ingest pipeline support
It wasn't being serialized so it wasn't working with the transport
client.
2016-08-16 16:03:18 -04:00
Nik Everett 862843ec90 Suppress failing test
This test was failing in the presence of transport clients. This turns
off transport clients while I fix the test so it doesn't fail for
everyone in the mean time.
2016-08-16 15:12:40 -04:00
Ryan Ernst 743d9fd008 Merge branch 'master' into search_parser 2016-08-16 11:28:59 -07:00
Nik Everett fdd50612ae Fix reindex under the transport client
The big change here is cleaning up the `TaskListResponse` so it doesn't
have a breaky `toString` implementation. That was causing the reindex
tests to break.

Also removed `NetworkModule#registerTaskStatus` which is part of the
Plugin API. Use `Plugin#getNamedWriteables` instead.
2016-08-16 12:15:15 -04:00
Ryan Ernst 7fde410586 Internal: Consolidate search parser registries
Parsing a search request is currently split up among a number of
classes, using multiple public static methods, which take multiple
regstries of elements that may appear in the search request like query
parsers and aggregations. This change begins consolidating all this code
by collapsing the registries normally used for parsing search requests
into a single SearchRequestParsers class. It is also made available to
plugin services to enable templating of search requests.  Eventually all
of the actual parsing logic should move to the class, and the registries
should be hidden, but for now they are at least co-located to reduce the
number of objects that must be passed around.
2016-08-16 01:59:24 -07:00
Nik Everett 1452ab4b9f Squash the rest of o.e.rest.action
Squashes all the subpackages of `org.elasticsearch.rest.action` down to
the following:
* `o.e.rest.action.admin` - Administrative actions
* `o.e.rest.action.cat` - Actions that make tables for `grep`ing
* `o.e.rest.action.document` - Actions that act on documents
* `o.e.rest.action.ingest` - Actions that act on ingest pipelines
* `o.e.rest.action.search` - Actions that search

I'm tempted to merge `search` into `document` but the `document`
package feels fairly complete as is and `Suggest` isn't actually always
about documents either....

I'm also tempted to merge `ingest` into `admin.cluster` because the
latter contains the actions for dealing with stored scripts.

I've moved the `o.e.rest.action.support` into `o.e.rest.action`.

I've also added `package-info.java`s to all packges in `o.e.rest`. I
figure if the package is too small to deserve a `package-info.java` file
then it is too small to deserve to be a package....

Also fixes checkstyle in all moved classes.
2016-08-15 21:06:32 -04:00
Igor Motov 10a766704e Rename Task Persistence into Storing Task Results
The term persisted task was used to indicate that a task should store its results upon its completion. We would like to use this term to indicate that a task can survive restart of nodes instead. This commit removes usages of the term "persist" when it means store results.
2016-08-15 10:02:43 -04:00
Nik Everett 9f8f2ea54b Remove ESIntegTestCase#pluginList
It was a useful method in 1.7 when javac's type inference wasn't as
good, but now we can just replace it with `Arrays.asList`.
2016-08-11 15:44:02 -04:00
Luca Cavanna a80a35ebc4 Merge pull request #19961 from javanna/fix/reindex_repleaceable
update and delete by query requests to implement IndicesRequest.Replaceable
2016-08-11 21:10:58 +02:00
javanna 4424d2263f UpdateByQueryRequest to implement IndicesRequest.Replaceable rather than CompositeIndicesRequest
Update by query is a shortcut to search + index. UpdateByQueryRequest gets serialized on the transport layer only when the transport client is used. Given that the request supports wildcards and allows to set its indices, it should implement IndicesRequest.Repleaceable. implementing CompositeIndicesRequest makes little sense as the indices that the request works against depend entirely on the inner search request.
2016-08-11 18:11:26 +02:00
javanna 11d770dde3 DeleteByQueryRequest to implement IndicesRequest.Replaceable
Delete by query is a shortcut to search + delete. DeleteByQueryRequest gets serialized on the transport layer only when the transport client is used. Given that the request supports wildcards and allows to set its indices, it should implement IndicesRequest.Repleaceable
2016-08-11 18:11:26 +02:00
Nik Everett e07e5d66fa Make reindex and lang-javascript compatible
Fixes two issues:
1. lang-javascript doesn't support `executable` with a `null` `vars`
parameters. The parameter is quite nullable.
2. reindex didn't support script engines who's `unwrap` method wasn't
a noop. This didn't come up for lang-groovy or lang-painless because
both of those `unwrap`s were noops. lang-javascript copys all maps that
it `unwrap`s.

This adds fairly low level unit tests for these fixes but dosen't add
an integration test that makes sure that reindex and lang-javascript
play well together. That'd make backporting this difficult and would
add a fairly significant amount of time to the build for a fairly rare
interaction. Hopefully the unit tests will be enough.
2016-08-11 09:54:03 -04:00
Adrien Grand 0d6ac57acf Collapse o.e.index.mapper packages. #19921
I also reduced the visibility of a couple classes and renamed/consolidated some
test classes for consistency, eg. removing the `Simple` prefix or using the
`<Type>FieldMapperTests` convention for testing field mappers.
2016-08-10 17:51:11 +02:00
David Pilato 4d272cc9b2 Merge branch 'master' into fix/19772-toString
# Conflicts:
#	core/src/test/java/org/elasticsearch/action/admin/cluster/node/tasks/TransportTasksActionTests.java
2016-08-09 11:53:29 +02:00
Jason Tedor a62740bbd2 Avoid early initializing Netty
Today when we load the Netty plugins, we indirectly cause several Netty
classes to initialize. This is because we attempt to load some classes
by name, and loading these classes is done in a way that triggers a long
chain of class initializers within Netty. We should not do this, this
can lead to log messages before the logger is loader, and it leads to
initialization in cases when the classes would never be needed (for
example, Netty 3 class initialization is never needed if Netty 4 is
used, and vice versa). This commit avoids this early initialization of
these classes by removing the need for the early loading.

Relates #19819
2016-08-05 14:58:33 -04:00
David Pilato 54603903f3 Remove ListTasksResponse#setDiscoveryNodes 2016-08-04 02:02:51 +02:00
Ali Beyad 6a7d005081 Makes the index.write.wait_for_active_shards setting index-level and
dynamically updatable for both index creation and write operations.
2016-08-01 13:37:05 -04:00
Ali Beyad 25d8eca62d Removes the notion of write consistency level across all APIs in
favor of waiting for active shard copy count (wait_for_active_shards).
2016-08-01 13:35:29 -04:00
Alexander Lin 9ac6389e43 Rename operation to result and reworking responses
* Rename operation to result and reworking responses
* Rename DocWriteResponse.Operation enum to DocWriteResponse.Result

These are just easier to interpret names.

Closes #19664
2016-08-01 10:42:58 -04:00
Nik Everett 303c9faca5 Squash o.e.rest.action.admin.cluster
In an effort to reduce the number of tiny packages we have in the
code base this moves all the files that were in subdirectories of
`org.elasticsearch.rest.action.admin.cluster` into
`org.elasticsearch.rest.action.admin.cluster`.

Also fixes line length in these packages.
2016-07-29 20:31:24 -04:00
Nik Everett 6f24866902 Reindex: Only ask for _version we need it
`_reindex` only needs the `_version` if the `dest` has
`"version_type": "external"`. So it shouldn't ask for it unless it does.

`_update_by_query` and `_delete_by_query` always need the `_version`.

Closes #19135
2016-07-29 17:13:27 -04:00
Alexander Lin 119026b4fb Remove isCreated and isFound from the Java API
This is cleanup work from #19566, where @nik9000 suggested trying to nuke the isCreated and isFound methods. I've combined nuking the two methods with removing UpdateHelper.Operation in favor of DocWriteResponse.Operation here.

Closes #19631.
2016-07-29 14:21:43 -04:00
Nik Everett c9790a1257 Use fewer threads when reindexing-from-remote
Reindex from remote uses the Elasticsearch client which uses apache
httpasyncclient which spins up 5 thread by default, 1 as a dispatcher
and 4 more to handle IO. This changes Reindex's usage so it only spins
up two thread - 1 dispatcher and one to handle io. It also renames the
threads to "es-client-$taskid-$thread_number". That way if we see any
thread sticking around we can trace it back to the task.
2016-07-29 14:13:10 -04:00
Nik Everett fb45f6a8a8 Add authentication to reindex-from-remote
The tests for authentication extend ESIntegTestCase and use a mock
authentication plugin. This way the clients don't have to worry about
running it. Sadly, that means we don't really have good coverage on the
REST portion of the authentication.

This also adds ElasticsearchStatusException, and exception on which you
can set an explicit status. The nice thing about it is that you can
set the RestStatus that it returns to whatever arbitrary status you like
based on the status that comes back from the remote system.
reindex-from-remote then uses it to wrap all remote failures, preserving
the status from the remote Elasticsearch or whatever proxy is between us
and the remove Elasticsearch.
2016-07-27 14:17:41 -04:00
Nik Everett 9270e8b22b Rename client yaml test infrastructure
This makes it obvious that these tests are for running the client yaml
suites. Now that there are other ways of running tests using the REST
client against a running cluster we can't go on calling the shared
client yaml tests "REST tests". They are rest tests, but they aren't
**the** rest tests.
2016-07-26 13:53:44 -04:00
Alexander Lin 8f2882a442 Add _operation field to index, update, delete responses
Performing the bulk request shown in #19267 now results in the following:
```
{"_index":"test","_type":"test","_id":"1","_version":1,"_operation":"create","forced_refresh":false,"_shards":{"total":2,"successful":1,"failed":0},"status":201}
{"_index":"test","_type":"test","_id":"1","_version":1,"_operation":"noop","forced_refresh":false,"_shards":{"total":2,"successful":1,"failed":0},"status":200}
```
2016-07-26 11:16:19 -04:00
Nik Everett a95d4f4ee7 Add Location header and improve REST testing
This adds a header that looks like `Location: /test/test/1` to the
response for the index/create/update API. The requirement for the header
comes from https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

https://tools.ietf.org/html/rfc7231#section-7.1.2 claims that relative
URIs are OK. So we use an absolute path which should resolve to the
appropriate location.

Closes #19079

This makes large changes to our rest test infrastructure, allowing us
to write junit tests that test a running cluster via the rest client.
It does this by splitting ESRestTestCase into two classes:
* ESRestTestCase is the superclass of all tests that use the rest client
to interact with a running cluster.
* ESClientYamlSuiteTestCase is the superclass of all tests that use the
rest client to run the yaml tests. These tests are shared across all
official clients, thus the `ClientYamlSuite` part of the name.
2016-07-25 17:02:40 -04:00
Jason Tedor 2d1b0587dd Introduce Netty 4
This commit adds transport-netty4, a transport and HTTP implementation
based on Netty 4.

Relates #19526
2016-07-22 22:26:35 -04:00
javanna d13a3d3761 Reindex from remote: add fallback in case content-type header is not set
We better read the header, but who knows what can happen, maybe headers are filtered out for some reasons and we don't want to run into an NPE, then we fallback to auto-detection.
2016-07-22 16:46:17 +02:00
javanna bce54cf38d reindex from remote to read content-type header rather than guessing content type based on content 2016-07-19 15:16:45 +02:00
javanna 54fa997545 Reindex from remote: remove async client in favour of using RestClient performRequest async method 2016-07-19 15:16:45 +02:00
javanna 1fbec71243 Rest client: introduce async performRequest method and use async client under the hood for sync requests too
The new method accepts the usual parameters (method, endpoint, params, entity and headers) plus a response listener and an async response consumer. Shortcut methods are also added that don't require params, entity and the async response consumer optional.

There are a few relevant api changes as a consequence of the move to async client that affect sync methods:
- Response doesn't implement Closeable anymore, responses don't need to be closed
- performRequest throws Exception rather than just IOException, as that is the the exception that we get from the FutureCallback#failed method in the async http client
- ssl configuration is a bit simpler, one only needs to call setSSLStrategy from a custom HttpClientConfigCallback, that doesn't end up overridng any other default around connection pooling (it used to happen with the sync client and make ssl configuration more complex)

Relates to #19055
2016-07-19 15:15:58 +02:00
Nik Everett d573541f66 Support requests_per_second=-1 to mean no throttling in reindex
This is entirely on the REST level, Float.POSITIVE_INFINITY is still
how you get no throttling over the transport api.

Closes #19089
2016-07-18 13:05:06 -04:00
Martijn van Groningen e0ebf5da1c Template cleanup:
* Removed `Template` class and unified script & template parsing logic. Templates are scripts, so they should be defined as a script. Unless there will be separate template infrastructure, templates should share as much code as possible with scripts.
* Removed ScriptParseException in favour for ElasticsearchParseException
* Moved TemplateQueryBuilder to lang-mustache module because this query is hard coded to work with mustache only
2016-07-18 10:16:01 +02:00
Ali Beyad 19d0dbcd17 Removes waiting for yellow cluster health upon index (#19460)
creation in the REST tests, as we no longer need it due
to index creation now waiting for active shard copies
before returning (by default, it waits for the primary of
each shard, which is the same as ensuring yellow health).

Relates #19450
2016-07-15 17:18:34 -04:00
Jason Tedor 31c648eee8 Rename transport-netty to transport-netty3
This commit renames the Netty 3 transport module from transport-netty to
transport-netty3. This is to make room for a Netty 4 transport module,
transport-netty4.

Relates #19439
2016-07-14 22:03:14 -04:00
Simon Willnauer c463083537 minor cleanups and an additional BogusPlugin for HttpSmokeTestCase 2016-07-12 17:55:05 +02:00
Simon Willnauer 9cb247287f consolidate security code in on place an allow test based on the jar dependency to opt out of netty internal property setting assertion 2016-07-12 17:41:21 +02:00
Simon Willnauer 4fb79707bd Fix remaining tests that either need access to the netty module or require explict configuration
Some tests still start http implicitly or miss configuring the transport clients correctly.
This commit fixes all remaining tests and adds a depdenceny to `transport-netty` from
`qa/smoke-test-http` and `modules/reindex` since they need an http server running on the nodes.

This also moves all required permissions for netty into it's module and out of core.
2016-07-12 16:29:57 +02:00
Nik Everett 81fcdfcee9 Expose task information from NodeClient
This exposes a method to start an action and return a task from
`NodeClient`. This allows reindex to use the injected `Client` rather
than require injecting `TransportAction`s
2016-07-07 18:02:09 -04:00
Ryan Ernst 4ea5f51a9c Fix reindex NPE when http is disabled 2016-07-05 23:59:29 -07:00
Ryan Ernst f815799bd4 Fix reindex action to depend on HttpServer instead of NodeService for
http info
2016-07-05 22:16:22 -07:00
Jason Tedor d0765d0761 Merge branch 'master' into feature/seq_no
* master: (192 commits)
  [TEST] Fix rare OBOE in AbstractBytesReferenceTestCase
  Reindex from remote
  Rename writeThrowable to writeException
  Start transport client round-robin randomly
  Reword Refresh API reference (#19270)
  Update fielddata.asciidoc
  Fix stored_fields message
  Add missing footer notes in mapper size docs
  Remote BucketStreams
  Add doc values support to the _size field in the mapper-size plugin
  Bump version to 5.0.0-alpha5.
  Update refresh.asciidoc
  Update shrink-index.asciidoc
  Change Debian repository for Vagrant debian-8 box
  [TEST] fix test to account for internal empyt reference optimization
  Upgrade to netty 3.10.6.Final (#19235)
  [TEST] fix histogram test when extended bounds overlaps data
  Remove redundant modifier
  Simplify TcpTransport interface by reducing send code to a single send method (#19223)
  Fix style violation in InstallPluginCommand.java
  ...
2016-07-05 22:01:07 -04:00
Nik Everett b3c015e2bb Reindex from remote
This adds a remote option to reindex that looks like

```
curl -POST 'localhost:9200/_reindex?pretty' -d'{
  "source": {
    "remote": {
      "host": "http://otherhost:9200"
    },
    "index": "target",
    "query": {
      "match": {
        "foo": "bar"
      }
    }
  },
  "dest": {
    "index": "target"
  }
}'
```

This reindex has all of the features of local reindex:
* Using queries to filter what is copied
* Retry on rejection
* Throttle/rethottle
The big advantage of this version is that it goes over the HTTP API
which can be made backwards compatible.

Some things are different:

The query field is sent directly to the other node rather than parsed
on the coordinating node. This should allow it to support constructs
that are invalid on the coordinating node but are valid on the target
node. Mostly, that means old syntax.
2016-07-05 16:13:17 -04:00
Jason Tedor 3343ceeae4 Do not catch throwable
Today throughout the codebase, catch throwable is used with reckless
abandon. This is dangerous because the throwable could be a fatal
virtual machine error resulting from an internal error in the JVM, or an
out of memory error or a stack overflow error that leaves the virtual
machine in an unstable and unpredictable state. This commit removes
catch throwable from the codebase and removes the temptation to use it
by modifying listener APIs to receive instances of Exception instead of
the top-level Throwable.

Relates #19231
2016-07-04 08:41:06 -04:00
Jim Ferenczi afe99fcdcd Restore reverted change now that alpha4 is out:
Rename `fields` to `stored_fields` and add `docvalue_fields`

`stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields.
`fields` will throw an exception if the user uses it.
Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field.

Closes #18943
2016-07-04 10:39:49 +02:00
Ryan Ernst 8275ab497b Merge pull request #19170 from rjernst/rest_handler_client
Changed rest handler interface to take NodeClient
2016-06-30 11:00:09 -07:00
Boaz Leskes 09ca6d6ed2 Add a BridgePartition to be used by testAckedIndexing (#19172)
We have long worked to capture different partitioning scenarios in our testing infra. This PR adds a new variant, inspired by the Jepsen blogs, which was forgotten far - namely a partition where one node can still see and be seen by all other nodes. It also updates the resiliency page to better reflect all the work that was done in this area.
2016-06-30 17:58:12 +02:00
Ryan Ernst c762e7aa15 Merge branch 'master' into rest_handler_client 2016-06-30 08:16:25 -07:00
Tanguy Leroux dc53ce929d Document Update/Delete-By-Query with version number zero
Update-By-Query and Delete-By-Query use internal versioning to update/delete documents. But documents can have a version number equal to zero using the external versioning... making the UBQ/DBQ request fail because zero is not a valid version number and they only support internal versioning for now. Sequence numbers might help to solve this issue in the future.
2016-06-30 15:45:14 +02:00
Nik Everett d57b780bb4 Remote TransportRethrottleAction from RestRethrottleAction
Just use the client to call it.
2016-06-30 09:36:31 -04:00
Ryan Ernst c77dc4a82c Merge pull request #19136 from rjernst/script_service_deps
Scripts: Remove ClusterState from compile api
2016-06-29 22:34:40 -07:00
Ryan Ernst 865b951b7d Internal: Changed rest handler interface to take NodeClient
Previously all rest handlers would take Client in their injected ctor.
However, it was only to hold the client around for runtime. Instead,
this can be done just once in the HttpService which handles rest
requests, and passed along through the handleRequest method. It also
should always be a NodeClient, and other types of Clients (eg a
TransportClient) would not work anyways (and some handlers can be
simplified in follow ups like reindex by taking NodeClient).
2016-06-29 18:02:18 -07:00
Nik Everett 8db43c0107 Move RestHandler registration to ActionModule and ActionPlugin
`RestHandler`s are highly tied to actions so registering them in the
same place makes sense.

Removes the need to for plugins to check if they are in transport client
mode before registering a RestHandler - `getRestHandlers` isn't called
at all in transport client mode.

This caused guice to throw a massive fit about the circular dependency
between NodeClient and the allocation deciders. I broke the circular
dependency by registering the actions map with the node client after
instantiation.
2016-06-29 18:31:44 -04:00
Ryan Ernst ecf6101798 Scripts: Remove ClusterState from compile api
Stored scripts are pulled from the cluster state, and the current api
requires passing the ClusterState on each call to compile. However, this
means every user of the ScriptService needs to depend on the
ClusterService. Instead, this change makes the ScriptService a
ClusterStateListener. It also simplifies tests a lot, as they no longer
need to create fake cluster states (except when testing stored scripts).
2016-06-28 13:20:00 -07:00
Yannick Welsch 0515791846 Fix logger usages 2016-06-28 16:51:06 +02:00
Nik Everett fa4844c3f4 Pull actions from plugins
Instead of implementing onModule(ActionModule) to register actions,
this has plugins implement ActionPlugin to declare actions. This is
yet another step in cleaning up the plugin infrastructure.

While I was in there I switched AutoCreateIndex and DestructiveOperations
to be eagerly constructed which makes them easier to use when
de-guice-ing the code base.
2016-06-28 08:36:24 -04:00
Alexander Reelsen ab8ff8909b Tests: Rename task.get to tasks.get
The task.get action got renamed to tasks.get, some tests
did not change this.

Relates #19107
2016-06-28 09:13:19 +02:00
Jim Ferenczi eb1e231a63 Revert "Rename `fields` to `stored_fields` and add `docvalue_fields`"
This reverts commit 2f46f53dc8.
2016-06-27 17:20:32 +02:00
Jason Tedor 112669daed Merge branch 'master' into feature/seq_no
* master: (416 commits)
  docs: removed obsolete information, percolator queries are not longer loaded into jvm heap memory.
  Upgrade JNA to 4.2.2 and remove optionality
  [TEST] Increase timeouts for Rest test client (#19042)
  Update migrate_5_0.asciidoc
  Add ThreadLeakLingering option to Rest client tests
  Add a MultiTermAwareComponent marker interface to analysis factories. #19028
  Attempt at fixing IndexStatsIT.testFilterCacheStats.
  Fix docs build.
  Move templates out of the Search API, into lang-mustache module
  revert - Inline reroute with process of node join/master election (#18938)
  Build valid slices in SearchSourceBuilderTests
  Docs: Convert aggs/misc to CONSOLE
  Docs: migration notes for _timestamp and _ttl
  Group client projects under :client
  [TEST] Add client-test module and make client tests use randomized runner directly
  Move upgrade test to upgrade from version 2.3.3
  Tasks: Add completed to the mapping
  Fail to start if plugin tries broken onModule
  Remove duplicated read byte array methods
  Rename `fields` to `stored_fields` and add `docvalue_fields`
  ...
2016-06-23 11:52:11 -04:00
Jim Ferenczi 2f46f53dc8 Rename `fields` to `stored_fields` and add `docvalue_fields`
`stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields.
`fields` will throw an exception if the user uses it.
Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field.

Closes #18943
2016-06-22 17:38:30 +02:00
Nik Everett 5f0292cb81 Fetch result when wait_for_completion
This makes this sequence:
```
curl -XDELETE localhost:9200/source,dest?pretty
for i in $( seq 1 100 ); do
  curl -XPOST localhost:9200/source/test -d'{"test": "test"}'; echo
done
curl localhost:9200/_refresh?pretty

curl -XPOST 'localhost:9200/_reindex?pretty&wait_for_completion=false' -d'{
  "source": {
    "index": "source"
  },
  "dest": {
    "index": "dest"
  }
}'

curl 'localhost:9200/_tasks/Jsyd6d9wSRW-O-NiiKbPcQ:237?wait_for_completion&pretty'
```

Return task *AND* the response to the user.

This also renames "result" to "response" in the persisted task info
to line it up with how we name the objects in Elasticsearch.
2016-06-21 14:18:53 -04:00
Simon Willnauer bdb6dcea3a Cleanup ClusterService dependencies and detached from Guice (#18941)
This change removes some unnecessary dependencies from ClusterService
and cleans up ClusterName creation. ClusterService is now not created
by guice anymore.
2016-06-17 17:07:19 +02:00
Ryan Ernst a4503c2aed Plugins: Remove name() and description() from api
In 2.0 we added plugin descriptors which require defining a name and
description for the plugin. However, we still have name() and
description() which must be overriden from the Plugin class. This still
exists for classpath plugins. But classpath plugins are mainly for
tests, and even then, referring to classpath plugins with their class is
a better idea. This change removes name() and description(), replacing
the name for classpath plugins with the full class name.
2016-06-15 17:12:22 -07:00
Nik Everett e392e0b1df Create get task API that falls back to the .tasks index
This adds a get task API that supports GET /_tasks/${taskId} and
removes that responsibility from the list tasks API. The get task
API supports wait_for_complation just as the list tasks API does
but doesn't support any of the list task API's filters. In exchange,
it supports falling back to the .results index when the task isn't
running any more. Like any good GET API it 404s when it doesn't
find the task.

Then we change reindex, update-by-query, and delete-by-query to
persist the task result when wait_for_completion=false. The leads
to the neat behavior that, once you start a reindex with
wait_for_completion=false, you can fetch the result of the task by
using the get task API and see the result when it has finished.

Also rename the .results index to .tasks.
2016-06-14 13:37:34 -04:00
Simon Willnauer 7379b17e61 Revert "Make random UUIDs reproducible in tests"
This reverts commit a25b8ee1bf.
2016-06-13 11:14:30 +02:00
Simon Willnauer f1d5fd72c8 Revert "Mark field in ReindexSameIndexTests as final"
This reverts commit 6d8692576e.
2016-06-13 11:14:30 +02:00
Nik Everett 387155559e Make TimeValue Writeable instead of Streamable
Writeable is better for immutable objects like TimeValue.

Switch to writeZLong which takes up less space than the original
writeLong in the majority of cases. Since we expect negative
TimeValues we shouldn't use
writeVLong.
2016-06-10 18:24:16 -04:00
Jason Tedor 6d8692576e Mark field in ReindexSameIndexTests as final
This commit restores a final modifier on the field
AutoCreateIndex#AUTO_CREATE_INDEX that was inadvertently removed in
a25b8ee1bf.
2016-06-10 10:20:45 -04:00
Jason Tedor a25b8ee1bf Make random UUIDs reproducible in tests
Today we use a random source of UUIDs for assigning allocation IDs,
cluster IDs, etc. Yet, the source of randomness for this is not
reproducible in tests. Since allocation IDs end up as keys in hash maps,
this means allocation decisions and not reproducible in tests and this
leads to non-reproducible test failures. This commit modifies the
behavior of random UUIDs so that they are reproducible under tests. The
behavior for production code is not changed, we still use a true source
of secure randomness but under tests we just use a reproducible source
of non-secure randomness.

It is important to note that there is a test,
UUIDTests#testThreadedRandomUUID that relies on the UUIDs being truly
random. Thus, we have to modify the setup for this test to use a true
source of randomness. Thus, this is one test that will never be
reproducible but it is intentionally so.

Relates #18808
2016-06-10 10:18:06 -04:00
Nik Everett bd1af34506 [reindex] Extract runnable to inner class
Makes it more readable
2016-06-08 14:00:15 -04:00
Nik Everett 2437313e4e Remove extra logging
The test shouldn't be failing any more.
2016-06-08 13:52:43 -04:00
Nik Everett 5b94c4a25b Fix a race condition in reindex's rethrottle
If you rethrottled the request while is was performing a scroll
request then it wouldn't properly pick up the rethrottle for that
batch. This was causing test failure and might cause issues for
users. The work around is simple though: just issue the rethrottle
again with a slightly faster throttle than the first time.

Caught by:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-os-compatibility/os=centos/525/console
2016-06-08 13:52:43 -04:00
Nik Everett 4b21157906 Remove setRefresh
It has been replaced with `setRefreshPolicy` which has support for
waiting until refresh with `setRefreshPolicy(WAIT_FOR)`.

Related to #1063
2016-06-08 13:50:59 -04:00
Jason Tedor d896886973 Merge branch 'master' into feature/seq_no
* master: (51 commits)
  Switch QueryBuilders to new MatchPhraseQueryBuilder
  Added method to allow creation of new methods on-the-fly.
  more cleanups
  Remove cluster name from data path
  Remove explicit parallel new GC flag
  rehash the docvalues in DocValuesSliceQuery using BitMixer.mix instead of the naive Long.hashCode.
  switch FunctionRef over to methodhandles
  ingest: Move processors from core to ingest-common module.
  Fix some typos (#18746)
  Fix ut
  convert FunctionRef/Def usage to methodhandles.
  Add the ability to partition a scroll in multiple slices. API:
  use painless types in FunctionRef
  Update ingest-node.asciidoc
  compute functional interface stuff in Definition
  Use method name in bootstrap check might fork test
  Make checkstyle happy (add Lookup import, line length)
  Don't hide LambdaConversionException and behave like real javac compiled code when a conversion fails. This works anyways, because fallback is allowed to throw any Throwable
  Pass through the lookup given by invokedynamic to the LambdaMetaFactory. Without it real lambdas won't work, as their implementations are private to script class
  checkstyle have your upper L
  ...
2016-06-07 17:57:53 -04:00
Martijn van Groningen f611f1c99e ingest: Move processors from core to ingest-common module.
Folded grok processor into ingest-common module.

The rest tests have been moved to ingest-common module as well, because these tests don't run in the rest-api-spec module but in the distribution:integ-test-zip module
and adding a test plugin there felt just wrong to me. I think this is ok. I left a tiny ingest rest test behind in that tests with an empty pipeline.

Removed messy tests, these tests were already covered in the rest tests

Added ingest test plugin in test infra so that each module testing integration with ingest doesn't need write its own plugin

Moved reindex ingest tests to qa module

Closes #18490
2016-06-07 17:32:52 +02:00
Jason Tedor da74323141 Register thread pool settings
This commit refactors the handling of thread pool settings so that the
individual settings can be registered rather than registering the top
level group. With this refactoring, individual plugins must now register
their own settings for custom thread pools that they need, but a
dedicated API is provided for this in the thread pool module. This
commit also renames the prefix on the thread pool settings from
"threadpool" to "thread_pool". This enables a hard break on the settings
so that:
 - some of the settings can be given more sensible names (e.g., the max
   number of threads in a scaling thread pool is now named "max" instead
   of "size")
 - change the soft limit on the number of threads in the bulk and
   indexing thread pools to a hard limit
 - the settings names for custom plugins for thread pools can be
   prefixed (e.g., "xpack.watcher.thread_pool.size")
 - remove dynamic thread pool settings

Relates #18674
2016-06-06 22:09:12 -04:00
Jason Tedor a60b8948ba Merge branch 'master' into feature/seq_no
* master: (184 commits)
  Add back pending deletes (#18698)
  refactor matrix agg documentation from modules to main agg section
  Implement ctx.op = "delete" on _update_by_query and _reindex
  Close SearchContext if query rewrite failed
  Wrap lines at 140 characters (:qa projects)
  Remove log file
  painless: Add support for the new Java 9 MethodHandles#arrayLength() factory (see https://bugs.openjdk.java.net/browse/JDK-8156915)
  More complete exception message in settings tests
  Use java from path if JAVA_HOME is not set
  Fix uncaught checked exception in AzureTestUtils
  [TEST] wait for yellow after setup doc tests (#18726)
  Fix recovery throttling to properly handle relocating non-primary shards (#18701)
  Fix merge stats rendering in RestIndicesAction (#18720)
  [TEST] mute RandomAllocationDeciderTests.testRandomDecisions
  Reworked docs for index-shrink API (#18705)
  Improve painless compile-time exceptions
  Adds UUIDs to snapshots
  Add test rethrottle test case for delete-by-query
  Do not start scheduled pings until transport start
  Adressing review comments
  ...
2016-06-06 11:16:22 -04:00
Tanguy Leroux a1172d816c Implement ctx.op = "delete" on _update_by_query and _reindex
closes #18043
2016-06-06 11:11:29 +02:00
Nik Everett f82ab787a5 Add test rethrottle test case for delete-by-query
and remove some type parameters that we don't need that were getting
in the way.
2016-06-02 15:04:18 -04:00
Nik Everett 1b66d4a97f Add more logging to reindex rethrottle
The tests are failing in CI and we can't track down the cause. This should
help!
2016-06-01 12:54:49 -04:00
Tanguy Leroux bf41ac8bf2 Makes DeleteByQueryRequest implements CompositeIndicesRequest 2016-05-25 09:36:07 +02:00
Nik Everett 5e81270509 Add retry test case for delete-by-query
Tests that we retry failed searches, scrolls, and bulks.
2016-05-24 12:02:55 -04:00
Ryan Ernst c7b45b2cc7 Tests: Remove unnecessary Callable variant of assertBusy
The assertBusy method currently has both a Runnable and Callable
version. This has caused confusion with type inference and lambdas
sometimes, in particular with java 9. This change removes the callable
version as nothing was actually using it.
2016-05-23 16:17:43 -07:00
Nik Everett 62ac719a94 Rerwite RetryTests to very carefully block the executors
This reproduces every time. No more randomness! Hurray!
2016-05-23 14:38:40 -04:00
Nik Everett b7817a6306 [reindex] Retry the retry test if it didn't cause retries
The retry test has failed a couple of times in CI because it wasn't able
to cause any retries. Putting it in a bash `while` loop shows that it
eventually does fail that way. The seed "4F6477A9C999CA20" seems especially
good at failing to get retries. It doesn't fail all the time, but more
than most.

This adds a retry to each test case, retrying a maximum of 10 times or
until it causes the retries. I've seen it fail to get retries 7 times
in a row but not go beyond that. Retrying doesn't seem to really hurt
the test runtime all that much. Most of the time is in the startup
cost.

Failing CI build that triggered this:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+periodic/852/console
2016-05-23 14:38:40 -04:00
Tanguy Leroux b0b503035a Update reindex cancel tests 2016-05-23 10:31:14 +02:00
Jason Tedor 6eb96e5fd8 Fix line-length violations in ABBSAT.java
This commit fixes line-length violations in
AsyncBulkByScrollActionTests.java.
2016-05-21 21:26:28 -04:00
Jason Tedor ad7229fe72 Merge branch 'master' into feature/seq_no
* master: (158 commits)
  Document the hack
  Refactor property placeholder use of env. vars
  Force java9 log4j hack in testing
  Fix log4j buggy java version detection
  Make java9 work again
  Don't mkdir directly in deb init script
  Fix env. var placeholder test so it's reproducible
  Remove ScriptMode class in favor of boolean true/false
  [rest api spec] fix doc urls
  Netty request/response tracer should wait for send
  Filter client/server VM options from jvm.options
  [rest api spec] fix url for reindex api docs
  Remove use of a Fields class in snapshot responses that contains x-content keys, in favor of declaring/using the keys directly.
  Limit retries of failed allocations per index (#18467)
  Proxy box method to use valueOf.
  Use the build-in valueOf method instead of the custom one.
  Fixed tests and added a comment to the box method.
  Fix boxing.
  Do not decode path when sending error
  Fix race condition in snapshot initialization
  ...
2016-05-21 21:04:43 -04:00
Nik Everett 223cb6a7f0 [reindex] Mark test awaits fix because it is unstable
Fix coming.
2016-05-19 18:03:36 -04:00
Tanguy Leroux a01ecb20ea Port Delete By Query to Reindex infrastructure
closes #16883
2016-05-19 16:07:50 +02:00
Nik Everett cfb06954ba [reindex] Add assertBusy to test
It has timing issues.
2016-05-17 14:05:35 -04:00
Nik Everett fe4823eae0 Reindex should retry on search failures
This uses the same backoff policy we use for bulk and just retries until
the request isn't rejected.

Instead of `{"retries": 12}` in the response to count retries this now
looks like `{"retries": {"bulk": 12", "search": 1}`.

Closes #18059
2016-05-17 13:58:45 -04:00
Nik Everett f569576c5b Switch default batch size for reindex to 1000 2016-05-16 08:19:29 -04:00
Jason Tedor 15d3d74444 Merge branch 'master' into feature/seq_no
* master: (904 commits)
  Removes unused methods in the o/e/common/Strings class.
  Add note regarding thread stack size on Windows
  painless: restore accidentally removed test
  Documented fuzzy_transpositions in match query
  Add not-null precondition check in BulkRequest
  Build: Make run task you full zip distribution
  Build: More pom generation improvements
  Add test for wrong array index
  Take return type from "after" field.
  painless: build descriptor of array and field load/store in code; fix array index to adapt type not DEF
  Build: Add developer info to generated pom files
  painless: improve exception stacktraces
  painless: Rename the dynamic call site factory to DefBootstrap and make the inner class very short (PIC = Polymorphic Inline Cache)
  Remove dead code.
  Avoid race while retiring executors
  Allow only a single extension for a scripting engine
  Adding REST tests to ensure key_as_string behavior stays consistent
  [test] Set logging to 11 on reindex test
  [TEST] increase logger level until we know what is going on
  Don't allow `fuzziness` for `multi_match` types cross_fields, phrase and phrase_prefix
  ...
2016-05-14 20:23:59 -04:00
Nik Everett 0a300320cd [test] Set logging to 11 on reindex test
It has failures we can't explain and we need logs to be able to do
anything useful with the failures:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+g1gc/359/consoleFull
2016-05-13 11:48:01 -04:00
Tanguy Leroux 6d288dec11 Clean up tests in Reindex module 2016-05-11 09:54:52 +02:00
Tanguy Leroux 8c52e8814b Remove ReindexResponse in favor of BulkIndexByScrollResponse 2016-05-09 17:03:16 +02:00
Adrien Grand 7d8708716e QueryBuilder does not need generics. #18133
QueryBuilder has generics, but those are never used: all call sites use
`QueryBuilder<?>`. Only `AbstractQueryBuilder` needs generics so that the base
class can contain a default implementation for setters that returns `this`.
2016-05-06 08:38:20 +02:00
Nik Everett 230697c202 [reindex] Switch throttle to Float.POSITIVE_INFITINTY/"unlimited"
All other values are errors.

Add java test for throttling. We had a REST test but it only ran against
one node so it didn't catch serialization errors.

Add Simple round trip test for rethrottle request
2016-05-04 16:14:32 -04:00
Ryan Ernst d12a4bb51d Merge pull request #17933 from rjernst/camelcase4
Remove camelCase support
2016-04-22 13:46:43 -07:00
Nik Everett cc1a55423c Reindex: properly mark things as child tasks
Do this by creating a Client subclass that automatically assigns the
parentTask to all requests that come through it. Code that doesn't want
to set the parentTask can call `unwrap` on the Client to get the inner
client instance that doesn't set the parentTask. Reindex uses this for
its ClearScrollRequest so that the request will run properly after the
reindex request has been canceled.
2016-04-22 14:00:11 -04:00
Ryan Ernst 55388590c1 Remove camelCase support
Now that the current uses of magical camelCase support have been
deprecated, we can remove these in master (sans remaining issues like
BulkRequest). This change removes camel case support from ParseField,
query types, analysis, and settings lookup.

see #8988
2016-04-22 09:18:10 -07:00
Nik Everett 51621f9d75 Remove ChildTaskRequest and always pass parentTaskId when building a task
Passing parentTaskId forces the caller to handle the parentTaskId.
2016-04-22 11:26:18 -04:00
Nik Everett ffeb5e38fc Remove parent-less task methods
Callers should explicitly handle parents - either using EMPTY_TASK_ID when
a parent isn't possible or piping parents from the TransportRequest when
possible.
2016-04-22 11:26:18 -04:00
Martijn van Groningen c5ad2e2865 Changed indexed scripts to be stored in the cluster state instead of the `.scripts` index.
Also added max script size soft limit for stored scripts.

Closes #16651
2016-04-22 13:42:55 +02:00
Isabel Drost-Fromm 233fe86ee4 Makes Script type writeable
Used to be Streamable. Left-over of the PROTOTYPE related refactoring by
@nik9000

Closes #17753
2016-04-21 12:40:29 +02:00
Christoph Büscher e06e122f9f Wrap xcontent parser creation in try-with-resource statement where possible 2016-04-18 16:13:56 +02:00
Christoph Büscher cdb36a2b0c Merge pull request #17417
Clean up QueryParseContext and don't hold it inside QueryRewrite/ShardContext
2016-04-18 15:13:53 +02:00
Nik Everett d3b1306069 Reindex: never report negative throttled_until
Just clamp the value at 0. It isn't useful to tell the user "this
thread should have woken 5ms ago".

Closes #17783
2016-04-15 16:53:23 -04:00
Christoph Büscher fbd558382d Clean up QueryParseContext and don't hold it inside QueryRewrite/ShardContext
This change cleans up a few things in QueryParseContext and QueryShardContext
that make it hard to reason about the state of these context objects and are
thus error prone and should be simplified.

Currently the parser that used in QueryParseContext can be set and reset any
time from the outside, which makes reasoning about it hard. This change makes
the parser a mandatory constructor argument removes ability to later set a
different ParseFieldMatcher. If none is provided at construction time, the
one set inside the parser is used. If a ParseFieldMatcher is specified at
construction time, it is implicitely set on the parser that is beeing used.
The ParseFieldMatcher is only kept inside the parser.

Also currently the QueryShardContext historically holds an inner QueryParseContext
(in the super class QueryRewriteContext), that is mainly used to hold the parser
and parseFieldMatcher. For that reason, the parser can also be reset, which leads
to the same problems as above. This change removes the QueryParseContext from
QueryRewriteContext and removes the ability to reset or retrieve it from the
QueryShardContext. Instead, `QueryRewriteContext#newParseContext(parser)` can be
used to create new parse contexts with the given parser from a shard context
when needed.
2016-04-15 17:13:01 +02:00
Christoph Büscher de036d63d8 Rename context.parseFieldMatcher() to context.getParseFieldMatcher 2016-04-15 15:15:32 +02:00
Christoph Büscher 15c59d07b3 Remove ParseFieldMatcher from AbstractXContentParser
Currently we are able to set a ParseFieldMatcher on XContentParsers,
mainly to conveniently carry it around to be available where the
actual parsing happens. This was just recently introduced together
with ObjectParser so that ObjectParser can make use of deprecation
logging and throwing errors while parsing.

This however is trappy because we create parsers in so many places in
the code and it is easy to forget setting the right ParseFieldMatcher.
Instead we should hold the ParseFieldMatcher only in the parse contexts
(e.g. QueryParseContext).

This PR removes the ParseFieldMatcher from XContentParser. ObjectParser
can still make use of it because we can make the otherwise unbounded
`context` type to extend an interface that makes sure contexts used in
ObjectParser can supply a ParseFieldMatcher. Contexts in ObjectParser
are now no longer optional, but it is sufficient to pass in a small
lambda expression in places where no other context is available.

Relates to #17417
2016-04-15 15:14:46 +02:00
Christoph Büscher e15e7f7e6e Remove parser argument from methods where we already pass in a parse context
When we pass down both parser and QueryParseContext to a method, we cannot
make sure that the parser contained in the context and the parser that is
parsed as an argument have the same state. This removes the parser argument
from methods where we currently have both the parser and the parse context
as arguments and instead retrieves the parse from the context inside the
method.
2016-04-14 16:18:05 +02:00
Martijn van Groningen 2928fd6ef3 Cleanup query builder for inner hits construction.
* Inner hits can now only be provided and prepared via setter in the nested, has_child and has_parent query.
* Also made `score_mode` a required constructor parameter.
* Moved has_child's min_child/max_children validation from doToQuery(...) to a setter.
2016-04-14 14:43:21 +02:00
Nik Everett cca3154c43 Rename isSourceEmpty to hasSource
And add a test case for {} to reindex.
2016-04-13 08:19:58 -04:00
Nik Everett c2e745bf3b reindex: Guard against user disabling fields 2016-04-13 08:19:58 -04:00
Nik Everett 0f9804b0e2 reindex: gracefully handle when _source is disabled
Closes #17666
2016-04-13 08:19:58 -04:00
Nik Everett 14d37baa4b [reindex] Don't get rejected
BulkByScrollTaskTest#testDelayAndRethrottle was getting rejected exceptions
every once in a while. This was reproducible ~20% of the time for me. I
added a CyclicBarrier to prevent the test from shutting down the thread pool
before the threads get finished.
2016-03-31 14:50:14 -04:00
Nik Everett 0c762fca35 Fix test mistake 2016-03-31 12:27:35 -04:00
Nik Everett 7f794e7b77 Test for invalid scroll_size 2016-03-31 12:21:32 -04:00
Nik Everett 30a1862339 Remove PROTOTYPE from BulkItemResponse.Failure
Closes #17086
2016-03-31 09:10:36 -04:00
Nik Everett 78ab6c5b7f [reindex] Dynamic throttle!
This allows the user to update the reindex throttle on the fly, with changes
that speed up the throttling being applied immediately and changes that
slow down the throttling being applied during the next batch. This means
that if a user throttles reindex in such a way that it tries to sleep for
16 years and then realizes that they've done something wrong then they
can change the throttle and reindex will wake up again. We don't apply
slow downs immediately so we never get in danger of losing the scan context.

Also, if reindex is canceled while it is sleeping (how it honor throttling)
then it'll immediately wake up and cancel itself.
2016-03-30 16:40:42 -04:00
Adrien Grand 068c788ec8 Disable fielddata on text fields by defaults. #17386
`text` fields will have fielddata disabled by default. Fielddata can still be
enabled on an existing index by setting `fielddata=true` in the mappings.
2016-03-30 14:35:32 +02:00
Clinton Gormley 3087d2b882 Fixed bad YAML in reindex REST test: 50_routing.yaml 2016-03-29 15:03:09 +02:00
Clinton Gormley 52daed0732 Update-by-query rest tests: fixed bad yaml and deleted a client-dependent test 2016-03-29 14:58:29 +02:00
Clinton Gormley 5f24581de3 The reindex body is now required, which changes the exception thrown by the REST test 2016-03-29 14:09:59 +02:00
Clinton Gormley b87beeb05f Rename update-by-query REST tests to update_by_query 2016-03-29 13:13:49 +02:00
Clinton Gormley 97606850e8 Renamed update-by-query REST spec to update_by_query 2016-03-29 11:45:20 +02:00
Nik Everett 0e6141e675 Replace is_true: took with took >= 0
This prevents tests from failing on machines that can finish the request
less than half a millisecond.
2016-03-28 13:03:48 -04:00
Boaz Leskes 91021e3019 merge from master 2016-03-25 15:50:48 +01:00
Nik Everett 93ab4cfc99 Stop using PROTOTYPE in NamedWriteableRegistry
readFrom is confusing because it requires an instance of the type that it
is reading but it doesn't modify it. But we also have (deprecated) methods
named readFrom that *do* modify the instance. The "right" way to implement
the non-modifying readFrom is to delegate to a constructor that takes a
StreamInput so that the read object can be immutable. Now that we have
`@FunctionalInterface`s it is fairly easy to register things by referring
directly to the constructor.

This change modifying NamedWriteableRegistry so that it does that. It keeps
supporting `registerPrototype` which registers objects to be read by
readFrom but deprecates it and delegates it to a new `register` method
that allows passing a simple functional interface. It also cuts Task.Status
subclasses over to using that method.

The start of #17085
2016-03-24 11:26:44 -04:00
Nik Everett 48aaebf23d [reindex] Wait for headers
The test was checking that we'd set the headers properly but in some cases
the request had yet to come in because it was running on another thread.
Now we wait for the headers to show up before failing the test.

Closes #17299
2016-03-24 09:55:49 -04:00
Nik Everett aaa4d57fff [reindex] Don't attempt to refresh on noop
If the user asks for a refresh but their reindex or update-by-query
operation touched no indexes we should just skip the resfresh call
entirely. Without this commit we refresh *all* indexes which is totally
wrong.

Closes #17296
2016-03-23 18:12:40 -04:00
Boaz Leskes 7c8cdf4a71 merged from master 2016-03-22 19:21:28 +01:00
Nik Everett da96b6e41d [reindex] Add thottling support
The throttle is applied when starting the next scroll request so that its
timeout can include the throttle time.
2016-03-22 12:34:14 -04:00
Boaz Leskes 39ae16bc4c merge from master 2016-03-22 11:46:26 +01:00
Boaz Leskes 2d1152ebac Remove ClusterService interface, in favor of it's only production instance #17183
We current have a ClusterService interface, implemented by InternalClusterService and a couple of test classes. Since the decoupling of the transport service and the cluster service, one can construct a ClusterService fairly easily, so we don't need this extra indirection.

Closes #17183
2016-03-21 13:55:10 +01:00
Boaz Leskes 858610d0d1 merge from master 2016-03-19 13:57:40 +01:00
Christoph Büscher 6ddf9ae92f Merge branch 'master' into feature-suggest-refactoring 2016-03-16 15:27:02 +01:00
Nik Everett 7197172047 [reindex] Properly register status
Without this commit fetching the status of a reindex from a node that isn't
coordinating the reindex will fail. This commit properly registers reindex's
status so this doesn't happen. To do so it moves all task status registration
into NetworkModule and creates a method to register other statuses which the
reindex plugin calls.
2016-03-16 07:40:49 -04:00
Christoph Büscher 40f3501d7f Merge branch 'master' into feature-suggest-refactoring 2016-03-14 14:57:57 +01:00
Simon Willnauer 31740e279f Resolve index names to Index instances early
Today index names are often resolved lazily, only when they are really
needed. This can be problematic especially when it gets to mapping updates
etc. when a node sends a mapping update to the master but while the request
is in-flight the index changes for whatever reason we would still apply the update
since we use the name of the index to identify the index in the clusterstate.
The problem is that index names can be reused which happens in practice and sometimes
even in a automated way rendering this problem as realistic.
In this change we resolve the index including it's UUID as early as possible in places
where changes to the clusterstate are possible. For instance mapping updates on a node use a
concrete index rather than it's name and the master will fail the mapping update iff
the index can't be found by it's <name, uuid> tuple.

Closes #17048
2016-03-14 11:08:48 +01:00
Ali Beyad 31dcb3e18b Merge pull request #16873 from abeyad/suggester-wiring-refactoring
Change internal representation of suggesters
2016-03-11 12:34:35 -05:00
Christoph Büscher bbcbba1bf5 Fixing some tests and compile problems in reindex module 2016-03-11 17:58:13 +01:00
Nik Everett ebc12690bc [reindex] Move refresh tests to unit test
The refresh tests were failing rarely due to refreshes happening
automatically on indexes with -1 refresh intervals. This commit moves
the refresh test into a unit test where we can check if it was attempted
so we never get false failures from background refreshes.

It also stopped refresh from being run if the reindex request was canceled.
2016-03-10 17:48:22 -05:00
Nik Everett b2eec96045 [reindex] Make search failure cause rest failure
Indexing failures have caused the reindex http request to fail for a while
now. Both search and indexing failures cause it to abort. But search
failures didn't cause a non-200 response code from the http api. This
fixes that.

Also slips in a fix to some infrequently failing rest tests.

Closes #16037
2016-03-10 13:47:49 -05:00
Nik Everett b8d931d23c [reindex] Timeout if sub-requests timeout
Sadly, it isn't easy to simulate a timeout during an integration test, you
just have to cause one. Groovy's sleep should do the job.
2016-03-10 13:05:23 -05:00
Boaz Leskes 838c7ddd82 fix indexing compilation issue 2016-03-10 12:12:14 +01:00
Nik Everett 38241a5d8b [reindex] Implement CompositeIndicesRequest
Implements CompositeIndicesRequest on UpdateByQueryRequest and
ReindexRequest so that plugins can reason about the request. In both cases
this implementation is imperfect but useful because instead of listing
all requests that make up the request it instead attempts to make dummy
requests that represent the requests that it will later make.
2016-03-09 16:29:23 -05:00
Nik Everett 6d0efae713 Teach list tasks api to wait for tasks to finish
_wait_for_completion defaults to false. If set to true then the API will
wait for all the tasks that it finds to stop running before returning. You
can use the timeout parameter to prevent it from waiting forever. If you
don't set a timeout parameter it'll default to 30 seconds.

Also adds a log message to rest tests if any tasks overrun the test. This
is just a log (instead of failing the test) because lots of tasks are run
by the cluster on its own and they shouldn't cause the test to fail. Things
like fetching disk usage from the other nodes, for example.

Switches the request to getter/setter style methods as we're going that
way in the Elasticsearch code base. Reindex is all getter/setter style.

Closes #16906
2016-03-08 11:53:57 -05:00
Nik Everett 4d6cb34417 [reindex] Add ingest support 2016-03-04 10:05:13 -05:00
Adrien Grand 2b545df372 Fix modules/reindex to not use the string field anymore. 2016-03-03 11:11:00 +01:00
Nik Everett 18e5bb83c5 Disable problematic reindex test
This should get the builds back to normal while we wait on #16914 or
something like it to fix the test properly.
2016-03-02 13:02:29 -05:00
Nik Everett 942eb70956 Revert "Silence reindex's rest tests"
This reverts commit aa0ef84f5a.
2016-03-02 09:17:06 -05:00
Nik Everett aa0ef84f5a Silence reindex's rest tests
They are failing sporadically in CI.
2016-03-02 08:32:30 -05:00
Nik Everett aeed7ee218 Reindex: rename source to searchRequest
This makes the code easier to read for those familiar with the
Elasticsearch code base.
2016-02-29 14:57:16 -05:00
Nik Everett d587a74533 Fix reindex to work with master branch
Stuff changed, reindex's got to change.
2016-02-29 10:04:16 -05:00