Commit Graph

949 Commits

Author SHA1 Message Date
javanna bf390a935e Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-21 17:18:23 +01:00
Martijn van Groningen e3b7e5d75a percolator: Replace percolate api with the new percolator query
Also replaced the PercolatorQueryRegistry with the new PercolatorQueryCache.

The PercolatorFieldMapper stores the rewritten form of each percolator query's xcontext
in a binary doc values field. This make sure that the query rewrite happens only during
indexing (some queries for example fetch shapes, terms in remote indices) and
the speed up the loading of the queries in the percolator query cache.

Because the percolator now works inside the search infrastructure a number of features
(sorting fields, pagination, fetch features) are available out of the box.

The following feature requests are automatically implemented via this refactoring:

Closes #10741
Closes #7297
Closes #13176
Closes #13978
Closes #11264
Closes #10741
Closes #4317
2016-03-21 12:21:50 +01:00
Clinton Gormley 0543d46c1d Fixed regex in cat.recovery REST tes
The time column should accept integer ms or floating point seconds
2016-03-16 17:22:00 +01:00
Simon Willnauer 121e7c8ca4 Add infrastructure to run REST tests on a multi-version cluster
This change adds the infrastructure to run the rest tests on a multi-node
cluster that users 2 different minor versions of elasticsearch. It doesn't implement
any dedicated BWC tests but rather leverages the existing REST tests.

Since we don't have a real version to test against, the tests uses the current version
until the first minor / RC is released to ensure the infrastructure works.

Relates to #14406
Closes #17072
2016-03-13 10:52:39 +01:00
Jason Tedor f465d98eb3 Add raw recovery progress to cat recovery API
This commit adds fields bytes_recovered and files_recovered to the cat
recovery API. These fields, respectively, indicate the total number of
bytes and files recovered. Additionally, for consistency, some totals
fields and translog recovery fields have been renamed.

Closes #17064
2016-03-11 08:27:09 -05:00
Nik Everett b8d931d23c [reindex] Timeout if sub-requests timeout
Sadly, it isn't easy to simulate a timeout during an integration test, you
just have to cause one. Groovy's sleep should do the job.
2016-03-10 13:05:23 -05:00
Martijn van Groningen 0bbb84c19a test: 'Test bulk request with default pipeline' may get run first and then the total ingest count for pipeline1 is 2. 2016-03-10 15:18:08 +01:00
Martijn van Groningen 2fa33d5c47 Added ingest statistics to node stats API
The ingest stats include the following statistics:
* `ingest.total.count`- The total number of document ingested during the lifetime of this node
* `ingest.total.time_in_millis` - The total time spent on ingest preprocessing documents during the lifetime of this node
* `ingest.total.current` - The total number of documents currently being ingested.
* `ingest.total.failed` - The total number ingest preprocessing operations failed during the lifetime of this node

Also these stats are returned on a per pipeline basis.
2016-03-10 13:21:43 +01:00
Nik Everett 6d0efae713 Teach list tasks api to wait for tasks to finish
_wait_for_completion defaults to false. If set to true then the API will
wait for all the tasks that it finds to stop running before returning. You
can use the timeout parameter to prevent it from waiting forever. If you
don't set a timeout parameter it'll default to 30 seconds.

Also adds a log message to rest tests if any tasks overrun the test. This
is just a log (instead of failing the test) because lots of tasks are run
by the cluster on its own and they shouldn't cause the test to fail. Things
like fetching disk usage from the other nodes, for example.

Switches the request to getter/setter style methods as we're going that
way in the Elasticsearch code base. Reindex is all getter/setter style.

Closes #16906
2016-03-08 11:53:57 -05:00
Jun Ohtani 071d578953 Analysis : Allow string explain param in JSON
Move some test methods from AnalylzeActionIT to RestAnalyzeActionTest
Allow string explain param if it can parse
Fix wrong param name in rest-api-spec

Closes #16925
2016-03-08 16:19:02 +09:00
Martijn van Groningen 82d01e4315 Added ingest info to node info API, which contains a list of available processors.
Internally the put pipeline API uses this information in node info API to validate if all specified processors in a pipeline exist on all nodes in the cluster.
2016-03-07 14:44:50 +01:00
javanna 9c4a5bbe7e adapt cluster stats api to node.client setting removal
The cluster stats api now returns counts for each node role. The `master_data`, `master_only`, `data_only` and `client` fields have been removed from the response in favour of `master`, `data`, `ingest` and `coordinating_only`. The same node can have multiple roles, hence contribute to multiple roles counts. Every node is implicitly a coordinating node, so whenever a node has no explicit roles, it will be counted as coordinating only.
2016-03-05 10:55:19 +01:00
javanna f786e9866c adapt _cat/nodes to node.client removal
_cat/nodes used to return `c` for client node or `d` for data node as part of the node.role column. This commit changes it to return `m` for master eligible, `d` for data and/or `i` for ingest. A node with no explicit roles will be a coordinating only node and marked with `-`. A node can obviously have multiple roles. The master column has been adapted to return only whether a node is the current master (`*`) or not (`-`).
2016-03-05 10:55:19 +01:00
Nik Everett 4d6cb34417 [reindex] Add ingest support 2016-03-04 10:05:13 -05:00
Clinton Gormley 30669f63e8 Document required settings when running the REST test suite 2016-03-04 13:50:40 +01:00
Simon Willnauer 5008694ba1 Remove support for legacy checksums
Elasticsearch 5.0 doesn't support indices wiht legacy checksums anymore.
The last time we write legacy checksums was in 1.3.0 which was based
on lucene 4.9 already which means that all files have CRC32 checksums.
All indices that Elasticsearch can read today must be written with
lucene version >= 4.8 anyway so we can drop this layer of backwards
compatibility entirely.

Since we are close to upgrading to Lucene 6.0 we should get rid of this
in a more contiained change than the lucene upgrade.
2016-03-03 22:58:18 +01:00
Adrien Grand fc0cc4a6bb Fix field_stats tests to use text/keyword instead of string. 2016-03-03 16:24:02 +01:00
Clinton Gormley 6b27de3f8c Fixed REST test to not rely on dynamic mapping 2016-03-03 14:38:10 +01:00
Clinton Gormley ce7fccb287 Fixed bad YAML in REST tests 2016-03-03 14:38:06 +01:00
Martijn van Groningen 75387001df Added `ingest_took` to bulk response to indicate how much time was spent on ingest preprocessing.
The `ingest_took` is separate from `took`, which keeps track how much time is spent on indexing/deleting/updating.
The `ingest_took` is only visible in the rest response if at least for one bulk item has ingest enabled.
2016-03-01 18:24:26 +01:00
Nik Everett c7c8bb357a Merge pull request #16861 from nik9000/reindex_is_ready
Reindex required some parsing changes for search requests to support
differing defaults from the regular search api.
2016-03-01 10:02:48 -05:00
Spencer 3f80feb899 [REST_API_SPEC] remove invalid use of catch: param
`catch: param` is designed to catch errors generated by client-side validation logic when users don't supply valid parameters to an API request. This test though is testing the server-side validation of pipeline aggregations, and so a "param" catch is invalid. Instead we will just test for a parse_exception error type using a regex.
2016-02-29 09:27:36 -07:00
Nik Everett c38119bae9 Merge branch 'master' into feature/reindex 2016-02-26 16:59:54 -05:00
Igor Motov d6af669776 Combine node name and task id into single string task id
This commit changes the URL for task operations from `/_tasks/{nodeId}/{taskId}` to `/_tasks/{taskId}`, where `{taskId}` has a form of nodeid:id
2016-02-24 12:44:12 -08:00
Simon Willnauer 354aae2fec Merge pull request #16770 from s1monw/http_on_cat
Expose http address in cat/nodes and cat/nodeattrs APIs

We expose a lot of information like IP address and port but never
expose the http address/ip:port in the CAT API. It's nice to have it
there too since otherwise json parsing is required to get this information
2016-02-22 14:20:33 -08:00
David Pilato a0a6eff0d0 Fix test for [cat/recovery] Make recovery time a TimeValue()
Related to #16743
2016-02-22 13:37:11 -08:00
Simon Willnauer 3c15200f6f Expose http address in cat/nodes and cat/nodeattrs APIs
We expose a lot of information like IP address and port but never
expose the http address/ip:port in the CAT API. It's nice to have it
there too since otherwise json parsing is required to get this information
2016-02-22 13:22:54 -08:00
Lee Hinman 99052c3fef Limit the accepted length of the _id
Elasticsearch should reject ids that are this long, to ensure a document
always remains retrievable for clients that impose a maximum URI length

Closes #16034
2016-02-22 12:34:18 -07:00
Spencer 31847c1e9d [REST API] use a block literal for request bodies 2016-02-20 12:55:23 -08:00
Spencer a859595dcd [REST API] use a block literal for request bodies 2016-02-20 12:53:39 -08:00
Adrien Grand 4f8895eae3 Add a text field.
This new field is intended to replace analyzed string fields.
2016-02-15 10:43:44 +01:00
Jason Tedor 3bbd1c129e Remove host from cat nodes API
As the host and ip fields are always equal by design, the host field in
the cat nodes API is redundant and should be removed.

Closes #16656
2016-02-14 09:21:32 -05:00
Nik Everett 821a20f582 Merge branch 'master' into feature/reindex 2016-02-11 17:41:05 -05:00
Nik Everett 18808b7576 Move reindex from a plugin to a module 2016-02-11 17:39:49 -05:00
Adrien Grand bc47c577d2 Add a new `keyword` field.
The `keyword` field is intended to replace `not_analyzed` string fields. It is
indexed and has doc values by default, and doesn't support enabling term
vectors.

Although it doesn't support setting an analyzer for now, there are plans for
it to support basic normalization in the future such as case folding.
2016-02-11 18:19:53 +01:00
Igor Motov 99a7d8e41f Add task cancellation mechanism
Only tasks that extend CancellableTask can be cancelled using this mechanism. If a cancellable task has children it can elect to cancel all child tasks as well. In this case a special ban parent request is sent to all nodes. This request does two things: 1) it prevents any tasks with the banned parent task from being started, and 2) it cancels all currently running tasks that have the banned task as a parent. The ban is lifted as soon as the coordinating node notifies all other nodes that the cancelled task has finished executing. If the coordinating node leaves the cluster before it has a chance to lift its bans, all bans set by this coordinating node are automatically removed.

As an option a task can elect to automatically cancel all child tasks if their parent task was running on a node that just left the cluster. This option makes sense for cancellable heavy tasks that have no side-effects and only return results to the coordinating node. With the coordinating node gone, it doesn't make sense to run such tasks any longer since their results will be most likely discarded.
2016-02-09 22:30:57 -05:00
Yannick Welsch 0d11443aba Fix filters and null parameters in _aliases command
Closes #16549
Closes #16547
2016-02-09 21:43:42 +01:00
Andrej Kazakov 7f2b369dfd Use Accept header field in cat API
The cat API previously used the Content-Type header field for
determining the media type of the response. This is in opposition to the
HTTP spec which specifies the Accept header field for this purpose. This
commit replaces the use of the Content-Type header field with the Accept
header field in the cat API.

Closes #14421
2016-02-05 06:28:39 -05:00
Martijn van Groningen 7a6adfd93a ingest: Added foreach processor.
This processor is useful when all elements of a json array need to be processed in the same way.
This avoids that a processor needs to be defined for each element in an array.
Also it is very likely that it is unknown how many elements are inside an json array.
2016-02-04 23:44:01 +01:00
Simon Willnauer 450ee70038 Remove DFS support from TermVector API
Retrieving distributed DF for TermVectors is beside it's esotheric justification
a very slow process and can cause serious load on the cluster. We also don't have nearly
enough testing for this stuff and given the complexity we should remove it rather than carrying it
around.
2016-02-04 16:20:24 +01:00
Yannick Welsch 4937531a17 Remove obsolete version in ShardRouting
Closes #16243
2016-02-04 15:50:25 +01:00
Tal Levy 9e7e2ab10b remove DeDotProcessor from Ingest 2016-02-02 14:16:01 -08:00
Tal Levy 3191fc7347 Merge pull request #16355 from talevy/fix_ingest_exception
revert PipelineFactoryError handling with throwing ElasticsearchParseException in ingest pipeline creation
2016-02-02 14:11:24 -08:00
Tal Levy 0a1580eefa revert PipelineFactoryError handling with throwing ElasticsearchParseException in ingest pipeline creation 2016-02-02 14:08:22 -08:00
Greg Marzouka e7fc98a33f Remove detect_noop from REST spec
Unless this should be supported as a query string parameter instead, right now it only works when specified in the body.
2016-02-02 15:32:14 -05:00
Tal Levy fca442f4d1 Introduce Pipeline Factory Error Responses in Node Ingest
When there is an exception thrown during pipeline creation within
Rest calls (in put pipeline, and simulate) We now return a structured
error response to the user with details around which processor's
configuration is the cause of the issue, or which configuration property
is misconfigured, etc.
2016-01-29 13:37:27 -08:00
Jim Ferenczi 1343d6cbd1 Remove search_after from the query string param of the rest api spec.
Handle null values in search_after.
Ensure that the cluster is green after each index creation in the integ tests.
2016-01-27 19:21:01 +01:00
javanna 8006e5cd15 [TEST] re-enable and merge cluster settings REST tests
We used to have a disabled test around cluster put settings as it left cluster settings behind without a way to remove them. That has been in fixed in the cluster put settings api, so the test can be re-enabled.
2016-01-27 17:37:42 +01:00
Jim Ferenczi aea7660e37 Add search_after parameter in the Search API.
The search_after parameter provides a way to efficiently paginate from one page to the next. This parameter accepts an array of sort values, those values are then used by the searcher to sort the top hits from the first document that is greater to the sort values.
This parameter must be used in conjunction with the sort parameter, it must contain exactly the same number of values than the number of fields to sort on.

NOTE: A field with one unique value per document should be used as the last element of the sort specification. Otherwise the sort order for documents that have the same sort values would be undefined. The recommended way is to use the field `_uuid` which is certain to contain one unique value for each document.

Fixes #8192
2016-01-27 09:42:58 +01:00
Tal Levy ff0e8272cb [ingest] update test to verify that documents are deep-copied between verbose results 2016-01-26 14:12:42 -08:00