Commit Graph

1435 Commits

Author SHA1 Message Date
Jason Tedor e1ef3d5cc2 Add debug logging for refresh REST tests
We are chasing a test failure in the "refresh=wait_for waits until
changes are visible in search" test yet the logs currently give us no
indication what is happening. This commit adds debug logging for this
test, and cleans up this logging in a teardown section. We can remove
this additional logging after we chase the test failure down.
2017-08-01 18:54:52 +09:00
Zachary Tong caef6cc128 [TEST] Move version skip to setup in Indices.GetMapping#70_legacy_multi_type (#25816)
Since the setup attempts to create an index with two types, and the setup runs before any test,
this will fail on versions 6.0+ before it has a chance to check the skip in each individual
test.  Moving to the setup resolves this issue.
2017-07-21 11:53:48 -04:00
Simon Willnauer 0e3ad522a2 Rewrite search requests on the coordinating nodes (#25814)
This change rewrites search requests on the coordinating node before
we send requests to the individual shards. This will reduce the rewrite load
and object creation for each rewrite on the executing nodes and will fetch
resources only once instead of N times once per shard for queries like `terms`
query with index lookups. (among percolator and geo-shape)

Relates to #25791
2017-07-21 09:38:38 +02:00
Jack Conradson 9f7463e796 remove lang url parameter from stored script requests (#25779)
Also has updates to ScriptMetaData for allowing the old namespace format to be loaded all the way back through 5.0; however, it will throw an exception if two scripts share the same id but different languages.
2017-07-20 08:51:08 -07:00
Luca Cavanna 5c5d723b86 Improve error message when aliases are not supported (#25728)
With #23997 and #25268 we have changed put alias, delete alias, update aliases and delete index to not accept aliases. Instead concrete indices should be provided as their index parameter.

This commit improves the error message in case aliases are provided, from an IndexNotFoundException (404 status code) with "no such index" message, to an IllegalArgumentException (400 status code) with "The provided expression [alias] matches an alias, specify the corresponding concrete indices instead." message.

Note that there is no specific error message for the case where wildcard expressions match one or more aliases. In fact, aliases are simply ignored when expanding wildcards for such APIs. An error is thrown only when the expression ends up matching no indices at all, and allow_no_indices is set to false. In that case the error is still the generic "404 - no such index".
2017-07-18 15:40:17 +02:00
Luca Cavanna 0d8b753325 IndexClosedException to return 400 rather than 403 (#25752)
403 can be confused with security. If an API doesn't support working against closed indices and closed indices are referred to in a request, that is a bad request, hence 400 is more appropriate.
2017-07-18 10:26:32 +02:00
Christoph Büscher a6e3d356ed Change parsing of numeric `to` and `from` parameters in `date_range` aggregation (#25376)
Currently the `to` and `from` parameter in the `date_range` aggregation is not
parsed with the correct date field format from the mappings or the aggregation
if the argument is numeric, but always treated as a long value specifying
`epoch_millis`. This leads to problems e.g. when the format is `epoch_second`,
but the `to` and `from` are currently treated as millis.

With this change, we interpret these parameters according to the `format` of the target field.
If the `format` in the mappings is not compatible with numeric input values,
a compatible `format` (e.g. `epoch_millis`, `epoch_second`) must be specified in
the `date_range` aggregation itself, otherwise an error is thrown.

#Closes #17920
2017-07-18 09:45:28 +02:00
Jason Tedor e9aa60dc9d Skip shrink ignores template mapping in BWC tests
This commit reverts some changes to the shrink API ignore template
mapping REST test in favor of simply skipping the test for BWC
purposes. The complexity here is due to deprecations and lacking the
infrastructure to gracefully handle a situation like this.
2017-07-17 20:32:18 +09:00
Colin Goodheart-Smithe 7a401cd1d2
[TEST] skips shrink source mapping rest test
This change skips the rest test in `rest-api-spec/test/indices.shrink/20_source_mapping.yml` as it currently fails because if we don’t expect the deprecation warning the normal rest tests fail because they get a warning they don’t expect but if we do expect the deprecation warning the mixed cluster tests fail because they don’t get a warning which they expected.
2017-07-17 12:24:07 +01:00
Jason Tedor b1f8b75ac3 Fix warnings in shrink ignore templates test
This commit fixes an issue with the REST test that the shrink API
ignores templates. The problem is that we have to use a BWC version of
the API (for the BWC tests) but this raises deprecation warnings. This
commit adds an expectation for these deprecation warnings.
2017-07-17 18:25:37 +09:00
Simon Willnauer 2da79f2b5e [TEST] Use 5.x compatible API in shrink tests 2017-07-17 09:45:49 +02:00
Jason Tedor 5b25b5d80a Fix comment on shrink indices test
This commit fixes a comment on a shrink indices test; the comment is
wrong because the fix in question was applied starting 5.6.0.
2017-07-17 16:28:09 +09:00
Jason Tedor fd98f7abc2 Adjust skip version for shrink index test
This commit adjusts the skip version for a shrink index test that
ensures that a shrunken index ignores templates; the version can be
adjusted after the fix was backported targeting 5.6.0 and later.

Relates #25380
2017-07-17 12:56:12 +09:00
Simon Willnauer ccda0441e1 Bump BWC versions after #25658 backport to 5.6 2017-07-15 11:34:16 +02:00
Ryan Ernst 072402463b Scripting: Remove search template actions (#25717)
The dedicated search template put/get/delete actions are deprecated in
5.6. This commit removes them from 6.0.
2017-07-14 23:12:05 -07:00
Luca Cavanna 7930b8a720 Fix indices options parsing from REST in delete index API (#25709)
When parsing indices options from REST, we parse the optional parameters that are supported at REST (ignore_unavailable, allow_no_indices and expand_wildcards) and we provide the API default values for all the other (internal) options so that they are set to the new indices options while parsing. The `ignoreAliases` option was forgotten though, which means that whenever you pass in any index option at REST to the delete index API, you get to delete aliases like it was supported before (as ignoreAliases gets set to false like in all the other APIs).

Added unit tests for IndicesOptions parsing from REST parameters, and yaml tests for the delete index API.
2017-07-14 10:39:44 +02:00
Colin Goodheart-Smithe 11477a608f Removes FieldStats API (#25628)
* Removes FieldStats API

* iter

* iter
2017-07-13 11:56:46 +01:00
Sergey Galkin e2bfb35f4a Shrunk indices should ignore templates
A shrunk index should ignore anything from templates and instead take
its mappings, aliases, and settings from the original index, plus any
new settings and aliases passed in with the shrink request. This commit
causes this to be the case.

Relates #25380
2017-07-12 18:27:38 -04:00
Simon Willnauer e81804cfa4 Add a shard filter search phase to pre-filter shards based on query rewriting (#25658)
Today if we search across a large amount of shards we hit every shard. Yet, it's quite
common to search across an index pattern for time based indices but filtering will exclude
all results outside a certain time range ie. `now-3d`. While the search can potentially hit
hundreds of shards the majority of the shards might yield 0 results since there is not document
that is within this date range. Kibana for instance does this regularly but used `_field_stats`
to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results.

This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards
and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.
2017-07-12 22:19:20 +02:00
Adrien Grand f9fbce84b6 Optimize the order of bytes in uuids for better compression. (#24615)
Flake ids organize bytes in such a way that ids are ordered. However, we do not
need that property and could reorganize bytes in an order that would better suit
Lucene's terms dict instead.

Some synthetic tests suggest that this change decreases the disk footprint of
the `_id` field by about 50% in many cases (see `UUIDTests.testCompression`).
For instance, when simulating the indexing of 10M docs at a rate of 10k docs
per second, the current uid generator used 20.2 bytes per document on average,
while this new generator which only puts bytes in a different order uses 9.6
bytes per document on average.

We had already explored this idea in #18209 but the attempt to share long common
prefixes had had a bad impact on indexing speed. This time I have been more
careful about putting discriminant bytes early in the `_id` in a way that
preserves indexing speed on par with today, while still allowing for better
compression.
2017-07-11 17:28:23 +02:00
Simon Willnauer 98c91a3bd0 Limit the number of concurrent shard requests per search request (#25632)
This is a protection mechanism to prevent a single search request from
hitting a large number of shards in the cluster concurrently. If a search is
executed against all indices in the cluster this can easily overload the cluster
causing rejections etc. which is not necessarily desirable. Instead this PR adds
a per request limit of `max_concurrent_shard_requests` that throttles the number of
concurrent initial phase requests to `256` by default. This limit can be increased per request
and protects single search requests from overloading the cluster. Subsequent PRs can introduces
addiontional improvemetns ie. limiting this on a `_msearch` level, making defaults a factor of
the number of nodes or sort shards iters such that we gain the best concurrency across nodes.
2017-07-11 16:23:10 +02:00
Boaz Leskes a6db0ea908 Run Translog retention yaml tests with no replicas
Initializing replicas change the translog retention logic and confuses the test.

Switch to the solution suggested in https://github.com/elastic/elasticsearch/issues/25623, if implemented
2017-07-10 11:06:57 +02:00
olcbean 2ba9fd2aec Remove deprecated created and found from index, delete and bulk (#25516)
The created and found fields in index and delete responses became obsolete after the introduction of the result field in index, update and delete responses (#19566).

After deprecating the created and found fields in 5.x (#19633), now they are removed.

Fixes #19630
2017-07-07 13:58:46 -04:00
Jim Ferenczi 31614c3ddb Remove deprecated fielddata_fields from search request (#25566)
... and inner_hits
2017-07-06 13:02:28 +02:00
Colin Goodheart-Smithe 41abccf6c5 Adds rewrite phase to aggregations (#25495)
* Adds rewrite phase to aggregations

This change adds aggregations to the rewrite performed by the `SearchSourceBuilder`. This means that `AggregationBuilder`s are able to implement a `rewrite()` method where they can return a new `AggregationBuilder` which is functionally the same but in a more primitive form. This is exactly analogous to the rewrite done by the `QueryBuilder`s.

The first aggregation to implement the rewrite are the filter and filters aggregations so they can rewrite the filters they contain.

Closes #17676

* Removes rewrite from PipelineAggregationBuilder

Rewrite is based on shard level information. Since pipeline aggregation are run in the reduce phase it doesn’t make sense to rewrite them on the shards. In fact eventually we shouldn’t be transporting them to the shards at all and should be retaining them on the coordinating node for execution in the reduce phase

* Addresses review comments

* addresses more review comments

* Fixed imports
2017-07-04 16:47:48 +01:00
Jun Ohtani 6894ef6057 [Analysis] Support normalizer in request param (#24767)
* [Analysis] Support normalizer in request param

Support normalizer param
Support custom normalizer with char_filter/filter param

Closes #23347
2017-07-04 19:16:56 +09:00
Jason Tedor 6ae4497c13 Adjust BWC version on bad allocation request test
This commit adjusts the BWC version on the bad cluster allocation
explain request test as changing the API to respond with a bad request
status instead of an internal server error status was backported to 5.x
to be included in 5.6.0.

Relates #25503
2017-06-30 18:05:58 -04:00
Jason Tedor c70c440050 Adjust status on bad allocation explain requests
When a user requests a cluster allocation explain in a situation where
it does not make sense (for example, there are no unassigned shards), we
should consider this a bad request instead of a server error. Yet, today
by throwing an illegal state exception, these are treated as server
errors. This commit adjusts these so that they throw illegal argument
exceptions and are treated as bad requests.

Relates #25503
2017-06-30 17:50:20 -04:00
Glen Smith 1dd28808d5 Fix typo in name of test
This commit fixes a typo in the name of a REST test.

Relates #25451
2017-06-30 12:51:37 -04:00
Ali Beyad b18bfd6062 Output all empty snapshot info fields if in verbose mode (#25455)
In #24477, a less verbose option was added to retrieve snapshot info via
GET /_snapshot/{repo}/{snapshots}.  The point of adding this less
verbose option was so that if the repository is a cloud based one, and
there are many snapshots for which the snapshot info needed to be
retrieved, then each snapshot would require reading a separate snapshot
metadata file to pull out the necessary information.  This can be costly
(performance and cost) on cloud based repositories, so a less verbose
option was added that only retrieves very basic information about each
snapshot that is all available in the index-N blob - requiring only one
read!

In order to display this less verbose snapshot info appropriately, logic
was added to not display those fields which could not be populated.
However, this broke integrators (e.g. ECE) that required these fields to
be present, even if empty.  This commit is to return these fields in the
response, even if empty, if the verbose option is set.
2017-06-28 17:37:56 -05:00
Andreas Gebhardt a156ccd80e Expand `/_cat/nodes` to return information about hard drive (#21775)
Expand `/_cat/nodes` with already present information about available disk space `diskAvail` (alias: `d`, `disk`) by:

    * `diskTotal` (alias `dt`): total disk space
    * `diskUsed` (alias `du`): used disk space (`diskTotal - diskAvail`)
    * `diskUsedPercent` (alias `dup`): used disk space percentage

Note: The available disk space is the number of bytes available to the node's Java virtual machine. The size might be smaller than the real one. That means the used disk space (percentage) is larger.

Closes #21679
2017-06-28 18:20:20 +02:00
Simon Willnauer 4e4a104f4a Remove remaining `index.mapper.single_type` setting usage from tests (#25388)
This change removes the remaining explicitly specified `index.mapper.single_type`
settings from tests in order to allow the removal of the setting.
This is the already approved part of #25375 broken out to simplfiy reviews on
2017-06-25 12:25:41 +02:00
Boaz Leskes d963882053 Enable a long translog retention policy by default (#25294)
#25147  added the translog deletion policy but didn't enable it by default. This PR enables a default retention of 512MB (same maximum size of the current translog) and an age of 12 hours (i.e., after 12 hours all translog files will be deleted). This increases to chance to have an ops based recovery, even if the primary flushed or the replica was offline for a few hours.

In order to see which parts of the translog are committed into lucene the translog stats are extended to include information about uncommitted operations.

Views now include all translog ops and guarantee, as before, that those will not go away. Snapshotting a view allows to filter out generations that are not relevant based on a specific sequence number.

Relates to #10708
2017-06-22 17:08:14 +02:00
Spencer c5b79cd460 [rest-api-spec/indices.refresh] Remove old params
Fixes #25234
2017-06-21 13:44:27 -07:00
Lee Hinman 50bac63210 [TEST] Add skip for 5.x BWC tests for custom filter in analyze API
Resolves #25316
2017-06-20 09:25:03 -06:00
Jun Ohtani 62d1969595 Parse synonyms with the same analysis chain (#8049)
* [Analysis] Parse synonyms with the same analysis chain

Synonym Token Filter / Synonym Graph Filter tokenize synonyms with whatever tokenizer and token filters appear before it in the chain.

Close #7199
2017-06-20 21:50:33 +09:00
Jim Ferenczi 68deda6d03 FastVectorHighlighter should not cache the field query globally (#25197)
This commit removes the global caching of the field query and replaces it with
a caching per field. Each field can use a different `highlight_query` and the rewriting of
some queries (prefix, automaton, ...) depends on the targeted field so the query used for highlighting
must be unique per field.
There might be a small performance penalty when highlighting multiple fields since the query needs to be rewritten
once per highlighted field with this change.

Fixes #25171
2017-06-15 00:33:01 +02:00
Boaz Leskes 43f4ae5a7b Indices.rollover/10_basic should refresh to make the doc visible in lucene stats 2017-06-13 23:37:15 +02:00
Boaz Leskes d3c97615c1 Adapt skip version in rest-api-spec/test/indices.rollover/20_max_doc_condition.yml
The relevant change was backported.
2017-06-13 14:46:15 +02:00
Sergey Galkin 1c95cbc4e8 Rollover max docs should only count primaries (#24977)
max_doc condition for index rollover should use document count only from primary shards 

Fixes #24217
2017-06-13 14:30:46 +02:00
Spencer 88591fecac [docs] include two cluster doc pages missing from index (#25180)
* [docs] include two cluster doc pages missing from index

* [rest-api-spec] update link to remote-info docs
2017-06-12 12:33:56 -07:00
Jason Tedor 725f6b6983 Change BWC versions on get mapping 404s
This commit changes the BWC versions on the get mapping 404s now that
this API returning 404s when a type is missing is supported since 5.5.0.

Relates #23192
2017-06-11 16:59:12 -04:00
Jason Tedor dcf57f296e Fix get mappings HEAD requests
Get mappings HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for get
mappings HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.

Relates #23192
2017-06-11 14:58:56 -04:00
Jim Ferenczi 5cdbebec94 Test: remove faling test that relies on merge order 2017-06-10 11:55:41 +02:00
Jason Tedor 8a45c3105f Change BWC versions on create index response
This commit changes the BWC versions on the create index response now
that the index name in the response is supported since 5.6.0.

Relates #25139
2017-06-09 13:52:08 -04:00
Sergey Novikov 7c8657df0e Return the index name on a create index response
This commit modifies the create index response so that it includes the
index name.

Relates #25139
2017-06-09 13:47:47 -04:00
Tal Levy a771912a22 Add Ingest-Processor specific Rest Endpoints & Add Grok endpoint (#25059)
This PR enables Ingest plugins to leverage processor-scoped REST
endpoints. First of which being the Grok endpoint that retrieves
Grok Patterns for users to retrieve all the built-in patterns.
Example usage: Kibana Grok Autocomplete!
2017-06-08 15:24:35 -07:00
Lee Hinman 119f8ed9f0 Correctly enable _all for older 5.x indices
When we disabled `_all` by default for indices created in 6.0, we missed adding
a layer that would handle the situation where `_all` was not enabled in 5.x and
then the cluster was updated to 6.0, this means that when the cluster was
updated the `_all` field would be disabled for 5.x indices and field values
would not be added to the `_all` field.

This adds a compatibility layer for 5.x indices where we treat the default
enabled value for the `_all` field to be `true` if unset on 5.x indices.

Resolves #25068
2017-06-08 14:37:44 -06:00
Lee Hinman 050b7cd0f9 Include empty mappings in GET /{index}/_mappings requests (#25118)
Previously this would output:

```
GET /test-1/_mappings

{ }
```

And after this change:

```
GET /test-1/_mappings

{
  "test-1": {
    "mappings": {}
  }
}
```

To bring parity back to the REST output after #24723.

Relates to #25090
2017-06-08 10:57:04 -06:00
Lee Hinman 5b2ab96364 Return index name and empty map for /{index}/_alias with no aliases
Previously in #24723 we changed the `_alias` API to not go through the
`RestGetIndicesAction` endpoint, instead creating a `RestGetAliasesAction` that
did the same thing.

This changes the formatting so that it matches the old formatting of the
endpoint, before:

```
GET /test-1/_alias

{ }
```

And after this change:

```
GET /test-1/_alias

{
  "test-1": {
    "aliases": {}
  }
}
```

This is related to #25090
2017-06-08 10:03:03 -06:00
Jim Ferenczi 36a5cf8f35 Automatically early terminate search query based on index sorting (#24864)
This commit refactors the query phase in order to be able
to automatically detect queries that can be early terminated.
If the index sort matches the query sort, the top docs collection is early terminated
on each segment and the computing of the total number of hits that match the query is delegated to a simple TotalHitCountCollector.
This change also adds a new parameter to the search request called `track_total_hits`.
It indicates if the total number of hits that match the query should be tracked.
If false, queries sorted by the index sort will not try to compute this information and 
and will limit the collection to the first N documents per segment.
Aggregations are not impacted and will continue to see every document
even when the index sort matches the query sort and `track_total_hits` is false.

Relates #6720
2017-06-08 12:10:46 +02:00
Simon Willnauer d57641a747 Skip rest tests that use mutiple types in pure 6.x clusters (#24965)
This change skips rest tests that use mutlitple types if the cluster
is a pure 6.x cluster. This allows all indics to be created with a version
less than 6.0 and that means we can safely use the `mapping.single_type` setting.

Relates to #24961
2017-06-07 15:00:17 +02:00
Jason Tedor 5a0b159cb7 Modify skips for get missing alises tests
Previous work modified the status code on the get aliases API when an
alias is missing so that these requests 404 now. This change was also
backported to 5.5 so we can adjust the skips to skip everything before
5.5.0.
2017-06-06 15:35:30 -04:00
Jason Tedor e03c4938c5 GET aliases should 404 if aliases are missing
Previously the HEAD and GET aliases endpoints were misaigned in
behavior. The HEAD verb would 404 if any aliases are missing while the
GET verb would not if any aliases existed. When HEAD was aligned with
GET, this broke the previous usage of HEAD to serve as an existence
check for aliases. It is the behavior of GET that is problematic here
though, if any alias is missing the request should 404. This commit
addresses this by modifying the behavior of GET to behave in this
way. This fixes the behavior for HEAD to also 404 when aliases are
missing.

Relates #25043
2017-06-06 14:37:29 -04:00
Jim Ferenczi 4a8759ef4c Collapse inner hits rest test should not skip 5.x
Relates https://github.com/elastic/elasticsearch/pull/24517
2017-06-06 09:33:56 +02:00
jaymode e98d5676b3
Test: update missing body tests to run against versions >= 5.5.0
This updates the missing body tests to run against versions >= 5.5.0 after backporting the change
to the 5.x branch.

See #23497
2017-06-05 14:26:07 -06:00
Alex Benusovich 5463294ec4 Fixed NPEs caused by requests without content. (#23497)
REST handlers that require a body will throw an an ElasticsearchParseException "request body required".
REST handlers that require a body OR source param will throw an ElasticsearchParseException "request body or source param required".
Replaced asserts in BulkRequest parsing code with a more descriptive IllegalArgumentException if the line contains an empty object.
Updated bulk REST test to verify an empty action line is rejected properly.
Updated BulkRequestTests with randomized testing for an empty action line.
Used try-with-resouces for XContentParser in AbstractBulkByQueryRestHandler.
2017-06-05 09:08:14 -06:00
Lee Hinman 134b0d594e [TEST] Skip wildcard expansion test due to breaking change
Relates to #24723
2017-06-02 20:48:52 -06:00
Lee Hinman a32d1b91fa Remove comma-separated feature parsing for GetIndicesAction
This removes the parsing of things like `GET /idx/_aliases,_mappings`, instead,
a user must choose between retriving all index metadata with `GET /idx`, or only
a specific form such as `GET /idx/_settings`.

Relates to (and is a prerequisite of) #24437
2017-06-02 14:43:38 -06:00
Colin Goodheart-Smithe 779fb9a1c0 Adds nodes usage API to monitor usages of actions (#24169)
* Adds nodes usage API to monitor usages of actions

The nodes usage API has 2 main endpoints

/_nodes/usage and /_nodes/{nodeIds}/usage return the usage statistics
for all nodes and the specified node(s) respectively.

At the moment only one type of usage statistics is available, the REST
actions usage. This records the number of times each REST action class is
called and when the nodes usage api is called will return a map of rest
action class name to long representing the number of times each of the action
classes has been called.

Still to do:

* [x] Create usage service to store usage statistics
* [x] Record usage in REST layer
* [x] Add Transport Actions
* [x] Add REST Actions
* [x] Tests
* [x] Documentation

* Rafactors UsageService so counts are done by the handlers

* Fixing up docs tests

* Adds a name to all rest actions

* Addresses review comments
2017-06-02 08:46:38 +01:00
Ryan Ernst 8d88b94372 Scripting: Add optional context parameter to put stored script requests (#25014)
This commit adds an optional `context` url parameter to the put stored
script request. When a context is specified, the script is compiled
against that context before storing, as a validation the script will
work when used in that context.
2017-06-01 17:53:48 -07:00
Jim Ferenczi 47cf7825dd Move BWC version to 5.5 after backport
Relates to #24517
2017-05-26 14:57:07 +02:00
Matt Weber 601a61a91c Support Multiple Collapse Inner Hits
Support multiple named inner hits on a field collapsing
request.
2017-05-26 13:23:57 +02:00
markharwood a64937db7a Test fix - rest test missing version skip for new 6.0 significant_text agg 2017-05-24 17:05:02 +01:00
markharwood b7197f5e21 SignificantText aggregation - like significant_terms, but for text (#24432)
* SignificantText aggregation - like significant_terms but doesn’t require fielddata=true, recommended used with `sampler` agg to limit expense of tokenizing docs and takes optional `filter_duplicate_text`:true setting to avoid stats skew from repeated sections of text in search results.

Closes #23674
2017-05-24 13:46:43 +01:00
Ryan Ernst 463fe2f4d4 Scripting: Remove file scripts (#24627)
This commit removes file scripts, which were deprecated in 5.5.

closes #21798
2017-05-17 14:42:25 -07:00
Jim Ferenczi 67c41d2e77 Fix ExpandSearchPhase when response contains no hits (#24688)
This change skips the expand search phase entirely when there is no search hits in the response.
2017-05-17 14:15:40 +02:00
Ryan Ernst 2a65bed243 Tests: Change rest test extension from .yaml to .yml (#24659)
This commit renames all rest test files to use the .yml extension
instead of .yaml. This way the extension used within all of
elasticsearch for yaml is consistent.
2017-05-16 17:24:35 -07:00
Jim Ferenczi 279a18a527 Add parent-join module (#24638)
* Add parent-join module

This change adds a new module named `parent-join`.
The goal of this module is to provide a replacement for the `_parent` field but as a first step this change only moves the `has_child`, `has_parent` queries and the `children` aggregation to this module.
These queries and aggregations are no longer in core but they are deployed by default as a module.

Relates #20257
2017-05-12 15:58:06 +02:00
Jim Ferenczi 7b7e15023a Add rest test for sliced scroll (#24630) 2017-05-12 00:07:24 +02:00
qwerty4030 e7d352b489 Compound order for histogram aggregations. (#22343)
This commit adds support for histogram and date_histogram agg compound order by refactoring and reusing terms agg order code. The major change is that the Terms.Order and Histogram.Order classes have been replaced/refactored into a new class BucketOrder. This is a breaking change for the Java Transport API. For backward compatibility with previous ES versions the (date)histogram compound order will use the first order. Also the _term and _time aggregation order keys have been deprecated; replaced by _key.

Relates to #20003: now that all these aggregations use the same order code, it should be easier to move validation to parse time (as a follow up PR).

Relates to #14771: histogram and date_histogram aggregation order will now be validated at reduce time.

Closes #23613: if a single BucketOrder that is not a tie-breaker is added with the Java Transport API, it will be converted into a CompoundOrder with a tie-breaker.
2017-05-11 18:06:26 +01:00
Ali Beyad 743217a430 Enhances get snapshots API to allow retrieving repository index only (#24477)
Currently, the get snapshots API (e.g. /_snapshot/{repositoryName}/_all)
provides information about snapshots in the repository, including the
snapshot state, number of shards snapshotted, failures, etc.  In order
to provide information about each snapshot in the repository, the call
must read the snapshot metadata blob (`snap-{snapshot_uuid}.dat`) for
every snapshot.  In cloud-based repositories, this can be expensive,
both from a cost and performance perspective.  Sometimes, all the user
wants is to retrieve all the names/uuids of each snapshot, and the
indices that went into each snapshot, without any of the other status
information about the snapshot.  This minimal information can be
retrieved from the repository index blob (`index-N`) without needing to
read each snapshot metadata blob.

This commit enhances the get snapshots API with an optional `verbose`
parameter.  If `verbose` is set to false on the request, then the get
snapshots API will only retrieve the minimal information about each
snapshot (the name, uuid, and indices in the snapshot), and only read
this information from the repository index blob, thereby giving users
the option to retrieve the snapshots in a repository in a more
cost-effective and efficient manner.

Closes #24288
2017-05-10 15:48:40 -04:00
Isabel Drost-Fromm bd559d96d4
This adds max_concurrent_searches to multi-search-template endpoint.
Closes #20912
2017-05-10 11:23:24 +02:00
Adrien Grand a72eaa8e0f Identify documents by their `_id`. (#24460)
Now that indices have a single type by default, we can move to the next step
and identify documents using their `_id` rather than the `_uid`.

One notable change in this commit is that I made deletions implicitly create
types. This helps with the live version map in the case that documents are
deleted before the first type is introduced. Otherwise there would be no way
to differenciate `DELETE index/foo/1` followed by `PUT index/foo/1` from
`DELETE index/bar/1` followed by `PUT index/foo/1`, even though those are
different if versioning is involved.
2017-05-09 16:33:52 +02:00
Jim Ferenczi 4df636b5ef Fix single shard scroll within a cluster with nodes in version >= 5.3 and <= 5.3 (#24512)
If a node in version >= 5.3 acts as a coordinating node during a scroll request that targets a single shard, the scroll may return the same documents over and over iff the targeted shard is hosted by a node with a version <= 5.3.
The nodes in this version will advance the scroll only if the search_type has been set to `query_and_fetch` though this search type has been removed in 5.3.
This change handles this situation by adding the removed search_type in the request that targets a node in version <= 5.3.
2017-05-09 09:14:17 +02:00
Simon Willnauer 5bfb98ade4 [TEST] Reenable disabled tests for _field_caps and _search_shards (#24505) 2017-05-05 16:02:26 +02:00
Simon Willnauer 8055b14f2e Temporarily disable tests 2017-05-05 12:08:00 +02:00
Simon Willnauer 03267e03da Fix NPE if field caps request has a field that exists not in all indices (#24504)
If a field caps request contains a field name that doesn't exist in all indices
the response will be partial and we hide an NPE. The NPE is now fixed but we still
have the problem that we don't pass on errors on the shard level to the user. This will
be fixed in a followup.
2017-05-05 11:56:03 +02:00
Simon Willnauer 6b67e0bf2f Include all aliases including non-filtering in `_search_shards` response (#24489)
`_search_shards`API today only returns aliases names if there is an alias
filter associated with one of them. Now it can be useful to see which aliases
have been expanded for an index given the index expressions. This change also includes non-filtering aliases even without a filtering alias being present.
2017-05-05 09:34:12 +02:00
Simon Willnauer 07f106d39c [TEST] Rollback temporarily disabled field_caps test (#24483) 2017-05-04 14:14:22 +02:00
Simon Willnauer 14e57bf9f8 Add cross cluster support to `_field_caps` (#24463)
To support kibana this commit adds an internal optimization
to support the cross cluster syntax for indices on the `_field_caps`
API.

Closes #24334
2017-05-04 11:44:54 +02:00
Jim Ferenczi 6fcd24d264 Check index sorting with no replica since we cannot ensure that the replica index is ready when forceMerge is called. Closes #24416 2017-05-02 20:25:09 +02:00
Luca Cavanna 91fbb0ba28 Move IndicesAliasesRequest#concreteAliases to TransportIndicesAliasesAction (#24400)
This method has to do with how the transport action may or may not resolve wildcards expressions to aliases names. It is only needed in TransportIndicesAliasesAction and for this reason it should be a private method in it rather than part of a request class which is also part of the Java API and later in the high level REST client.
2017-05-01 19:59:06 +02:00
javanna 7863407b46 [TEST] fix _cat/allocation index size check
The check expected the size of the index to always be returned in bytes, but that can possibly be kb, mb, gb and tb depending on the actual size.
2017-05-01 14:38:14 +02:00
Guillaume Le Floch 382a617d34 Handle multiple aliases in _cat/aliases api (#23698)
The alias parameter was documented as a list in our rest-spec, yet only the first value out of a list was getting read and processed. This commit adds support for multiple aliases to _cat/aliases

Closes #23661
2017-04-28 15:21:44 +02:00
Adrien Grand 1be2800120 Only allow one type on 7.0 indices (#24317)
This adds the `index.mapping.single_type` setting, which enforces that indices
have at most one type when it is true. The default value is true for 6.0+ indices
and false for old indices.

Relates #15613
2017-04-27 08:43:20 +02:00
Guillaume Le Floch 739cb35d1b Allow passing single scrollID in clear scroll API body (#24242)
* Allow single scrollId in string format

Closes #24233
2017-04-25 13:43:21 +02:00
Nik Everett caf376c8af Start building analysis-common module (#23614)
Start moving built in analysis components into the new analysis-common
module. The goal of this project is:
1. Remove core's dependency on lucene-analyzers-common.jar which should
shrink the dependencies for transport client and high level rest client.
2. Prove that analysis plugins can do all the "built in" things by moving all
"built in" behavior to a plugin.
3. Force tests not to depend on any oddball analyzer behavior. If tests
need anything more than the standard analyzer they can use the mock
analyzer provided by Lucene's test infrastructure.
2017-04-19 18:51:34 -04:00
Jim Ferenczi f05af0a382 Enable index-time sorting (#24055)
This change adds an index setting to define how the documents should be sorted inside each Segment.
It allows any numeric, date, boolean or keyword field inside a mapping to be used to sort the index on disk.
It is not allowed to use a `nested` fields inside an index that defines an index sorting since `nested` fields relies on the original sort of the index.
This change does not add early termination capabilities in the search layer. This will be added in a follow up.

Relates #6720
2017-04-19 14:36:11 +02:00
Adrien Grand 4632661bc7 Upgrade to a Lucene 7 snapshot (#24089)
We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index.

Some notes about the change:
 - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors
 - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed
 - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge
2017-04-18 15:17:21 +02:00
Jason Tedor 972bdc09ee Reject empty IDs
When indexing a document via the bulk API where IDs can be explicitly
specified, we currently accept an empty ID. This is problematic because
such a document can not be obtained via the get API. Instead, we should
rejected these requets as accepting them could be a dangerous form of
leniency. Additionally, we already have a way of specifying
auto-generated IDs and that is to not explicitly specify an ID so we do
not need a second way. This commit rejects the individual requests where
ID is specified but empty.

Relates #24118
2017-04-15 10:36:03 -04:00
Lee Hinman 5cace8e48a Remove shadow replicas
Resolves #22024
2017-04-11 11:26:26 -06:00
Simon Willnauer 42e0b4f5e9 [TEST] Enable new REST test for 5.4 and BWC against 5.4.x 2017-04-11 13:30:45 +02:00
Simon Willnauer f22e0dc30b Add cross-cluster search remote cluster info API (#23969)
This commit adds an API to discover information like seed nodes,
http addresses and connection status of a configured remote cluster.

Closes #23925
2017-04-11 09:24:40 +02:00
Jim Ferenczi af49c46b76 Fix BWC tests for field_stats now that the deprecation has been back ported to 5.4 2017-04-10 12:40:37 +02:00
Jim Ferenczi 9b3c85dd88 Deprecate _field_stats endpoint (#23914)
_field_stats has evolved quite a lot to become a multi purpose API capable of retrieving the field capabilities and the min/max value for a field.
In the mean time a more focused API called `_field_caps` has been added, this enpoint is a good replacement for _field_stats since he can
retrieve the field capabilities by just looking at the field mapping (no lookup in the index structures).
Also the recent improvement made to range queries makes the _field_stats API obsolete since this queries are now rewritten per shard based on the min/max found for the field.
This means that a range query that does not match any document in a shard can return quickly and can be cached efficiently.
For these reasons this change deprecates _field_stats. The deprecation should happen in 5.4 but we won't remove this API in 6.x yet which is why
 this PR is made directly to 6.0.
 The rest tests have also been adapted to not throw an error while this change is backported to 5.4.
2017-04-10 10:10:16 +02:00
jaymode 53e3ddf2f0
Test: remove test that will never run on master
This test was added in #23950 for backporting and review, but it is always skipped on master so
this commit deletes it.
2017-04-06 15:50:08 -04:00
Jay Modi 495bf21b46 Preserve response headers when creating an index (#23950)
This commit preserves the response headers when creating an index and updating settings for an
index.

Closes #23947
2017-04-06 20:38:09 +01:00
Clinton Gormley 01b807f98e Adapted search_shards rest test to work with Perl
Because of the way Perl treats numbers, the boost is represented
as 1 instead of 1.0, which caused this test to fail.
2017-04-02 12:52:34 +02:00
Clinton Gormley 5b3c662145 To examine an exception in rest tests, the exception should be caught, not ignored 2017-04-02 12:52:30 +02:00