Commit Graph

1544 Commits

Author SHA1 Message Date
Simon Willnauer 47a506d2db Bump compat version for local depdendent test to 6.2.0 2018-01-04 16:49:31 +01:00
Simon Willnauer b68f7ed8c3
Pass `java.locale.providers=COMPAT` to Java 9 onwards (#28080)
Java 9 added some enhancements to the internationalization support that
impact our date parsing support. To ensure flawless BWC and consistent
behavior going forward Java 9 runtimes requrie the system property
`java.locale.providers=COMPAT` to be set.

Closes #10984
2018-01-04 16:43:51 +01:00
Mayya Sharipova dcde895f49
Introduce limit to the number of terms in Terms Query (#27968)
- Introduce index level settings to control the maximum number of terms
    that can be used in a Terms Query
- Throw an error if a request exceeds this max number

Closes #18829
2017-12-28 17:36:29 -05:00
Jim Ferenczi 0b2c8c835e
Fix composite aggregation when after term is missing in the shard (#27936)
This change fixes a bug when a keyword term in the `after` key is not present in the shard.
In this case the global ord of the document values are compared with the insertion point of the
`after` keyword and values that are equal to the insertion point should be considered "after" the top value.
2017-12-26 09:58:49 +01:00
olcbean 7f2f59eb85 delete `operation_threading` from the rest specs (#27940) 2017-12-21 13:09:11 -08:00
Mayya Sharipova cbd271e497
Limit the analyzed text for highlighting (#27934)
* Limit the analyzed text for highlighting

- Introduce index level settings to control the max number of character
to be analyzed for highlighting
- Throw an error if analysis is required on a larger text

Closes #27517
2017-12-21 10:19:58 -05:00
Jim Ferenczi c753b82ca8 Adapt scroll rest test after backport. relates #27842 2017-12-21 09:31:56 +01:00
Colin Goodheart-Smithe 4cbbe3ed93
Fixes DocStats to not report index size < -1 (#27863)
Previously to this change when DocStats are added together (for example when adding the index size of all primary shards for an index)  we naively added the `totalSizeInBytes` together. This worked most of the time but not when the index size on one or multiple shards was reported to be `-1` (no value).

This change improves the logic by considering if the current value or the value to be added is `-1`:
* If the current and new value are both `-1` the value remains at `-1`
* If the current value is `-1` and the new value is not `-1`, current value is changed to be equal to the new value
* If the current value is not `-1` and the new value is `-1` the new value is ignored and the current value is not changed
* If both the current and new values are not `-1` the current value is changed to be equal to the sum of the current and new values.

The change also re-enables the failing rollover YAML test that was failing due to this bug.
2017-12-20 14:45:09 +00:00
Stuart Cam e458c6b762
timestamp and ttl in index requests (#27888)
timestamp and ttl are not accepted anymore as parameters of index/update requests.
2017-12-20 10:43:21 +11:00
Christoph Büscher 05aa1a6033 [Tests] Remove redudant rest test added in #27900
The removed rest test doesn't really test the the issue
reported in #27841 and adds nothing on top of the unit test.
2017-12-19 20:04:37 +01:00
Christoph Büscher fb2fd4e8ee
Fix preserving FiltersAggregationBuilder#keyed field on rewrite (#27900)
Currently FiltersAggregationBuilder#doRewrite creates a new FiltersAggregationBuilder which doesn't correctly copy the original "keyed" field if a non-keyed filter gets rewritten.
This can cause rendering bugs of the output aggregations like the one reported in #27841.

Closes #27841
2017-12-19 19:56:12 +01:00
kel 7a27a2770b Reject scroll query if size is 0 (#22552) (#27842) 2017-12-18 10:38:41 +01:00
Jim Ferenczi 55b71a871b Adapt rest test after backport. Relates #27833 2017-12-18 10:36:44 +01:00
Jason Tedor 75c0cd0672
Move range field mapper back to core
This commit moves the range field mapper back to core so that we can
remove the compile-time dependency of percolator on mapper-extras which
compilcates dependency management for the percolator client JAR, and
modules should not be intertwined like this anyway.

Relates #27854
2017-12-17 14:27:10 -05:00
kel f5e0932c8d Add version support for inner hits in field collapsing (#27822) (#27833)
Add version support for inner hits in field collapsing
2017-12-15 18:00:40 +01:00
Christoph Büscher 52cb6c8ef2 Merge branch 'master' into rankeval 2017-12-07 14:22:46 +01:00
Jim Ferenczi caea6b70fa
Add a new cluster setting to limit the total number of buckets returned by a request (#27581)
This commit adds a new dynamic cluster setting named `search.max_buckets` that can be used to limit the number of buckets created per shard or by the reduce phase. Each multi bucket aggregator can consume buckets during the final build of the aggregation at the shard level or during the reduce phase (final or not) in the coordinating node. When an aggregator consumes a bucket, a global count for the request is incremented and if this number is greater than the limit an exception is thrown (TooManyBuckets exception).
This change adds the ability for multi bucket aggregator to "consume" buckets in the global limit, the default is 10,000. It's an opt-in consumer so each multi-bucket aggregator must explicitly call the consumer when a bucket is added in the response.

Closes #27452 #26012
2017-12-06 09:15:28 +01:00
Christoph Büscher bbec33d35c Merge branch 'master' into rankeval 2017-12-04 12:57:19 +01:00
Mayya Sharipova c6b73239ae
Limit the number of tokens produced by _analyze (#27529)
Add an index level setting `index.analyze.max_token_count` to control
the number of generated tokens in the  _analyze endpoint.
Defaults to 10000.

Throw an error if the number of generated tokens exceeds this limit.

Closes #27038
2017-11-30 11:54:39 -05:00
Tanguy Leroux 41f73e0acf Fix version for include_global_state in Snapshot Status API
It also adds a Rest test.

Related #26853
2017-11-30 11:33:01 +01:00
Christoph Büscher 35688f6441 Merge branch 'master' into rankeval 2017-11-29 15:24:06 +01:00
Martijn van Groningen cb1204774b
Include the _index, _type and _id to nested search hits in the top_hits and inner_hits response.
Also include _type and _id for parent/child hits inside inner hits.

In the case of top_hits aggregation the nested search hits are
directly returned and are not grouped by a root or parent document, so
it is important to include the _id and _index attributes in order to know
to what documents these nested search hits belong to.

Closes #27053
2017-11-28 14:05:29 +01:00
Nhat Nguyen 8d6bfe53bb
Remove workaround in translog rest test (#27530)
Relates #25623 and a6db0ea908
2017-11-27 09:41:30 -05:00
Christoph Büscher 5661b1c3df Merge branch 'master' into rankeval 2017-11-24 16:25:05 +01:00
Nhat Nguyen 06d35f4f01 Backport wait_for_initialiazing_shards to cluster health API
Relates #27489
2017-11-24 09:56:16 -05:00
Nhat Nguyen 46b508d6c9
Add wait_for_no_initializing_shards to cluster health API (#27489)
This adds a new option to the cluster health request allowing to wait
until there is no initializing shards.

Closes #25623
2017-11-23 15:09:58 -05:00
Simon Willnauer fadbe0de08
Automatically prepare indices for splitting (#27451)
Today we require users to prepare their indices for split operations.
Yet, we can do this automatically when an index is created which would
make the split feature a much more appealing option since it doesn't have
any 3rd party prerequisites anymore.

This change automatically sets the number of routinng shards such that
an index is guaranteed to be able to split once into twice as many shards.
The number of routing shards is scaled towards the default shard limit per index
such that indices with a smaller amount of shards can be split more often than
larger ones. For instance an index with 1 or 2 shards can be split 10x
(until it approaches 1024 shards) while an index created with 128 shards can only
be split 3x by a factor of 2. Please note this is just a default value and users
can still prepare their indices with `index.number_of_routing_shards` for custom
splitting.

NOTE: this change has an impact on the document distribution since we are changing
the hash space. Documents are still uniformly distributed across all shards but since
we are artificually changing the number of buckets in the consistent hashign space
document might be hashed into different shards compared to previous versions.

This is a 7.0 only change.
2017-11-23 09:48:54 +01:00
Mayya Sharipova 57e4d10007
Limit the number of nested documents (#27405)
Add an index level setting `index.mapping.nested_objects.limit` to control
the number of nested json objects that can be in a single document
across all fields. Defaults to 10000.

Throw an error if the number of created nested documents exceed this
limit during the parsing of a document.

Closes #26962
2017-11-22 10:16:28 -05:00
Jim Ferenczi 90d2ead14a Adapt rest test BWC version after backport
Relates #26800
2017-11-21 15:45:02 +01:00
Christoph Büscher d979ccace9 Merge branch 'master' into rankeval 2017-11-21 14:11:02 +01:00
Jim Ferenczi 6319424e4a
Move composite aggregation to core (#27474)
This change removes the module named aggs-composite and adds the `composite` aggs
as a core aggregation. This allows other plugins to use this new aggregation
and simplifies the integration in the HL rest client.
2017-11-21 13:31:01 +01:00
Simon Willnauer 8aba7c8bbe Fix test BWC version after backport
Relates to #27468
2017-11-21 12:31:04 +01:00
Simon Willnauer ea35abca28
Protect shard splitting from illegal target shards (#27468)
While we have an assertion that checks if the number of routing shards is a multiple
of the number of shards we need a real hard exception that checks this way earlier.
This change adds a check and test that is executed before we create the index.

Relates to #26931
2017-11-21 12:09:45 +01:00
Luca Cavanna 29450de7b5
Cross Cluster Search: make remote clusters optional (#27182)
Today Cross Cluster Search requires at least one node in each remote cluster to be up once the cross cluster search is run. Otherwise the whole search request fails despite some of the data (either local and/or remote) is available. This happens when performing the _search/shards calls to find out which remote shards the query has to be executed on. This scenario is different from shard failures that may happen later on when the query is actually executed, in case e.g. remote shards are missing, which is not going to fail the whole request but rather yield partial results, and the _shards section in the response will indicate that.

This commit introduces a boolean setting per cluster called search.remote.$cluster_alias.skip_if_disconnected, set to false by default, which allows to skip certain clusters if they are down when trying to reach them through a cross cluster search requests. By default all clusters are mandatory.

Scroll requests support such setting too when they are first initiated (first search request with scroll parameter), but subsequent scroll rounds (_search/scroll endpoint) will fail if some of the remote clusters went down meanwhile.

The search API response contains now a new _clusters section, similar to the _shards section, that gets returned whenever one or more clusters were disconnected and got skipped:

"_clusters" : {
    "total" : 3,
    "successful" : 2,
    "skipped" : 1
}
Such section won't be part of the response if no clusters have been skipped.

The per cluster skip_unavailable setting value has also been added to the output of the remote/info API.
2017-11-21 11:41:47 +01:00
Zachary Tong 196dbf3357
Add YAML REST tests for filters bucket agg (#27128)
Related to #26220
2017-11-20 16:44:30 -05:00
Simon Willnauer 28e5cf933f Bump test version after backport
Relates to #27455
2017-11-20 16:54:59 +01:00
Simon Willnauer 720e96e288
Ensure nested documents have consistent version and seq_ids (#27455)
Today we index dummy values for seq_ids and version on nested documents.
This is on the one hand trappy since users can request these values via
inner hits and on the other hand not necessarily good for compression since
the dummy value will likely not compress well when seqIDs are lowish.

This change ensures that we share the same field values for all documents in a
nested block. This won't have any overhead, in-fact it might be more efficient since
we even reduce the work needed slightly.
2017-11-20 16:50:08 +01:00
Mayya Sharipova 858b2c7cb8
Standardize underscore requirements in parameters (#27414)
Stardardize underscore requirements in parameters across different type of
requests:
_index, _type, _source, _id keep their underscores
params like version and retry_on_conflict will be without underscores
Throw an error if older versions of parameters are used

BulkRequest, MultiGetRequest, TermVectorcRequest, MoreLikeThisQuery
were changed

Closes #26886
2017-11-17 15:31:52 -05:00
Yannick Welsch 3b963dcfe5 Stop skipping REST test after backport of #27056 2017-11-16 16:08:10 +01:00
kel 6b817489f3 Fix default value of ignore_unavailable for snapshot REST API (#27056)
The default value for ignore_unavailable did not match what was documented when using the REST APIs for snapshot creation and restore. This commit sets the default value of ignore_unavailable to false, the way it is documented and ensures it's the same when using either REST API or transport client.

Closes #25359
2017-11-16 16:03:09 +01:00
Clinton Gormley 1caa5c8e32 Rest test fixes (#27354)
* REST: Rename ingest.processor.grok to ingest.processor_grok
* REST: Rename remote.info to cluster.remote_info
* REST: Fixed bad YAML comments
* REST: Force dummy scripts to be strings, not numbers
* REST: Fix bad YAML in search/110_field_collapsing.yml
* REST: Adjust percentile tests to work with Perl number handling
2017-11-14 11:14:14 +01:00
Jim Ferenczi 29331f1127
Fail queries with scroll that explicitely set request_cache (#27342)
Queries that create a scroll context cannot use the cache.
They modify the search context during their execution so using the cache
can lead to duplicate result for the next scroll query.

This change fails the entire request if the request_cache option is explictely set
on a query that creates a scroll context (`scroll=1m`) and make sure internally that we never
use the cache for these queries when the option is not explicitely used.
For 6.x a deprecation log will be printed instead of failing the entire request and the request_cache hint
will be ignored (forced to false).
2017-11-10 16:02:06 +01:00
Boaz Leskes ace446f335 Update shrink's bwc version to 6.1.0 and enabled bwc tests 2017-11-07 15:35:46 +01:00
olcbean 7f593a26a3 Setting url parts as required to reflect the code base (#27263) 2017-11-06 09:58:27 -07:00
Nick Lang 09294a9b9a keys in aggs percentiles need to be in quotes. (#26905)
Languages which are stronger typed will failed when comparing these results
2017-11-06 17:45:04 +01:00
Russ Cam a0bdedb143 Align routing param type with search.json (#26958)
Relates https://github.com/elastic/elasticsearch-net/issues/2869
2017-11-06 17:34:22 +01:00
Simon Willnauer bd7efa908a Add ability to split shards (#26931)
This change adds a new `_split` API that allows to split indices into a new
index with a power of two more shards that the source index.  This API works
alongside the `_shrink` API but doesn't require any shard relocation before
indices can be split.

The split operation is conceptually an inverse `_shrink` operation since we
initialize the index with a _syntetic_ number of routing shards that are used
for the consistent hashing at index time. Compared to indices created with
earlier versions this might produce slightly different shard distributions but
has no impact on the per-index backwards compatibility.  For now, the user is
required to prepare an index to be splittable by setting the
`index.number_of_routing_shards` at index creation time.  The setting allows the
user to prepare the index to be splittable in factors of
`index.number_of_routing_shards` ie. if the index is created with
`index.number_of_routing_shards: 16` and `index.number_of_shards: 2` it can be
split into `4, 8, 16` shards. This is an intermediate step until we can make
this the default. This also allows us to safely backport this change to 6.x.

The `_split` operation is implemented internally as a DeleteByQuery on the
lucene level that is executed while the primary shards execute their initial
recovery. Subsequent merges that are triggered due to this operation will not be
executed immediately. All merges will be deferred unti the shards are started
and will then be throttled accordingly.

This change is intended for the 6.1 feature release but will not support pre-6.1
indices to be split unless these indices have been shrunk before. In that case
these indices can be split backwards into their original number of shards.
2017-11-06 11:37:55 +01:00
olcbean e440e23ad1 Fix inconsistencies in the rest api specs for `tasks` (#27163)
modify parameters names to reflect the changes done in the code base
2017-11-06 10:11:25 +01:00
Nhat fd3fac9565 Backport the size-based index rollver to v6.1.0
Relates #27004
2017-11-04 20:14:59 -04:00
Nhat c7ce5a07f2
Add size-based condition to the index rollover API (#27160)
This is to add a max_size condition to the index rollover API. We use
a totalSizeInBytes from DocsStats to evaluate this condition.

Closes #27004
2017-11-04 19:51:48 -04:00