Commit Graph

2139 Commits

Author SHA1 Message Date
Martijn van Groningen f0dd9b4ace
Add data stream timestamp validation via metadata field mapper (#59002)
Backport of #58582 to 7.x branch.

This commit adds a new metadata field mapper that validates,
that a document has exactly a single timestamp value in the data stream timestamp field and
that the timestamp field mapping only has `type`, `meta` or `format` attributes configured.
Other attributes can affect the guarantee that an index with this meta field mapper has a
useable timestamp field.

The MetadataCreateIndexService inserts a data stream timestamp field mapper whenever
a new backing index of a data stream is created.

Relates to #53100
2020-07-06 11:32:33 +02:00
Dan Hermann 7c43cbca82
[7.x] Ignore matching data streams if include_data_streams is false (#59028) 2020-07-03 14:51:32 -05:00
Dan Hermann c1781bc7e7
[7.x] Add include_data_streams flag for authorization (#59008) 2020-07-03 12:58:39 -05:00
Dan Hermann 5e7746d3bd
[7.x] Mirror privileges over data streams to their backing indices (#58991) 2020-07-03 06:33:38 -05:00
Lee Hinman d3d03fc1c6
[7.x] Add default composable templates for new indexing strategy (#57629) (#58757)
Backports the following commits to 7.x:

    Add default composable templates for new indexing strategy (#57629)
2020-07-01 09:32:32 -06:00
David Turner 822b7421ce Forbid read-only-allow-delete block in blocks API (#58727)
The read-only-allow-delete block is not really under the user's control
since Elasticsearch adds/removes it automatically. This commit removes
support for it from the new API for adding blocks to indices that was
introduced in #58094.
2020-07-01 13:18:26 +01:00
Martijn van Groningen a0df96befb
Add data stream support to put mapping and update index settings APIs. (#58758)
Backport of #58231 to 7.x branch.

Change update index setting and put mapping api
to execute on all backing indices if data stream is targeted.

Relates #53100
2020-07-01 13:32:21 +02:00
David Turner 3a234d2669
Account for remaining recovery in disk allocator (#58800)
Today the disk-based shard allocator accounts for incoming shards by
subtracting the estimated size of the incoming shard from the free space on the
node. This is an overly conservative estimate if the incoming shard has almost
finished its recovery since in that case it is already consuming most of the
disk space it needs.

This change adds to the shard stats a measure of how much larger each store is
expected to grow, computed from the ongoing recovery, and uses this to account
for the disk usage of incoming shards more accurately.

Backport of #58029 to 7.x

* Picky picky

* Missing type
2020-07-01 10:12:44 +01:00
Lee Hinman efa0686a6c Fix template name in mapping composition yml test (#58788)
The warning was copied from elsewhere and just needed to use the correct template and index name.
2020-06-30 17:04:15 -06:00
Dan Hermann 1c2a726731
Data stream support for search shards API (#58486) (#58765) 2020-06-30 17:59:51 -05:00
Dan Hermann cae49b0fd7
[7.x] Add data stream support to open index API (#58767) 2020-06-30 14:30:32 -05:00
Dan Hermann a84ff81743
Data stream support for get field mappings API (#58488) (#58766) 2020-06-30 13:45:04 -05:00
Julie Tibshirani ab65a57d70
Merge mappings for composable index templates (#58709)
This PR implements recursive mapping merging for composable index templates.

When creating an index, we perform the following:
* Add each component template mapping in order, merging each one in after the
last.
* Merge in the index template mappings (if present).
* Merge in the mappings on the index request itself (if present).

Some principles:
* All 'structural' changes are disallowed (but everything else is fine). An
object mapper can never be changed between `type: object` and `type: nested`. A
field mapper can never be changed to an object mapper, and vice versa.
* Generally, each section is merged recursively. This includes `object`
mappings, as well as root options like `dynamic_templates` and `meta`. Once we
reach 'leaf components' like field definitions, they always overwrite an
existing one instead of being merged.

Relates to #53101.
2020-06-30 08:01:37 -07:00
Yannick Welsch b885cbff1a
Add index block api (#58716)
Adds an API for putting an index block in place, which also ensures for write blocks that, once successfully returning to
the user, all shards of the index are properly accounting for the block, for example that all in-flight writes to an index have
been completed after adding the write block.

This API allows coordinating more complex workflows, where it is crucial that an index is no longer receiving writes after
the API completes, useful for example when marking an index as read-only during an upgrade in order to reindex its
documents.
2020-06-30 14:06:52 +02:00
Nik Everett 03e6d1b535
Add Variable Width Histogram Aggregation (backport of #42035) (#58440)
Implements a new histogram aggregation called `variable_width_histogram` which
dynamically determines bucket intervals based on document groupings. These
groups are determined by running a one-pass clustering algorithm on each shard
and then reducing each shard's clusters using an agglomerative
clustering algorithm.

This PR addresses #9572.

The shard-level clustering is done in one pass to minimize memory overhead. The
algorithm was lightly inspired by
[this paper](https://ieeexplore.ieee.org/abstract/document/1198387). It fetches
a small number of documents to sample the data and determine initial clusters.
Subsequent documents are then placed into one of these clusters, or a new one
if they are an outlier. This algorithm is described in more details in the
aggregation's docs.

At reduce time, a
[hierarchical agglomerative clustering](https://en.wikipedia.org/wiki/Hierarchical_clustering)
algorithm inspired by [this paper](https://arxiv.org/abs/1802.00304)
continually merges the closest buckets from all shards (based on their
centroids) until the target number of buckets is reached.

The final values produced by this aggregation are approximate. Each bucket's
min value is used as its key in the histogram. Furthermore, buckets are merged
based on their centroids and not their bounds. So it is possible that adjacent
buckets will overlap after reduction. Because each bucket's key is its min,
this overlap is not shown in the final histogram. However, when such overlap
occurs, we set the key of the bucket with the larger centroid to the midpoint
between its minimum and the smaller bucket’s maximum:
`min[large] = (min[large] + max[small]) / 2`. This heuristic is expected to
increases the accuracy of the clustering.

Nodes are unable to share centroids during the shard-level clustering phase. In
the future, resolving https://github.com/elastic/elasticsearch/issues/50863
would let us solve this issue.

It doesn’t make sense for this aggregation to support the `min_doc_count`
parameter, since clusters are determined dynamically. The `order` parameter is
not supported here to keep this large PR from becoming too complex.

Co-authored-by: James Dorfman <jamesdorfman@users.noreply.github.com>
2020-06-25 11:40:47 -04:00
Rory Hunter ebe1d9cdbe Update rest-api-spec keyword list
Follow-up to 35aecf4c9aa. Somehow I missed the fact that there's an ILM
API named `retry`, which is a keyword in Ruby. I've removed it from the
keywords list.
2020-06-25 09:55:13 +01:00
Rory Hunter e413de4203 Validate that REST API names do not contain keywords (#58452)
If an API name (or components of a name) overlaps with a reserved word in
the programming language for an ES client, then it's possible that the code
that is generated from the API will not compile. This PR adds validation to
check for such overlaps.
2020-06-25 09:48:54 +01:00
Martijn van Groningen f4fad9c65a
Re-enable data streams yaml tests in bwc mode (#58500)
Backport of #58403 to 7.x branch.
2020-06-24 16:59:51 +02:00
Martijn van Groningen 7dda9934f9
Keep track of timestamp_field mapping as part of a data stream (#58400)
Backporting #58096 to 7.x branch.
Relates to #53100

* use mapping source direcly instead of using mapper service to extract the relevant mapping details
* moved assertion to TimestampField class and added helper method for tests
* Improved logic that inserts timestamp field mapping into an mapping.
If the timestamp field path consisted out of object fields and
if the final mapping did not contain the parent field then an error
occurred, because the prior logic assumed that the object field existed.
2020-06-22 17:46:38 +02:00
Jim Ferenczi 82db0b575c
Allow index filtering in field capabilities API (#57276) (#58299)
This change allows to use an `index_filter` in the
field capabilities API. Indices are filtered from
the response if the provided query rewrites to `match_none`
on every shard:

````
GET metrics-*
{
  "index_filter": {
    "bool": {
      "must": [
        "range": {
          "@timestamp": {
            "gt": "2019"
          }
        }
      }
  }
}
````

The filtering is done on a best-effort basis, it uses the can match phase
to rewrite queries to `match_none` instead of fully executing the request.
The first shard that can match the filter is used to create the field
capabilities response for the entire index.

Closes #56195
2020-06-18 10:23:26 +02:00
Rory Hunter e065d6cc91 Rename dangling index APIs (#58266)
The dangling_indices.import API name could cause issues in the client
libs because import is a reserved word in many languages. Rename the
API to avoid this, and rename the other APIs for consistency.

Related to #48366.
2020-06-18 08:58:32 +01:00
Nik Everett ab2c6d9696
Save memory when auto_date_histogram is not on top (backport of #57304) (#58190)
This builds an `auto_date_histogram` aggregator that natively aggregates
from many buckets and uses it when the `auto_date_histogram` used to use
`asMultiBucketAggregator` which should save a significant amount of
memory in those cases. In particular, this happens when
`auto_date_histogram` is a sub-aggregator of a multi-bucketing aggregator
like `terms` or `histogram` or `filters`. For the most part we preserve
the original implementation when `auto_date_histogram` only collects from
a single bucket.

It isn't possible to "just port the aggregator" without taking a pretty
significant performance hit because we used to rewrite all of the
buckets every time we switched to a coarser and coarser rounding
configuration. Without some major surgery to how to delay sub-aggs
we'd end up rewriting the delay list zillions of time if there are many
buckets.

The multi-bucket version of the aggregator has a "budget" of "wasted"
buckets and only rewrites all of the buckets when we exceed that budget.
Now that we don't rebucket every time we increase the rounding we can no
longer get an accurate count of the number of buckets! So instead the
aggregator uses an estimate of the number of buckets to trigger switching
to a coarser rounding. This estimate is likely to be *terrible* when
buckets are far apart compared to the rounding. So it also uses the
difference between the first and last bucket to trigger switching to a
coarser rounding. Which covers for the shortcomings of the bucket
estimation technique pretty well. It also causes the aggregator to emit
fewer buckets in cases where they'd be reduced together on the
coordinating node. This is wonderful! But probably fairly rare.

All of that does buy us some speed improvements when the aggregator is
a child of multi-bucket aggregator:
Without metrics or time zone: 25% faster
With metrics: 15% faster
With time zone: 22% faster

Relates to #56487
2020-06-17 08:48:41 -04:00
Rory Hunter 03369e0980
Implement dangling indices API (#58176)
Backport of #50920. Part of #48366. Implement an API for listing,
importing and deleting dangling indices.

Co-authored-by: David Turner <david.turner@elastic.co>
2020-06-16 21:50:38 +01:00
Dan Hermann 911d46370e
Prohibit clone, shrink, and split on a data stream's write index 2020-06-16 10:53:20 -05:00
Dan Hermann 17f3318732
[7.x] Resolve index API (#58037) 2020-06-12 15:41:32 -05:00
Nik Everett 5056f2792d
Skip max_buckets test when it is flaky (#58038)
Before #57042 the max_buckets test would consistently pass because the
request would consistently fail. In particular, the request would fail on
the data node. After #57042 it only fails on the coordinating node. When
the max_buckets test is run in a mixed version cluster it consistently
fails on *either* the data node or the coordinating node. Except when
the coordinating node is missing #43095. In that case if the one data
node has #57042 and one does not, *and* the one that doesn't gets the
request first, fails it as expected, and then the coordinating node
retries the request on the node with #57042. When that happens the
request fails mysteriously with "partial shard failures" as the error
message but not partial failures reported. This is *exactly* the bug
fixed in #43095.

This updates the test to be skipped in mixed version clusters without
 #43095 because they *sometimes* fail the test spuriously. The request
fails in those cases, just like we expect, but with a mysterious error
message.

Closes #57657
2020-06-12 15:06:56 -04:00
Mayya Sharipova 8bd0147ba7
Correct how meta-field is defined for pre 7.8 hits (#57951)
We keep a static list of meta-fields: META_FIELDS_BEFORE_7_8
as it was before.
This is done to ensure the backwards compatability with pre 7.8 nodes.

Closes #57831
2020-06-12 09:39:53 -04:00
Martijn van Groningen 01d8bb8cfa
Enforce valid field mapping exists for timestamp_field in templates. (#58036)
Backport of #57741 to 7.x branch.

Relates to #53100
2020-06-12 15:24:42 +02:00
Martijn van Groningen f4199f2ee0
Prohibit append-only writes targeting backing indices directly. (#58025)
Backport of #57788 to 7.x branch.

Append-only writes can only target the corresponding data stream.

Relates to #53100
2020-06-12 13:17:55 +02:00
Russ Cam f51f9b19c7 Mark Component and Index template APIs as experimental (#57910)
This commit marks the Component Template and
Index Template APIs as experimental.

(cherry picked from commit a85f2bede8eb632e3837ac7630f8dfdf46da6b52)
2020-06-10 14:07:09 +10:00
Dan Hermann b501b282f8
Change default backing index naming scheme 2020-06-09 09:31:34 -05:00
Lee Hinman 6e8cf0973f
[7.x] Disallow merging existing mapping field definitions in templates (#57701) (#57822)
Backports the following commits to 7.x:

    Disallow merging existing mapping field definitions in templates (#57701)
2020-06-08 12:56:09 -06:00
Benjamin Trent 16fcb64c99
Test mute (#57832) 2020-06-08 14:03:16 -04:00
Nik Everett ee0ce8ffaf
Fix a bug with missing fields in sig_terms (#57757)
When you run a `significant_terms` aggregation on a field and it *is*
mapped but there aren't any values for it then the count of the
documents that match the query on that shard still have to be added to
the overall doc count. I broke that in #57361. This fixes that.

Closes #57402
2020-06-08 10:07:14 -04:00
Dan Hermann 3fe93e24a6
[7.x] Prohibit closing the write index for a data stream (#57740) 2020-06-05 11:14:43 -05:00
Nik Everett 94b3eed6be Re-mute test
Tracked in #57402
2020-06-05 10:52:24 -04:00
Nik Everett de27253d87 Drop skip on test after backporting fix
Fixed in 98c379c507.

Closes #57402
2020-06-04 16:04:18 -04:00
Nik Everett 98c379c507
Merge remaining sig_terms into terms (#57397) (#57687)
Merges the remaining implementation of `significant_terms` into `terms`
so that we can more easilly make them work properly without
`asMultiBucketAggregator` which *should* save memory and speed them up.

Relates #56487
2020-06-04 14:32:32 -04:00
Nik Everett 97c06816a4
Fix an optimization in terms agg (backport #57438) (#57547)
When the `terms` agg runs against strings and uses global ordinals it
has an optimization when it collects segments that only ever have a
single value for the particular string. This is *very* common. But I
broke it in #57241. This fixes that optimization and adds `debug`
information that you can use to see how often we collect segments of
each type. And adds a test to make sure that I don't break the
optimization again.

We also had a specialiation for when there isn't a filter on the terms
to aggregate. I had removed that specialization in #57241 which resulted
in some slow down as well. This adds it back but in a more clear way.
And, hopefully, a way that is marginally faster when there *is* a
filter.

Closes #57407
2020-06-02 14:57:45 -04:00
David Kyle 4d54bb3917
Correct expected warning in indices.create yml tests (#57409)
v2 index is now composable, v1 is now legacy
2020-06-01 19:22:30 +01:00
David Kyle 82f27ef128
Mute msearch yml test (#57406)
For #57402
2020-06-01 13:02:46 +01:00
Nik Everett 4263c25b2f
Save memory when histogram agg is not on top (backport of #57277) (#57377)
This saves some memory when the `histogram` aggregation is not a top
level aggregation by dropping `asMultiBucketAggregator` in favor of
natively implementing multi-bucket storage in the aggregator. For the
most part this just uses the `LongKeyedBucketOrds` that we built the
first time we did this.
2020-05-29 15:07:37 -04:00
Martijn van Groningen d8928b3f48
fixed allowed warnings in yaml test 2020-05-29 13:37:49 +02:00
Martijn van Groningen 04ef39da77
Change cluster info actions to be able to resolve data streams. (#57343)
Backport of #56878 to 7.x branch.

With this change the following APIs will be able to resolve data streams:
get index, get mappings and ilm explain APIs.

Relates to #53100
2020-05-29 12:17:53 +02:00
Russ Cam 2a9073d4c1 Deprecate local param in get_mapping.json (#57265)
Relates: elastic/elasticsearch#55014

This commit deprecates the local param in get_mapping.json.
This parameter is a no-op and field mappings are always retrieved locally.

(cherry picked from commit 0b041cccd894f01d723fb2979f70c1cf279700a6)
2020-05-29 12:25:41 +10:00
Nik Everett b9fe10866e
Make global ords terms simpler to understand (backport of #57241) (#57311)
When the `terms` enum operates on non-numeric data it can collect it via
global ordinals. It actually has two separate collection strategies for,
one "dense" and one "remapping". Each of *those* strategies has two
"iteration" strategies that it uses to build buckets, depending on
whether or not we need buckets with `0` docs in them. Previously this
was done with several `null` checks and never really explained. This
change replaces those checks with two `CollectionStrategy` classes which
have good stuff like documentation.
2020-05-28 16:52:35 -04:00
Martijn van Groningen 225ccd1cfa
Ensure template exists when creating data stream (#57275)
Backporting #56888 to 7.x branch.

Limit the creation of data streams only for namespaces that have a composable template with a data stream definition.

This way we ensure that mappings/settings have been specified and will be used at data stream creation and data stream rollover.

Also remove `timestamp_field` parameter from create data stream request and
let the create data stream api resolve the timestamp field
from the data stream definition snippet inside a composable template.

Relates to #53100
2020-05-28 15:08:25 +02:00
Dan Hermann 2738998ebb
Limit _cat/indices test to versions with fix (#57244) (#57256) 2020-05-27 16:57:24 -05:00
Lee Hinman c0f732b9f6
[7.x] Rename template V2 classes to ComposableTemplate (#57183) (#57232)
Backports the following commits to 7.x:

    Rename template V2 classes to ComposableTemplate (#57183)
2020-05-27 11:01:59 -06:00
Nik Everett 4d5be7c817
Save memory on numeric sig terms when not top (backport of #56789) (#57221)
This saves memory when running numeric significant terms which are not
at the top level by merging its collection into numeric terms and relying
on the optimization that we made in #55873.
2020-05-27 12:03:28 -04:00
Jake Landis 920677af6f
7.x only REST specification fixes (#56736)
Fixes for the REST specification specific to 7.x

* remove ignore "cat.thread_pool.json" and add the "" as valid option. #55984 deprecated this field since it these params here have no effect on this specific API
* remove ignore "indices.put_mapping.json" by adding the required / in the path to pass validation.
2020-05-26 12:33:57 -05:00
Dan Hermann c5f61fe24c
Handle exceptions when building _cat/indices response 2020-05-25 09:59:24 -05:00
James Rodewig b3426dd558
[DOCS] Add delete snapshot repo API docs (#57043)
Changes:

* Adds API reference docs for the delete snapshot repo API.

* Corrects an error in the delete snapshot repo API spec. Comma-separated
repository names are not supported.

* Relocates the existing delete snapshot repo API example docs.
2020-05-21 14:47:07 -04:00
Nik Everett 8b9c4eb3e0
Save memory when date_histogram is not on top (#56921) (#56960)
When `date_histogram` is a sub-aggregator it used to allocate a bunch of
objects for every one of it's parent's buckets. This uses the data
structures that we built in #55873 rework the `date_histogram`
aggregator instead of all of the allocation.

Part of #56487
2020-05-19 17:36:55 -04:00
Ioannis Kakavas 0eb81870de
Adjust version mute for reload secure settings (#56938) (#56951)
We can safely run the reload_secure_settings tests
after 7.7.0 , the relevant changes have long been
backported there
2020-05-20 00:24:29 +03:00
Ioannis Kakavas 38e55cd348
Adjust reload keystore test to pass in FIPS (#56889) (#56940)
In KeystoreWrapper class we determine if the error to decrypt a
given keystore is caused by a wrong password based on the exception
that the SunJCE implementation of AES is
throwing(AEADBadTagException). Other implementations from other
Security Providers fail with a different exception and as such we
cannot differentiate between a corrupted file and a wrong password
in a foolproof way.
As in other tests such as in
KeyStoreWrapperTests#testDecryptKeyStoreWithWrongPassword
we handle this by matching both possible exception messages.
2020-05-19 18:11:43 +03:00
James Rodewig ecf6d8f974
[DOCS] Fix component template API link in JSON specs (#56884) (#56945)
Co-authored-by: Tomas Della Vedova <delvedor@users.noreply.github.com>
2020-05-19 11:00:15 -04:00
Lee Hinman e208925465
[7.x] Add template simulation API for simulating template composition (#56842) (#56924) 2020-05-19 08:12:21 -06:00
Dan Hermann 66871c5342
[7.x] Rename endpoint from plural "_data_streams" to singular "_data_stream" (#56825) 2020-05-15 10:27:53 -05:00
Ryan Ernst 9fb80d3827
Move publishing configuration to a separate plugin (#56727)
This is another part of the breakup of the massive BuildPlugin. This PR
moves the code for configuring publications to a separate plugin. Most
of the time these publications are jar files, but this also supports the
zip publication we have for integ tests.
2020-05-14 20:23:07 -07:00
Lee Hinman a73d7d9e2b
[7.x] Don't allow invalid template combinations (#56397) (#56795)
Backports the following commits to 7.x:

- Don't allow invalid template combinations (#56397)
2020-05-14 16:20:53 -06:00
Nik Everett 126619ae3c
Add list of defered aggregations to the profiler (backport of #56208) (#56682)
This adds a few things to the `breakdown` of the profiler:
* `histogram` aggregations now contain `total_buckets` which is the
  count of buckets that they collected. This could be useful when
  debugging a histogram inside of another bucketing agg that is fairly
  selective.
* All bucketing aggs that can delay their sub-aggregations will now add
  a list of delayed sub-aggregations. This is useful because we
  sometimes have fairly involved logic around which sub-aggregations get
  delayed and this will save you from having to guess.
* Aggregtations wrapped in the `MultiBucketAggregatorWrapper` can't
  accurately add anything to the breakdown. Instead they the wrapper
  adds a marker entry `"multi_bucket_aggregator_wrapper": true` so we
  can be quickly pick out such aggregations when debugging.

It also fixes a bug where `_count` breakdown entries were contributing
to the overall `time_in_nanos`. They didn't add a large amount of time
so it is unlikely that this caused a big problem, but I was there.

To support the arbitrary breakdown data this reworks the profiler so
that the `breakdown` can contain any data that is supported by
`StreamOutput#writeGenericValue(Object)` and
`XContentBuilder#value(Object)`.
2020-05-13 16:33:22 -04:00
Martijn van Groningen d3dace903b
Fix allowed warning in data stream rest test. (#56630) (#56634) 2020-05-13 09:44:19 +02:00
Jake Landis 9c76ee47c4
[7.x] json spec: allow null for documentation url (#55749) (#56625)
This commit allows the JSON schema's documentation.url property to have a null value.
This can useful for cases where a feature is under development, and does not have
documentation published yet.

This commit also adds a documentation.url for two ml resources.
2020-05-12 14:49:02 -05:00
Martijn van Groningen 0c61bc63e4
Backport: auto create data streams using index templates v2 (#56596)
Backport: #55377

This commit adds the ability to auto create data streams using index templates v2.
Index templates (v2) now have a data_steam field that includes a timestamp field,
if provided and index name matches with that template then a data stream
(plus first backing index) is auto created.

Relates to #53100
2020-05-12 17:01:15 +02:00
James Rodewig 8c457c884a
[DOCS] Add clean up snapshot repository API docs (#56519) 2020-05-12 09:54:49 -04:00
Lee Hinman 1337b35572
Remove prefer_v2_templates query string parameter (#56545)
This commit removes the `prefer_v2_templates` flag and setting. This was a brief setting that
allowed specifying whether V1 or V2 template should be used when an index is created. It has been
removed in favor of V2 templates always having priority.

Relates to #53101
Resolves #56528

This is not a breaking change because this flag was never in a released version.
2020-05-11 14:56:42 -06:00
Nik Everett b5e385fa56
Fix auto_date_histogram interval (#56252) (#56341)
`auto_date_histogram` was returning the incorrect `interval` because
of a combination of two things:
1. When pipeline aggregations rewrote `auto_date_histogram` we reset the
   interval to 1. Oops. Fixed that.
2. *Every* bucket aggregation was rewriting its buckets as though there
   was a pipeline aggregation even if there aren't any. This is a bit
   silly so we skip that too.

Closes #56116
2020-05-07 10:27:40 -04:00
Dan Hermann 6674f14fb3
[7.x] Get index includes parent data stream for backing indices (#56238) 2020-05-05 15:43:42 -05:00
Andrei Dan f569405fde
Enable simulate API tests in 7.8 (#55946)
As #55686 was backported the simulate index template api is no available
in 7.8.
2020-05-05 11:28:00 +01:00
David Roberts 31e32aa420
[TEST] Allow more warnings about multiple template matches (#56085)
Adds some extra allowed warnings about multiple index templates
matching on index creation of the same type that were added
in #56038.
2020-05-03 21:07:51 +01:00
Jake Landis 1e65ead01f
[7.x] deprecrate size from cat.thread_pool in json spec (#55984) (#56050) 2020-04-30 13:10:30 -05:00
Andrei Dan c5b04311e0
Conditionally run tests asserting overlapping templates (#56028) (#56040)
Only run the tests verifyin the overlapping index templates when there is
no `global` index template (ie. when the default shards are not changed)

(cherry picked from commit e256becad7650018ed6687d6f4ddba5e255f6b29)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-04-30 17:02:08 +01:00
Andrei Dan 68985bc1ca
Add HLRC support for simulate index template api (#55936) (#56029)
(cherry picked from commit 475790c34e0bab95d352132d6be63c4f5b219fb1)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-04-30 15:40:48 +01:00
Andrei Dan 83828af7ef
Update template v2 api rest spec (#55948) (#56008)
This removed the specification of `order` as it is not a parameter of the
v2 put template api (the priority is the equivalent of `order` and is
defined in the body) and add a bit of description for the `cause` parameter
(which is currently used as a cluster update task tracking)

(cherry picked from commit e3e9782b2059e28bc4a08be2232c1e5baecad3d6)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-04-30 14:23:59 +01:00
Dan Hermann 9bf254fe36
REST test for rolling data streams 2020-04-29 17:34:52 -05:00
Dan Hermann bf89e485fc
[7.x] Delete index API properly handles backing indices for data streams (#55971) 2020-04-29 16:32:59 -05:00
Andrei Dan 6b886b0b7a
[7.x] Add simulate template composition API _index_template/_simulate_index/{name} (#55686) (#55922)
This adds a new api to simulate matching the given index name against the
 index templates in the system.

The syntax for the new API takes the following form:

POST _index_template/_simulate_index/{index_name}
{
  "index_patterns": ["logs-*"],
  "priority": 15,
  "template": {
	"settings": {
		"number_of_shards": 3
	}
       ...
   }
}

Where the body is optional, but we support the entire body used by the
PUT _index_template/{name} api. When the body is specified we'll simulate
matching the given index against a system that'd have the given index
template together with the index templates that exist in the system.

The response, in both cases, will return the matching template's resolved
settings, mappings and aliases, together with a special field that'll print any
overlapping templates and their corresponding index patterns.

(cherry picked from commit 1a5845edce1f445c58e094e9a3b6792e21e543b0)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-04-29 14:57:44 +01:00
David Turner 5ca511622f
Add API specs for voting config exclusions (#55919)
Closes #48131
Backport of #55760

Co-authored-by: zacharymorn <zacharymorn@gmail.com>
2020-04-29 14:00:36 +01:00
Lee Hinman 1c73fcfc86
Mark ITv2 APIs as experimental (#55874)
This commit marks the V2 index and component template APIs experimental, with intent to mark them as
"stable" in 7.9.0.

Relates to #53101
2020-04-28 11:27:34 -06:00
Lee Hinman 777caf0725
[7.x] Add support for V2 index templates to /_cat/templates (#55829) (#55866)
Backports the following commits to 7.x:
 - Add support for V2 index templates to /_cat/templates (#55829)
2020-04-28 10:14:19 -06:00
Zachary Tong 715c90bf7d Aggs must specify a `field` or `script` (or both) (#52226)
This adds a validation to VSParserHelper to ensure that a field or
script or both are specified by the user.  This is technically
required today already, but throws an exception much deeper
in the agg framework and has a very unintuitive error for the user
(as well as eating more resources instead of failing early)
2020-04-23 19:23:41 -04:00
Jake Landis 25ea6a74f0
[7.x] Validate REST specs against schema (#55117) (#55563)
A JSON schema was recently introduced for the REST API specification. #54252
This PR introduces a 3rd party validation tool to ensure that the
REST specification conforms to the schema.

The task is applied to the 3 projects that contain REST API specifications.
The plugin wires this task into the precommit commit task, and should be
considered as part of the public API for the build tools for any plugin
developer to contribute their plugin's specification.

An ignore parameter has been introduced for the task to allow specific
file to be ignored from the validation. The ignored files in this PR
will soon get issues logged and a link so they can be fixed.

Closes #54314
2020-04-22 14:14:03 -05:00
Fernando Briano 71672ea33d
Add skip arbitrary_key to nodes.reload_secure_settings YAML test (#55540) 2020-04-22 09:48:33 +01:00
Lee Hinman 9eddd2bcc9
[7.x] Add prefer_v2_templates flag and index setting (#55411) (#55476)
This commit adds a new querystring parameter on the following APIs:
- Index
- Update
- Bulk
- Create Index
- Rollover

These APIs now support a `?prefer_v2_templates=true|false` flag. This flag changes the preference
creation to use either V2 index templates or V1 templates. This flag defaults to `false` and will be
changed to `true` for 8.0+ in subsequent work.

Additionally, setting this flag internally sets the `index.prefer_v2_templates` index-level setting.
This setting is used so that actions that automatically create a new index (things like rollover
initiated by ILM) will inherit the preference from the original index. This setting is dynamic so
that a transition from v1 to v2 templates can occur for long-running indices grouped by an alias
performing periodic rollover.

This also adds support for sending this parameter to the High Level Rest Client.

Relates to #53101
2020-04-20 12:05:42 -06:00
Dan Hermann dc703d75f5
Add explicit generation attribute to data streams 2020-04-20 07:40:33 -05:00
Martijn van Groningen 417d5f2009
Make data streams in APIs resolvable. (#55337)
Backport from: #54726

The INCLUDE_DATA_STREAMS indices option controls whether data streams can be resolved in an api for both concrete names and wildcard expressions. If data streams cannot be resolved then a 400 error is returned indicating that data streams cannot be used.

In this pr, the INCLUDE_DATA_STREAMS indices option is enabled in the following APIs: search, msearch, refresh, index (op_type create only) and bulk (index requests with op type create only). In a subsequent later change, we will determine which other APIs need to be able to resolve data streams and enable the INCLUDE_DATA_STREAMS indices option for these APIs.

Whether an api resolve all backing indices of a data stream or the latest index of a data stream (write index) depends on the IndexNameExpressionResolver.Context.isResolveToWriteIndex().
If isResolveToWriteIndex() returns true then data streams resolve to the latest index (for example: index api) and otherwise a data stream resolves to all backing indices of a data stream (for example: search api).

Relates to #53100
2020-04-17 08:33:37 +02:00
Mark Tozzi 22c55180c1
[7.x] Backport ValuesSourceRegistry and related work (#54922)
* Add ValuesSource Registry and associated logic (#54281)

* Remove ValuesSourceType argument to ValuesSourceAggregationBuilder (#48638)

* ValuesSourceRegistry Prototype (#48758)

* Remove generics from ValuesSource related classes (#49606)

* fix percentile aggregation tests (#50712)

* Basic thread safety for ValuesSourceRegistry (#50340)

* Remove target value type from ValuesSourceAggregationBuilder (#49943)

* Cleanup default values source type (#50992)

* CoreValuesSourceType no longer implements Writable (#51276)

* Remove genereics & hard coded ValuesSource references from Matrix Stats (#51131)

* Put values source types on fields (#51503)

* Remove VST Any (#51539)

* Rewire terms agg to use new VS registry (#51182)

Also adds some basic AggTestCases for untested code
paths (and boilerplate for future tests once the IT are
converted over)

* Wire Cardinality aggregation to work with the ValuesSourceRegistry (#51337)

* Wire Percentiles aggregator into new VS framework (#51639)

This required a bit of a refactor to percentiles itself.  Before,
the Builder would switch on the chosen algo to generate an
algo-specific factory.  This doesn't work (or at least, would be
difficult) in the new VS framework.

This refactor consolidates both factories together and introduces
a PercentilesConfig object to act as a standardized way to pass
algo-specific parameters through the factory.  This object
is then used when deciding which kind of aggregator to create

Note: CoreValuesSourceType.HISTOGRAM still lives in core, and will
be moved in a subsequent PR.

* Remove generics and target value type from MultiVSAB (#51647)

* fix checkstyle after merge (#52008)

* Plumb ValuesSourceRegistry through to QuerySearchContext (#51710)

* Convert RareTerms to new VS registry (#52166)

* Wire up Value Count (#52225)

* Wire up Max & Min aggregations (#52219)

* ValuesSource refactoring: Wire up Sum aggregation (#52571)

* ValuesSource refactoring: Wire up SigTerms aggregation (#52590)

* Soft immutability for VSConfig (#52729)

* Unmute testSupportedFieldTypes, fix Percentiles/Ranks/Terms tests (#52734)

Also fixes Percentiles which was incorrectly specified to only accept
numeric, but in fact also accepts Boolean and Date (because those are
numeric on master - thanks `testSupportedFieldTypes` for catching it!)

* VS refactoring: Wire up stats aggregation (#52891)

* ValuesSource refactoring: Wire up string_stats aggregation (#52875)

* VS refactoring: Wire up median (MAD) aggregation (#52945)

* fix valuesourcetype issue with constant_keyword field (#53041)x-pack/plugin/rollup/src/main/java/org/elasticsearch/xpack/rollup/job/RollupIndexer.java

this commit implements `getValuesSourceType` for
the ConstantKeyword field type.

master was merged into feature/extensible-values-source
introducing a new field type that was not implementing
`getValuesSourceType`.

* ValuesSource refactoring: Wire up Avg aggregation (#52752)

* Wire PercentileRanks aggregator into new VS framework  (#51693)

* Add a VSConfig resolver for aggregations not using the registry (#53038)

* Vs refactor wire up ranges and date ranges (#52918)

* Wire up geo_bounds aggregation to ValuesSourceRegistry (#53034)

This commit updates the geo_bounds aggregation to depend
on registering itself in the ValuesSourceRegistry

relates #42949.

* VS refactoring: convert Boxplot to new registry (#53132)

* Wire-up geotile_grid and geohash_grid to ValuesSourceRegistry (#53037)

This commit updates the geo*_grid aggregations to depend
on registering itself in the ValuesSourceRegistry

relates to the values-source refactoring meta issue #42949.

* Wire-up geo_centroid agg to ValuesSourceRegistry (#53040)

This commit updates the geo_centroid aggregation to depend
on registering itself in the ValuesSourceRegistry.

relates to the values-source refactoring meta issue #42949.

* Fix type tests for Missing aggregation (#53501)

* ValuesSource Refactor: move histo VSType into XPack module (#53298)

- Introduces a new API (`getBareAggregatorRegistrar()`) which allows plugins to register aggregations against existing agg definitions defined in Core.
- This moves the histogram VSType over to XPack where it belongs. `getHistogramValues()` still remains as a Core concept
- Moves the histo-specific bits over to xpack (e.g. the actual aggregator logic). This requires extra boilerplate since we need to create a new "Analytics" Percentile/Rank aggregators to deal with the histo field. Doubly-so since percentiles/ranks are extra boiler-plate'y... should be much lighter for other aggs

* Wire up DateHistogram to the ValuesSourceRegistry (#53484)

* Vs refactor parser cleanup (#53198)

Co-authored-by: Zachary Tong <polyfractal@elastic.co>
Co-authored-by: Zachary Tong <zach@elastic.co>
Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com>
Co-authored-by: Tal Levy <JubBoy333@gmail.com>

* First batch of easy fixes

* Remove List.of from ValuesSourceRegistry

Note that we intend to have a follow up PR dealing with the mutability
of the registry, so I didn't even try to address that here.

* More compiler fixes

* More compiler fixes

* More compiler fixes

* Precommit is happy and so am I

* Add new Core VSTs to tests

* Disabled supported type test on SigTerms until we can backport it's fix

* fix checkstyle

* Fix test failure from semantic merge issue

* Fix some metaData->metadata replacements that got lost

* Fix list of supported types for MinAggregator

* Fix list of supported types for Avg

* remove unused import

Co-authored-by: Zachary Tong <polyfractal@elastic.co>
Co-authored-by: Zachary Tong <zach@elastic.co>
Co-authored-by: Christos Soulios <1561376+csoulios@users.noreply.github.com>
Co-authored-by: Tal Levy <JubBoy333@gmail.com>
2020-04-16 16:54:46 -04:00
Christoph Büscher 4d849f0948
Fix creating filtered alias using now in a date_nanos range query failed (#54785) (#55329)
Modify the value of nowInMillis in queryShardContext to current timestamp, because the
value will be used lately when validating the filtered alias which uses now in a date_nanos
range query.
2020-04-16 19:47:53 +02:00
Tomas Della Vedova 9872deace7
Yaml test: Fixed bad indentation (#55170) (#55206) 2020-04-15 11:13:52 +02:00
Mark Vieira ce85063653
[7.x] Re-add origin url information to publish POM files (#55173) 2020-04-14 13:24:15 -07:00
Nhat Nguyen 96bb1164f0 Support hierarchical task cancellation (#54757)
With this change, when a task is canceled, the task manager will cancel
not only its direct child tasks but all also its descendant tasks.

Closes #50990
2020-04-13 12:35:21 -04:00
Ioannis Kakavas 7a8a66d9ae
[7.x] Fix ReloadSecureSettings API to consume password (#54771) (#55059)
The secure_settings_password was never taken into consideration in
the ReloadSecureSettings API. This commit fixes that and adds
necessary REST layer testing. Doing so, it also:

- Allows TestClusters to have a password protected keystore
so that it can be set for tests.
- Adds a parameter to the run task so that elastisearch can
be run with a password protected keystore from source.
2020-04-13 09:50:55 +03:00
Yang Wang 862799956c
Deprecate local parameter for get field mapping request (#55014) (#55099)
The usage of local parameter for GetFieldMappingRequest has been removed from the underlying transport action since v2.0.

This PR deprecates the parameter from rest layer. It will be removed in next major version.
2020-04-12 13:48:47 +10:00
Przemko Robakowski afa3467957
[7.x] HLRC support for Index Templates V2 (#54838) (#54932)
* HLRC support for Index Templates V2 (#54838)

* HLRC support for Index Templates V2

This change adds High Level Rest Client support for Index Templates V2.

Relates to #53101

* fixed compilation error

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-04-09 07:43:13 +02:00
Lee Hinman 1f17df13c1 Bump minimum version for component template CRUD test (#54992)
These tests do CRUD for component templates, however, for 7.7 some changes weren't backported in the
`_doc` wrapping/unwrapping done for the APIs, this can cause test failures.

This bumps the minimum version for these tests to 7.8, which is okay because component templates are
hidden behind a flag and have no compatibility guarantees for 7.7.

Relates to #53101
2020-04-08 16:39:46 -06:00
Dan Hermann c7f9a27d2d
Delete backing indices with data stream (#54693) (#54976) 2020-04-08 15:18:12 -05:00
Lee Hinman c2c0707174
[7.x] Add allowed warnings to index template composition tests… (#54961)
We occasionally add a global template for our YAML tests, and this can cause warnings for these
template tests. This commit adds these warnings so they don't cause test failures.

Resolves #54822

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-04-08 10:17:32 -06:00
Tal Levy 254d1e3543
[7.x] Create new `geo` module and migrate geo_shape registration (#53562) (#54924)
This commit introduces a new `geo` module that is intended
to be contain all the geo-spatial-specific features in server.

As a first step, the responsibility of registering the geo_shape
field mapper is moved to this module.

Co-authored-by: Nicholas Knize <nknize@gmail.com>
2020-04-07 16:30:58 -07:00
Nik Everett 915092dc28
More pipeline aggregation cleanup (backport of #54298) (#54890)
This replaces the last bit of validation that pipeline aggregations
performed on the data nodes with explicit checks in a few
`PipelineAggregationBuilders`. We were *already* catching these
validation errors for pipeline aggregations that require that their
parent be squentially ordered. This just adds validation for pipelines
that require *any* parent like `bucket_selector` and `bucket_sort`.
2020-04-07 10:40:34 -04:00
Mayya Sharipova 0013dd4528
Add checks for field collapse test failure (#54831)
There were some failures on 7.x of field collapse tests,
where total hits count was less then expected.
This adds an additional test to check total hits count
before field collapse queries to understand if the problem
is with field collapsing or with simply that writes have
not been finished yet

Relates to #52416
2020-04-06 17:14:17 -04:00
Przemko Robakowski 7b1bb9952a
[7.x] HLRC support for Component Templates APIs (#54635) (#54828)
* HLRC support for Component Templates APIs (#54635)
2020-04-06 20:24:23 +02:00
Nhat Nguyen 2fdbed7797 Broadcast cancellation to only nodes have outstanding child tasks (#54312)
Today when canceling a task we broadcast ban/unban requests to all nodes
in the cluster. This strategy does not scale well for hierarchical
cancellation. With this change, we will track outstanding child requests
and broadcast the cancellation to only nodes that have outstanding child
tasks. This change also prevents a parent task from sending child
requests once it got canceled.

Relates #50990
Supersedes #51157

Co-authored-by: Igor Motov <igor@motovs.org>
Co-authored-by: Yannick Welsch <yannick@welsch.lu>
2020-04-06 11:11:29 -04:00
Lee Hinman 814c248819
[7.x] Use V2 index templates during index creation (#54669) (#54750)
* Use V2 index templates during index creation

This commit changes our index creation code to use (and favor!) V2 index templates during index
creation. The creation precedence goes like so, in order of precedence:

- Existing source `IndexMetadata` - for example, when recovering from a peer or a shrink/split/clone
  where index templates should not be applied
- A matching V2 index template, if one is found
  - When a V2 template is found, all component templates (in the `composed_of` field) are applied
    in the order that they appear, with the index template having the 2nd highest precedence (the
    create index request always has the top priority when it comes to index settings)
- All matching V1 templates (the old style)

This also adds index template validation when `PUT`-ing a new v2 index template (because this was
required) and ensures that all index and component templates specify *no* top-level mapping type (it
is automatically added when the template is added to the cluster state).

This does not yet implement fine-grained component template merging of mappings, where we favor
merging only a single field's configuration, that will be done in subsequent work.

This also keeps the existing hidden index behavior present for v1 templates, where a hidden index
will match v2 index templates unless they are global (`*`) templates.

Relates to #53101
2020-04-03 14:46:15 -06:00
Dan Hermann 18fef3de2a
Get data stream accepts single search parameter 2020-04-03 10:36:26 -05:00
Dan Hermann 39c4ec6821
[7.x] Create first backing index when creating data stream 2020-04-02 17:19:35 -05:00
Dan Hermann 547ab849b4
Test to enforce response to invalid data stream names (#54663) 2020-04-02 15:04:03 -05:00
Zachary Tong 20d67720aa
Refactor Percentiles/Ranks aggregation builders and factories (#51887) (#54537)
- Consolidates HDR/TDigest factories into a single factory
- Consolidates most HDR/TDigest builder into an abstract builder
- Deprecates method(), compression(), numSigFig() in favor of a new
unified PercentileConfig object
- Disallows setting algo options that don't apply to current algo

The unified config method carries both the method and algo-specific
setting. This provides a mechanism to reject settings that apply
to the wrong algorithm.  For BWC the old methods are retained
but marked as deprecated, and can be removed in future versions.

Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>

Co-authored-by: Mark Tozzi <mark.tozzi@gmail.com>
2020-04-02 10:39:41 -04:00
Russ Cam 2978024375 Update rest API specs (#54252)
This commit updates the rest API specs to validate against a
JSON schema for the specifications. Most updates are to add
a description, whilst others fix typos and unify conventions
e.g. deprecations, descriptions, urls starting with /. The schema
conforms to draft-07 JSON schema.

(cherry picked from commit da37e01d32f9764c3937736ef0c7d3ab40af9a77)
2020-04-02 10:53:32 +10:00
Andy Bristol 62a52465fc
aggregator and yaml tests for missing agg (#53214)
Tests for unmapped fields, the missing parameter, scripting, and correct
ValuesSource types in MissingAggregatorTests. Basic yaml tests for the
missing agg

For #42949
2020-04-01 15:23:08 -07:00
Lee Hinman 0f312c8b38
Add allowed warnings to index template v2 YAML tests (#54535)
There is a setting in `ESClientYamlSuiteTestCase` under `usually()` that can install a `global`
template changing the number of shards for all indices. This can cause warnings when installing v2
templates (see #54367). This adds these as optional warnings so they don't cause failures regardless
of whether the global template is installed or not.

These warnings can be removed when our internal template usage has been moved to index templates v2

Relates to #53101
2020-03-31 15:34:19 -06:00
Nik Everett 56047f74be
Fix auto_date_histogram serialization bug (#54447)
This fixes a serialization bug in `auto_date_histogram` that comes up in
a cluster mixed between pre-7.3.0 and post-7.3.0.

Includes #54429 to keep 7.x looking like master for simpler backports.

Closes #54382
2020-03-30 13:49:38 -04:00
Nik Everett e58ad9fed3
Clean up how pipeline aggs check for multi-bucket (backport of #54161) (#54379)
Pipeline aggregations like `stats_bucket`, `sum_bucket`, and
`percentiles_bucket` only operate on buckets that have multiple buckets.
This adds support for those aggregations to `geo_distance`, `ip_range`,
`auto_date_histogram`, and `rare_terms`.

This all happened because we used a marker interface to mark compatible
aggs, `MultiBucketAggregationBuilder` and it was fairly easy to forget
to implement the interface.

This replaces the marker interface with an abstract method in
`AggregationBuilder`, `bucketCardinality` which makes you return `NONE`,
`ONE`, or `MANY`. The `bucket` aggregations can check for `MANY`. At
this point `ONE` and `NONE` amount to about the same thing, but I
suspect that'll be a useful distinction when validating bucket sorts.

Closes #53215
2020-03-30 10:44:55 -04:00
Jason Tedor 686f4f0158
Adjust BWC version on node roles being sorted
Node roles are sorted now as of 7.8.0. This commit adjusts the BWC
version for tests.
2020-03-28 15:30:42 -04:00
Jason Tedor 37b59a357f
Ensure that the output of node roles are sorted (#54376)
This commit ensures that node roles are sorted by node role name, which
makes the output easier to consume, and also makes it easier to rely on
the behavior of the output in assertions.
2020-03-28 12:51:21 -04:00
Lee Hinman f2cc2b1127
[7.x] Add REST APIs for IndexTemplateV2Metadata CRUD (#54039) (#54347)
* Add REST APIs for IndexTemplateV2Metadata CRUD (#54039)

* Add REST APIs for IndexTemplateV2Metadata CRUD

This commit adds the get/put/delete APIs for interacting with the now v2 versions of index
templates.

These APIs are behind the existing `es.itv2_feature_flag_registered` system property feature flag.

Relates to #53101

* Add exceptions for HLRC tests

* Add skips for 7.x versions

* Use index_template instead of template_v2 in action names

* Add test for MetaDataIndexTemplateService.addIndexTemplateV2

* Move removal to static method and add test

* Add unit tests for request classes (implement hashCode & equals)

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

* Fix compilation

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-27 10:47:22 -06:00
Martijn Laarman 4c36b5daee Document known features on rest-api-spec tests (#52916)
* Document known features on rest-api-spec tests

Features dictate wheter a `rest-api-spec` test runner can execute a
test. This PR documents all the know features in the java implementation
of the runner.

* Apply suggestions from code review

Co-Authored-By: Luca Cavanna <javanna@users.noreply.github.com>

Co-authored-by: Luca Cavanna <javanna@users.noreply.github.com>
(cherry picked from commit fbe173723d0ad7ebb920cda855bce3fb758b04a6)
2020-03-25 16:29:53 +01:00
Jason Tedor 381d7586e4
Introduce formal role for remote cluster client (#54138)
This commit introduce a formal role for identifying nodes that are
capable of making connections to remote clusters.

Relates #53924
2020-03-24 21:59:43 -04:00
Dan Hermann cb73de2eb7
Unmute data stream YML tests (#54100) 2020-03-24 19:11:13 -05:00
Przemko Robakowski 5594d57727
/_cat/shards support path stats (#53461) (#54119)
* _cat/shards support path stats

* fix some style case

* fix some style case

* fix rest-api-spec cat.shards error

* fix rest-api-spec cat.shards bwc error

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: weizijun <weizijun1989@gmail.com>
Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-24 20:46:05 +01:00
Dan Hermann 30105a5ab5
[7.x] Cluster state and CRUD operations for data streams (#54073) 2020-03-24 07:58:52 -05:00
Jim Ferenczi 9e3f7f4575
Add heuristics to compute pre_filter_shard_size when unspecified (#53873) (#54007)
This commit changes the pre_filter_shard_size default from 128 to unspecified.
This allows to apply heuristics based on the request and the target indices when deciding
whether the can match phase should run or not. When unspecified, this pr runs the can match phase
automatically if one of these conditions is met:
  * The request targets more than 128 shards.
  * The request contains read-only indices.
  * The primary sort of the query targets an indexed field.
Users can opt-out from this behavior by setting the `pre_filter_shard_size` to a static value.

Closes #39835
2020-03-24 02:05:15 +01:00
Martijn van Groningen aef7b89219
Backport: initial data stream commit (#53959)
This commits adds a data stream feature flag, initial definition of a data stream and
the stubs for the data stream create, delete and get APIs. Also simple serialization
tests are added and a rest test to thest the data stream API stubs.

This is a large amount of code and mainly mechanical, but this commit should be
straightforward to review, because there isn't any real logic.

The data stream transport and rest action are behind the data stream feature flag and
are only intialized if the feature flag is enabled. The feature flag is enabled if
elasticsearch is build as snapshot or a release build and the
'es.datastreams_feature_flag_registered' is enabled.

The integ-test-zip sets the feature flag if building a release build, otherwise
rest tests would fail.

Relates to #53100
2020-03-23 12:58:09 +01:00
James Rodewig 1d1e177a62 [DOCS] Note doc links should be live in REST API JSON specs (#53871)
Downstream Elasticsearch clients, such as the Elaticsearch-JS client,
use the documentation links in our REST API JSON specifications to
create their docs.

Using a broken link or linking to yet-to-be-created doc pages can
break the docs build for these clients.

This PR adds a related note to the README for the REST API JSON Specs.
2020-03-23 07:42:03 -04:00
Alan Woodward a3f21f24ea
Emit deprecation warning when TermsLookup contains a type (#53731)
TermsLookup in master no longer accepts a type parameter. We should emit
a deprecate warning in 7.x when a terms lookup requests includes type to prepare
users for its removal.

Relates to #41059
2020-03-20 15:11:31 +00:00
Jake Landis cce60215d8
[7.x] Add Watcher to available rest resources (#53620) (#53764)
Prior to this commit Watcher explicitly copied test between two
projects with a copy task. This commit removes the explicit copy in favor
of adding the Watcher tests to the available restResources that may be
copied between projects.

This is how inter-project dependencies should be modeled. However, only
Watcher is included here since it is (currently) the only project with
inter-project test dependencies.
2020-03-19 12:29:36 -05:00
Lee Hinman 40181eb200
[7.x] Fix feature flag setting for ComponentTemplate APIs (#53… (#53800)
* Fix feature flag setting for ComponentTemplate APIs (#53758)

The feature flag was set for *most* of the builds, but there are a couple where it was missing.

Resolves #53708

* Add skip for older versions of ES
2020-03-19 09:35:07 -06:00
James Rodewig 0e2e06bd7e [DOCS] Remove incorrect parms from put index template API docs (#53750)
Removes the `flat_settings` and `timeout` query parameters from the JSON
spec and asciidoc docs for the put index template API.

These parameters are not supported by the API.
2020-03-18 14:36:27 -04:00
Jim Ferenczi 8e17322b3a
Shortcut query phase using the results of other shards (#51852) (#53659)
This commit, built on top of #51708, allows to modify shard search requests based on informations collected on other shards. It is intended to speed up sorted queries on time-based indices. For queries that are only interested in the top documents.

This change will rewrite the shard queries to match none if the bottom sort value computed in prior shards is better than all values in the shard.
For queries that mix top documents and aggregations this change will reset the size of the top documents to 0 instead of rewriting to match none.
This means that we don't need to keep a search context open for this shard since we know in advance that it doesn't contain any competitive hit.
2020-03-18 17:20:35 +01:00
Lee Hinman 9c0e846db3
[7.x] Add REST API for ComponentTemplate CRUD (#53558) (#53681)
* Add REST API for ComponentTemplate CRUD

This adds the Put/Get/DeleteComponentTemplate APIs that allow inserting, retrieving, and removing
ComponentTemplateMetadata into the cluster state metadata.

These APIs are currently only available behind a feature flag system property -
`es.itv2_feature_flag_registered`.

Relates to #53101

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-03-17 13:23:28 -06:00
Hendrik Muhs a0314ad015 [Transform] add transform discovery node role (#53616)
Enhancement of #52712: Add a discovery node role using the letter t for transform.

Fixes #53156
2020-03-17 11:39:20 +01:00
Nik Everett 9845dbb7d6
Fix sorting agg buckets by doc_count (backport of #53617) (#53627)
I broke sorting aggregations by `doc_count` in #51271 by mixing up true
and false. This flips that comparison and adds a few tests to double
check that we don't so this again.
2020-03-16 17:35:43 -04:00
Gordon Brown 031932b32f
Allow _cat indices & aliases to use indices options (#53248)
This commit adjusts the _cat/indices and _cat/aliases APIs to allow
specifying indices options, so that these APIs can handle hidden
indices/aliases in the same way as other APIs.

Also adds the hidden option to the expand_wildcards parameter
in the YAML spec for every API that accepts it.
2020-03-16 11:25:05 -06:00
Mayya Sharipova a906f8a0e4
Highlighters skip ignored keyword values (#53408) (#53604)
Keyword field values with length more than ignore_above are not
indexed. But highlighters still were retrieving these values
from _source and were trying to highlight them. This sometimes lead to
errors if a field length exceeded  max_analyzed_offset. But also this
is an overall wrong behaviour to attempt to highlight something that was
ignored during indexing.

This PR checks if a keyword value was ignored because of its length,
and if yes, skips highlighting it.

Backport: #53408
Closes #43800
2020-03-16 11:06:25 -04:00
Marios Trivyzas f8fe1d3344
Fix YAML test for search.allow_expensive_queries (#53541) (#53548)
Remove excessive testing and keep only the checks for when the queries
are disallowed. Fix also the check for the initial value of the setting
to be conmbatible with Go client tests.

(cherry picked from commit 314145294ea926e069c6f8629dfc622a7f31a0fb)
2020-03-13 16:07:38 +01:00
Nik Everett 9ada508347
Fix date_nanos in composite aggs (backport of #53315) (#53347)
It looks like `date_nanos` fields weren't likely to work properly in
composite aggs because composites iterate field values using points and
we weren't converting the points into milliseconds. Because the doc
values were coming back in milliseconds we ended up geting very confused
and just never collecting sub-aggregations.

This fixes that by adding a method to `DateFieldMapper.Resolution` to
`parsePointAsMillis` which is similarly in name and function to
`NumberFieldMapper.NumberType`'s `parsePoint` except that it normalizes
to milliseconds which is what aggs need at the moment.

Closes #53168
2020-03-11 13:00:07 -04:00
Nik Everett 89c0e1f566
Fix composite agg sort bug (backport of #53296) (#53337)
When an composite aggregation is run against an index with a sort that
*starts* with the "source" fields from the composite but has additional
fields it'd blow up in while trying to decide if it could use the sort.
This changes it to decide that it *can* use the sort.

Closes #52480
2020-03-10 11:32:46 -04:00
Nik Everett f32e4583d1
Add `allowed_warnings` to yaml tests (backport of #53139) (#53173)
When we test backwards compatibility we often end up in a situation
where we *sometimes* get a warning, and sometimes don't. Like, we won't
get the warning if we're testing against an older version, but we will
in a newer one. Or we won't get the warning if the request randomly
lands on a node with an old version of the code. But we wouldn't if it
randomed into a node with newer code.

This adds `allowed_warnings` to our yaml test runner for those cases:
warnings declared this way are "allowed" but not "required".

Blocks #52959

Co-authored-by: Benjamin Trent <ben.w.trent@gmail.com>
2020-03-05 17:11:54 -05:00
David Turner 52fa465300
Cache completion stats between refreshes (#52872)
Computing the stats for completion fields may involve a significant amount of
work since it walks every field of every segment looking for completion fields.
Innocuous-looking APIs like `GET _stats` or `GET _cluster/stats` do this for
every shard in the cluster. This repeated work is unnecessary since these stats
do not change between refreshes; in many indices they remain constant for a
long time.

This commit introduces a cache for these stats which is invalidated on a
refresh, allowing most stats calls to bypass the work needed to compute them on
most shards.

Closes #51915
Backport of #51991
2020-02-27 10:01:24 +00:00
Nhat Nguyen db6b9c21c7 Use local checkpoint to calculate min translog gen for recovery (#51905)
Today we use the translog_generation of the safe commit as the minimum
required translog generation for recovery. This approach has a
limitation, where we won't be able to clean up translog unless we flush.
Reopening an already recovered engine will create a new empty translog,
and we leave it there until we force flush.

This commit removes the translog_generation commit tag and uses the
local checkpoint of the safe commit to calculate the minimum required
translog generation for recovery instead.

Closes #49970
2020-02-26 17:08:18 -05:00
Jake Landis 8d311297ca
[7.x] Smarter copying of the rest specs and tests (#52114) (#52798)
* Smarter copying of the rest specs and tests (#52114)

This PR addresses the unnecessary copying of the rest specs and allows
for better semantics for which specs and tests are copied. By default
the rest specs will get copied if the project applies
`elasticsearch.standalone-rest-test` or `esplugin` and the project
has rest tests or you configure the custom extension `restResources`.

This PR also removes the need for dozens of places where the x-pack
specs were copied by supporting copying of the x-pack rest specs too.

The plugin/task introduced here can also copy the rest tests to the
local project through a similar configuration.

The new plugin/task allows a user to minimize the surface area of
which rest specs are copied. Per project can be configured to include
only a subset of the specs (or tests). Configuring a project to only
copy the specs when actually needed should help with build cache hit
rates since we can better define what is actually in use.
However, project level optimizations for build cache hit rates are
not included with this PR.

Also, with this PR you can no longer use the includePackaged flag on
integTest task.

The following items are included in this PR:
* new plugin: `elasticsearch.rest-resources`
* new tasks: CopyRestApiTask and CopyRestTestsTask - performs the copy
* new extension 'restResources'
```
restResources {
  restApi {
    includeCore 'foo' , 'bar' //will include the core specs that start with foo and bar
    includeXpack 'baz' //will include x-pack specs that start with baz
  }
  restTests {
    includeCore 'foo', 'bar' //will include the core tests that start with foo and bar
    includeXpack 'baz' //will include the x-pack tests that start with baz
  }
}

```
2020-02-26 08:13:41 -06:00
Nhat Nguyen f0bc8abcd0 Fix translog stats on closed indices yaml test (#52800)
We need to wait for no initializing shards before closing; otherwise, 
we might fail to close some recovering replicas.

Closes #52701
2020-02-26 08:14:35 -05:00
Russ Cam 087ceb899b
Reinstate params in rest api specs (#52544)
This commit reinstates the following params in the rest specs:

1. "analyzer" in delete_by_query
2. "ccs_minimize_roundtrips" in msearch_template
3. "ccs_minimize_roundtrips" in search_template

All appear to be valid options that seem to have been inadvertantly removed
between 7.3 and 7.4.

Fixes elastic/elasticsearch#47768
2020-02-20 14:39:15 +11:00
Marios Trivyzas ea6f0e39bc
[Tests] Update skip version for YAML tests (#52310)
Update skip versions upper boundary to match the release
or intented release version of the feature/fix.
2020-02-13 15:36:31 +01:00
Nik Everett ac535f59a5
Enable BWC test after backport (#52300)
Now that we've backported #52016 we can run its tests when we're
performance backwards compatibility testing.
2020-02-13 07:56:29 -05:00
Nik Everett 7efce22f19
Fix a DST error in date_histogram (backport #52016) (#52237)
When `date_histogram` attempts to optimize itself it for a particular
time zone it checks to see if the entire shard is within the same
"transition". Most time zone transition once every size months or
thereabouts so the optimization can usually kicks in.

*But* it crashes when you attempt feed it a time zone who's last DST
transition was before epoch. The reason for this is a little twisted:
before this patch it'd find the next and previous transitions in
milliseconds since epoch. Then it'd cast them to `Long`s and pass them
into the `DateFieldType` to check if the shard's contents were within
the range. The trouble is they are then converted to `String`s which are
*then* parsed back to `Instant`s which are then convertd to `long`s. And
the parser doesn't like most negative numbers. And everything before
epoch is negative.

This change removes the
`long` -> `Long` -> `String` -> `Instant` -> `long` chain in favor of
passing the `long` -> `Instant` -> `long` which avoids the fairly complex
parsing code and handles a bunch of interesting edge cases around
epoch. And other edge cases around `date_nanos`.

Closes #50265
2020-02-12 17:57:04 -05:00
Nik Everett 8c930a9960
Update skip after backport (#52288)
Now that #51868 is fully backported we can run its tests in the
backwards compatibility tests.
2020-02-12 17:01:27 -05:00
Marios Trivyzas dac720d7a1
Add a cluster setting to disallow expensive queries (#51385) (#52279)
Add a new cluster setting `search.allow_expensive_queries` which by
default is `true`. If set to `false`, certain queries that have
usually slow performance cannot be executed and an error message
is returned.

- Queries that need to do linear scans to identify matches:
  - Script queries
- Queries that have a high up-front cost:
  - Fuzzy queries
  - Regexp queries
  - Prefix queries (without index_prefixes enabled
  - Wildcard queries
  - Range queries on text and keyword fields
- Joining queries
  - HasParent queries
  - HasChild queries
  - ParentId queries
  - Nested queries
- Queries on deprecated 6.x geo shapes (using PrefixTree implementation)
- Queries that may have a high per-document cost:
  - Script score queries
  - Percolate queries

Closes: #29050
(cherry picked from commit a8b39ed842c7770bd9275958c9f747502fd9a3ea)
2020-02-12 22:56:14 +01:00
Nik Everett 0c1889389a
Update skip for backported fix (#52241)
Now that #51172 is fully backported we can fix the `skip` clause in the
bwc tests for it.
2020-02-12 13:55:47 -05:00
Tim Vernum 79f67e79cf Mute MixedCluster 180_locale_dependent_mapping (#52116)
Muting this test as it has frequent failures.

See: #49719
2020-02-10 12:37:32 +11:00