Commit Graph

51581 Commits

Author SHA1 Message Date
Tim Brooks 760ab726c2
Share netty event loops between transports (#56553)
Currently Elasticsearch creates independent event loop groups for each
transport (http and internal) transport type. This is unnecessary and
can lead to contention when different threads access shared resources
(ex: allocators). This commit moves to a model where, by default, the
event loops are shared between the transports. The previous behavior can
be attained by specifically setting the http worker count.
2020-05-11 15:43:43 -06:00
Marc e0e7b89499 [DOCS] Add document update API link to concurrency control docs (#56481)
Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2020-05-11 17:37:32 -04:00
Benjamin Trent 1d6b2f074e
[Transform] adds geotile_grid support in group_by (#56514) (#56549)
This adds support for grouping by geo points. This uses the agg [geotile_grid](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geotilegrid-aggregation.html).

I am opting to store the tile results of group_by as a `geo_shape` so that users can query the results. Additionally, the shapes could be visualized and filtered in the kibana maps app.

relates to https://github.com/elastic/elasticsearch/issues/56121
2020-05-11 17:02:40 -04:00
Lee Hinman 1337b35572
Remove prefer_v2_templates query string parameter (#56545)
This commit removes the `prefer_v2_templates` flag and setting. This was a brief setting that
allowed specifying whether V1 or V2 template should be used when an index is created. It has been
removed in favor of V2 templates always having priority.

Relates to #53101
Resolves #56528

This is not a breaking change because this flag was never in a released version.
2020-05-11 14:56:42 -06:00
Brandon Morelli 659edb92ff
docs: [7.x][apm] link to master in n.x branches (#56539) 2020-05-11 13:42:37 -07:00
Nik Everett c85a363b60 Fix nested agg test
I accidentally allowed the test framework to double-wrap a reader that
we rely on being only singly wrapped. Lame.

closes #56529
2020-05-11 16:15:25 -04:00
Lee Hinman 91c5ace569 [7.x] Disable BWC tests for prefer_v2_template removal (#56543)
This disables BWC tests for 7.x so the prefer_v2_template flag can be removed in 7.8, 7.x, and master.
2020-05-11 14:06:03 -06:00
Lee Hinman 11179b90de Re-enable BWC tests on 7.x
This can only be merged once #56541, #56545 and #56546 have been merged
2020-05-11 14:05:47 -06:00
Lee Hinman 7c5e6c29f5
[7.x] Disable BWC tests for prefer_v2_template removal (#56543)
This disables BWC tests for 7.x so the prefer_v2_template flag can be removed in 7.8, 7.x, and master.
2020-05-11 13:49:43 -06:00
renshuki b90c7e3797 [DOCS] Add more examples to Painless statements (#48397)
Add more examples to the Painless statements documentation (including while and do...while) as requested in #47485 (review)
2020-05-11 10:21:45 -07:00
zhenxianyimeng 8e96e5c936
Use CollectionUtils.isEmpty where appropriate (#55910)
This commit uses the isEmpty utility method for arrays in place of null and greater than zero checks.
2020-05-11 09:55:57 -07:00
Martijn van Groningen 32471abc0e
Check whether data stream feature flag is enabled before deleting all data streams, (#56517) (#56520)
this will fix the release build.
2020-05-11 18:34:50 +02:00
James Rodewig 2be6d7b8b6
[DOCS] Relocate request body param docs to search API docs (#56436)
Moves documentation for the following request body parameters to the
search API reference docs:

* `explain`
* `query`
* `seq_no_primary_term`
* `version`

Removes documentation for these parameters from the Request body search
page[0].

[0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-request-body.html
2020-05-11 11:29:38 -04:00
Armin Braun 3ab6eba6bc
Fix RollupJobTaskTests Leaking Threads on Slowness (#56438) (#56518)
We are ensuring order in the two tests changed by waiting on latches.
The problem is, that 3s is a pretty short wait and on CI can randomly be exceeded
by pure chance. If that happened we wouldn't have visibility on it since we didn't
assert that the waits actually worked.
=> Fixed by asserting that the waits work and upping the timeout to our standard 10s
Also, moved to a per-test threadpool to make it simpler to identify which test failed,
should an unexpected task run on a closed client's pool afterall.
2020-05-11 17:24:10 +02:00
Lisa Cawley 7ce0a25fbc [DOCS] Add UUID troubleshooting tip for stack monitoring (#55744) 2020-05-11 07:49:34 -07:00
Lisa Cawley 1474606b18 [DOCS] Clarify model snapshot retention properties (#56477) 2020-05-11 07:43:10 -07:00
James Rodewig ba67ab3b64
[DOCS] Add reference docs for `search.max_buckets` setting (#56449) (#56511)
Adds reference-style setting documentation for the `search.max_buckets`
setting.

This setting was previously only documented on the [bucket
aggregations][0] page.

[0]: https://www.elastic.co/guide/en/elasticsearch/reference/master/search-aggregations-bucket.html
2020-05-11 09:45:09 -04:00
Jim Ferenczi 02ab9112a9 Fix spurious failures in AsyncSearchIntegTestCase (#56026)
Async search integration tests are subject to random failures when:
  * The test index has more than one replica.
  * The request cache is used.
  * Some shards are empty.
  * The maintenance service starts a garbage collection when node is closing.

They are also slow because the test index is created/populated on each
test method.

This change refactors these integration tests in order to:
  * Create the index once for the entire test suite.
  * Fix the usage of the request cache and replicas.
  * Ensures that all shards have at least one document.
  * Increase the delay of the maintenance service garbage collection.

Closes #55895
Closes #55988
2020-05-11 15:03:03 +02:00
Martijn van Groningen 9ae09570d8
Allow a number of broadcast transport actions to resolve data streams (#55726) (#56502)
Change TransportBroadcastByNodeAction and TransportBroadcastReplicationAction
to be able to resolve data streams by default. Implementations can change this ability.

This change allows to following APIs to resolve data streams: flush,
refresh (already supported data streams), force merge, clear indices cache,
indices stats (already supported data streams), segments, upgrade stats, 
upgrade, validate query, searchable snapshots stats, clear searchable snapshots cache and
reload analyzers APIs.

Relates to #53100
2020-05-11 12:48:35 +02:00
Rene Groeschke c29bc87040
Move bwcVersions extension property to BuildParams (back port) (#56381)
* Move bwcVersions extension property to BuildParams (#56206)
* Fix :qa Task Using Broken BwC Versions Resolution (#56332)

Co-authored-by: Armin Braun <me@obrown.io>
2020-05-11 09:39:13 +02:00
István Zoltán Szabó ebe1e4c4c4 [DOCS] Expands GET DFA stats API docs with new phases (#56407)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-05-11 09:26:15 +02:00
Hendrik Muhs c0985615aa [DOC] document transform settings and docs_per_second (#56178)
add documentation for throttling, added in #56007
2020-05-11 09:23:49 +02:00
Hendrik Muhs ad54c51467
[7.x] add a basic get index rolling upgrade test (#56322) (#56411)
add a very basic rolling upgrade test for get index, post mortem action of #56274
2020-05-11 07:51:38 +02:00
Nik Everett 2823300bdf
Speed up rounding in auto_date_histogram (#56384) (#56486)
This wires `auto_date_histogram` into the rounding optimization that I
built in #55559. This is should significantly speed up any
`auto_date_histogram`s with `time_zone`s on them.
2020-05-09 11:37:30 -04:00
Nik Everett 2f38aeb5e2
Save memory when numeric terms agg is not top (#55873) (#56454)
Right now all implementations of the `terms` agg allocate a new
`Aggregator` per bucket. This uses a bunch of memory. Exactly how much
isn't clear but each `Aggregator` ends up making its own objects to read
doc values which have non-trivial buffers. And it forces all of it
sub-aggregations to do the same. We allocate a new `Aggregator` per
bucket for two reasons:

1. We didn't have an appropriate data structure to track the
   sub-ordinals of each parent bucket.
2. You can only make a single call to `runDeferredCollections(long...)`
   per `Aggregator` which was the only way to delay collection of
   sub-aggregations.

This change switches the method that builds aggregation results from
building them one at a time to building all of the results for the
entire aggregator at the same time.

It also adds a fairly simplistic data structure to track the sub-ordinals
for `long`-keyed buckets.

It uses both of those to power numeric `terms` aggregations and removes
the per-bucket allocation of their `Aggregator`. This fairly
substantially reduces memory consumption of numeric `terms` aggregations
that are not the "top level", especially when those aggregations contain
many sub-aggregations. It also is a pretty big speed up, especially when
the aggregation is under a non-selective aggregation like
the `date_histogram`.

I picked numeric `terms` aggregations because those have the simplest
implementation. At least, I could kind of fit it in my head. And I
haven't fully understood the "bytes"-based terms aggregations, but I
imagine I'll be able to make similar optimizations to them in follow up
changes.
2020-05-08 20:38:53 -04:00
Mark Vieira 0fb9bc5379
Always use archive base name as the pom artifact id (#56447) (#56467) 2020-05-08 16:11:19 -07:00
debadair 6ae7327061
[DOCS] Align with ILM changes. (#55953) (#56455)
* [DOCS] Align with ILM changes.

* Apply suggestions from code review

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
Co-authored-by: Lee Hinman <dakrone@users.noreply.github.com>

* Incorporated review comments.
2020-05-08 14:22:27 -07:00
Armin Braun 0a254cf223
Serialize Monitoring Bulk Request Compressed (#56410) (#56442)
Even with changes from #48854 we're still seeing significant (as in tens and hundreds of MB)
buffer usage for bulk exports in some cases which destabilizes master nodes.
Since we need to know the serialized length of the bulk body we can't do the serialization
in a streaming manner. (also it's not easily doable with the HTTP client API we're using anyway).
=> let's at least serialize on heap in compressed form and decompress as we're streaming to the
HTTP connection. For small requests this adds negligible overhead but for large requests this reduces
the size of the payload field by about an order of magnitude (empirically determined) which is a massive reduction in size when considering O(100MB) bulk requests.
2020-05-08 23:16:07 +02:00
Jake Landis 95e5e9e598
[7.x] [DOCS] Update default value of index.name.time_format (#56453) (#56456)
Corrects the default value of index.name.time_format
2020-05-08 16:09:42 -05:00
Théophile Helleboid - chtitux 23e419a7aa SLM PUT: add precision on date math support in indices parameter (#55203)
It was not clear for me that `indices` parameter supports date math expression.

I think it may be worth to add the precision in the documentation.
2020-05-08 15:06:09 -06:00
Dimitris Athanasiou 44ffa388ac
[7.x][ML] Use non-zero timeout when force stopping DF analytics (#56423) (#56428)
We have been using a zero timeout in the case that DF analytics
is stopped. This may cause a timeout when we cancel, for example,
the reindex task.

This commit fixes this by using the default timeout instead.

Backport of #56423
2020-05-08 21:12:11 +03:00
Nik Everett bd4b9dd10e
Speed up time interval arounding around dst (backport #56371) (#56396)
When an index spans a daylight savings time transition we can't use our
optimization that rewrites the requested time zone to a fixed time zone
and instead we used to fall back to a java.util.time based rounding
implementation. In #55559 we optimized "time unit" rounding. This
optimizes "time interval" rounding.

The java.util.time based implementation is about 1650% slower than the
rounding implementation for a fixed time zone. This replaces it with a
similar optimization that is only about 30% slower than the fixed time
zone. The java.util.time implementation allocates a ton of short lived
objects but the optimized implementation doesn't. So it *might* end up
being faster than the microbenchmarks imply.
2020-05-08 13:39:27 -04:00
Nicole Albee 5b708f846c
[DOCS] Fix broken link in the ilm-tutorial. (#56310) (#56311) (#56446) 2020-05-08 12:22:39 -05:00
Armin Braun b18d242300
Fix Simulate Template Endpoint Temporary Index Handling (#56406) (#56432)
Use proper facility for creating temporary index service for the simulation
that does not add itself to the `IndicesService` unnecessarily (breaking an assertion about the
internal consistency of the cluster state and the `IndicesService`).

Closes #56298
2020-05-08 18:05:24 +02:00
David Roberts 9a3924a641
[ML] Adjust list of platforms that have ML native code (#56426)
Native code is now available for linux-aarch64.

Note that it is _not_ currently supported!
2020-05-08 16:22:45 +01:00
Martijn van Groningen 83739b5806
Backport: allow cluster health api to resolve data streams (#56425)
Backport of: #56413

Allow cluster health api to resolve data streams and
automatically remove data streams after each test in
test cases extending from `ESIntegTestCase`

Relates to #53100
2020-05-08 17:16:25 +02:00
Dimitris Athanasiou c117ae7a6e
[7.x][ML] Force stopping stopped DF analytics should succeed (#56421) (#56424)
Force stopping a DF analytics job whose config exists and that
is stopped should succeed. This was broken by #56360.

Closes #56414

Backport of #56421
2020-05-08 18:04:24 +03:00
Tanguy Leroux 8e9b69bfd7
Use snapshot information to build searchable snapshot store MetadataSnapshot (#56289) (#56403)
While investigating possible optimizations to speed up searchable
snapshots shard restores, we noticed that Elasticsearch builds the
list of shard files on local disk in order to compare it with the list of
files contained in the snapshot to restore. This list of files is
materialized with a MetadataSnapshot object whose construction
involves to read the footer checksum of every files of the shard
using Store.checksumFromLuceneFile() method.

Further investigation shows that a MetadataSnapshot object is
also created for other types of operations like building the list of
files to recover in a peer recovery (and primary shard relocation)
or in order to assign a shard to a node. These operations use the
Store.getMetadata(IndexCommit) method to build the list of files
and checksums.

In the case of searchable snapshots building the MetadataSnapshot
object can potentially trigger cache misses, which in turn can
cause the download and the writing in cache of the last range of
the file in order to check the 16 bytes footer. This in turn can
cause more evictions.

Since searchable snapshots already contains the footer information
of every file in BlobStoreIndexShardSnapshot it can directly read the
checksum from it and avoid to use the cache at all to create a
MetadataSnapshot for the operations mentioned above.

This commit adds a shortcut to the
SearchableSnapshotDirectory.openInput() method - similarly to what
already exists for segment infos - so that it creates a specific
IndexInput for checksum reading operation.
2020-05-08 14:16:19 +02:00
Dimitris Athanasiou 60b1c67409
[7.x][ML] Allow stopping DF analytics whose config is missing (#56360) (#56408)
It is possible that the config document for a data frame
analytics job is deleted from the config index. If that is
the case the user is unable to stop a running job because
we attempt to retrieve the config and that will throw.

This commit changes that. When the request is forced,
we do not expand the requested ids based on the existing
configs but from the list of running tasks instead.

Backport of #56360
2020-05-08 13:54:44 +03:00
Hendrik Muhs e2e4c3179c Revert "add a basic get index rolling upgrade test (#56322)"
This reverts commit 0b40886db3.
2020-05-08 12:41:15 +02:00
Hendrik Muhs 406d58f28c Revert "dont use mappings for the test, due to different output in 6.9 (type removal)"
This reverts commit b2c70a95da.
2020-05-08 12:41:11 +02:00
Hendrik Muhs b2c70a95da dont use mappings for the test, due to different output in 6.9 (type removal) 2020-05-08 11:53:47 +02:00
Hendrik Muhs 0b40886db3 add a basic get index rolling upgrade test (#56322)
add a very basic rolling upgrade test for get index, post mortem action of #56274
2020-05-08 10:53:01 +02:00
Hendrik Muhs cc35d37788 [Transform] unmute transform upgrade tests (#56296)
the transform upgrade tests broke due to #56238, but got fixed with #56274

fixes #56269
fixes #56250
2020-05-08 10:48:58 +02:00
Armin Braun 085ff8c404
Add More Trace Logging to BlobStoreRepository (#56336) (#56401)
Adding more trace logging that would be helpful in understanding
the precise order of blob-level operations if needed.
2020-05-08 08:31:32 +02:00
István Zoltán Szabó ceb0b0dba3 [DOCS] Updated screenshots in transform ecommerce example. (#56359) 2020-05-08 07:46:15 +02:00
Ryan Ernst 582145a493
Upgrade forbidden apis to 3.0 (#56368)
This commit upgrades forbidden apis to the latest version, which also
means we now get task configuration avoidance.
2020-05-07 19:05:07 -07:00
Tal Levy 13944b1bf9 Fix max-int limit for number of points reduced in geo_centroid (#56370)
A bug in InternalGeoCentroid#reduce existed that summed up
the aggregation's long-valued counts into a local integer variable.
Since it is definitely possible to reduce more than Integer.MAX points,
this change simply updates that variable to be a long-valued number.

Closes #55992.
2020-05-07 14:30:29 -07:00
Tal Levy 6e0178fb68
Move CumulativeSumPipelineAgg to use ConstructingObjectParser parsing (#55990) (#56380)
As part of #52776, this refactors the aggregation to use
the context parser to parse its parameters.
2020-05-07 12:34:54 -07:00
Tim Brooks b84d1e2577
Improve logging around SniffConnectionStrategy (#56378)
Currently, the logging around the SniffConnectionStrategy is limited.
The log messages are inconsistent and sometimes wrong. This commit
cleans up these log message to describe when connections are happening
and what failed if a step fails.

Additionally, this commit enables TRACE logging for a problematic test
(testEnsureWeReconnect).
2020-05-07 13:11:56 -06:00