Commit Graph

52644 Commits

Author SHA1 Message Date
Costin Leau 5580eb61ed EQL: Improve sequence limiting (#59439)
Improve the way limit (in particular offset) is being applied to handle
the case where the matches are less than the offset and absolute limit.

Combine Matcher and SequenceStateMachine into one class since the two
have evolved beyond their original name and structure.

(cherry picked from commit 63d3c62cdfc33dea03f21d5565b9c8ea104003eb)
2020-07-14 13:19:09 +03:00
Armin Braun 0e3d87ab54
Add Assertions on CS Application in Snapshot Logic (#58681) (#59511)
Relates to #58680. Bugs like that should not only show up in logs
but ideally also get caught in tests. We expect to never see exceptions
in these two spots.
2020-07-14 12:16:42 +02:00
Armin Braun 81e96954d0
Improve Efficiency of SnapshotsService CS Apply (#56874) (#59508)
This change removes the redundant submitting of two separate cluster state updates
for the node configuration changes and routing changes that affect snapshots.
Since we submitted the task to deal with node configuration changes every time on master
fail-over we could also move the BwC cleanup loop that removes `INIT` state snapshots as well
as snapshots that have all their shards completed into this cluster state update task.

Aside from improving efficiency overall this change has the fortunate side effect of moving
all snapshot finalization to the CS update thread. This is helpful for concurrent snapshots
since it makes it very natural and straight forward to order snapshot finalizations by exploiting
that they are all initiated on the same thread.
2020-07-14 11:49:09 +02:00
Hendrik Muhs c8290167a0
[7.x][Transform] separate pivot and extract function interface (#59505)
separate pivot from the indexer and introduce an abstraction layer, pivot becomes a function.
Foundation to add more functions to transform.

piggy backed fixes:
 - when running geo tile group_by it could fail due to query clause limit (unreleased)
 - new style page size using settings was not validating limit of 10k (7.8)
2020-07-14 11:27:16 +02:00
Martijn van Groningen 5f24be1bc1
Also set system property when running test task. (#59499)
Closes #59488
2020-07-14 10:34:52 +02:00
Rene Groeschke d5c11479da
Remove remaining deprecated api usages (#59231) (#59498)
- Fix duplicate path deprecation by removing duplicate test resources
- fix deprecated non annotated input property in LazyPropertyList
- fix deprecated usage of AbstractArchiveTask.version
- Resolve correct test resources
2020-07-14 10:25:00 +02:00
David Roberts 529aa345df [ML] Account for per-partition categorization in model memory estimate (#59458)
Now that we have per-partition categorization, the estimate for
the model memory limit required for a particular analysis config
needs to take into account whether categorization is operating
for the job as a whole or per-partition.
2020-07-14 09:16:28 +01:00
Yang Wang 4350add12c
Allow null name when deserialising API key document (#59485) (#59496)
API keys can be created without names using grant API key action. This is considered as a bug (#59484). Since the feature has already been released, we need to accomodate existing keys that are created with null names. This PR relaxes the parser logic so that a null name is accepted.
2020-07-14 16:08:32 +10:00
debadair 7d20d32a8c
Update node.asciidoc (#59201) (#59479)
TIP block was missing due to the lack of line break prior to the "TIP"

Co-authored-by: Leaf-Lin <39002973+Leaf-Lin@users.noreply.github.com>
2020-07-13 16:51:14 -07:00
Tim Brooks 623df95a32
Adding indexing pressure stats to node stats API (#59467)
We have recently added internal metrics to monitor the amount of
indexing occurring on a node. These metrics introduce back pressure to
indexing when memory utilization is too high. This commit exposes these
stats through the node stats API.
2020-07-13 17:23:42 -06:00
Mark Vieira dc7d4c615c
Ensure fixture runtime dependencies are built before starting containers (#59474) 2020-07-13 15:58:01 -07:00
Nik Everett 81cba796e6
Add microbenchmark for LongKeyedBucketOrds (#58608) (#59459)
I've always been confused by the strange behavior that I saw when
working on #57304. Specifically, I saw switching from a bimorphic
invocation to a monomorphic invocation to give us a 7%-15% performance
bump. This felt *bonkers* to me. And, it also made me wonder whether
it'd be worth looking into doing it everywhere.

It turns out that, no, it isn't needed everywhere. This benchmark shows
that a bimorphic invocation like:
```
LongKeyedBucketOrds ords = new LongKeyedBucketOrds.ForSingle();
ords.add(0, 0); <------ this line
```

is 19% slower than a monomorphic invocation like:
```
LongKeyedBucketOrds.ForSingle ords = new LongKeyedBucketOrds.ForSingle();
ords.add(0, 0); <------ this line
```

But *only* when the reference is mutable. In the example above, if
`ords` is never changed then both perform the same. But if the `ords`
reference is assigned twice then we start to see the difference:
```
immutable bimorphic    avgt   10   6.468 ± 0.045  ns/op
immutable monomorphic  avgt   10   6.756 ± 0.026  ns/op
mutable   bimorphic    avgt   10   9.741 ± 0.073  ns/op
mutable   monomorphic  avgt   10   8.190 ± 0.016  ns/op
```

So the conclusion from all this is that we've done the right thing:
`auto_date_histogram` is the only aggregation in which `ords` isn't final
and it is the only aggregation that forces monomorphic invocations. All
other aggregations use an immutable bimorphic invocation. Which is fine.

Relates to #56487
2020-07-13 17:22:46 -04:00
James Rodewig db89764539
[DOCS] Add data streams to rollup APIs (#59423) (#59465) 2020-07-13 16:57:40 -04:00
Lee Hinman 81bdb20b8a Fix license header for DataStreamRestIT 2020-07-13 14:41:29 -06:00
Tim Brooks 68d56fa7db
Implement rejections in `WriteMemoryLimits` (#59451)
This commit adds rejections when the indexing memory limits are
exceeded for primary or coordinating operations. The amount of bytes
allow for indexing is controlled by a new setting
`indexing_limits.memory.limit`.
2020-07-13 14:34:50 -06:00
James Rodewig a1cf955dbd
[DOCS] Clarify that passwords are not preserved for `kibana_system` user (#59449) (#59460) 2020-07-13 16:34:11 -04:00
Mark Tozzi eb0b28dd1d
Move getPointReaderOrNull into AggregatorBase (#58769) (#59455) 2020-07-13 16:31:33 -04:00
Lee Hinman bf1a60130d
[7.x] Add telemetery for data streams (#59433) (#59454)
This commit adds data stream info to the `/_xpack` and `/_xpack/usage` APIs. Currently the usage is
pretty minimal, returning only the number of data streams and the number of indices currently
abstracted by a data stream:

```
  ...
  "data_streams" : {
    "available" : true,
    "enabled" : true,
    "data_streams" : 3,
    "indices_count" : 17
  }
  ...
```
2020-07-13 14:30:11 -06:00
Adam Locke aa260636e5
Indicating that the size parameter defaults to 10. (#59438) (#59461) 2020-07-13 16:27:20 -04:00
Armin Braun 64c5f70a2d
Remove Needless Context Switches on Loading RepositoryData (#56935) (#59452)
We don't need to switch to the generic or snapshot pool for loading
cached repository data (i.e. most of the time in normal operation).

This makes `executeConsistentStateUpdate` less heavy if it has to retry
and lowers the chance of having to retry in the first place.
Also, this change allowed simplifying a few other spots in the codebase
where we would fork off to another pool just to load repository data.
2020-07-13 21:38:29 +02:00
Jake Landis 665b7b7bd8
Convert modules to use yamlRestTest (#59089) (#59446)
This commit moves the modules REST tests to the
newly introduced yamlRestTest source set. A few
tests have also been re-named to include the correct
IT suffix. Without changing the names, the testing
conventions task would fail since now that the YAML
tests are no longer present pacify the convention.
These tests have moved to the internalClusterTest
source set.

related: #56841
2020-07-13 13:53:05 -05:00
Armin Braun bde92fc5fc
Remove Needless Context Switch From Snapshot Finalization (#56871) (#59443)
No need to do any switch to the `SNAPSHOT` pool here, the blob store
repo handles all its writes async on the `SNAPSHOT` pool so we're just
needlessly context-switching to enqueue those tasks there.
Also cleaned up the source only repository (the only override to `finalizeSnapshot`)
to make it clear that no IO is happening there and we don't need to run it on the
`SNAPSHOT` pool either.
2020-07-13 20:11:07 +02:00
Armin Braun 31be3a3645
More Efficient Snapshot State Handling (#56669) (#59430)
Follow up to #56365. Instead of redundantly checking snapshots for completion
over and over, just track the completed snapshots in the CS updates that complete
them instead of looping over the smae snapshot entries over and over.
Also, in the batched snapshot shard status updates, only check for completion
of a snapshot entry if it isn't already finalizing.
2020-07-13 18:58:04 +02:00
James Rodewig d293e1ae36
[DOCS] Add data streams to reload search analyzers API (#59422) (#59437) 2020-07-13 12:50:47 -04:00
James Rodewig 0a7664e190
[DOCS] Add data streams to validate query API (#59420) (#59436) 2020-07-13 12:50:34 -04:00
homersimpsons f95658d1f8 [DOCS] MatchQuery: `transpositions` to `fuzzy_transpositions` (#59371) 2020-07-13 12:37:30 -04:00
Christos Soulios 3868bcc7b8
[7.x] Histogram integration on Histogram field type (#59431)
Backports #58930 to 7.x
Implements histogram aggregation over histogram fields as requested in #53285.
2020-07-13 19:36:33 +03:00
Dimitris Athanasiou a7895ff458
[7.x][ML] Remove unused member var from ExtractedFieldsDetector (#59395) (#59406)
Removes member variable `index` from `ExtractedFieldsDetector`
as it is not used.

Backport of #59395

Co-authored-by: Elastic Machine <elasticmachine@users.noreply.github.com>
2020-07-13 19:10:43 +03:00
Igor Motov 1acb4aeba9
EQL: Prepare for release (#59331) (#59426)
Enables eql setting in release builds.

Relates #51613
2020-07-13 11:54:32 -04:00
Henning Andersen adf6083dd0
Enhance real memory circuit breaker with G1 GC (#58674) (#59394)
Using G1 GC, Elasticsearch can rarely trigger that heap usage goes above
the real memory circuit breaker limit and stays there for an extended
period. This situation will persist until the next young GC. The circuit
breaking itself hinders that from occurring in a timely manner since it
breaks all request before real work is done.

This commit gently nudges G1 to do a young GC and then double checks
that heap usage is still above the real memory circuit breaker limit
before throwing the circuit breaker exception.

Related to #57202
2020-07-13 17:41:09 +02:00
Martijn van Groningen b1b7bf3912
Make data streams a basic licensed feature. (#59392)
Backport of #59293 to 7.x branch.

* Create new data-stream xpack module.
* Move TimestampFieldMapper to the new module,
  this results in storing a composable index template
  with data stream definition only to work with default
  distribution. This way data streams can only be used
  with default distribution, since a data stream can
  currently only be created if a matching composable index
  template exists with a data stream definition.
* Renamed `_timestamp` meta field mapper
   to `_data_stream_timestamp` meta field mapper.
* Add logic to put composable index template api
  to fail if `_data_stream_timestamp` meta field mapper
  isn't registered. So that a more understandable
  error is returned when attempting to store a template
  with data stream definition via the oss distribution.

In a follow up the data stream transport and
rest actions can be moved to the xpack data-stream module.
2020-07-13 17:26:46 +02:00
Yang Wang cc9166a5ea Mute failed 120_api_key_auth test till #59425 is addressed. 2020-07-14 01:10:36 +10:00
Yang Wang edf27cd765 Adjust BWC versions for API key auth test.
API key realm name is not available in authentication metadata prior to
v7.5. The issue is tracked at #59425
2020-07-14 00:38:42 +10:00
David Roberts b5e8250a4e
[ML] Drive categorization warning notifications from annotations (#59393)
With the introduction of per-partition categorization the old
logic for creating a job notification for categorization status
"warn" does not work.  However, the C++ code is already writing
annotations for categorization status "warn" that take into
account whether per-partition categorization is being used and
which partition(s) the warnings relate to.  Therefore, this
change alters the Java results processor to create notifications
based on the annotations the C++ writes.  (It is arguable that
we don't need both annotations and notifications, but they show
up in different ways in the UI: only annotations are visible in
results and only notifications set the warning symbol in the
jobs list.  This means it's best to have both.)

Backport of #59377
2020-07-13 15:28:57 +01:00
Dan Hermann c228532ebd
Update docs for delete data stream API to show that multiple names are supported 2020-07-13 09:11:25 -05:00
James Rodewig 27a87c9d0c
[DOCS] Update snapshot/restore and SLM docs for data streams (#58513) (#59403)
Updates the existing snapshot/restore and SLM docs to make them
aware of data streams.
2020-07-13 09:26:51 -04:00
Alan Woodward bd01fd107c Revert "Migrate CompletionFieldMapper to parametrized format (#59291)"
This reverts commit 19ba6c39d2.
2020-07-13 14:16:09 +01:00
David Kyle 054d5236d4 Mute RegressionIT failure (#59414)
For #59413
2020-07-13 14:12:19 +01:00
James Rodewig 2629a95e14
[DOCS] EQL: Document `until` keyword support (#59320) (#59408) 2020-07-13 09:05:47 -04:00
James Rodewig 85101fa487
[DOCS] Add data streams to searchable snapshot API docs (#59325) (#59409) 2020-07-13 09:05:27 -04:00
James Rodewig a357ec59f2
[DOCS] Add data streams to index APIs (#59329) (#59410) 2020-07-13 09:05:03 -04:00
James Rodewig 35a78b88ab
[DOCS] Add data streams to ILM explain API (#59343) (#59411) 2020-07-13 09:04:42 -04:00
James Rodewig 896d0ffd9b
[DOCS] EQL: Prepare docs for release (#59259) (#59407)
Changes:

* Swaps the `dev` admonitions for `experimental` admonitions
* Removes `ifdef` statements preventing the docs from appearing in
  released branches
2020-07-13 09:04:15 -04:00
James Rodewig 9d5c091f7a
[DOCS] Add data streams to EQL search docs (#58611) (#59404) 2020-07-13 09:03:55 -04:00
James Rodewig 39bcc4a1a7
[DOCS] Add ingest pipeline ex to data stream docs (#58343) (#59402) 2020-07-13 09:03:36 -04:00
Yang Wang a84469742c
Improve role cache efficiency for API key roles (#58156) (#59397)
This PR ensure that same roles are cached only once even when they are from different API keys.
API key role descriptors and limited role descriptors are now saved in Authentication#metadata
as raw bytes instead of deserialised Map<String, Object>.
Hashes of these bytes are used as keys for API key roles. Only when the required role is not found
in the cache, they will be deserialised to build the RoleDescriptors. The deserialisation is directly
from raw bytes to RoleDescriptors without going through the current detour of
"bytes -> Map -> bytes -> RoleDescriptors".
2020-07-13 22:58:11 +10:00
Armin Braun 4e574a7136
Remove Dead Code from Closed Index Snapshot Logic (#56764) (#59398)
The code path for closed indices is dead code here ever since #39644
because `shards(currentState, indexIds, ...)` does not set
`MISSING` on a closed index's shard that is assigned any longer. Before that change it would always set `MISSING` for a closed index's shard even it was assigned.
=> simplified the code accordingly.
2020-07-13 14:49:16 +02:00
David Turner 3fb9dccc22 Fix FSHealthServiceTests on Windows (#59387)
In #52680 we introduced a new health check mechanism. This commit fixes
up some related test failures on Windows caused by erroneously assuming
that all paths begin with `/`.

Closes #59380
2020-07-13 12:43:45 +01:00
Alan Woodward 19ba6c39d2 Migrate CompletionFieldMapper to parametrized format (#59291)
This adds some optional extra configuration to Parameter:

* custom serialization (to handle analyzers)
* deprecated parameter names
* parameter validation
2020-07-13 12:43:15 +01:00
Armin Braun 08b54feaaf
Remove Snapshot INIT Step (#55918) (#59374)
With #55773 the snapshot INIT state step has become obsolete. We can set up the snapshot directly in one single step to simplify the state machine.

This is a big help for building concurrent snapshots because it allows us to establish a deterministic order of operations between snapshot create and delete operations since all of their entries now contain a repository generation. With this change simple queuing up of snapshot operations can and will be added in a follow-up.
2020-07-13 13:41:09 +02:00