Commit Graph

54036 Commits

Author SHA1 Message Date
Martijn van Groningen 0baefc8ddc
Always validate that only a create op is allowed in bulk api for data streams (#62820)
Backport #62766 to 7.x branch.

The bulk api cache the resolved concrete indices when resolving the user provided
index name into the actual index name. The validation that prevents write ops other
than create from being executed in a data stream was only performed if the result
wasn't cached. In case of cached resolvings, the validation never occurs.

The validation would be skipped for all bulk items for a data stream after a create
operation for that same data stream. This commit ensures that the validation is always
performed for all bulk items (whether the concrete index resolution has been cached or
not cached).

Closes #62762
2020-09-23 16:27:54 +02:00
Nik Everett f8bc5a3e6b
Grok: Handle utf-8 natively (backport of #62794) (#62826)
This adds a method to `Grok` that matches against sections offset from
utf-8 byte arrays:
```
Map<String, Object> captures(byte[] utf8Bytes, int offset, int length)
```

This'll be useful for the grok-flavored runtime fields because they
want to match against utf-8 encoded strings stored in a big array. And
joni already supports this.
2020-09-23 09:33:03 -04:00
Dimitris Athanasiou 7de5201291
[7.x][ML] Handle data frame analytics state spreading over multiple docs (#62564) (#62824)
When state persistence was first implemented for data frame analytics
we had the assumption that state would always fit in a single document.
However this is not the case any more.

This commit adds handling of state that spreads over multiple documents.

Backport of #62564
2020-09-23 16:16:34 +03:00
James Rodewig e3d5915566 [DOCS] Fix JSON spec linnk for PIT API (#61783) 2020-09-23 14:29:06 +02:00
Armin Braun a754fd8020
Fix CoordinatorTests.testLogsMessagesIfPublicationDelayed (#62815) (#62822)
We need to account for an addional `DEFAULT_DELAY_VARIABILITY` timeout for
the lag detector task to be executed after its scheduled.

Closes #62383
2020-09-23 14:23:28 +02:00
Dimitris Athanasiou 69e72656fa
[7.x][ML] Reset reindexing progress when DFA job resumes with incomplete reindexing (#62772) (#62816)
This fixes reindexing progress in the scenario when a DFA job that had not finished
reindexing is resumed (either because the user called stop and start or because the
job was reassigned in the middle of reindexing). Before the fix reindexing progress
stays to the value it had reached before until it surpasses that value.

When we resume a data frame analytics job we want to preserve reindexing progress
and reset all other phases. Except for when reindexing was not completed.
In that case we are deleting the destination index and starting reindexing
from scratch. Thus we need to reset reindexing progress too.

Backport of #62772
2020-09-23 14:09:04 +03:00
Christoph Büscher 054a950ceb Align version field plugin naming (#62757)
To better align the plugin naming with other mapper plugins under x-pack (e.g.
mapper-flattened) this PR changes the plugin name and the containing directory
to "mapper-version"
2020-09-23 11:50:15 +02:00
Christoph Büscher 29074e7055
Add case insensitive prefix and wildcard to 'version' field (#62754) (#62782)
This change adds support for the recently introduced case insensitivity flag for
wildcard and prefix queries. Since version field values are encoded differently we
need to adapt our own AutomatonQuery variation to add both cases if case insensitivity
is turned on.
2020-09-23 11:48:34 +02:00
Ignacio Vera 81645ec2cc
nextSetBit should check if the underlaying array contains the current word (#62805) (#62812)
This is a recent addition and it is missing a check as the underlaying array can be smaller that the numBits capacity.
2020-09-23 11:17:26 +02:00
Luca Cavanna 862fab06d3
Share same existsQuery impl throughout mappers (#57607)
Most of our field types have the same implementation for their `existsQuery` method which relies on doc_values if present, otherwise it queries norms if available or uses a term query against the _field_names meta field. This standard implementation is repeated in many different mappers.

There are field types that only query doc_values, because they always have them, and field types that always query _field_names, because they never have norms nor doc_values. We could apply the same standard logic to all of these field types as `MappedFieldType` has the knowledge about what data structures are available.

This commit introduces a standard implementation that does the right thing depending on the data structure that is available. With that only field types that require a different behaviour need to override the existsQuery method.

At the same time, this no longer forces subclasses to override `existsQuery`, which could be forgotten when needed. To address this we introduced a new test method in `MapperTestCase` that verifies the `existsQuery` being generated and its consistency with the available data structures.
2020-09-23 11:00:53 +02:00
Przemko Robakowski 005e0bffaf
[7.x] Make for each processor resistant to field modification (#62791) (#62807)
* Make for each processor resistant to field modification  (#62791)

This change provides consistent view of field that foreach processor is iterating over. That prevents it to go into infinite loop and put great pressure on the cluster.

Closes #62790

* fix compilation
2020-09-23 10:46:00 +02:00
David Kyle bc34ecc581
[ML] Mute annotations index upgrade mapping test (#62814)
For #61908
2020-09-23 09:37:04 +01:00
Luca Cavanna 5ca86d541c
Move stored flag from TextSearchInfo to MappedFieldType (#62717) (#62770) 2020-09-23 09:40:34 +02:00
Albert Zaharovits b4ec821067
Fix doc-update interceptor for indices with DLS and FLS (#61516)
This fixes the protection against updates (and bulk updates) for indices with DLS
and/or FLS, when the request uses date math expressions.
2020-09-23 08:55:22 +03:00
Nhat Nguyen 663b85b98f Make keep alive optional in PointInTimeBuilder (#62720)
Remove the keepAlive parameter from the constructor of PointInTimeBuilder
as it's optional.
2020-09-22 18:52:54 -04:00
Nik Everett 7ffea4621d
Extract capture config from grok patterns up front (backport of #62706) (#62785)
This extracts the configuration for extracting values from a groked
string when building the grok expression to do two things:
1. Create a method exposing that configuration on `Grok` itself which
   will be used grok `grok` flavored runtime fields.
2. Marginally speed up extracting grok values by skipping a little
   string manipulation.
2020-09-22 17:44:42 -04:00
Nik Everett fa13585fae
Fix Eclipse build (#62733) (#62786)
Eclipse was confused for two reasons:
1. `:x-pack:plugin` depended on itself.
2. `ql`, `sql`, and `eql` couldn't see some methods.

I fixed problem 1 by only adding the "depends on itself" configuration
outside of eclipse. I fixed problem 2 by making a `test` sub-project in
`ql` that contains test utilities and depending on those where possible.
2020-09-22 17:44:25 -04:00
Jay Modi cb1dc5260f
Dedicated threadpool for system index writes (#62792)
This commit adds a dedicated threadpool for system index write
operations. The dedicated resources for system index writes serves as
a means to ensure that user activity does not block important system
operations from occurring such as the management of users and roles.

Backport of #61655
2020-09-22 15:31:38 -06:00
Rory Hunter 54d97ecc60
Check glibc version (#62728)
Java 15 requires at last glibc 2.14, but we support older Linux OSs that ship with older versions. Rather than continue to ship Java 14, which is now EOL and therefore unsupported, ES will detect this situation and print a helpful message, instead of the cryptic error that would otherwise be printed. Users on older OSs will have to set JAVA_HOME instead of using the bundled JVM.

This doesn't affect v8.0.0 because these older Linux OSs will not be supported, and all the supported ones have glibc 2.14.
2020-09-22 13:48:27 -07:00
Benjamin Trent 77bfb32635
[7.x] [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls (#62694) (#62784)
* [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls (#62694)

* [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls
 global parameters, outside of the global index, are ignored for internal callers in certain cases.
If the interal caller is adding requests via the following methods:
```
- BulkRequest#add(IndexRequest)
- BulkRequest#add(UpdateRequest)
- BulkRequest#add(DocWriteRequest)
- BulkRequest#add(DocWriteRequest[])
```
It is better to specifically set the desired parameters on the requests before they are added
to the bulk request object.

This commit addresses this issue for the ML plugin

* unmuting test
2020-09-22 15:07:08 -04:00
Marios Trivyzas 1e72144847
EQL: Remove support for `=` for comparisons (#62756) (#62775)
Since `=` is rarely used and is undocumented we its support for
equality comparisons keeping `==` as the only option. `=` is now only
used for assignments like in `maxspan=10m`.

Closes: #62650
(cherry picked from commit ad5ae4d887b5c2feca2d0e874d7bdf738e3fd54e)
2020-09-22 20:56:04 +02:00
James Rodewig 2366c1443b [DOCS] EQL: Note = is not an equality operator 2020-09-22 13:54:38 -04:00
Nik Everett 39a617773d
Raname grok's built-in patterns (backport of #62735) (#62765)
This reworks the code around grok's built-in patterns to name things
more like the rest of the code. Its not a big deal, but I'm just more
used to having `public static final` constants in SHOUTING_SNAKE_CASE.
2020-09-22 13:06:43 -04:00
Lisa Cawley c995e73c6d [DOCS] Add realm limitations for monitoring clusters (#62714) 2020-09-22 09:37:00 -07:00
James Rodewig 7b2010de81 [DOCS] Fix EQL search API example 2020-09-22 12:09:38 -04:00
Adam Locke 56fbfabeda
[DOCS] Add remote node as a node role (#62730) (#62776)
* Adding remote node as a node role.

* Incorporating reviewer feedback.
2020-09-22 12:02:22 -04:00
Lisa Cawley 7e97f17845 [DOCS] Add SLM security privileges (#62737) 2020-09-22 08:44:18 -07:00
Rory Hunter 3f856d1c81 Prioritise recovery of system index shards (#62640)
Closes #61660. When ordering shard for recovery, ensure system index shards are
ordered first so that their recovery will be started first.

Note that I rewrote PriorityComparatorTests to use IndexMetadata instead of its
local IndexMeta POJO.
2020-09-22 15:48:27 +01:00
James Rodewig c0e611e0a7
[DOCS] Fix typo: NamedID -> NameID (#62721) (#62767)
Co-authored-by: Greg Back <1045796+gtback@users.noreply.github.com>
2020-09-22 10:30:35 -04:00
markharwood a0df0fb074
Search - add case insensitive flag for "term" family of queries #61596 (#62661)
Backport of fe9145f

Closes #61546
2020-09-22 13:56:51 +01:00
Armin Braun 0d5250c99b
Add Trace Logging to File Restore (#62755) (#62761)
Requested by the performance team and generally potentially useful
to log each file at `TRACE` like we do for snapshot create.
2020-09-22 14:44:40 +02:00
Andrei Dan 0be89bcd7f
Mute RegressionIT.testTwoJobsWithSameRandomizeSeedUseSameTrainingSet (#62763) 2020-09-22 13:43:15 +01:00
David Kyle 31fbc6800f
[7.x] [ML] Add upgrade mappings assertions to full cluster restart tests (#62293) (#62305)
Refactors the index mapping checks in the rolling upgrade tests
and use that shared code in the full cluster restart tests.
2020-09-22 13:09:51 +01:00
Rory Hunter df2b5dd4d1 Use the Elastic Docker registry for the UBI base image (#62606)
Swap the base Docker image for UBI builds to point at Elastic's registry.
2020-09-22 12:09:07 +01:00
Amogh Mishra bc6bea5924 Remove node from cluster when node locks broken (#61400)
In #52680 we introduced a mechanism that will allow nodes to remove
themselves from the cluster if they locally determine themselves to be
unhealthy. The only check today is that their data paths are all
empirically writeable. This commit extends this check to consider a
failure of `NodeEnvironment#assertEnvIsLocked()` to be an indication of
unhealthiness.

Closes #58373
2020-09-22 10:08:41 +01:00
Armin Braun aa0dc56412
Ensure MockRepository is Unblocked on Node Close (#62711) (#62748)
`RepositoriesService#doClose` was never called which lead to
mock repositories not unblocking until the `ThreadPool` interrupts
all threads. Thus stopping a node that is blocked on a mock repository operation wastes `10s`
in each test that does it (which is quite a few as it turns out).
2020-09-22 11:00:18 +02:00
Armin Braun 4bdbc39e9f
Fix testQueuedSnapshotOperationsAndBrokenRepoOnMasterFailOverMultiple (#62713) (#62747)
There's possible retries here that work out if both the snapshot and the delete
operation are retried when master shuts down and hits the unlikely case of the retried delete
executing before the retried snapshot, making both operations pass.

Closes #62686
2020-09-22 10:42:11 +02:00
Luca Cavanna 9ae29713fd
Dense vector field type minor fixes (#62631)
The dense vector field is not aggregatable although it produces fielddata through its BinaryDocValuesField. It should pass up hasDocValues set to true to its parent class in its constructor, and return isAggregatable false. Same for the sparse vector field (only in 7.x).

This may not have consequences today, but it will be important once we try to share the same exists query implementation throughout all of the mappers with #57607.
2020-09-22 10:40:51 +02:00
Christoph Büscher 593511e5c9
VersionFieldIT should register transportClientPlugins (#62734) 2020-09-22 10:10:44 +02:00
Ignacio Vera 265387f348
override needsScore() on ValueCountAggregator (#62683) (#62745) 2020-09-22 08:47:16 +02:00
Yang Wang 28503f04f7
Fix privilege requirement for CCS with Point In Time reader (#62261) (#62696)
When target indices are remote only, CCS does not require user to have privileges on the local cluster. This PR ensure Point-In-Time reader follows the same pattern.

Relates: #61827
2020-09-22 12:51:51 +10:00
Tim Brooks fae2f5f8e1
Log alloc description after netty processors set (#62741)
Currently we log the NettyAllocator description when the netty plugin is
created. Unfortunately, this hits certain static fields in Netty which
triggers the settings of the number of CPU processors. This conflicts
with out Elasticsearch behavior to override this based on a setting.

This commit resolves the issue by logging after the processors have been
set.
2020-09-21 19:52:51 -06:00
Yang Wang 897d2e8a02
Fix ccs permission for search with a scroll id (#62053) (#62695)
CCS with remote indices only does not require any privileges on the local cluster.
This PR ensures that search with scroll follow the permission model.
2020-09-22 11:49:40 +10:00
James Rodewig 21d5236173 [DOCS] EQL: Style fixes 2020-09-21 19:44:21 -04:00
James Rodewig 00bfc2d684
[7.x] [DOCS] EQL: Improve regsvr32 misuse explanation (#62722) (#62738)
* [DOCS] EQL: Improve regsvr32 misuse explanation (#62722)

Expands the introduction to better explain what regsvr32 misuse is and
how it works at a high level.

* [DOCS] EQL: Style fixes
2020-09-21 19:02:10 -04:00
Tim Brooks 9bf0d9105a
Change netty pool chunk size to respect G1 region (#62410)
Currently the netty pool chunk size defaults to 16MB. The number does
not play well with the G1GC which causes this to consume entire regions.
Additionally, we normally allocated arrays of size 64KB or less. This
means that Elasticsearch could handle a smaller pool chunk size to play
nicer with the G1GC.
2020-09-21 16:45:09 -06:00
Jim Ferenczi 1fc78d430b Fix terms aggregation ordering after the final reduce (#62732)
This commit ensures that the final order of the terms aggregations
is registered correctly after the final reduce.
This bug was introduced in #62028 which is not released yet so this PR is marked
as a non-issue.
This issue was discovered when running a terms aggregation under an auto-date
histogram. In such a case, the auto-date histogram may run multiple final reduce
to merge buckets together. This change makes sure that running multiple final reduces
doesn't create duplicates but it doesn't fix the fact that the final reduce may prune
the list of terms prematurely. This other bug is tracked separately in #62731.
2020-09-22 00:03:04 +02:00
Nhat Nguyen f9f4d87437 Remove invalid assertion in SearchService (#62675)
This assertion does not always hold because there can be a race between
`putReaderContext` and `afterIndexRemoved` when an index is deleted.

Closes #62624
2020-09-21 16:29:00 -04:00
Andrei Dan 79d0c4ed18
ILM: allow check-migration step to continue if tier setting unset (#62636) (#62724)
This allows the `check-migration` step to move past the allocation check
if the tier routing settings are manually unset.

This helps a user unblock ILM in case a tier is removed (ie. if the warm tier
is decommissioned this will allow users to resume the ILM policies stuck in
`check-migration` waiting for the warm nodes to become available and the managed
index to allocate. this allows the index to allocate on the other available tiers)

(cherry picked from commit d7a1eaa7f51d0972d10c0df1d3cd77d6b755dd41)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-09-21 20:40:01 +01:00
Ryan Ernst ee835ee74a
Copy gradle/system jdk to local dir for packaging tests (#61436)
The distro tests rely on two jdks, pulled in by the jdk download plugin.
The move the artifact transforms result in the path to the extracted
jdks existing under the gradle cache dir, which is outside the vagrant
mount of the elasticsearch project. This commit creates a local copy
within the `qa:os` project that the packaging tests use.

closes #61138
2020-09-21 12:19:09 -07:00