Commit Graph

53981 Commits

Author SHA1 Message Date
Dimitris Athanasiou 69e72656fa
[7.x][ML] Reset reindexing progress when DFA job resumes with incomplete reindexing (#62772) (#62816)
This fixes reindexing progress in the scenario when a DFA job that had not finished
reindexing is resumed (either because the user called stop and start or because the
job was reassigned in the middle of reindexing). Before the fix reindexing progress
stays to the value it had reached before until it surpasses that value.

When we resume a data frame analytics job we want to preserve reindexing progress
and reset all other phases. Except for when reindexing was not completed.
In that case we are deleting the destination index and starting reindexing
from scratch. Thus we need to reset reindexing progress too.

Backport of #62772
2020-09-23 14:09:04 +03:00
Christoph Büscher 054a950ceb Align version field plugin naming (#62757)
To better align the plugin naming with other mapper plugins under x-pack (e.g.
mapper-flattened) this PR changes the plugin name and the containing directory
to "mapper-version"
2020-09-23 11:50:15 +02:00
Christoph Büscher 29074e7055
Add case insensitive prefix and wildcard to 'version' field (#62754) (#62782)
This change adds support for the recently introduced case insensitivity flag for
wildcard and prefix queries. Since version field values are encoded differently we
need to adapt our own AutomatonQuery variation to add both cases if case insensitivity
is turned on.
2020-09-23 11:48:34 +02:00
Ignacio Vera 81645ec2cc
nextSetBit should check if the underlaying array contains the current word (#62805) (#62812)
This is a recent addition and it is missing a check as the underlaying array can be smaller that the numBits capacity.
2020-09-23 11:17:26 +02:00
Luca Cavanna 862fab06d3
Share same existsQuery impl throughout mappers (#57607)
Most of our field types have the same implementation for their `existsQuery` method which relies on doc_values if present, otherwise it queries norms if available or uses a term query against the _field_names meta field. This standard implementation is repeated in many different mappers.

There are field types that only query doc_values, because they always have them, and field types that always query _field_names, because they never have norms nor doc_values. We could apply the same standard logic to all of these field types as `MappedFieldType` has the knowledge about what data structures are available.

This commit introduces a standard implementation that does the right thing depending on the data structure that is available. With that only field types that require a different behaviour need to override the existsQuery method.

At the same time, this no longer forces subclasses to override `existsQuery`, which could be forgotten when needed. To address this we introduced a new test method in `MapperTestCase` that verifies the `existsQuery` being generated and its consistency with the available data structures.
2020-09-23 11:00:53 +02:00
Przemko Robakowski 005e0bffaf
[7.x] Make for each processor resistant to field modification (#62791) (#62807)
* Make for each processor resistant to field modification  (#62791)

This change provides consistent view of field that foreach processor is iterating over. That prevents it to go into infinite loop and put great pressure on the cluster.

Closes #62790

* fix compilation
2020-09-23 10:46:00 +02:00
David Kyle bc34ecc581
[ML] Mute annotations index upgrade mapping test (#62814)
For #61908
2020-09-23 09:37:04 +01:00
Luca Cavanna 5ca86d541c
Move stored flag from TextSearchInfo to MappedFieldType (#62717) (#62770) 2020-09-23 09:40:34 +02:00
Albert Zaharovits b4ec821067
Fix doc-update interceptor for indices with DLS and FLS (#61516)
This fixes the protection against updates (and bulk updates) for indices with DLS
and/or FLS, when the request uses date math expressions.
2020-09-23 08:55:22 +03:00
Nhat Nguyen 663b85b98f Make keep alive optional in PointInTimeBuilder (#62720)
Remove the keepAlive parameter from the constructor of PointInTimeBuilder
as it's optional.
2020-09-22 18:52:54 -04:00
Nik Everett 7ffea4621d
Extract capture config from grok patterns up front (backport of #62706) (#62785)
This extracts the configuration for extracting values from a groked
string when building the grok expression to do two things:
1. Create a method exposing that configuration on `Grok` itself which
   will be used grok `grok` flavored runtime fields.
2. Marginally speed up extracting grok values by skipping a little
   string manipulation.
2020-09-22 17:44:42 -04:00
Nik Everett fa13585fae
Fix Eclipse build (#62733) (#62786)
Eclipse was confused for two reasons:
1. `:x-pack:plugin` depended on itself.
2. `ql`, `sql`, and `eql` couldn't see some methods.

I fixed problem 1 by only adding the "depends on itself" configuration
outside of eclipse. I fixed problem 2 by making a `test` sub-project in
`ql` that contains test utilities and depending on those where possible.
2020-09-22 17:44:25 -04:00
Jay Modi cb1dc5260f
Dedicated threadpool for system index writes (#62792)
This commit adds a dedicated threadpool for system index write
operations. The dedicated resources for system index writes serves as
a means to ensure that user activity does not block important system
operations from occurring such as the management of users and roles.

Backport of #61655
2020-09-22 15:31:38 -06:00
Rory Hunter 54d97ecc60
Check glibc version (#62728)
Java 15 requires at last glibc 2.14, but we support older Linux OSs that ship with older versions. Rather than continue to ship Java 14, which is now EOL and therefore unsupported, ES will detect this situation and print a helpful message, instead of the cryptic error that would otherwise be printed. Users on older OSs will have to set JAVA_HOME instead of using the bundled JVM.

This doesn't affect v8.0.0 because these older Linux OSs will not be supported, and all the supported ones have glibc 2.14.
2020-09-22 13:48:27 -07:00
Benjamin Trent 77bfb32635
[7.x] [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls (#62694) (#62784)
* [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls (#62694)

* [ML] changing to not use global bulk indexing parameters in conjunction with add(object) calls
 global parameters, outside of the global index, are ignored for internal callers in certain cases.
If the interal caller is adding requests via the following methods:
```
- BulkRequest#add(IndexRequest)
- BulkRequest#add(UpdateRequest)
- BulkRequest#add(DocWriteRequest)
- BulkRequest#add(DocWriteRequest[])
```
It is better to specifically set the desired parameters on the requests before they are added
to the bulk request object.

This commit addresses this issue for the ML plugin

* unmuting test
2020-09-22 15:07:08 -04:00
Marios Trivyzas 1e72144847
EQL: Remove support for `=` for comparisons (#62756) (#62775)
Since `=` is rarely used and is undocumented we its support for
equality comparisons keeping `==` as the only option. `=` is now only
used for assignments like in `maxspan=10m`.

Closes: #62650
(cherry picked from commit ad5ae4d887b5c2feca2d0e874d7bdf738e3fd54e)
2020-09-22 20:56:04 +02:00
James Rodewig 2366c1443b [DOCS] EQL: Note = is not an equality operator 2020-09-22 13:54:38 -04:00
Nik Everett 39a617773d
Raname grok's built-in patterns (backport of #62735) (#62765)
This reworks the code around grok's built-in patterns to name things
more like the rest of the code. Its not a big deal, but I'm just more
used to having `public static final` constants in SHOUTING_SNAKE_CASE.
2020-09-22 13:06:43 -04:00
Lisa Cawley c995e73c6d [DOCS] Add realm limitations for monitoring clusters (#62714) 2020-09-22 09:37:00 -07:00
James Rodewig 7b2010de81 [DOCS] Fix EQL search API example 2020-09-22 12:09:38 -04:00
Adam Locke 56fbfabeda
[DOCS] Add remote node as a node role (#62730) (#62776)
* Adding remote node as a node role.

* Incorporating reviewer feedback.
2020-09-22 12:02:22 -04:00
Lisa Cawley 7e97f17845 [DOCS] Add SLM security privileges (#62737) 2020-09-22 08:44:18 -07:00
Rory Hunter 3f856d1c81 Prioritise recovery of system index shards (#62640)
Closes #61660. When ordering shard for recovery, ensure system index shards are
ordered first so that their recovery will be started first.

Note that I rewrote PriorityComparatorTests to use IndexMetadata instead of its
local IndexMeta POJO.
2020-09-22 15:48:27 +01:00
James Rodewig c0e611e0a7
[DOCS] Fix typo: NamedID -> NameID (#62721) (#62767)
Co-authored-by: Greg Back <1045796+gtback@users.noreply.github.com>
2020-09-22 10:30:35 -04:00
markharwood a0df0fb074
Search - add case insensitive flag for "term" family of queries #61596 (#62661)
Backport of fe9145f

Closes #61546
2020-09-22 13:56:51 +01:00
Armin Braun 0d5250c99b
Add Trace Logging to File Restore (#62755) (#62761)
Requested by the performance team and generally potentially useful
to log each file at `TRACE` like we do for snapshot create.
2020-09-22 14:44:40 +02:00
Andrei Dan 0be89bcd7f
Mute RegressionIT.testTwoJobsWithSameRandomizeSeedUseSameTrainingSet (#62763) 2020-09-22 13:43:15 +01:00
David Kyle 31fbc6800f
[7.x] [ML] Add upgrade mappings assertions to full cluster restart tests (#62293) (#62305)
Refactors the index mapping checks in the rolling upgrade tests
and use that shared code in the full cluster restart tests.
2020-09-22 13:09:51 +01:00
Rory Hunter df2b5dd4d1 Use the Elastic Docker registry for the UBI base image (#62606)
Swap the base Docker image for UBI builds to point at Elastic's registry.
2020-09-22 12:09:07 +01:00
Amogh Mishra bc6bea5924 Remove node from cluster when node locks broken (#61400)
In #52680 we introduced a mechanism that will allow nodes to remove
themselves from the cluster if they locally determine themselves to be
unhealthy. The only check today is that their data paths are all
empirically writeable. This commit extends this check to consider a
failure of `NodeEnvironment#assertEnvIsLocked()` to be an indication of
unhealthiness.

Closes #58373
2020-09-22 10:08:41 +01:00
Armin Braun aa0dc56412
Ensure MockRepository is Unblocked on Node Close (#62711) (#62748)
`RepositoriesService#doClose` was never called which lead to
mock repositories not unblocking until the `ThreadPool` interrupts
all threads. Thus stopping a node that is blocked on a mock repository operation wastes `10s`
in each test that does it (which is quite a few as it turns out).
2020-09-22 11:00:18 +02:00
Armin Braun 4bdbc39e9f
Fix testQueuedSnapshotOperationsAndBrokenRepoOnMasterFailOverMultiple (#62713) (#62747)
There's possible retries here that work out if both the snapshot and the delete
operation are retried when master shuts down and hits the unlikely case of the retried delete
executing before the retried snapshot, making both operations pass.

Closes #62686
2020-09-22 10:42:11 +02:00
Luca Cavanna 9ae29713fd
Dense vector field type minor fixes (#62631)
The dense vector field is not aggregatable although it produces fielddata through its BinaryDocValuesField. It should pass up hasDocValues set to true to its parent class in its constructor, and return isAggregatable false. Same for the sparse vector field (only in 7.x).

This may not have consequences today, but it will be important once we try to share the same exists query implementation throughout all of the mappers with #57607.
2020-09-22 10:40:51 +02:00
Christoph Büscher 593511e5c9
VersionFieldIT should register transportClientPlugins (#62734) 2020-09-22 10:10:44 +02:00
Ignacio Vera 265387f348
override needsScore() on ValueCountAggregator (#62683) (#62745) 2020-09-22 08:47:16 +02:00
Yang Wang 28503f04f7
Fix privilege requirement for CCS with Point In Time reader (#62261) (#62696)
When target indices are remote only, CCS does not require user to have privileges on the local cluster. This PR ensure Point-In-Time reader follows the same pattern.

Relates: #61827
2020-09-22 12:51:51 +10:00
Tim Brooks fae2f5f8e1
Log alloc description after netty processors set (#62741)
Currently we log the NettyAllocator description when the netty plugin is
created. Unfortunately, this hits certain static fields in Netty which
triggers the settings of the number of CPU processors. This conflicts
with out Elasticsearch behavior to override this based on a setting.

This commit resolves the issue by logging after the processors have been
set.
2020-09-21 19:52:51 -06:00
Yang Wang 897d2e8a02
Fix ccs permission for search with a scroll id (#62053) (#62695)
CCS with remote indices only does not require any privileges on the local cluster.
This PR ensures that search with scroll follow the permission model.
2020-09-22 11:49:40 +10:00
James Rodewig 21d5236173 [DOCS] EQL: Style fixes 2020-09-21 19:44:21 -04:00
James Rodewig 00bfc2d684
[7.x] [DOCS] EQL: Improve regsvr32 misuse explanation (#62722) (#62738)
* [DOCS] EQL: Improve regsvr32 misuse explanation (#62722)

Expands the introduction to better explain what regsvr32 misuse is and
how it works at a high level.

* [DOCS] EQL: Style fixes
2020-09-21 19:02:10 -04:00
Tim Brooks 9bf0d9105a
Change netty pool chunk size to respect G1 region (#62410)
Currently the netty pool chunk size defaults to 16MB. The number does
not play well with the G1GC which causes this to consume entire regions.
Additionally, we normally allocated arrays of size 64KB or less. This
means that Elasticsearch could handle a smaller pool chunk size to play
nicer with the G1GC.
2020-09-21 16:45:09 -06:00
Jim Ferenczi 1fc78d430b Fix terms aggregation ordering after the final reduce (#62732)
This commit ensures that the final order of the terms aggregations
is registered correctly after the final reduce.
This bug was introduced in #62028 which is not released yet so this PR is marked
as a non-issue.
This issue was discovered when running a terms aggregation under an auto-date
histogram. In such a case, the auto-date histogram may run multiple final reduce
to merge buckets together. This change makes sure that running multiple final reduces
doesn't create duplicates but it doesn't fix the fact that the final reduce may prune
the list of terms prematurely. This other bug is tracked separately in #62731.
2020-09-22 00:03:04 +02:00
Nhat Nguyen f9f4d87437 Remove invalid assertion in SearchService (#62675)
This assertion does not always hold because there can be a race between
`putReaderContext` and `afterIndexRemoved` when an index is deleted.

Closes #62624
2020-09-21 16:29:00 -04:00
Andrei Dan 79d0c4ed18
ILM: allow check-migration step to continue if tier setting unset (#62636) (#62724)
This allows the `check-migration` step to move past the allocation check
if the tier routing settings are manually unset.

This helps a user unblock ILM in case a tier is removed (ie. if the warm tier
is decommissioned this will allow users to resume the ILM policies stuck in
`check-migration` waiting for the warm nodes to become available and the managed
index to allocate. this allows the index to allocate on the other available tiers)

(cherry picked from commit d7a1eaa7f51d0972d10c0df1d3cd77d6b755dd41)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-09-21 20:40:01 +01:00
Ryan Ernst ee835ee74a
Copy gradle/system jdk to local dir for packaging tests (#61436)
The distro tests rely on two jdks, pulled in by the jdk download plugin.
The move the artifact transforms result in the path to the extracted
jdks existing under the gradle cache dir, which is outside the vagrant
mount of the elasticsearch project. This commit creates a local copy
within the `qa:os` project that the packaging tests use.

closes #61138
2020-09-21 12:19:09 -07:00
Seth Michael Larson ae24dfc4f0
[7.x] Migrate Python documentation to elasticsearch-py
Backport of PR #62710
2020-09-21 14:12:20 -05:00
Marios Trivyzas 1f612cccbb
SQL: Implement FORMAT function (#55454) (#62701)
Implement FORMAT according to the SQL Server spec: https://docs.microsoft.com/en-us/sql/t-sql/functions/format-transact-sql?view=sql-server-ver15#ExampleD by translating to the java.time patterns used in DATETIME_FORMAT.

Closes: #54965

Co-authored-by: Marios Trivyzas <matriv@users.noreply.github.com>
Co-authored-by: Bogdan Pintea <bogdan.pintea@elastic.co>
Co-authored-by: Andrei Stefan <astefan@users.noreply.github.com>
(cherry picked from commit da511f4e033db6e8a6aa2a54b23e906b5e026845)
2020-09-21 19:22:04 +02:00
Ignacio Vera cadd5dc53f
Fix bug when initializing HyperLogLogPlusPlusSparse (#62602) (#62702)
This is a follow up of #62480 where we are oversizing one array when initialising. In addition it prevents a possible CircuitBreaker leak during initialisation.
2020-09-21 17:30:40 +02:00
Armin Braun 13e28b85ff
Speed up RepositoryData Serialization (#62684) (#62703)
Make serializing `RepositoryData` a little faster and split up/document the code for it a little
as well given how massive this method has gotten at this point.
2020-09-21 17:29:56 +02:00
Dan Hermann a06339ffae
Fix NPE when deleting multiple backing indices on a data stream (#62274) (#62708) 2020-09-21 10:26:47 -05:00