Commit Graph

6062 Commits

Author SHA1 Message Date
Julie Tibshirani 997c73ec17
Correct how field retrieval handles multifields and copy_to. (#61391)
Before when a value was copied to a field through a parent field or `copy_to`,
we parsed it using the `FieldMapper` from the source field. Instead we should
parse it using the target `FieldMapper`. This ensures that we apply the
appropriate mapping type and options to the copied value.

To implement the fix cleanly, this PR refactors the value parsing strategy. Now
instead of looking up values directly, field mappers produce a helper object
`ValueFetcher`. The value fetchers are responsible for almost all aspects of
fetching, including looking up the right paths in the _source.

The PR is fairly big but each commit can be reviewed individually.

Fixes #61033.
2020-08-20 15:53:35 -07:00
Alan Woodward a3a0c63ccf
Convert NumberFieldMapper to parametrized form (#61092) (#61376)
In addition, this commit converts ScaledFloatFieldMapper as it was relying
on a number of static values taken from NumberFieldMapper that had changed
or been removed.
2020-08-20 16:43:26 +01:00
Nik Everett 9789e6d154
Migrate some field mapper tests to ESTestCase (#61301) (#61346)
This switches a few tests for field mappers from `ESSingleNodeTestCase`
to `ESTestCase` because, in general, we prefer to avoid
`ESSingleNodeTestCase` when we can because it is slow and "big". "Big"
here means that it pulls in an entire node, making it difficult to
reason about what you are testing.
2020-08-19 15:43:49 -04:00
Francisco Fernández Castaño 89a7f32100
Fix SearchableSnapshotDirectoryTests#testRecoveryStateIsKeptOpenAfterPreWarmFailure (#61343)
The test didn't take into account the case where 0 documents are
indexed into the shard, meaning that files aren't loaded during
the pre-warm phase. The test injects FileSystem failures, if
the snapshot doesn't contain any files, pre-warm doesn't read
any files and the recovery completes normally.

Closes #61295
Backport of #61317
2020-08-19 19:28:47 +02:00
Andrei Stefan a214d7902a
EQL: make endsWith function use a wildcard ES query wherever possible (#61160) (#61320)
(cherry picked from commit 55fdb7e2c74d4fae86ec40686091ecba831caeaf)
2020-08-19 14:17:55 +03:00
Andrei Stefan a6c0670a14
EQL: make stringContains function use a wildcard ES query (#61189) (#61313)
(cherry picked from commit 039a7d1c68f6f1ed0e7e6cfb86be6b04eec8051c)
2020-08-19 12:40:48 +03:00
Martijn van Groningen d4a8172f8e
Disable ilm history in data streams rest qa module. (#61312)
Backport of #61291 to 7.x branch.

Closes #61273
2020-08-19 10:34:26 +02:00
Andrei Stefan 93abbb9057
Add data streams wildcard pattern yml test (#61269) (#61280)
(cherry picked from commit e13a365eeb6d8c6a7c9a91f94f0e8e78e3fe4773)
2020-08-18 19:38:07 +03:00
Andrei Stefan 5de0f19cc3
EQL: Return sequence join keys in the original type (#61268) (#61282)
(cherry picked from commit d54957d61faa0d502387656e3cace594017b6ea0)
2020-08-18 19:37:15 +03:00
Martijn van Groningen cbf60f6c5e
Add tests that simulate new indexing strategy upgrade procedure. (#61263)
Backport of #61082 to 7.x branch.

Closes #58251
2020-08-18 17:02:29 +02:00
Andrei Stefan ad627c7eab
Introduce ordering in the constant_keyword test for better predictibility. (#61248) (#61252)
(cherry picked from commit 69193f9de8178dbaa1d8467f1686b100dd2b161c)
2020-08-18 12:17:15 +03:00
Mark Tozzi db1df6cc30
[7.x] Remove a bunch of type boilerplate from Aggs (#60852) (#61031) 2020-08-17 12:13:05 -04:00
James Rodewig 60876a0e32
[DOCS] Replace Wikipedia links with attribute (#61171) (#61209) 2020-08-17 11:27:04 -04:00
Andrei Stefan db8788e5a2
QL: wildcard field type support (#58062) (#61205)
(cherry picked from commit c874e6cdd3e051ce599b50c18642de038b84105f)
2020-08-17 18:24:32 +03:00
Andrei Stefan 90e116738e
QL: add filtering query dsl support to IndexResolver (#60514) (#61200)
(cherry picked from commit 7b3635d796be26af9f87d19963a8ed4ab4bbf13f)
2020-08-17 17:59:58 +03:00
Nik Everett 1b7bbafd81
Add method to make random DateFormatter pattern (backport of #60613) (#61213)
Adds a method to make a random date `DateFormatter` pattern. We expect
this'll be useful for runtime fields to compate their formatting with
the standard date field.
2020-08-17 10:57:52 -04:00
David Kyle ba89af544f
[7.x] Respect ML upgrade mode in TrainedModelStatsService (#61143) (#61187)
When in upgrade mode the ml stats service should not write to the stats index.
2020-08-17 11:09:25 +01:00
Benjamin Trent 43fc6c34bc
Muting analytics integration tests for change new native output model_metadata (#61158)
relates to elastic/ml-cpp#1456
2020-08-14 11:45:35 -04:00
Benjamin Trent 8f302282f4
[ML] adds new feature_processors field for data frame analytics (#60528) (#61148)
feature_processors allow users to create custom features from
individual document fields.

These `feature_processors` are the same object as the trained model's pre_processors.

They are passed to the native process and the native process then appends them to the
pre_processor array in the inference model.

closes https://github.com/elastic/elasticsearch/issues/59327
2020-08-14 10:32:20 -04:00
David Roberts d1b60269f4
[ML] Ensure annotations index mappings are up to date (#61142)
When the ML annotations index was first added, only the
ML UI wrote to it, so the code to create it was designed
with this in mind.  Now the ML backend also creates
annotations, and those mappings can change between
versions.

In this change:

1. The code that runs on the master node to create the
   annotations index if it doesn't exist but another ML
   index does also now ensures the mappings are up-to-date.
   This is good enough for the ML UI's use of the
   annotations index, because the upgrade order rules say
   that the whole Elasticsearch cluster must be upgraded
   prior to Kibana, so the master node should be on the
   newer version before Kibana tries to write an
   annotation with the new fields.
2. We now also check whether the annotations index exists
   with the correct mappings before starting an autodetect
   process on a node.  This is necessary because ML nodes
   can be upgraded before the master node, so could write
   an annotation with the new fields before the master node
   knows about the new fields.

Backport of #61107
2020-08-14 13:51:04 +01:00
Benjamin Trent 7c3bfb9437
[ML] updating feature_importance results mapping (#61104) (#61144)
This updates the feature_importance mapping change from elastic/ml-cpp#1387
2020-08-14 08:43:10 -04:00
Nhat Nguyen 328c86a4ec Increase timeout in PrimaryFollowerAllocationIT
A slow CI can take more than 10 seconds to relocate shards on the follower.
2020-08-13 14:41:32 -04:00
Dan Hermann c17839c255
Add warning handler to resolve test failure (#60427) (#61102) 2020-08-13 10:37:08 -05:00
Benjamin Trent a497263c47
[ML] ensure config index is updated before clearing finished_time (#61064) (#61085)
When a user upgrades between versions, they may stop their ML jobs.

Then when the upgrade is complete, they will want to open the jobs again.

But, when opening a job, we attempt to clear out the jobs finished_time. If the job configuration has adjusted between the versions (i.e. added a new field), it will dynamically update the .ml-config index.

We should instead manually change the mapping to be the updated version.
2020-08-13 08:12:10 -04:00
David Turner dd7410d8c2 Disable rebalancing in searchable snapshots tests (#61068)
Fixes a test failure in which we allocated some shards and then
relocated them elsewhere, invalidating an assertion about the recovery
statistics which assumed that the shards stayed where they were
originally allocated.

Closes #61067.
2020-08-13 09:08:27 +01:00
Lee Hinman e3df64a429
[7.x] Add data tiers (hot, warm, cold, frozen) as custom node roles (#60994) (#61045)
This commit adds the `data_hot`, `data_warm`, `data_cold`, and `data_frozen` node roles to the
x-pack plugin. These roles are intended to be the base for the formalization of data tiers in
Elasticsearch.

These roles all act as data nodes (meaning shards can be allocated to them). Nodes with the existing
`data` role acts as though they have all of the roles configured (it is a hot, warm, cold, and
frozen node).

This also includes a custom `AllocationDecider` that allows the user to configure the following
settings on a cluster level:
- `cluster.routing.allocation.require._tier`
- `cluster.routing.allocation.include._tier`
- `cluster.routing.allocation.exclude._tier`

And in index settings:
- `index.routing.allocation.require._tier`
- `index.routing.allocation.include._tier`
- `index.routing.allocation.exclude._tier`

Relates to #60848
2020-08-12 11:06:23 -06:00
Andrei Dan 32173a82c8
ILM: add frozen phase (#60983) (#61035)
This adds a frozen phase to ILM that will allow the execution of the
set_priority, unfollow, allocate, freeze and searchable_snapshot actions.

The frozen phase will be executed after the cold and before the delete phase.

(cherry picked from commit 6d0148001c3481290ed7e60dab588e0191346864)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-08-12 16:36:27 +01:00
Yannick Welsch 6644f2283d Do not access snapshot repo on dedicated voting-only master node (#61016)
Today a snapshot repository verification ensures that all master-eligible and data nodes have write access to the
snapshot repository (and can see each other's data) since taking a snapshot requires data nodes and the currently
elected master to write to the repository. However, a dedicated voting-only master-eligible node is not a data node and
will never be the elected master so we should not require it to have write access to the repository.

Closes #59649
2020-08-12 16:56:45 +02:00
Benjamin Trent 4275a715c9
[ML] adjusting inference processor to support foreach usage (#60915) (#61022)
`foreach` processors store information within the `_ingest` metadata object.

This commit adds the contents of the `_ingest` metadata (if it is not empty).

And will append new inference results if the result field already exists.

This allows a `foreach` to execute and multiple inference results being written to the same result field.

closes https://github.com/elastic/elasticsearch/issues/60867
2020-08-12 08:34:18 -04:00
markharwood 66098e0bf4
Search fix: query_string regex/wildcard searches not working on wildcard fields (#60959) (#61010)
The Query string parser was not delegating the construction of wildcard/regex queries to the underlying field type.
The wildcard field has special data structures and queries that operate on them so cannot rely on the basic regex/wildcard queries that were being used for other fields.

Closes #60957
2020-08-12 10:44:52 +01:00
Armin Braun 32423a486d
Simplify and Speed up some Compression Usage (#60953) (#61008)
Use thread-local buffers and deflater and inflater instances to speed up
compressing and decompressing from in-memory bytes.
Not manually invoking `end()` on these should be safe since their off-heap memory
will eventually be reclaimed by the finalizer thread which should not be an issue for thread-locals
that are not instantiated at a high frequency.
This significantly reduces the amount of byte copying and object creation relative to the previous approach
which had to create a fresh temporary buffer (that was then resized multiple times during operations), copied
bytes out of that buffer to a freshly allocated `byte[]`, used 4k stream buffers needlessly when working with
bytes that are already in arrays (`writeTo` handles efficient writing to the compression logic now) etc.

Relates #57284 which should be helped by this change to some degree.
Also, I expect this change to speed up mapping/template updates a little as those make heavy use of these
code paths.
2020-08-12 11:06:23 +02:00
Andrei Dan 35423a75af
Tests: don't fail if ILM executed the action already (#60916) (#60982)
(cherry picked from commit 8c970ad20f4f55a9c0d6a256aa643ea037281e75)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-08-12 09:04:04 +01:00
Dimitris Athanasiou 2e18c0f2ac
[7.x][ML] Audit force stopping data frame analytics (#60973) (#61004)
Audits a message when a data frame analytics job is force stopped.

Backport of #60973
2020-08-12 07:45:26 +03:00
Yang Wang c7b0290256
Mute kerberos tests for jdk 8u[262,271) (#60995)
The Kerberos bug (JDK-8246193) is introduced in JDK 8u262 and fixed in 8u271.
This PR mute for any possible releases between these two versions.
2020-08-12 11:50:48 +10:00
Nhat Nguyen ceaa28e97b Increase timeout in testFollowIndexWithConcurrentMappingChanges (#60534)
The test failed because the leader was taking a lot of CPUs to process
many mapping updates. This commit reduces the mapping updates, increases
timeout, and adds more debug info.

Closes #59832
2020-08-11 17:03:22 -04:00
Nhat Nguyen bf7eecf1dc Fix synchronization in ShardFollowNodeTask (#60490)
The leader mapping, settings, and aliases versions in a shard follow-task 
are updated without proper synchronization and can go backward.
2020-08-11 14:52:52 -04:00
James Rodewig 929f1cc9f9
[DOCS] Remove search request body page (#60972) (#60977) 2020-08-11 13:04:07 -04:00
Francisco Fernández Castaño d544528c7b
Increase information on assertRecoveryStats assertion (#60960)
Backport of #60952
2020-08-11 15:30:59 +02:00
Dimitris Athanasiou 6062672148
[7.x][ML] Monitor reindex response in DF analytics (#60911) (#60958)
Examines the reindex response in order to report potential
problems that occurred during the reindexing phase of
data frame analytics jobs.

Backport of #60911
2020-08-11 16:17:37 +03:00
Mark Tozzi ab8518fb5b
[7.x] Extensibility for Composite Agg #59648 (#60842) 2020-08-11 09:14:33 -04:00
Dan Hermann 839c6cdfc0
Un-mute data stream REST test (#60120) (#60939) 2020-08-11 08:10:04 -05:00
David Kyle 18a65c5b9a
DFA Get Stats can return multiple responses if more than one error occurs (#60950)
If the search for get stats with multiple job Ids fails the listener is called for each failure. 
This change waits for all responses then returns the first error if there was one.
2020-08-11 10:28:05 +01:00
Alan Woodward 54279212cf
Make MetadataFieldMapper extend ParametrizedFieldMapper (#59847) (#60924)
This commit cuts over all metadata field mappers to parametrized format.
2020-08-11 09:02:28 +01:00
Benjamin Trent 66b3e89482
[ML] enable logging for test failures (#60902) (#60910) 2020-08-10 12:36:30 -04:00
Francisco Fernández Castaño 2a4fd8329b
Avoid a race condition while waiting for pre warm to finish on SearchableSnapshotDirectoryTests (#60906)
Backport of #60885. Closes #60813
2020-08-10 17:29:16 +02:00
Jim Ferenczi f30f1f04e2
Replace AggregatorTestCase#search with AggregatorTestCase#searchAndReduce (#60816)
This commit removes the ability to test the top level result of an aggregator
before it runs the final reduce. All aggregator tests that use AggregatorTestCase#search
are rewritten with AggregatorTestCase#searchAndReduce in order to ensure that we test
the final output (the one sent to the end user) rather than an intermediary result
that could be different.
This change also removes spurious commits triggered on top of a random index writer.
These commits slow down the tests and are redundant with the commits that the
random index writer performs.
2020-08-10 17:23:00 +02:00
David Roberts dd02e9f31a [TEST] Mute SearchableSnapshotActionIT testSearchableSnapshotForceMergesIndexToOneSegment (#60904)
Due to https://github.com/elastic/elasticsearch/issues/60901
2020-08-10 15:25:39 +01:00
Henning Andersen a155315ceb
Autoscaling decider and decision service (#59005) (#60884)
Split the autoscaling decider into a service and configuration
in order to enable having additional context information available
in the service. Added AutoscalingDeciderContext holding generic
information all deciders are expected to need. Implemented GET
_autoscaling/decision
2020-08-10 15:28:52 +02:00
Andrei Dan 235e5ed3ea
[7.x] ILM: add force-merge step to searchable snapshots action (#60819) (#60882)
This adds a force-merge step to the searchable snapshot action, enabled by default,
but parameterizable using the `force_merge-index" optional boolean.

eg.
```
PUT _ilm/policy/my_policy
{
  "policy": {
    "phases": {
      "cold": {
        "actions": {
          "searchable_snapshot" : {
            "snapshot_repository" : "backing_repo",
            "force_merge_index": true
          }
        }
      }
    }
  }
}
```

(cherry picked from commit d0a17b2d35f1b083b574246bdbf3e1929471a4a9)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-08-10 13:45:11 +01:00
Martijn van Groningen 64bb082f9b
Improve error message for non append-only writes that target data stream (#60874)
Backport of #60809 to 7.x branch.

Closes #60581
2020-08-10 13:18:59 +02:00