Commit Graph

5813 Commits

Author SHA1 Message Date
Przemko Robakowski 81d8cfd57f
Avoid accidental cleanups in indices cleaner tests (#67338) (#67370)
When tests in AbstractIndicesCleanerTestCase run at 1AM they can clash with scheduled clean up in CleanerService which leads to rare, non-reproducible failures.
This change fixes it by setting retention period in CleanerService to max possible time, so nothing is ever deleted by it.

Closes #64386
2021-01-12 18:49:25 +01:00
Lee Hinman b2bfa9ed4f
[7.10] Fix ignoring existed cluster settings in DataTierAllocationDecider (#67137) (4295d489) (#67180)
Relates to #67133.
Seem to #65037.

The main changes of this PR are:

    Modify the construction method of DataTierAllocationDecider, add a param settings like FilterAllocationDecider.
    Create DataTierAllocationDecider in the main method of DataTierMigrationRoutedStep and SetSingleNodeAllocateStep, and the DataTierAllocationDecider is constructed using the cluster settings in the cluster metadata, so the cluster level _tier filters can be seen when executing the steps.
    Add some tests for the change.

Co-authored-by: bellengao <gbl_long@163.com>
2021-01-07 11:27:22 -07:00
Marios Trivyzas a2a389afdb
EQL: [Tests] Fix issues for SequenceSpecTests (#67104) (#67143)
- Use `argumentFormatting` starting with 1 as 0 is problematic for Java 16
- Actually enable the tests since the listener that checks the results
  was not called and nothing was asserted.
- Disable one test that fails which is tracked with: #67102

Fixes: #66778
(cherry picked from commit a4199af58b8c5ab4757a45248672e0233978b208)
2021-01-07 11:56:57 +01:00
Lee Hinman e9b798bdb1
[7.10] Make FilterAllocationDecider totally ignore tier-based allocation settings (#67019) (#67034)
Previously we treated attribute filtering for _tier-prefixed attributes a pass-through, meaning
that they were essentially always treated as matching in DiscoveryNodeFilters.match, however, for
exclude settings, this meant that the node was considered to match the node if a _tier* filter was
specified.

This commit prunes these attributes from the DiscoveryNodeFilters when considering the filters for
FilterAllocationDecider so that they are only considered in DataTierAllocationDecider.

Resolves #66679
2021-01-05 12:42:34 -07:00
Ioannis Kakavas 2fd049cda7
Mute JDBC client tests with SSL in FIPS mode (#66566) (#66747)
JDBC client can only be configured for SSL with keystores,
but we can't use JKS/PKCS12 keystores in FIPS 140-2 mode.

Resolves: #66095
2020-12-22 17:16:33 +02:00
Andrei Stefan 12ffe4a4b3
QL: handle IP type fields extraction with ignore_malformed property (#66622) (#66686)
Return null for any field that is present in the _ignored section of the
response, not only numerics and IPs.

(cherry picked from commit 106719f6af1eb2f06396161143a5f0185c5d1b54)
2020-12-21 16:07:03 +02:00
Andrei Dan 2620725297
Fix MANAGE_IDX_TEMPLATE privilege to allow `component_template/*` (#66514) (#66581)
(cherry picked from commit bcc28e0ab8e6883e14b23f93f428dee03b377a1d)
Signed-off-by: Andrei Dan <andrei.dan@elastic.co>
2020-12-18 11:17:16 +00:00
Albert Zaharovits 480561dbc3
Store and use only internal security headers (#66365)
For async searches (EQL included) the client's request headers were
erroneously stored in the .tasks index. This might expose the requesting
client's HTTP Authorization header. This PR fixes that by employing the
usual approach to store only the security-internal headers, which carry
the authentication result, instead of the original Authorization header,
which is commonly utilized to redo authentication for scheduled tasks.
2020-12-17 23:40:55 +02:00
Costin Leau 4cb3ee5b4e EQL: Fix early trimming of in-flight data (#66493)
Rework trimToLast to take into account an ordinal for last trimming so
instead of keeping the last entry in a stage, it keeps the last entry
before the given ordinal.
This takes care of the case where a dense stage that requires several
passes does not discard valid data from a previous sparse stage that go
beyond the current stage point.

(cherry picked from commit 4f55749072b39f89822bdd52c67998f7bed890a9)
(cherry picked from commit 6b61dfead88a144c6e85e384d47a24f0c1480c6b)
(cherry picked from commit cece81b5dee88b18e3e7ea189fc342ef53ea19f2)
2020-12-17 18:00:33 +02:00
Benjamin Trent a370104535
[ML] change to only calculate model size on initial load to prevent slow cache promotions (#66451) (#66462)
When a value is promoted in the LRU cache, its weight is removed and added.

The LocalModel object was recalculating the model size for ever weight check, which caused a polynomial runtime.

This commit changes the model size to only be calculated in the LocalModel ctor.
2020-12-16 14:01:21 -05:00
Bogdan Pintea 176587ebc4
QL: Verify filter's condition type (backport of #66268) (#66408)
* SQL: Verify filter's condition type (#66268)

* Verify filter's condition type

This adds a check in the verifier to check if filter's condition is of a
boolean type and fail the request otherwise.

(cherry picked from commit 3aec1a3d99a3f4650ec8be014a97106320f0874a)
2020-12-15 23:24:11 +01:00
Jim Ferenczi 330de82d59 Fix composite aggregation on unsigned long (#65715)
This commit ensures that the after key is parsed with the doc value formatter.
This is needed for unsigned longs that uses shifted longs internally.

Closes #65685
2020-12-14 16:59:07 +01:00
Marios Trivyzas 416ea4fcdc
EQL: [Tests] New eql correctness data snapshot (#66238)
With the upcoming validation for type compatibility of the sequence
keys, several tests are failing because some fields that contain IP
data were previously mapped as keyword. Fixed the mapping and created a
new snaphost of the correctness data in the gcs bucket.

Relates to: #66183

(cherry picked from commit 7f638f661c5a5c57a4ea7d3d3e2ccf5c81ae92d1)
2020-12-14 10:34:14 +01:00
Nhat Nguyen 84d4e5bcb1
CCR should check historyUUID in every read request (#66220)
Today, CCR only checks the historyUUID of the leader shard when it has
operations to replicate. If the follower shard is already in-sync with
the leader shard, then CCR won't detect if the historyUUID of the leader
shard has been changed. While this is not an issue, it can annoy users
in the following situation:

The follower index is in-sync with the leader index

Users restore the leader index from snapshots

CCR won't detect the issue and report ok in its stats API

CCR suddenly stops working when users start indexing to the leader index

This commit makes sure that we always check historyUUID in every
read-request so we can detect and report the issue as soon as possible.

Backport of #65841
2020-12-12 12:25:13 -05:00
Lee Hinman 8cbb9612d0
[7.10] Create AllocationDeciders in the main method of the ILM step (#65037) (8ac30f9a) (#66070)
Backports the following commits to 7.x:

    Create AllocationDeciders in the main method of the ILM step (#65037) (8ac30f9)
2020-12-08 16:56:25 -07:00
Tanguy Leroux 16fae5d66d
Also reroute after shard snapshot size fetch failure (#66008)
In #61906 we added the possibility for the master node to fetch
the size of a shard snapshot before allocating the shard to a
data node with enough disk space to host it. When merging
this change we agreed that any failure during size fetching
should not prevent the shard to be allocated.

Sadly it does not work as expected: the service only triggers
reroutes when fetching the size succeed but never when it
 fails. It means that a shard might stay unassigned until
another cluster state update triggers a new allocation
(as in #64372). More sadly, the test I wrote was wrong as
it explicitly triggered a reroute.

This commit changes the InternalSnapshotsInfoService
so that it also triggers a reroute when fetching the snapshot
shard size failed, ensuring that the allocation can move
forward by using an UNAVAILABLE_EXPECTED_SHARD_SIZE
shard size. This unknown shard size is kept around in the
snapshot info service until no corresponding unassigned
shards need the information.

Backport of #65436
2020-12-08 12:10:37 +01:00
Przemysław Witek d562caf9b2
Fix compile errors in QuerierTests (#65935) 2020-12-07 13:27:36 +01:00
Bogdan Pintea 2ec53ea7c4 Abort sorting in case of local agg sort queue overflow (#65687)
In case the local agg sorter queue gets full and no limit has been provided,
the local sorter will now erroneously call the failure callback for every
single row in the original rowset that's left over the local queue limit
(instead for just the first one).  The failure response is dispatched in any
case, so this is relatively harmless.  The sorter continues iterating on the
original response fetching subsequent pages. In case of correct Elasticsearch
behaviour, this is also harmless, it'll just trigger a number of internal
exceptions. However, in case of a pagination defect in Elasticsearch (like
GH#65685, where the same search_after is returned), this will result in an
effective spin loop, potentially rendering eventually the node unresponsive.

This PR simply breaks both the inner loop iterating over the current unsorted
rowset, as well as the outer one, iterating over the left pages.

It also fixes an outdated documentation limitation.

(cherry picked from commit 638402c387faf79bba38fcc95f371a73146efc0b)
2020-12-07 11:32:41 +01:00
Jim Ferenczi 1c34507e66 Create async search index if necessary on updates and deletes (#64606)
This change ensures that we create the async search index with the right mappings and settings when updating or deleting a document. Users can delete the async search index at any time so we have to re-create it internally if necessary before applying any new operation.
2020-12-02 09:04:28 +01:00
Armin Braun 16642f1c74
Handle RejectedExecutionException in ShardFollowTasksExecutor (#65648) (#65653)
Follow-up to #65415. We can't have this exception bubble up in an exception
handler any longer due to the new assertion so we must handle it here.
2020-12-01 06:51:05 +01:00
Ioannis Kakavas f6921af885 Revert "Gracefully handle exceptions from Security Providers (#65464) (#65554)"
This reverts commit 12ba9e3e16. This
commit was mechanically backported to 7.10 while it shouldn't have
been.
2020-11-26 17:11:34 +02:00
Ioannis Kakavas 12ba9e3e16
Gracefully handle exceptions from Security Providers (#65464) (#65554)
In certain situations, such as when configured in FIPS 140 mode,
the Java security provider in use might throw a subclass of
java.lang.Error. We currently do not catch these and as a result
the JVM exits, shutting down elasticsearch.

This commit attempts to address this by catching subclasses of Error
that might be thrown for instance when a PBKDF2 implementation
is used from a Security Provider in FIPS 140 mode, with the password
input being less than 14 bytes (112 bits).

- In our PBKDF2 family of hashers, we catch the Error and
throw an ElasticsearchException while creating or verifying the
hash. We throw on verification instead of simply returning false
on purpose so that the message bubbles up and the cause becomes
obvious (otherwise it would be indistinguishable from a wrong
password).
- In KeyStoreWrapper, we catch the Error in order to wrap and re-throw 
a GeneralSecurityException with a helpful message. This can happen when 
using any of the keystore CLI commands, when the node starts or when we 
attempt to reload secure settings.
- In the `elasticsearch-users` tool, we catch the ElasticsearchException that
the Hasher class re-throws and throw an appropriate UserException.

Tests are missing because it's not trivial to set CI in fips approved mode
right now, and thus any tests would need to be muted. There is a parallel
effort in #64024 to enable that and tests will be added in a followup.
2020-11-26 17:04:34 +02:00
Ioannis Kakavas b4b4483e24
Do not interpret SecurityException in KeystoreAwareCommand (#65366) (#65486)
KeyStoreAwareCommand attempted to deduce whether an error occurred
because of a wrong password by checking the cause of the
SecurityException that KeyStoreWrapper.decrypt() throws. Checking
for AEADBadTagException was wrong becase that exception could be
(and usually is) wrapped in an IOException. Furthermore, since we
are doing the check already in KeyStoreWrapper, we can just return
the message of the SecurityException to the user directly, as we do
in other places.
2020-11-26 13:12:18 +02:00
Marios Trivyzas 54e7e4c9de
EQL: [Tests] Adjust README for preserving test data (#65460)
Adjusted the README file to mention both the option to preserve the test
data when simple reproducing/executing the tests, but also when starting
the server node manually and issuing the query(ies) against it.

Follows: #65400
(cherry picked from commit e3a1910d28d8b0ed20997754c74fa4d4d52cda15)
2020-11-25 14:30:25 +01:00
Martijn van Groningen 387af748a5
Add support for data stream APIs in transport client. (#65484)
Backporting #65433 to the 7.10 branch.
2020-11-25 10:23:02 +01:00
Martijn van Groningen 4801f7f619
Include the entire response in error message in case of reporting generation error. (#64979)
The toString of HttpResponse includes not just the status, but also all the other details.
2020-11-25 09:24:08 +01:00
Mark Vieira f8f5d27f6b Add option to preserve data in test clusters (#65400)
(cherry picked from commit 1ce323e1368cf5231181f1efaba1c4e425066e37)
2020-11-24 11:56:56 -08:00
Andras Palinkas 7f7e938a25
{S,E}QL: Fix optimization of `NotEquals` in conjunctions (#65331) (#65449)
* Fix the `CombineBinaryComparisons` optimizer rule, so that semantic
equality taken into account during the optimization of `NotEquals`

Examples that previously removed the `NotEquals` expressions (leading
to incorrect results):

```
double >= 10 AND integer != 9
-->  double >= 10

keyword != '2021' AND datetime >= '2020-01-01T00:00:00'
--> datetime >= '2020-01-01T00:00:00'
```

With the fix, expressions like the above will not be touched.
`NotEquals` will only be eliminated from the `AND` expression if the
left side of the `NotEquals` `semanticEquals()` to the left side
of the other expressions within the conjunction (comparisons against
the same field/expression).

* Unit tests and integration tests

Close #65322
(cherry-picked from 8b2b7fa)
2020-11-24 13:20:32 -05:00
Jay Modi 419bda5c15
Fix watcher search template test after #65332 (#65382)
In #65332, the serialization of the WatcherSearchTemplateRequest class
changed to use IndicesOptions built in XContent facilities. This had
the side effect of fixing the handling of `all` for `expand_wildcards`
to include hidden indices. However, the tests in WatcherUtilsTests were
missed. This change updates those tests.

Backport of #65379
2020-11-24 09:04:42 -07:00
Mark Vieira cda1f884ee Mute WatcherUtilsTests.testDeserializeSearchRequest 2020-11-23 16:02:08 -08:00
Jay Modi 1a13a0b10f
Watcher understands hidden expand wildcard value (#65372)
Watcher has a search template that stores indices options to be used as
part of a search during watch execution, but this was not updated to be
aware of hidden indices and the `hidden` expand_wildcards option. This
change makes use of the `IndicesOptions#toXContent` method in Watcher,
which already handles the new value. Additionally, the XContent parsing
is moved to the IndicesOptions class so that we will be less likely to
miss updating this in the future.

Closes #65148
Backport of #65332
2020-11-23 09:17:49 -07:00
Andrei Stefan 866a6afcdf
Extend the interval date comparison (#65348) (#65358)
(cherry picked from commit acfb463892fdaf3f0deb679122b5e402c7b56418)
2020-11-23 15:24:36 +02:00
Armin Braun 7fbdcb5e00
Fix SearchableSnapshotsIntegTests.testCreateAndRestoreSearchableSnapshot (#65343) (#65351)
The recovery stats assertions in this test ran without any waiting for
the recoveries to actually finish. The fact that they ran after the concurrent
searches checks generally meant that they would pass (because of searches warming caches
+ general relative slowness of searches) but there is no hard guarantees this will work
reliably as the pre-fetch threads which will update the recovery state might still be slow
to do so randomly, causing the assertions to trip.

closes #65302
2020-11-23 12:30:18 +01:00
Armin Braun b0cea04f95
Fix Broken Error Handling in CacheFile#acquire (#65342) (#65347)
If we fail to create the `FileChannelReference` (e.g. because the directory it should be created in
was deleted in a test) we have to remove the listener from the `listeners` set to not trip internal
consistency assertions.

Relates #65302 (does not fix it though, but reduces noise from failures by removing secondary
tripped assertions after the test fails)
2020-11-23 08:57:24 +01:00
Armin Braun 67b6317488
Mute JdbcCsvSpecIT#testCurrentDateFilter (#65341)
Muting for https://github.com/elastic/elasticsearch/issues/65336
2020-11-22 22:37:21 +01:00
Nik Everett 56605e4d9a Fixup reduceRandom tests (#65263)
In aa1ea96b8698aa12bed1c4e8d704882a2a639791 I made all
`testReduceRandom` tests for aggs mimick production more precisely.
More precisely, they pick the correct "lead" result when performing
partial reduction. This is great, but, sadly, some tests assumed that we
always reduced against the "first" aggregator. This fixes those tests.

Closes #65163
2020-11-20 13:10:34 -05:00
Jay Modi 893e1a5282
Fix date math hidden index resolution (#65278)
This commit updates the IndexAbstractionResolver so that hidden indices
are properly resolved when date math is in use and when we are checking
if the index is visible.

Closes #65157
Backport of #65236
2020-11-19 12:40:14 -07:00
Nhat Nguyen 3989243a52 Stop renew retention leases when follow task fails (#65168)
If a shard follow-task hits a non-retryable error and stops, then we 
should also stop the retention-leases renewal process associated with
that follow-task.
2020-11-18 15:53:55 -05:00
Jim Ferenczi 9f3e3e2162 Fix "resource not found" exception on existing EQL async search (#65167)
This change fixes the initialization of the async results service
for the EQL get async action. The boolean that differentiates EQL
from normal _async_search request is set incorrectly, which results
in errors (404) when extending the keep alive of a running EQL search.

Fixes #65108
2020-11-18 09:10:31 +01:00
Costin Leau f089547b20 EQL: Fix aggressive/incorrect until policy in sequences (#65156)
The current until implementation in sequences is too optimistic, leading
to an aggressive match that discards correct data leading to invalid
results.
This commit addresses this issue and also unifies the until usage inside
TumblingWindow.
Further more it packs together the UntilGroup with SequenceGroup to
minimize memory usage and improve clean-up.

(cherry picked from commit de2724e92c732c66436939dbbedef93c9981b435)
(cherry picked from commit a60757756aae5f5abb31176fee972a7cdeac3649)
2020-11-18 09:34:33 +02:00
Dimitris Athanasiou 197de8fe66
[7.10][ML] Increase timeout waiting for DFA jobs to finish in integ tests (#65126) (#65131)
It appears that occasionally 30 seconds are not enough for CI workers
to complete DFA jobs. In order to eliminate such failures we increase
the time we wait for DFA jobs to complete in integration tests to
60 seconds.

Fixes #64926

Backport of #65126
2020-11-17 16:46:17 +02:00
Costin Leau 74fde15833 EQL: Allow null tiebreakers inside ordinals/sequences (#65033)
Align Ordinal comparator to consider nulls last (higher) in tiebreakers.
Add unit tests to Ordinal comparisons and criterion extraction.

Fix #64706

(cherry picked from commit 93dc883abd6b8855ff1618a574412b7f773b8ff5)
(cherry picked from commit 936e5f1a2cc29c1d5662cb8aa90c629af563a987)
2020-11-16 16:52:55 +02:00
Przemysław Witek de668ab84b
[7.10] [ML] Extract dependent variable's mapping correctly in case of a multi-field (#63813) (#64287) 2020-11-16 10:34:58 +01:00
Costin Leau 9551cb3420 EQL: small improvements to the testing base class
Extract request settings into dedicated methods for easier adjustments

(cherry picked from commit 4f93591cc561c7f8ff7c2f070dd1180f209810b7)
(cherry picked from commit ff7e8427345c304f5a37612c870b48555484b692)
2020-11-14 16:40:48 +02:00
Costin Leau f7cc570c4f EQL: Re-enable correctness tests (#65041)
Enable previously disabled tests - only two type of queries remain
disabled: one that does pattern matching and another one for
case-insensitivity.

Fix #63742

(cherry picked from commit 20210cc43b34438c40b8b5aebf0aa2b8161c4104)
(cherry picked from commit 95d08f2c8d0aac52cc1ed470fa489c239ee25159)
2020-11-14 16:09:11 +02:00
Costin Leau 76e73fec79
EQL: Add option for returning results from the tail of the stream (#64869) (#65040)
Introduce option for specifying whether the results are returned from
the tail (end) of the stream or the head (beginning).
Improve sequencing algorithm by significantly eliminating the number
of in-flight sequences for spare datasets.
Refactor the sequence class by eliminating some of the redundant code.
Change matching behavior for tail sequences.
Return results based on their first entry ordinal instead of
insertion order (which was ordered on the last match ordinal).
Randomize results position inside test suite.

Close #58646

(cherry picked from commit e85d9d1bbee13ad408e789fd62efb30bc8d223f2)
(cherry picked from commit 452c674a10cdc16dced3cde7babf5d5a9d64a6d9)
2020-11-14 13:44:17 +02:00
Alan Woodward 0e2a9b4ac7 Fix sparse vector test 2020-11-12 20:10:07 +00:00
Benjamin Trent b888f36388
[ML] fix custom feature processor extraction bugs around boolean fields and custom one_hot feature output order (#64937) (#65009)
This commit fixes two problems:

- When extracting a doc value, we allow boolean scalars to be used as input
- The output order of processed feature names is deterministic. Previous custom one hot fields used to be non-deterministic and thus could cause weird bugs.
2020-11-12 11:15:57 -05:00
Tanguy Leroux e40d7e02ea
Makes testCcrRepositoryFetchesSnapshotShardSizeFromIndexShardStoreStats more robust (#64976) (#64989)
Today this test fails because the sizes of the snapshot 
shards are only kept in a very short period of time in 
the InternalSnapshotsInfoService and are not 
guaranteed to exist once the shards are correctly 
assigned.

closes #64167
2020-11-12 15:38:38 +01:00
Dimitris Athanasiou b5efaf6e3b
[7.10][ML] Protect against stack overflow while loading DFA data (#64947) (#64956)
If we encounter an exception during extracting data in a data
frame analytics job, we retry once. However, we were not catching
exceptions thrown from processing the search response. This may
result in an infinite loop that causes a stack overflow.

This commit fixes this problem.

Backport of #64947
2020-11-12 11:08:40 +02:00