Commit Graph

68 Commits

Author SHA1 Message Date
Costin Leau f089547b20 EQL: Fix aggressive/incorrect until policy in sequences (#65156)
The current until implementation in sequences is too optimistic, leading
to an aggressive match that discards correct data leading to invalid
results.
This commit addresses this issue and also unifies the until usage inside
TumblingWindow.
Further more it packs together the UntilGroup with SequenceGroup to
minimize memory usage and improve clean-up.

(cherry picked from commit de2724e92c732c66436939dbbedef93c9981b435)
(cherry picked from commit a60757756aae5f5abb31176fee972a7cdeac3649)
2020-11-18 09:34:33 +02:00
Costin Leau 9551cb3420 EQL: small improvements to the testing base class
Extract request settings into dedicated methods for easier adjustments

(cherry picked from commit 4f93591cc561c7f8ff7c2f070dd1180f209810b7)
(cherry picked from commit ff7e8427345c304f5a37612c870b48555484b692)
2020-11-14 16:40:48 +02:00
Costin Leau 76e73fec79
EQL: Add option for returning results from the tail of the stream (#64869) (#65040)
Introduce option for specifying whether the results are returned from
the tail (end) of the stream or the head (beginning).
Improve sequencing algorithm by significantly eliminating the number
of in-flight sequences for spare datasets.
Refactor the sequence class by eliminating some of the redundant code.
Change matching behavior for tail sequences.
Return results based on their first entry ordinal instead of
insertion order (which was ordered on the last match ordinal).
Randomize results position inside test suite.

Close #58646

(cherry picked from commit e85d9d1bbee13ad408e789fd62efb30bc8d223f2)
(cherry picked from commit 452c674a10cdc16dced3cde7babf5d5a9d64a6d9)
2020-11-14 13:44:17 +02:00
Andrei Stefan a6d8319231
* Wrap a verification_exception in case there is no valid index available (#64267)
Wrap a verification_exception in case there is no valid index available in an index_not_found_exception providing also the original index pattern that may be lost in the chain of filters involving the Security one.

(cherry picked from commit 9c9da2f2f9a4ad12704f7d3a273f067e96cd2054)
2020-10-29 10:14:50 +02:00
Costin Leau 2363c4be4b EQL: Polish testing infra (#64205)
Add tie-breaker inside request creation
Add configurable timeout

(cherry picked from commit ff281d7b6fd7b4cd2f08bac49aa0b354b6812940)
(cherry picked from commit 34bd76fc2987b1ad0b6275ac4358e362a0ba7fb0)
2020-10-27 17:23:43 +02:00
Ryland Herrick 7e8769a666
EQL: make allow_no_indices true by default (#63573) (#63645)
* Allow all indices options variants
Irrespective of allow_no_indices value, throw VerificationException when
there is no index validated

Co-authored-by: Andrei Stefan <astefan@users.noreply.github.com>
2020-10-14 03:41:04 +03:00
Costin Leau 2ab5f226c4 EQL: Avoid filtering on tiebreakers (#63415)
Do not filter by tiebreaker while searching sequence matches as
it's not monotonic and thus can filter out valid data.
Add handling for data 'near' the boundary that has the same timestamp
but different tie-breaker and thus can be just outside the window.

Fix #62781
Relates #63215

(cherry picked from commit 36f834600d4d9ded0fb7b1440274b2e597733770)
(cherry picked from commit 72a2ce825f3bfd13f87423ba7f3c739ea64c57f6)
2020-10-08 13:50:41 +03:00
Costin Leau d027e24b31 EQL: Remove match functions (#63275)
Since match (for matching regex) is not currently in use remove it for
now.

Close #63263

(cherry picked from commit 6abd531cf457f3c5686f59709647bed3276e3c6b)
2020-10-05 23:30:41 +03:00
Costin Leau 6856306dcf EQL: Remove wildcard functionality from : (#63276)
Restrict : operator to only case insensitive matching on strings

Close #63262

(cherry picked from commit bc02e77150cdd85594dfac4f03d8aeb85aaddbb3)
2020-10-05 23:30:41 +03:00
Andrei Stefan 76bba601ab
Remove case_sensitive request option (#63218) (#63244)
Make EQL case sensitive by default and adapt some of the string functions
Remove the case sensitive option from Between string function
Add case_insensitive option to term and wildcard queries usage

(cherry picked from commit 7550e0664c8c2f1f13519036c759b1e76345551f)
2020-10-05 22:04:42 +03:00
Costin Leau 1047d67199 Revert "EQL: Avoid filtering on tiebreakers (#63215)"
This reverts commit efd2243886.
2020-10-05 15:55:59 +03:00
Costin Leau efd2243886 EQL: Avoid filtering on tiebreakers (#63215)
Do not filter by tiebreaker while searching sequence matches as
it's not monotonic and thus can filter out valid data.

Fix #62781

(cherry picked from commit 4d62198df70f3b70f8b6e7730e888057652c18a8)
2020-10-05 14:21:30 +03:00
Marios Trivyzas 7d74fb8577
EQL: Replace ?"..." with """...""" for unescaped strings (#62539) (#63174)
Use triple double quotes enclosing a string literal to interpret it
as unescaped, in order to use `?` for marking query params and avoid
user confusion. `?` also usually implies regex expressions.

Any character inside the `"""` beginning-closing markings is considered
raw and the only thing that is not permitted is the `"""` sequence itself.
If a user wants to use that, needs to resort to the normal `"` string literal
and use proper escaping.

Relates to #61659

(cherry picked from commit d87c2ca2eacab5552bca1e520d33cf71da40bcfd)
2020-10-02 14:58:50 +02:00
Costin Leau 614f4c13a5 EQL: Introduce case-sensitive equality (#63121)
Introduce : operator for doing case insensitive string comparisons.
Recognizes "*" for wildcard matches in string literals.
Restricted only to string types.

Relates #62941

(cherry picked from commit 201e577e65f26a9b958a6197fe6c7268da39de29)
2020-10-02 00:23:08 +03:00
Costin Leau c2992ea287 EQL: Fix NPE from incorrect use of ids search (#63032)
This fixes a bug introduced when moving from mget to ids query. While
mget returns all the ids given, id query is a search query and thus by
default returns only 10 documents.
The fix correctly sets the expected size so all the information is
returned inside the response.

Fix #63030

(cherry picked from commit 09ba85548a0142a1fe8376efea9cc4e7764a207c)
2020-10-01 13:49:58 +03:00
Costin Leau 3bee28056f EQL: Fix bug in sequences with any pattern (#63007)
Fix query creation inside sequences with any queries due to lacking a
clause to combine, which lead to an invalid request being created.

Fix #62967

(cherry picked from commit ff59d8823919a6e70928816e5c3687308ebde33f)
2020-09-29 18:19:25 +03:00
Costin Leau ef7a6ce4b2 EQL: Refactor testing infrastructure (#62928)
Extract reusable methods inside QL TestUtils
Rename abstract base classes for clarity
Clean-up EQL DataLoader

(cherry picked from commit 48db3f285aa8976ead5a9f5d071a9c1046d7bd31)
2020-09-28 14:22:56 +03:00
Nik Everett fa13585fae
Fix Eclipse build (#62733) (#62786)
Eclipse was confused for two reasons:
1. `:x-pack:plugin` depended on itself.
2. `ql`, `sql`, and `eql` couldn't see some methods.

I fixed problem 1 by only adding the "depends on itself" configuration
outside of eclipse. I fixed problem 2 by making a `test` sub-project in
`ql` that contains test utilities and depending on those where possible.
2020-09-22 17:44:25 -04:00
Marios Trivyzas 1e72144847
EQL: Remove support for `=` for comparisons (#62756) (#62775)
Since `=` is rarely used and is undocumented we its support for
equality comparisons keeping `==` as the only option. `=` is now only
used for assignments like in `maxspan=10m`.

Closes: #62650
(cherry picked from commit ad5ae4d887b5c2feca2d0e874d7bdf738e3fd54e)
2020-09-22 20:56:04 +02:00
Costin Leau 81f2f84177 EQL: Allow requests with size 0 (#62537)
The purpose for this change is to allow validation of queries without
having to actually execute them. The optimizer already picks up this
case.

Fix #62494

(cherry picked from commit 675889559b2f96a0c1faa6fc84fd537148ba2cce)
2020-09-18 11:24:39 +03:00
Marios Trivyzas abce04888f
EQL: Forbid usage of ['] for string literals (#62458) (#62496)
The usage of single quotes to wrap a string literal is forbidden
and an error encouraging the user to user double quotes is returned.

Tests are properly adjusted.

Relates to #61659

(cherry picked from commit 8be400b77370bf4cf68c89f492c2d235f3cce43c)
2020-09-17 11:29:09 +02:00
Costin Leau 03d2395183 EQL: Use Point In Time inside sequences (#62276)
Use the newly introduced PIT API to have a consistent view of the data
while doing sequence matching, which involves multiple calls, aka
repeatable reads and thus avoid race conditions or any in-flight updates
on the data.

(cherry picked from commit daa72fc3c71fd36afb55278021ff6bbc591ef148)
2020-09-15 15:40:03 +03:00
Andrei Stefan cce6da7d52
EQL: add the wildcard field type to the IT tests (#62166) (#62200)
* Add wildcard field type as an option for randomized testing of IT queries

(cherry picked from commit 87b14c409c180c4d53c3c61a30bd69f1b81a2823)
2020-09-10 12:36:36 +03:00
Andrei Stefan 7d5791b6bd
EQL: create the search request with a list of indices (#62005) (#62076)
* The query client uses an array of indices instead of the comma separated
version of the indices names

(cherry picked from commit 8ec4a768f4892a4a2faed25836cb333a9deb2ace)
2020-09-08 10:26:59 +03:00
Costin Leau 99ee87e332 EQL: Revert filter pipe (#61907)
The current implementation of the filter pipe is incomplete hence why
it got reverted. Note this is not a complete revert as some of the
improvements of said commit (such as the PostAnalyzer) are useful in
general.

Relates #61805

(cherry picked from commit 7a7eb66f7d39586c3a3bc00dce49e6c47a23b46a)
2020-09-03 22:31:08 +03:00
Costin Leau e6dc8054a5 EQL: Introduce filter pipe (#61805)
Allow filtering through a pipe, across events and sequences.
Filter pipes are pushed down to base queries.
For now filtering after limit (head/tail) is forbidden as the
semantics are still up for debate.

Fix #59763

(cherry picked from commit 80569a388b76cecb5f55037fe989c8b6f140761b)
2020-09-02 15:48:51 +03:00
Costin Leau bff3c7470e
EQL: Replace SearchHit in response with Event (#61428) (#61522)
The building block of the eql response is currently the SearchHit. This
is a problem since it is tied to an actual search, and thus has scoring,
highlighting, shard information and a lot of other things that are not
relevant for EQL.
This becomes a problem when doing sequence queries since the response is
not generated from one search query and thus there are no SearchHits to
speak of.
Emulating one is not just conceptually incorrect but also problematic
since most of the data is missed or made-up.

As such this PR introduces a simple class, Event, that maps nicely to
the terminology while hiding the ES internals (the use of SearchHit or
GetResult/GetResponse depending on the API used).

Fix #59764
Fix #59779

Co-authored-by: Igor Motov <igor@motovs.org>
(cherry picked from commit 997376fbe6ef2894038968842f5e0635731ede65)
2020-08-25 17:32:42 +03:00
Costin Leau 9cc80621c3 EQL: Fix matching of tail/desc queries (#59827)
When dealing with tail queries, data is returned descending for the base
criterion yet the rest of the queries are ascending. This caused a
problem during insertion since while in a page, the data is ASC, between
pages the blocks of data is DESC.
This caused incorrectly sorting inside a SequenceGroup which led to
incorrect results.

Further more in case of limit, since the data in a page is ASC, early
return is not possible neither is desc matching. Thus the page needs to
be consumed first before finding the final results.
A future improvement could be to keep only the top N results dropping
the rest during insertion time.

(cherry picked from commit 77c88da054a1ce662a264f72cde5986d4ce37e3a)
2020-07-19 00:49:16 +03:00
Andrei Stefan d513e1090f
Do not create the index, if it's already there (#59745) (#59747)
(cherry picked from commit d097447d257efdf0a36b1157e1f177aed86ecca1)
2020-07-17 11:38:30 +03:00
Costin Leau 6b75525efb EQL: Improve testing spec (#59615)
Case sensitivity is incorporated as a test dimension - instead of
running the same test twice, two different tests are created.
Clean-up the test invocation by removing unused parameters.

Fix #59294

(cherry picked from commit 72c8a3582d8e8a4a663d82814a17a1a3d2757292)
2020-07-15 18:07:24 +03:00
Andrei Stefan cf752992d6
Add telemetry metrics (#59526) 2020-07-14 16:25:24 +03:00
Costin Leau d9c1e531db EQL: Introduce until functionality (#59292)
Sequences now support until conditional, which prevents a match from
occurring if the until matches a document while doing look-ups.
Thus a sequence must complete before the until condition matches - if
any document within the sequence occurs at, or after, the until hit, the
sequence is discarded.

(cherry picked from commit 1ba1b9f0661aee655aa48cf9475ac61aaee2bfda)
2020-07-09 17:12:01 +03:00
Andrei Stefan d187b531ed
EQL: Give a name to all toml tests and enforce the naming of new tests (#59283) (#59295)
(cherry picked from commit c8ffe3c9237d3cdd90331795b8e37517155b7e91)
2020-07-09 16:20:29 +03:00
Andrei Stefan c0e0bca84c
Remove search_after and implicit_join_key_field (#59232) (#59280)
(cherry picked from commit 6ede6c59eff321b9fedad30e19508b9e4f788b54)
2020-07-09 12:34:01 +03:00
Costin Leau 3e32d060bf EQL: Fix bug in skipping window (#59196)
Corrected condition that caused a sequence window to be skipped when a query
returns no results by checking not just the current stage but also following
ones as they can match with in-flight sequences.
Improve logging
Fix NPE when emptying a SequenceGroup
Increase randomization in testing
Make maxspan inclusive (up to and equal to value vs just up to)

(cherry picked from commit ad32c488688cb350c2934dfca03af86045e997b0)
2020-07-08 14:36:39 +03:00
Costin Leau f9c15d0fec EQL: Introduce sequencing fetch size (#59063)
The current internal sequence algorithm relies on fetching multiple results and then paginating through the dataset. Depending on the dataset and memory, setting a larger page size can yield better performance at the expense of memory.
This PR makes this behavior explicit by decoupling the fetch size from size, the maximum number of results desired.
As such, use in testing a minimum fetch size which exposed a number of bugs:

Jumping across data across queries causing valid data to be seen as a gap.
Incorrectly resuming searching across pages (again causing data to be discarded).
which have been addressed.

(cherry picked from commit 2f389a7724790d7b0bda67264d6eafcfa8b2116e)
2020-07-06 19:14:26 +03:00
Costin Leau fe775a315f EQL: Obey size request parameter (#59014)
While at it, change the default size to 10 (to align it with the search
API defaults).

(cherry picked from commit 45795939b277e736a9e4f2f008d1c3f406239075)
2020-07-06 19:14:25 +03:00
Rene Groeschke d952b101e6
Replace compile configuration usage with api (7.x backport) (#58721)
* Replace compile configuration usage with api (#58451)

- Use java-library instead of plugin to allow api configuration usage
- Remove explicit references to runtime configurations in dependency declarations
- Make test runtime classpath input for testing convention
  - required as java library will by default not have build jar file
  - jar file is now explicit input of the task and gradle will ensure its properly build

* Fix compile usages in 7.x branch
2020-06-30 15:57:41 +02:00
Andrei Stefan 7b80ea7218
Fix release tests (#58713) (#58725)
(cherry picked from commit 7816c100612168bf46595c4813fe374bca2e7259)
2020-06-30 13:42:32 +03:00
Costin Leau 3a546f1f51 EQL: Introduce support for sequence maxspan (#58635)
EQL sequences can specify now a maximum time allowed for their span
(computed between the first and the last matching event).

(cherry picked from commit 747c3592244192a2e25a092f62aec91a899afc83)
2020-06-29 21:31:00 +03:00
Andrei Stefan 3cb8f54f28
EQL: case sensitivity aware integration testing (#58624) (#58672)
* EQL: case sensitivity aware integration testing (#58624)

* Add DataLoader
* Rewrite case sensitivity settings:
NULL -> run both case sensitive and insensitive tests
TRUE -> run case sensitive test only
FALSE -> run case insensitive test only
* Rename test_queries_supported
* Add more toml tests from the Python client

Co-authored-by: Ross Wolf <31489089+rw-access@users.noreply.github.com>
(cherry picked from commit 34d383421599f060a5c083b40df35f135de49e39)
2020-06-29 18:40:07 +03:00
Costin Leau 3c81b91474 EQL: Add Head/Tail pipe support (#58536)
Introduce pipe support, in particular head and tail
(which can also be chained).

(cherry picked from commit 4521ca3367147d4d6531cf0ab975d8d705f400ea)
(cherry picked from commit d6731d659d012c96b19879d13cfc9e1eaf4745a4)
2020-06-27 09:49:14 +03:00
Costin Leau ff0ea62cb8 EQL: Fix casing for tiebreaker field (#57943)
Use tiebreaker instead of tieBreaker

(cherry picked from commit 3c774948a5d5e10fac267cb9a54f5d0559a00c1d)
2020-06-11 00:10:19 +03:00
Costin Leau 439205d1ea EQL: Introduce tie breaker support (#57787)
Allow a field inside the data to be used as a tie breaker for events
that have the same timestamp.
The field is optional by default.
If used, the tie-breaker always requires a non-null value since it is
used inside `search_after` which requires a non-null value.

Fix #56824

(cherry picked from commit e5719ecb474b32730d93afdbb6834a32b0b2df8b)
2020-06-09 22:50:19 +03:00
Bogdan Pintea 74b2c8a770 Change error message for comp against fields (#57126)
Change the error message wording for comparisons against fields in
filtering (s/variables/fields).

(cherry picked from commit d9a1cb50940d0a98fd75b9c0123ca6e1d862f65d)
2020-05-26 17:57:51 +02:00
Costin Leau 6f4af43405 EQL: Skip execution for filters with empty results (#56718)
Optimize away events queries and joins/sequence that cannot match any
results without having to query the backend.

(cherry picked from commit 69c8ef8cfefd8fc6dcb6d1a566bfcd537068e3e4)
2020-05-14 22:38:23 +03:00
Aleksandr Maus 87a10806ab
EQL: Fix cidrMatch function fails to match when used in scripts (#56246) (#56735)
EQL: Fix cidrMatch function fails to match when used in scripts (#56246)

Addresses https://github.com/elastic/elasticsearch/issues/55709
2020-05-13 22:41:24 -04:00
Ross Wolf 61e2cf89b5
EQL: Add number function (#55084)
* EQL: Add number function
* EQL: Fix the locale used for number for deterministic functionality
* EQL: Add more ToNumber tests
* EQL: Add more number ToNumberProcessor unit tests
* EQL: Remove unnecessary overrides, fix processor methods
* EQL: Remove additional unnecessary overrides
* EQL: Lint fixes for ToNumber
* EQL: ToNumber renames from PR feedback
* EQL: Remove NumberFormat locale handling
* EQL: Removed NumberFormat from ToNumber
* EQL: Add number function tests
* EQL: ToNumberProcessorTests formatting
* EQL: Remove newline in ToNumberProcessorTests
* EQL: Add number(..., null) test
* EQL: Create expression.function.scalar.math package
* EQL: Remove painless whitespace for ToNumber.asScript
* EQL: Add Long support
2020-05-13 14:09:06 -06:00
Costin Leau 9f1ecd52eb EQL: Introduce support for sequences (#56300)
Initial support for EQL sequences
The current algorithm is focused on correctness and does not contain
any optimization which is left for the future.

The current implementation uses a state machine approach which moves
ascending and runs each query one after the other working on computing
sequences as the data comes in.
For each result, the key and its timestamp are being extracted which are
then used for matching/building a sequence.

(cherry picked from commit 4f3e18c894a1841d333022361ad9d1fdf1477dc3)
2020-05-13 15:42:31 +03:00
Marios Trivyzas cbbbd499bf
SQL/EQL: Add support for scalars within LIKE/RLIKE (#56495) (#56674)
- Add support for scalar functions on the field of SQL's LIKE/RLIKE
- Add support for scalar functions on the field of EQL's match/matchLite

Closes: #55058
(cherry picked from commit 51c14e2dbb7fb29004a23369c449d425b3ac8fe2)
2020-05-13 13:40:24 +02:00