OpenSearch

mirror of https://github.com/honeymoose/OpenSearch.git synced 2025-02-19 03:15:15 +00:00

Author	SHA1	Message	Date
Costin Leau	71b92f8699	QL: Optimize Like/Rlike all (#62682 ) Replace common Like and RLike queries that match all characters with IsNotNull (exists) queries Fix #62585 (cherry picked from commit 4c23fad0468a9edd7325b06c6a96f7af37625dbf)	2020-09-24 13:44:53 +03:00
Nhat Nguyen	663b85b98f	Make keep alive optional in PointInTimeBuilder (#62720 ) Remove the keepAlive parameter from the constructor of PointInTimeBuilder as it's optional.	2020-09-22 18:52:54 -04:00
Marios Trivyzas	1e72144847	EQL: Remove support for `=` for comparisons (#62756 ) (#62775 ) Since `=` is rarely used and is undocumented we its support for equality comparisons keeping `==` as the only option. `=` is now only used for assignments like in `maxspan=10m`. Closes: #62650 (cherry picked from commit ad5ae4d887b5c2feca2d0e874d7bdf738e3fd54e)	2020-09-22 20:56:04 +02:00
Marios Trivyzas	b072de4ce0	EQL: Disallow chained comparisons (#62567 ) (#62601 ) Expressions like `1 = 2 = 3 = 4` or `1 < 2 = 3 >= 4` were treated with leftmost priority: ((1 = 2) = 3) = 4 which can lead to confusing results. Since such expressions don't make so much change for EQL filters we disallow them in the parser to prevent unexpected results from their bad usage. Major DBs like PostgreSQL and Oracle also disallow them in their SQL syntax. (counter example would be MySQL which interprets them as we did before with leftmost priority). Fixes: #61654 (cherry picked from commit 8f94981bb093f104228d267b532e0a3d5b7f6a38)	2020-09-18 10:48:14 +02:00
Costin Leau	81f2f84177	EQL: Allow requests with size 0 (#62537 ) The purpose for this change is to allow validation of queries without having to actually execute them. The optimizer already picks up this case. Fix #62494 (cherry picked from commit 675889559b2f96a0c1faa6fc84fd537148ba2cce)	2020-09-18 11:24:39 +03:00
William Brafford	5a0dca2491	Deprecate xpack.eql.enabled setting and make it a no-op (#61375 ) (#62491 ) * Deprecate xpack.eql.enabled and make it a no-op * Remove uses of xpack.eql.enabled	2020-09-17 14:17:27 -04:00
Marios Trivyzas	abce04888f	EQL: Forbid usage of ['] for string literals (#62458 ) (#62496 ) The usage of single quotes to wrap a string literal is forbidden and an error encouraging the user to user double quotes is returned. Tests are properly adjusted. Relates to #61659 (cherry picked from commit 8be400b77370bf4cf68c89f492c2d235f3cce43c)	2020-09-17 11:29:09 +02:00
Costin Leau	ceaf96061c	EQL: Fetch sequence documents using Point-In-Time (#62469 ) To preserve the PIT semantics, the retrieval of results has moved from using multi-get to using an idsQuery. (cherry picked from commit 1c2362fcf2be62ce568b3772924abce7331ef23c)	2020-09-17 00:12:19 +03:00
Costin Leau	03d2395183	EQL: Use Point In Time inside sequences (#62276 ) Use the newly introduced PIT API to have a consistent view of the data while doing sequence matching, which involves multiple calls, aka repeatable reads and thus avoid race conditions or any in-flight updates on the data. (cherry picked from commit daa72fc3c71fd36afb55278021ff6bbc591ef148)	2020-09-15 15:40:03 +03:00
Nhat Nguyen	3d69b5c41e	Introduce point in time APIs in x-pack basic (#61062 ) This commit introduces a new API that manages point-in-times in x-pack basic. Elasticsearch pit (point in time) is a lightweight view into the state of the data as it existed when initiated. A search request by default executes against the most recent point in time. In some cases, it is preferred to perform multiple search requests using the same point in time. For example, if refreshes happen between search_after requests, then the results of those requests might not be consistent as changes happening between searches are only visible to the more recent point in time. A point in time must be opened before being used in search requests. The `keep_alive` parameter tells Elasticsearch how long it should keep a point in time around. ``` POST /my_index/_pit?keep_alive=1m ``` The response from the above request includes a `id`, which should be passed to the `id` of the `pit` parameter of search requests. ``` POST /_search { "query": { "match" : { "title" : "elasticsearch" } }, "pit": { "id": "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWICBXV1aWQyAAAFdXVpZDEAAQltYXRjaF9hbGw_gAAAAA==", "keep_alive": "1m" } } ``` Point-in-times are automatically closed when the `keep_alive` is elapsed. However, keeping point-in-times has a cost; hence, point-in-times should be closed as soon as they are no longer used in search requests. ``` DELETE /_pit { "id" : "46ToAwMDaWR4BXV1aWQxAgZub2RlXzEAAAAAAAAAAAEBYQNpZHkFdXVpZDIrBm5vZGVfMwAAAAAAAAAAKgFjA2lkeQV1dWlkMioGbm9kZV8yAAAAAAAAAAAMAWIBBXV1aWQyAAA=" } ``` #### Notable works in this change: - Move the search state to the coordinating node: #52741 - Allow searches with a specific reader context: #53989 - Add the ability to acquire readers in IndexShard: #54966 Relates #46523 Relates #26472 Co-authored-by: Jim Ferenczi <jimczi@apache.org>	2020-09-10 19:25:47 -04:00
Costin Leau	0f9532689f	EQL: Propagate key constraints through the query (#62073 ) Since join keys are common across all queries in a Join/Sequence, any constraint applied on one query needs to be obeyed but all the other queries. This PR enhances the optimizer to propagate such constraints across all queries so they get pushed down to the actual generated ES queries. Fix #58937 (cherry picked from commit 4afa5debc199c132c07015bfae17952c40a21e5d)	2020-09-08 18:40:47 +03:00
Andrei Stefan	7d5791b6bd	EQL: create the search request with a list of indices (#62005 ) (#62076 ) * The query client uses an array of indices instead of the comma separated version of the indices names (cherry picked from commit 8ec4a768f4892a4a2faed25836cb333a9deb2ace)	2020-09-08 10:26:59 +03:00
Costin Leau	99ee87e332	EQL: Revert filter pipe (#61907 ) The current implementation of the filter pipe is incomplete hence why it got reverted. Note this is not a complete revert as some of the improvements of said commit (such as the PostAnalyzer) are useful in general. Relates #61805 (cherry picked from commit 7a7eb66f7d39586c3a3bc00dce49e6c47a23b46a)	2020-09-03 22:31:08 +03:00
Martijn van Groningen	3d9c12e2d3	Fix data stream wildcard resolution bug in eql search api.(#61910 ) Backport of #61904 to 7.x branch. The eql search api redirects to the search api. For this reason the eql search api could work with concrete data stream names. However if security is enabled and a data stream name snippet with a wildcard was used then it could not resolve this expressions. This is because the EqlSearchRequest class didn't overwrite the `includeDataStreams()` method. This pr fixes this, so that the security layer can properly expand data stream name wildcard expressions for the eql search api. This commit also moves the eql data stream test to xpack rest tests, so that the test runs with security enabled. This is required to reproduce the bug. Closes #60828	2020-09-03 16:03:57 +02:00
Costin Leau	e6dc8054a5	EQL: Introduce filter pipe (#61805 ) Allow filtering through a pipe, across events and sequences. Filter pipes are pushed down to base queries. For now filtering after limit (head/tail) is forbidden as the semantics are still up for debate. Fix #59763 (cherry picked from commit 80569a388b76cecb5f55037fe989c8b6f140761b)	2020-09-02 15:48:51 +03:00
Costin Leau	bff3c7470e	EQL: Replace SearchHit in response with Event (#61428 ) (#61522 ) The building block of the eql response is currently the SearchHit. This is a problem since it is tied to an actual search, and thus has scoring, highlighting, shard information and a lot of other things that are not relevant for EQL. This becomes a problem when doing sequence queries since the response is not generated from one search query and thus there are no SearchHits to speak of. Emulating one is not just conceptually incorrect but also problematic since most of the data is missed or made-up. As such this PR introduces a simple class, Event, that maps nicely to the terminology while hiding the ES internals (the use of SearchHit or GetResult/GetResponse depending on the API used). Fix #59764 Fix #59779 Co-authored-by: Igor Motov <igor@motovs.org> (cherry picked from commit 997376fbe6ef2894038968842f5e0635731ede65)	2020-08-25 17:32:42 +03:00
Andrei Stefan	a214d7902a	EQL: make endsWith function use a wildcard ES query wherever possible (#61160 ) (#61320 ) (cherry picked from commit 55fdb7e2c74d4fae86ec40686091ecba831caeaf)	2020-08-19 14:17:55 +03:00
Andrei Stefan	a6c0670a14	EQL: make stringContains function use a wildcard ES query (#61189 ) (#61313 ) (cherry picked from commit 039a7d1c68f6f1ed0e7e6cfb86be6b04eec8051c)	2020-08-19 12:40:48 +03:00
Andrei Stefan	5de0f19cc3	EQL: Return sequence join keys in the original type (#61268 ) (#61282 ) (cherry picked from commit d54957d61faa0d502387656e3cace594017b6ea0)	2020-08-18 19:37:15 +03:00
Andrei Stefan	90e116738e	QL: add filtering query dsl support to IndexResolver (#60514 ) (#61200 ) (cherry picked from commit 7b3635d796be26af9f87d19963a8ed4ab4bbf13f)	2020-08-17 17:59:58 +03:00
Costin Leau	9cc80621c3	EQL: Fix matching of tail/desc queries (#59827 ) When dealing with tail queries, data is returned descending for the base criterion yet the rest of the queries are ascending. This caused a problem during insertion since while in a page, the data is ASC, between pages the blocks of data is DESC. This caused incorrectly sorting inside a SequenceGroup which led to incorrect results. Further more in case of limit, since the data in a page is ASC, early return is not possible neither is desc matching. Thus the page needs to be consumed first before finding the final results. A future improvement could be to keep only the top N results dropping the rest during insertion time. (cherry picked from commit 77c88da054a1ce662a264f72cde5986d4ce37e3a)	2020-07-19 00:49:16 +03:00
Costin Leau	5f2285a8b3	EQL: Fix bug in returning results (#59673 ) Using serialization/deserialization when dealing with non-trivial documents causes the process to get stuck not to mention it is expensive. Use a much more simple approach at the expense of losing information (we're just interested in the source after all). (cherry picked from commit e1659822db7ce1390ba9bbfb21768e24a0907dff)	2020-07-16 01:01:13 +03:00
Costin Leau	679619c798	EQL: Improve retrieval of results (#59552 ) Instead of retrieving an entire SearchHit, get just a reference and postpone the document retrieval when assembling the final results. Remove sort information from results to make them consistent. Move TumblingWindow under the sequence package. Co-authored-by: James Rodewig <james.rodewig@elastic.co> (cherry picked from commit bccfbcd81f2f1d3552e95e4a9ee2618fb3059bd9)	2020-07-14 23:53:57 +03:00
Andrei Stefan	cf752992d6	Add telemetry metrics (#59526 )	2020-07-14 16:25:24 +03:00
Costin Leau	5580eb61ed	EQL: Improve sequence limiting (#59439 ) Improve the way limit (in particular offset) is being applied to handle the case where the matches are less than the offset and absolute limit. Combine Matcher and SequenceStateMachine into one class since the two have evolved beyond their original name and structure. (cherry picked from commit 63d3c62cdfc33dea03f21d5565b9c8ea104003eb)	2020-07-14 13:19:09 +03:00
Igor Motov	1acb4aeba9	EQL: Prepare for release (#59331 ) (#59426 ) Enables eql setting in release builds. Relates #51613	2020-07-13 11:54:32 -04:00
Costin Leau	d9c1e531db	EQL: Introduce until functionality (#59292 ) Sequences now support until conditional, which prevents a match from occurring if the until matches a document while doing look-ups. Thus a sequence must complete before the until condition matches - if any document within the sequence occurs at, or after, the until hit, the sequence is discarded. (cherry picked from commit 1ba1b9f0661aee655aa48cf9475ac61aaee2bfda)	2020-07-09 17:12:01 +03:00
Andrei Stefan	c0e0bca84c	Remove search_after and implicit_join_key_field (#59232 ) (#59280 ) (cherry picked from commit 6ede6c59eff321b9fedad30e19508b9e4f788b54)	2020-07-09 12:34:01 +03:00
Costin Leau	3e32d060bf	EQL: Fix bug in skipping window (#59196 ) Corrected condition that caused a sequence window to be skipped when a query returns no results by checking not just the current stage but also following ones as they can match with in-flight sequences. Improve logging Fix NPE when emptying a SequenceGroup Increase randomization in testing Make maxspan inclusive (up to and equal to value vs just up to) (cherry picked from commit ad32c488688cb350c2934dfca03af86045e997b0)	2020-07-08 14:36:39 +03:00
Costin Leau	f9c15d0fec	EQL: Introduce sequencing fetch size (#59063 ) The current internal sequence algorithm relies on fetching multiple results and then paginating through the dataset. Depending on the dataset and memory, setting a larger page size can yield better performance at the expense of memory. This PR makes this behavior explicit by decoupling the fetch size from size, the maximum number of results desired. As such, use in testing a minimum fetch size which exposed a number of bugs: Jumping across data across queries causing valid data to be seen as a gap. Incorrectly resuming searching across pages (again causing data to be discarded). which have been addressed. (cherry picked from commit 2f389a7724790d7b0bda67264d6eafcfa8b2116e)	2020-07-06 19:14:26 +03:00
Costin Leau	fe775a315f	EQL: Obey size request parameter (#59014 ) While at it, change the default size to 10 (to align it with the search API defaults). (cherry picked from commit 45795939b277e736a9e4f2f008d1c3f406239075)	2020-07-06 19:14:25 +03:00
Costin Leau	965f77fa44	EQL: Introduce sequence internal paging (#58859 ) Refactor sequence matching classes in order to decouple querying from results consumption (and matching). Rename some classes to better convey their intent. Introduce internal pagination of sequence algorithm, that is getting the data in slices and, if needed, moving forward in order to find more matches until either the dataset is consumer or the number of results desired is found. (cherry picked from commit bcf2c1141302f3f98c85e82d2c501aa02c8540e9)	2020-07-02 13:44:21 +03:00
Andrei Stefan	b904a60275	EQL: Add case handling to stringContains (#58762 ) (#58813 ) Co-authored-by: Ross Wolf <31489089+rw-access@users.noreply.github.com> (cherry picked from commit 1a58776d3aa563beb364b067a1db46497122306f)	2020-07-01 13:51:45 +03:00
Andrei Stefan	470bcee5bf	EQL: Integrate TOML tests for function folding (#58748 ) (#58812 ) Co-authored-by: Ross Wolf <31489089+rw-access@users.noreply.github.com> (cherry picked from commit e9b1fa58cf8d510a4b4afb14f66b0d5f9c603ebb)	2020-07-01 13:50:54 +03:00
Costin Leau	3a546f1f51	EQL: Introduce support for sequence maxspan (#58635 ) EQL sequences can specify now a maximum time allowed for their span (computed between the first and the last matching event). (cherry picked from commit 747c3592244192a2e25a092f62aec91a899afc83)	2020-06-29 21:31:00 +03:00
Igor Motov	773f3574a9	Removes debug logging from RestEqlCancellationIT (#58676 ) The test didn't fail since the fix in #58493. So, it's time to remove debug logging and close the issue. Closes #58270	2020-06-29 13:15:01 -04:00
Costin Leau	3c81b91474	EQL: Add Head/Tail pipe support (#58536 ) Introduce pipe support, in particular head and tail (which can also be chained). (cherry picked from commit 4521ca3367147d4d6531cf0ab975d8d705f400ea) (cherry picked from commit d6731d659d012c96b19879d13cfc9e1eaf4745a4)	2020-06-27 09:49:14 +03:00
Igor Motov	20af856abd	[7.x] EQL: Adds an ability to execute an asynchronous EQL search (#58192 ) Adds async support to EQL searches Closes #49638 Co-authored-by: James Rodewig james.rodewig@elastic.co	2020-06-25 14:11:57 -04:00
Andrei Stefan	69f73d948b	EQL: code cleanup and further tests (#58458 ) (#58497 ) Add FunctionPipe tests to all functions. Cleanup functions code. (cherry picked from commit 0f83d5799841fe99d8aeaf46e50dd11aa6bf8a57)	2020-06-24 17:38:56 +03:00
Costin Leau	ff0ea62cb8	EQL: Fix casing for tiebreaker field (#57943 ) Use tiebreaker instead of tieBreaker (cherry picked from commit 3c774948a5d5e10fac267cb9a54f5d0559a00c1d)	2020-06-11 00:10:19 +03:00
Aleksandr Maus	ec60335496	EQL: implement case sensitivity for indexOf and endsWith string functions (#57707 ) (#57908 ) * EQL: implement case sensitivity for indexOf and endsWith string functions	2020-06-10 08:55:49 -04:00
Costin Leau	439205d1ea	EQL: Introduce tie breaker support (#57787 ) Allow a field inside the data to be used as a tie breaker for events that have the same timestamp. The field is optional by default. If used, the tie-breaker always requires a non-null value since it is used inside `search_after` which requires a non-null value. Fix #56824 (cherry picked from commit e5719ecb474b32730d93afdbb6834a32b0b2df8b)	2020-06-09 22:50:19 +03:00
Bogdan Pintea	74b2c8a770	Change error message for comp against fields (#57126 ) Change the error message wording for comparisons against fields in filtering (s/variables/fields). (cherry picked from commit d9a1cb50940d0a98fd75b9c0123ca6e1d862f65d)	2020-05-26 17:57:51 +02:00
Costin Leau	6f4af43405	EQL: Skip execution for filters with empty results (#56718 ) Optimize away events queries and joins/sequence that cannot match any results without having to query the backend. (cherry picked from commit 69c8ef8cfefd8fc6dcb6d1a566bfcd537068e3e4)	2020-05-14 22:38:23 +03:00
Andrei Stefan	ddf4e47e86	EQL: fix QueryFolderOkTests (#56714 ) (#56728 ) (cherry picked from commit 8b21ccd0eac3b3d0fbd090152b3dff6ae5217b52)	2020-05-14 10:58:25 +03:00
Aleksandr Maus	87a10806ab	EQL: Fix cidrMatch function fails to match when used in scripts (#56246 ) (#56735 ) EQL: Fix cidrMatch function fails to match when used in scripts (#56246) Addresses https://github.com/elastic/elasticsearch/issues/55709	2020-05-13 22:41:24 -04:00
Ross Wolf	61e2cf89b5	EQL: Add number function (#55084 ) * EQL: Add number function * EQL: Fix the locale used for number for deterministic functionality * EQL: Add more ToNumber tests * EQL: Add more number ToNumberProcessor unit tests * EQL: Remove unnecessary overrides, fix processor methods * EQL: Remove additional unnecessary overrides * EQL: Lint fixes for ToNumber * EQL: ToNumber renames from PR feedback * EQL: Remove NumberFormat locale handling * EQL: Removed NumberFormat from ToNumber * EQL: Add number function tests * EQL: ToNumberProcessorTests formatting * EQL: Remove newline in ToNumberProcessorTests * EQL: Add number(..., null) test * EQL: Create expression.function.scalar.math package * EQL: Remove painless whitespace for ToNumber.asScript * EQL: Add Long support	2020-05-13 14:09:06 -06:00
Costin Leau	9f1ecd52eb	EQL: Introduce support for sequences (#56300 ) Initial support for EQL sequences The current algorithm is focused on correctness and does not contain any optimization which is left for the future. The current implementation uses a state machine approach which moves ascending and runs each query one after the other working on computing sequences as the data comes in. For each result, the key and its timestamp are being extracted which are then used for matching/building a sequence. (cherry picked from commit 4f3e18c894a1841d333022361ad9d1fdf1477dc3)	2020-05-13 15:42:31 +03:00
Marios Trivyzas	cbbbd499bf	SQL/EQL: Add support for scalars within LIKE/RLIKE (#56495 ) (#56674 ) - Add support for scalar functions on the field of SQL's LIKE/RLIKE - Add support for scalar functions on the field of EQL's match/matchLite Closes: #55058 (cherry picked from commit 51c14e2dbb7fb29004a23369c449d425b3ac8fe2)	2020-05-13 13:40:24 +02:00
Andrei Stefan	f0074e93a0	QL: case sensitive support in EQL (#56404 ) (#56597 ) * QL: case sensitive support in EQL (#56404) * adds a generic startsWith function to QL * modifies the existent EQL startsWith function to be case sensitive aware * improves the existent EQL startsWith function to use a prefix query when the function is used in a case sensitive context. Same improvement is used in SQL's newly added STARTS_WITH function. * adds case sensitivity to EQL configuration through a case_sensitive parameter in the eql request, as established in #54411. The case_sensitive parameter can be specified when running queries (default is case insensitive) (cherry picked from commit ee5a09ea840167566e34c28c8225dc38bc6a7ae8)	2020-05-12 16:56:18 +03:00

1 2 3

117 Commits