OpenSearch

Commit Graph

Author	SHA1	Message	Date
Martijn van Groningen	c358ecb5fb	Don't preserve indices between enrich qa tests. This was added because it was suspected to cause the monitoring enrich verification to fail, but that is not the case. See #48258	2019-10-31 14:23:56 +01:00
Martijn van Groningen	05324b7f03	Muted verifying monitoring integration in enrich integration test. Relates to #48258	2019-10-24 08:39:53 +02:00
Martijn van Groningen	bbe50eca72	Fail with a better error when if there are no ingest nodes (#48272 ) when executing enrich execute policy api.	2019-10-22 07:42:04 +02:00
Martijn van Groningen	0ec0ab64c9	Fix executing enrich policies stats (#48132 ) The enrich stats api picked the wrong task to be displayed in the executing stats section. In case `wait_for_completion` was set to `false` then no task was being displayed and if that param was set to `true` then the wrong task was being displayed (transport action task instead of enrich policy executor task). Testing executing policies in enrich stats api is tricky. I have verified locally that this commit fixes the bug.	2019-10-22 07:41:56 +02:00
Martijn van Groningen	c09b62d5bf	Backport: also validate source index at put enrich policy time (#48311 ) Backport of: #48254 This changes tests to create a valid source index prior to creating the enrich policy.	2019-10-22 07:38:16 +02:00
James Baiera	0d12ef8958	Add Enrich Origin (#48098 ) (#48312 ) This PR adds an origin for the Enrich feature, and modifies the background maintenance task to use the origin when executing client operations. Without this fix, the maintenance task fails to execute when security is enabled.	2019-10-21 16:40:49 -04:00
Martijn van Groningen	844825a13f	Validate policy type when storing an enrich policy (#48126 )	2019-10-18 16:26:48 +02:00
Martijn van Groningen	a5fe69c344	Include enrich into the info api as feature (#48157 ) This commit also fixes a bug, the enrich enabled setting was not included in the list of settings. Backport of #48109	2019-10-17 09:51:32 +02:00
Martijn van Groningen	77164e9017	adjusted minimal supported version	2019-10-15 07:45:00 +02:00
Martijn van Groningen	51c33f3edf	remove eclipse conditional	2019-10-15 07:18:32 +02:00
Martijn van Groningen	c4b1a3045a	Fixed test, take into account that Map can be the result if max_matches is 1.	2019-10-15 07:03:01 +02:00
James Baiera	18d7e32b7d	Add wait for completion for Enrich policy execution (#47886 ) This PR adds the ability to run the enrich policy execution task in the background, returning a task id instead of waiting for the completed operation.	2019-10-14 16:05:28 -04:00
Martijn van Groningen	7fc9198d46	Change how `max_matches` affects `target_field` option. (#47982 ) Prior to this change the `target_field` would always be a json array field in the document being ingested. This to take into account that multiple enrich documents could be inserted into the `target_field`. However the default `max_matches` is `1`. Meaning that by default only a single enrich document would be added to `target_field` json array field. This commit changes this; if `max_matches` is set to `1` then the single document would be added as a json object to the `target_field` and if it is configured to a higher value then the enrich documents will be added as a json array (even if a single enrich document happens to be enriched).	2019-10-14 21:09:48 +02:00
James Baiera	73263c654a	Add basic task support for executing enrich policies (#47523 ) Changes the execution logic to create a new task using the execute request, and attaches the new task to the policy runner to be updated. Also, a new response is now returned from the execute api, which contains either the task id of the execution, or the completed status of the run. The fields are mutually exclusive to make it easier to discern what type of response it is.	2019-10-11 13:32:06 -04:00
Martijn van Groningen	aace42d38d	Add HLRC support for enrich stats API (#47306 ) This PR also includes HLRC docs for the enrich stats api. Relates to #32789	2019-10-10 09:08:29 +02:00
Martijn van Groningen	19393fc5a7	match processor should handler values other than string properly (#47419 ) Currently if the document being ingested contains another field value than a string then the processor fails with an error. This commit changes the match processor to handle number values and array values correctly. If a json array is detected then the `terms` query is used instead of the `term` query.	2019-10-10 08:49:17 +02:00
Martijn van Groningen	f8ebb75fcf	Reuse OperationRouting#searchShards(...) to select local enrich shard (#47359 ) The currently logic shard selecting logic selects a random shard copy instead of selecting the local shard copy and if local copy is not available then selecting a random shard copy. The latter is desired behaviour for enrich. By reusing `OperationRouting#searchShards(...)` we get the desired behaviour and reuse the same logic that the search api is using.	2019-10-09 17:31:43 +02:00
Martijn van Groningen	be0e17770c	required change after merging in 7 dot x branch	2019-10-09 09:16:23 +02:00
James Baiera	b9fb354618	Add retry to force merge operation in EnrichPolicyRunner (#47178 ) Adds a check when running an Enrich policy to make sure that an Enrich index is force merged down to one segment, and if it was not fully merged, attempts the merge again, up to a configurable number of times.	2019-10-08 11:23:02 -04:00
Martijn van Groningen	8b7100eb1f	Don't remove indices to avoid monitoring from intermittently failing to index monitoring docs.	2019-10-08 17:10:42 +02:00
Tal Levy	a17f394e27	Geo-Match Enrich Processor (#47243 ) (#47701 ) this commit introduces a geo-match enrich processor that looks up a specific `geo_point` field in the enrich-index for all entries that have a geo_shape match field that meets some specific relation criteria with the input field. For example, the enrich index may contain documents with zipcodes and their respective geo_shape. Ingesting documents with a geo_point field can be enriched with which zipcode they associate according to which shape they are contained within. this commit also refactors some of the MatchProcessor by moving a lot of the shared code to AbstractEnrichProcessor. Closes #42639.	2019-10-07 15:03:46 -07:00
James Baiera	a66c0dcd95	Add pipeline to ensure unique Enrich index documents (#46348 ) Adds a pipeline that removes ids and routing from documents before indexing them into enrich indices. Enrich documents may come from multiple indices, and thus have id collisions on them. This pipeline ensures that documents with colliding id fields do not clobber one another during the reindex operation while executing an enrich policy.	2019-10-04 12:20:52 -04:00
Michael Basnight	0e1b77568a	Add enable checks to missing enrich plugin methods (#47187 ) Some of the server side objects that do not need to be created unless enrich is enabled were still being created. This commit fixes that.	2019-10-01 12:04:46 -05:00
Martijn van Groningen	fe937ea4b8	Add config namespace in get policy api response (#47162 ) Currently the policy config is placed directly in the json object of the toplevel `policies` array field. For example: ``` { "policies": [ { "match": { "name" : "my-policy", "indices" : ["users"], "match_field" : "email", "enrich_fields" : [ "first_name", "last_name", "city", "zip", "state" ] } } ] } ``` This change adds a `config` field in each policy json object: ``` { "policies": [ { "config": { "match": { "name" : "my-policy", "indices" : ["users"], "match_field" : "email", "enrich_fields" : [ "first_name", "last_name", "city", "zip", "state" ] } } } ] } ``` This allows us in the future to add other information about policies in the get policy api response. The UI will consume this API to build an overview of all policies. The UI may in the future include additional information about a policy and the plan is to include that in the get policy api, so that this information can be gathered in a single api call. An example of the information that is likely to be added is: * Last policy execution time * The status of a policy (executing, executed, unexecuted) * Information about the last failure if exists	2019-09-30 14:37:23 +02:00
Martijn van Groningen	bb3e9cb908	fixed checkstyle violation	2019-09-30 08:42:51 +02:00
Martijn van Groningen	1c3d5b77b5	give monitoring more time	2019-09-30 08:04:29 +02:00
Martijn van Groningen	8a4eefdd83	Expose enrich stats api to monitoring. (#46708 ) This change also slightly modifies the stats response, so that is can easier consumer by monitoring and other users. (coordinators stats are now in a list instead of a map and has an additional field for the node id) Relates to #32789	2019-09-26 11:04:33 +02:00
James Baiera	9967aff714	Add notice to Enrich index mapping metadata (#45996 )	2019-09-24 12:55:11 -04:00
James Baiera	a349b22273	Add the cluster version to enrich policies (#45021 ) Adds the Elasticsearch version as a field on the EnrichPolicy object	2019-09-23 18:44:45 -04:00
Martijn van Groningen	33bbc4798b	fixed compile errors after merging	2019-09-23 09:46:14 +02:00
Michael Basnight	f1c7ed647b	Allow comma separated ids in get enrich policy API (#46351 ) This commit changes the GET REST api so it will accept an optional comma separated list of enrich policy ids. This change also modifies the behavior of the GET API in that it will not error if it is passed a bad enrich id anymore, but will instead just return an empty list.	2019-09-20 10:06:58 -05:00
Martijn van Groningen	a4b0f66919	Add enrich stats api (#46462 ) The enrich api returns enrich coordinator stats and information about currently executing enrich policies. The coordinator stats include per ingest node: * The current number of search requests in the queue. * The total number of outstanding remote requests that have been executed since node startup. Each remote request is likely to include multiple search requests. This depends on how much search requests are in the queue at the time when the remote request is performed. * The number of current outstanding remote requests. * The total number of search requests that `enrich` processors have executed since node startup. The current execution policies stats include: * The name of policy that is executing * A full blow task info object that is executing the policy. Relates to #32789	2019-09-11 13:40:24 +02:00
Martijn van Groningen	c79a8e448d	Convert enrich qa modules to use testclusters.	2019-09-11 11:40:18 +02:00
Martijn van Groningen	8a48ef2a06	fixed typo	2019-09-11 09:52:25 +02:00
Martijn van Groningen	ef33a99e6e	Disable default features that are not needed for enrich indices. (#46525 ) Relates to #32789	2019-09-11 09:20:38 +02:00
Michael Basnight	9304f5c889	Ensure enrich executes on master node only (#46448 ) The previous transport action was a read action, which under the right set of circumstances can execute on a coordinating node. This commit ensures that cannot happen.	2019-09-10 09:59:36 -05:00
Martijn van Groningen	ded98e50b7	Change exact match processor to match processor. (#46041 ) Besides a rename, this changes allows to processor to attach multiple enrich docs to the document being ingested. Also in order to control the maximum number of enrich docs to be included in the document being ingested, the `max_matches` setting is added to the enrich processor. Relates #32789	2019-09-04 18:05:12 +02:00
Martijn van Groningen	6bec63fdfa	removed redundant cast	2019-09-04 11:18:31 +02:00
Michael Basnight	51a703da29	Add enrich transport client support (#46002 ) This commit adds an enrich client, as well as a smoke test to validate the client works.	2019-08-29 09:10:07 -05:00
Michael Basnight	a82d24b3ce	Remove enrich indices on delete policy (#45870 ) When a policy is deleted, the enrich indices that are backing the policy alias should also be deleted. This commit does that work and cleans up the transport action a bit so that the lock release is easier to see, as well as to ensure that any action carried out, regardless of exception, unlocks the policy.	2019-08-23 15:26:43 -05:00
Martijn van Groningen	a38e6850a5	fixed errors after cherry-picking 2 commits	2019-08-23 13:51:00 +02:00
Martijn van Groningen	6067065ed6	Decouple enrich processor factory from enrich policy (#45826 ) This commit changes the enrich processor factory to read the required configuration from the current enrich index (from meta mapping field) in order to create the processor. Before this change the required config was read from the enrich policy in the cluster state. Enrich policies are going to be stored in an index (instead of the cluster state). In a processor factory there isn't a way to load something from an index, so with this change we read the required config / info from the enrich index (which is derived from the enrich policy), which then allows us to move enrich policies to an index. With this change it is required to execute a policy before creating a pipeline. Otherwise there is no enrich index and then there is no way to validate that a policy exist or retrieve its type and match field. Relates to #32789	2019-08-23 13:46:39 +02:00
Martijn van Groningen	cb42e19a32	Change how type is stored in an enrich policy. (#45789 ) A policy type controls how the enrich index is created and the query executed against the match field. Currently there is a single policy type (`exact_match`). In the near future more policy types will be added and different policy may have different configuration options. For this reason type should be a json object instead of a string field: ``` { "exact_match": { ... } } ``` instead of: ``` { "type": "exact_match", ... } ``` This will make streaming parsing of enrich policies easier as in the new format, the parsing code can know ahead what configuration fields to expect. In the latter format that is not possible if the type field appears not as the first field. Relates to #32789	2019-08-23 13:43:38 +02:00
Martijn van Groningen	33972423e9	Enrich processor configuration changes (#45466 ) Enrich processor configuration changes: * Renamed `enrich_key` option to `field` option. * Replaced `set_from` and `targets` options with `target_field`. The `target_field` option behaves different to how `set_from` and `targets` worked. The `target_field` is the field that will contain the looked up document. Relates to #32789	2019-08-22 09:49:22 +02:00
Martijn van Groningen	5864f30771	ensure that the items in the bulk response are the same as is in the bulk request	2019-08-21 10:07:02 +02:00
Martijn van Groningen	ac7173c0d4	Renamed CoordinatorProxyAction to EnrichCoordinatorProxyAction and (#45663 ) fail if query shard context needs current time (certain queries / scripts use this, but in the enrich context this is not used).	2019-08-20 18:51:47 +02:00
Michael Basnight	e3373d349b	Consolidate enrich list all and get by name APIs (#45705 ) The get and list APIs are a single API in this commit. Whether requesting one named policy or all policies, a list of policies is returened. The list API code has all been removed and the GET api is what remains, which contains much of the list response code.	2019-08-20 10:29:59 -05:00
Michael Basnight	db57d2206a	Prevent delete policy for active executing policy (#45472 ) This commit adds a lock to the delete policy, in the same way that the locking is done for policy execution. It also creates a test to exercise the delete transport action, and modifies an existing test to provide a common set of functions for saving and deleting policies.	2019-08-15 10:08:11 -05:00
Michael Basnight	03f45dad57	Fix policy removal bug in delete policy (#45573 ) The delete policy had a subtle bug in that it would still delete the policy if pipelines were accessing it, after giving the client back an error. This commit fixes that and ensures it does not happen by adding verification in the test.	2019-08-15 13:20:59 +02:00
Michael Basnight	fd57d3cb29	Fix test broken by policy rename	2019-08-14 13:57:47 -05:00

1 2

99 Commits