Commit Graph

93 Commits

Author SHA1 Message Date
Jay Modi 3edadfefd0 RestHandlers declare handled routes (#52123)
This commit changes how RestHandlers are registered with the
RestController so that a RestHandler no longer needs to register itself
with the RestController. Instead the RestHandler interface has new
methods which when called provide information about the routes
(method and path combinations) that are handled by the handler
including any deprecated and/or replaced combinations.

This change also makes the publication of RestHandlers safe since they
no longer publish a reference to themselves within their constructors.

Closes #51622

Co-authored-by: Jason Tedor <jason@tedor.me>

Backport of #51950
2020-02-09 22:48:32 -07:00
Rory Hunter d8bd736f8a
Formatting: keep simple if / else on the same line (#51544)
Backport of #51526.

Previous the formatter was breaking simple if/else statements (i.e.
without braces) onto separate lines, which could be fragile because the
formatter cannot also introduce braces. Instead, keep such expressions
on the same line.
2020-01-29 10:42:04 +00:00
Gordon Brown 89c2834b24
Deprecate creation of dot-prefixed index names except for hidden and system indices (#49959)
This commit deprecates the creation of dot-prefixed index names (e.g.
.watches) unless they are either 1) a hidden index, or 2) registered by
a plugin that extends SystemIndexPlugin. This is the first step
towards more thorough protections for system indices.

This commit also modifies several plugins which use dot-prefixed indices
to register indices they own as system indices, and adds a plugin to
register .tasks as a system index.
2020-01-28 10:01:16 -07:00
Rory Hunter 2f069d8f3f
Tweak formatter config for long generic lines (#51027)
Backport of #50909. The current formatting config allows some long
generic declarations to break the 140 character limit. Tweak the config
to wrap such lines.
2020-01-15 13:17:37 +00:00
Martijn van Groningen e76c3d4d32
Tidy up enrich processors: (#50957)
* Fix generics usages.
* Sealed match processor class.
2020-01-15 08:51:22 +01:00
Martijn van Groningen 2079f1cbeb
Backport: Fix ingest simulate response document order if processor executes async (#50269)
Backport #50244 to 7.x branch.

If a processor executes asynchronously and the ingest simulate api simulates with
multiple documents then the order of the documents in the response may not match
the order of the documents in the request.

Alexander Reelsen discovered this issue with the enrich processor with the following reproduction:

```
PUT cities/_doc/munich
{"zip":"80331","city":"Munich"}

PUT cities/_doc/berlin
{"zip":"10965","city":"Berlin"}

PUT /_enrich/policy/zip-policy
{
  "match": {
    "indices": "cities",
    "match_field": "zip",
    "enrich_fields": [ "city" ]
  }
}

POST /_enrich/policy/zip-policy/_execute

GET _cat/indices/.enrich-*

POST /_ingest/pipeline/_simulate
{
  "pipeline": {
  "processors" : [
    {
      "enrich" : {
        "policy_name": "zip-policy",
        "field" : "zip",
        "target_field": "city",
        "max_matches": "1"
      }
    }
  ]
  },
  "docs": [
    { "_id": "first", "_source" : { "zip" : "80331" } } ,
    { "_id": "second", "_source" : { "zip" : "50667" } }
  ]
}
```

* fixed test compile error
2019-12-17 12:27:07 +01:00
Martijn van Groningen 09c4269097
Add templating support to enrich processor (#49093)
Adds support for templating to `field` and `target_field` options.
2019-11-27 08:53:11 +01:00
James Baiera 6bb6adb8d3
Reuse collected cluster state in EnrichPolicyRunner (#48488) (#49100)
The cluster state is obtained twice in the EnrichPolicyRunner when updating 
the final alias. There is a possibility for the state to be slightly different 
between those two calls. This PR just has the function get the cluster state 
once and reuse it for the life of the function call.
2019-11-14 14:14:39 -05:00
Martijn van Groningen 18d5d73305
Enable spotless for enrich gradle project in 7 dot x branch. (#48976)
Backport of #48908

The enrich project doesn't have much history as all the other gradle projects,
so it makes sense to enable spotless for this gradle project.
2019-11-12 13:22:34 +01:00
Martijn van Groningen bbe50eca72
Fail with a better error when if there are no ingest nodes (#48272)
when executing enrich execute policy api.
2019-10-22 07:42:04 +02:00
Martijn van Groningen 0ec0ab64c9
Fix executing enrich policies stats (#48132)
The enrich stats api picked the wrong task to be displayed
in the executing stats section.

In case `wait_for_completion` was set to `false` then no task
was being displayed and if that param was set to `true` then
the wrong task was being displayed (transport action task instead
of enrich policy executor task).

Testing executing policies in enrich stats api is tricky.
I have verified locally that this commit fixes the bug.
2019-10-22 07:41:56 +02:00
Martijn van Groningen c09b62d5bf
Backport: also validate source index at put enrich policy time (#48311)
Backport of: #48254

This changes tests to create a valid
source index prior to creating the enrich policy.
2019-10-22 07:38:16 +02:00
James Baiera 0d12ef8958
Add Enrich Origin (#48098) (#48312)
This PR adds an origin for the Enrich feature, and modifies the background 
maintenance task to use the origin when executing client operations. 
Without this fix, the maintenance task fails to execute when security is 
enabled.
2019-10-21 16:40:49 -04:00
Martijn van Groningen 844825a13f
Validate policy type when storing an enrich policy (#48126) 2019-10-18 16:26:48 +02:00
Martijn van Groningen a5fe69c344
Include enrich into the info api as feature (#48157)
This commit also fixes a bug, the enrich enabled setting
was not included in the list of settings.

Backport of #48109
2019-10-17 09:51:32 +02:00
Martijn van Groningen 77164e9017
adjusted minimal supported version 2019-10-15 07:45:00 +02:00
Martijn van Groningen c4b1a3045a
Fixed test, take into account that Map can be the result if max_matches is 1. 2019-10-15 07:03:01 +02:00
James Baiera 18d7e32b7d Add wait for completion for Enrich policy execution (#47886)
This PR adds the ability to run the enrich policy execution task in the background,
returning a task id instead of waiting for the completed operation.
2019-10-14 16:05:28 -04:00
Martijn van Groningen 7fc9198d46
Change how `max_matches` affects `target_field` option. (#47982)
Prior to this change the `target_field` would always be a json array
field in the document being ingested. This to take into account that
multiple enrich documents could be inserted into the `target_field`.

However the default `max_matches` is `1`. Meaning that by default
only a single enrich document would be added to `target_field` json
array field.

This commit changes this; if `max_matches` is set to `1` then the single
document would be added as a json object to the `target_field` and
if it is configured to a higher value then the enrich documents will be
added as a json array (even if a single enrich document happens to be
enriched).
2019-10-14 21:09:48 +02:00
James Baiera 73263c654a Add basic task support for executing enrich policies (#47523)
Changes the execution logic to create a new task using the execute request,
and attaches the new task to the policy runner to be updated. Also, a new
response is now returned from the execute api, which contains either the task
id of the execution, or the completed status of the run. The fields are mutually
exclusive to make it easier to discern what type of response it is.
2019-10-11 13:32:06 -04:00
Martijn van Groningen aace42d38d
Add HLRC support for enrich stats API (#47306)
This PR also includes HLRC docs for the enrich stats api.

Relates to #32789
2019-10-10 09:08:29 +02:00
Martijn van Groningen 19393fc5a7
match processor should handler values other than string properly (#47419)
Currently if the document being ingested contains another field value
than a string then the processor fails with an error.

This commit changes the match processor to handle number values
and array values correctly.

If a json array is detected then the `terms` query is used instead
of the `term` query.
2019-10-10 08:49:17 +02:00
Martijn van Groningen f8ebb75fcf
Reuse OperationRouting#searchShards(...) to select local enrich shard (#47359)
The currently logic shard selecting logic selects a random shard copy
instead of selecting the local shard copy and if local copy is not
available then selecting a random shard copy. The latter is desired
behaviour for enrich.

By reusing `OperationRouting#searchShards(...)` we get the desired
behaviour and reuse the same logic that the search api is using.
2019-10-09 17:31:43 +02:00
James Baiera b9fb354618 Add retry to force merge operation in EnrichPolicyRunner (#47178)
Adds a check when running an Enrich policy to make sure that an Enrich index
is force merged down to one segment, and if it was not fully merged, attempts
the merge again, up to a configurable number of times.
2019-10-08 11:23:02 -04:00
Tal Levy a17f394e27
Geo-Match Enrich Processor (#47243) (#47701)
this commit introduces a geo-match enrich processor that looks up a specific
`geo_point` field in the enrich-index for all entries that have a geo_shape match field
that meets some specific relation criteria with the input field.

For example, the enrich index may contain documents with zipcodes and their respective
geo_shape. Ingesting documents with a geo_point field can be enriched with which zipcode
they associate according to which shape they are contained within.

this commit also refactors some of the MatchProcessor by moving a lot of the shared code to
AbstractEnrichProcessor.

Closes #42639.
2019-10-07 15:03:46 -07:00
James Baiera a66c0dcd95 Add pipeline to ensure unique Enrich index documents (#46348)
Adds a pipeline that removes ids and routing from documents before indexing
them into enrich indices. Enrich documents may come from multiple indices,
and thus have id collisions on them. This pipeline ensures that documents
with colliding id fields do not clobber one another during the reindex operation
while executing an enrich policy.
2019-10-04 12:20:52 -04:00
Michael Basnight 0e1b77568a Add enable checks to missing enrich plugin methods (#47187)
Some of the server side objects that do not need to be created unless
enrich is enabled were still being created. This commit fixes that.
2019-10-01 12:04:46 -05:00
Martijn van Groningen fe937ea4b8
Add config namespace in get policy api response (#47162)
Currently the policy config is placed directly in the json object
of the toplevel `policies` array field. For example:

```
{
    "policies": [
        {
            "match": {
                "name" : "my-policy",
                "indices" : ["users"],
                "match_field" : "email",
                "enrich_fields" : [
                    "first_name",
                    "last_name",
                    "city",
                    "zip",
                    "state"
                ]
            }
        }
    ]
}
```

This change adds a `config` field in each policy json object:

```
{
    "policies": [
        {
            "config": {
                "match": {
                    "name" : "my-policy",
                    "indices" : ["users"],
                    "match_field" : "email",
                    "enrich_fields" : [
                        "first_name",
                        "last_name",
                        "city",
                        "zip",
                        "state"
                    ]
                }
            }
        }
    ]
}
```

This allows us in the future to add other information about policies
in the get policy api response.

The UI will consume this API to build an overview of all policies.
The UI may in the future include additional information about a policy
and the plan is to include that in the get policy api, so that this
information can be gathered in a single api call.

An example of the information that is likely to be added is:
* Last policy execution time
* The status of a policy (executing, executed, unexecuted)
* Information about the last failure if exists
2019-09-30 14:37:23 +02:00
Martijn van Groningen 8a4eefdd83
Expose enrich stats api to monitoring. (#46708)
This change also slightly modifies the stats response,
so that is can easier consumer by monitoring and other
users. (coordinators stats are now in a list instead of
a map and has an additional field for the node id)

Relates to #32789
2019-09-26 11:04:33 +02:00
James Baiera 9967aff714 Add notice to Enrich index mapping metadata (#45996) 2019-09-24 12:55:11 -04:00
James Baiera a349b22273 Add the cluster version to enrich policies (#45021)
Adds the Elasticsearch version as a field on the EnrichPolicy object
2019-09-23 18:44:45 -04:00
Martijn van Groningen 33bbc4798b
fixed compile errors after merging 2019-09-23 09:46:14 +02:00
Michael Basnight f1c7ed647b Allow comma separated ids in get enrich policy API (#46351)
This commit changes the GET REST api so it will accept an optional comma
separated list of enrich policy ids. This change also modifies the
behavior of the GET API in that it will not error if it is passed a bad
enrich id anymore, but will instead just return an empty list.
2019-09-20 10:06:58 -05:00
Martijn van Groningen a4b0f66919
Add enrich stats api (#46462)
The enrich api returns enrich coordinator stats and
information about currently executing enrich policies.

The coordinator stats include per ingest node:
* The current number of search requests in the queue.
* The total number of outstanding remote requests that
  have been executed since node startup. Each remote
  request is likely to include multiple search requests.
  This depends on how much search requests are in the
  queue at the time when the remote request is performed.
* The number of current outstanding remote requests.
* The total number of search requests that `enrich`
  processors have executed since node startup.

The current execution policies stats include:
* The name of policy that is executing
* A full blow task info object that is executing the policy.

Relates to #32789
2019-09-11 13:40:24 +02:00
Martijn van Groningen 8a48ef2a06
fixed typo 2019-09-11 09:52:25 +02:00
Martijn van Groningen ef33a99e6e
Disable default features that are not needed for enrich indices. (#46525)
Relates to #32789
2019-09-11 09:20:38 +02:00
Michael Basnight 9304f5c889 Ensure enrich executes on master node only (#46448)
The previous transport action was a read action, which under the right
set of circumstances can execute on a coordinating node. This commit
ensures that cannot happen.
2019-09-10 09:59:36 -05:00
Martijn van Groningen ded98e50b7
Change exact match processor to match processor. (#46041)
Besides a rename, this changes allows to processor to attach multiple
enrich docs to the document being ingested.

Also in order to control the maximum number of enrich docs to be
included in the document being ingested, the `max_matches` setting
is added to the enrich processor.

Relates #32789
2019-09-04 18:05:12 +02:00
Martijn van Groningen 6bec63fdfa
removed redundant cast 2019-09-04 11:18:31 +02:00
Michael Basnight 51a703da29
Add enrich transport client support (#46002)
This commit adds an enrich client, as well as a smoke test to validate
the client works.
2019-08-29 09:10:07 -05:00
Michael Basnight a82d24b3ce Remove enrich indices on delete policy (#45870)
When a policy is deleted, the enrich indices that are backing the policy
alias should also be deleted. This commit does that work and cleans up
the transport action a bit so that the lock release is easier to see, as
well as to ensure that any action carried out, regardless of exception,
unlocks the policy.
2019-08-23 15:26:43 -05:00
Martijn van Groningen a38e6850a5
fixed errors after cherry-picking 2 commits 2019-08-23 13:51:00 +02:00
Martijn van Groningen 6067065ed6
Decouple enrich processor factory from enrich policy (#45826)
This commit changes the enrich processor factory to read the required
configuration from the current enrich index (from meta mapping field)
in order to create the processor.

Before this change the required config was read from the enrich policy
in the cluster state. Enrich policies are going to be stored in an
index (instead of the cluster state). In a processor factory there isn't
a way to load something from an index, so with this change we read
the required config / info from the enrich index (which is derived
from the enrich policy), which then allows us to move enrich policies
to an index.

With this change it is required to execute a policy before creating a
pipeline. Otherwise there is no enrich index and then there is no way
to validate that a policy exist or retrieve its type and match field.

Relates to #32789
2019-08-23 13:46:39 +02:00
Martijn van Groningen 33972423e9
Enrich processor configuration changes (#45466)
Enrich processor configuration changes:
* Renamed `enrich_key` option to `field` option.
* Replaced `set_from` and `targets` options with `target_field`.

The `target_field` option behaves different to how `set_from` and
`targets` worked. The `target_field` is the field that will contain
the looked up document.

Relates to #32789
2019-08-22 09:49:22 +02:00
Martijn van Groningen 5864f30771
ensure that the items in the bulk response are the same as is in the bulk request 2019-08-21 10:07:02 +02:00
Martijn van Groningen ac7173c0d4
Renamed CoordinatorProxyAction to EnrichCoordinatorProxyAction and (#45663)
fail if query shard context needs current time (certain queries / scripts
use this, but in the enrich context this is not used).
2019-08-20 18:51:47 +02:00
Michael Basnight e3373d349b Consolidate enrich list all and get by name APIs (#45705)
The get and list APIs are a single API in this commit. Whether
requesting one named policy or all policies, a list of policies is
returened. The list API code has all been removed and the GET api is
what remains, which contains much of the list response code.
2019-08-20 10:29:59 -05:00
Michael Basnight db57d2206a Prevent delete policy for active executing policy (#45472)
This commit adds a lock to the delete policy, in the same way that the
locking is done for policy execution. It also creates a test to exercise
the delete transport action, and modifies an existing test to provide a
common set of functions for saving and deleting policies.
2019-08-15 10:08:11 -05:00
Michael Basnight 03f45dad57
Fix policy removal bug in delete policy (#45573)
The delete policy had a subtle bug in that it would still delete the
policy if pipelines were accessing it, after giving the client back an
error. This commit fixes that and ensures it does not happen by adding
verification in the test.
2019-08-15 13:20:59 +02:00
Michael Basnight 52a094b177 Fail delete policy if pipeline exists (#44438)
If a pipeline that refrences the policy exists, we should not allow the
policy to be deleted. The user will need to remove the processor from
the pipeline before deleting the policy. This commit adds a check to
ensure that the policy cannot be deleted if it is referenced by any
pipeline in the system.
2019-08-14 13:51:10 -05:00