Add initial version of enrich processor docs. (#45084)

Relates to #32789
This commit is contained in:
Martijn van Groningen 2019-08-12 20:36:10 +02:00
parent 4ac25b23f6
commit 04626de6ae
No known key found for this signature in database
GPG Key ID: AB236F4FCF2AF12A
2 changed files with 467 additions and 0 deletions

View File

@ -784,6 +784,372 @@ metadata field to provide the error message.
--------------------------------------------------
// NOTCONSOLE
[role="xpack"]
[testenv="basic"]
[[ingest-enriching-data]]
== Enriching data with ingest node
The <<enrich-processor,enrich processor>> allows documents to be enriched with data from
an enrich index that is managed by an enrich policy prior to indexing.
The data that is used by the enrich index is managed by the user in regular indices.
An enrich policy is configuration that indicates how an enrich index is created from
the data in the user's maintained indices. When an enrich policy is executed
a new enrich index is created for that policy, which the enrich process can then use.
An enrich policy also controls what kind of enrichment the `enrich` processor is able to do.
[[enrich-policy-definition]]
=== Enrich Policy Definition
The <<enrich-processor,enrich processor>> requires more than just the configuration in a pipeline.
The main piece to configure is the enrich policy:
[[enrich-policy-options]]
.Enrich policy options
[options="header"]
|======
| Name | Required | Default | Description
| `type` | yes | - | The policy type.
| `indices` | yes | - | The indices to fetch the data from.
| `query` | no | `match_all` query | The query to be used to select which documents are included.
| `enrich_key` | yes | - | The field that the enrich processor will query against.
| `enrich_values` | yes | - | The fields to include in the enrich index.
|======
[[enrich-policy-types]]
==== Policy types
An enrich processor is associated with a policy via the `policy_name` option.
The policy type of the policy determines what kind of enrichment an `enrich` processor is able to do.
The following policy types are currently supported:
* `exact_match` - Can lookup exactly one document and use its content to enrich the document being ingested.
[[enrich-processor-getting-started]]
=== Getting started
Create a regular index that contains data you like to enrich your incoming documents with:
[source,js]
--------------------------------------------------
PUT /users/_doc/1?refresh
{
"email": "mardy.brown@email.me",
"first_name": "Mardy",
"last_name": "Brown",
"address": "6649 N Blue Gum St",
"city": "New Orleans",
"county": "Orleans",
"state": "LA",
"zip": 70116,
"phone1":"504-621-8927",
"phone2": "504-845-1427",
"web": "mardy-brown.me"
}
--------------------------------------------------
// CONSOLE
Create an enrich policy:
[source,js]
--------------------------------------------------
PUT /_enrich/policy/users-policy
{
"type": "exact_match",
"indices": "users",
"enrich_key": "email",
"enrich_values": ["first_name", "last_name", "address", "city", "zip", "state"]
}
--------------------------------------------------
// CONSOLE
// TEST[continued]
Which returns:
[source,js]
--------------------------------------------------
{
"acknowledged": true
}
--------------------------------------------------
// TESTRESPONSE
Execute that enrich policy:
[source,js]
--------------------------------------------------
POST /_enrich/policy/users-policy/_execute
--------------------------------------------------
// CONSOLE
// TEST[continued]
Which returns:
[source,js]
--------------------------------------------------
{
"acknowledged": true
}
--------------------------------------------------
// TESTRESPONSE
Create the pipeline and enrich a document:
[source,js]
--------------------------------------------------
PUT _ingest/pipeline/user_lookup
{
"description" : "Enriching user details to messages",
"processors" : [
{
"enrich" : {
"policy_name": "users-policy",
"enrich_key" : "email",
"targets": ["address", "city", "zip", "state"]
}
}
]
}
PUT my_index/_doc/my_id?pipeline=user_lookup
{
"email": "mardy.brown@email.me"
}
GET my_index/_doc/my_id
--------------------------------------------------
// CONSOLE
// TEST[continued]
Which returns:
[source,js]
--------------------------------------------------
{
"found": true,
"_index": "my_index",
"_type": "_doc",
"_id": "my_id",
"_version": 1,
"_seq_no": 55,
"_primary_term": 1,
"_source": {
"zip": 70116,
"address": "6649 N Blue Gum St",
"city": "New Orleans",
"state": "LA",
"email": "mardy.brown@email.me"
}
}
--------------------------------------------------
// TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term":1/"_primary_term" : $body._primary_term/]
//////////////////////////
[source,js]
--------------------------------------------------
DELETE /_enrich/policy/users-policy
--------------------------------------------------
// CONSOLE
// TEST[continued]
//////////////////////////
[[enrich-policy-apis]]
=== Enrich Policy APIs
Also there are several APIs in order to manage and execute enrich policies:
* <<put-policy-api,Put policy api>>.
* <<get-policy-api,Get policy api>>.
* <<delete-policy-api,Delete policy api>>.
* <<execute-policy-api,Execute policy api>>.
* <<list-policies-api,List policies api>>.
If security is enabled then the user managing enrich policies will need to have
the `enrich_user` builtin role. Also the user will need to have read privileges
for the indices the enrich policy is referring to.
[[put-policy-api]]
==== Put Policy API
The put policy api allows a policy to be stored by an user specified id in the url and
the enrich policy definition as body.
Request:
[source,js]
--------------------------------------------------
PUT /_enrich/policy/my-policy
{
"type": "exact_match",
"indices": "users",
"enrich_key": "email",
"enrich_values": ["first_name", "last_name", "address", "city", "zip", "state"]
}
--------------------------------------------------
// CONSOLE
Response:
[source,js]
--------------------------------------------------
{
"acknowledged": true
}
--------------------------------------------------
// TESTRESPONSE
[[get-policy-api]]
==== Get Policy API
The get policy api allows a policy to be retrieved by id.
Request"
[source,js]
--------------------------------------------------
GET /_enrich/policy/my-policy
--------------------------------------------------
// CONSOLE
// TEST[continued]
Response:
[source,js]
--------------------------------------------------
{
"type": "exact_match",
"indices": ["users"],
"enrich_key": "email",
"enrich_values": ["first_name", "last_name", "address", "city", "zip", "state"]
}
--------------------------------------------------
// TESTRESPONSE
[[list-policies-api]]
==== List Policies API
The list policies api allows all policies to be returned.
Request:
[source,js]
--------------------------------------------------
GET /_enrich/policy
--------------------------------------------------
// CONSOLE
// TEST[continued]
Response:
[source,js]
--------------------------------------------------
{
"policies": [
{
"name" : "my-policy",
"type" : "exact_match",
"indices" : ["users"],
"enrich_key" : "email",
"enrich_values" : [
"first_name",
"last_name",
"address",
"city",
"zip",
"state"
]
}
]
}
--------------------------------------------------
// TESTRESPONSE
[[execute-policy-api]]
==== Execute Policy API
The execute policy api executes a policy based on the provided id.
It may take some time before this API returns a response.
Executing a policy involves creating a new enrich index, indexing the documents from
the indices specified in policy into the enrich index and some other operations.
Note that this api needs to be re-executed in order to incorporate new changes
in the index the policy is pointing to after the policy has been executed.
This API creates an index with the `.enrich-*` prefix in the name. This index purpose
is the be used by the enrich processor only and should not be used by anything else.
Internally old `.enrich-*` are removed by an internal cleanup mechanism.
//////////////////////////
[source,js]
--------------------------------------------------
PUT /users/_doc/1?refresh
{
"email": "mardy.brown@email.me",
"first_name": "Mardy",
"last_name": "Brown",
"address": "6649 N Blue Gum St",
"city": "New Orleans",
"county": "Orleans",
"state": "LA",
"zip": 70116,
"phone1":"504-621-8927",
"phone2": "504-845-1427",
"web": "mardy-brown.me"
}
--------------------------------------------------
// CONSOLE
// TEST[continued]
//////////////////////////
Request:
[source,js]
--------------------------------------------------
POST /_enrich/policy/my-policy/_execute
--------------------------------------------------
// CONSOLE
// TEST[continued]
Response:
[source,js]
--------------------------------------------------
{
"acknowledged": true
}
--------------------------------------------------
// TESTRESPONSE
[[delete-policy-api]]
===== Delete Policy API
The delete policy api allows a policy to be removed by id.
Request:
[source,js]
--------------------------------------------------
DELETE /_enrich/policy/my-policy
--------------------------------------------------
// CONSOLE
// TEST[continued]
Response:
[source,js]
--------------------------------------------------
{
"acknowledged": true
}
--------------------------------------------------
// TESTRESPONSE
[[ingest-processors]]
== Processors
@ -859,6 +1225,7 @@ include::processors/date-index-name.asciidoc[]
include::processors/dissect.asciidoc[]
include::processors/dot-expand.asciidoc[]
include::processors/drop.asciidoc[]
include::processors/enrich.asciidoc[]
include::processors/fail.asciidoc[]
include::processors/foreach.asciidoc[]
include::processors/geoip.asciidoc[]

View File

@ -0,0 +1,100 @@
[role="xpack"]
[testenv="basic"]
[[enrich-processor]]
=== Enrich Processor
The `enrich` processor can enrich documents with data from another index.
See <<ingest-enriching-data,enrich data>> section for more information how to set this up and
check out the <<enrich-processor-getting-started,getting started>> to get familiar with enrich policies and related APIs.
a
[[enrich-options]]
.Enrich Options
[options="header"]
|======
| Name | Required | Default | Description
| `policy_name` | yes | - | The name of the enrich policy to use.
| `enrich_key` | no | Policy enrich_key | The field to get the value from for the enrich lookup.
| `ignore_missing` | no | `false` | If `true` and `enrich_key` does not exist, the processor quietly exits without modifying the document
| `override` | no | true | If processor will update fields with pre-existing non-null-valued field. When set to `false`, such fields will not be touched.
| `targets` | no 1) | - | Describes what fields should be added to the document being indexed from the lookup document
| `set_from` | no 1) | - | Same as `targets`, but allows fields from the lookup document to added under a different name to the document being indexed
include::common-options.asciidoc[]
|======
1) Either `targets` or `set_from` must be specified.
[[enrich-processor-set-from]]
==== Enrich `set_from` option
This option should be used in the case that the field in the looked up document should be placed under
a different field in the document being ingested.
The `set_from` accepts an array with two fields:
* `source` - The name of the field in the lookup document
* `target` - The name of the field in the document being ingested that should hold the source field's value.
For example:
//////////////////////////
[source,js]
--------------------------------------------------
PUT /_enrich/policy/users-policy
{
"type": "exact_match",
"indices": "users",
"enrich_key": "email",
"enrich_values": ["first_name", "last_name", "address", "city", "zip", "state"]
}
--------------------------------------------------
// CONSOLE
// TEST
//////////////////////////
[source,js]
--------------------------------------------------
PUT _ingest/pipeline/user_lookup
{
"description" : "Enriching user details to messages",
"processors" : [
{
"enrich" : {
"policy_name": "users-policy",
"enrich_key" : "email",
"set_from": [
{
"source": "address",
"target": "address-line-1"
},
{
"source": "city",
"target": "residence"
},
{
"source": "zip",
"target": "zipcode"
},
{
"source": "state",
"target": "us_state"
}
]
}
}
]
}
--------------------------------------------------
// CONSOLE
// TEST[continued]
//////////////////////////
[source,js]
--------------------------------------------------
DELETE /_enrich/policy/users-policy
--------------------------------------------------
// CONSOLE
// TEST[continued]
//////////////////////////