[DOCS] Update "Enrich your data" tutorials (#46417)

* Move enrich docs to separate file

* Rewrite enrich processor tutorial
This commit is contained in:
James Rodewig 2019-09-09 08:44:56 -04:00 committed by Martijn van Groningen
parent d74d995382
commit a27d075db4
No known key found for this signature in database
GPG Key ID: AB236F4FCF2AF12A
5 changed files with 305 additions and 209 deletions

View File

@ -73,8 +73,7 @@ include::put-enrich-policy.asciidoc[tag=enrich-policy-api-prereqs]
Use the execute enrich policy API
to create the enrich index for an existing enrich policy.
// tag::execute-enrich-policy-desc[]
// tag::execute-enrich-policy-def[]
The *enrich index* contains documents from the policy's source indices.
Enrich indices always begin with `.enrich-*`,
are read-only,
@ -85,20 +84,20 @@ and are <<indices-forcemerge,force merged>>.
Enrich indices should be used by the <<enrich-processor,enrich processor>> only.
Avoid using enrich indices for other purposes.
====
// end::execute-enrich-policy-def[]
// tag::update-enrich-index[]
Once created, you cannot update
or index documents to an enrich index.
Instead, update your source indices
and execute the enrich policy again.
This creates a new enrich index from your updated source indices
and deletes the previous enrich index.
// end::update-enrich-index[]
Because this API request performs several operations,
it may take a while to return a response.
// end::execute-enrich-policy-desc[]
[[sample-api-path-params]]
==== {api-path-parms-title}

View File

@ -63,7 +63,7 @@ If you use {es} {security-features}, you must have:
Use the put enrich policy API
to create a new enrich policy.
// tag::enrich-policy-def
// tag::enrich-policy-def[]
An *enrich policy* is a set of rules the enrich processor uses
to append the appropriate data to incoming documents.
An enrich policy contains:
@ -71,15 +71,15 @@ An enrich policy contains:
* The *policy type*,
which determines how the processor enriches incoming documents
* A list of source indices
* The *match field*, a field used to match incoming documents
* *Enrich fields*, fields appended to incoming documents
* The *match field* used to match incoming documents
* *Enrich fields* appended to incoming documents
from matching documents
// end::enrich-policy-def
// end::enrich-policy-def[]
===== Update an enrich policy
// tag::update-enrich-policy
// tag::update-enrich-policy[]
You cannot update an existing enrich policy.
Instead, you can:
@ -91,7 +91,7 @@ Instead, you can:
. Use the <<delete-enrich-policy-api, delete enrich policy API>>
to delete the previous enrich policy.
// end::update-enrich-policy
// end::update-enrich-policy[]
[[put-enrich-policy-api-path-params]]

View File

@ -0,0 +1,293 @@
[role="xpack"]
[testenv="basic"]
[[ingest-enriching-data]]
== Enrich your data
You can use the <<enrich-processor,enrich processor>>
to append data from existing indices
to incoming documents during ingest.
For example, you can use the enrich processor to:
* Identify web services or vendors based on known IP addresses
* Add product information to retail orders based on product IDs
* Supplement contact information based on an email address
[float]
[[enrich-setup]]
=== Set up an enrich processor
To set up an enrich processor and learn how it works,
follow these steps:
. Check the <<enrich-prereqs, prerequisites>>.
. <<create-enrich-source-index>>.
. <<create-enrich-policy>>.
. <<execute-enrich-policy>>.
. <<add-enrich-processor>>.
. <<ingest-enrich-docs>>.
Once you have an enrich processor set up,
you can <<update-enrich-data,update your enrich data>>
and <<update-enrich-policies, update your enrich policies>>
using the <<enrich-apis,enrich APIs>>.
[IMPORTANT]
====
The enrich processor performs several operations
and may impact the speed of your <<pipeline,ingest pipeline>>.
We strongly recommend testing and benchmarking your enrich processors
before deploying them in production.
We do not recommend using the enrich processor to append real-time data.
The enrich processor works best with reference data
that doesn't change frequently.
====
[float]
[[enrich-prereqs]]
==== Prerequisites
include::{docdir}/ingest/apis/enrich/put-enrich-policy.asciidoc[tag=enrich-policy-api-prereqs]
[float]
[[create-enrich-source-index]]
==== Create a source index
To begin,
create one or more source indices.
A *source index* contains data you want to append to incoming documents.
You can index and manage documents in a source index
like a regular index.
The following <<docs-index_,index API>> request creates the `users` source index
containing user data.
This request also indexes a new document to the `users` source index.
[source,js]
----
PUT /users/_doc/1?refresh
{
"email": "mardy.brown@asciidocsmith.com",
"first_name": "Mardy",
"last_name": "Brown",
"city": "New Orleans",
"county": "Orleans",
"state": "LA",
"zip": 70116,
"web": "mardy.asciidocsmith.com"
}
----
// CONSOLE
You also can set up {beats-ref}/getting-started.html[{beats}],
such as a {filebeat-ref}/filebeat-getting-started.html[{filebeat}],
to automatically send and index documents
to your source indices.
See {beats-ref}/getting-started.html[Getting started with {beats}].
[float]
[[create-enrich-policy]]
==== Create an enrich policy
Use the <<put-enrich-policy-api, put enrich policy>> API
to create an enrich policy.
include::{docdir}/ingest/apis/enrich/put-enrich-policy.asciidoc[tag=enrich-policy-def]
[source,js]
----
PUT /_enrich/policy/users-policy
{
"match": {
"indices": "users",
"match_field": "email",
"enrich_fields": ["first_name", "last_name", "city", "zip", "state"]
}
}
----
// CONSOLE
// TEST[continued]
[float]
[[execute-enrich-policy]]
==== Execute an enrich policy
Use the <<execute-enrich-policy-api, execute enrich policy>> API
to create an enrich index for the policy.
include::apis/enrich/execute-enrich-policy.asciidoc[tag=execute-enrich-policy-def]
The following request executes the `users-policy` enrich policy.
Because this API request performs several operations,
it may take a while to return a response.
[source,js]
----
POST /_enrich/policy/users-policy/_execute
----
// CONSOLE
// TEST[continued]
[float]
[[add-enrich-processor]]
==== Add the enrich processor to an ingest pipeline
Use the <<put-pipeline-api,put pipeline>> API
to create an ingest pipeline.
Include an <<enrich-processor,enrich processor>>
that uses your enrich policy.
When defining an enrich processor,
you must include the following:
* The *field* used to match incoming documents
to documents in the enrich index.
+
This field should be included in incoming documents.
To match, this field must contain the exact
value of the match field of a document in the enrich index.
* The *target field* added to incoming documents.
This field contains all appended enrich data.
The following request adds a new pipeline, `user_lookup`.
This pipeline includes an enrich processor
that uses the `users-policy` enrich policy.
[source,js]
----
PUT /_ingest/pipeline/user_lookup
{
"description" : "Enriching user details to messages",
"processors" : [
{
"enrich" : {
"policy_name": "users-policy",
"field" : "email",
"target_field": "user"
}
}
]
}
----
// CONSOLE
// TEST[continued]
You also can add other <<ingest-processors,processors>>
to your ingest pipeline.
You can use these processors to change or drop incoming documents
based on your criteria.
See <<ingest-processors>> for a list of built-in processors.
[float]
[[ingest-enrich-docs]]
==== Ingest and enrich documents
Index incoming documents using your ingest pipeline.
Because the enrich policy type is `match`,
the enrich processor matches incoming documents
to documents in the enrich index
based on match field values.
The processor then appends the enrich field data
from any matching document in the enrich index
to target field of the incoming document.
The enrich processor appends all data to the target field as an array.
If the incoming document matches more than one document in the enrich index,
the processor appends data from those documents to the array.
If the incoming document matches no documents in the enrich index,
the processor appends no data.
The following <<docs-index_,Index API>> request uses the ingest pipeline
to index a document
containing the `email` field,
the `match_field` specified in the `users-policy` enrich policy.
[source,js]
----
PUT /my_index/_doc/my_id?pipeline=user_lookup
{
"email": "mardy.brown@asciidocsmith.com"
}
----
// CONSOLE
// TEST[continued]
To verify the enrich processor matched
and appended the appropriate field data,
use the <<docs-get,get>> API to view the indexed document.
[source,js]
----
GET /my_index/_doc/my_id
----
// CONSOLE
// TEST[continued]
The API returns the following response:
[source,js]
----
{
"found": true,
"_index": "my_index",
"_type": "_doc",
"_id": "my_id",
"_version": 1,
"_seq_no": 55,
"_primary_term": 1,
"_source": {
"user": [
{
"email": "mardy.brown@asciidocsmith.com",
"first_name": "Mardy",
"last_name": "Brown",
"zip": 70116,
"city": "New Orleans",
"state": "LA"
}
],
"email": "mardy.brown@asciidocsmith.com"
}
}
----
// TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term":1/"_primary_term" : $body._primary_term/]
[float]
[[update-enrich-data]]
=== Update your enrich index
include::{docdir}/ingest/apis/enrich/execute-enrich-policy.asciidoc[tag=update-enrich-index]
If wanted, you can <<docs-reindex,reindex>>
or <<docs-update-by-query,update>> any already ingested documents
using your ingest pipeline.
[float]
[[update-enrich-policies]]
=== Update an enrich policy
include::apis/enrich/put-enrich-policy.asciidoc[tag=update-enrich-policy]
////
[source,js]
--------------------------------------------------
DELETE /_ingest/pipeline/user_lookup
DELETE /_enrich/policy/users-policy
--------------------------------------------------
// CONSOLE
// TEST[continued]
////

View File

@ -752,204 +752,8 @@ metadata field to provide the error message.
--------------------------------------------------
// NOTCONSOLE
[role="xpack"]
[testenv="basic"]
[[ingest-enriching-data]]
== Enrich your data using the ingest node
The <<enrich-processor,enrich processor>> allows documents to be enriched with data from
an enrich index that is managed by an enrich policy prior to indexing.
The data that is used by the enrich index is managed by the user in regular indices.
An enrich policy is configuration that indicates how an enrich index is created from
the data in the user's maintained indices. When an enrich policy is executed
a new enrich index is created for that policy, which the enrich process can then use.
An enrich policy also controls what kind of enrichment the `enrich` processor is able to do.
[[enrich-policy-definition]]
=== Enrich Policy Definition
The <<enrich-processor,enrich processor>> requires more than just the configuration in a pipeline.
The main piece to configure is the enrich policy:
[[enrich-policy-options]]
.Enrich policy options
[options="header"]
|======
| Name | Required | Default | Description
| `type` | yes | - | The policy type.
| `indices` | yes | - | The indices to fetch the data from.
| `query` | no | `match_all` query | The query to be used to select which documents are included.
| `match_field` | yes | - | The field that will be used to match against an input document.
| `enrich_fields` | yes | - | The fields that will be available to enrich the input document.
|======
[[enrich-policy-types]]
==== Policy types
An enrich processor is associated with a policy via the `policy_name` option.
The policy type of the policy determines what kind of enrichment an `enrich` processor is able to do.
The following policy types are currently supported:
* `match` - Can lookup documents by running a term query and use the retrieved content to enrich the document being ingested.
[[enrich-processor-getting-started]]
=== Getting started
Create a regular index that contains data you like to enrich your incoming documents with:
[source,js]
--------------------------------------------------
PUT /users/_doc/1?refresh
{
"email": "mardy.brown@email.me",
"first_name": "Mardy",
"last_name": "Brown",
"address": "6649 N Blue Gum St",
"city": "New Orleans",
"county": "Orleans",
"state": "LA",
"zip": 70116,
"phone1":"504-621-8927",
"phone2": "504-845-1427",
"web": "mardy-brown.me"
}
--------------------------------------------------
// CONSOLE
Create an enrich policy:
[source,js]
--------------------------------------------------
PUT /_enrich/policy/users-policy
{
"match": {
"indices": "users",
"match_field": "email",
"enrich_fields": ["first_name", "last_name", "address", "city", "zip", "state"]
}
}
--------------------------------------------------
// CONSOLE
// TEST[continued]
Which returns:
[source,js]
--------------------------------------------------
{
"acknowledged": true
}
--------------------------------------------------
// TESTRESPONSE
[[execute-enrich-policy]]
Execute that enrich policy:
[source,js]
--------------------------------------------------
POST /_enrich/policy/users-policy/_execute
--------------------------------------------------
// CONSOLE
// TEST[continued]
Which returns:
[source,js]
--------------------------------------------------
{
"acknowledged": true
}
--------------------------------------------------
// TESTRESPONSE
Create the pipeline and enrich a document:
[source,js]
--------------------------------------------------
PUT _ingest/pipeline/user_lookup
{
"description" : "Enriching user details to messages",
"processors" : [
{
"enrich" : {
"policy_name": "users-policy",
"field" : "email",
"target_field": "user"
}
}
]
}
PUT my_index/_doc/my_id?pipeline=user_lookup
{
"email": "mardy.brown@email.me"
}
GET my_index/_doc/my_id
--------------------------------------------------
// CONSOLE
// TEST[continued]
Which returns:
[source,js]
--------------------------------------------------
{
"found": true,
"_index": "my_index",
"_type": "_doc",
"_id": "my_id",
"_version": 1,
"_seq_no": 55,
"_primary_term": 1,
"_source": {
"user": [
{
"email": "mardy.brown@email.me",
"first_name": "Mardy",
"last_name": "Brown",
"zip": 70116,
"address": "6649 N Blue Gum St",
"city": "New Orleans",
"state": "LA"
}
],
"email": "mardy.brown@email.me"
}
}
--------------------------------------------------
// TESTRESPONSE[s/"_seq_no": \d+/"_seq_no" : $body._seq_no/ s/"_primary_term":1/"_primary_term" : $body._primary_term/]
//////////////////////////
[source,js]
--------------------------------------------------
DELETE /_ingest/pipeline/user_lookup
DELETE /_enrich/policy/users-policy
--------------------------------------------------
// CONSOLE
// TEST[continued]
//////////////////////////
[[enrich-policy-apis]]
=== Enrich Policy APIs
Also there are several APIs in order to manage and execute enrich policies:
* <<put-enrich-policy-api,Put policy api>>.
* <<get-enrich-policy-api,Get enrich policy api>>.
* <<delete-enrich-policy-api,Delete policy api>>.
* <<execute-enrich-policy-api,Execute policy api>>.
If security is enabled then the user managing enrich policies will need to have
the `enrich_user` builtin role. Also the user will need to have read privileges
for the indices the enrich policy is referring to.
include::enrich.asciidoc[]
[[ingest-processors]]

View File

@ -5,7 +5,7 @@
The `enrich` processor can enrich documents with data from another index.
See <<ingest-enriching-data,enrich data>> section for more information how to set this up and
check out the <<enrich-processor-getting-started,getting started>> to get familiar with enrich policies and related APIs.
check out the <<ingest-enriching-data,tutorial>> to get familiar with enrich policies and related APIs.
[[enrich-options]]
.Enrich Options