2016-02-11 17:16:56 -05:00
|
|
|
|
[[ingest]]
|
2018-09-27 11:36:18 -04:00
|
|
|
|
= Ingest node
|
2016-02-11 17:16:56 -05:00
|
|
|
|
|
|
|
|
|
[partintro]
|
|
|
|
|
--
|
2018-01-16 03:35:35 -05:00
|
|
|
|
Use an ingest node to pre-process documents before the actual document indexing happens.
|
|
|
|
|
The ingest node intercepts bulk and index requests, it applies transformations, and it then
|
|
|
|
|
passes the documents back to the index or bulk APIs.
|
2016-02-11 17:16:56 -05:00
|
|
|
|
|
2018-01-16 03:35:35 -05:00
|
|
|
|
All nodes enable ingest by default, so any node can handle ingest tasks. You can also create
|
|
|
|
|
dedicated ingest nodes. To disable ingest for a node, configure the following setting in the
|
|
|
|
|
elasticsearch.yml file:
|
2016-02-11 17:16:56 -05:00
|
|
|
|
|
|
|
|
|
[source,yaml]
|
|
|
|
|
--------------------------------------------------
|
|
|
|
|
node.ingest: false
|
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
2018-01-16 03:35:35 -05:00
|
|
|
|
To pre-process documents before indexing, <<pipeline,define a pipeline>> that specifies a series of
|
|
|
|
|
<<ingest-processors,processors>>. Each processor transforms the document in some specific way. For example, a
|
|
|
|
|
pipeline might have one processor that removes a field from the document, followed by
|
|
|
|
|
another processor that renames a field. The <<cluster-state,cluster state>> then stores
|
|
|
|
|
the configured pipelines.
|
2016-02-11 17:16:56 -05:00
|
|
|
|
|
2018-01-16 03:35:35 -05:00
|
|
|
|
To use a pipeline, simply specify the `pipeline` parameter on an index or bulk request. This
|
2019-07-01 02:04:26 -04:00
|
|
|
|
way, the ingest node knows which pipeline to use.
|
|
|
|
|
|
|
|
|
|
For example:
|
|
|
|
|
Create a pipeline
|
|
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
|
--------------------------------------------------
|
|
|
|
|
PUT _ingest/pipeline/my_pipeline_id
|
|
|
|
|
{
|
|
|
|
|
"description" : "describe pipeline",
|
|
|
|
|
"processors" : [
|
|
|
|
|
{
|
|
|
|
|
"set" : {
|
|
|
|
|
"field": "foo",
|
|
|
|
|
"value": "new"
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
]
|
|
|
|
|
}
|
|
|
|
|
--------------------------------------------------
|
|
|
|
|
// CONSOLE
|
|
|
|
|
// TEST
|
|
|
|
|
|
|
|
|
|
Index with defined pipeline
|
2016-02-11 17:16:56 -05:00
|
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
|
--------------------------------------------------
|
2017-12-14 11:47:53 -05:00
|
|
|
|
PUT my-index/_doc/my-id?pipeline=my_pipeline_id
|
2016-02-11 17:16:56 -05:00
|
|
|
|
{
|
2016-04-29 10:42:03 -04:00
|
|
|
|
"foo": "bar"
|
2016-02-11 17:16:56 -05:00
|
|
|
|
}
|
|
|
|
|
--------------------------------------------------
|
2016-05-09 09:42:23 -04:00
|
|
|
|
// CONSOLE
|
2019-07-01 02:04:26 -04:00
|
|
|
|
// TEST[continued]
|
|
|
|
|
|
|
|
|
|
Response:
|
|
|
|
|
[source,js]
|
|
|
|
|
--------------------------------------------------
|
|
|
|
|
{
|
|
|
|
|
"_index" : "my-index",
|
|
|
|
|
"_type" : "_doc",
|
|
|
|
|
"_id" : "my-id",
|
|
|
|
|
"_version" : 1,
|
|
|
|
|
"result" : "created",
|
|
|
|
|
"_shards" : {
|
|
|
|
|
"total" : 2,
|
|
|
|
|
"successful" : 2,
|
|
|
|
|
"failed" : 0
|
|
|
|
|
},
|
|
|
|
|
"_seq_no" : 0,
|
|
|
|
|
"_primary_term" : 1
|
|
|
|
|
}
|
|
|
|
|
--------------------------------------------------
|
|
|
|
|
// TESTRESPONSE[s/"successful" : 2/"successful" : 1/]
|
2016-02-11 17:16:56 -05:00
|
|
|
|
|
2018-10-23 14:49:25 -04:00
|
|
|
|
An index may also declare a <<dynamic-index-settings,default pipeline>> that will be used in the
|
|
|
|
|
absence of the `pipeline` parameter.
|
|
|
|
|
|
2016-03-15 14:03:18 -04:00
|
|
|
|
See <<ingest-apis,Ingest APIs>> for more information about creating, adding, and deleting pipelines.
|
2016-03-04 01:00:07 -05:00
|
|
|
|
|
2016-02-11 17:16:56 -05:00
|
|
|
|
--
|
|
|
|
|
|
2016-04-29 10:42:03 -04:00
|
|
|
|
include::ingest/ingest-node.asciidoc[]
|