mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-06 04:58:50 +00:00
This commit enhances the required pipeline functionality by changing it so that default/request pipelines can also be executed, but the required pipeline is always executed last. This gives users the flexibility to execute their own indexing pipelines, but also ensure that any required pipelines are also executed. Since such pipelines are executed last, we change the name of required pipelines to final pipelines.
90 lines
2.5 KiB
Plaintext
90 lines
2.5 KiB
Plaintext
[[ingest]]
|
||
= Ingest node
|
||
|
||
[partintro]
|
||
--
|
||
Use an ingest node to pre-process documents before the actual document indexing happens.
|
||
The ingest node intercepts bulk and index requests, it applies transformations, and it then
|
||
passes the documents back to the index or bulk APIs.
|
||
|
||
All nodes enable ingest by default, so any node can handle ingest tasks. You can also create
|
||
dedicated ingest nodes. To disable ingest for a node, configure the following setting in the
|
||
elasticsearch.yml file:
|
||
|
||
[source,yaml]
|
||
--------------------------------------------------
|
||
node.ingest: false
|
||
--------------------------------------------------
|
||
|
||
To pre-process documents before indexing, <<pipeline,define a pipeline>> that specifies a series of
|
||
<<ingest-processors,processors>>. Each processor transforms the document in some specific way. For example, a
|
||
pipeline might have one processor that removes a field from the document, followed by
|
||
another processor that renames a field. The <<cluster-state,cluster state>> then stores
|
||
the configured pipelines.
|
||
|
||
To use a pipeline, simply specify the `pipeline` parameter on an index or bulk request. This
|
||
way, the ingest node knows which pipeline to use.
|
||
|
||
For example:
|
||
Create a pipeline
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
PUT _ingest/pipeline/my_pipeline_id
|
||
{
|
||
"description" : "describe pipeline",
|
||
"processors" : [
|
||
{
|
||
"set" : {
|
||
"field": "foo",
|
||
"value": "new"
|
||
}
|
||
}
|
||
]
|
||
}
|
||
--------------------------------------------------
|
||
|
||
Index with defined pipeline
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
PUT my-index/_doc/my-id?pipeline=my_pipeline_id
|
||
{
|
||
"foo": "bar"
|
||
}
|
||
--------------------------------------------------
|
||
// TEST[continued]
|
||
|
||
Response:
|
||
|
||
[source,console-result]
|
||
--------------------------------------------------
|
||
{
|
||
"_index" : "my-index",
|
||
"_type" : "_doc",
|
||
"_id" : "my-id",
|
||
"_version" : 1,
|
||
"result" : "created",
|
||
"_shards" : {
|
||
"total" : 2,
|
||
"successful" : 2,
|
||
"failed" : 0
|
||
},
|
||
"_seq_no" : 0,
|
||
"_primary_term" : 1
|
||
}
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/"successful" : 2/"successful" : 1/]
|
||
|
||
An index may also declare a <<dynamic-index-settings,default pipeline>> that will be used in the
|
||
absence of the `pipeline` parameter.
|
||
|
||
Finally, an index may also declare a <<dynamic-index-settings,final pipeline>>
|
||
that will be executed after any request or default pipeline (if any).
|
||
|
||
See <<ingest-apis,Ingest APIs>> for more information about creating, adding, and deleting pipelines.
|
||
|
||
--
|
||
|
||
include::ingest/ingest-node.asciidoc[]
|