OpenSearch/docs/reference/ingest.asciidoc

[[ingest]]
= Ingest node

[partintro]
--
Use an ingest node to pre-process documents before the actual document indexing happens.
The ingest node intercepts bulk and index requests, it applies transformations, and it then
passes the documents back to the index or bulk APIs.

All nodes enable ingest by default, so any node can handle ingest tasks. You can also create
dedicated ingest nodes. To disable ingest for a node, configure the following setting in the
elasticsearch.yml file:

[source,yaml]
--------------------------------------------------
node.ingest: false
--------------------------------------------------

To pre-process documents before indexing, <<pipeline,define a pipeline>> that specifies a series of
<<ingest-processors,processors>>. Each processor transforms the document in some specific way. For example, a
pipeline might have one processor that removes a field from the document, followed by
another processor that renames a field. The <<cluster-state,cluster state>> then stores
the configured pipelines.

To use a pipeline, simply specify the `pipeline` parameter on an index or bulk request. This
way, the ingest node knows which pipeline to use.

For example:
Create a pipeline

[source,console]
--------------------------------------------------
PUT _ingest/pipeline/my_pipeline_id
{
  "description" : "describe pipeline",
  "processors" : [
    {
      "set" : {
        "field": "foo",
        "value": "new"
      }
    }
  ]
}
--------------------------------------------------

Index with defined pipeline

[source,console]
--------------------------------------------------
PUT my-index/_doc/my-id?pipeline=my_pipeline_id
{
  "foo": "bar"
}
--------------------------------------------------
// TEST[continued]

Response：

[source,console-result]
--------------------------------------------------
{
  "_index" : "my-index",
  "_type" : "_doc",
  "_id" : "my-id",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}
--------------------------------------------------
// TESTRESPONSE[s/"successful" : 2/"successful" : 1/]

An index may also declare a <<dynamic-index-settings,default pipeline>> that will be used in the
absence of the `pipeline` parameter.

Finally, an index may also declare a <<dynamic-index-settings,final pipeline>>
that will be executed after any request or default pipeline (if any).

See <<ingest-apis,Ingest APIs>> for more information about creating, adding, and deleting pipelines.

--

include::ingest/ingest-node.asciidoc[]
-												Add ingest docs to the build

											
										
										
											2016-02-11 17:16:56 -05:00
+								[[ingest]]
-												[DOCS] Synchronizes captialization in top-level titles (#33605)


											
										
										
											2018-09-27 11:36:18 -04:00
+								= Ingest node
-												Add ingest docs to the build

											
										
										
											2016-02-11 17:16:56 -05:00
 								[partintro]
 								--
-												Replace required pipeline with final pipeline (#49470)

This commit enhances the required pipeline functionality by changing it
so that default/request pipelines can also be executed, but the required
pipeline is always executed last. This gives users the flexibility to
execute their own indexing pipelines, but also ensure that any required
pipelines are also executed. Since such pipelines are executed last, we
change the name of required pipelines to final pipelines.

											
										
										
											2019-11-22 14:00:38 -05:00
+								Use an ingest node to pre-process documents before the actual document indexing happens.
-												[Docs] Changes to ingest.asciidoc (#28212)


											
										
										
											2018-01-16 03:35:35 -05:00
+								The ingest node intercepts bulk and index requests, it applies transformations, and it then
 								passes the documents back to the index or bulk APIs.
-												Add ingest docs to the build

											
										
										
											2016-02-11 17:16:56 -05:00
-												[Docs] Changes to ingest.asciidoc (#28212)


											
										
										
											2018-01-16 03:35:35 -05:00
+								All nodes enable ingest by default, so any node can handle ingest tasks. You can also create
 								dedicated ingest nodes. To disable ingest for a node, configure the following setting in the
 								elasticsearch.yml file:
-												Add ingest docs to the build

											
										
										
											2016-02-11 17:16:56 -05:00
 								[source,yaml]
 								--------------------------------------------------
 								node.ingest: false
 								--------------------------------------------------
-												[Docs] Changes to ingest.asciidoc (#28212)


											
										
										
											2018-01-16 03:35:35 -05:00
+								To pre-process documents before indexing, <<pipeline,define a pipeline>> that specifies a series of
 								<<ingest-processors,processors>>. Each processor transforms the document in some specific way. For example, a
 								pipeline might have one processor that removes a field from the document, followed by
 								another processor that renames a field. The <<cluster-state,cluster state>> then stores
 								the configured pipelines.
-												Add ingest docs to the build

											
										
										
											2016-02-11 17:16:56 -05:00
-												[Docs] Changes to ingest.asciidoc (#28212)


											
										
										
											2018-01-16 03:35:35 -05:00
+								To use a pipeline, simply specify the `pipeline` parameter on an index or bulk request. This
-												Replace required pipeline with final pipeline (#49470)

This commit enhances the required pipeline functionality by changing it
so that default/request pipelines can also be executed, but the required
pipeline is always executed last. This gives users the flexibility to
execute their own indexing pipelines, but also ensure that any required
pipelines are also executed. Since such pipelines are executed last, we
change the name of required pipelines to final pipelines.

											
										
										
											2019-11-22 14:00:38 -05:00
+								way, the ingest node knows which pipeline to use.
-												show a full ingest example in the index page, to let user fast understand ingest node. (#43476)


											
										
										
											2019-07-01 02:04:26 -04:00
 								For example:
 								Create a pipeline
-												[DOCS] Change // CONSOLE comments to [source,console] (#46441) (#46451)


											
										
										
											2019-09-06 11:31:13 -04:00
+								[source,console]
-												show a full ingest example in the index page, to let user fast understand ingest node. (#43476)


											
										
										
											2019-07-01 02:04:26 -04:00
+								--------------------------------------------------
 								PUT _ingest/pipeline/my_pipeline_id
 								{
 								  "description" : "describe pipeline",
 								  "processors" : [
 								    {
 								      "set" : {
 								        "field": "foo",
 								        "value": "new"
 								      }
 								    }
 								  ]
 								}
 								--------------------------------------------------
 								Index with defined pipeline
-												Add ingest docs to the build

											
										
										
											2016-02-11 17:16:56 -05:00
-												[DOCS] Change // CONSOLE comments to [source,console] (#46441) (#46451)


											
										
										
											2019-09-06 11:31:13 -04:00
+								[source,console]
-												Add ingest docs to the build

											
										
										
											2016-02-11 17:16:56 -05:00
+								--------------------------------------------------
-												Allow `_doc` as a type. (#27816)

Allowing `_doc` as a type will enable users to make the transition to 7.0
smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`.
This also moves most of the documentation to `_doc` as a type name.

Closes #27750
Closes #27751
											
										
										
											2017-12-14 11:47:53 -05:00
+								PUT my-index/_doc/my-id?pipeline=my_pipeline_id
-												Add ingest docs to the build

											
										
										
											2016-02-11 17:16:56 -05:00
+								{
-												Generate and run tests from the docs

Adds infrastructure so `gradle :docs:check` will extract tests from
snippets in the documentation and execute the tests. This is included
in `gradle check` so it should happen on CI and during a normal build.

By default each `// AUTOSENSE` snippet creates a unique REST test. These
tests are executed in a random order and the cluster is wiped between
each one. If multiple snippets chain together into a test you can annotate
all snippets after the first with `// TEST[continued]` to have the
generated tests for both snippets joined.

Snippets marked as `// TESTRESPONSE` are checked against the response
of the last action.

See docs/README.asciidoc for lots more.

Closes #12583. That issue is about catching bugs in the docs during build.
This catches *some* bugs in the docs during build which is a good start.

											
										
										
											2016-04-29 10:42:03 -04:00
+								  "foo": "bar"
-												Add ingest docs to the build

											
										
										
											2016-02-11 17:16:56 -05:00
+								}
 								--------------------------------------------------
-												show a full ingest example in the index page, to let user fast understand ingest node. (#43476)


											
										
										
											2019-07-01 02:04:26 -04:00
+								// TEST[continued]
 								Response：
-												[DOCS] [5 of 5] Change // TESTRESPONSE comments to [source,console-results] (#46449) (#46459)


											
										
										
											2019-09-06 16:09:09 -04:00
 								[source,console-result]
-												show a full ingest example in the index page, to let user fast understand ingest node. (#43476)


											
										
										
											2019-07-01 02:04:26 -04:00
+								--------------------------------------------------
 								{
 								  "_index" : "my-index",
 								  "_type" : "_doc",
 								  "_id" : "my-id",
 								  "_version" : 1,
 								  "result" : "created",
 								  "_shards" : {
 								    "total" : 2,
 								    "successful" : 2,
 								    "failed" : 0
 								  },
 								  "_seq_no" : 0,
 								  "_primary_term" : 1
 								}
 								--------------------------------------------------
 								// TESTRESPONSE[s/"successful" : 2/"successful" : 1/]
-												Add ingest docs to the build

											
										
										
											2016-02-11 17:16:56 -05:00
-												ingest: document index.default_pipeline (#34500)


											
										
										
											2018-10-23 14:49:25 -04:00
+								An index may also declare a <<dynamic-index-settings,default pipeline>> that will be used in the
 								absence of the `pipeline` parameter.
-												Replace required pipeline with final pipeline (#49470)

This commit enhances the required pipeline functionality by changing it
so that default/request pipelines can also be executed, but the required
pipeline is always executed last. This gives users the flexibility to
execute their own indexing pipelines, but also ensure that any required
pipelines are also executed. Since such pipelines are executed last, we
change the name of required pipelines to final pipelines.

											
										
										
											2019-11-22 14:00:38 -05:00
+								Finally, an index may also declare a <<dynamic-index-settings,final pipeline>>
 								that will be executed after any request or default pipeline (if any).
-												Docs: Added the ingest node to the modules/nodes page

Closes #17113

											
										
										
											2016-03-15 14:03:18 -04:00
+								See <<ingest-apis,Ingest APIs>> for more information about creating, adding, and deleting pipelines.
-												Ingest node edits

											
										
										
											2016-03-04 01:00:07 -05:00
-												Add ingest docs to the build

											
										
										
											2016-02-11 17:16:56 -05:00
+								--
-												Generate and run tests from the docs

Adds infrastructure so `gradle :docs:check` will extract tests from
snippets in the documentation and execute the tests. This is included
in `gradle check` so it should happen on CI and during a normal build.

By default each `// AUTOSENSE` snippet creates a unique REST test. These
tests are executed in a random order and the cluster is wiped between
each one. If multiple snippets chain together into a test you can annotate
all snippets after the first with `// TEST[continued]` to have the
generated tests for both snippets joined.

Snippets marked as `// TESTRESPONSE` are checked against the response
of the last action.

See docs/README.asciidoc for lots more.

Closes #12583. That issue is about catching bugs in the docs during build.
This catches *some* bugs in the docs during build which is a good start.

											
										
										
											2016-04-29 10:42:03 -04:00
+								include::ingest/ingest-node.asciidoc[]