2023-08-29 16:07:34 -04:00
---
layout: default
title: CSV
2023-10-17 17:22:28 -04:00
parent: Ingest processors
2023-08-29 16:07:34 -04:00
nav_order: 40
2023-10-17 17:22:28 -04:00
redirect_from:
- /api-reference/ingest-apis/processors/csv/
2023-08-29 16:07:34 -04:00
---
2023-12-05 14:49:46 -05:00
# CSV processor
2023-08-29 16:07:34 -04:00
2023-10-23 10:53:52 -04:00
The `csv` processor is used to parse CSVs and store them as individual fields in a document. The processor ignores empty fields.
2023-12-12 15:16:16 -05:00
## Syntax
2023-10-23 10:53:52 -04:00
The following is the syntax for the `csv` processor:
2023-08-29 16:07:34 -04:00
```json
{
"csv": {
"field": "field_name",
"target_fields": ["field1, field2, ..."]
}
}
```
{% include copy-curl.html %}
## Configuration parameters
The following table lists the required and optional parameters for the `csv` processor.
2023-12-12 15:16:16 -05:00
Parameter | Required/Optional | Description |
2023-08-29 16:07:34 -04:00
|-----------|-----------|-----------|
2023-12-12 15:16:16 -05:00
`field` | Required | The name of the field containing the data to be converted. Supports template snippets. |
2023-08-29 16:07:34 -04:00
`target_fields` | Required | The name of the field in which to store the parsed data. |
`description` | Optional | A brief description of the processor. |
`empty_value` | Optional | Represents optional parameters that are not required or are not applicable. |
2023-12-12 15:16:16 -05:00
`if` | Optional | A condition for running the processor. |
`ignore_failure` | Optional | Specifies whether the processor continues execution even if it encounters errors. If set to `true` , failures are ignored. Default is `false` . |
`ignore_missing` | Optional | Specifies whether the processor should ignore documents that do not contain the specified field. If set to `true` , the processor does not modify the document if the field does not exist or is `null` . Default is `false` . |
2023-08-29 16:07:34 -04:00
`on_failure` | Optional | A list of processors to run if the processor fails. |
`quote` | Optional | The character used to quote fields in the CSV data. Default is `"` . |
`separator` | Optional | The delimiter used to separate the fields in the CSV data. Default is `,` . |
2023-12-12 15:16:16 -05:00
`tag` | Optional | An identifier tag for the processor. Useful for debugging in order to distinguish between processors of the same type. |
2023-08-29 16:07:34 -04:00
`trim` | Optional | If set to `true` , the processor trims white space from the beginning and end of the text. Default is `false` . |
## Using the processor
Follow these steps to use the processor in a pipeline.
2023-12-12 15:16:16 -05:00
**Step 1: Create a pipeline**
2023-08-29 16:07:34 -04:00
The following query creates a pipeline, named `csv-processor` , that splits `resource_usage` into three new fields named `cpu_usage` , `memory_usage` , and `disk_usage` :
```json
PUT _ingest/pipeline/csv-processor
{
"description": "Split resource usage into individual fields",
"processors": [
{
"csv": {
"field": "resource_usage",
"target_fields": ["cpu_usage", "memory_usage", "disk_usage"],
"separator": ","
}
}
]
}
```
{% include copy-curl.html %}
2023-12-12 15:16:16 -05:00
**Step 2 (Optional): Test the pipeline**
2023-08-29 16:07:34 -04:00
It is recommended that you test your pipeline before you ingest documents.
{: .tip}
To test the pipeline, run the following query:
```json
POST _ingest/pipeline/csv-processor/_simulate
{
"docs": [
{
"_index": "testindex1",
"_id": "1",
"_source": {
"resource_usage": "25,4096,10",
"memory_usage": "4096",
"disk_usage": "10",
"cpu_usage": "25"
}
}
]
}
```
{% include copy-curl.html %}
2023-12-12 15:16:16 -05:00
**Response**
2023-08-29 16:07:34 -04:00
The following example response confirms that the pipeline is working as expected:
```json
{
"docs": [
{
"doc": {
"_index": "testindex1",
"_id": "1",
"_source": {
"memory_usage": "4096",
"disk_usage": "10",
"resource_usage": "25,4096,10",
"cpu_usage": "25"
},
"_ingest": {
"timestamp": "2023-08-22T16:40:45.024796379Z"
}
}
}
]
}
```
2023-12-12 15:16:16 -05:00
**Step 3: Ingest a document**
2023-08-29 16:07:34 -04:00
The following query ingests a document into an index named `testindex1` :
```json
PUT testindex1/_doc/1?pipeline=csv-processor
{
"resource_usage": "25,4096,10"
}
```
{% include copy-curl.html %}
2023-12-12 15:16:16 -05:00
**Step 4 (Optional): Retrieve the document**
2023-08-29 16:07:34 -04:00
To retrieve the document, run the following query:
```json
GET testindex1/_doc/1
```
{% include copy-curl.html %}