OpenSearch/docs/reference/ingest/processors/html_strip.asciidoc
Alexander Reelsen 8e33a5292a Add HTML strip processor (#41888)
This processor uses the lucene HTMLStripCharFilter class to remove HTML
entities from a field. This adds to the char filter, so that there is
possibility to store the stripped version as well.

Note, that the characeter filter replaces tags with a newline, so that
the produced HTML will look slightly different than the incoming HTML
with regards to newlines.
2019-05-09 13:01:07 +02:00

27 lines
828 B
Plaintext

[[htmlstrip-processor]]
=== HTML Strip Processor
Removes HTML from field.
NOTE: Each HTML tag is replaced with a `\n` character.
[[htmlstrip-options]]
.HTML Strip Options
[options="header"]
|======
| Name | Required | Default | Description
| `field` | yes | - | The string-valued field to remove HTML tags from
| `target_field` | no | `field` | The field to assign the value to, by default `field` is updated in-place
| `ignore_missing` | no | `false` | If `true` and `field` does not exist, the processor quietly exits without modifying the document
include::common-options.asciidoc[]
|======
[source,js]
--------------------------------------------------
{
"html_strip": {
"field": "foo"
}
}
--------------------------------------------------
// NOTCONSOLE