mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-07 13:38:49 +00:00
This processor uses the lucene HTMLStripCharFilter class to remove HTML entities from a field. This adds to the char filter, so that there is possibility to store the stripped version as well. Note, that the characeter filter replaces tags with a newline, so that the produced HTML will look slightly different than the incoming HTML with regards to newlines.
27 lines
828 B
Plaintext
27 lines
828 B
Plaintext
[[htmlstrip-processor]]
|
|
=== HTML Strip Processor
|
|
Removes HTML from field.
|
|
|
|
NOTE: Each HTML tag is replaced with a `\n` character.
|
|
|
|
[[htmlstrip-options]]
|
|
.HTML Strip Options
|
|
[options="header"]
|
|
|======
|
|
| Name | Required | Default | Description
|
|
| `field` | yes | - | The string-valued field to remove HTML tags from
|
|
| `target_field` | no | `field` | The field to assign the value to, by default `field` is updated in-place
|
|
| `ignore_missing` | no | `false` | If `true` and `field` does not exist, the processor quietly exits without modifying the document
|
|
include::common-options.asciidoc[]
|
|
|======
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"html_strip": {
|
|
"field": "foo"
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|