2015-08-06 11:24:29 -04:00
|
|
|
[[ignore-malformed]]
|
|
|
|
=== `ignore_malformed`
|
|
|
|
|
|
|
|
Sometimes you don't have much control over the data that you receive. One
|
|
|
|
user may send a `login` field that is a <<date,`date`>>, and another sends a
|
|
|
|
`login` field that is an email address.
|
|
|
|
|
2020-07-07 14:59:35 -04:00
|
|
|
Trying to index the wrong data type into a field throws an exception by
|
2015-08-06 11:24:29 -04:00
|
|
|
default, and rejects the whole document. The `ignore_malformed` parameter, if
|
2019-10-18 14:39:07 -04:00
|
|
|
set to `true`, allows the exception to be ignored. The malformed field is not
|
2015-08-06 11:24:29 -04:00
|
|
|
indexed, but other fields in the document are processed normally.
|
|
|
|
|
|
|
|
For example:
|
|
|
|
|
2019-09-06 11:31:13 -04:00
|
|
|
[source,console]
|
2015-08-06 11:24:29 -04:00
|
|
|
--------------------------------------------------
|
2020-07-27 15:58:26 -04:00
|
|
|
PUT my-index-000001
|
2015-08-06 11:24:29 -04:00
|
|
|
{
|
|
|
|
"mappings": {
|
2019-01-22 09:13:52 -05:00
|
|
|
"properties": {
|
|
|
|
"number_one": {
|
|
|
|
"type": "integer",
|
|
|
|
"ignore_malformed": true
|
|
|
|
},
|
|
|
|
"number_two": {
|
|
|
|
"type": "integer"
|
2015-08-06 11:24:29 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2020-07-27 15:58:26 -04:00
|
|
|
PUT my-index-000001/_doc/1
|
2015-08-06 11:24:29 -04:00
|
|
|
{
|
|
|
|
"text": "Some text value",
|
|
|
|
"number_one": "foo" <1>
|
|
|
|
}
|
|
|
|
|
2020-07-27 15:58:26 -04:00
|
|
|
PUT my-index-000001/_doc/2
|
2015-08-06 11:24:29 -04:00
|
|
|
{
|
|
|
|
"text": "Some text value",
|
|
|
|
"number_two": "foo" <2>
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
2017-09-14 15:24:03 -04:00
|
|
|
// TEST[catch:bad_request]
|
2019-09-06 11:31:13 -04:00
|
|
|
|
2016-04-29 10:42:03 -04:00
|
|
|
<1> This document will have the `text` field indexed, but not the `number_one` field.
|
|
|
|
<2> This document will be rejected because `number_two` does not allow malformed values.
|
2015-08-06 11:24:29 -04:00
|
|
|
|
2019-10-18 14:39:07 -04:00
|
|
|
The `ignore_malformed` setting is currently supported by the following <<mapping-types,mapping types>>:
|
|
|
|
|
|
|
|
<<number>>:: `long`, `integer`, `short`, `byte`, `double`, `float`, `half_float`, `scaled_float`
|
|
|
|
<<date>>:: `date`
|
|
|
|
<<date_nanos>>:: `date_nanos`
|
|
|
|
<<geo-point>>:: `geo_point` for lat/lon points
|
|
|
|
<<geo-shape>>:: `geo_shape` for complex shapes like polygons
|
|
|
|
<<ip>>:: `ip` for IPv4 and IPv6 addresses
|
|
|
|
|
2019-06-20 03:15:43 -04:00
|
|
|
TIP: The `ignore_malformed` setting value can be updated on
|
2015-08-12 15:21:37 -04:00
|
|
|
existing fields using the <<indices-put-mapping,PUT mapping API>>.
|
|
|
|
|
2015-08-06 11:24:29 -04:00
|
|
|
[[ignore-malformed-setting]]
|
|
|
|
==== Index-level default
|
|
|
|
|
|
|
|
The `index.mapping.ignore_malformed` setting can be set on the index level to
|
2019-10-18 14:39:07 -04:00
|
|
|
ignore malformed content globally across all allowed mapping types.
|
|
|
|
Mapping types that don't support the setting will ignore it if set on the index level.
|
2015-08-06 11:24:29 -04:00
|
|
|
|
2019-09-06 11:31:13 -04:00
|
|
|
[source,console]
|
2015-08-06 11:24:29 -04:00
|
|
|
--------------------------------------------------
|
2020-07-27 15:58:26 -04:00
|
|
|
PUT my-index-000001
|
2015-08-06 11:24:29 -04:00
|
|
|
{
|
|
|
|
"settings": {
|
|
|
|
"index.mapping.ignore_malformed": true <1>
|
|
|
|
},
|
|
|
|
"mappings": {
|
2019-01-22 09:13:52 -05:00
|
|
|
"properties": {
|
|
|
|
"number_one": { <1>
|
|
|
|
"type": "byte"
|
|
|
|
},
|
|
|
|
"number_two": {
|
|
|
|
"type": "integer",
|
|
|
|
"ignore_malformed": false <2>
|
2015-08-06 11:24:29 -04:00
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
<1> The `number_one` field inherits the index-level setting.
|
|
|
|
<2> The `number_two` field overrides the index-level setting to turn off `ignore_malformed`.
|
2018-05-02 04:47:02 -04:00
|
|
|
|
|
|
|
==== Dealing with malformed fields
|
|
|
|
|
|
|
|
Malformed fields are silently ignored at indexing time when `ignore_malformed`
|
|
|
|
is turned on. Whenever possible it is recommended to keep the number of
|
|
|
|
documents that have a malformed field contained, or queries on this field will
|
|
|
|
become meaningless. Elasticsearch makes it easy to check how many documents
|
2020-01-13 09:47:36 -05:00
|
|
|
have malformed fields by using `exists`,`term` or `terms` queries on the special
|
2018-05-02 04:47:02 -04:00
|
|
|
<<mapping-ignored-field,`_ignored`>> field.
|
|
|
|
|
2019-04-30 10:19:09 -04:00
|
|
|
[[json-object-limits]]
|
2019-04-10 14:46:51 -04:00
|
|
|
==== Limits for JSON Objects
|
2020-07-07 14:59:35 -04:00
|
|
|
You can't use `ignore_malformed` with the following data types:
|
2019-04-10 14:46:51 -04:00
|
|
|
|
2020-07-07 14:59:35 -04:00
|
|
|
* <<nested, Nested data type>>
|
|
|
|
* <<object, Object data type>>
|
|
|
|
* <<range, Range data types>>
|
2019-04-10 14:46:51 -04:00
|
|
|
|
|
|
|
You also can't use `ignore_malformed` to ignore JSON objects submitted to fields
|
2020-07-07 14:59:35 -04:00
|
|
|
of the wrong data type. A JSON object is any data surrounded by curly brackets
|
|
|
|
`"{}"` and includes data mapped to the nested, object, and range data types.
|
2019-04-10 14:46:51 -04:00
|
|
|
|
|
|
|
If you submit a JSON object to an unsupported field, {es} will return an error
|
2020-01-13 09:47:36 -05:00
|
|
|
and reject the entire document regardless of the `ignore_malformed` setting.
|