mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-03-30 11:58:36 +00:00
* Changes for #52239. * Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page. * Moving tip after the introduction and clarifying limits. * Update docs/reference/mapping.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/mapping/types/nested.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> Co-authored-by: James Rodewig <james.rodewig@elastic.co> Co-authored-by: James Rodewig <james.rodewig@elastic.co>
This commit is contained in:
parent
771ddbf083
commit
cbd35e9a2b
@ -1,9 +1,8 @@
|
|||||||
[[kv-processor]]
|
[[kv-processor]]
|
||||||
=== KV Processor
|
=== KV Processor
|
||||||
This processor helps automatically parse messages (or specific event fields) which are of the foo=bar variety.
|
This processor helps automatically parse messages (or specific event fields) which are of the `foo=bar` variety.
|
||||||
|
|
||||||
For example, if you have a log message which contains `ip=1.2.3.4 error=REFUSED`, you can parse those automatically by configuring:
|
|
||||||
|
|
||||||
|
For example, if you have a log message which contains `ip=1.2.3.4 error=REFUSED`, you can parse those fields automatically by configuring:
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
@ -17,8 +16,10 @@ For example, if you have a log message which contains `ip=1.2.3.4 error=REFUSED`
|
|||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
// NOTCONSOLE
|
// NOTCONSOLE
|
||||||
|
|
||||||
|
TIP: Using the KV Processor can result in field names that you cannot control. Consider using the <<flattened>> datatype instead, which maps an entire object as a single field and allows for simple searches over its contents.
|
||||||
|
|
||||||
[[kv-options]]
|
[[kv-options]]
|
||||||
.Kv Options
|
.KV Options
|
||||||
[options="header"]
|
[options="header"]
|
||||||
|======
|
|======
|
||||||
| Name | Required | Default | Description
|
| Name | Required | Default | Description
|
||||||
|
@ -17,7 +17,7 @@ A mapping definition has:
|
|||||||
|
|
||||||
<<mapping-fields,Meta-fields>>::
|
<<mapping-fields,Meta-fields>>::
|
||||||
|
|
||||||
Meta-fields are used to customize how a document's metadata associated is
|
Meta-fields are used to customize how a document's associated metadata is
|
||||||
treated. Examples of meta-fields include the document's
|
treated. Examples of meta-fields include the document's
|
||||||
<<mapping-index-field,`_index`>>, <<mapping-id-field,`_id`>>, and
|
<<mapping-index-field,`_index`>>, <<mapping-id-field,`_id`>>, and
|
||||||
<<mapping-source-field,`_source`>> fields.
|
<<mapping-source-field,`_source`>> fields.
|
||||||
@ -58,17 +58,16 @@ via the <<multi-fields>> parameter.
|
|||||||
[float]
|
[float]
|
||||||
=== Settings to prevent mappings explosion
|
=== Settings to prevent mappings explosion
|
||||||
|
|
||||||
Defining too many fields in an index is a condition that can lead to a
|
Defining too many fields in an index can lead to a
|
||||||
mapping explosion, which can cause out of memory errors and difficult
|
mapping explosion, which can cause out of memory errors and difficult
|
||||||
situations to recover from. This problem may be more common than expected.
|
situations to recover from.
|
||||||
As an example, consider a situation in which every new document inserted
|
|
||||||
introduces new fields. This is quite common with dynamic mappings.
|
Consider a situation where every new document inserted
|
||||||
Every time a document contains new fields, those will end up in the index's
|
introduces new fields, such as with <<dynamic-mapping,dynamic mapping>>.
|
||||||
mappings. This isn't worrying for a small amount of data, but it can become a
|
Each new field is added to the index mapping, which can become a
|
||||||
problem as the mapping grows.
|
problem as the mapping grows.
|
||||||
The following settings allow you to limit the number of field mappings that
|
|
||||||
can be created manually or dynamically, in order to prevent bad documents from
|
Use the following settings to limit the number of field mappings (created manually or dynamically) and prevent documents from causing a mapping explosion:
|
||||||
causing a mapping explosion:
|
|
||||||
|
|
||||||
`index.mapping.total_fields.limit`::
|
`index.mapping.total_fields.limit`::
|
||||||
The maximum number of fields in an index. Field and object mappings, as well as
|
The maximum number of fields in an index. Field and object mappings, as well as
|
||||||
@ -84,26 +83,37 @@ If you increase this setting, we recommend you also increase the
|
|||||||
<<search-settings,`indices.query.bool.max_clause_count`>> setting, which
|
<<search-settings,`indices.query.bool.max_clause_count`>> setting, which
|
||||||
limits the maximum number of <<query-dsl-bool-query,boolean clauses>> in a query.
|
limits the maximum number of <<query-dsl-bool-query,boolean clauses>> in a query.
|
||||||
====
|
====
|
||||||
|
+
|
||||||
|
[TIP]
|
||||||
|
====
|
||||||
|
If your field mappings contain a large, arbitrary set of keys, consider using the <<flattened,flattened>> datatype.
|
||||||
|
====
|
||||||
|
|
||||||
`index.mapping.depth.limit`::
|
`index.mapping.depth.limit`::
|
||||||
The maximum depth for a field, which is measured as the number of inner
|
The maximum depth for a field, which is measured as the number of inner
|
||||||
objects. For instance, if all fields are defined at the root object level,
|
objects. For instance, if all fields are defined at the root object level,
|
||||||
then the depth is `1`. If there is one object mapping, then the depth is
|
then the depth is `1`. If there is one object mapping, then the depth is
|
||||||
`2`, etc. The default is `20`.
|
`2`, etc. Default is `20`.
|
||||||
|
|
||||||
|
// tag::nested-fields-limit[]
|
||||||
`index.mapping.nested_fields.limit`::
|
`index.mapping.nested_fields.limit`::
|
||||||
The maximum number of distinct `nested` mappings in an index, defaults to `50`.
|
The maximum number of distinct `nested` mappings in an index. The `nested` type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting
|
||||||
|
limits the number of unique `nested` types per index. Default is `50`.
|
||||||
|
// end::nested-fields-limit[]
|
||||||
|
|
||||||
|
// tag::nested-objects-limit[]
|
||||||
`index.mapping.nested_objects.limit`::
|
`index.mapping.nested_objects.limit`::
|
||||||
The maximum number of `nested` JSON objects within a single document across
|
The maximum number of nested JSON objects that a single document can contain across all
|
||||||
all nested types, defaults to 10000.
|
`nested` types. This limit helps to prevent out of memory errors when a document contains too many nested
|
||||||
|
objects. Default is `10000`.
|
||||||
|
// end::nested-objects-limit[]
|
||||||
|
|
||||||
`index.mapping.field_name_length.limit`::
|
`index.mapping.field_name_length.limit`::
|
||||||
Setting for the maximum length of a field name. The default value is
|
Setting for the maximum length of a field name. This setting isn't really something that addresses
|
||||||
Long.MAX_VALUE (no limit). This setting isn't really something that addresses
|
|
||||||
mappings explosion but might still be useful if you want to limit the field length.
|
mappings explosion but might still be useful if you want to limit the field length.
|
||||||
It usually shouldn't be necessary to set this setting. The default is okay
|
It usually shouldn't be necessary to set this setting. The default is okay
|
||||||
unless a user starts to add a huge number of fields with really long names.
|
unless a user starts to add a huge number of fields with really long names. Default is
|
||||||
|
`Long.MAX_VALUE` (no limit).
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
== Dynamic mapping
|
== Dynamic mapping
|
||||||
|
@ -5,14 +5,17 @@
|
|||||||
++++
|
++++
|
||||||
|
|
||||||
The `nested` type is a specialised version of the <<object,`object`>> datatype
|
The `nested` type is a specialised version of the <<object,`object`>> datatype
|
||||||
that allows arrays of objects to be indexed in a way that they can be queried
|
that allows arrays of objects to be indexed in a way that they can be queried
|
||||||
independently of each other.
|
independently of each other.
|
||||||
|
|
||||||
|
TIP: When ingesting key-value pairs with a large, arbitrary set of keys, you might consider modeling each key-value pair as its own nested document with `key` and `value` fields. Instead, consider using the <<flattened,flattened>> datatype, which maps an entire object as a single field and allows for simple searches over its contents.
|
||||||
|
Nested documents and queries are typically expensive, so using the `flattened` datatype for this use case is a better option.
|
||||||
|
|
||||||
|
[[nested-arrays-flattening-objects]]
|
||||||
==== How arrays of objects are flattened
|
==== How arrays of objects are flattened
|
||||||
|
|
||||||
Arrays of inner <<object,`object` fields>> do not work the way you may expect.
|
Elasticsearch has no concept of inner objects. Therefore, it flattens object
|
||||||
Lucene has no concept of inner objects, so Elasticsearch flattens object
|
hierarchies into a simple list of field names and values. For instance, consider the
|
||||||
hierarchies into a simple list of field names and values. For instance, the
|
|
||||||
following document:
|
following document:
|
||||||
|
|
||||||
[source,console]
|
[source,console]
|
||||||
@ -35,7 +38,7 @@ PUT my_index/_doc/1
|
|||||||
|
|
||||||
<1> The `user` field is dynamically added as a field of type `object`.
|
<1> The `user` field is dynamically added as a field of type `object`.
|
||||||
|
|
||||||
would be transformed internally into a document that looks more like this:
|
The previous document would be transformed internally into a document that looks more like this:
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
@ -71,10 +74,12 @@ GET my_index/_search
|
|||||||
==== Using `nested` fields for arrays of objects
|
==== Using `nested` fields for arrays of objects
|
||||||
|
|
||||||
If you need to index arrays of objects and to maintain the independence of
|
If you need to index arrays of objects and to maintain the independence of
|
||||||
each object in the array, you should use the `nested` datatype instead of the
|
each object in the array, use the `nested` datatype instead of the
|
||||||
<<object,`object`>> datatype. Internally, nested objects index each object in
|
<<object,`object`>> datatype.
|
||||||
|
|
||||||
|
Internally, nested objects index each object in
|
||||||
the array as a separate hidden document, meaning that each nested object can be
|
the array as a separate hidden document, meaning that each nested object can be
|
||||||
queried independently of the others, with the <<query-dsl-nested-query,`nested` query>>:
|
queried independently of the others with the <<query-dsl-nested-query,`nested` query>>:
|
||||||
|
|
||||||
[source,console]
|
[source,console]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
@ -152,6 +157,8 @@ GET my_index/_search
|
|||||||
<4> `inner_hits` allow us to highlight the matching nested documents.
|
<4> `inner_hits` allow us to highlight the matching nested documents.
|
||||||
|
|
||||||
|
|
||||||
|
[[nested-accessing-documents]]
|
||||||
|
==== Interacting with `nested` documents
|
||||||
Nested documents can be:
|
Nested documents can be:
|
||||||
|
|
||||||
* queried with the <<query-dsl-nested-query,`nested`>> query.
|
* queried with the <<query-dsl-nested-query,`nested`>> query.
|
||||||
@ -209,29 +216,20 @@ document as standard (flat) fields. Defaults to `false`.
|
|||||||
[float]
|
[float]
|
||||||
=== Limits on `nested` mappings and objects
|
=== Limits on `nested` mappings and objects
|
||||||
|
|
||||||
As described earlier, each nested object is indexed as a separate document under the hood.
|
As described earlier, each nested object is indexed as a separate Lucene document.
|
||||||
Continuing with the example above, if we indexed a single document containing 100 `user` objects,
|
Continuing with the previous example, if we indexed a single document containing 100 `user` objects,
|
||||||
then 101 Lucene documents would be created -- one for the parent document, and one for each
|
then 101 Lucene documents would be created: one for the parent document, and one for each
|
||||||
nested object. Because of the expense associated with `nested` mappings, Elasticsearch puts
|
nested object. Because of the expense associated with `nested` mappings, Elasticsearch puts
|
||||||
settings in place to guard against performance problems:
|
settings in place to guard against performance problems:
|
||||||
|
|
||||||
`index.mapping.nested_fields.limit`::
|
include::{docdir}/mapping.asciidoc[tag=nested-fields-limit]
|
||||||
|
|
||||||
The `nested` type should only be used in special cases, when arrays of objects need to be
|
In the previous example, the `user` mapping would count as only 1 towards this limit.
|
||||||
queried independently of each other. To safeguard against poorly designed mappings, this setting
|
|
||||||
limits the number of unique `nested` types per index. In our example, the `user` mapping would
|
|
||||||
count as only 1 towards this limit. Defaults to 50.
|
|
||||||
|
|
||||||
`index.mapping.nested_objects.limit`::
|
|
||||||
|
|
||||||
This setting limits the number of nested objects that a single document may contain across all
|
|
||||||
`nested` types, in order to prevent out of memory errors when a document contains too many nested
|
|
||||||
objects. To illustrate how the setting works, say we added another `nested` type called `comments`
|
|
||||||
to our example mapping above. Then for each document, the combined number of `user` and `comment`
|
|
||||||
objects it contains must be below the limit. Defaults to 10000.
|
|
||||||
|
|
||||||
Additional background on these settings, including information on their default values, can be found
|
|
||||||
in <<mapping-limit-settings>>.
|
|
||||||
|
|
||||||
|
include::{docdir}/mapping.asciidoc[tag=nested-objects-limit]
|
||||||
|
|
||||||
|
To illustrate how this setting works, consider adding another `nested` type called `comments`
|
||||||
|
to the previous example mapping. For each document, the combined number of `user` and `comment`
|
||||||
|
objects it contains must be below the limit.
|
||||||
|
|
||||||
|
See <<mapping-limit-settings>> regarding additional settings for preventing mappings explosion.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user