mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-03-27 10:28:28 +00:00
* Changes for #52239. * Incorporating review feedback from Julie T. Also single-sourcing nexted options in the Mapping page and referencing them in the Nested page. * Moving tip after the introduction and clarifying limits. * Update docs/reference/mapping.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> * Update docs/reference/mapping/types/nested.asciidoc Co-authored-by: James Rodewig <james.rodewig@elastic.co> Co-authored-by: James Rodewig <james.rodewig@elastic.co> Co-authored-by: James Rodewig <james.rodewig@elastic.co>
This commit is contained in:
parent
771ddbf083
commit
cbd35e9a2b
@ -1,9 +1,8 @@
|
||||
[[kv-processor]]
|
||||
=== KV Processor
|
||||
This processor helps automatically parse messages (or specific event fields) which are of the foo=bar variety.
|
||||
|
||||
For example, if you have a log message which contains `ip=1.2.3.4 error=REFUSED`, you can parse those automatically by configuring:
|
||||
This processor helps automatically parse messages (or specific event fields) which are of the `foo=bar` variety.
|
||||
|
||||
For example, if you have a log message which contains `ip=1.2.3.4 error=REFUSED`, you can parse those fields automatically by configuring:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
@ -17,8 +16,10 @@ For example, if you have a log message which contains `ip=1.2.3.4 error=REFUSED`
|
||||
--------------------------------------------------
|
||||
// NOTCONSOLE
|
||||
|
||||
TIP: Using the KV Processor can result in field names that you cannot control. Consider using the <<flattened>> datatype instead, which maps an entire object as a single field and allows for simple searches over its contents.
|
||||
|
||||
[[kv-options]]
|
||||
.Kv Options
|
||||
.KV Options
|
||||
[options="header"]
|
||||
|======
|
||||
| Name | Required | Default | Description
|
||||
|
@ -17,7 +17,7 @@ A mapping definition has:
|
||||
|
||||
<<mapping-fields,Meta-fields>>::
|
||||
|
||||
Meta-fields are used to customize how a document's metadata associated is
|
||||
Meta-fields are used to customize how a document's associated metadata is
|
||||
treated. Examples of meta-fields include the document's
|
||||
<<mapping-index-field,`_index`>>, <<mapping-id-field,`_id`>>, and
|
||||
<<mapping-source-field,`_source`>> fields.
|
||||
@ -58,17 +58,16 @@ via the <<multi-fields>> parameter.
|
||||
[float]
|
||||
=== Settings to prevent mappings explosion
|
||||
|
||||
Defining too many fields in an index is a condition that can lead to a
|
||||
Defining too many fields in an index can lead to a
|
||||
mapping explosion, which can cause out of memory errors and difficult
|
||||
situations to recover from. This problem may be more common than expected.
|
||||
As an example, consider a situation in which every new document inserted
|
||||
introduces new fields. This is quite common with dynamic mappings.
|
||||
Every time a document contains new fields, those will end up in the index's
|
||||
mappings. This isn't worrying for a small amount of data, but it can become a
|
||||
situations to recover from.
|
||||
|
||||
Consider a situation where every new document inserted
|
||||
introduces new fields, such as with <<dynamic-mapping,dynamic mapping>>.
|
||||
Each new field is added to the index mapping, which can become a
|
||||
problem as the mapping grows.
|
||||
The following settings allow you to limit the number of field mappings that
|
||||
can be created manually or dynamically, in order to prevent bad documents from
|
||||
causing a mapping explosion:
|
||||
|
||||
Use the following settings to limit the number of field mappings (created manually or dynamically) and prevent documents from causing a mapping explosion:
|
||||
|
||||
`index.mapping.total_fields.limit`::
|
||||
The maximum number of fields in an index. Field and object mappings, as well as
|
||||
@ -84,26 +83,37 @@ If you increase this setting, we recommend you also increase the
|
||||
<<search-settings,`indices.query.bool.max_clause_count`>> setting, which
|
||||
limits the maximum number of <<query-dsl-bool-query,boolean clauses>> in a query.
|
||||
====
|
||||
+
|
||||
[TIP]
|
||||
====
|
||||
If your field mappings contain a large, arbitrary set of keys, consider using the <<flattened,flattened>> datatype.
|
||||
====
|
||||
|
||||
`index.mapping.depth.limit`::
|
||||
The maximum depth for a field, which is measured as the number of inner
|
||||
objects. For instance, if all fields are defined at the root object level,
|
||||
then the depth is `1`. If there is one object mapping, then the depth is
|
||||
`2`, etc. The default is `20`.
|
||||
`2`, etc. Default is `20`.
|
||||
|
||||
// tag::nested-fields-limit[]
|
||||
`index.mapping.nested_fields.limit`::
|
||||
The maximum number of distinct `nested` mappings in an index, defaults to `50`.
|
||||
The maximum number of distinct `nested` mappings in an index. The `nested` type should only be used in special cases, when arrays of objects need to be queried independently of each other. To safeguard against poorly designed mappings, this setting
|
||||
limits the number of unique `nested` types per index. Default is `50`.
|
||||
// end::nested-fields-limit[]
|
||||
|
||||
// tag::nested-objects-limit[]
|
||||
`index.mapping.nested_objects.limit`::
|
||||
The maximum number of `nested` JSON objects within a single document across
|
||||
all nested types, defaults to 10000.
|
||||
The maximum number of nested JSON objects that a single document can contain across all
|
||||
`nested` types. This limit helps to prevent out of memory errors when a document contains too many nested
|
||||
objects. Default is `10000`.
|
||||
// end::nested-objects-limit[]
|
||||
|
||||
`index.mapping.field_name_length.limit`::
|
||||
Setting for the maximum length of a field name. The default value is
|
||||
Long.MAX_VALUE (no limit). This setting isn't really something that addresses
|
||||
Setting for the maximum length of a field name. This setting isn't really something that addresses
|
||||
mappings explosion but might still be useful if you want to limit the field length.
|
||||
It usually shouldn't be necessary to set this setting. The default is okay
|
||||
unless a user starts to add a huge number of fields with really long names.
|
||||
unless a user starts to add a huge number of fields with really long names. Default is
|
||||
`Long.MAX_VALUE` (no limit).
|
||||
|
||||
[float]
|
||||
== Dynamic mapping
|
||||
|
@ -5,14 +5,17 @@
|
||||
++++
|
||||
|
||||
The `nested` type is a specialised version of the <<object,`object`>> datatype
|
||||
that allows arrays of objects to be indexed in a way that they can be queried
|
||||
that allows arrays of objects to be indexed in a way that they can be queried
|
||||
independently of each other.
|
||||
|
||||
TIP: When ingesting key-value pairs with a large, arbitrary set of keys, you might consider modeling each key-value pair as its own nested document with `key` and `value` fields. Instead, consider using the <<flattened,flattened>> datatype, which maps an entire object as a single field and allows for simple searches over its contents.
|
||||
Nested documents and queries are typically expensive, so using the `flattened` datatype for this use case is a better option.
|
||||
|
||||
[[nested-arrays-flattening-objects]]
|
||||
==== How arrays of objects are flattened
|
||||
|
||||
Arrays of inner <<object,`object` fields>> do not work the way you may expect.
|
||||
Lucene has no concept of inner objects, so Elasticsearch flattens object
|
||||
hierarchies into a simple list of field names and values. For instance, the
|
||||
Elasticsearch has no concept of inner objects. Therefore, it flattens object
|
||||
hierarchies into a simple list of field names and values. For instance, consider the
|
||||
following document:
|
||||
|
||||
[source,console]
|
||||
@ -35,7 +38,7 @@ PUT my_index/_doc/1
|
||||
|
||||
<1> The `user` field is dynamically added as a field of type `object`.
|
||||
|
||||
would be transformed internally into a document that looks more like this:
|
||||
The previous document would be transformed internally into a document that looks more like this:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
@ -71,10 +74,12 @@ GET my_index/_search
|
||||
==== Using `nested` fields for arrays of objects
|
||||
|
||||
If you need to index arrays of objects and to maintain the independence of
|
||||
each object in the array, you should use the `nested` datatype instead of the
|
||||
<<object,`object`>> datatype. Internally, nested objects index each object in
|
||||
each object in the array, use the `nested` datatype instead of the
|
||||
<<object,`object`>> datatype.
|
||||
|
||||
Internally, nested objects index each object in
|
||||
the array as a separate hidden document, meaning that each nested object can be
|
||||
queried independently of the others, with the <<query-dsl-nested-query,`nested` query>>:
|
||||
queried independently of the others with the <<query-dsl-nested-query,`nested` query>>:
|
||||
|
||||
[source,console]
|
||||
--------------------------------------------------
|
||||
@ -152,6 +157,8 @@ GET my_index/_search
|
||||
<4> `inner_hits` allow us to highlight the matching nested documents.
|
||||
|
||||
|
||||
[[nested-accessing-documents]]
|
||||
==== Interacting with `nested` documents
|
||||
Nested documents can be:
|
||||
|
||||
* queried with the <<query-dsl-nested-query,`nested`>> query.
|
||||
@ -209,29 +216,20 @@ document as standard (flat) fields. Defaults to `false`.
|
||||
[float]
|
||||
=== Limits on `nested` mappings and objects
|
||||
|
||||
As described earlier, each nested object is indexed as a separate document under the hood.
|
||||
Continuing with the example above, if we indexed a single document containing 100 `user` objects,
|
||||
then 101 Lucene documents would be created -- one for the parent document, and one for each
|
||||
As described earlier, each nested object is indexed as a separate Lucene document.
|
||||
Continuing with the previous example, if we indexed a single document containing 100 `user` objects,
|
||||
then 101 Lucene documents would be created: one for the parent document, and one for each
|
||||
nested object. Because of the expense associated with `nested` mappings, Elasticsearch puts
|
||||
settings in place to guard against performance problems:
|
||||
|
||||
`index.mapping.nested_fields.limit`::
|
||||
include::{docdir}/mapping.asciidoc[tag=nested-fields-limit]
|
||||
|
||||
The `nested` type should only be used in special cases, when arrays of objects need to be
|
||||
queried independently of each other. To safeguard against poorly designed mappings, this setting
|
||||
limits the number of unique `nested` types per index. In our example, the `user` mapping would
|
||||
count as only 1 towards this limit. Defaults to 50.
|
||||
|
||||
`index.mapping.nested_objects.limit`::
|
||||
|
||||
This setting limits the number of nested objects that a single document may contain across all
|
||||
`nested` types, in order to prevent out of memory errors when a document contains too many nested
|
||||
objects. To illustrate how the setting works, say we added another `nested` type called `comments`
|
||||
to our example mapping above. Then for each document, the combined number of `user` and `comment`
|
||||
objects it contains must be below the limit. Defaults to 10000.
|
||||
|
||||
Additional background on these settings, including information on their default values, can be found
|
||||
in <<mapping-limit-settings>>.
|
||||
In the previous example, the `user` mapping would count as only 1 towards this limit.
|
||||
|
||||
include::{docdir}/mapping.asciidoc[tag=nested-objects-limit]
|
||||
|
||||
To illustrate how this setting works, consider adding another `nested` type called `comments`
|
||||
to the previous example mapping. For each document, the combined number of `user` and `comment`
|
||||
objects it contains must be below the limit.
|
||||
|
||||
See <<mapping-limit-settings>> regarding additional settings for preventing mappings explosion.
|
||||
|
Loading…
x
Reference in New Issue
Block a user