OpenSearch/docs/reference/mapping/fields/parent-field.asciidoc

170 lines
4.6 KiB
Plaintext
Raw Normal View History

[[mapping-parent-field]]
=== `_parent` field
A parent-child relationship can be established between documents in the same
index by making one mapping type the parent of another:
[source,js]
--------------------------------------------------
PUT my_index
{
"settings": {
"mapping.single_type": false
},
"mappings": {
"my_parent": {},
"my_child": {
"_parent": {
"type": "my_parent" <1>
}
}
}
}
PUT my_index/my_parent/1 <2>
{
"text": "This is a parent document"
}
PUT my_index/my_child/2?parent=1 <3>
{
"text": "This is a child document"
}
PUT my_index/my_child/3?parent=1&refresh=true <3>
{
"text": "This is another child document"
}
GET my_index/my_parent/_search
{
"query": {
"has_child": { <4>
"type": "my_child",
"query": {
"match": {
"text": "child document"
}
}
}
}
}
--------------------------------------------------
// CONSOLE
<1> The `my_parent` type is parent to the `my_child` type.
<2> Index a parent document.
<3> Index two child documents, specifying the parent document's ID.
<4> Find all parent documents that have children which match the query.
See the <<query-dsl-has-child-query,`has_child`>> and
<<query-dsl-has-parent-query,`has_parent`>> queries,
the <<search-aggregations-bucket-children-aggregation,`children`>> aggregation,
and <<parent-child-inner-hits,inner hits>> for more information.
The value of the `_parent` field is accessible in aggregations
and scripts, and may be queried with the
<<query-dsl-parent-id-query, `parent_id` query>>:
[source,js]
--------------------------
GET my_index/_search
{
"query": {
"parent_id": { <1>
"type": "my_child",
"id": "1"
}
},
"aggs": {
"parents": {
"terms": {
"field": "_parent", <2>
"size": 10
}
}
},
"script_fields": {
"parent": {
2016-06-27 09:55:16 -04:00
"script": {
"inline": "doc['_parent']" <3>
}
}
}
}
--------------------------
// CONSOLE
// TEST[continued]
<1> Querying the id of the `_parent` field (also see the <<query-dsl-has-parent-query,`has_parent` query>> and the <<query-dsl-has-child-query,`has_child` query>>)
<2> Aggregating on the `_parent` field (also see the <<search-aggregations-bucket-children-aggregation,`children`>> aggregation)
2016-06-27 09:55:16 -04:00
<3> Accessing the `_parent` field in scripts
==== Parent-child restrictions
* The parent and child types must be different -- parent-child relationships
cannot be established between documents of the same type.
* The `_parent.type` setting can only point to a type that doesn't exist yet.
2016-08-29 07:51:39 -04:00
This means that a type cannot become a parent type after it has been
created.
* Parent and child documents must be indexed on the same shard. The `parent`
ID is used as the <<mapping-routing-field,routing>> value for the child,
to ensure that the child is indexed on the same shard as the parent.
This means that the same `parent` value needs to be provided when
<<docs-get,getting>>, <<docs-delete,deleting>>, or <<docs-update,updating>>
a child document.
==== Global ordinals
Parent-child uses <<global-ordinals,global ordinals>> to speed up joins.
Global ordinals need to be rebuilt after any change to a shard. The more
parent id values are stored in a shard, the longer it takes to rebuild the
global ordinals for the `_parent` field.
Global ordinals, by default, are built eagerly: if the index has changed,
global ordinals for the `_parent` field will be rebuilt as part of the refresh.
This can add significant time the refresh. However most of the times this is the
right trade-off, otherwise global ordinals are rebuilt when the first parent-child
query or aggregation is used. This can introduce a significant latency spike for
your users and usually this is worse as multiple global ordinals for the `_parent`
field may be attempt rebuilt within a single refresh interval when many writes
are occurring.
When the parent/child is used infrequently and writes occur frequently it may
make sense to disable eager loading:
[source,js]
--------------------------------------------------
PUT my_index
{
"settings": {
"mapping.single_type": false
},
"mappings": {
"my_parent": {},
"my_child": {
"_parent": {
"type": "my_parent",
"eager_global_ordinals": false
}
}
}
}
--------------------------------------------------
// CONSOLE
The amount of heap used by global ordinals can be checked as follows:
[source,sh]
--------------------------------------------------
# Per-index
GET _stats/fielddata?human&fields=_parent
# Per-node per-index
GET _nodes/stats/indices/fielddata?human&fields=_parent
--------------------------------------------------
// CONSOLE