OpenSearch/docs/reference/mapping/fields/parent-field.asciidoc

166 lines
4.5 KiB
Plaintext

[[mapping-parent-field]]
=== `_parent` field
added[2.0.0,The parent-child implementation has been completely rewritten. It is advisable to reindex any 1.x indices which use parent-child to take advantage of the new optimizations]
A parent-child relationship can be established between documents in the same
index by making one mapping type the parent of another:
[source,js]
--------------------------------------------------
PUT my_index
{
"mappings": {
"my_parent": {},
"my_child": {
"_parent": {
"type": "my_parent" <1>
}
}
}
}
PUT my_index/my_parent/1 <2>
{
"text": "This is a parent document"
}
PUT my_index/my_child/2?parent=1 <3>
{
"text": "This is a child document"
}
PUT my_index/my_child/3?parent=1 <3>
{
"text": "This is another child document"
}
GET my_index/my_parent/_search
{
"query": {
"has_child": { <4>
"type": "my_child",
"query": {
"match": {
"text": "child document"
}
}
}
}
}
--------------------------------------------------
// AUTOSENSE
<1> The `my_parent` type is parent to the `my_child` type.
<2> Index a parent document.
<3> Index two child documents, specifying the parent document's ID.
<4> Find all parent documents that have children which match the query.
See the <<query-dsl-has-child-query,`has_child`>> and
<<query-dsl-has-parent-query,`has_parent`>> queries,
the <<search-aggregations-bucket-children-aggregation,`children`>> aggregation,
and <<parent-child-inner-hits,inner hits>> for more information.
The value of the `_parent` field is accessible in queries, aggregations, scripts,
and when sorting:
[source,js]
--------------------------
GET my_index/_search
{
"query": {
"terms": {
"_parent": [ "1" ] <1>
}
},
"aggs": {
"parents": {
"terms": {
"field": "_parent", <2>
"size": 10
}
}
},
"sort": [
{
"_parent": { <3>
"order": "desc"
}
}
],
"script_fields": {
"parent": {
"script": "doc['_parent']" <4>
}
}
}
--------------------------
// AUTOSENSE
<1> Querying on the `_parent` field (also see the <<query-dsl-has-parent-query,`has_parent` query>> and the <<query-dsl-has-child-query,`has_child` query>>)
<2> Aggregating on the `_parent` field (also see the <<search-aggregations-bucket-children-aggregation,`children`>> aggregation)
<3> Sorting on the `_parent` field
<4> Accessing the `_parent` field in scripts (inline scripts must be <<enable-dynamic-scripting,enabled>> for this example to work)
==== Parent-child restrictions
* The parent and child types must be different -- parent-child relationships
cannot be established between documents of the same type.
* The `_parent.type` setting can only point to a type that doesn't exist yet.
This means that a type cannot become a parent type after it is has been
created.
* Parent and child documents must be indexed on the same shard. The `parent`
ID is used as the <<mapping-routing-field,routing>> value for the child,
to ensure that the child is indexed on the same shard as the parent.
This means that the same `parent` value needs to be provided when
<<docs-get,getting>>, <<docs-delete,deleting>>, or <<docs-update,updating>>
a child document.
==== Global ordinals
Parent-child uses <<global-ordinals,global ordinals>> to speed up joins.
Global ordinals need to be rebuilt after any change to a shard. The more
parent id values are stored in a shard, the longer it takes to rebuild the
global ordinals for the `_parent` field.
Global ordinals, by default, are built lazily: the first parent-child query or
aggregation after a refresh will trigger building of global ordinals. This can
introduce a significant latency spike for your users. You can use
<<fielddata-loading,eager_global_ordinals>> to shift the cost of building global
ordinals from query time to refresh time, by mapping the `_parent` field as follows:
[source,js]
--------------------------------------------------
PUT my_index
{
"mappings": {
"my_parent": {},
"my_child": {
"_parent": {
"type": "my_parent",
"fielddata": {
"loading": "eager_global_ordinals"
}
}
}
}
}
--------------------------------------------------
// AUTOSENSE
The amount of heap used by global ordinals can be checked as follows:
[source,sh]
--------------------------------------------------
# Per-index
GET _stats/fielddata?human&fields=_parent
# Per-node per-index
GET _nodes/stats/indices/fielddata?human&fields=_parent
--------------------------------------------------
// AUTOSENSE