mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-05-19 23:24:57 +00:00
This commit adds back "id" as the key within a script to specify a stored script (which with file scripts now gone is no longer ambiguous). It also adds "source" as a replacement for "code". This is in an attempt to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.
170 lines
4.6 KiB
Plaintext
170 lines
4.6 KiB
Plaintext
[[mapping-parent-field]]
|
|
=== `_parent` field
|
|
|
|
A parent-child relationship can be established between documents in the same
|
|
index by making one mapping type the parent of another:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
PUT my_index
|
|
{
|
|
"settings": {
|
|
"mapping.single_type": false
|
|
},
|
|
"mappings": {
|
|
"my_parent": {},
|
|
"my_child": {
|
|
"_parent": {
|
|
"type": "my_parent" <1>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
PUT my_index/my_parent/1 <2>
|
|
{
|
|
"text": "This is a parent document"
|
|
}
|
|
|
|
PUT my_index/my_child/2?parent=1 <3>
|
|
{
|
|
"text": "This is a child document"
|
|
}
|
|
|
|
PUT my_index/my_child/3?parent=1&refresh=true <3>
|
|
{
|
|
"text": "This is another child document"
|
|
}
|
|
|
|
GET my_index/my_parent/_search
|
|
{
|
|
"query": {
|
|
"has_child": { <4>
|
|
"type": "my_child",
|
|
"query": {
|
|
"match": {
|
|
"text": "child document"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
<1> The `my_parent` type is parent to the `my_child` type.
|
|
<2> Index a parent document.
|
|
<3> Index two child documents, specifying the parent document's ID.
|
|
<4> Find all parent documents that have children which match the query.
|
|
|
|
|
|
See the <<query-dsl-has-child-query,`has_child`>> and
|
|
<<query-dsl-has-parent-query,`has_parent`>> queries,
|
|
the <<search-aggregations-bucket-children-aggregation,`children`>> aggregation,
|
|
and <<parent-child-inner-hits,inner hits>> for more information.
|
|
|
|
The value of the `_parent` field is accessible in aggregations
|
|
and scripts, and may be queried with the
|
|
<<query-dsl-parent-id-query, `parent_id` query>>:
|
|
|
|
[source,js]
|
|
--------------------------
|
|
GET my_index/_search
|
|
{
|
|
"query": {
|
|
"parent_id": { <1>
|
|
"type": "my_child",
|
|
"id": "1"
|
|
}
|
|
},
|
|
"aggs": {
|
|
"parents": {
|
|
"terms": {
|
|
"field": "_parent", <2>
|
|
"size": 10
|
|
}
|
|
}
|
|
},
|
|
"script_fields": {
|
|
"parent": {
|
|
"script": {
|
|
"source": "doc['_parent']" <3>
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------
|
|
// CONSOLE
|
|
// TEST[continued]
|
|
|
|
<1> Querying the id of the `_parent` field (also see the <<query-dsl-has-parent-query,`has_parent` query>> and the <<query-dsl-has-child-query,`has_child` query>>)
|
|
<2> Aggregating on the `_parent` field (also see the <<search-aggregations-bucket-children-aggregation,`children`>> aggregation)
|
|
<3> Accessing the `_parent` field in scripts
|
|
|
|
|
|
==== Parent-child restrictions
|
|
|
|
* The parent and child types must be different -- parent-child relationships
|
|
cannot be established between documents of the same type.
|
|
|
|
* The `_parent.type` setting can only point to a type that doesn't exist yet.
|
|
This means that a type cannot become a parent type after it has been
|
|
created.
|
|
|
|
* Parent and child documents must be indexed on the same shard. The `parent`
|
|
ID is used as the <<mapping-routing-field,routing>> value for the child,
|
|
to ensure that the child is indexed on the same shard as the parent.
|
|
This means that the same `parent` value needs to be provided when
|
|
<<docs-get,getting>>, <<docs-delete,deleting>>, or <<docs-update,updating>>
|
|
a child document.
|
|
|
|
==== Global ordinals
|
|
|
|
Parent-child uses <<eager-global-ordinals,global ordinals>> to speed up joins.
|
|
Global ordinals need to be rebuilt after any change to a shard. The more
|
|
parent id values are stored in a shard, the longer it takes to rebuild the
|
|
global ordinals for the `_parent` field.
|
|
|
|
Global ordinals, by default, are built eagerly: if the index has changed,
|
|
global ordinals for the `_parent` field will be rebuilt as part of the refresh.
|
|
This can add significant time the refresh. However most of the times this is the
|
|
right trade-off, otherwise global ordinals are rebuilt when the first parent-child
|
|
query or aggregation is used. This can introduce a significant latency spike for
|
|
your users and usually this is worse as multiple global ordinals for the `_parent`
|
|
field may be attempt rebuilt within a single refresh interval when many writes
|
|
are occurring.
|
|
|
|
When the parent/child is used infrequently and writes occur frequently it may
|
|
make sense to disable eager loading:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
PUT my_index
|
|
{
|
|
"settings": {
|
|
"mapping.single_type": false
|
|
},
|
|
"mappings": {
|
|
"my_parent": {},
|
|
"my_child": {
|
|
"_parent": {
|
|
"type": "my_parent",
|
|
"eager_global_ordinals": false
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
The amount of heap used by global ordinals can be checked as follows:
|
|
|
|
[source,sh]
|
|
--------------------------------------------------
|
|
# Per-index
|
|
GET _stats/fielddata?human&fields=_parent
|
|
|
|
# Per-node per-index
|
|
GET _nodes/stats/indices/fielddata?human&fields=_parent
|
|
--------------------------------------------------
|
|
// CONSOLE
|