Add documentation for the new parent-join field (#25227)
* Add documentation for the new parent-join field This commit adds the docs for the new parent-join field. It explains how to define, index and query this new field. Relates #20257
This commit is contained in:
parent
ff9edb627e
commit
ccb3c9aae7
|
@ -38,6 +38,8 @@ string:: <<text,`text`>> and <<keyword,`keyword`>>
|
|||
|
||||
<<percolator>>:: Accepts queries from the query-dsl
|
||||
|
||||
<<parent-join>>:: Defines parent/child relation for documents within the same index
|
||||
|
||||
[float]
|
||||
=== Multi-fields
|
||||
|
||||
|
@ -84,7 +86,7 @@ include::types/token-count.asciidoc[]
|
|||
|
||||
include::types/percolator.asciidoc[]
|
||||
|
||||
|
||||
include::types/parent-join.asciidoc[]
|
||||
|
||||
|
||||
|
||||
|
|
|
@ -0,0 +1,417 @@
|
|||
[[mapping-parent-join-field]]
|
||||
=== `join` datatype
|
||||
|
||||
The `join` datatype is a special field that creates
|
||||
parent/child relation within documents of the same index.
|
||||
This `relations` section defines a set of possible relations within the documents,
|
||||
each relation being a parent name and a child name.
|
||||
A parent/child relation can be defined as follows:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT my_index
|
||||
{
|
||||
"mappings": {
|
||||
"doc": {
|
||||
"properties": {
|
||||
"my_join_field": { <1>
|
||||
"type": "join",
|
||||
"relations": {
|
||||
"my_parent": "my_child" <2>
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> The name for the field
|
||||
<2> Defines a single relation where `my_parent` is parent of `my_child`.
|
||||
|
||||
To index a document with a join, the name of the relation and the optional parent
|
||||
of the document must be provided in the `source`.
|
||||
For instance the following creates a parent document in the `my_parent` context:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT my_index/doc/1?refresh
|
||||
{
|
||||
"text": "This is a parent document",
|
||||
"my_join_field": "my_parent" <1>
|
||||
}
|
||||
|
||||
PUT my_index/doc/2?refresh
|
||||
{
|
||||
"text": "This is a another parent document",
|
||||
"my_join_field": "my_parent"
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
<1> This document is a `my_parent` document.
|
||||
|
||||
When indexing a child, the name of the relation as well as the parent id of the document
|
||||
must be added in the `_source`.
|
||||
It is required to index the lineage of a parent in the same shard so you must
|
||||
always route child documents using their greater parent id.
|
||||
For instance the following index two children documents pointing to the same parent
|
||||
with a `routing` value equals to the `id` of the parent:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT my_index/doc/3?routing=1&refresh <1>
|
||||
{
|
||||
"text": "This is a child document",
|
||||
"my_join_field": {
|
||||
"name": "my_child", <2>
|
||||
"parent": "1" <3>
|
||||
}
|
||||
}
|
||||
|
||||
PUT my_index/doc/4?routing=1&refresh <1>
|
||||
{
|
||||
"text": "This is a another child document",
|
||||
"my_join_field": {
|
||||
"name": "my_child",
|
||||
"parent": "1"
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
<1> This child document must be on the same shard than its parent
|
||||
<2> `my_child` is the name of the join for this document
|
||||
<3> The parent id of this child document
|
||||
|
||||
==== Parent-join restrictions
|
||||
|
||||
* Only one `join` field is allowed per index mapping.
|
||||
* An element can have multiple children but only one parent.
|
||||
* Parent and child documents must be indexed on the same shard.
|
||||
This means that the same `routing` value needs to be provided when
|
||||
<<docs-get,getting>>, <<docs-delete,deleting>>, or <<docs-update,updating>>
|
||||
a child document.
|
||||
* It is possible to add a new relation to an existing `join` field.
|
||||
* It is also possible to add a child to an existing element
|
||||
but only if the element is already a parent.
|
||||
|
||||
==== Searching with parent-join
|
||||
|
||||
The parent-join creates one field to index the name of the relation
|
||||
within the document (`my_parent`, `my_child`, ...).
|
||||
It also creates one field per parent/child relation. The name of this field is
|
||||
the name of the `join` field followed by `#` and the name of the parent in the relation.
|
||||
So for instance for the `my_parent` => [`my_child`, `another_child`] relation,
|
||||
the `join` field creates an additional field named `my_join_field#my_parent`.
|
||||
This field contains the parent `_id` that the document links to
|
||||
if the document is a child (`my_child` or `another_child`) and the `_id` of
|
||||
document if it's a parent (`my_parent`).
|
||||
When searching an index that contains a `join` field, these two fields are always
|
||||
returned in the search response:
|
||||
|
||||
[source,js]
|
||||
--------------------------
|
||||
GET my_index/_search
|
||||
{
|
||||
"query": {
|
||||
"match_all": {}
|
||||
},
|
||||
"sort": ["_id"]
|
||||
}
|
||||
--------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
{
|
||||
...,
|
||||
"hits": {
|
||||
"total": 4,
|
||||
"max_score": null,
|
||||
"hits": [
|
||||
{
|
||||
"_index": "my_index",
|
||||
"_type": "doc",
|
||||
"_id": "1",
|
||||
"_score": null,
|
||||
"_source": {
|
||||
"text": "This is a parent document",
|
||||
"my_join_field": "my_parent"
|
||||
},
|
||||
"fields": {
|
||||
"my_join_field": [
|
||||
"my_parent" <1>
|
||||
]
|
||||
},
|
||||
"sort": [
|
||||
"1"
|
||||
]
|
||||
},
|
||||
{
|
||||
"_index": "my_index",
|
||||
"_type": "doc",
|
||||
"_id": "2",
|
||||
"_score": null,
|
||||
"_source": {
|
||||
"text": "This is a another parent document",
|
||||
"my_join_field": "my_parent"
|
||||
},
|
||||
"fields": {
|
||||
"my_join_field": [
|
||||
"my_parent" <2>
|
||||
]
|
||||
},
|
||||
"sort": [
|
||||
"2"
|
||||
]
|
||||
},
|
||||
{
|
||||
"_index": "my_index",
|
||||
"_type": "doc",
|
||||
"_id": "3",
|
||||
"_score": null,
|
||||
"_routing": "1",
|
||||
"_source": {
|
||||
"text": "This is a child document",
|
||||
"my_join_field": {
|
||||
"name": "my_child", <3>
|
||||
"parent": "1" <4>
|
||||
}
|
||||
},
|
||||
"fields": {
|
||||
"my_join_field": [
|
||||
"my_child"
|
||||
],
|
||||
"my_join_field#my_parent": [
|
||||
"1"
|
||||
]
|
||||
},
|
||||
"sort": [
|
||||
"3"
|
||||
]
|
||||
},
|
||||
{
|
||||
"_index": "my_index",
|
||||
"_type": "doc",
|
||||
"_id": "4",
|
||||
"_score": null,
|
||||
"_routing": "1",
|
||||
"_source": {
|
||||
"text": "This is a another child document",
|
||||
"my_join_field": {
|
||||
"name": "my_child",
|
||||
"parent": "1"
|
||||
}
|
||||
},
|
||||
"fields": {
|
||||
"my_join_field": [
|
||||
"my_child"
|
||||
],
|
||||
"my_join_field#my_parent": [
|
||||
"1"
|
||||
]
|
||||
},
|
||||
"sort": [
|
||||
"4"
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// TESTRESPONSE[s/\.\.\./"timed_out": false, "took": $body.took, "_shards": $body._shards/]
|
||||
|
||||
<1> This document belongs to the `my_parent` join
|
||||
<2> This document belongs to the `my_parent` join
|
||||
<3> This document belongs to the `my_child` join
|
||||
<4> The linked parent id for the child document
|
||||
|
||||
==== Parent-join queries and aggregations
|
||||
|
||||
See the <<query-dsl-has-child-query,`has_child`>> and
|
||||
<<query-dsl-has-parent-query,`has_parent`>> queries,
|
||||
the <<search-aggregations-bucket-children-aggregation,`children`>> aggregation,
|
||||
and <<parent-child-inner-hits,inner hits>> for more information.
|
||||
|
||||
The value of the `join` field is accessible in aggregations
|
||||
and scripts, and may be queried with the
|
||||
<<query-dsl-parent-id-query, `parent_id` query>>:
|
||||
|
||||
[source,js]
|
||||
--------------------------
|
||||
GET my_index/_search
|
||||
{
|
||||
"query": {
|
||||
"parent_id": { <1>
|
||||
"type": "my_child",
|
||||
"id": "1"
|
||||
}
|
||||
},
|
||||
"aggs": {
|
||||
"parents": {
|
||||
"terms": {
|
||||
"field": "my_join_field#my_parent", <2>
|
||||
"size": 10
|
||||
}
|
||||
}
|
||||
},
|
||||
"script_fields": {
|
||||
"parent": {
|
||||
"script": {
|
||||
"source": "doc['my_join_field#my_parent']" <3>
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
<1> Querying the `parent id` field (also see the <<query-dsl-has-parent-query,`has_parent` query>> and the <<query-dsl-has-child-query,`has_child` query>>)
|
||||
<2> Aggregating on the `parent id` field (also see the <<search-aggregations-bucket-children-aggregation,`children`>> aggregation)
|
||||
<3> Accessing the parent id` field in scripts
|
||||
|
||||
|
||||
==== Global ordinals
|
||||
|
||||
The `join` field uses <<eager-global-ordinals,global ordinals>> to speed up joins.
|
||||
Global ordinals need to be rebuilt after any change to a shard. The more
|
||||
parent id values are stored in a shard, the longer it takes to rebuild the
|
||||
global ordinals for the `join` field.
|
||||
|
||||
Global ordinals, by default, are built eagerly: if the index has changed,
|
||||
global ordinals for the `join` field will be rebuilt as part of the refresh.
|
||||
This can add significant time to the refresh. However most of the times this is the
|
||||
right trade-off, otherwise global ordinals are rebuilt when the first parent-join
|
||||
query or aggregation is used. This can introduce a significant latency spike for
|
||||
your users and usually this is worse as multiple global ordinals for the `join`
|
||||
field may be attempt rebuilt within a single refresh interval when many writes
|
||||
are occurring.
|
||||
|
||||
When the `join` field is used infrequently and writes occur frequently it may
|
||||
make sense to disable eager loading:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT my_index
|
||||
{
|
||||
"mappings": {
|
||||
"doc": {
|
||||
"properties": {
|
||||
"my_join_field": {
|
||||
"type": "join",
|
||||
"relations": {
|
||||
"my_parent": "my_child"
|
||||
},
|
||||
"eager_global_ordinals": false
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The amount of heap used by global ordinals can be checked per parent relation
|
||||
as follows:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
# Per-index
|
||||
GET _stats/fielddata?human&fields=my_join_field#my_parent
|
||||
|
||||
# Per-node per-index
|
||||
GET _nodes/stats/indices/fielddata?human&fields=my_join_field#my_parent
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
==== Multiple levels of parent join
|
||||
|
||||
It is also possible to define multiple children for a single parent:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT my_index
|
||||
{
|
||||
"mappings": {
|
||||
"doc": {
|
||||
"properties": {
|
||||
"my_join_field": {
|
||||
"type": "join",
|
||||
"relations": {
|
||||
"my_parent": ["my_child", "another_child"] <1>
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> `my_parent` is parent of `my_child`.
|
||||
|
||||
... and multiple levels of parent/child:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT my_index
|
||||
{
|
||||
"mappings": {
|
||||
"doc": {
|
||||
"properties": {
|
||||
"my_join_field": {
|
||||
"type": "join",
|
||||
"relations": {
|
||||
"my_parent": ["my_child", "another_child"], <1>
|
||||
"another_child": "grand_child" <2>
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
<1> `my_parent` is parent of `my_child` and `another_child`
|
||||
<2> `another_child` is parent of `grand_child`
|
||||
|
||||
The mapping above represents the following tree:
|
||||
|
||||
my_parent
|
||||
/ \
|
||||
/ \
|
||||
my_child another_child
|
||||
|
|
||||
|
|
||||
grand_child
|
||||
|
||||
Indexing a grand child document requires a `routing` value equals
|
||||
to the grand-parent (the greater parent of the lineage):
|
||||
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT my_index/doc/3?routing=1&refresh <1>
|
||||
{
|
||||
"text": "This is a grand child document",
|
||||
"my_join_field": {
|
||||
"name": "grand_child",
|
||||
"parent": "2" <2>
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
||||
<1> This child document must be on the same shard than its grandparent and parent
|
||||
<2> The parent id of this document (must points to an `another_child` document)
|
||||
|
||||
|
|
@ -1,7 +1,7 @@
|
|||
[[search-request-inner-hits]]
|
||||
=== Inner hits
|
||||
|
||||
The <<mapping-parent-field, parent/child>> and <<nested, nested>> features allow the return of documents that
|
||||
The <<mapping-parent-join-field, parent-join> and <<nested, nested>> features allow the return of documents that
|
||||
have matches in a different scope. In the parent/child case, parent documents are returned based on matches in child
|
||||
documents or child documents are returned based on matches in parent documents. In the nested case, documents are returned
|
||||
based on matches in nested inner objects.
|
||||
|
@ -103,11 +103,11 @@ PUT test/doc/1?refresh
|
|||
"comments": [
|
||||
{
|
||||
"author": "kimchy",
|
||||
"text": "comment text"
|
||||
"number": 1
|
||||
},
|
||||
{
|
||||
"author": "nik9000",
|
||||
"text": "words words words"
|
||||
"number": 2
|
||||
}
|
||||
]
|
||||
}
|
||||
|
@ -118,7 +118,7 @@ POST test/_search
|
|||
"nested": {
|
||||
"path": "comments",
|
||||
"query": {
|
||||
"match": {"comments.text" : "words"}
|
||||
"match": {"comments.number" : 2}
|
||||
},
|
||||
"inner_hits": {} <1>
|
||||
}
|
||||
|
@ -137,29 +137,29 @@ An example of a response snippet that could be generated from the above search r
|
|||
...,
|
||||
"hits": {
|
||||
"total": 1,
|
||||
"max_score": 1.0444683,
|
||||
"max_score": 1.0,
|
||||
"hits": [
|
||||
{
|
||||
"_index": "test",
|
||||
"_type": "doc",
|
||||
"_id": "1",
|
||||
"_score": 1.0444683,
|
||||
"_score": 1.0,
|
||||
"_source": ...,
|
||||
"inner_hits": {
|
||||
"comments": { <1>
|
||||
"hits": {
|
||||
"total": 1,
|
||||
"max_score": 1.0444683,
|
||||
"max_score": 1.0,
|
||||
"hits": [
|
||||
{
|
||||
"_nested": {
|
||||
"field": "comments",
|
||||
"offset": 1
|
||||
},
|
||||
"_score": 1.0444683,
|
||||
"_score": 1.0,
|
||||
"_source": {
|
||||
"author": "nik9000",
|
||||
"text": "words words words"
|
||||
"number": 2
|
||||
}
|
||||
}
|
||||
]
|
||||
|
@ -425,33 +425,39 @@ This indirect referencing is only supported for nested inner hits.
|
|||
[[parent-child-inner-hits]]
|
||||
==== Parent/child inner hits
|
||||
|
||||
The parent/child `inner_hits` can be used to include parent or child
|
||||
The parent/child `inner_hits` can be used to include parent or child:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
PUT test
|
||||
{
|
||||
"settings": {
|
||||
"mapping.single_type": false
|
||||
},
|
||||
"mappings": {
|
||||
"my_parent": {},
|
||||
"my_child": {
|
||||
"_parent": {
|
||||
"type": "my_parent"
|
||||
"doc": {
|
||||
"properties": {
|
||||
"my_join_field": {
|
||||
"type": "join",
|
||||
"relations": {
|
||||
"my_parent": "my_child"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
PUT test/my_parent/1?refresh
|
||||
PUT test/doc/1?refresh
|
||||
{
|
||||
"test": "test"
|
||||
"number": 1,
|
||||
"my_join_field": "my_parent"
|
||||
}
|
||||
|
||||
PUT test/my_child/1?parent=1&refresh
|
||||
PUT test/doc/2?routing=1&refresh
|
||||
{
|
||||
"test": "test"
|
||||
"number": 1,
|
||||
"my_join_field": {
|
||||
"name": "my_child",
|
||||
"parent": "1"
|
||||
}
|
||||
}
|
||||
|
||||
POST test/_search
|
||||
|
@ -461,7 +467,7 @@ POST test/_search
|
|||
"type": "my_child",
|
||||
"query": {
|
||||
"match": {
|
||||
"test": "test"
|
||||
"number": 1
|
||||
}
|
||||
},
|
||||
"inner_hits": {} <1>
|
||||
|
@ -478,40 +484,59 @@ An example of a response snippet that could be generated from the above search r
|
|||
[source,js]
|
||||
--------------------------------------------------
|
||||
{
|
||||
...,
|
||||
"hits": {
|
||||
"total": 1,
|
||||
"max_score": 1.0,
|
||||
"hits": [
|
||||
{
|
||||
"_index": "test",
|
||||
"_type": "my_parent",
|
||||
"_id": "1",
|
||||
"_score": 1.0,
|
||||
"_source": ...,
|
||||
"inner_hits": {
|
||||
"my_child": {
|
||||
"hits": {
|
||||
"total": 1,
|
||||
"max_score": 0.18232156,
|
||||
"hits": [
|
||||
{
|
||||
"_type": "my_child",
|
||||
"_id": "1",
|
||||
"_score": 0.18232156,
|
||||
"_routing": "1",
|
||||
"_parent": "1",
|
||||
"_source": {
|
||||
"test": "test"
|
||||
}
|
||||
...,
|
||||
"hits": {
|
||||
"total": 1,
|
||||
"max_score": 1.0,
|
||||
"hits": [
|
||||
{
|
||||
"_index": "test",
|
||||
"_type": "doc",
|
||||
"_id": "1",
|
||||
"_score": 1.0,
|
||||
"_source": {
|
||||
"number": 1,
|
||||
"my_join_field": "my_parent"
|
||||
},
|
||||
"fields": {
|
||||
"my_join_field": [
|
||||
"my_parent"
|
||||
]
|
||||
},
|
||||
"inner_hits": {
|
||||
"my_child": {
|
||||
"hits": {
|
||||
"total": 1,
|
||||
"max_score": 1.0,
|
||||
"hits": [
|
||||
{
|
||||
"_type": "doc",
|
||||
"_id": "2",
|
||||
"_score": 1.0,
|
||||
"_routing": "1",
|
||||
"_source": {
|
||||
"number": 1,
|
||||
"my_join_field": {
|
||||
"name": "my_child",
|
||||
"parent": "1"
|
||||
}
|
||||
},
|
||||
"fields": {
|
||||
"my_join_field": [
|
||||
"my_child"
|
||||
],
|
||||
"my_join_field#my_parent": [
|
||||
"1"
|
||||
]
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
// TESTRESPONSE[s/"_source": \.\.\./"_source": $body.hits.hits.0._source/]
|
||||
|
|
Loading…
Reference in New Issue