OpenSearch/docs/reference/query-dsl/queries/top-children-query.asciidoc

85 lines
2.8 KiB
Plaintext

[[query-dsl-top-children-query]]
=== Top Children Query
The `top_children` query runs the child query with an estimated hits
size, and out of the hit docs, aggregates it into parent docs. If there
aren't enough parent docs matching the requested from/size search
request, then it is run again with a wider (more hits) search.
The `top_children` also provide scoring capabilities, with the ability
to specify `max`, `sum` or `avg` as the score type.
One downside of using the `top_children` is that if there are more child
docs matching the required hits when executing the child query, then the
`total_hits` result of the search response will be incorrect.
How many hits are asked for in the first child query run is controlled
using the `factor` parameter (defaults to `5`). For example, when asking
for 10 parent docs (with `from` set to 0), then the child query will
execute with 50 hits expected. If not enough parents are found (in our
example 10), and there are still more child docs to query, then the
child search hits are expanded by multiplying by the
`incremental_factor` (defaults to `2`).
The required parameters are the `query` and `type` (the child type to
execute the query on). Here is an example with all different parameters,
including the default values:
[source,js]
--------------------------------------------------
{
"top_children" : {
"type": "blog_tag",
"query" : {
"term" : {
"tag" : "something"
}
},
"score" : "max",
"factor" : 5,
"incremental_factor" : 2
}
}
--------------------------------------------------
[float]
==== Scope
A `_scope` can be defined on the query allowing to run aggregations on the
same scope name that will work against the child documents. For example:
[source,js]
--------------------------------------------------
{
"top_children" : {
"_scope" : "my_scope",
"type": "blog_tag",
"query" : {
"term" : {
"tag" : "something"
}
}
}
}
--------------------------------------------------
[float]
==== Memory Considerations
In order to support parent-child joins, all of the (string) parent IDs
must be resident in memory (in the <<index-modules-fielddata,field data cache>>.
Additionaly, every child document is mapped to its parent using a long
value (approximately). It is advisable to keep the string parent ID short
in order to reduce memory usage.
You can check how much memory is being used by the ID cache using the
<<indices-stats,indices stats>> or <<cluster-nodes-stats,nodes stats>>
APIS, eg:
[source,js]
--------------------------------------------------
curl -XGET "http://localhost:9200/_stats/id_cache?pretty&human"
--------------------------------------------------