2013-08-28 19:24:34 -04:00
|
|
|
[[query-dsl-top-children-query]]
|
|
|
|
=== Top Children Query
|
|
|
|
|
|
|
|
The `top_children` query runs the child query with an estimated hits
|
|
|
|
size, and out of the hit docs, aggregates it into parent docs. If there
|
|
|
|
aren't enough parent docs matching the requested from/size search
|
|
|
|
request, then it is run again with a wider (more hits) search.
|
|
|
|
|
|
|
|
The `top_children` also provide scoring capabilities, with the ability
|
|
|
|
to specify `max`, `sum` or `avg` as the score type.
|
|
|
|
|
|
|
|
One downside of using the `top_children` is that if there are more child
|
|
|
|
docs matching the required hits when executing the child query, then the
|
|
|
|
`total_hits` result of the search response will be incorrect.
|
|
|
|
|
|
|
|
How many hits are asked for in the first child query run is controlled
|
|
|
|
using the `factor` parameter (defaults to `5`). For example, when asking
|
|
|
|
for 10 parent docs (with `from` set to 0), then the child query will
|
|
|
|
execute with 50 hits expected. If not enough parents are found (in our
|
|
|
|
example 10), and there are still more child docs to query, then the
|
|
|
|
child search hits are expanded by multiplying by the
|
|
|
|
`incremental_factor` (defaults to `2`).
|
|
|
|
|
|
|
|
The required parameters are the `query` and `type` (the child type to
|
|
|
|
execute the query on). Here is an example with all different parameters,
|
|
|
|
including the default values:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
"top_children" : {
|
|
|
|
"type": "blog_tag",
|
|
|
|
"query" : {
|
|
|
|
"term" : {
|
|
|
|
"tag" : "something"
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"score" : "max",
|
|
|
|
"factor" : 5,
|
|
|
|
"incremental_factor" : 2
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
[float]
|
|
|
|
==== Scope
|
|
|
|
|
|
|
|
A `_scope` can be defined on the query allowing to run facets on the
|
|
|
|
same scope name that will work against the child documents. For example:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
{
|
|
|
|
"top_children" : {
|
|
|
|
"_scope" : "my_scope",
|
|
|
|
"type": "blog_tag",
|
|
|
|
"query" : {
|
|
|
|
"term" : {
|
|
|
|
"tag" : "something"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
[float]
|
|
|
|
==== Memory Considerations
|
|
|
|
|
2014-06-21 10:32:29 -04:00
|
|
|
In order to support parent-child joins, all of the (string) parent IDs
|
|
|
|
must be resident in memory (in the <<index-modules-fielddata,field data cache>>.
|
|
|
|
Additionaly, every child document is mapped to its parent using a long
|
|
|
|
value (approximately). It is advisable to keep the string parent ID short
|
|
|
|
in order to reduce memory usage.
|
|
|
|
|
|
|
|
You can check how much memory is being used by the ID cache using the
|
|
|
|
<<indices-stats,indices stats>> or <<cluster-nodes-stats,nodes stats>>
|
|
|
|
APIS, eg:
|
|
|
|
|
|
|
|
[source,js]
|
|
|
|
--------------------------------------------------
|
|
|
|
curl -XGET "http://localhost:9200/_stats/id_cache?pretty&human"
|
|
|
|
--------------------------------------------------
|
|
|
|
|
|
|
|
|