119 lines
3.8 KiB
Plaintext
119 lines
3.8 KiB
Plaintext
[[search-request-collapse]]
|
|
=== Field Collapsing
|
|
|
|
Allows to collapse search results based on field values.
|
|
The collapsing is done by selecting only the top sorted document per collapse key.
|
|
For instance the query below retrieves the best tweet for each user and sorts them by number of likes.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /twitter/tweet/_search
|
|
{
|
|
"query": {
|
|
"match": {
|
|
"message": "elasticsearch"
|
|
}
|
|
},
|
|
"collapse" : {
|
|
"field" : "user" <1>
|
|
},
|
|
"sort": ["likes"], <2>
|
|
"from": 10 <3>
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[setup:twitter]
|
|
<1> collapse the result set using the "user" field
|
|
<2> sort the top docs by number of likes
|
|
<3> define the offset of the first collapsed result
|
|
|
|
WARNING: The total number of hits in the response indicates the number of matching documents without collapsing.
|
|
The total number of distinct group is unknown.
|
|
|
|
The field used for collapsing must be a single valued <<keyword, `keyword`>> or <<number, `numeric`>> field with <<doc-values, `doc_values`>> activated
|
|
|
|
NOTE: The collapsing is applied to the top hits only and does not affect aggregations.
|
|
|
|
|
|
==== Expand collapse results
|
|
|
|
It is also possible to expand each collapsed top hits with the `inner_hits` option.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /twitter/tweet/_search
|
|
{
|
|
"query": {
|
|
"match": {
|
|
"message": "elasticsearch"
|
|
}
|
|
},
|
|
"collapse" : {
|
|
"field" : "user", <1>
|
|
"inner_hits": {
|
|
"name": "last_tweets", <2>
|
|
"size": 5, <3>
|
|
"sort": [{ "date": "asc" }] <4>
|
|
},
|
|
"max_concurrent_group_searches": 4 <5>
|
|
},
|
|
"sort": ["likes"]
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[setup:twitter]
|
|
<1> collapse the result set using the "user" field
|
|
<2> the name used for the inner hit section in the response
|
|
<3> the number of inner_hits to retrieve per collapse key
|
|
<4> how to sort the document inside each group
|
|
<5> the number of concurrent requests allowed to retrieve the inner_hits` per group
|
|
|
|
See <<search-request-inner-hits, inner hits>> for the complete list of supported options and the format of the response.
|
|
|
|
It is also possible to request multiple `inner_hits` for each collapsed hit. This can be useful when you want to get
|
|
multiple representations of the collapsed hits.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /twitter/tweet/_search
|
|
{
|
|
"query": {
|
|
"match": {
|
|
"message": "elasticsearch"
|
|
}
|
|
},
|
|
"collapse" : {
|
|
"field" : "user", <1>
|
|
"inner_hits": [
|
|
{
|
|
"name": "most_liked", <2>
|
|
"size": 3,
|
|
"sort": ["likes"]
|
|
},
|
|
{
|
|
"name": "most_recent", <3>
|
|
"size": 3,
|
|
"sort": [{ "date": "asc" }]
|
|
}
|
|
]
|
|
},
|
|
"sort": ["likes"]
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[setup:twitter]
|
|
<1> collapse the result set using the "user" field
|
|
<2> return the three most liked tweets for the user
|
|
<3> return the three most recent tweets for the user
|
|
|
|
The expansion of the group is done by sending an additional query for each
|
|
`inner_hit` request for each collapsed hit returned in the response. This can significantly slow things down
|
|
if you have too many groups and/or `inner_hit` requests.
|
|
|
|
The `max_concurrent_group_searches` request parameter can be used to control
|
|
the maximum number of concurrent searches allowed in this phase.
|
|
The default is based on the number of data nodes and the default search thread pool size.
|
|
|
|
WARNING: `collapse` cannot be used in conjunction with <<search-request-scroll, scroll>>,
|
|
<<search-request-rescore, rescore>> or <<search-request-search-after, search after>>.
|