OpenSearch/docs/reference/search/request/inner-hits.asciidoc

246 lines
8.6 KiB
Plaintext

[[search-request-inner-hits]]
=== Inner hits
The <<mapping-parent-field, parent/child>> and <<mapping-nested-type, nested>> features allow to return documents that
have matches in a different scope. In the parent/child case parent document are returned based on matches in child
documents or child document are returned based on matches in parent documents. In the nested case documents are returned
based on matches in nested inner objects.
In both cases the actual matches in the different scopes that caused a document to be returned is hidden. In many cases
it is very useful to know which inner nested objects in the case of nested or children or parent documents in the case
of parent/child caused certain information to be returned. The inner hits feature can be used for this. This feature
returns per search hit in the search response additional nested hits that caused a search hit to match in a different scope.
The following snippet explains the basic structure of inner hits:
[source,js]
--------------------------------------------------
"inner_hits" : {
"<inner_hits_name>" : {
"<path|type>" : {
"<path-to-nested-object-field|child-or-parent-type>" : {
<inner_hits_body>
[,"inner_hits" : { [<sub_inner_hits>]+ } ]?
}
}
}
[,"<inner_hits_name_2>" : { ... } ]*
}
--------------------------------------------------
Inside the `inner_hits` definition, first the name if the inner hit is defined then whether the inner_hit
is a nested by defining `path` or a parent/child based definition by defining `type`. The next object layer contains
the name of the nested object field if the inner_hits is nested or the parent or child type if the inner_hit definition
is parent/child based.
Multiple inner hit definitions can be defined in a single request. In the `<inner_hits_body>` any option for features
that `inner_hits` support can be defined. Optionally another `inner_hits` definition can be defined in the `<inner_hits_body>`.
If `inner_hits` is defined each search will contain a `inner_hits` json object with the following structure:
[source,js]
--------------------------------------------------
"hits": [
{
"_index": ...,
"_type": ...,
"_id": ...,
"inner_hits": {
"<inner_hits_name>": {
"hits": {
"total": ...,
"hits": [
{
"_type": ...,
"_id": ...,
...
},
...
]
}
}
},
...
},
...
]
--------------------------------------------------
==== Options
Inner hits support the following options:
[horizontal]
`path`:: Defines the nested scope where hits will be collected from.
`type`:: Defines the parent or child type score where hits will be collected from.
`query`:: Defines the query that will run in the defined nested, parent or child scope to collect and score hits. By default all document in the scope will be matched.
`from`:: The offset from where the first hit to fetch for each `inner_hits` in the returned regular search hits.
`size`:: The maximum number of hits to return per `inner_hits`. By default the top three matching hits are returned.
`sort`:: How the inner hits should be sorted per `inner_hits`. By default the hits are sorted by the score.
Either `path` or `type` must be defined. The `path` or `type` defines the scope from where hits are fetched and
used as inner hits.
Inner hits also supports the following per document features:
* <<search-request-highlighting,Highlighting>>
* <<search-request-explain,Explain>>
* <<search-request-source-filtering,Source filtering>>
* <<search-request-script-fields,Script fields>>
* <<search-request-fielddata-fields,Fielddata fields>>
* <<search-request-version,Include versions>>
[[nested-inner-hits]]
==== Nested inner hits
The nested `inner_hits` can be used to include nested inner objects as inner hits to a search hit.
The example below assumes that there is a nested object field defined with the name `comments`:
[source,js]
--------------------------------------------------
{
"query" : {
"nested" : {
"path" : "comments",
"query" : {
"match" : {"comments.message" : "[actual query]"}
}
}
},
"inner_hits" : {
"comment" : {
"path" : { <1>
"comments" : { <2>
"query" : {
"match" : {"comments.message" : "[actual query]"}
}
}
}
}
}
}
--------------------------------------------------
<1> The inner hit definition is nested and requires the `path` option.
<2> The path option refers to the nested object field `comments`
In the above the query is repeated in both the query and the `comment` inner hit definition. At the moment there is
no query referencing support, so in order to make sure that only inner nested objects are returned that contributed to
the matching of the regular hits, the inner query in the `nested` query needs to also be defined on the inner hits definition.
An example of a response snippet that could be generated from the above search request:
[source,js]
--------------------------------------------------
...
"hits": {
...
"hits": [
{
"_index": "my-index",
"_type": "question",
"_id": "1",
"_source": ...,
"inner_hits": {
"comment": {
"hits": {
"total": ...,
"hits": [
{
"_type": "question",
"_id": "1",
"_nested": {
"field": "comments",
"offset": 2
},
"_source": ...
},
...
]
}
}
}
},
...
--------------------------------------------------
The `_nested` metadata is crucial in the above example, because it defines from what inner nested object this inner hit
came from. The `field` defines the object array field the nested hit is from and the `offset` relative to its location
in the `_source`. Due to sorting and scoring the actual location of the hit objects in the `inner_hits` is usually
different than the location a nested inner object was defined.
By default the `_source` is returned also for the hit objects in `inner_hits`, but this can be changed. Either via
`_source` filtering feature part of the source can be returned or be disabled. If stored fields are defined on the
nested level these can also be returned via the `fields` feature.
An important default is that the `_source` returned in hits inside `inner_hits` is relative to the `_nested` metadata.
So in the above example only the comment part is returned per nested hit and not the entire source of the top level
document that contained the the comment.
[[parent-child-inner-hits]]
==== Parent/child inner hits
The parent/child `inner_hits` can be used to include parent or child
The examples below assumes that there is a `_parent` field mapping in the `comment` type:
[source,js]
--------------------------------------------------
{
"query" : {
"has_child" : {
"type" : "comment",
"query" : {
"match" : {"message" : "[actual query]"}
}
}
},
"inner_hits" : {
"comment" : {
"type" : { <1>
"comment" : { <2>
"query" : {
"match" : {"message" : "[actual query]"}
}
}
}
}
}
}
--------------------------------------------------
<1> This is a parent/child inner hit definition and requires the `type` option.
<2> Refers to the document type `comment`
An example of a response snippet that could be generated from the above search request:
[source,js]
--------------------------------------------------
...
"hits": {
...
"hits": [
{
"_index": "my-index",
"_type": "question",
"_id": "1",
"_source": ...,
"inner_hits": {
"comment": {
"hits": {
"total": ...,
"hits": [
{
"_type": "comment",
"_id": "5",
"_source": ...
},
...
]
}
}
}
},
...
--------------------------------------------------