Merge pull request #18519 from MaineC/docs/add_console_to_search

Add console to docs for inner hits, explain, and friends
This commit is contained in:
Isabel Drost-Fromm 2016-08-01 12:10:09 +02:00 committed by GitHub
commit 5104437e4f
5 changed files with 341 additions and 204 deletions

View File

@ -11,39 +11,126 @@ type respectively.
[float]
=== Usage
Imagine having indexed the following document:
[source,js]
----------------------------------------
PUT /twitter/tweet/1?refresh
{
"user": "kimchy",
"message": "search"
}
---------------------------------------
// CONSOLE
// TESTSETUP
Full query example:
[source,js]
--------------------------------------------------
curl -XGET 'localhost:9200/twitter/tweet/1/_explain' -d '{
"query" : {
--------------------------------------
GET /twitter/tweet/1/_explain
{
"query" : {
"term" : { "message" : "search" }
}
}'
--------------------------------------------------
}
}
--------------------------------------
// CONSOLE
This will yield the following result:
[source,js]
--------------------------------------------------
{
"matches" : true,
"explanation" : {
"value" : 0.15342641,
"description" : "fieldWeight(message:search in 0), product of:",
"details" : [ {
"value" : 1.0,
"description" : "tf(termFreq(message:search)=1)"
}, {
"value" : 0.30685282,
"description" : "idf(docFreq=1, maxDocs=1)"
}, {
"value" : 0.5,
"description" : "fieldNorm(field=message, doc=0)"
} ]
}
}
"_index": "twitter",
"_type": "tweet",
"_id": "1",
"matched": true,
"explanation": {
"value": 0.2876821,
"description": "sum of:",
"details": [
{
"value": 0.2876821,
"description": "weight(message:search in 0) [PerFieldSimilarity], result of:",
"details": [
{
"value": 0.2876821,
"description": "score(doc=0,freq=1.0 = termFreq=1.0\n), product of:",
"details": [
{
"value": 0.2876821,
"description": "idf(docFreq=1, docCount=1)",
"details": [ ]
},
{
"value": 1.0,
"description": "tfNorm, computed from:",
"details" : [
{
"value" : 1.0,
"description" : "termFreq=1.0",
"details" : [ ]
},
{
"value" : 1.2,
"description" : "parameter k1",
"details" : [ ]
},
{
"value" : 0.75,
"description" : "parameter b",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "avgFieldLength",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "fieldLength",
"details" : [ ]
}
]
}
]
}
]
},
{
"value" : 0.0,
"description" : "match on required clause, product of:",
"details" : [
{
"value" : 0.0,
"description" : "# clause",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "_type:tweet, product of:",
"details" : [
{
"value" : 1.0,
"description" : "boost",
"details" : [ ]
},
{
"value" : 1.0,
"description" : "queryNorm",
"details" : [ ]
}
]
}
]
}
]
}
}
--------------------------------------------------
// TESTRESPONSE
There is also a simpler way of specifying the query via the `q`
parameter. The specified `q` parameter value is then parsed as if the
@ -52,8 +139,9 @@ explain api:
[source,js]
--------------------------------------------------
curl -XGET 'localhost:9200/twitter/tweet/1/_explain?q=message:search'
GET /twitter/tweet/1/_explain?q=message:search
--------------------------------------------------
// CONSOLE
This will yield the same result as the previous request.

View File

@ -9,6 +9,28 @@ available in the Lucene index. This can be useful to explore a dataset which
you don't know much about. For example, this allows creating a histogram
aggregation with meaningful intervals based on the min/max range of values.
For the following examples, lets assume the following indexed data:
[source,js]
-------------------------------------------------
PUT /github/user/1?refresh
{
"user": "kimchy",
"project": "elasticsearch",
"rating": "great project"
}
PUT /twitter/tweet/1?refresh
{
"user": "kimchy",
"message": "you know, for search",
"rating": 10
}
------------------------------------------------
// CONSOLE
// TESTSETUP
The field stats api by defaults executes on all indices, but can execute on
specific indices too.
@ -16,15 +38,17 @@ All indices:
[source,js]
--------------------------------------------------
curl -XGET "http://localhost:9200/_field_stats?fields=rating"
GET /_field_stats?fields=rating
--------------------------------------------------
// CONSOLE
Specific indices:
[source,js]
--------------------------------------------------
curl -XGET "http://localhost:9200/index1,index2/_field_stats?fields=rating"
GET /twitter,github/_field_stats?fields=rating
--------------------------------------------------
// CONSOLE
Supported request options:
@ -38,10 +62,12 @@ Alternatively the `fields` option can also be defined in the request body:
[source,js]
--------------------------------------------------
curl -XPOST "http://localhost:9200/_field_stats?level=indices" -d '{
POST /_field_stats?level=indices
{
"fields" : ["rating"]
}'
}
--------------------------------------------------
// CONSOLE
This is equivalent to the previous request.
@ -114,8 +140,9 @@ Request:
[source,js]
--------------------------------------------------
curl -XGET "http://localhost:9200/_field_stats?fields=rating,answer_count,creation_date,display_name"
GET /_field_stats?fields=rating,user,project,message
--------------------------------------------------
// CONSOLE
Response:
@ -123,101 +150,65 @@ Response:
--------------------------------------------------
{
"_shards": {
"total": 1,
"successful": 1,
"total": 10,
"successful": 10,
"failed": 0
},
"indices": {
"_all": { <1>
"_all": { <1>
"fields": {
"creation_date": {
"max_doc": 1326564,
"doc_count": 564633,
"density": 42,
"sum_doc_freq": 2258532,
"sum_total_term_freq": -1,
"min_value": "2008-08-01T16:37:51.513Z",
"max_value": "2013-06-02T03:23:11.593Z",
"is_searchable": "true",
"is_aggregatable": "true"
"project": {
"max_doc": 1,
"doc_count": 1,
"density": 100,
"sum_doc_freq": 1,
"sum_total_term_freq": 1,
"type": "string",
"searchable": true,
"aggregatable": false,
"min_value": "elasticsearch",
"max_value": "elasticsearch"
},
"display_name": {
"max_doc": 1326564,
"doc_count": 126741,
"density": 9,
"sum_doc_freq": 166535,
"sum_total_term_freq": 166616,
"min_value": "0",
"max_value": "정혜선",
"is_searchable": "true",
"is_aggregatable": "false"
"message": {
"max_doc": 1,
"doc_count": 1,
"density": 100,
"sum_doc_freq": 4,
"sum_total_term_freq": 4,
"type": "string",
"searchable": true,
"aggregatable": false,
"min_value": "for",
"max_value": "you"
},
"answer_count": {
"max_doc": 1326564,
"doc_count": 139885,
"density": 10,
"sum_doc_freq": 559540,
"sum_total_term_freq": -1,
"min_value": 0,
"max_value": 160,
"is_searchable": "true",
"is_aggregatable": "true"
},
"rating": {
"max_doc": 1326564,
"doc_count": 437892,
"density": 33,
"sum_doc_freq": 1751568,
"sum_total_term_freq": -1,
"min_value": -14,
"max_value": 1277,
"is_searchable": "true",
"is_aggregatable": "true"
}
}
}
}
}
--------------------------------------------------
<1> The `_all` key indicates that it contains the field stats of all indices in the cluster.
NOTE: When using the cluster level field statistics it is possible to have conflicts if the same field is used in
different indices with incompatible types. For instance a field of type `long` is not compatible with a field of
type `float` or `string`. A section named `conflicts` is added to the response if one or more conflicts are raised.
It contains all the fields with conflicts and the reason of the incompatibility.
[source,js]
--------------------------------------------------
{
"_shards": {
"total": 1,
"successful": 1,
"failed": 0
},
"indices": {
"_all": {
"fields": {
"creation_date": {
"max_doc": 1326564,
"doc_count": 564633,
"density": 42,
"sum_doc_freq": 2258532,
"sum_total_term_freq": -1,
"min_value": "2008-08-01T16:37:51.513Z",
"max_value": "2013-06-02T03:23:11.593Z",
"is_searchable": "true",
"is_aggregatable": "true"
"user": {
"max_doc": 2,
"doc_count": 2,
"density": 100,
"sum_doc_freq": 2,
"sum_total_term_freq": 2,
"type": "string",
"searchable": true,
"aggregatable": false,
"min_value": "kimchy",
"max_value": "kimchy"
}
}
}
},
"conflicts": {
"field_name_in_conflict1": "reason1",
"field_name_in_conflict2": "reason2"
"rating": "Field [rating] of type [integer] conflicts with existing field of type [string] in other index." <2>
}
}
--------------------------------------------------
// TESTRESPONSE
<1> The `_all` key indicates that it contains the field stats of all indices in the cluster.
<2> When using the cluster level field statistics it is possible to have conflicts if the same field is used in
different indices with incompatible types. For instance a field of type `long` is not compatible with a field of
type `float` or `string`. A section named `conflicts` is added to the response if one or more conflicts are raised.
It contains all the fields with conflicts and the reason of the incompatibility.
[float]
==== Indices level field statistics example
@ -226,8 +217,9 @@ Request:
[source,js]
--------------------------------------------------
curl -XGET "http://localhost:9200/_field_stats?fields=rating,answer_count,creation_date,display_name&level=indices"
GET /_field_stats?fields=rating,user,project,message&level=indices
--------------------------------------------------
// CONSOLE
Response:
@ -235,63 +227,97 @@ Response:
--------------------------------------------------
{
"_shards": {
"total": 1,
"successful": 1,
"total": 10,
"successful": 10,
"failed": 0
},
"indices": {
"stack": { <1>
"github": {
"fields": {
"creation_date": {
"max_doc": 1326564,
"doc_count": 564633,
"density": 42,
"sum_doc_freq": 2258532,
"sum_total_term_freq": -1,
"min_value": "2008-08-01T16:37:51.513Z",
"max_value": "2013-06-02T03:23:11.593Z",
"is_searchable": "true",
"is_aggregatable": "true"
},
"display_name": {
"max_doc": 1326564,
"doc_count": 126741,
"density": 9,
"sum_doc_freq": 166535,
"sum_total_term_freq": 166616,
"min_value": "0",
"max_value": "정혜선",
"is_searchable": "true",
"is_aggregatable": "false"
},
"answer_count": {
"max_doc": 1326564,
"doc_count": 139885,
"density": 10,
"sum_doc_freq": 559540,
"sum_total_term_freq": -1,
"min_value": 0,
"max_value": 160,
"is_searchable": "true",
"is_aggregatable": "true"
},
"rating": {
"max_doc": 1326564,
"doc_count": 437892,
"density": 33,
"sum_doc_freq": 1751568,
"sum_total_term_freq": -1,
"min_value": -14,
"max_value": 1277,
"is_searchable": "true",
"is_aggregatable": "true"
"max_doc": 1,
"doc_count": 1,
"density": 100,
"sum_doc_freq": 2,
"sum_total_term_freq": 2,
"type": "string",
"searchable": true,
"aggregatable": false,
"min_value": "great",
"max_value": "project"
},
"project": {
"max_doc": 1,
"doc_count": 1,
"density": 100,
"sum_doc_freq": 1,
"sum_total_term_freq": 1,
"type": "string",
"searchable": true,
"aggregatable": false,
"min_value": "elasticsearch",
"max_value": "elasticsearch"
},
"user": {
"max_doc": 1,
"doc_count": 1,
"density": 100,
"sum_doc_freq": 1,
"sum_total_term_freq": 1,
"type": "string",
"searchable": true,
"aggregatable": false,
"min_value": "kimchy",
"max_value": "kimchy"
}
}
},
"twitter": {
"fields": {
"rating": {
"max_doc": 1,
"doc_count": 1,
"density": 100,
"sum_doc_freq": -1,
"sum_total_term_freq": 1,
"type": "integer",
"searchable": true,
"aggregatable": true,
"min_value": 10,
"min_value_as_string": "10",
"max_value": 10,
"max_value_as_string": "10"
},
"message": {
"max_doc": 1,
"doc_count": 1,
"density": 100,
"sum_doc_freq": 4,
"sum_total_term_freq": 4,
"type": "string",
"searchable": true,
"aggregatable": false,
"min_value": "for",
"max_value": "you"
},
"user": {
"max_doc": 1,
"doc_count": 1,
"density": 100,
"sum_doc_freq": 1,
"sum_total_term_freq": 1,
"type": "string",
"searchable": true,
"aggregatable": false,
"min_value": "kimchy",
"max_value": "kimchy"
}
}
}
}
}
--------------------------------------------------
// TESTRESPONSE
<1> The `stack` key means it contains all field stats for the `stack` index.
[float]
@ -307,8 +333,9 @@ holding questions created in the year 2014:
[source,js]
--------------------------------------------------
curl -XPOST "http://localhost:9200/_field_stats?level=indices" -d '{
"fields" : ["answer_count"] <1>
POST /_field_stats?level=indices
{
"fields" : ["rating"], <1>
"index_constraints" : { <2>
"creation_date" : { <3>
"max_value" : { <4>
@ -319,8 +346,9 @@ curl -XPOST "http://localhost:9200/_field_stats?level=indices" -d '{
}
}
}
}'
}
--------------------------------------------------
// CONSOLE
<1> The fields to compute and return field stats for.
<2> The set index constraints. Note that index constrains can be defined for fields that aren't defined in the `fields` option.
@ -341,8 +369,9 @@ If missing, the format configured in the field's mapping is used.
[source,js]
--------------------------------------------------
curl -XPOST "http://localhost:9200/_field_stats?level=indices" -d '{
"fields" : ["answer_count"]
POST /_field_stats?level=indices
{
"fields" : ["rating"],
"index_constraints" : {
"creation_date" : {
"max_value" : {
@ -355,7 +384,7 @@ curl -XPOST "http://localhost:9200/_field_stats?level=indices" -d '{
}
}
}
}'
}
--------------------------------------------------
// CONSOLE
<1> Custom date format

View File

@ -7,41 +7,49 @@ example:
[source,js]
--------------------------------------------------
$ curl -XGET 'http://localhost:9200/twitter/tweet/_search' -d '{
GET /twitter/tweet/_search
{
"query" : {
"term" : { "user" : "kimchy" }
}
}
'
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
And here is a sample response:
[source,js]
--------------------------------------------------
{
"_shards":{
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits":{
"total" : 1,
"hits" : [
{
"_index" : "twitter",
"_type" : "tweet",
"_id" : "1",
"_source" : {
"user" : "kimchy",
"postDate" : "2009-11-15T14:12:12",
"message" : "trying out Elasticsearch"
}
"took": 42,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.2876821,
"hits": [
{
"_index": "twitter",
"_type": "tweet",
"_id": "0",
"_score": 0.2876821,
"_source": {
"user": "kimchy",
"message" : "trying out Elasticsearch",
"date": "2009-11-15T14:12:12",
"likes": 0
}
]
}
}
]
}
}
--------------------------------------------------
// TESTRESPONSE[s/"took": 42/"took": "$body.took"/]
[float]
=== Parameters
@ -105,8 +113,10 @@ matching document was found (per shard).
[source,js]
--------------------------------------------------
$ curl -XGET 'http://localhost:9200/_search?q=tag:wow&size=0&terminate_after=1'
GET /_search?q=user:kimchy&size=0&terminate_after=1
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
The response will not contain any hits as the `size` was set to `0`. The
`hits.total` will be either equal to `0`, indicating that there were no
@ -118,22 +128,22 @@ be set to `true` in the response.
[source,js]
--------------------------------------------------
{
"took": 3,
"took": 42,
"timed_out": false,
"terminated_early": true,
"_shards": {
"total": 1,
"successful": 1,
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0,
"max_score": 0.0,
"hits": []
}
}
--------------------------------------------------
// TESTRESPONSE[s/"took": 42/"took": "$body.took"/]
include::request/query.asciidoc[]

View File

@ -2,8 +2,8 @@
=== Inner hits
The <<mapping-parent-field, parent/child>> and <<nested, nested>> features allow the return of documents that
have matches in a different scope. In the parent/child case, parent document are returned based on matches in child
documents or child document are returned based on matches in parent documents. In the nested case, documents are returned
have matches in a different scope. In the parent/child case, parent documents are returned based on matches in child
documents or child documents are returned based on matches in parent documents. In the nested case, documents are returned
based on matches in nested inner objects.
In both cases, the actual matches in the different scopes that caused a document to be returned is hidden. In many cases,
@ -84,18 +84,20 @@ The example below assumes that there is a nested object field defined with the n
[source,js]
--------------------------------------------------
GET /_search
{
"query" : {
"nested" : {
"path" : "comments",
"query" : {
"match" : {"comments.message" : "[actual query]"}
"match" : {"comments.message" : "some message"}
},
"inner_hits" : {} <1>
}
}
}
--------------------------------------------------
// CONSOLE
<1> The inner hit definition in the nested query. No other options need to be defined.
@ -157,16 +159,20 @@ with the root hits then the following path can be defined:
[source,js]
--------------------------------------------------
GET /_search
{
"query" : {
"nested" : {
"path" : "comments.votes",
"query" : { ... },
"query" : {
"match": { "name": "kimchy" }
},
"inner_hits" : {}
}
}
}
--------------------------------------------------
// CONSOLE
This indirect referencing is only supported for nested inner hits.
@ -179,18 +185,20 @@ The examples below assumes that there is a `_parent` field mapping in the `comme
[source,js]
--------------------------------------------------
GET /_search
{
"query" : {
"has_child" : {
"type" : "comment",
"query" : {
"match" : {"message" : "[actual query]"}
"match" : {"message" : "some message"}
},
"inner_hits" : {} <1>
}
}
}
--------------------------------------------------
// CONSOLE
<1> The inner hit definition like in the nested example.
@ -224,4 +232,4 @@ An example of a response snippet that could be generated from the above search r
}
},
...
--------------------------------------------------
--------------------------------------------------

View File

@ -38,7 +38,7 @@ should keep the ``search context'' alive (see <<scroll-search-context>>), eg `?s
[source,js]
--------------------------------------------------
curl -XGET 'localhost:9200/twitter/tweet/_search?scroll=1m' -d '
GET /twitter/tweet/_search?scroll=1m
{
"query": {
"match" : {
@ -46,8 +46,9 @@ curl -XGET 'localhost:9200/twitter/tweet/_search?scroll=1m' -d '
}
}
}
'
--------------------------------------------------
// CONSOLE
// TEST[setup:twitter]
The result from the above request includes a `_scroll_id`, which should
be passed to the `scroll` API in order to retrieve the next batch of
@ -55,12 +56,11 @@ results.
[source,js]
--------------------------------------------------
curl -XGET <1> 'localhost:9200/_search/scroll' <2> -d'
GET <1> /_search/scroll <2>
{
"scroll" : "1m", <3>
"scroll_id" : "c2Nhbjs2OzM0NDg1ODpzRlBLc0FXNlNyNm5JWUc1" <4>
}
'
--------------------------------------------------
<1> `GET` or `POST` can be used.
@ -94,14 +94,14 @@ order, this is the most efficient option:
[source,js]
--------------------------------------------------
curl -XGET 'localhost:9200/_search?scroll=1m' -d '
GET /_search?scroll=1m
{
"sort": [
"_doc"
]
}
'
--------------------------------------------------
// CONSOLE
[[scroll-search-context]]
==== Keeping the search context alive
@ -130,8 +130,9 @@ You can check how many search contexts are open with the
[source,js]
---------------------------------------
curl -XGET localhost:9200/_nodes/stats/indices/search?pretty
GET /_nodes/stats/indices/search?pretty
---------------------------------------
// CONSOLE
==== Clear scroll API
@ -163,8 +164,9 @@ All search contexts can be cleared with the `_all` parameter:
[source,js]
---------------------------------------
curl -XDELETE localhost:9200/_search/scroll/_all
DELETE /_search/scroll/_all
---------------------------------------
// CONSOLE
The `scroll_id` can also be passed as a query string parameter or in the request body.
Multiple scroll IDs can be passed as comma separated values: