350 lines
9.1 KiB
Plaintext
350 lines
9.1 KiB
Plaintext
[[search-request-sort]]
|
|
=== Sort
|
|
|
|
Allows to add one or more sort on specific fields. Each sort can be
|
|
reversed as well. The sort is defined on a per field level, with special
|
|
field name for `_score` to sort by score.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"sort" : [
|
|
{ "post_date" : {"order" : "asc"}},
|
|
"user",
|
|
{ "name" : "desc" },
|
|
{ "age" : "desc" },
|
|
"_score"
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
==== Sort Values
|
|
|
|
The sort values for each document returned are also returned as part of
|
|
the response.
|
|
|
|
==== Sort mode option
|
|
|
|
Elasticsearch supports sorting by array or multi-valued fields. The `mode` option
|
|
controls what array value is picked for sorting the document it belongs
|
|
to. The `mode` option can have the following values:
|
|
|
|
[horizontal]
|
|
`min`:: Pick the lowest value.
|
|
`max`:: Pick the highest value.
|
|
`sum`:: Use the sum of all values as sort value. Only applicable for
|
|
number based array fields.
|
|
`avg`:: Use the average of all values as sort value. Only applicable
|
|
for number based array fields.
|
|
|
|
===== Sort mode example usage
|
|
|
|
In the example below the field price has multiple prices per document.
|
|
In this case the result hits will be sort by price ascending based on
|
|
the average price per document.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
curl -XPOST 'localhost:9200/_search' -d '{
|
|
"query" : {
|
|
...
|
|
},
|
|
"sort" : [
|
|
{"price" : {"order" : "asc", "mode" : "avg"}}
|
|
]
|
|
}'
|
|
--------------------------------------------------
|
|
|
|
==== Sorting within nested objects.
|
|
|
|
Elasticsearch also supports sorting by
|
|
fields that are inside one or more nested objects. The sorting by nested
|
|
field support has the following parameters on top of the already
|
|
existing sort options:
|
|
|
|
`nested_path`::
|
|
Defines the on what nested object to sort. The actual
|
|
sort field must be a direct field inside this nested object. The default
|
|
is to use the most immediate inherited nested object from the sort
|
|
field.
|
|
|
|
`nested_filter`::
|
|
A filter the inner objects inside the nested path
|
|
should match with in order for its field values to be taken into account
|
|
by sorting. Common case is to repeat the query / filter inside the
|
|
nested filter or query. By default no `nested_filter` is active.
|
|
|
|
===== Nested sorting example
|
|
|
|
In the below example `offer` is a field of type `nested`. Because
|
|
`offer` is the closest inherited nested field, it is picked as
|
|
`nested_path`. Only the inner objects that have color blue will
|
|
participate in sorting.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
curl -XPOST 'localhost:9200/_search' -d '{
|
|
"query" : {
|
|
...
|
|
},
|
|
"sort" : [
|
|
{
|
|
"offer.price" : {
|
|
"mode" : "avg",
|
|
"order" : "asc",
|
|
"nested_filter" : {
|
|
"term" : { "offer.color" : "blue" }
|
|
}
|
|
}
|
|
}
|
|
]
|
|
}'
|
|
--------------------------------------------------
|
|
|
|
Nested sorting is also supported when sorting by
|
|
scripts and sorting by geo distance.
|
|
|
|
==== Missing Values
|
|
|
|
The `missing` parameter specifies how docs which are missing
|
|
the field should be treated: The `missing` value can be
|
|
set to `_last`, `_first`, or a custom value (that
|
|
will be used for missing docs as the sort value). For example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"sort" : [
|
|
{ "price" : {"missing" : "_last"} },
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
NOTE: If a nested inner object doesn't match with
|
|
the `nested_filter` then a missing value is used.
|
|
|
|
==== Ignoring Unmapped Fields
|
|
|
|
coming[1.4.0] Before 1.4.0 there was the `ignore_unmapped` boolean
|
|
parameter, which was not enough information to decide on the sort
|
|
values to emit, and didn't work for cross-index search. It is still
|
|
supported but users are encouraged to migrate to the new
|
|
`unmapped_type` instead.
|
|
|
|
By default, the search request will fail if there is no mapping
|
|
associated with a field. The `unmapped_type` option allows to ignore
|
|
fields that have no mapping and not sort by them. The value of this
|
|
parameter is used to determine what sort values to emit. Here is an
|
|
example of how it can be used:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"sort" : [
|
|
{ "price" : {"unmapped_type" : "long"} },
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
If any of the indices that are queried doesn't have a mapping for `price`
|
|
then Elasticsearch will handle it as if there was a mapping of type
|
|
`long`, with all documents in this index having no value for this field.
|
|
|
|
==== Geo Distance Sorting
|
|
|
|
Allow to sort by `_geo_distance`. Here is an example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"sort" : [
|
|
{
|
|
"_geo_distance" : {
|
|
"pin.location" : [-70, 40],
|
|
"order" : "asc",
|
|
"unit" : "km"
|
|
}
|
|
}
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
Note: the geo distance sorting supports `sort_mode` options: `min`,
|
|
`max` and `avg`.
|
|
|
|
The following formats are supported in providing the coordinates:
|
|
|
|
===== Lat Lon as Properties
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"sort" : [
|
|
{
|
|
"_geo_distance" : {
|
|
"pin.location" : {
|
|
"lat" : 40,
|
|
"lon" : -70
|
|
},
|
|
"order" : "asc",
|
|
"unit" : "km"
|
|
}
|
|
}
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
===== Lat Lon as String
|
|
|
|
Format in `lat,lon`.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"sort" : [
|
|
{
|
|
"_geo_distance" : {
|
|
"pin.location" : "-70,40",
|
|
"order" : "asc",
|
|
"unit" : "km"
|
|
}
|
|
}
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
===== Geohash
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"sort" : [
|
|
{
|
|
"_geo_distance" : {
|
|
"pin.location" : "drm3btev3e86",
|
|
"order" : "asc",
|
|
"unit" : "km"
|
|
}
|
|
}
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
===== Lat Lon as Array
|
|
|
|
Format in `[lon, lat]`, note, the order of lon/lat here in order to
|
|
conform with http://geojson.org/[GeoJSON].
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"sort" : [
|
|
{
|
|
"_geo_distance" : {
|
|
"pin.location" : [-70, 40],
|
|
"order" : "asc",
|
|
"unit" : "km"
|
|
}
|
|
}
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
|
|
==== Multiple reference points
|
|
|
|
added[1.4.0]
|
|
|
|
Multiple geo points can be passed as an array containing any `geo_point` format, for example
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
"pin.location" : [[-70, 40], [-71, 42]]
|
|
"pin.location" : [{"lat": -70, "lon": 40}, {"lat": -71, "lon": 42}]
|
|
--------------------------------------------------
|
|
|
|
and so forth.
|
|
|
|
The final distance for a document will then be `min`/`max` distance of all points contained in the document to all points given in the sort request.
|
|
|
|
|
|
|
|
==== Script Based Sorting
|
|
|
|
Allow to sort based on custom scripts, here is an example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"query" : {
|
|
....
|
|
},
|
|
"sort" : {
|
|
"_script" : {
|
|
"script" : "doc['field_name'].value * factor",
|
|
"type" : "number",
|
|
"params" : {
|
|
"factor" : 1.1
|
|
},
|
|
"order" : "asc"
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
Note, it is recommended, for single custom based script based sorting,
|
|
to use `function_score` query instead as sorting based on score is faster.
|
|
|
|
==== Track Scores
|
|
|
|
When sorting on a field, scores are not computed. By setting
|
|
`track_scores` to true, scores will still be computed and tracked.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
"track_scores": true,
|
|
"sort" : [
|
|
{ "post_date" : {"reverse" : true} },
|
|
{ "name" : "desc" },
|
|
{ "age" : "desc" }
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
|
|
==== Memory Considerations
|
|
|
|
When sorting, the relevant sorted field values are loaded into memory.
|
|
This means that per shard, there should be enough memory to contain
|
|
them. For string based types, the field sorted on should not be analyzed
|
|
/ tokenized. For numeric types, if possible, it is recommended to
|
|
explicitly set the type to six_hun types (like `short`, `integer` and
|
|
`float`).
|