452 lines
11 KiB
Plaintext
452 lines
11 KiB
Plaintext
[[search-request-sort]]
|
|
=== Sort
|
|
|
|
Allows to add one or more sort on specific fields. Each sort can be
|
|
reversed as well. The sort is defined on a per field level, with special
|
|
field name for `_score` to sort by score, and `_doc` to sort by index order.
|
|
|
|
Assuming the following index mapping:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
PUT /my_index
|
|
{
|
|
"mappings": {
|
|
"my_type": {
|
|
"properties": {
|
|
"post_date": { "type": "date" },
|
|
"user": {
|
|
"type": "keyword"
|
|
},
|
|
"name": {
|
|
"type": "keyword"
|
|
},
|
|
"age": { "type": "integer" }
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /my_index/my_type/_search
|
|
{
|
|
"sort" : [
|
|
{ "post_date" : {"order" : "asc"}},
|
|
"user",
|
|
{ "name" : "desc" },
|
|
{ "age" : "desc" },
|
|
"_score"
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[continued]
|
|
|
|
NOTE: `_doc` has no real use-case besides being the most efficient sort order.
|
|
So if you don't care about the order in which documents are returned, then you
|
|
should sort by `_doc`. This especially helps when <<search-request-scroll,scrolling>>.
|
|
|
|
==== Sort Values
|
|
|
|
The sort values for each document returned are also returned as part of
|
|
the response.
|
|
|
|
==== Sort Order
|
|
|
|
The `order` option can have the following values:
|
|
|
|
[horizontal]
|
|
`asc`:: Sort in ascending order
|
|
`desc`:: Sort in descending order
|
|
|
|
The order defaults to `desc` when sorting on the `_score`, and defaults
|
|
to `asc` when sorting on anything else.
|
|
|
|
==== Sort mode option
|
|
|
|
Elasticsearch supports sorting by array or multi-valued fields. The `mode` option
|
|
controls what array value is picked for sorting the document it belongs
|
|
to. The `mode` option can have the following values:
|
|
|
|
[horizontal]
|
|
`min`:: Pick the lowest value.
|
|
`max`:: Pick the highest value.
|
|
`sum`:: Use the sum of all values as sort value. Only applicable for
|
|
number based array fields.
|
|
`avg`:: Use the average of all values as sort value. Only applicable
|
|
for number based array fields.
|
|
`median`:: Use the median of all values as sort value. Only applicable
|
|
for number based array fields.
|
|
|
|
===== Sort mode example usage
|
|
|
|
In the example below the field price has multiple prices per document.
|
|
In this case the result hits will be sorted by price ascending based on
|
|
the average price per document.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
PUT /my_index/my_type/1?refresh
|
|
{
|
|
"product": "chocolate",
|
|
"price": [20, 4]
|
|
}
|
|
|
|
POST /_search
|
|
{
|
|
"query" : {
|
|
"term" : { "product" : "chocolate" }
|
|
},
|
|
"sort" : [
|
|
{"price" : {"order" : "asc", "mode" : "avg"}}
|
|
]
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
[[nested-sorting]]
|
|
==== Sorting within nested objects.
|
|
|
|
Elasticsearch also supports sorting by
|
|
fields that are inside one or more nested objects. The sorting by nested
|
|
field support has the following parameters on top of the already
|
|
existing sort options:
|
|
|
|
`nested_path`::
|
|
Defines on which nested object to sort. The actual
|
|
sort field must be a direct field inside this nested object.
|
|
When sorting by nested field, this field is mandatory.
|
|
|
|
`nested_filter`::
|
|
A filter that the inner objects inside the nested path
|
|
should match with in order for its field values to be taken into account
|
|
by sorting. Common case is to repeat the query / filter inside the
|
|
nested filter or query. By default no `nested_filter` is active.
|
|
|
|
===== Nested sorting example
|
|
|
|
In the below example `offer` is a field of type `nested`.
|
|
The `nested_path` needs to be specified; otherwise, elasticsearch doesn't know on what nested level sort values need to be captured.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
POST /_search
|
|
{
|
|
"query" : {
|
|
"term" : { "product" : "chocolate" }
|
|
},
|
|
"sort" : [
|
|
{
|
|
"offer.price" : {
|
|
"mode" : "avg",
|
|
"order" : "asc",
|
|
"nested_path" : "offer",
|
|
"nested_filter" : {
|
|
"term" : { "offer.color" : "blue" }
|
|
}
|
|
}
|
|
}
|
|
]
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
Nested sorting is also supported when sorting by
|
|
scripts and sorting by geo distance.
|
|
|
|
==== Missing Values
|
|
|
|
The `missing` parameter specifies how docs which are missing
|
|
the field should be treated: The `missing` value can be
|
|
set to `_last`, `_first`, or a custom value (that
|
|
will be used for missing docs as the sort value).
|
|
The default is `_last`.
|
|
|
|
For example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"sort" : [
|
|
{ "price" : {"missing" : "_last"} }
|
|
],
|
|
"query" : {
|
|
"term" : { "product" : "chocolate" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
NOTE: If a nested inner object doesn't match with
|
|
the `nested_filter` then a missing value is used.
|
|
|
|
==== Ignoring Unmapped Fields
|
|
|
|
By default, the search request will fail if there is no mapping
|
|
associated with a field. The `unmapped_type` option allows to ignore
|
|
fields that have no mapping and not sort by them. The value of this
|
|
parameter is used to determine what sort values to emit. Here is an
|
|
example of how it can be used:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"sort" : [
|
|
{ "price" : {"unmapped_type" : "long"} }
|
|
],
|
|
"query" : {
|
|
"term" : { "product" : "chocolate" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
If any of the indices that are queried doesn't have a mapping for `price`
|
|
then Elasticsearch will handle it as if there was a mapping of type
|
|
`long`, with all documents in this index having no value for this field.
|
|
|
|
[[geo-sorting]]
|
|
==== Geo Distance Sorting
|
|
|
|
Allow to sort by `_geo_distance`. Here is an example, assuming `pin.location` is a field of type `geo_point`:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"sort" : [
|
|
{
|
|
"_geo_distance" : {
|
|
"pin.location" : [-70, 40],
|
|
"order" : "asc",
|
|
"unit" : "km",
|
|
"mode" : "min",
|
|
"distance_type" : "sloppy_arc"
|
|
}
|
|
}
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
|
|
|
|
`distance_type`::
|
|
|
|
How to compute the distance. Can either be `sloppy_arc` (default), `arc` (slightly more precise but significantly slower) or `plane` (faster, but inaccurate on long distances and close to the poles).
|
|
|
|
`mode`::
|
|
|
|
What to do in case a field has several geo points. By default, the shortest
|
|
distance is taken into account when sorting in ascending order and the
|
|
longest distance when sorting in descending order. Supported values are
|
|
`min`, `max`, `median` and `avg`.
|
|
|
|
`unit`::
|
|
|
|
The unit to use when computing sort values. The default is `m` (meters).
|
|
|
|
NOTE: geo distance sorting does not support configurable missing values: the
|
|
distance will always be considered equal to +Infinity+ when a document does not
|
|
have values for the field that is used for distance computation.
|
|
|
|
The following formats are supported in providing the coordinates:
|
|
|
|
===== Lat Lon as Properties
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"sort" : [
|
|
{
|
|
"_geo_distance" : {
|
|
"pin.location" : {
|
|
"lat" : 40,
|
|
"lon" : -70
|
|
},
|
|
"order" : "asc",
|
|
"unit" : "km"
|
|
}
|
|
}
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
===== Lat Lon as String
|
|
|
|
Format in `lat,lon`.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"sort" : [
|
|
{
|
|
"_geo_distance" : {
|
|
"pin.location" : "40,-70",
|
|
"order" : "asc",
|
|
"unit" : "km"
|
|
}
|
|
}
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
===== Geohash
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"sort" : [
|
|
{
|
|
"_geo_distance" : {
|
|
"pin.location" : "drm3btev3e86",
|
|
"order" : "asc",
|
|
"unit" : "km"
|
|
}
|
|
}
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
===== Lat Lon as Array
|
|
|
|
Format in `[lon, lat]`, note, the order of lon/lat here in order to
|
|
conform with http://geojson.org/[GeoJSON].
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"sort" : [
|
|
{
|
|
"_geo_distance" : {
|
|
"pin.location" : [-70, 40],
|
|
"order" : "asc",
|
|
"unit" : "km"
|
|
}
|
|
}
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
|
|
==== Multiple reference points
|
|
|
|
Multiple geo points can be passed as an array containing any `geo_point` format, for example
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"sort" : [
|
|
{
|
|
"_geo_distance" : {
|
|
"pin.location" : [[-70, 40], [-71, 42]],
|
|
"order" : "asc",
|
|
"unit" : "km"
|
|
}
|
|
}
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
and so forth.
|
|
|
|
The final distance for a document will then be `min`/`max`/`avg` (defined via `mode`) distance of all points contained in the document to all points given in the sort request.
|
|
|
|
|
|
|
|
==== Script Based Sorting
|
|
|
|
Allow to sort based on custom scripts, here is an example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
},
|
|
"sort" : {
|
|
"_script" : {
|
|
"type" : "number",
|
|
"script" : {
|
|
"lang": "painless",
|
|
"inline": "doc['field_name'].value * params.factor",
|
|
"params" : {
|
|
"factor" : 1.1
|
|
}
|
|
},
|
|
"order" : "asc"
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
|
|
==== Track Scores
|
|
|
|
When sorting on a field, scores are not computed. By setting
|
|
`track_scores` to true, scores will still be computed and tracked.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_search
|
|
{
|
|
"track_scores": true,
|
|
"sort" : [
|
|
{ "post_date" : {"order" : "desc"} },
|
|
{ "name" : "desc" },
|
|
{ "age" : "desc" }
|
|
],
|
|
"query" : {
|
|
"term" : { "user" : "kimchy" }
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
==== Memory Considerations
|
|
|
|
When sorting, the relevant sorted field values are loaded into memory.
|
|
This means that per shard, there should be enough memory to contain
|
|
them. For string based types, the field sorted on should not be analyzed
|
|
/ tokenized. For numeric types, if possible, it is recommended to
|
|
explicitly set the type to narrower types (like `short`, `integer` and
|
|
`float`).
|