2019-03-20 09:24:41 -04:00
|
|
|
[[query-dsl-distance-feature-query]]
|
2019-07-18 10:18:11 -04:00
|
|
|
=== Distance feature query
|
|
|
|
++++
|
|
|
|
<titleabbrev>Distance feature</titleabbrev>
|
|
|
|
++++
|
2019-03-20 09:24:41 -04:00
|
|
|
|
2019-08-02 14:15:12 -04:00
|
|
|
Boosts the <<relevance-scores,relevance score>> of documents closer to a
|
2019-07-29 08:34:50 -04:00
|
|
|
provided `origin` date or point. For example, you can use this query to give
|
|
|
|
more weight to documents closer to a certain date or location.
|
2019-03-20 09:24:41 -04:00
|
|
|
|
2019-07-29 08:34:50 -04:00
|
|
|
You can use the `distance_feature` query to find the nearest neighbors to a
|
|
|
|
location. You can also use the query in a <<query-dsl-bool-query,`bool`>>
|
|
|
|
search's `should` filter to add boosted relevance scores to the `bool` query's
|
|
|
|
scores.
|
2019-03-20 09:24:41 -04:00
|
|
|
|
|
|
|
|
2019-07-29 08:34:50 -04:00
|
|
|
[[distance-feature-query-ex-request]]
|
|
|
|
==== Example request
|
2019-03-20 09:24:41 -04:00
|
|
|
|
2019-07-29 08:34:50 -04:00
|
|
|
[[distance-feature-index-setup]]
|
|
|
|
===== Index setup
|
|
|
|
To use the `distance_feature` query, your index must include a <<date, `date`>>,
|
|
|
|
<<date_nanos, `date_nanos`>> or <<geo-point,`geo_point`>> field.
|
2019-03-20 09:24:41 -04:00
|
|
|
|
2019-07-29 08:34:50 -04:00
|
|
|
To see how you can set up an index for the `distance_feature` query, try the
|
|
|
|
following example.
|
2019-03-20 09:24:41 -04:00
|
|
|
|
2019-07-29 08:34:50 -04:00
|
|
|
. Create an `items` index with the following field mapping:
|
|
|
|
+
|
|
|
|
--
|
2019-03-20 09:24:41 -04:00
|
|
|
|
2019-07-29 08:34:50 -04:00
|
|
|
* `name`, a <<keyword,`keyword`>> field
|
|
|
|
* `production_date`, a <<date, `date`>> field
|
|
|
|
* `location`, a <<geo-point,`geo_point`>> field
|
2019-03-20 09:24:41 -04:00
|
|
|
|
|
|
|
[source,js]
|
2019-07-29 08:34:50 -04:00
|
|
|
----
|
|
|
|
PUT /items
|
2019-03-20 09:24:41 -04:00
|
|
|
{
|
|
|
|
"mappings": {
|
|
|
|
"properties": {
|
|
|
|
"name": {
|
|
|
|
"type": "keyword"
|
|
|
|
},
|
|
|
|
"production_date": {
|
|
|
|
"type": "date"
|
|
|
|
},
|
|
|
|
"location": {
|
|
|
|
"type": "geo_point"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2019-07-29 08:34:50 -04:00
|
|
|
----
|
|
|
|
// CONSOLE
|
|
|
|
// TESTSETUP
|
|
|
|
--
|
2019-03-20 09:24:41 -04:00
|
|
|
|
2019-07-29 08:34:50 -04:00
|
|
|
. Index several documents to this index.
|
|
|
|
+
|
|
|
|
--
|
|
|
|
[source,js]
|
|
|
|
----
|
|
|
|
PUT /items/_doc/1?refresh
|
2019-03-20 09:24:41 -04:00
|
|
|
{
|
|
|
|
"name" : "chocolate",
|
|
|
|
"production_date": "2018-02-01",
|
|
|
|
"location": [-71.34, 41.12]
|
|
|
|
}
|
|
|
|
|
2019-07-29 08:34:50 -04:00
|
|
|
PUT /items/_doc/2?refresh
|
2019-03-20 09:24:41 -04:00
|
|
|
{
|
|
|
|
"name" : "chocolate",
|
|
|
|
"production_date": "2018-01-01",
|
|
|
|
"location": [-71.3, 41.15]
|
|
|
|
}
|
|
|
|
|
|
|
|
|
2019-07-29 08:34:50 -04:00
|
|
|
PUT /items/_doc/3?refresh
|
2019-03-20 09:24:41 -04:00
|
|
|
{
|
|
|
|
"name" : "chocolate",
|
|
|
|
"production_date": "2017-12-01",
|
|
|
|
"location": [-71.3, 41.12]
|
|
|
|
}
|
2019-07-29 08:34:50 -04:00
|
|
|
----
|
2019-03-20 09:24:41 -04:00
|
|
|
// CONSOLE
|
2019-07-29 08:34:50 -04:00
|
|
|
--
|
|
|
|
|
|
|
|
|
|
|
|
[[distance-feature-query-ex-query]]
|
|
|
|
===== Example queries
|
2019-03-20 09:24:41 -04:00
|
|
|
|
2019-07-29 08:34:50 -04:00
|
|
|
[[distance-feature-query-date-ex]]
|
|
|
|
====== Boost documents based on date
|
|
|
|
The following `bool` search returns documents with a `name` value of
|
|
|
|
`chocolate`. The search also uses the `distance_feature` query to increase the
|
|
|
|
relevance score of documents with a `production_date` value closer to `now`.
|
2019-03-20 09:24:41 -04:00
|
|
|
|
|
|
|
[source,js]
|
2019-07-29 08:34:50 -04:00
|
|
|
----
|
|
|
|
GET /items/_search
|
2019-03-20 09:24:41 -04:00
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"bool": {
|
|
|
|
"must": {
|
|
|
|
"match": {
|
|
|
|
"name": "chocolate"
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"should": {
|
|
|
|
"distance_feature": {
|
|
|
|
"field": "production_date",
|
|
|
|
"pivot": "7d",
|
|
|
|
"origin": "now"
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2019-07-29 08:34:50 -04:00
|
|
|
----
|
2019-03-20 09:24:41 -04:00
|
|
|
// CONSOLE
|
|
|
|
|
2019-07-29 08:34:50 -04:00
|
|
|
[[distance-feature-query-distance-ex]]
|
|
|
|
====== Boost documents based on location
|
|
|
|
The following `bool` search returns documents with a `name` value of
|
|
|
|
`chocolate`. The search also uses the `distance_feature` query to increase the
|
|
|
|
relevance score of documents with a `location` value closer to `[-71.3, 41.15]`.
|
2019-03-20 09:24:41 -04:00
|
|
|
|
|
|
|
[source,js]
|
2019-07-29 08:34:50 -04:00
|
|
|
----
|
|
|
|
GET /items/_search
|
2019-03-20 09:24:41 -04:00
|
|
|
{
|
|
|
|
"query": {
|
|
|
|
"bool": {
|
|
|
|
"must": {
|
|
|
|
"match": {
|
|
|
|
"name": "chocolate"
|
|
|
|
}
|
|
|
|
},
|
|
|
|
"should": {
|
|
|
|
"distance_feature": {
|
|
|
|
"field": "location",
|
|
|
|
"pivot": "1000m",
|
|
|
|
"origin": [-71.3, 41.15]
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
}
|
2019-07-29 08:34:50 -04:00
|
|
|
----
|
2019-03-20 09:24:41 -04:00
|
|
|
// CONSOLE
|
2019-07-29 08:34:50 -04:00
|
|
|
|
|
|
|
|
|
|
|
[[distance-feature-top-level-params]]
|
|
|
|
==== Top-level parameters for `distance_feature`
|
|
|
|
`field`::
|
|
|
|
(Required, string) Name of the field used to calculate distances. This field
|
|
|
|
must meet the following criteria:
|
|
|
|
|
|
|
|
* Be a <<date, `date`>>, <<date_nanos, `date_nanos`>> or
|
|
|
|
<<geo-point,`geo_point`>> field
|
|
|
|
* Have an <<mapping-index,`index`>> mapping parameter value of `true`, which is
|
|
|
|
the default
|
|
|
|
* Have an <<doc-values,`doc_values`>> mapping parameter value of `true`, which
|
|
|
|
is the default
|
|
|
|
|
|
|
|
`origin`::
|
|
|
|
+
|
|
|
|
--
|
|
|
|
(Required, string) Date or point of origin used to calculate distances.
|
|
|
|
|
|
|
|
If the `field` value is a <<date, `date`>> or <<date_nanos, `date_nanos`>>
|
|
|
|
field, the `origin` value must be a <<date-format-pattern,date>>.
|
|
|
|
<<date-math,Date Math>>, such as `now-1h`, is supported.
|
|
|
|
|
|
|
|
If the `field` value is a <<geo-point,`geo_point`>> field, the `origin` value
|
|
|
|
must be a geopoint.
|
|
|
|
--
|
|
|
|
|
|
|
|
`pivot`::
|
|
|
|
+
|
|
|
|
--
|
|
|
|
(Required, <<time-units,time unit>> or <<distance-units,distance unit>>)
|
|
|
|
Distance from the `origin` at which relevance scores receive half of the `boost`
|
|
|
|
value.
|
|
|
|
|
|
|
|
If the `field` value is a <<date, `date`>> or <<date_nanos, `date_nanos`>>
|
|
|
|
field, the `pivot` value must be a <<time-units,time unit>>, such as `1h` or
|
|
|
|
`10d`.
|
|
|
|
|
|
|
|
If the `field` value is a <<geo-point,`geo_point`>> field, the `pivot` value
|
|
|
|
must be a <<distance-units,distance unit>>, such as `1km` or `12m`.
|
|
|
|
--
|
|
|
|
|
|
|
|
`boost`::
|
|
|
|
+
|
|
|
|
--
|
|
|
|
(Optional, float) Floating point number used to multiply the
|
2019-08-02 14:15:12 -04:00
|
|
|
<<relevance-scores,relevance score>> of matching documents. This value
|
2019-07-29 08:34:50 -04:00
|
|
|
cannot be negative. Defaults to `1.0`.
|
|
|
|
--
|
|
|
|
|
|
|
|
|
|
|
|
[[distance-feature-notes]]
|
|
|
|
==== Notes
|
|
|
|
|
|
|
|
[[distance-feature-calculation]]
|
|
|
|
===== How the `distance_feature` query calculates relevance scores
|
|
|
|
The `distance_feature` query dynamically calculates the distance between the
|
|
|
|
`origin` value and a document's field values. It then uses this distance as a
|
2019-08-02 14:15:12 -04:00
|
|
|
feature to boost the <<relevance-scores,relevance score>> of closer
|
2019-07-29 08:34:50 -04:00
|
|
|
documents.
|
|
|
|
|
2019-08-02 14:15:12 -04:00
|
|
|
The `distance_feature` query calculates a document's
|
|
|
|
<<relevance-scores,relevance score>> as follows:
|
2019-07-29 08:34:50 -04:00
|
|
|
|
|
|
|
```
|
|
|
|
relevance score = boost * pivot / (pivot + distance)
|
|
|
|
```
|
|
|
|
|
|
|
|
The `distance` is the absolute difference between the `origin` value and a
|
|
|
|
document's field value.
|
|
|
|
|
|
|
|
[[distance-feature-skip-hits]]
|
|
|
|
===== Skip non-competitive hits
|
|
|
|
Unlike the <<query-dsl-function-score-query,`function_score`>> query or other
|
2019-08-02 14:15:12 -04:00
|
|
|
ways to change <<relevance-scores,relevance scores>>, the
|
2019-07-29 08:34:50 -04:00
|
|
|
`distance_feature` query efficiently skips non-competitive hits when the
|
|
|
|
<<search-uri-request,`track_total_hits`>> parameter is **not** `true`.
|