247 lines
8.3 KiB
Plaintext
247 lines
8.3 KiB
Plaintext
[[search-aggregations-bucket-geohashgrid-aggregation]]
|
|
=== GeoHash grid Aggregation
|
|
|
|
A multi-bucket aggregation that works on `geo_point` fields and groups points into buckets that represent cells in a grid.
|
|
The resulting grid can be sparse and only contains cells that have matching data. Each cell is labeled using a http://en.wikipedia.org/wiki/Geohash[geohash] which is of user-definable precision.
|
|
|
|
* High precision geohashes have a long string length and represent cells that cover only a small area.
|
|
* Low precision geohashes have a short string length and represent cells that each cover a large area.
|
|
|
|
Geohashes used in this aggregation can have a choice of precision between 1 and 12.
|
|
|
|
WARNING: The highest-precision geohash of length 12 produces cells that cover less than a square metre of land and so high-precision requests can be very costly in terms of RAM and result sizes.
|
|
Please see the example below on how to first filter the aggregation to a smaller geographic area before requesting high-levels of detail.
|
|
|
|
The specified field must be of type `geo_point` (which can only be set explicitly in the mappings) and it can also hold an array of `geo_point` fields, in which case all points will be taken into account during aggregation.
|
|
|
|
|
|
==== Simple low-precision request
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
PUT /museums
|
|
{
|
|
"mappings": {
|
|
"_doc": {
|
|
"properties": {
|
|
"location": {
|
|
"type": "geo_point"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
POST /museums/_doc/_bulk?refresh
|
|
{"index":{"_id":1}}
|
|
{"location": "52.374081,4.912350", "name": "NEMO Science Museum"}
|
|
{"index":{"_id":2}}
|
|
{"location": "52.369219,4.901618", "name": "Museum Het Rembrandthuis"}
|
|
{"index":{"_id":3}}
|
|
{"location": "52.371667,4.914722", "name": "Nederlands Scheepvaartmuseum"}
|
|
{"index":{"_id":4}}
|
|
{"location": "51.222900,4.405200", "name": "Letterenhuis"}
|
|
{"index":{"_id":5}}
|
|
{"location": "48.861111,2.336389", "name": "Musée du Louvre"}
|
|
{"index":{"_id":6}}
|
|
{"location": "48.860000,2.327000", "name": "Musée d'Orsay"}
|
|
|
|
POST /museums/_search?size=0
|
|
{
|
|
"aggregations" : {
|
|
"large-grid" : {
|
|
"geohash_grid" : {
|
|
"field" : "location",
|
|
"precision" : 3
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
|
|
Response:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
...
|
|
"aggregations": {
|
|
"large-grid": {
|
|
"buckets": [
|
|
{
|
|
"key": "u17",
|
|
"doc_count": 3
|
|
},
|
|
{
|
|
"key": "u09",
|
|
"doc_count": 2
|
|
},
|
|
{
|
|
"key": "u15",
|
|
"doc_count": 1
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]
|
|
|
|
==== High-precision requests
|
|
|
|
When requesting detailed buckets (typically for displaying a "zoomed in" map) a filter like <<query-dsl-geo-bounding-box-query,geo_bounding_box>> should be applied to narrow the subject area otherwise potentially millions of buckets will be created and returned.
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
POST /museums/_search?size=0
|
|
{
|
|
"aggregations" : {
|
|
"zoomed-in" : {
|
|
"filter" : {
|
|
"geo_bounding_box" : {
|
|
"location" : {
|
|
"top_left" : "52.4, 4.9",
|
|
"bottom_right" : "52.3, 5.0"
|
|
}
|
|
}
|
|
},
|
|
"aggregations":{
|
|
"zoom1":{
|
|
"geohash_grid" : {
|
|
"field": "location",
|
|
"precision": 8
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[continued]
|
|
|
|
The geohashes returned by the `geohash_grid` aggregation can be also used for zooming in. To zoom into the
|
|
first geohash `u17` returned in the previous example, it should be specified as both `top_left` and `bottom_right` corner:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
POST /museums/_search?size=0
|
|
{
|
|
"aggregations" : {
|
|
"zoomed-in" : {
|
|
"filter" : {
|
|
"geo_bounding_box" : {
|
|
"location" : {
|
|
"top_left" : "u17",
|
|
"bottom_right" : "u17"
|
|
}
|
|
}
|
|
},
|
|
"aggregations":{
|
|
"zoom1":{
|
|
"geohash_grid" : {
|
|
"field": "location",
|
|
"precision": 8
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[continued]
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
{
|
|
...
|
|
"aggregations" : {
|
|
"zoomed-in" : {
|
|
"doc_count" : 3,
|
|
"zoom1" : {
|
|
"buckets" : [
|
|
{
|
|
"key" : "u173zy3j",
|
|
"doc_count" : 1
|
|
},
|
|
{
|
|
"key" : "u173zvfz",
|
|
"doc_count" : 1
|
|
},
|
|
{
|
|
"key" : "u173zt90",
|
|
"doc_count" : 1
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
}
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]
|
|
|
|
For "zooming in" on the system that don't support geohashes, the bucket keys should be translated into bounding boxes using
|
|
one of available geohash libraries. For example, for javascript the https://github.com/sunng87/node-geohash[node-geohash] library
|
|
can be used:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
var geohash = require('ngeohash');
|
|
|
|
// bbox will contain [ 52.03125, 4.21875, 53.4375, 5.625 ]
|
|
// [ minlat, minlon, maxlat, maxlon]
|
|
var bbox = geohash.decode_bbox('u17');
|
|
--------------------------------------------------
|
|
// NOTCONSOLE
|
|
|
|
|
|
==== Cell dimensions at the equator
|
|
The table below shows the metric dimensions for cells covered by various string lengths of geohash.
|
|
Cell dimensions vary with latitude and so the table is for the worst-case scenario at the equator.
|
|
|
|
[horizontal]
|
|
*GeoHash length*:: *Area width x height*
|
|
1:: 5,009.4km x 4,992.6km
|
|
2:: 1,252.3km x 624.1km
|
|
3:: 156.5km x 156km
|
|
4:: 39.1km x 19.5km
|
|
5:: 4.9km x 4.9km
|
|
6:: 1.2km x 609.4m
|
|
7:: 152.9m x 152.4m
|
|
8:: 38.2m x 19m
|
|
9:: 4.8m x 4.8m
|
|
10:: 1.2m x 59.5cm
|
|
11:: 14.9cm x 14.9cm
|
|
12:: 3.7cm x 1.9cm
|
|
|
|
|
|
|
|
==== Options
|
|
|
|
[horizontal]
|
|
field:: Mandatory. The name of the field indexed with GeoPoints.
|
|
|
|
precision:: Optional. The string length of the geohashes used to define
|
|
cells/buckets in the results. Defaults to 5.
|
|
The precision can either be defined in terms of the integer
|
|
precision levels mentioned above. Values outside of [1,12] will
|
|
be rejected.
|
|
Alternatively, the precision level can be approximated from a
|
|
distance measure like "1km", "10m". The precision level is
|
|
calculate such that cells will not exceed the specified
|
|
size (diagonal) of the required precision. When this would lead
|
|
to precision levels higher than the supported 12 levels,
|
|
(e.g. for distances <5.6cm) the value is rejected.
|
|
|
|
size:: Optional. The maximum number of geohash buckets to return
|
|
(defaults to 10,000). When results are trimmed, buckets are
|
|
prioritised based on the volumes of documents they contain.
|
|
|
|
shard_size:: Optional. To allow for more accurate counting of the top cells
|
|
returned in the final result the aggregation defaults to
|
|
returning `max(10,(size x number-of-shards))` buckets from each
|
|
shard. If this heuristic is undesirable, the number considered
|
|
from each shard can be over-ridden using this parameter.
|