2013-11-29 09:59:09 -05:00
[[search-aggregations-bucket-geohashgrid-aggregation]]
2014-05-12 19:35:58 -04:00
=== GeoHash grid Aggregation
2013-11-29 09:59:09 -05:00
A multi-bucket aggregation that works on `geo_point` fields and groups points into buckets that represent cells in a grid.
The resulting grid can be sparse and only contains cells that have matching data. Each cell is labeled using a http://en.wikipedia.org/wiki/Geohash[geohash] which is of user-definable precision.
* High precision geohashes have a long string length and represent cells that cover only a small area.
* Low precision geohashes have a short string length and represent cells that each cover a large area.
Geohashes used in this aggregation can have a choice of precision between 1 and 12.
2014-02-04 05:54:32 -05:00
WARNING: The highest-precision geohash of length 12 produces cells that cover less than a square metre of land and so high-precision requests can be very costly in terms of RAM and result sizes.
2013-11-29 09:59:09 -05:00
Please see the example below on how to first filter the aggregation to a smaller geographic area before requesting high-levels of detail.
2014-02-04 05:54:32 -05:00
The specified field must be of type `geo_point` (which can only be set explicitly in the mappings) and it can also hold an array of `geo_point` fields, in which case all points will be taken into account during aggregation.
==== Simple low-precision request
2013-11-29 09:59:09 -05:00
[source,js]
--------------------------------------------------
2017-03-30 21:19:07 -04:00
PUT /museums
{
"mappings": {
"doc": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
}
POST /museums/doc/_bulk?refresh
{"index":{"_id":1}}
{"location": "52.374081,4.912350", "name": "NEMO Science Museum"}
{"index":{"_id":2}}
{"location": "52.369219,4.901618", "name": "Museum Het Rembrandthuis"}
{"index":{"_id":3}}
{"location": "52.371667,4.914722", "name": "Nederlands Scheepvaartmuseum"}
{"index":{"_id":4}}
{"location": "51.222900,4.405200", "name": "Letterenhuis"}
{"index":{"_id":5}}
{"location": "48.861111,2.336389", "name": "Musée du Louvre"}
{"index":{"_id":6}}
{"location": "48.860000,2.327000", "name": "Musée d'Orsay"}
POST /museums/_search?size=0
2013-11-29 09:59:09 -05:00
{
"aggregations" : {
2017-03-30 21:19:07 -04:00
"large-grid" : {
2014-02-04 03:39:55 -05:00
"geohash_grid" : {
2013-11-29 09:59:09 -05:00
"field" : "location",
"precision" : 3
}
}
}
}
--------------------------------------------------
2017-03-30 21:19:07 -04:00
// CONSOLE
2013-11-29 09:59:09 -05:00
Response:
[source,js]
--------------------------------------------------
{
2017-03-30 21:19:07 -04:00
...
2013-11-29 09:59:09 -05:00
"aggregations": {
2017-03-30 21:19:07 -04:00
"large-grid": {
2013-11-29 09:59:09 -05:00
"buckets": [
{
2017-03-30 21:19:07 -04:00
"key": "u17",
"doc_count": 3
},
{
"key": "u09",
"doc_count": 2
2013-11-29 09:59:09 -05:00
},
{
2017-03-30 21:19:07 -04:00
"key": "u15",
"doc_count": 1
2013-11-29 09:59:09 -05:00
}
]
}
}
}
--------------------------------------------------
2017-03-30 21:19:07 -04:00
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]
2013-11-29 09:59:09 -05:00
==== High-precision requests
2015-05-05 02:27:52 -04:00
When requesting detailed buckets (typically for displaying a "zoomed in" map) a filter like <<query-dsl-geo-bounding-box-query,geo_bounding_box>> should be applied to narrow the subject area otherwise potentially millions of buckets will be created and returned.
2013-11-29 09:59:09 -05:00
[source,js]
--------------------------------------------------
2017-03-30 21:19:07 -04:00
POST /museums/_search?size=0
2013-11-29 09:59:09 -05:00
{
"aggregations" : {
2017-03-30 21:19:07 -04:00
"zoomed-in" : {
2014-02-04 05:54:32 -05:00
"filter" : {
2013-11-29 09:59:09 -05:00
"geo_bounding_box" : {
"location" : {
2017-03-30 21:19:07 -04:00
"top_left" : "52.4, 4.9",
"bottom_right" : "52.3, 5.0"
2013-11-29 09:59:09 -05:00
}
}
},
"aggregations":{
"zoom1":{
2014-02-04 03:39:55 -05:00
"geohash_grid" : {
2017-03-30 21:19:07 -04:00
"field": "location",
"precision": 8
2013-11-29 09:59:09 -05:00
}
}
}
}
}
2017-03-30 21:19:07 -04:00
}
2013-11-29 09:59:09 -05:00
--------------------------------------------------
2017-03-30 21:19:07 -04:00
// CONSOLE
// TEST[continued]
2013-11-29 09:59:09 -05:00
2014-02-04 05:54:32 -05:00
==== Cell dimensions at the equator
2013-11-29 09:59:09 -05:00
The table below shows the metric dimensions for cells covered by various string lengths of geohash.
Cell dimensions vary with latitude and so the table is for the worst-case scenario at the equator.
2014-02-04 05:54:32 -05:00
2013-11-29 09:59:09 -05:00
[horizontal]
*GeoHash length*:: *Area width x height*
2014-02-04 05:54:32 -05:00
1:: 5,009.4km x 4,992.6km
2:: 1,252.3km x 624.1km
3:: 156.5km x 156km
4:: 39.1km x 19.5km
5:: 4.9km x 4.9km
6:: 1.2km x 609.4m
7:: 152.9m x 152.4m
8:: 38.2m x 19m
9:: 4.8m x 4.8m
2013-11-29 09:59:09 -05:00
10:: 1.2m x 59.5cm
11:: 14.9cm x 14.9cm
12:: 3.7cm x 1.9cm
2014-02-04 05:54:32 -05:00
==== Options
2013-11-29 09:59:09 -05:00
[horizontal]
2014-02-04 05:54:32 -05:00
field:: Mandatory. The name of the field indexed with GeoPoints.
precision:: Optional. The string length of the geohashes used to define
cells/buckets in the results. Defaults to 5.
size:: Optional. The maximum number of geohash buckets to return
(defaults to 10,000). When results are trimmed, buckets are
prioritised based on the volumes of documents they contain.
shard_size:: Optional. To allow for more accurate counting of the top cells
returned in the final result the aggregation defaults to
returning `max(10,(size x number-of-shards))` buckets from each
shard. If this heuristic is undesirable, the number considered
from each shard can be over-ridden using this parameter.