OpenSearch/docs/reference/aggregations/bucket/geotilegrid-aggregation.asc...

253 lines
7.5 KiB
Plaintext
Raw Normal View History

[[search-aggregations-bucket-geotilegrid-aggregation]]
=== GeoTile Grid Aggregation
A multi-bucket aggregation that works on `geo_point` fields and groups points into
buckets that represent cells in a grid. The resulting grid can be sparse and only
contains cells that have matching data. Each cell corresponds to a
{wikipedia}/Tiled_web_map[map tile] as used by many online map
sites. Each cell is labeled using a "{zoom}/{x}/{y}" format, where zoom is equal
to the user-specified precision.
* High precision keys have a larger range for x and y, and represent tiles that
cover only a small area.
* Low precision keys have a smaller range for x and y, and represent tiles that
each cover a large area.
See https://wiki.openstreetmap.org/wiki/Zoom_levels[Zoom level documentation]
on how precision (zoom) correlates to size on the ground. Precision for this
aggregation can be between 0 and 29, inclusive.
WARNING: The highest-precision geotile of length 29 produces cells that cover
less than a 10cm by 10cm of land and so high-precision requests can be very
costly in terms of RAM and result sizes. Please see the example below on how
to first filter the aggregation to a smaller geographic area before requesting
high-levels of detail.
The specified field must be of type `geo_point` (which can only be set
explicitly in the mappings) and it can also hold an array of `geo_point`
fields, in which case all points will be taken into account during aggregation.
==== Simple low-precision request
[source,console]
--------------------------------------------------
PUT /museums
{
"mappings": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
POST /museums/_bulk?refresh
{"index":{"_id":1}}
{"location": "52.374081,4.912350", "name": "NEMO Science Museum"}
{"index":{"_id":2}}
{"location": "52.369219,4.901618", "name": "Museum Het Rembrandthuis"}
{"index":{"_id":3}}
{"location": "52.371667,4.914722", "name": "Nederlands Scheepvaartmuseum"}
{"index":{"_id":4}}
{"location": "51.222900,4.405200", "name": "Letterenhuis"}
{"index":{"_id":5}}
{"location": "48.861111,2.336389", "name": "Musée du Louvre"}
{"index":{"_id":6}}
{"location": "48.860000,2.327000", "name": "Musée d'Orsay"}
POST /museums/_search?size=0
{
"aggregations": {
"large-grid": {
"geotile_grid": {
"field": "location",
"precision": 8
}
}
}
}
--------------------------------------------------
Response:
[source,console-result]
--------------------------------------------------
{
...
"aggregations": {
"large-grid": {
"buckets": [
{
"key": "8/131/84",
"doc_count": 3
},
{
"key": "8/129/88",
"doc_count": 2
},
{
"key": "8/131/85",
"doc_count": 1
}
]
}
}
}
--------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]
==== High-precision requests
When requesting detailed buckets (typically for displaying a "zoomed in" map)
a filter like <<query-dsl-geo-bounding-box-query,geo_bounding_box>> should be
applied to narrow the subject area otherwise potentially millions of buckets
will be created and returned.
[source,console]
--------------------------------------------------
POST /museums/_search?size=0
{
"aggregations": {
"zoomed-in": {
"filter": {
"geo_bounding_box": {
"location": {
"top_left": "52.4, 4.9",
"bottom_right": "52.3, 5.0"
}
}
},
"aggregations": {
"zoom1": {
"geotile_grid": {
"field": "location",
"precision": 22
}
}
}
}
}
}
--------------------------------------------------
// TEST[continued]
[source,console-result]
--------------------------------------------------
{
...
"aggregations": {
"zoomed-in": {
"doc_count": 3,
"zoom1": {
"buckets": [
{
"key": "22/2154412/1378379",
"doc_count": 1
},
{
"key": "22/2154385/1378332",
"doc_count": 1
},
{
"key": "22/2154259/1378425",
"doc_count": 1
}
]
}
}
}
}
--------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]
==== Requests with additional bounding box filtering
The `geotile_grid` aggregation supports an optional `bounds` parameter
that restricts the points considered to those that fall within the
bounds provided. The `bounds` parameter accepts the bounding box in
all the same <<query-dsl-geo-bounding-box-query-accepted-formats,accepted formats>> of the
bounds specified in the Geo Bounding Box Query. This bounding box can be used with or
without an additional `geo_bounding_box` query filtering the points prior to aggregating.
It is an independent bounding box that can intersect with, be equal to, or be disjoint
to any additional `geo_bounding_box` queries defined in the context of the aggregation.
[source,console,id=geotilegrid-aggregation-with-bounds]
--------------------------------------------------
POST /museums/_search?size=0
{
"aggregations": {
"tiles-in-bounds": {
"geotile_grid": {
"field": "location",
"precision": 22,
"bounds": {
"top_left": "52.4, 4.9",
"bottom_right": "52.3, 5.0"
}
}
}
}
}
--------------------------------------------------
// TEST[continued]
[source,console-result]
--------------------------------------------------
{
...
"aggregations": {
"tiles-in-bounds": {
"buckets": [
{
"key": "22/2154412/1378379",
"doc_count": 1
},
{
"key": "22/2154385/1378332",
"doc_count": 1
},
{
"key": "22/2154259/1378425",
"doc_count": 1
}
]
}
}
}
--------------------------------------------------
// TESTRESPONSE[s/\.\.\./"took": $body.took,"_shards": $body._shards,"hits":$body.hits,"timed_out":false,/]
[discrete]
[role="xpack"]
==== Aggregating `geo_shape` fields
Aggregating on <<geo-shape>> fields works just as it does for points, except that a single
shape can be counted for in multiple tiles. A shape will contribute to the count of matching values
if any part of its shape intersects with that tile. Below is an image that demonstrates this:
image:images/spatial/geoshape_grid.png[]
==== Options
[horizontal]
field:: Mandatory. The name of the field indexed with GeoPoints.
precision:: Optional. The integer zoom of the key used to define
cells/buckets in the results. Defaults to 7.
Values outside of [0,29] will be rejected.
bounds: Optional. The bounding box to filter the points in the bucket.
size:: Optional. The maximum number of geohash buckets to return
(defaults to 10,000). When results are trimmed, buckets are
prioritised based on the volumes of documents they contain.
shard_size:: Optional. To allow for more accurate counting of the top cells
returned in the final result the aggregation defaults to
returning `max(10,(size x number-of-shards))` buckets from each
shard. If this heuristic is undesirable, the number considered
from each shard can be over-ridden using this parameter.