146 lines
4.5 KiB
Plaintext
146 lines
4.5 KiB
Plaintext
|
[[index-modules-shard-query-cache]]
|
||
|
== Shard query cache
|
||
|
|
||
|
coming[1.4.0]
|
||
|
|
||
|
When a search request is run against an index or against many indices, each
|
||
|
involved shard executes the search locally and returns its local results to
|
||
|
the _coordinating node_, which combines these shard-level results into a
|
||
|
``global'' result set.
|
||
|
|
||
|
The shard-level query cache module caches the local results on each shard.
|
||
|
This allows frequently used (and potentially heavy) search requests to return
|
||
|
results almost instantly. The query cache is a very good fit for the logging
|
||
|
use case, where only the most recent index is being actively updated --
|
||
|
results from older indices will be served directly from the cache.
|
||
|
|
||
|
[IMPORTANT]
|
||
|
==================================
|
||
|
|
||
|
For now, the query cache will only only cache the results of search requests
|
||
|
where <<count,`?search_type=count`>>, so it will not cache `hits`,
|
||
|
but it will cache `hits.total`, <<search-aggregations,aggregations>>, and
|
||
|
<<search-suggesters,suggestions>>.
|
||
|
|
||
|
Queries that use `now` (see <<date-math>>) cannot be cached.
|
||
|
==================================
|
||
|
|
||
|
[float]
|
||
|
=== Cache invalidation
|
||
|
|
||
|
The cache is smart -- it keeps the same _near real-time_ promise as uncached
|
||
|
search.
|
||
|
|
||
|
Cached results are invalidated automatically whenever the shard refreshes, but
|
||
|
only if the data in the shard has actually changed. In other words, you will
|
||
|
always get the same results from the cache as you would for an uncached search
|
||
|
request.
|
||
|
|
||
|
The longer the refresh interval, the longer that cached entries will remain
|
||
|
valid. If the cache is full, the least recently used cache keys will be
|
||
|
evicted.
|
||
|
|
||
|
The cache can be expired manually with the <<indices-clearcache,`clear-cache` API>>:
|
||
|
|
||
|
[source,json]
|
||
|
------------------------
|
||
|
curl -XPOST 'localhost:9200/kimchy,elasticsearch/_cache/clear?query_cache=true'
|
||
|
------------------------
|
||
|
|
||
|
[float]
|
||
|
=== Enabling caching by default
|
||
|
|
||
|
The cache is not enabled by default, but can be enabled when creating a new
|
||
|
index as follows:
|
||
|
|
||
|
[source,json]
|
||
|
-----------------------------
|
||
|
curl -XPUT localhost:9200/my_index -d'
|
||
|
{
|
||
|
"settings": {
|
||
|
"index.cache.query.enable": true
|
||
|
}
|
||
|
}
|
||
|
'
|
||
|
-----------------------------
|
||
|
|
||
|
It can also be enabled or disabled dynamically on an existing index with the
|
||
|
<<indices-update-settings,`update-settings`>> API:
|
||
|
|
||
|
[source,json]
|
||
|
-----------------------------
|
||
|
curl -XPUT localhost:9200/my_index/_settings -d'
|
||
|
{ "index.cache.query.enable": true }
|
||
|
'
|
||
|
-----------------------------
|
||
|
|
||
|
[float]
|
||
|
=== Enabling caching per request
|
||
|
|
||
|
The `query_cache` query-string parameter can be used to enable or disable
|
||
|
caching on a *per-query* basis. If set, it overrides the index-level setting:
|
||
|
|
||
|
[source,json]
|
||
|
-----------------------------
|
||
|
curl localhost:9200/my_index/_search?search_type=count&query_cache=true -d'
|
||
|
{
|
||
|
"aggs": {
|
||
|
"popular_colors": {
|
||
|
"terms": {
|
||
|
"field": "colors"
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
'
|
||
|
-----------------------------
|
||
|
|
||
|
IMPORTANT: If your query uses a script whose result is not deterministic (e.g.
|
||
|
it uses a random function or references the current time) you should set the
|
||
|
`query_cache` flag to `false` to disable caching for that request.
|
||
|
|
||
|
[float]
|
||
|
=== Cache key
|
||
|
|
||
|
The whole JSON body is used as the cache key. This means that if the JSON
|
||
|
changes -- for instance if keys are output in a different order -- then the
|
||
|
cache key will not be recognised.
|
||
|
|
||
|
TIP: Most JSON libraries support a _canonical_ mode which ensures that JSON
|
||
|
keys are always emitted in the same order. This canonical mode can be used in
|
||
|
the application to ensure that a request is always serialized in the same way.
|
||
|
|
||
|
[float]
|
||
|
=== Cache settings
|
||
|
|
||
|
The cache is managed at the node level, and has a default maximum size of `1%`
|
||
|
of the heap. This can be changed in the `config/elasticsearch.yml` file with:
|
||
|
|
||
|
[source,yaml]
|
||
|
--------------------------------
|
||
|
indices.cache.query.size: 2%
|
||
|
--------------------------------
|
||
|
|
||
|
Also, you can use the +indices.cache.query.expire+ setting to specify a TTL
|
||
|
for cached results, but there should be no reason to do so. Remember that
|
||
|
stale results are automatically invalidated when the index is refreshed. This
|
||
|
setting is provided for completeness' sake only.
|
||
|
|
||
|
[float]
|
||
|
=== Monitoring cache usage
|
||
|
|
||
|
The size of the cache (in bytes) and the number of evictions can be viewed
|
||
|
by index, with the <<indices-stats,`indices-stats`>> API:
|
||
|
|
||
|
[source,json]
|
||
|
------------------------
|
||
|
curl -XPOST 'localhost:9200/_stats/query_cache?pretty&human'
|
||
|
------------------------
|
||
|
|
||
|
or by node with the <<cluster-nodes-stats,`nodes-stats`>> API:
|
||
|
|
||
|
[source,json]
|
||
|
------------------------
|
||
|
curl -XPOST 'localhost:9200/_nodes/stats/indices/query_cache?pretty&human'
|
||
|
------------------------
|