Documented the query cache module

Related to #7161 and #7167
2014-08-06 11:54:51 +02:00 · 2014-08-06 11:54:51 +02:00 · e7f1aa4f4f
parent 6f09eb1b06
commit e7f1aa4f4f
6 changed files with 225 additions and 43 deletions
--- a/docs/reference/index-modules.asciidoc
+++ b/docs/reference/index-modules.asciidoc
@ -72,6 +72,8 @@ include::index-modules/translog.asciidoc[]
 include::index-modules/cache.asciidoc[]
 include::index-modules/query-cache.asciidoc[]
 include::index-modules/fielddata.asciidoc[]
 include::index-modules/codec.asciidoc[]
--- a/docs/reference/index-modules/query-cache.asciidoc
+++ b/docs/reference/index-modules/query-cache.asciidoc
@ -0,0 +1,145 @@
 [[index-modules-shard-query-cache]]
 == Shard query cache
 coming[1.4.0]
 When a search request is run against an index or against many indices, each
 involved shard executes the search locally and returns its local results to
 the _coordinating node_, which combines these shard-level results into a
 ``global'' result set.
 The shard-level query cache module caches the local results on each shard.
 This allows frequently used (and potentially heavy) search requests to return
 results almost instantly. The query cache is a very good fit for the logging
 use case, where only the most recent index is being actively updated --
 results from older indices will be served directly from the cache.
 [IMPORTANT]
 ==================================
 For now, the query cache will only  only cache the results of search requests
 where <<count,`?search_type=count`>>, so it will not cache `hits`,
 but it will cache `hits.total`,  <<search-aggregations,aggregations>>, and
 <<search-suggesters,suggestions>>.
 Queries that use `now` (see <<date-math>>) cannot be cached.
 ==================================
 [float]
 === Cache invalidation
 The cache is smart -- it keeps the same _near real-time_ promise as uncached
 search.
 Cached results are invalidated automatically whenever the shard refreshes, but
 only if the data in the shard has actually changed.  In other words, you will
 always get the same results from the cache as you would for an uncached search
 request.
 The longer the refresh interval, the longer that cached entries will remain
 valid. If the cache is full, the least recently used cache keys will be
 evicted.
 The cache can be expired manually with the <<indices-clearcache,`clear-cache` API>>:
 [source,json]
 ------------------------
 curl -XPOST 'localhost:9200/kimchy,elasticsearch/_cache/clear?query_cache=true'
 ------------------------
 [float]
 === Enabling caching by default
 The cache is not enabled by default, but can be enabled when creating a new
 index as follows:
 [source,json]
 -----------------------------
 curl -XPUT localhost:9200/my_index -d'
 {
  "settings": {
    "index.cache.query.enable": true
  }
 }
 '
 -----------------------------
 It can also be enabled or disabled dynamically on an existing index with the
 <<indices-update-settings,`update-settings`>> API:
 [source,json]
 -----------------------------
 curl -XPUT localhost:9200/my_index/_settings -d'
 { "index.cache.query.enable": true }
 '
 -----------------------------
 [float]
 === Enabling caching per request
 The `query_cache` query-string parameter can be used to enable or disable
 caching on a *per-query* basis.  If set, it overrides the index-level setting:
 [source,json]
 -----------------------------
 curl localhost:9200/my_index/_search?search_type=count&query_cache=true -d'
 {
  "aggs": {
    "popular_colors": {
      "terms": {
        "field": "colors"
      }
    }
  }
 }
 '
 -----------------------------
 IMPORTANT: If your query uses a script whose result is not deterministic (e.g.
 it uses a random function or references the current time) you should set the
 `query_cache` flag to `false` to disable caching for that request.
 [float]
 === Cache key
 The whole JSON body is used as the cache key.  This means that if the JSON
 changes -- for instance if keys are output in a different order -- then the
 cache key will not be recognised.
 TIP: Most JSON libraries support a _canonical_ mode which ensures that JSON
 keys are always emitted in the same order. This canonical mode can be used in
 the application to ensure that a request is always serialized in the same way.
 [float]
 === Cache settings
 The cache is managed at the node level, and has a default maximum size of `1%`
 of the heap.  This can be changed in the `config/elasticsearch.yml` file with:
 [source,yaml]
 --------------------------------
 indices.cache.query.size: 2%
 --------------------------------
 Also, you can use the +indices.cache.query.expire+ setting to specify a TTL
 for cached results, but there should be no reason to do so.  Remember that
 stale results are automatically invalidated when the index is refreshed. This
 setting is provided for completeness' sake only.
 [float]
 === Monitoring cache usage
 The size of the cache (in bytes) and the number of evictions can be viewed
 by index, with the <<indices-stats,`indices-stats`>> API:
 [source,json]
 ------------------------
 curl -XPOST 'localhost:9200/_stats/query_cache?pretty&human'
 ------------------------
 or by node with the <<cluster-nodes-stats,`nodes-stats`>> API:
 [source,json]
 ------------------------
 curl -XPOST 'localhost:9200/_nodes/stats/indices/query_cache?pretty&human'
 ------------------------
--- a/docs/reference/indices/clearcache.asciidoc
+++ b/docs/reference/indices/clearcache.asciidoc
@ -9,9 +9,9 @@ associated with one ore more indices.
 $ curl -XPOST 'http://localhost:9200/twitter/_cache/clear'
 --------------------------------------------------
-The API, by default, will clear all caches. Specific caches can be
+The API, by default, will clear all caches. Specific caches can be cleaned
-cleaned explicitly by setting `filter`, `field_data` or `id_cache` to
+explicitly by setting `filter`, `field_data`, `query_cache` coming[1.4.0],
-`true`.
+or `id_cache` to `true`.
 All caches relating to a specific field(s) can also be cleared by
 specifying `fields` parameter with a comma delimited list of the
--- a/docs/reference/indices/stats.asciidoc
+++ b/docs/reference/indices/stats.asciidoc
@ -39,20 +39,32 @@ specified as well in the URI. Those stats can be any of:
                groups). The `groups` parameter accepts a comma separated list of group names.
                Use `_all` to return statistics for all groups.
-`warmer`:: 		Warmer statistics.
+`completion`::  Completion suggest statistics.
 `merge`:: 		Merge statistics.
 `fielddata`::   Fielddata statistics.
 `flush`::       Flush statistics.
-`completion`:: 		Completion suggest statistics.
+`merge`::       Merge statistics.
 `query_cache`:: <<index-modules-shard-query-cache,Shard query cache>> statistics. coming[1.4.0]
 `refresh`::     Refresh statistics.
 `suggest`::     Suggest statistics.
 `warmer`::      Warmer statistics.
-Some statistics allow per field granularity which accepts a list comma-separated list of included fields. By default all fields are included:
+Some statistics allow per field granularity which accepts a list
 comma-separated list of included fields. By default all fields are included:
 [horizontal]
-`fields`::	List of fields to be included in the statistics. This is used as the default list unless a more specific field list is provided (see below).
+`fields`::
-`completion_fields`::	List of fields to be included in the Completion Suggest statistics
+
-`fielddata_fields`:: 	List of fields to be included in the Fielddata statistics
+    List of fields to be included in the statistics. This is used as the
    default list unless a more specific field list is provided (see below).
 `completion_fields`::
    List of fields to be included in the Completion Suggest statistics.
 `fielddata_fields`::
    List of fields to be included in the Fielddata statistics.
 Here are some samples:
--- a/docs/reference/search/aggregations.asciidoc
+++ b/docs/reference/search/aggregations.asciidoc
@ -125,6 +125,18 @@ aggregated for the buckets created by their "parent" bucket aggregation.
 There are different bucket aggregators, each with a different "bucketing" strategy. Some define a single bucket, some
 define fixed number of multiple buckets, and others dynamically create the buckets during the aggregation process.
 [float]
 === Caching heavy aggregations
 coming[1.4.0]
 Frequently used aggregations (e.g. for display on the home page of a website)
 can be cached for faster responses. These cached results are the same results
 that would be returned by an uncached aggregation -- you will never get stale
 results.
 See <<index-modules-shard-query-cache>> for more details.
 include::aggregations/metrics.asciidoc[]
--- a/docs/reference/search/request-body.asciidoc
+++ b/docs/reference/search/request-body.asciidoc
@ -46,39 +46,50 @@ And here is a sample response:
 [float]
 === Parameters
-[cols="<,<",options="header",]
+[horizontal]
-|=======================================================================
+`timeout`::
 |Name |Description
 |`timeout` |A search timeout, bounding the search request to be executed
 within the specified time value and bail with the hits accumulated up to
 that point when expired. Defaults to no timeout. See <<time-units>>.
-|`from` |The starting from index of the hits to return. Defaults to `0`.
+    A search timeout, bounding the search request to be executed within the
    specified time value and bail with the hits accumulated up to that point
    when expired. Defaults to no timeout. See <<time-units>>.
-|`size` |The number of hits to return. Defaults to `10`.
+`from`::
-|`search_type` |The type of the search operation to perform. Can be
+    The starting from index of the hits to return. Defaults to `0`.
 `dfs_query_then_fetch`, `dfs_query_and_fetch`, `query_then_fetch`,
 `query_and_fetch`. Defaults to `query_then_fetch`. See
 <<search-request-search-type,_Search Type_>> for
 more details on the different types of search that can be performed.
-|coming[1.4.0] `terminate_after` |The maximum number of documents to collect for
+`size`::
 each shard, upon reaching which the query execution will terminate early.
 If set, the response will have a boolean field `terminated_early` to
 indicate whether the query execution has actually terminated_early.
 Defaults to no terminate_after.
 |=======================================================================
-Out of the above, the `search_type` is the one that can not be passed
+    The number of hits to return. Defaults to `10`.
 within the search request body, and in order to set it, it must be
 passed as a request REST parameter.
-The rest of the search request should be passed within the body itself.
+`search_type`::
 The body content can also be passed as a REST parameter named `source`.
-Both HTTP GET and HTTP POST can be used to execute search with body.
+    The type of the search operation to perform. Can be
-Since not all clients support GET with body, POST is allowed as well.
+    `dfs_query_then_fetch`, `dfs_query_and_fetch`, `query_then_fetch`,
    `query_and_fetch`. Defaults to `query_then_fetch`. See
    <<search-request-search-type,_Search Type_>> for more.
 `query_cache`::
    coming[1.4.0] Set to `true` or `false` to enable or disable the caching
    of search results for requests where `?search_type=count`, ie
    aggregations and suggestions.  See <<index-modules-shard-query-cache>>.
 `terminate_after`::
    coming[1.4.0] The maximum number of documents to collect for each shard,
    upon reaching which the query execution will terminate early. If set, the
    response will have a boolean field `terminated_early` to indicate whether
    the query execution has actually terminated_early. Defaults to no
    terminate_after.
 Out of the above, the `search_type` and the `query_cache` must be passed as
 query-string parameters. The rest of the search request should be passed
 within the body itself. The body content can also be passed as a REST
 parameter named `source`.
 Both HTTP GET and HTTP POST can be used to execute search with body. Since not
 all clients support GET with body, POST is allowed as well.
 include::request/query.asciidoc[]