Docs: Removed all the added/deprecated tags from 1.x

This commit is contained in:
Clinton Gormley 2014-09-26 21:04:42 +02:00
parent e85e07941d
commit cb00d4a542
89 changed files with 124 additions and 585 deletions

View File

@ -14,8 +14,7 @@ type:
|`pattern` |The regular expression pattern, defaults to `\W+`.
|`flags` |The regular expression flags.
|`stopwords` |A list of stopwords to initialize the stop filter with.
Defaults to an 'empty' stopword list added[1.0.0.RC1, Previously
defaulted to the English stopwords list]. Check
Defaults to an 'empty' stopword list Check
<<analysis-stop-analyzer,Stop Analyzer>> for more details.
|===================================================================

View File

@ -18,8 +18,7 @@ type:
|=======================================================================
|Setting |Description
|`stopwords` |A list of stopwords to initialize the stop filter with.
Defaults to an 'empty' stopword list added[1.0.0.Beta1, Previously
defaulted to the English stopwords list]. Check
Defaults to an 'empty' stopword list Check
<<analysis-stop-analyzer,Stop Analyzer>> for more details.
|`max_token_length` |The maximum token length. If a token is seen that
exceeds this length then it is discarded. Defaults to `255`.

View File

@ -1,7 +1,5 @@
[[analysis-apostrophe-tokenfilter]]
=== Apostrophe Token Filter
added[1.3.0]
The `apostrophe` token filter strips all characters after an apostrophe,
including the apostrophe itself.

View File

@ -20,7 +20,6 @@ equivalents, if one exists. Example:
}
--------------------------------------------------
added[1.1.0]
Accepts `preserve_original` setting which defaults to false but if true
will keep the original token as well as emit the folded token. For
example:

View File

@ -1,8 +1,6 @@
[[analysis-classic-tokenfilter]]
=== Classic Token Filter
added[1.3.0]
The `classic` token filter does optional post-processing of
terms that are generated by the <<analysis-classic-tokenizer,`classic` tokenizer>>.

View File

@ -1,8 +1,6 @@
[[analysis-keep-types-tokenfilter]]
=== Keep Types Token Filter
coming[1.4.0]
A token filter of type `keep_types` that only keeps tokens with a token type
contained in a predefined set.

View File

@ -4,7 +4,7 @@
A token filter of type `lowercase` that normalizes token text to lower
case.
Lowercase token filter supports Greek, Irish added[1.3.0], and Turkish lowercase token
Lowercase token filter supports Greek, Irish, and Turkish lowercase token
filters through the `language` parameter. Below is a usage example in a
custom analyzer

View File

@ -11,19 +11,19 @@ http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/
German::
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/de/GermanNormalizationFilter.html[`german_normalization`] added[1.3.0]
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/de/GermanNormalizationFilter.html[`german_normalization`]
Hindi::
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/hi/HindiNormalizer.html[`hindi_normalization`] added[1.3.0]
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/hi/HindiNormalizer.html[`hindi_normalization`]
Indic::
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/in/IndicNormalizer.html[`indic_normalization`] added[1.3.0]
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/in/IndicNormalizer.html[`indic_normalization`]
Kurdish (Sorani)::
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/ckb/SoraniNormalizer.html[`sorani_normalization`] added[1.3.0]
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/ckb/SoraniNormalizer.html[`sorani_normalization`]
Persian::
@ -31,6 +31,6 @@ http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/
Scandinavian::
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/ScandinavianNormalizationFilter.html[`scandinavian_normalization`] added[1.3.0],
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/ScandinavianFoldingFilter.html[`scandinavian_folding`] added[1.3.0]
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/ScandinavianNormalizationFilter.html[`scandinavian_normalization`],
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/miscellaneous/ScandinavianFoldingFilter.html[`scandinavian_folding`]

View File

@ -65,15 +65,15 @@ http://snowball.tartarus.org/algorithms/danish/stemmer.html[*`danish`*]
Dutch::
http://snowball.tartarus.org/algorithms/dutch/stemmer.html[*`dutch`*],
http://snowball.tartarus.org/algorithms/kraaij_pohlmann/stemmer.html[`dutch_kp`] added[1.3.0,Renamed from `kp`]
http://snowball.tartarus.org/algorithms/kraaij_pohlmann/stemmer.html[`dutch_kp`]
English::
http://snowball.tartarus.org/algorithms/porter/stemmer.html[*`english`*] added[1.3.0,Returns the <<analysis-porterstem-tokenfilter,`porter_stem`>> instead of the <<analysis-snowball-tokenfilter,`english` Snowball token filter>>],
http://ciir.cs.umass.edu/pubfiles/ir-35.pdf[`light_english`] added[1.3.0,Returns the <<analysis-kstem-tokenfilter,`kstem` token filter>>],
http://snowball.tartarus.org/algorithms/porter/stemmer.html[*`english`*],
http://ciir.cs.umass.edu/pubfiles/ir-35.pdf[`light_english`],
http://www.researchgate.net/publication/220433848_How_effective_is_suffixing[`minimal_english`],
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/en/EnglishPossessiveFilter.html[`possessive_english`],
http://snowball.tartarus.org/algorithms/english/stemmer.html[`porter2`] added[1.3.0,Returns the <<analysis-snowball-tokenfilter,`english` Snowball token filter>> instead of the <<analysis-snowball-tokenfilter,`porter` Snowball token filter>>],
http://snowball.tartarus.org/algorithms/english/stemmer.html[`porter2`],
http://snowball.tartarus.org/algorithms/lovins/stemmer.html[`lovins`]
Finnish::
@ -89,8 +89,8 @@ http://dl.acm.org/citation.cfm?id=318984[`minimal_french`]
Galician::
http://bvg.udc.es/recursos_lingua/stemming.jsp[*`galician`*] added[1.3.0],
http://bvg.udc.es/recursos_lingua/stemming.jsp[`minimal_galician`] (Plural step only) added[1.3.0]
http://bvg.udc.es/recursos_lingua/stemming.jsp[*`galician`*],
http://bvg.udc.es/recursos_lingua/stemming.jsp[`minimal_galician`] (Plural step only)
German::
@ -127,7 +127,7 @@ http://www.ercim.eu/publication/ws-proceedings/CLEF2/savoy.pdf[*`light_italian`*
Kurdish (Sorani)::
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/ckb/SoraniStemmer.html[*`sorani`*] added[1.3.0]
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/ckb/SoraniStemmer.html[*`sorani`*]
Latvian::
@ -136,20 +136,20 @@ http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/
Norwegian (Bokmål)::
http://snowball.tartarus.org/algorithms/norwegian/stemmer.html[*`norwegian`*],
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/no/NorwegianLightStemmer.html[*`light_norwegian`*] added[1.3.0],
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/no/NorwegianLightStemmer.html[*`light_norwegian`*],
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/no/NorwegianMinimalStemmer.html[`minimal_norwegian`]
Norwegian (Nynorsk)::
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/no/NorwegianLightStemmer.html[*`light_nynorsk`*] added[1.3.0],
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/no/NorwegianMinimalStemmer.html[`minimal_nynorsk`] added[1.3.0]
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/no/NorwegianLightStemmer.html[*`light_nynorsk`*],
http://lucene.apache.org/core/4_9_0/analyzers-common/org/apache/lucene/analysis/no/NorwegianMinimalStemmer.html[`minimal_nynorsk`]
Portuguese::
http://snowball.tartarus.org/algorithms/portuguese/stemmer.html[`portuguese`],
http://dl.acm.org/citation.cfm?id=1141523&dl=ACM&coll=DL&CFID=179095584&CFTOKEN=80067181[*`light_portuguese`*],
http://www.inf.ufrgs.br/\~buriol/papers/Orengo_CLEF07.pdf[`minimal_portuguese`],
http://www.inf.ufrgs.br/\~viviane/rslp/index.htm[`portuguese_rslp`] added[1.3.0]
http://www.inf.ufrgs.br/\~viviane/rslp/index.htm[`portuguese_rslp`]
Romanian::

View File

@ -1,7 +1,5 @@
[[analysis-uppercase-tokenfilter]]
=== Uppercase Token Filter
added[1.2.0]
A token filter of type `uppercase` that normalizes token text to upper
case.

View File

@ -1,8 +1,6 @@
[[analysis-classic-tokenizer]]
=== Classic Tokenizer
added[1.3.0]
A tokenizer of type `classic` providing grammar based tokenizer that is
a good tokenizer for English language documents. This tokenizer has
heuristics for special treatment of acronyms, company names, email addresses,

View File

@ -1,8 +1,6 @@
[[analysis-thai-tokenizer]]
=== Thai Tokenizer
added[1.3.0]
A tokenizer of type `thai` that segments Thai text into words. This tokenizer
uses the built-in Thai segmentation algorithm included with Java to divide
up Thai text. Text in other languages in general will be treated the same

View File

@ -51,7 +51,7 @@ specified to expand to all indices.
+
If `none` is specified then wildcard expansion will be disabled and if `all`
is specified, wildcard expressions will expand to all indices (this is equivalent
to specifying `open,closed`). coming[1.4.0]
to specifying `open,closed`).
The defaults settings for the above parameters depend on the api being used.
@ -82,7 +82,7 @@ The human readable values can be turned off by adding `?human=false`
to the query string. This makes sense when the stats results are
being consumed by a monitoring tool, rather than intended for human
consumption. The default for the `human` flag is
`false`. added[1.00.Beta,Previously defaulted to `true`]
`false`.
[float]
=== Flat Settings
@ -246,7 +246,7 @@ document indexed.
[float]
=== JSONP
By default JSONP responses are disabled. coming[1.3,Previously JSONP was enabled by default]
By default JSONP responses are disabled.
When enabled, all REST APIs accept a `callback` parameter
resulting in a http://en.wikipedia.org/wiki/JSONP[JSONP] result. You can enable

View File

@ -1,8 +1,6 @@
[[cat-fielddata]]
== cat fielddata
added[1.2.0]
`fielddata` shows information about currently loaded fielddata on a per-node
basis.

View File

@ -103,8 +103,6 @@ due to forced awareness or allocation filtering.
[float]
===== Disable allocation
added[1.0.0.RC1]
All the disable allocation settings have been deprecated in favour for
`cluster.routing.allocation.enable` setting.
@ -156,7 +154,7 @@ All the disable allocation settings have been deprecated in favour for
`discovery.zen.minimum_master_nodes`::
See <<modules-discovery-zen>>
`discovery.zen.publish_timeout` added[1.1.0, The setting existed before but wasn't dynamic]::
`discovery.zen.publish_timeout`::
See <<modules-discovery-zen>>
[float]

View File

@ -20,7 +20,7 @@ $ curl -XDELETE 'http://localhost:9200/twitter/tweet/_query' -d '{
--------------------------------------------------
NOTE: The query being sent in the body must be nested in a `query` key, same as
the <<search-search,search api>> works added[1.0.0.RC1,The query was previously the top-level object].
the <<search-search,search api>> works
Both above examples end up doing the same thing, which is delete all
tweets from the twitter index for a certain user. The result of the

View File

@ -73,8 +73,6 @@ to fetch the first document matching the id across all types.
[[get-source-filtering]]
=== Source filtering
added[1.0.0.Beta1]
By default, the get operation returns the contents of the `_source` field unless
you have used the `fields` parameter or if the `_source` field is disabled.
You can turn off `_source` retrieval by using the `_source` parameter:
@ -127,8 +125,6 @@ will fail.
[float]
[[generated-fields]]
=== Generated fields
coming[1.4.0]
If no refresh occurred between indexing and refresh, GET will access the transaction log to fetch the document. However, some fields are generated only when indexing.
If you try to access a field that is only generated when indexing, you will get an exception (default). You can choose to ignore field that are generated if the transaction log is accessed by setting `ignore_errors_on_generated_fields=true`.

View File

@ -113,8 +113,6 @@ GET /test/_mget/
[[mget-source-filtering]]
=== Source filtering
added[1.0.0.Beta1]
By default, the `_source` field will be returned for every document (if stored).
Similar to the <<get-source-filtering,get>> API, you can retrieve only parts of
the `_source` (or not at all) by using the `_source` parameter. You can also use
@ -183,8 +181,6 @@ curl 'localhost:9200/_mget' -d '{
[float]
=== Generated fields
coming[1.4.0]
See <<generated-fields>> for fields are generated only when indexing.
[float]

View File

@ -3,7 +3,7 @@
Multi termvectors API allows to get multiple termvectors at once. The
documents from which to retrieve the term vectors are specified by an index,
type and id. But the documents could also be artificially provided coming[1.4.0].
type and id. But the documents could also be artificially provided
The response includes a `docs`
array with all the fetched termvectors, each element having the structure
provided by the <<docs-termvectors,termvectors>>
@ -92,7 +92,7 @@ curl 'localhost:9200/testidx/test/_mtermvectors' -d '{
}'
--------------------------------------------------
Additionally coming[1.4.0], just like for the <<docs-termvectors,termvectors>>
Additionally, just like for the <<docs-termvectors,termvectors>>
API, term vectors could be generated for user provided documents. The syntax
is similar to the <<search-percolate,percolator>> API. The mapping used is
determined by `_index` and `_type`.

View File

@ -1,11 +1,9 @@
[[docs-termvectors]]
== Term Vectors
added[1.0.0.Beta1]
Returns information and statistics on terms in the fields of a particular
document. The document could be stored in the index or artificially provided
by the user coming[1.4.0]. Note that for documents stored in the index, this
by the user Note that for documents stored in the index, this
is a near realtime API as the term vectors are not available until the next
refresh.
@ -24,7 +22,7 @@ curl -XGET 'http://localhost:9200/twitter/tweet/1/_termvector?fields=text,...'
or by adding the requested fields in the request body (see
example below). Fields can also be specified with wildcards
in similar way to the <<query-dsl-multi-match-query,multi match query>> coming[1.4.0].
in similar way to the <<query-dsl-multi-match-query,multi match query>>
[float]
=== Return values
@ -45,8 +43,6 @@ If the requested information wasn't stored in the index, it will be
computed on the fly if possible. Additionally, term vectors could be computed
for documents not even existing in the index, but instead provided by the user.
coming[1.4.0,The ability to computed term vectors on the fly as well as support for artificial documents is only available from 1.4.0 onwards (see below example 2 and 3 respectively)]
[WARNING]
======
Start and end offsets assume UTF-16 encoding is being used. If you want to use
@ -232,7 +228,7 @@ Response:
--------------------------------------------------
[float]
=== Example 2 coming[1.4.0]
=== Example 2
Term vectors which are not explicitly stored in the index are automatically
computed on the fly. The following request returns all information and statistics for the
@ -251,7 +247,7 @@ curl -XGET 'http://localhost:9200/twitter/tweet/1/_termvector?pretty=true' -d '{
--------------------------------------------------
[float]
=== Example 3 coming[1.4.0]
=== Example 3
Additionally, term vectors can also be generated for artificial documents,
that is for documents not present in the index. The syntax is similar to the

View File

@ -145,8 +145,6 @@ curl -XPOST 'localhost:9200/test/type1/1/_update' -d '{
}
}'
--------------------------------------------------
coming[1.4.0]
If the document does not exist you may want your update script to
run anyway in order to initialize the document contents using
business logic unknown to the client. In this case pass the

View File

@ -100,7 +100,7 @@ settings API.
[[disk]]
=== Disk-based Shard Allocation
added[1.3.0] disk based shard allocation is enabled from version 1.3.0 onward
disk based shard allocation is enabled from version 1.3.0 onward
Elasticsearch can be configured to prevent shard
allocation on nodes depending on disk usage for the node. This

View File

@ -28,8 +28,6 @@ example, can be set to `5m` for a 5 minute expiry.
[[circuit-breaker]]
=== Circuit Breaker
coming[1.4.0,Prior to 1.4.0 there was only a single circuit breaker for fielddata]
Elasticsearch contains multiple circuit breakers used to prevent operations from
causing an OutOfMemoryError. Each breaker specifies a limit for how much memory
it can use. Additionally, there is a parent-level breaker that specifies the
@ -59,18 +57,10 @@ parameters:
A constant that all field data estimations are multiplied with to determine a
final estimation. Defaults to 1.03
`indices.fielddata.breaker.limit`::
deprecated[1.4.0,Replaced by `indices.breaker.fielddata.limit`]
`indices.fielddata.breaker.overhead`::
deprecated[1.4.0,Replaced by `indices.breaker.fielddata.overhead`]
[float]
[[request-circuit-breaker]]
==== Request circuit breaker
coming[1.4.0]
The request circuit breaker allows Elasticsearch to prevent per-request data
structures (for example, memory used for calculating aggregations during a
request) from exceeding a certain amount of memory.
@ -162,8 +152,6 @@ field data format.
[float]
==== Global ordinals
added[1.2.0]
Global ordinals is a data-structure on top of field data, that maintains an
incremental numbering for all the terms in field data in a lexicographic order.
Each term has a unique number and the number of term 'A' is lower than the number

View File

@ -1,8 +1,6 @@
[[index-modules-shard-query-cache]]
== Shard query cache
coming[1.4.0]
When a search request is run against an index or against many indices, each
involved shard executes the search locally and returns its local results to
the _coordinating node_, which combines these shard-level results into a

View File

@ -113,7 +113,7 @@ See <<vm-max-map-count>>
[[default_fs]]
[float]
==== Hybrid MMap / NIO FS added[1.3.0]
==== Hybrid MMap / NIO FS
The `default` type stores the shard index on the file system depending on
the file type by mapping a file into memory (mmap) or using Java NIO. Currently

View File

@ -74,8 +74,6 @@ the same index. The filter can be defined using Query DSL and is applied
to all Search, Count, Delete By Query and More Like This operations with
this alias.
coming[1.4.0,Fields referred to in alias filters must exist in the mappings of the index/indices pointed to by the alias]
To create a filtered alias, first we need to ensure that the fields already
exist in the mapping:
@ -242,8 +240,6 @@ curl -XPUT 'localhost:9200/users/_alias/user_12' -d '{
[[alias-index-creation]]
=== Aliases during index creation
added[1.1.0]
Aliases can also be specified during <<create-index-aliases,index creation>>:
[source,js]
@ -314,8 +310,6 @@ Possible options:
The rest endpoint is: `/{index}/_alias/{alias}`.
coming[1.4.0,The API will always include an `aliases` section, even if there aren't any aliases. Previous versions would not return the `aliases` section]
[float]
==== Examples:

View File

@ -10,7 +10,7 @@ $ curl -XPOST 'http://localhost:9200/twitter/_cache/clear'
--------------------------------------------------
The API, by default, will clear all caches. Specific caches can be cleaned
explicitly by setting `filter`, `fielddata`, `query_cache` coming[1.4.0],
explicitly by setting `filter`, `fielddata`, `query_cache`,
or `id_cache` to `true`.
All caches relating to a specific field(s) can also be cleared by

View File

@ -111,8 +111,6 @@ curl -XPUT localhost:9200/test -d '{
[[create-index-aliases]]
=== Aliases
added[1.1.0]
The create index API allows also to provide a set of <<indices-aliases,aliases>>:
[source,js]
@ -133,8 +131,6 @@ curl -XPUT localhost:9200/test -d '{
[float]
=== Creation Date
coming[1.4.0]
When an index is created, a timestamp is stored in the index metadata for the creation date. By
default this it is automatically generated but it can also be specified using the
`creation_date` parameter on the create index API:

View File

@ -23,7 +23,7 @@ The flush API accepts the following request parameters:
`wait_if_ongoing`:: If set to `true` the flush operation will block until the
flush can be executed if another flush operation is already executing.
The default is `false` and will cause an exception to be thrown on
the shard level if another flush operation is already running. coming[1.4.0]
the shard level if another flush operation is already running.
`full`:: If set to `true` a new index writer is created and settings that have
been changed related to the index writer will be refreshed. Note: if a full flush

View File

@ -29,8 +29,6 @@ curl -XGET 'http://localhost:9200/_all/_mapping/tweet,book'
If you want to get mappings of all indices and types then the following
two examples are equivalent:
coming[1.4.0,The API will always include a `mappings` section, even if there aren't any mappings. Previous versions would not return the `mappings` section]
[source,js]
--------------------------------------------------
curl -XGET 'http://localhost:9200/_all/_mapping'

View File

@ -38,7 +38,7 @@ to `true`. Note, a merge can potentially be a very heavy operation, so
it might make sense to run it set to `false`.
`force`:: Force a merge operation, even if there is a single segment in the
shard with no deletions. added[1.1.0]
shard with no deletions.
[float]
[[optimize-multi-index]]

View File

@ -43,7 +43,7 @@ specified as well in the URI. Those stats can be any of:
`fielddata`:: Fielddata statistics.
`flush`:: Flush statistics.
`merge`:: Merge statistics.
`query_cache`:: <<index-modules-shard-query-cache,Shard query cache>> statistics. coming[1.4.0]
`query_cache`:: <<index-modules-shard-query-cache,Shard query cache>> statistics.
`refresh`:: Refresh statistics.
`suggest`:: Suggest statistics.
`warmer`:: Warmer statistics.

View File

@ -27,8 +27,6 @@ Defines a template named template_1, with a template pattern of `te*`.
The settings and mappings will be applied to any index name that matches
the `te*` template.
added[1.1.0]
It is also possible to include aliases in an index template as follows:
[source,js]

View File

@ -111,8 +111,6 @@ settings API:
`index.routing.allocation.disable_replica_allocation`::
Disable replica allocation. Defaults to `false`. Deprecated in favour for `index.routing.allocation.enable`.
added[1.0.0.RC1]
`index.routing.allocation.enable`::
Enables shard allocation for a specific index. It can be set to:
* `all` (default) - Allows shard allocation for all shards.

View File

@ -66,8 +66,6 @@ curl -XPUT localhost:9200/_template/template_1 -d '
}'
--------------------------------------------------
coming[1.4.0]
On the same level as `types` and `source`, the `query_cache` flag is supported
to enable query caching for the warmed search request. If not specified, it will
use the index level configuration of query caching.
@ -142,8 +140,6 @@ where
Instead of `_warmer` you can also use the plural `_warmers`.
coming[1.4.0]
The `query_cache` parameter can be used to enable query caching for
the search request. If not specified, it will use the index level configuration
of query caching.
@ -182,8 +178,6 @@ Getting a warmer for specific index (or alias, or several indices) based
on its name. The provided name can be a simple wildcard expression or
omitted to get all warmers.
coming[1.4.0,The API will always include a `warmers` section, even if there aren't any warmers. Previous versions would not return the `warmers` section]
Some examples:
[source,js]

View File

@ -67,8 +67,6 @@ root and inner object types:
[float]
=== Unmapped fields in queries
coming[1.4.0]
Queries and filters can refer to fields which don't exist in a mapping, except
when registering a new <<search-percolate,percolator query>> or when creating
a <<filtered,filtered alias>>. In these two cases, any fields referred to in

View File

@ -17,8 +17,6 @@ include::fields/all-field.asciidoc[]
include::fields/analyzer-field.asciidoc[]
include::fields/boost-field.asciidoc[]
include::fields/parent-field.asciidoc[]
include::fields/field-names-field.asciidoc[]

View File

@ -1,72 +0,0 @@
[[mapping-boost-field]]
=== `_boost`
deprecated[1.0.0.RC1,See <<function-score-instead-of-boost>>]
Boosting is the process of enhancing the relevancy of a document or
field. Field level mapping allows to define an explicit boost level on a
specific field. The boost field mapping (applied on the
<<mapping-root-object-type,root object>>) allows
to define a boost field mapping where *its content will control the
boost level of the document*. For example, consider the following
mapping:
[source,js]
--------------------------------------------------
{
"tweet" : {
"_boost" : {"name" : "my_boost", "null_value" : 1.0}
}
}
--------------------------------------------------
The above mapping defines a mapping for a field named `my_boost`. If the
`my_boost` field exists within the JSON document indexed, its value will
control the boost level of the document indexed. For example, the
following JSON document will be indexed with a boost value of `2.2`:
[source,js]
--------------------------------------------------
{
"my_boost" : 2.2,
"message" : "This is a tweet!"
}
--------------------------------------------------
[[function-score-instead-of-boost]]
==== Function score instead of boost
Support for document boosting via the `_boost` field has been removed
from Lucene and is deprecated in Elasticsearch as of v1.0.0.RC1. The
implementation in Lucene resulted in unpredictable result when
used with multiple fields or multi-value fields.
Instead, the <<query-dsl-function-score-query>> can be used to achieve
the desired functionality by boosting each document by the value in
any field of the document:
[source,js]
--------------------------------------------------
{
"query": {
"function_score": {
"query": { <1>
"match": {
"title": "your main query"
}
},
"functions": [{
"field_value_factor": { <2>
"field": "my_boost_field"
}
}],
"score_mode": "multiply"
}
}
}
--------------------------------------------------
<1> The original query, now wrapped in a `function_score` query.
<2> This function returns the value in `my_boost_field`, which is then
multiplied by the query `_score` for each document.
Note, that `field_value_factor` is a 1.2.x feature.

View File

@ -1,8 +1,6 @@
[[mapping-field-names-field]]
=== `_field_names`
added[1.3.0]
The `_field_names` field indexes the field names of a document, which can later
be used to search for documents based on the fields that they contain typically
using the `exists` and `missing` filters.

View File

@ -1,7 +1,5 @@
[[mapping-transform]]
== Transform
added[1.3.0]
The document can be transformed before it is indexed by registering a
script in the `transform` element of the mapping. The result of the
transform is indexed but the original source is stored in the `_source`

View File

@ -181,7 +181,6 @@ you don't need scoring on a specific field, it is highly recommended to disable
norms on it. In particular, this is the case for fields that are used solely
for filtering or aggregations.
added[1.2.0]
In case you would like to disable norms after the fact, it is possible to do so
by using the <<indices-put-mapping,PUT mapping API>>. Please however note that
norms won't be removed instantly, but as your index will receive new insertions
@ -556,8 +555,6 @@ The following Similarities are configured out-of-box:
[float]
===== Copy to field
added[1.0.0.RC2]
Adding `copy_to` parameter to any field mapping will cause all values of this field to be copied to fields specified in
the parameter. In the following example all values from fields `title` and `abstract` will be copied to the field
`meta_data`.
@ -590,8 +587,6 @@ Multiple fields are also supported:
[float]
===== Multi fields
added[1.0.0.RC1]
The `fields` options allows to map several core types fields into a single
json source field. This can be useful if a single field need to be
used in different ways. For example a single field is to be used for both

View File

@ -38,23 +38,13 @@ The following settings may be used:
`cluster.routing.allocation.enable`::
Controls shard allocation for all indices, by allowing specific
kinds of shard to be allocated.
added[1.0.0.RC1,Replaces `cluster.routing.allocation.disable*`]
Can be set to:
* `all` (default) - Allows shard allocation for all kinds of shards.
* `primaries` - Allows shard allocation only for primary shards.
* `new_primaries` - Allows shard allocation only for primary shards for new indices.
* `none` - No shard allocations of any kind are allowed for all indices.
`cluster.routing.allocation.disable_new_allocation`::
deprecated[1.0.0.RC1,Replaced by `cluster.routing.allocation.enable`]
`cluster.routing.allocation.disable_allocation`::
deprecated[1.0.0.RC1,Replaced by `cluster.routing.allocation.enable`]
`cluster.routing.allocation.disable_replica_allocation`::
deprecated[1.0.0.RC1,Replaced by `cluster.routing.allocation.enable`]
`cluster.routing.allocation.same_shard.host`::
Allows to perform a check to prevent allocation of multiple instances
of the same shard on a single host, based on host name and host address.

View File

@ -75,8 +75,6 @@ configure the election to handle cases of slow or congested networks
(higher values assure less chance of failure). Once a node joins, it
will send a join request to the master (`discovery.zen.join_timeout`)
with a timeout defaulting at 20 times the ping timeout.
added[1.3.0,Previously defaulted to 10 times the ping timeout].
Nodes can be excluded from becoming a master by setting `node.master` to
`false`. Note, once a node is a client node (`node.client` set to
`true`), it will not be allowed to become a master (`node.master` is
@ -161,5 +159,4 @@ updates its own cluster state and replies to the master node, which waits for
all nodes to respond, up to a timeout, before going ahead processing the next
updates in the queue. The `discovery.zen.publish_timeout` is set by default
to 30 seconds and can be changed dynamically through the
<<cluster-update-settings,cluster update settings api>> added[1.1.0, The
setting existed before but wasn't dynamic].
<<cluster-update-settings,cluster update settings api>>

View File

@ -43,8 +43,7 @@ once all `gateway.recover_after...nodes` conditions are met.
The `gateway.expected_nodes` allows to set how many data and master
eligible nodes are expected to be in the cluster, and once met, the
`gateway.recover_after_time` is ignored and recovery starts.
Setting `gateway.expected_nodes` also defaults `gateway.recovery_after_time` to `5m` added[1.3.0, before `expected_nodes`
required `recovery_after_time` to be set]. The `gateway.expected_data_nodes` and `gateway.expected_master_nodes`
Setting `gateway.expected_nodes` also defaults `gateway.recovery_after_time` to `5m` The `gateway.expected_data_nodes` and `gateway.expected_master_nodes`
settings are also supported. For example setting:
[source,js]

View File

@ -72,10 +72,10 @@ share the following allowed settings:
|=======================================================================
|Setting |Description
|`network.tcp.no_delay` |Enable or disable tcp no delay setting.
Defaults to `true`. coming[1.4,Can be set to `default` to not be set at all.]
Defaults to `true`.
|`network.tcp.keep_alive` |Enable or disable tcp keep alive. Defaults
to `true`. coming[1.4,Can be set to `default` to not be set at all].
to `true`.
|`network.tcp.reuse_address` |Should an address be reused or not.
Defaults to `true` on non-windows machines.

View File

@ -165,8 +165,6 @@ bin/plugin --install mobz/elasticsearch-head
[float]
==== Lucene version dependent plugins
added[1.2.0]
For some plugins, such as analysis plugins, a specific major Lucene version is
required to run. In that case, the plugin provides in its `es-plugin.properties`
file the Lucene version for which the plugin was built for.

View File

@ -6,10 +6,6 @@ expressions. For example, scripts can be used to return "script fields"
as part of a search request, or can be used to evaluate a custom score
for a query and so on.
deprecated[1.3.0,Mvel has been deprecated and will be removed in 1.4.0]
added[1.3.0,Groovy scripting support]
The scripting module uses by default http://groovy.codehaus.org/[groovy]
(previously http://mvel.codehaus.org/[mvel] in 1.3.x and earlier) as the
scripting language with some extensions. Groovy is used since it is extremely
@ -23,8 +19,6 @@ All places where a `script` parameter can be used, a `lang` parameter
script. The `lang` options are `groovy`, `js`, `mvel`, `python`,
`expression` and `native`.
added[1.2.0, Dynamic scripting is disabled for non-sandboxed languages by default since version 1.2.0]
To increase security, Elasticsearch does not allow you to specify scripts for
non-sandboxed languages with a request. Instead, scripts must be placed in the
`scripts` directory inside the configuration directory (the directory where

View File

@ -189,7 +189,7 @@ should be restored as well as prevent global cluster state from being restored b
<<search-multi-index-type,multi index syntax>>. The `rename_pattern` and `rename_replacement` options can be also used to
rename index on restore using regular expression that supports referencing the original text as explained
http://docs.oracle.com/javase/6/docs/api/java/util/regex/Matcher.html#appendReplacement(java.lang.StringBuffer,%20java.lang.String)[here].
Set `include_aliases` to `false` to prevent aliases from being restored together with associated indices added[1.3.0].
Set `include_aliases` to `false` to prevent aliases from being restored together with associated indices
[source,js]
-----------------------------------
@ -211,8 +211,6 @@ persistent settings are added to the existing persistent settings.
[float]
=== Partial restore
added[1.3.0]
By default, entire restore operation will fail if one or more indices participating in the operation don't have
snapshots of all shards available. It can occur if some shards failed to snapshot for example. It is still possible to
restore such indices by setting `partial` to `true`. Please note, that only successfully snapshotted shards will be
@ -222,8 +220,6 @@ restored in this case and all missing shards will be recreated empty.
[float]
=== Snapshot status
added[1.1.0]
A list of currently running snapshots with their detailed status information can be obtained using the following command:
[source,shell]

View File

@ -56,8 +56,6 @@ tribe:
metadata: true
--------------------------------
added[1.2.0]
The tribe node can also configure blocks on indices explicitly:
[source,yaml]
@ -67,8 +65,6 @@ tribe:
indices.write: hk*,ldn*
--------------------------------
added[1.2.0]
When there is a conflict and multiple clusters hold the same index, by default
the tribe node will pick one of them. This can be configured using the `tribe.on_conflict`
setting. It defaults to `any`, but can be set to `drop` (drop indices that have

View File

@ -64,8 +64,6 @@ next to the given cell.
[float]
==== Caching
added[1.3.0]
The result of the filter is not cached by default. The
`_cache` parameter can be set to `true` to turn caching on.
By default the filter uses the resulting geohash cells as a cache key.

View File

@ -45,8 +45,6 @@ The `has_child` filter also accepts a filter instead of a query:
[float]
==== Min/Max Children
added[1.3.0]
The `has_child` filter allows you to specify that a minimum and/or maximum
number of children are required to match for the parent doc to be considered
a match:

View File

@ -30,8 +30,6 @@ The `range` filter accepts the following parameters:
`lte`:: Less-than or equal to
`lt`:: Less-than
coming[1.4.0]
When applied on `date` fields the `range` filter accepts also a `time_zone` parameter.
The `time_zone` parameter will be applied to your input lower and upper bounds and will
move them to UTC time based date:

View File

@ -6,11 +6,6 @@ retrieved by a query. This can be useful if, for example, a score
function is computationally expensive and it is sufficient to compute
the score on a filtered set of documents.
`function_score` provides the same functionality that
`custom_boost_factor`, `custom_score` and
`custom_filters_score` provided
but with additional capabilities such as distance and recency scoring (see description below).
==== Using function score
To use `function_score`, the user has to define a query and one or
@ -73,7 +68,7 @@ First, each document is scored by the defined functions. The parameter
`max`:: maximum score is used
`min`:: minimum score is used
Because scores can be on different scales (for example, between 0 and 1 for decay functions but arbitrary for `field_value_factor`) and also because sometimes a different impact of functions on the score is desirable, the score of each function can be adjusted with a user defined `weight` (coming[1.4.0]). The `weight` can be defined per function in the `functions` array (example above) and is multiplied with the score computed by the respective function.
Because scores can be on different scales (for example, between 0 and 1 for decay functions but arbitrary for `field_value_factor`) and also because sometimes a different impact of functions on the score is desirable, the score of each function can be adjusted with a user defined `weight` (). The `weight` can be defined per function in the `functions` array (example above) and is multiplied with the score computed by the respective function.
If weight is given without any other function declaration, `weight` acts as a function that simply returns the `weight`.
The new score can be restricted to not exceed a certain limit by setting
@ -135,8 +130,6 @@ you wish to inhibit this, set `"boost_mode": "replace"`
===== Weight
coming[1.4.0]
The `weight` score allows you to multiply the score by the provided
`weight`. This can sometimes be desired since boost value set on
specific queries gets normalized, while for this score function it does
@ -147,13 +140,6 @@ not.
"weight" : number
--------------------------------------------------
===== Boost factor
deprecated[1.4.0]
Same as `weight`. Use `weight` instead.
===== Random
The `random_score` generates scores using a hash of the `_uid` field,
@ -172,8 +158,6 @@ be a memory intensive operation since the values are unique.
===== Field Value factor
added[1.2.0]
The `field_value_factor` function allows you to use a field from a document to
influence the score. It's similar to using the `script_score` function, however,
it avoids the overhead of scripting. If used on a multi-valued field, only the
@ -489,105 +473,3 @@ are supported.
If the numeric field is missing in the document, the function will
return 1.
==== Relation to `custom_boost`, `custom_score` and `custom_filters_score`
The `custom_boost_factor` query
[source,js]
--------------------------------------------------
"custom_boost_factor": {
"boost_factor": 5.2,
"query": {...}
}
--------------------------------------------------
becomes
[source,js]
--------------------------------------------------
"function_score": {
"weight": 5.2,
"query": {...}
}
--------------------------------------------------
The `custom_score` query
[source,js]
--------------------------------------------------
"custom_score": {
"params": {
"param1": 2,
"param2": 3.1
},
"query": {...},
"script": "_score * doc['my_numeric_field'].value / pow(param1, param2)"
}
--------------------------------------------------
becomes
[source,js]
--------------------------------------------------
"function_score": {
"boost_mode": "replace",
"query": {...},
"script_score": {
"params": {
"param1": 2,
"param2": 3.1
},
"script": "_score * doc['my_numeric_field'].value / pow(param1, param2)"
}
}
--------------------------------------------------
and the `custom_filters_score`
[source,js]
--------------------------------------------------
"custom_filters_score": {
"filters": [
{
"boost": "3",
"filter": {...}
},
{
"filter": {...},
"script": "_score * doc['my_numeric_field'].value / pow(param1, param2)"
}
],
"params": {
"param1": 2,
"param2": 3.1
},
"query": {...},
"score_mode": "first"
}
--------------------------------------------------
becomes:
[source,js]
--------------------------------------------------
"function_score": {
"functions": [
{
"weight": "3",
"filter": {...}
},
{
"filter": {...},
"script_score": {
"params": {
"param1": 2,
"param2": 3.1
},
"script": "_score * doc['my_numeric_field'].value / pow(param1, param2)"
}
}
],
"query": {...},
"score_mode": "first"
}
--------------------------------------------------

View File

@ -56,8 +56,6 @@ inside the `has_child` query:
[float]
==== Min/Max Children
added[1.3.0]
The `has_child` query allows you to specify that a minimum and/or maximum
number of children are required to match for the parent doc to be considered
a match:

View File

@ -45,21 +45,10 @@ Individual fields can be boosted with the caret (`^`) notation:
--------------------------------------------------
<1> The `subject` field is three times as important as the `message` field.
[float]
=== `use_dis_max`
deprecated[1.1.0,Use `type:best_fields` or `type:most_fields` instead. See <<multi-match-types>>]
By default, the `multi_match` query generates a `match` clause per field, then wraps them
in a `dis_max` query. By setting `use_dis_max` to `false`, they will be wrapped in a
`bool` query instead.
[[multi-match-types]]
[float]
=== Types of `multi_match` query:
added[1.1.0]
The way the `multi_match` query is executed internally depends on the `type`
parameter, which can be set to:

View File

@ -72,7 +72,7 @@ both>>.
|`lenient` |If set to `true` will cause format based failures (like
providing text to a numeric field) to be ignored.
|`locale` | added[1.1.0] Locale that should be used for string conversions.
|`locale` | Locale that should be used for string conversions.
Defaults to `ROOT`.
|=======================================================================

View File

@ -29,8 +29,6 @@ The `range` query accepts the following parameters:
`lt`:: Less-than
`boost`:: Sets the boost value of the query, defaults to `1.0`
coming[1.4.0]
When applied on `date` fields the `range` filter accepts also a `time_zone` parameter.
The `time_zone` parameter will be applied to your input lower and upper bounds and will
move them to UTC time based date:

View File

@ -44,10 +44,10 @@ enable. Defaults to `ALL`.
be automatically lower-cased or not (since they are not analyzed). Defaults to
true.
|`locale` | added[1.1.0] Locale that should be used for string conversions.
|`locale` | Locale that should be used for string conversions.
Defaults to `ROOT`.
|`lenient` | added[1.1.0] If set to `true` will cause format based failures
|`lenient` | If set to `true` will cause format based failures
(like providing text to a numeric field) to be ignored.
|=======================================================================

View File

@ -1,8 +1,6 @@
[[query-dsl-template-query]]
=== Template Query
added[1.1.0]
A query that accepts a query template and a map of key/value pairs to fill in
template parameters.
@ -95,8 +93,6 @@ which is then turned into:
}
------------------------------------------
added[1.3.0]
You can register a template by storing it in the elasticsearch index `.scripts` or by using the REST API. (See <<search-template>> for more details)
In order to execute the stored template, reference it by name in the `query`
parameter:

View File

@ -118,8 +118,6 @@ define fixed number of multiple buckets, and others dynamically create the bucke
[float]
=== Caching heavy aggregations
coming[1.4.0]
Frequently used aggregations (e.g. for display on the home page of a website)
can be cached for faster responses. These cached results are the same results
that would be returned by an uncached aggregation -- you will never get stale

View File

@ -1,8 +1,6 @@
[[search-aggregations-bucket-children-aggregation]]
=== Children Aggregation
coming[1.4.0]
A special single bucket aggregation that enables aggregating from buckets on parent document types to buckets on child documents.
This aggregation relies on the <<mapping-parent-field,_parent field>> in the mapping. This aggregation has a single option:

View File

@ -1,8 +1,6 @@
[[search-aggregations-bucket-filters-aggregation]]
=== Filters Aggregation
coming[1.4.0]
Defines a multi bucket aggregations where each bucket is associated with a
filter. Each bucket will collect all documents that match its associated
filter.

View File

@ -117,7 +117,7 @@ precision:: Optional. The string length of the geohashes used to define
size:: Optional. The maximum number of geohash buckets to return
(defaults to 10,000). When results are trimmed, buckets are
prioritised based on the volumes of documents they contain.
added[1.1.0] A value of `0` will return all buckets that
A value of `0` will return all buckets that
contain a hit, use with caution as this could use a lot of CPU
and network bandwith if there are many buckets.
@ -126,6 +126,6 @@ shard_size:: Optional. To allow for more accurate counting of the top cells
returning `max(10,(size x number-of-shards))` buckets from each
shard. If this heuristic is undesirable, the number considered
from each shard can be over-ridden using this parameter.
added[1.1.0] A value of `0` makes the shard size unlimited.
A value of `0` makes the shard size unlimited.

View File

@ -1,8 +1,6 @@
[[search-aggregations-bucket-reverse-nested-aggregation]]
=== Reverse nested Aggregation
added[1.2.0]
A special single bucket aggregation that enables aggregating on parent docs from nested documents. Effectively this
aggregation can break out of the nested block structure and link to other nested structures or the root document,
which allows nesting other aggregations that aren't part of the nested object in a nested aggregation.

View File

@ -10,8 +10,6 @@ This feature is marked as experimental, and may be subject to change in the
future. If you use this feature, please let us know your experience with it!
=====
added[1.1.0]
.Example use cases:
* Suggesting "H5N1" when users search for "bird flu" in text
@ -231,8 +229,6 @@ are presented unstemmed, highlighted, with the right case, in the right order an
============
==== Custom background sets
added[1.2.0]
Ordinarily, the foreground set of documents is "diffed" against a background set of all the documents in your index.
However, sometimes it may prove useful to use a narrower background set as the basis for comparisons.
@ -284,8 +280,6 @@ However, the `size` and `shard size` settings covered in the next section provid
The scores are derived from the doc frequencies in _foreground_ and _background_ sets. The _absolute_ change in popularity (foregroundPercent - backgroundPercent) would favor common terms whereas the _relative_ change in popularity (foregroundPercent/ backgroundPercent) would favor rare terms. Rare vs common is essentially a precision vs recall balance and so the absolute and relative changes are multiplied to provide a sweet spot between precision and recall.
===== mutual information
added[1.3.0]
Mutual information as described in "Information Retrieval", Manning et al., Chapter 13.5.1 can be used as significance score by adding the parameter
[source,js]
@ -308,8 +302,6 @@ Per default, the assumption is that the documents in the bucket are also contain
===== Chi square
coming[1.4.0]
Chi square as described in "Information Retrieval", Manning et al., Chapter 13.5.2 can be used as significance score by adding the parameter
[source,js]
@ -323,8 +315,6 @@ Chi square behaves like mutual information and can be configured with the same p
===== google normalized distance
coming[1.4.0]
Google normalized distance as described in "The Google Similarity Distance", Cilibrasi and Vitanyi, 2007 (http://arxiv.org/pdf/cs/0412098v3.pdf) can be used as significance score by adding the parameter
[source,js]
@ -355,7 +345,7 @@ If the number of unique terms is greater than `size`, the returned list can be s
(it could be that the term counts are slightly off and it could even be that a term that should have been in the top
size buckets was not returned).
added[1.2.0] If set to `0`, the `size` will be set to `Integer.MAX_VALUE`.
If set to `0`, the `size` will be set to `Integer.MAX_VALUE`.
To ensure better accuracy a multiple of the final `size` is used as the number of terms to request from each shard
using a heuristic based on the number of shards. To take manual control of this setting the `shard_size` parameter
@ -368,7 +358,7 @@ a consolidated review by the reducing node before the final selection. Obviously
will cause extra network traffic and RAM usage so this is quality/cost trade off that needs to be balanced. If `shard_size` is set to -1 (the default) then `shard_size` will be automatically estimated based on the number of shards and the `size` parameter.
added[1.2.0] If set to `0`, the `shard_size` will be set to `Integer.MAX_VALUE`.
If set to `0`, the `shard_size` will be set to `Integer.MAX_VALUE`.
NOTE: `shard_size` cannot be smaller than `size` (as it doesn't make much sense). When it is, elasticsearch will
@ -399,7 +389,7 @@ The above aggregation would only return tags which have been found in 10 hits or
Terms that score highly will be collected on a shard level and merged with the terms collected from other shards in a second step. However, the shard does not have the information about the global term frequencies available. The decision if a term is added to a candidate list depends only on the score computed on the shard using local shard frequencies, not the global frequencies of the word. The `min_doc_count` criterion is only applied after merging local terms statistics of all shards. In a way the decision to add the term as a candidate is made without being very _certain_ about if the term will actually reach the required `min_doc_count`. This might cause many (globally) high frequent terms to be missing in the final result if low frequent but high scoring terms populated the candidate lists. To avoid this, the `shard_size` parameter can be increased to allow more candidate terms on the shards. However, this increases memory consumption and network traffic.
added[1.2.0] `shard_min_doc_count` parameter
`shard_min_doc_count` parameter
The parameter `shard_min_doc_count` regulates the _certainty_ a shard has if the term should actually be added to the candidate list or not with respect to the `min_doc_count`. Terms will only be considered if their local shard frequency within the set is higher than the `shard_min_doc_count`. If your dictionary contains many low frequent words and you are not interested in these (for example misspellings), then you can set the `shard_min_doc_count` parameter to filter out candidate terms on a shard level that will with a reasonable certainty not reach the required `min_doc_count` even after merging the local frequencies. `shard_min_doc_count` is set to `1` per default and has no effect unless you explicitly set it.

View File

@ -54,19 +54,19 @@ size buckets was not returned). If set to `0`, the `size` will be set to `Intege
==== Document counts are approximate
As described above, the document counts (and the results of any sub aggregations) in the terms aggregation are not always
accurate. This is because each shard provides its own view of what the ordered list of terms should be and these are
As described above, the document counts (and the results of any sub aggregations) in the terms aggregation are not always
accurate. This is because each shard provides its own view of what the ordered list of terms should be and these are
combined to give a final view. Consider the following scenario:
A request is made to obtain the top 5 terms in the field product, ordered by descending document count from an index with
3 shards. In this case each shard is asked to give its top 5 terms.
A request is made to obtain the top 5 terms in the field product, ordered by descending document count from an index with
3 shards. In this case each shard is asked to give its top 5 terms.
[source,js]
--------------------------------------------------
{
"aggs" : {
"products" : {
"terms" : {
"terms" : {
"field" : "product",
"size" : 5
}
@ -75,23 +75,23 @@ A request is made to obtain the top 5 terms in the field product, ordered by des
}
--------------------------------------------------
The terms for each of the three shards are shown below with their
The terms for each of the three shards are shown below with their
respective document counts in brackets:
[width="100%",cols="^2,^2,^2,^2",options="header"]
|=========================================================
| | Shard A | Shard B | Shard C
| 1 | Product A (25) | Product A (30) | Product A (45)
| 2 | Product B (18) | Product B (25) | Product C (44)
| 3 | Product C (6) | Product F (17) | Product Z (36)
| 4 | Product D (3) | Product Z (16) | Product G (30)
| 5 | Product E (2) | Product G (15) | Product E (29)
| 6 | Product F (2) | Product H (14) | Product H (28)
| 7 | Product G (2) | Product I (10) | Product Q (2)
| 8 | Product H (2) | Product Q (6) | Product D (1)
| 9 | Product I (1) | Product J (8) |
| 10 | Product J (1) | Product C (4) |
| 1 | Product A (25) | Product A (30) | Product A (45)
| 2 | Product B (18) | Product B (25) | Product C (44)
| 3 | Product C (6) | Product F (17) | Product Z (36)
| 4 | Product D (3) | Product Z (16) | Product G (30)
| 5 | Product E (2) | Product G (15) | Product E (29)
| 6 | Product F (2) | Product H (14) | Product H (28)
| 7 | Product G (2) | Product I (10) | Product Q (2)
| 8 | Product H (2) | Product Q (6) | Product D (1)
| 9 | Product I (1) | Product J (8) |
| 10 | Product J (1) | Product C (4) |
|=========================================================
@ -102,41 +102,41 @@ The shards will return their top 5 terms so the results from the shards will be:
|=========================================================
| | Shard A | Shard B | Shard C
| 1 | Product A (25) | Product A (30) | Product A (45)
| 2 | Product B (18) | Product B (25) | Product C (44)
| 3 | Product C (6) | Product F (17) | Product Z (36)
| 4 | Product D (3) | Product Z (16) | Product G (30)
| 5 | Product E (2) | Product G (15) | Product E (29)
| 1 | Product A (25) | Product A (30) | Product A (45)
| 2 | Product B (18) | Product B (25) | Product C (44)
| 3 | Product C (6) | Product F (17) | Product Z (36)
| 4 | Product D (3) | Product Z (16) | Product G (30)
| 5 | Product E (2) | Product G (15) | Product E (29)
|=========================================================
Taking the top 5 results from each of the shards (as requested) and combining them to make a final top 5 list produces
Taking the top 5 results from each of the shards (as requested) and combining them to make a final top 5 list produces
the following:
[width="40%",cols="^2,^2"]
|=========================================================
| 1 | Product A (100)
| 2 | Product Z (52)
| 3 | Product C (50)
| 4 | Product G (45)
| 5 | Product B (43)
| 1 | Product A (100)
| 2 | Product Z (52)
| 3 | Product C (50)
| 4 | Product G (45)
| 5 | Product B (43)
|=========================================================
Because Product A was returned from all shards we know that its document count value is accurate. Product C was only
returned by shards A and C so its document count is shown as 50 but this is not an accurate count. Product C exists on
shard B, but its count of 4 was not high enough to put Product C into the top 5 list for that shard. Product Z was also
returned only by 2 shards but the third shard does not contain the term. There is no way of knowing, at the point of
combining the results to produce the final list of terms, that there is an error in the document count for Product C and
not for Product Z. Product H has a document count of 44 across all 3 shards but was not included in the final list of
Because Product A was returned from all shards we know that its document count value is accurate. Product C was only
returned by shards A and C so its document count is shown as 50 but this is not an accurate count. Product C exists on
shard B, but its count of 4 was not high enough to put Product C into the top 5 list for that shard. Product Z was also
returned only by 2 shards but the third shard does not contain the term. There is no way of knowing, at the point of
combining the results to produce the final list of terms, that there is an error in the document count for Product C and
not for Product Z. Product H has a document count of 44 across all 3 shards but was not included in the final list of
terms because it did not make it into the top five terms on any of the shards.
==== Shard Size
The higher the requested `size` is, the more accurate the results will be, but also, the more expensive it will be to
compute the final results (both due to bigger priority queues that are managed on a shard level and due to bigger data
transfers between the nodes and the client).
transfers between the nodes and the client).
The `shard_size` parameter can be used to minimize the extra work that comes with bigger requested `size`. When defined,
it will determine how many terms the coordinating node will request from each shard. Once all the shards responded, the
@ -148,17 +148,15 @@ the client. If set to `0`, the `shard_size` will be set to `Integer.MAX_VALUE`.
NOTE: `shard_size` cannot be smaller than `size` (as it doesn't make much sense). When it is, elasticsearch will
override it and reset it to be equal to `size`.
added[1.1.0] It is possible to not limit the number of terms that are returned by setting `size` to `0`. Don't use this
It is possible to not limit the number of terms that are returned by setting `size` to `0`. Don't use this
on high-cardinality fields as this will kill both your CPU since terms need to be return sorted, and your network.
==== Calculating Document Count Error
coming[1.4.0]
There are two error values which can be shown on the terms aggregation. The first gives a value for the aggregation as
a whole which represents the maximum potential document count for a term which did not make it into the final list of
terms. This is calculated as the sum of the document count from the last term returned from each shard .For the example
given above the value would be 46 (2 + 15 + 29). This means that in the worst case scenario a term which was not returned
There are two error values which can be shown on the terms aggregation. The first gives a value for the aggregation as
a whole which represents the maximum potential document count for a term which did not make it into the final list of
terms. This is calculated as the sum of the document count from the last term returned from each shard .For the example
given above the value would be 46 (2 + 15 + 29). This means that in the worst case scenario a term which was not returned
could have the 4th highest document count.
[source,js]
@ -185,13 +183,13 @@ could have the 4th highest document count.
}
--------------------------------------------------
The second error value can be enabled by setting the `show_term_doc_count_error` parameter to true. This shows an error value
for each term returned by the aggregation which represents the 'worst case' error in the document count and can be useful when
deciding on a value for the `shard_size` parameter. This is calculated by summing the document counts for the last term returned
by all shards which did not return the term. In the example above the error in the document count for Product C would be 15 as
Shard B was the only shard not to return the term and the document count of the last termit did return was 15. The actual document
count of Product C was 54 so the document count was only actually off by 4 even though the worst case was that it would be off by
15. Product A, however has an error of 0 for its document count, since every shard returned it we can be confident that the count
The second error value can be enabled by setting the `show_term_doc_count_error` parameter to true. This shows an error value
for each term returned by the aggregation which represents the 'worst case' error in the document count and can be useful when
deciding on a value for the `shard_size` parameter. This is calculated by summing the document counts for the last term returned
by all shards which did not return the term. In the example above the error in the document count for Product C would be 15 as
Shard B was the only shard not to return the term and the document count of the last termit did return was 15. The actual document
count of Product C was 54 so the document count was only actually off by 4 even though the worst case was that it would be off by
15. Product A, however has an error of 0 for its document count, since every shard returned it we can be confident that the count
returned is accurate.
[source,js]
@ -220,10 +218,10 @@ returned is accurate.
}
--------------------------------------------------
These errors can only be calculated in this way when the terms are ordered by descending document count. When the aggregation is
ordered by the terms values themselves (either ascending or descending) there is no error in the document count since if a shard
does not return a particular term which appears in the results from another shard, it must not have that term in its index. When the
aggregation is either sorted by a sub aggregation or in order of ascending document count, the error in the document counts cannot be
These errors can only be calculated in this way when the terms are ordered by descending document count. When the aggregation is
ordered by the terms values themselves (either ascending or descending) there is no error in the document count since if a shard
does not return a particular term which appears in the results from another shard, it must not have that term in its index. When the
aggregation is either sorted by a sub aggregation or in order of ascending document count, the error in the document counts cannot be
determined and is given a value of -1 to indicate this.
==== Order
@ -342,8 +340,6 @@ PATH := <AGG_NAME>[<AGG_SEPARATOR><AGG_NAME>]*[<METRIC_SEPARATOR
The above will sort the countries buckets based on the average height among the female population.
coming[1.4.0]
Multiple criteria can be used to order the buckets by providing an array of order criteria such as the following:
[source,js]
@ -368,13 +364,13 @@ Multiple criteria can be used to order the buckets by providing an array of orde
}
--------------------------------------------------
The above will sort the countries buckets based on the average height among the female population and then by
The above will sort the countries buckets based on the average height among the female population and then by
their `doc_count` in descending order.
NOTE: In the event that two buckets share the same values for all order criteria the bucket's term value is used as a
NOTE: In the event that two buckets share the same values for all order criteria the bucket's term value is used as a
tie-breaker in ascending alphabetical order to prevent non-deterministic ordering of buckets.
==== Minimum document count
==== Minimum document count
It is possible to only return terms that match more than a configured number of hits using the `min_doc_count` option:
@ -397,7 +393,7 @@ The above aggregation would only return tags which have been found in 10 hits or
Terms are collected and ordered on a shard level and merged with the terms collected from other shards in a second step. However, the shard does not have the information about the global document count available. The decision if a term is added to a candidate list depends only on the order computed on the shard using local shard frequencies. The `min_doc_count` criterion is only applied after merging local terms statistics of all shards. In a way the decision to add the term as a candidate is made without being very _certain_ about if the term will actually reach the required `min_doc_count`. This might cause many (globally) high frequent terms to be missing in the final result if low frequent terms populated the candidate lists. To avoid this, the `shard_size` parameter can be increased to allow more candidate terms on the shards. However, this increases memory consumption and network traffic.
added[1.2.0] `shard_min_doc_count` parameter
`shard_min_doc_count` parameter
The parameter `shard_min_doc_count` regulates the _certainty_ a shard has if the term should actually be added to the candidate list or not with respect to the `min_doc_count`. Terms will only be considered if their local shard frequency within the set is higher than the `shard_min_doc_count`. If your dictionary contains many low frequent terms and you are not interested in those (for example misspellings), then you can set the `shard_min_doc_count` parameter to filter out candidate terms on a shard level that will with a reasonable certainty not reach the required `min_doc_count` even after merging the local counts. `shard_min_doc_count` is set to `0` per default and has no effect unless you explicitly set it.
@ -530,7 +526,7 @@ strings that represent the terms as they are found in the index:
}
}
}
--------------------------------------------------
--------------------------------------------------
==== Multi-field terms aggregation
@ -561,12 +557,12 @@ this single field, which will benefit from the global ordinals optimization.
==== Collect mode
added[1.3.0] Deferring calculation of child aggregations
Deferring calculation of child aggregations
For fields with many unique terms and a small number of required results it can be more efficient to delay the calculation
of child aggregations until the top parent-level aggs have been pruned. Ordinarily, all branches of the aggregation tree
are expanded in one depth-first pass and only then any pruning occurs. In some rare scenarios this can be very wasteful and can hit memory constraints.
An example problem scenario is querying a movie database for the 10 most popular actors and their 5 most common co-stars:
An example problem scenario is querying a movie database for the 10 most popular actors and their 5 most common co-stars:
[source,js]
--------------------------------------------------
@ -590,11 +586,11 @@ An example problem scenario is querying a movie database for the 10 most popular
}
--------------------------------------------------
Even though the number of movies may be comparatively small and we want only 50 result buckets there is a combinatorial explosion of buckets
during calculation - a single movie will produce n² buckets where n is the number of actors. The sane option would be to first determine
Even though the number of movies may be comparatively small and we want only 50 result buckets there is a combinatorial explosion of buckets
during calculation - a single movie will produce n² buckets where n is the number of actors. The sane option would be to first determine
the 10 most popular actors and only then examine the top co-stars for these 10 actors. This alternative strategy is what we call the `breadth_first` collection
mode as opposed to the default `depth_first` mode:
[source,js]
--------------------------------------------------
{
@ -620,24 +616,20 @@ mode as opposed to the default `depth_first` mode:
When using `breadth_first` mode the set of documents that fall into the uppermost buckets are
cached for subsequent replay so there is a memory overhead in doing this which is linear with the number of matching documents.
cached for subsequent replay so there is a memory overhead in doing this which is linear with the number of matching documents.
In most requests the volume of buckets generated is smaller than the number of documents that fall into them so the default `depth_first`
collection mode is normally the best bet but occasionally the `breadth_first` strategy can be significantly more efficient. Currently
collection mode is normally the best bet but occasionally the `breadth_first` strategy can be significantly more efficient. Currently
elasticsearch will always use the `depth_first` collect_mode unless explicitly instructed to use `breadth_first` as in the above example.
Note that the `order` parameter can still be used to refer to data from a child aggregation when using the `breadth_first` setting - the parent
aggregation understands that this child aggregation will need to be called first before any of the other child aggregations.
WARNING: It is not possible to nest aggregations such as `top_hits` which require access to match score information under an aggregation that uses
the `breadth_first` collection mode. This is because this would require a RAM buffer to hold the float score value for every document and
this would typically be too costly in terms of RAM.
this would typically be too costly in terms of RAM.
[[search-aggregations-bucket-terms-aggregation-execution-hint]]
==== Execution hint
added[1.2.0] Added the `global_ordinals`, `global_ordinals_hash` and `global_ordinals_low_cardinality` execution modes
deprecated[1.3.0] Removed the `ordinals` execution mode
There are different mechanisms by which terms aggregations can be executed:
- by using field values directly in order to aggregate data per-bucket (`map`)

View File

@ -1,8 +1,6 @@
[[search-aggregations-metrics-cardinality-aggregation]]
=== Cardinality Aggregation
added[1.1.0]
A `single-value` metrics aggregation that calculates an approximate count of
distinct values. Values can be extracted either from specific fields in the
document or generated by a script.

View File

@ -1,8 +1,6 @@
[[search-aggregations-metrics-geobounds-aggregation]]
=== Geo Bounds Aggregation
added[1.3.0]
A metric aggregation that computes the bounding box containing all geo_point values for a field.
.Experimental!

View File

@ -1,20 +1,11 @@
[[search-aggregations-metrics-percentile-aggregation]]
=== Percentiles Aggregation
added[1.1.0]
A `multi-value` metrics aggregation that calculates one or more percentiles
over numeric values extracted from the aggregated documents. These values
can be extracted either from specific numeric fields in the documents, or
be generated by a provided script.
.Experimental!
[IMPORTANT]
=====
This feature is marked as experimental, and may be subject to change in the
future. If you use this feature, please let us know your experience with it!
=====
Percentiles show the point at which a certain percentage of observed values
occur. For example, the 95th percentile is the value which is greater than 95%
of the observed values.
@ -71,9 +62,6 @@ percentiles: `[ 1, 5, 25, 50, 75, 95, 99 ]`. The response will look like this:
}
--------------------------------------------------
WARNING: added[1.2.0] The above response structure applies for `1.2.0` and above. Pre `1.2.0` release, the `values` object was
missing and all the percentiles where placed directly under the aggregation name object
As you can see, the aggregation will return a calculated value for each percentile
in the default range. If we assume response times are in milliseconds, it is
immediately obvious that the webpage normally loads in 15-30ms, but occasionally

View File

@ -1,8 +1,6 @@
[[search-aggregations-metrics-percentile-rank-aggregation]]
=== Percentile Ranks Aggregation
added[1.3.0]
A `multi-value` metrics aggregation that calculates one or more percentile ranks
over numeric values extracted from the aggregated documents. These values
can be extracted either from specific numeric fields in the documents, or

View File

@ -1,8 +1,6 @@
[[search-aggregations-metrics-scripted-metric-aggregation]]
=== Scripted Metric Aggregation
coming[1.4.0]
A metric aggregation that executes using scripts to provide a metric output.
.Experimental!

View File

@ -1,8 +1,6 @@
[[search-aggregations-metrics-top-hits-aggregation]]
=== Top hits Aggregation
added[1.3.0]
A `top_hits` metric aggregator keeps track of the most relevant document being aggregated. This aggregator is intended
to be used as a sub aggregator, so that the top matching documents can be aggregated per bucket.

View File

@ -34,8 +34,6 @@ The name of the aggregation (`grades_count` above) also serves as the key by whi
retrieved from the returned response.
==== Script
added[1.1.0]
Counting the values generated by a script:
[source,js]

View File

@ -1,8 +1,6 @@
[[search-benchmark]]
== Benchmark
coming[1.4.0]
.Experimental!
[IMPORTANT]
=====

View File

@ -21,7 +21,7 @@ $ curl -XGET 'http://localhost:9200/twitter/tweet/_count' -d '
--------------------------------------------------
NOTE: The query being sent in the body must be nested in a `query` key, same as
the <<search-search,search api>> works added[1.0.0.RC1,The query was previously the top-level object].
the <<search-search,search api>> works
Both examples above do the same thing, which is count the number of
tweets from the twitter index for a certain user. The result is:
@ -64,7 +64,7 @@ query.
|default_operator |The default operator to be used, can be `AND` or
`OR`. Defaults to `OR`.
|coming[1.4.0] terminate_after |The maximum count for each shard, upon
|terminate_after |The maximum count for each shard, upon
reaching which the query execution will terminate early.
If set, the response will have a boolean field `terminated_early` to
indicate whether the query execution has actually terminated_early.

View File

@ -63,7 +63,7 @@ This will yield the same result as the previous request.
[horizontal]
`_source`::
added[1.0.0.Beta1] Set to `true` to retrieve the `_source` of the document explained. You can also
Set to `true` to retrieve the `_source` of the document explained. You can also
retrieve part of the document by using `_source_include` & `_source_exclude` (see <<get-source-filtering,Get API>> for more details)
`fields`::

View File

@ -1,8 +1,6 @@
[[search-percolate]]
== Percolator
added[1.0.0.Beta1]
Traditionally you design documents based on your data and store them into an index and then define queries via the search api
in order to retrieve these documents. The percolator works in the opposite direction, first you store queries into an
index and then via the percolate api you define documents in order to retrieve these queries.
@ -20,7 +18,6 @@ in the percolate api.
Field referred to in a percolator query must *already* exist in the mapping
assocated with the index used for percolation.
coming[1.4.0,Applies to indices created in 1.4.0 or later]
There are two ways to make sure that a field mapping exist:
* Add or update a mapping via the <<indices-create-index,create index>> or

View File

@ -70,13 +70,13 @@ And here is a sample response:
`query_cache`::
coming[1.4.0] Set to `true` or `false` to enable or disable the caching
Set to `true` or `false` to enable or disable the caching
of search results for requests where `?search_type=count`, ie
aggregations and suggestions. See <<index-modules-shard-query-cache>>.
`terminate_after`::
coming[1.4.0] The maximum number of documents to collect for each shard,
The maximum number of documents to collect for each shard,
upon reaching which the query execution will terminate early. If set, the
response will have a boolean field `terminated_early` to indicate whether
the query execution has actually terminated_early. Defaults to no

View File

@ -44,60 +44,3 @@ Script fields can also be automatically detected and used as fields, so
things like `_source.obj1.field1` can be used, though not recommended, as
`obj1.field1` will work as well.
[[partial]]
==== Partial
deprecated[1.0.0Beta1,Replaced by <<search-request-source-filtering>>]
When loading data from `_source`, partial fields can be used to use
wildcards to control what part of the `_source` will be loaded based on
`include` and `exclude` patterns. For example:
[source,js]
--------------------------------------------------
{
"query" : {
"match_all" : {}
},
"partial_fields" : {
"partial1" : {
"include" : "obj1.obj2.*",
}
}
}
--------------------------------------------------
And one that will also exclude `obj1.obj3`:
[source,js]
--------------------------------------------------
{
"query" : {
"match_all" : {}
},
"partial_fields" : {
"partial1" : {
"include" : "obj1.obj2.*",
"exclude" : "obj1.obj3.*"
}
}
}
--------------------------------------------------
Both `include` and `exclude` support multiple patterns:
[source,js]
--------------------------------------------------
{
"query" : {
"match_all" : {}
},
"partial_fields" : {
"partial1" : {
"include" : ["obj1.obj2.*", "obj1.obj4.*"],
"exclude" : "obj1.obj3.*"
}
}
}
--------------------------------------------------

View File

@ -124,8 +124,6 @@ The following is an example that forces the use of the plain highlighter:
==== Force highlighting on source
added[1.0.0.RC1]
Forces the highlighting to highlight fields based on the source even if fields are
stored separately. Defaults to `false`.

View File

@ -82,8 +82,6 @@ for <<query-dsl-function-score-query,`function query`>> rescores.
==== Multiple Rescores
added[1.1.0]
It is also possible to execute multiple rescores in sequence:
[source,js]
--------------------------------------------------

View File

@ -131,12 +131,6 @@ the `nested_filter` then a missing value is used.
==== Ignoring Unmapped Fields
coming[1.4.0] Before 1.4.0 there was the `ignore_unmapped` boolean
parameter, which was not enough information to decide on the sort
values to emit, and didn't work for cross-index search. It is still
supported but users are encouraged to migrate to the new
`unmapped_type` instead.
By default, the search request will fail if there is no mapping
associated with a field. The `unmapped_type` option allows to ignore
fields that have no mapping and not sort by them. The value of this
@ -285,8 +279,6 @@ conform with http://geojson.org/[GeoJSON].
==== Multiple reference points
coming[1.4.0]
Multiple geo points can be passed as an array containing any `geo_point` format, for example
[source,js]

View File

@ -1,8 +1,6 @@
[[search-request-source-filtering]]
=== Source filtering
added[1.0.0.Beta1]
Allows to control how the `_source` field is returned with every hit.

View File

@ -1,8 +1,6 @@
[[search-template]]
== Search Template
added[1.1.0]
The `/_search/template` endpoint allows to use the mustache language to pre render search requests,
before they are executed and fill existing templates with template parameters.
@ -224,8 +222,6 @@ GET /_search/template
<1> Name of the the query template in `config/scripts/`, i.e., `storedTemplate.mustache`.
added[1.3.0]
You can also register search templates by storing it in the elasticsearch cluster in a special index named `.scripts`.
There are REST APIs to manage these indexed templates.

View File

@ -1,8 +1,6 @@
[[suggester-context]]
=== Context Suggester
added[1.2.0]
The context suggester is an extension to the suggest API of Elasticsearch. Namely the
suggester system provides a very fast way of searching documents by handling these
entirely in memory. But this special treatment does not allow the handling of

View File

@ -62,7 +62,7 @@ query.
|`explain` |For each hit, contain an explanation of how scoring of the
hits was computed.
|`_source`| added[1.0.0.Beta1]Set to `false` to disable retrieval of the `_source` field. You can also retrieve
|`_source`|Set to `false` to disable retrieval of the `_source` field. You can also retrieve
part of the document by using `_source_include` & `_source_exclude` (see the <<search-request-source-filtering, request body>>
documentation for more details)
@ -82,7 +82,7 @@ scores and return them as part of each hit.
within the specified time value and bail with the hits accumulated up to
that point when expired. Defaults to no timeout.
|coming[1.4.0] `terminate_after` |The maximum number of documents to collect for
|`terminate_after` |The maximum number of documents to collect for
each shard, upon reaching which the query execution will terminate early.
If set, the response will have a boolean field `terminated_early` to
indicate whether the query execution has actually terminated_early.

View File

@ -43,7 +43,7 @@ curl -XGET 'http://localhost:9200/twitter/tweet/_validate/query' -d '{
--------------------------------------------------
NOTE: The query being sent in the body must be nested in a `query` key, same as
the <<search-search,search api>> works added[1.0.0.RC1,The query was previously the top-level object].
the <<search-search,search api>> works
If the query is invalid, `valid` will be `false`. Here the query is
invalid because Elasticsearch knows the post_date field should be a date

View File

@ -52,7 +52,7 @@ $ bin/elasticsearch -Xmx2g -Xms2g -Des.index.store.type=memory --node.name=my-no
Elasticsearch is built using Java, and requires at least
http://www.oracle.com/technetwork/java/javase/downloads/index.html[Java 7] in
order to run added[1.2.0,Was at least Java 6 before]. Only Oracle's Java and
order to run Only Oracle's Java and
the OpenJDK are supported.
We recommend installing the *Java 8 update 20 or later*, or *Java 7 update 55

View File

@ -1,8 +1,6 @@
[[testing-framework]]
== Java Testing Framework
added[1.0.0.RC1]
[[testing-intro]]
Testing is a crucial part of your application, and as information retrieval itself is already a complex topic, there should not be any additional complexity in setting up a testing infrastructure, which uses elasticsearch. This is the main reason why we decided to release an additional file to the release, which allows you to use the same testing infrastructure we do in the elasticsearch core. The testing framework allows you to setup clusters with multiple nodes in order to check if your code covers everything needed to run in a cluster. The framework prevents you from writing complex code yourself to start, stop or manage several test nodes in a cluster. In addition there is another very important feature called randomized testing, which you are getting for free as it is part of the elasticsearch infrastructure.