parent
2affa5004f
commit
d7b0d547d4
|
@ -5,7 +5,7 @@
|
||||||
[[shard-allocation-filtering]]
|
[[shard-allocation-filtering]]
|
||||||
=== Shard Allocation Filtering
|
=== Shard Allocation Filtering
|
||||||
|
|
||||||
Allow to control allocation if indices on nodes based on include/exclude
|
Allows to control the allocation of indices on nodes based on include/exclude
|
||||||
filters. The filters can be set both on the index level and on the
|
filters. The filters can be set both on the index level and on the
|
||||||
cluster level. Lets start with an example of setting it on the cluster
|
cluster level. Lets start with an example of setting it on the cluster
|
||||||
level:
|
level:
|
||||||
|
|
|
@ -2,8 +2,8 @@
|
||||||
== Analysis
|
== Analysis
|
||||||
|
|
||||||
The index analysis module acts as a configurable registry of Analyzers
|
The index analysis module acts as a configurable registry of Analyzers
|
||||||
that can be used in order to both break indexed (analyzed) fields when a
|
that can be used in order to break down indexed (analyzed) fields when a
|
||||||
document is indexed and process query strings. It maps to the Lucene
|
document is indexed as well as to process query strings. It maps to the Lucene
|
||||||
`Analyzer`.
|
`Analyzer`.
|
||||||
|
|
||||||
Analyzers are (generally) composed of a single `Tokenizer` and zero or
|
Analyzers are (generally) composed of a single `Tokenizer` and zero or
|
||||||
|
|
|
@ -2,8 +2,8 @@
|
||||||
== Codec module
|
== Codec module
|
||||||
|
|
||||||
Codecs define how documents are written to disk and read from disk. The
|
Codecs define how documents are written to disk and read from disk. The
|
||||||
postings format is the part of the codec that responsible for reading
|
postings format is the part of the codec that is responsible for reading
|
||||||
and writing the term dictionary, postings lists and positions, payloads
|
and writing the term dictionary, postings lists and positions, as well as the payloads
|
||||||
and offsets stored in the postings list. The doc values format is
|
and offsets stored in the postings list. The doc values format is
|
||||||
responsible for reading column-stride storage for a field and is typically
|
responsible for reading column-stride storage for a field and is typically
|
||||||
used for sorting or faceting. When a field doesn't have doc values enabled,
|
used for sorting or faceting. When a field doesn't have doc values enabled,
|
||||||
|
@ -25,7 +25,7 @@ Elasticsearch, requiring data to be reindexed.
|
||||||
[[custom-postings]]
|
[[custom-postings]]
|
||||||
=== Configuring a custom postings format
|
=== Configuring a custom postings format
|
||||||
|
|
||||||
Custom postings format can be defined in the index settings in the
|
A custom postings format can be defined in the index settings in the
|
||||||
`codec` part. The `codec` part can be configured when creating an index
|
`codec` part. The `codec` part can be configured when creating an index
|
||||||
or updating index settings. An example on how to define your custom
|
or updating index settings. An example on how to define your custom
|
||||||
postings format:
|
postings format:
|
||||||
|
@ -48,7 +48,7 @@ curl -XPUT 'http://localhost:9200/twitter/' -d '{
|
||||||
}'
|
}'
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
Then we defining your mapping your can use the `my_format` name in the
|
Then when defining your mapping you can use the `my_format` name in the
|
||||||
`postings_format` option as the example below illustrates:
|
`postings_format` option as the example below illustrates:
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
|
|
|
@ -8,16 +8,31 @@ explicit mappings pre defined. For more information about mapping
|
||||||
definitions, check out the <<mapping,mapping section>>.
|
definitions, check out the <<mapping,mapping section>>.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
=== Dynamic / Default Mappings
|
=== Dynamic Mappings
|
||||||
|
|
||||||
Dynamic mappings allow to automatically apply generic mapping definition
|
New types and new fields within types can be added dynamically just
|
||||||
to types that do not have mapping pre defined or applied to new mapping
|
by indexing a document. When Elasticsearch encounters a new type,
|
||||||
definitions (overridden). This is mainly done thanks to the fact that
|
it creates the type using the `_default_` mapping (see below).
|
||||||
the `object` type and namely the root `object` type allow for schema
|
|
||||||
less dynamic addition of unmapped fields.
|
|
||||||
|
|
||||||
The default mapping definition is plain mapping definition that is
|
When it encounters a new field within a type, it autodetects the
|
||||||
embedded within Elasticsearch:
|
datatype that the field contains and adds it to the type mapping
|
||||||
|
automatically.
|
||||||
|
|
||||||
|
See <<mapping-dynamic-mapping>> for details of how to control and
|
||||||
|
configure dynamic mapping.
|
||||||
|
|
||||||
|
[float]
|
||||||
|
=== Default Mapping
|
||||||
|
|
||||||
|
When a new type is created (at <<indices-create-index,index creation>> time,
|
||||||
|
using the <<indices-put-mapping,`put-mapping` API>> or just by indexing a
|
||||||
|
document into it), the type uses the `_default_` mapping as its basis. Any
|
||||||
|
mapping specified in the <<indices-create-index,`create-index`>> or
|
||||||
|
<<indices-put-mapping,`put-mapping`>> request override values set in the
|
||||||
|
`_default_` mapping.
|
||||||
|
|
||||||
|
The default mapping definition is a plain mapping definition that is
|
||||||
|
embedded within ElasticSearch:
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
@ -27,13 +42,15 @@ embedded within Elasticsearch:
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
Pretty short, no? Basically, everything is defaulted, especially the
|
Pretty short, isn't it? Basically, everything is `_default_`ed, including the
|
||||||
dynamic nature of the root object mapping. The default mapping
|
dynamic nature of the root object mapping which allows new fields to be added
|
||||||
definition can be overridden in several manners. The simplest manner is
|
automatically.
|
||||||
to simply define a file called `default-mapping.json` and placed it
|
|
||||||
under the `config` directory (which can be configured to exist in a
|
The built-in default mapping definition can be overridden in several ways. A
|
||||||
different location). It can also be explicitly set using the
|
`_default_` mapping can be specified when creating a new index, or the global
|
||||||
`index.mapper.default_mapping_location` setting.
|
`_default_` mapping (for all indices) can be configured by creating a file
|
||||||
|
called `config/default-mapping.json`. (This location can be changed with
|
||||||
|
the `index.mapper.default_mapping_location` setting.)
|
||||||
|
|
||||||
Dynamic creation of mappings for unmapped types can be completely
|
Dynamic creation of mappings for unmapped types can be completely
|
||||||
disabled by setting `index.mapper.dynamic` to `false`.
|
disabled by setting `index.mapper.dynamic` to `false`.
|
||||||
|
|
|
@ -43,7 +43,7 @@ This policy has the following settings:
|
||||||
|
|
||||||
Segments smaller than this are "rounded up" to this size, i.e. treated as
|
Segments smaller than this are "rounded up" to this size, i.e. treated as
|
||||||
equal (floor) size for merge selection. This is to prevent frequent
|
equal (floor) size for merge selection. This is to prevent frequent
|
||||||
flushing of tiny segments from allowing a long tail in the index. Default
|
flushing of tiny segments, thus preventing a long tail in the index. Default
|
||||||
is `2mb`.
|
is `2mb`.
|
||||||
|
|
||||||
`index.merge.policy.max_merge_at_once`::
|
`index.merge.policy.max_merge_at_once`::
|
||||||
|
@ -67,7 +67,7 @@ This policy has the following settings:
|
||||||
|
|
||||||
Sets the allowed number of segments per tier. Smaller values mean more
|
Sets the allowed number of segments per tier. Smaller values mean more
|
||||||
merging but fewer segments. Default is `10`. Note, this value needs to be
|
merging but fewer segments. Default is `10`. Note, this value needs to be
|
||||||
>= then the `max_merge_at_once` otherwise you'll force too many merges to
|
>= than the `max_merge_at_once` otherwise you'll force too many merges to
|
||||||
occur.
|
occur.
|
||||||
|
|
||||||
`index.reclaim_deletes_weight`::
|
`index.reclaim_deletes_weight`::
|
||||||
|
@ -83,8 +83,8 @@ This policy has the following settings:
|
||||||
<<index-modules-settings>>.
|
<<index-modules-settings>>.
|
||||||
|
|
||||||
For normal merging, this policy first computes a "budget" of how many
|
For normal merging, this policy first computes a "budget" of how many
|
||||||
segments are allowed by be in the index. If the index is over-budget,
|
segments are allowed to be in the index. If the index is over-budget,
|
||||||
then the policy sorts segments by decreasing size (pro-rating by percent
|
then the policy sorts segments by decreasing size (proportionally considering percent
|
||||||
deletes), and then finds the least-cost merge. Merge cost is measured by
|
deletes), and then finds the least-cost merge. Merge cost is measured by
|
||||||
a combination of the "skew" of the merge (size of largest seg divided by
|
a combination of the "skew" of the merge (size of largest seg divided by
|
||||||
smallest seg), total merge size and pct deletes reclaimed, so that
|
smallest seg), total merge size and pct deletes reclaimed, so that
|
||||||
|
@ -99,7 +99,7 @@ budget.
|
||||||
Note, this can mean that for large shards that holds many gigabytes of
|
Note, this can mean that for large shards that holds many gigabytes of
|
||||||
data, the default of `max_merged_segment` (`5gb`) can cause for many
|
data, the default of `max_merged_segment` (`5gb`) can cause for many
|
||||||
segments to be in an index, and causing searches to be slower. Use the
|
segments to be in an index, and causing searches to be slower. Use the
|
||||||
indices segments API to see the segments that an index have, and
|
indices segments API to see the segments that an index has, and
|
||||||
possibly either increase the `max_merged_segment` or issue an optimize
|
possibly either increase the `max_merged_segment` or issue an optimize
|
||||||
call for the index (try and aim to issue it on a low traffic time).
|
call for the index (try and aim to issue it on a low traffic time).
|
||||||
|
|
||||||
|
@ -192,24 +192,21 @@ supported, with the default being the `ConcurrentMergeScheduler`.
|
||||||
[float]
|
[float]
|
||||||
==== ConcurrentMergeScheduler
|
==== ConcurrentMergeScheduler
|
||||||
|
|
||||||
A merge scheduler that runs merges using a separated thread, until the
|
A merge scheduler that runs merges using a separate thread. When the maximum
|
||||||
maximum number of threads at which when a merge is needed, the thread(s)
|
number of threads is reached, further merges will wait until a merge thread
|
||||||
that are updating the index will pause until one or more merges
|
becomes available.
|
||||||
completes.
|
|
||||||
|
|
||||||
The scheduler supports the following settings:
|
The scheduler supports the following settings:
|
||||||
|
|
||||||
[cols="<,<",options="header",]
|
`index.merge.scheduler.max_thread_count`::
|
||||||
|=======================================================================
|
|
||||||
|Setting |Description
|
The maximum number of threads to perform the merge operation. Defaults to
|
||||||
|index.merge.scheduler.max_thread_count |The maximum number of threads
|
|
||||||
to perform the merge operation. Defaults to
|
|
||||||
`Math.max(1, Math.min(3, Runtime.getRuntime().availableProcessors() / 2))`.
|
`Math.max(1, Math.min(3, Runtime.getRuntime().availableProcessors() / 2))`.
|
||||||
|=======================================================================
|
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
==== SerialMergeScheduler
|
==== SerialMergeScheduler
|
||||||
|
|
||||||
A merge scheduler that simply does each merge sequentially using the
|
A merge scheduler that simply does each merge sequentially using the
|
||||||
calling thread (blocking the operations that triggered the merge, the
|
calling thread (blocking the operations that triggered the merge or the
|
||||||
index operation).
|
index operation).
|
||||||
|
|
|
@ -163,9 +163,9 @@ An alias can also be added with the endpoint
|
||||||
where
|
where
|
||||||
|
|
||||||
[horizontal]
|
[horizontal]
|
||||||
`index`:: The index to alias refers to. Can be any of `blank | * | _all | glob pattern | name1, name2, …`
|
`index`:: The index the alias refers to. Can be any of `blank | * | _all | glob pattern | name1, name2, …`
|
||||||
`name`:: The name of the alias. This is a required option.
|
`name`:: The name of the alias. This is a required option.
|
||||||
`routing`:: An optional routing that can be associated with an alias.
|
`routing`:: An optional routing that can be associated with an alias.
|
||||||
`filter`:: An optional filter that can be associated with an alias.
|
`filter`:: An optional filter that can be associated with an alias.
|
||||||
|
|
||||||
You can also use the plural `_aliases`.
|
You can also use the plural `_aliases`.
|
||||||
|
@ -190,7 +190,7 @@ curl -XPUT 'localhost:9200/users/_alias/user_12' -d '{
|
||||||
"term" : {
|
"term" : {
|
||||||
"user_id" : 12
|
"user_id" : 12
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}'
|
}'
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
|
@ -226,8 +226,8 @@ The rest endpoint is: `/{index}/_alias/{name}`
|
||||||
where
|
where
|
||||||
|
|
||||||
[horizontal]
|
[horizontal]
|
||||||
`index`:: `* | _all | glob pattern | name1, name2, …`
|
`index`:: `* | _all | glob pattern | name1, name2, …`
|
||||||
`name`:: `* | _all | glob pattern | name1, name2, …`
|
`name`:: `* | _all | glob pattern | name1, name2, …`
|
||||||
|
|
||||||
Alternatively you can use the plural `_aliases`. Example:
|
Alternatively you can use the plural `_aliases`. Example:
|
||||||
|
|
||||||
|
@ -245,15 +245,14 @@ alias name and index name. This api redirects to the master and fetches
|
||||||
the requested index aliases, if available. This api only serialises the
|
the requested index aliases, if available. This api only serialises the
|
||||||
found index aliases.
|
found index aliases.
|
||||||
|
|
||||||
Possible options:
|
Possible options:
|
||||||
[horizontal]
|
[horizontal]
|
||||||
`index`::
|
`index`::
|
||||||
|
|
||||||
The index name to get aliases for. Partially names are
|
The index name to get aliases for. Partially names are
|
||||||
supported via wildcards, also multiple index names can be specified
|
supported via wildcards, also multiple index names can be specified
|
||||||
separated with a comma. Also the alias name for an index can be used.
|
separated with a comma. Also the alias name for an index can be used.
|
||||||
|
|
||||||
`alias`::
|
`alias`::
|
||||||
The name of alias to return in the response. Like the index
|
The name of alias to return in the response. Like the index
|
||||||
option, this option supports wildcards and the option the specify
|
option, this option supports wildcards and the option the specify
|
||||||
multiple alias names separated by a comma.
|
multiple alias names separated by a comma.
|
||||||
|
|
|
@ -47,7 +47,7 @@ Also, the analyzer can be derived based on a field mapping, for example:
|
||||||
curl -XGET 'localhost:9200/test/_analyze?field=obj1.field1' -d 'this is a test'
|
curl -XGET 'localhost:9200/test/_analyze?field=obj1.field1' -d 'this is a test'
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
Will cause the analysis to happen based on the analyzer configure in the
|
Will cause the analysis to happen based on the analyzer configured in the
|
||||||
mapping for `obj1.field1` (and if not, the default index analyzer).
|
mapping for `obj1.field1` (and if not, the default index analyzer).
|
||||||
|
|
||||||
Also, the text can be provided as part of the request body, and not as a
|
Also, the text can be provided as part of the request body, and not as a
|
||||||
|
|
|
@ -1,7 +1,7 @@
|
||||||
[[indices-get-mapping]]
|
[[indices-get-mapping]]
|
||||||
== Get Mapping
|
== Get Mapping
|
||||||
|
|
||||||
The get mapping API allows to retrieve mapping definition of index or
|
The get mapping API allows to retrieve mapping definitions for an index or
|
||||||
index/type.
|
index/type.
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
|
@ -15,7 +15,7 @@ curl -XGET 'http://localhost:9200/twitter/tweet/_mapping'
|
||||||
The get mapping API can be used to get more than one index or type
|
The get mapping API can be used to get more than one index or type
|
||||||
mapping with a single call. General usage of the API follows the
|
mapping with a single call. General usage of the API follows the
|
||||||
following syntax: `host:port/{index}/{type}/_mapping` where both
|
following syntax: `host:port/{index}/{type}/_mapping` where both
|
||||||
`{index}` and `{type}` can stand for comma-separated list of names. To
|
`{index}` and `{type}` can accept a comma-separated list of names. To
|
||||||
get mappings for all indices you can use `_all` for `{index}`. The
|
get mappings for all indices you can use `_all` for `{index}`. The
|
||||||
following are some examples:
|
following are some examples:
|
||||||
|
|
||||||
|
|
|
@ -26,7 +26,7 @@ merge needs to execute, and if so, executes it.
|
||||||
`only_expunge_deletes`:: Should the optimize process only expunge segments
|
`only_expunge_deletes`:: Should the optimize process only expunge segments
|
||||||
with deletes in it. In Lucene, a document is not deleted from a segment,
|
with deletes in it. In Lucene, a document is not deleted from a segment,
|
||||||
just marked as deleted. During a merge process of segments, a new
|
just marked as deleted. During a merge process of segments, a new
|
||||||
segment is created that does not have those deletes. This flag allow to
|
segment is created that does not have those deletes. This flag allows to
|
||||||
only merge segments that have deletes. Defaults to `false`.
|
only merge segments that have deletes. Defaults to `false`.
|
||||||
|
|
||||||
`flush`:: Should a flush be performed after the optimize. Defaults to
|
`flush`:: Should a flush be performed after the optimize. Defaults to
|
||||||
|
|
|
@ -38,7 +38,7 @@ which means conflicts are *not* ignored.
|
||||||
The definition of conflict is really dependent on the type merged, but
|
The definition of conflict is really dependent on the type merged, but
|
||||||
in general, if a different core type is defined, it is considered as a
|
in general, if a different core type is defined, it is considered as a
|
||||||
conflict. New mapping definitions can be added to object types, and core
|
conflict. New mapping definitions can be added to object types, and core
|
||||||
type mapping can be upgraded by specifying multi fields on a core type.
|
type mappings can be upgraded by specifying multi fields on a core type.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
[[put-mapping-multi-index]]
|
[[put-mapping-multi-index]]
|
||||||
|
|
|
@ -61,7 +61,7 @@ actual index name that the template gets applied to during index creation.
|
||||||
=== Deleting a Template
|
=== Deleting a Template
|
||||||
|
|
||||||
Index templates are identified by a name (in the above case
|
Index templates are identified by a name (in the above case
|
||||||
`template_1`) and can be delete as well:
|
`template_1`) and can be deleted as well:
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
|
@ -13,7 +13,7 @@ and get them.
|
||||||
|
|
||||||
Index warmup can be disabled by setting `index.warmer.enabled` to
|
Index warmup can be disabled by setting `index.warmer.enabled` to
|
||||||
`false`. It is supported as a realtime setting using update settings
|
`false`. It is supported as a realtime setting using update settings
|
||||||
API. This can be handy when doing initial bulk indexing, disabling pre
|
API. This can be handy when doing initial bulk indexing: disable pre
|
||||||
registered warmers to make indexing faster and less expensive and then
|
registered warmers to make indexing faster and less expensive and then
|
||||||
enable it.
|
enable it.
|
||||||
|
|
||||||
|
|
|
@ -1,15 +1,15 @@
|
||||||
[[mapping-dynamic-mapping]]
|
[[mapping-dynamic-mapping]]
|
||||||
== Dynamic Mapping
|
== Dynamic Mapping
|
||||||
|
|
||||||
Default mappings allow to automatically apply generic mapping definition
|
Default mappings allow to automatically apply generic mapping definitions
|
||||||
to types that do not have mapping pre defined. This is mainly done
|
to types that do not have mappings predefined. This is mainly done
|
||||||
thanks to the fact that the
|
thanks to the fact that the
|
||||||
<<mapping-object-type,object mapping>> and
|
<<mapping-object-type,object mapping>> and
|
||||||
namely the <<mapping-root-object-type,root
|
namely the <<mapping-root-object-type,root
|
||||||
object mapping>> allow for schema-less dynamic addition of unmapped
|
object mapping>> allow for schema-less dynamic addition of unmapped
|
||||||
fields.
|
fields.
|
||||||
|
|
||||||
The default mapping definition is plain mapping definition that is
|
The default mapping definition is a plain mapping definition that is
|
||||||
embedded within the distribution:
|
embedded within the distribution:
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
|
@ -20,10 +20,10 @@ embedded within the distribution:
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
Pretty short, no? Basically, everything is defaulted, especially the
|
Pretty short, isn't it? Basically, everything is defaulted, especially the
|
||||||
dynamic nature of the root object mapping. The default mapping
|
dynamic nature of the root object mapping. The default mapping
|
||||||
definition can be overridden in several manners. The simplest manner is
|
definition can be overridden in several manners. The simplest manner is
|
||||||
to simply define a file called `default-mapping.json` and placed it
|
to simply define a file called `default-mapping.json` and to place it
|
||||||
under the `config` directory (which can be configured to exist in a
|
under the `config` directory (which can be configured to exist in a
|
||||||
different location). It can also be explicitly set using the
|
different location). It can also be explicitly set using the
|
||||||
`index.mapper.default_mapping_location` setting.
|
`index.mapper.default_mapping_location` setting.
|
||||||
|
|
|
@ -7,8 +7,8 @@ especially for search requests, where we want to execute a search query
|
||||||
against the content of a document, without knowing which fields to
|
against the content of a document, without knowing which fields to
|
||||||
search on. This comes at the expense of CPU cycles and index size.
|
search on. This comes at the expense of CPU cycles and index size.
|
||||||
|
|
||||||
The `_all` fields can be completely disabled. Explicit field mapping and
|
The `_all` fields can be completely disabled. Explicit field mappings and
|
||||||
object mapping can be excluded / included in the `_all` field. By
|
object mappings can be excluded / included in the `_all` field. By
|
||||||
default, it is enabled and all fields are included in it for ease of
|
default, it is enabled and all fields are included in it for ease of
|
||||||
use.
|
use.
|
||||||
|
|
||||||
|
@ -69,7 +69,7 @@ specific `index_analyzer` and `search_analyzer`) to be set.
|
||||||
|
|
||||||
For any field to allow
|
For any field to allow
|
||||||
<<search-request-highlighting,highlighting>> it has
|
<<search-request-highlighting,highlighting>> it has
|
||||||
to be either stored or part of the `_source` field. By default `_all`
|
to be either stored or part of the `_source` field. By default the `_all`
|
||||||
field does not qualify for either, so highlighting for it does not yield
|
field does not qualify for either, so highlighting for it does not yield
|
||||||
any data.
|
any data.
|
||||||
|
|
||||||
|
|
|
@ -20,7 +20,7 @@ Here is a simple mapping:
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
The above will use the value of the `my_field` to lookup an analyzer
|
The above will use the value of the `my_field` to lookup an analyzer
|
||||||
registered under it. For example, indexing a the following doc:
|
registered under it. For example, indexing the following doc:
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
@ -33,7 +33,7 @@ Will cause the `whitespace` analyzer to be used as the index analyzer
|
||||||
for all fields without explicit analyzer setting.
|
for all fields without explicit analyzer setting.
|
||||||
|
|
||||||
The default path value is `_analyzer`, so the analyzer can be driven for
|
The default path value is `_analyzer`, so the analyzer can be driven for
|
||||||
a specific document by setting `_analyzer` field in it. If custom json
|
a specific document by setting the `_analyzer` field in it. If a custom json
|
||||||
field name is needed, an explicit mapping with a different path should
|
field name is needed, an explicit mapping with a different path should
|
||||||
be set.
|
be set.
|
||||||
|
|
||||||
|
|
|
@ -4,7 +4,7 @@
|
||||||
deprecated[1.0.0.RC1,See <<function-score-instead-of-boost>>]
|
deprecated[1.0.0.RC1,See <<function-score-instead-of-boost>>]
|
||||||
|
|
||||||
Boosting is the process of enhancing the relevancy of a document or
|
Boosting is the process of enhancing the relevancy of a document or
|
||||||
field. Field level mapping allows to define explicit boost level on a
|
field. Field level mapping allows to define an explicit boost level on a
|
||||||
specific field. The boost field mapping (applied on the
|
specific field. The boost field mapping (applied on the
|
||||||
<<mapping-root-object-type,root object>>) allows
|
<<mapping-root-object-type,root object>>) allows
|
||||||
to define a boost field mapping where *its content will control the
|
to define a boost field mapping where *its content will control the
|
||||||
|
@ -20,7 +20,7 @@ mapping:
|
||||||
}
|
}
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
||||||
The above mapping defines mapping for a field named `my_boost`. If the
|
The above mapping defines a mapping for a field named `my_boost`. If the
|
||||||
`my_boost` field exists within the JSON document indexed, its value will
|
`my_boost` field exists within the JSON document indexed, its value will
|
||||||
control the boost level of the document indexed. For example, the
|
control the boost level of the document indexed. For example, the
|
||||||
following JSON document will be indexed with a boost value of `2.2`:
|
following JSON document will be indexed with a boost value of `2.2`:
|
||||||
|
|
|
@ -9,7 +9,7 @@ to the date the document was processed by the indexing chain.
|
||||||
[float]
|
[float]
|
||||||
==== enabled
|
==== enabled
|
||||||
|
|
||||||
By default it is disabled, in order to enable it, the following mapping
|
By default it is disabled. In order to enable it, the following mapping
|
||||||
should be defined:
|
should be defined:
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
|
|
|
@ -86,7 +86,7 @@ The following table lists all the attributes that can be used with the
|
||||||
|`index_name` |The name of the field that will be stored in the index.
|
|`index_name` |The name of the field that will be stored in the index.
|
||||||
Defaults to the property/field name.
|
Defaults to the property/field name.
|
||||||
|
|
||||||
|`store` |Set to `true` to store actual field in the index, `false` to not
|
|`store` |Set to `true` to actually store the field in the index, `false` to not
|
||||||
store it. Defaults to `false` (note, the JSON document itself is stored,
|
store it. Defaults to `false` (note, the JSON document itself is stored,
|
||||||
and it can be retrieved from it).
|
and it can be retrieved from it).
|
||||||
|
|
||||||
|
@ -208,8 +208,8 @@ store it. Defaults to `false` (note, the JSON document itself is stored,
|
||||||
and it can be retrieved from it).
|
and it can be retrieved from it).
|
||||||
|
|
||||||
|`index` |Set to `no` if the value should not be indexed. Setting to
|
|`index` |Set to `no` if the value should not be indexed. Setting to
|
||||||
`no` disables `include_in_all`. If set to `no` the field can be stored
|
`no` disables `include_in_all`. If set to `no` the field should be either stored
|
||||||
in `_source`, have `include_in_all` enabled, or `store` should be set to
|
in `_source`, have `include_in_all` enabled, or `store` be set to
|
||||||
`true` for this to be useful.
|
`true` for this to be useful.
|
||||||
|
|
||||||
|`doc_values` |Set to `true` to store field values in a column-stride fashion.
|
|`doc_values` |Set to `true` to store field values in a column-stride fashion.
|
||||||
|
@ -317,8 +317,8 @@ store it. Defaults to `false` (note, the JSON document itself is stored,
|
||||||
and it can be retrieved from it).
|
and it can be retrieved from it).
|
||||||
|
|
||||||
|`index` |Set to `no` if the value should not be indexed. Setting to
|
|`index` |Set to `no` if the value should not be indexed. Setting to
|
||||||
`no` disables `include_in_all`. If set to `no` the field can be stored
|
`no` disables `include_in_all`. If set to `no` the field should be either stored
|
||||||
in `_source`, have `include_in_all` enabled, or `store` should be set to
|
in `_source`, have `include_in_all` enabled, or `store` be set to
|
||||||
`true` for this to be useful.
|
`true` for this to be useful.
|
||||||
|
|
||||||
|`doc_values` |Set to `true` to store field values in a column-stride fashion.
|
|`doc_values` |Set to `true` to store field values in a column-stride fashion.
|
||||||
|
@ -380,8 +380,8 @@ store it. Defaults to `false` (note, the JSON document itself is stored,
|
||||||
and it can be retrieved from it).
|
and it can be retrieved from it).
|
||||||
|
|
||||||
|`index` |Set to `no` if the value should not be indexed. Setting to
|
|`index` |Set to `no` if the value should not be indexed. Setting to
|
||||||
`no` disables `include_in_all`. If set to `no` the field can be stored
|
`no` disables `include_in_all`. If set to `no` the field should be either stored
|
||||||
in `_source`, have `include_in_all` enabled, or `store` should be set to
|
in `_source`, have `include_in_all` enabled, or `store` be set to
|
||||||
`true` for this to be useful.
|
`true` for this to be useful.
|
||||||
|
|
||||||
|`boost` |The boost value. Defaults to `1.0`.
|
|`boost` |The boost value. Defaults to `1.0`.
|
||||||
|
@ -488,13 +488,13 @@ Elasticsearch has several builtin formats:
|
||||||
contained in a very low number of documents.
|
contained in a very low number of documents.
|
||||||
|
|
||||||
`pulsing`::
|
`pulsing`::
|
||||||
A postings format in-lines the posting lists for very low
|
A postings format that in-lines the posting lists for very low
|
||||||
frequent terms in the term dictionary. This is useful to improve lookup
|
frequent terms in the term dictionary. This is useful to improve lookup
|
||||||
performance for low-frequent terms.
|
performance for low-frequent terms.
|
||||||
|
|
||||||
`bloom_default`::
|
`bloom_default`::
|
||||||
A postings format that uses a bloom filter to
|
A postings format that uses a bloom filter to
|
||||||
improve term lookup performance. This is useful for primarily keys or
|
improve term lookup performance. This is useful for primary keys or
|
||||||
fields that are used as a delete key.
|
fields that are used as a delete key.
|
||||||
|
|
||||||
`bloom_pulsing`::
|
`bloom_pulsing`::
|
||||||
|
@ -579,10 +579,8 @@ custom doc values formats. See
|
||||||
==== Similarity
|
==== Similarity
|
||||||
|
|
||||||
Elasticsearch allows you to configure a similarity (scoring algorithm) per field.
|
Elasticsearch allows you to configure a similarity (scoring algorithm) per field.
|
||||||
Allowing users a simpler extension beyond the usual TF/IDF algorithm. As
|
The `similarity` setting provides a simple way of choosing a similarity algorithm
|
||||||
part of this, new algorithms have been added including BM25. Also as
|
other than the default TF/IDF, such as `BM25`.
|
||||||
part of the changes, it is now possible to define a Similarity per
|
|
||||||
field, giving even greater control over scoring.
|
|
||||||
|
|
||||||
You can configure similarities via the
|
You can configure similarities via the
|
||||||
<<index-modules-similarity,similarity module>>
|
<<index-modules-similarity,similarity module>>
|
||||||
|
|
|
@ -139,21 +139,21 @@ number of terms that will be indexed depends on the `geohash_precision`.
|
||||||
Defaults to `false`. *Note*: This option implicitly enables `geohash`.
|
Defaults to `false`. *Note*: This option implicitly enables `geohash`.
|
||||||
|
|
||||||
|`validate` |Set to `true` to reject geo points with invalid latitude or
|
|`validate` |Set to `true` to reject geo points with invalid latitude or
|
||||||
longitude (default is `false`) *Note*: Validation only works when
|
longitude (default is `false`). *Note*: Validation only works when
|
||||||
normalization has been disabled.
|
normalization has been disabled.
|
||||||
|
|
||||||
|`validate_lat` |Set to `true` to reject geo points with an invalid
|
|`validate_lat` |Set to `true` to reject geo points with an invalid
|
||||||
latitude
|
latitude.
|
||||||
|
|
||||||
|`validate_lon` |Set to `true` to reject geo points with an invalid
|
|`validate_lon` |Set to `true` to reject geo points with an invalid
|
||||||
longitude
|
longitude.
|
||||||
|
|
||||||
|`normalize` |Set to `true` to normalize latitude and longitude (default
|
|`normalize` |Set to `true` to normalize latitude and longitude (default
|
||||||
is `true`)
|
is `true`).
|
||||||
|
|
||||||
|`normalize_lat` |Set to `true` to normalize latitude
|
|`normalize_lat` |Set to `true` to normalize latitude.
|
||||||
|
|
||||||
|`normalize_lon` |Set to `true` to normalize longitude
|
|`normalize_lon` |Set to `true` to normalize longitude.
|
||||||
|=======================================================================
|
|=======================================================================
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
|
|
|
@ -88,8 +88,8 @@ configured it may return some false positives or false negatives for
|
||||||
certain queries. To mitigate this, it is important to select an
|
certain queries. To mitigate this, it is important to select an
|
||||||
appropriate value for the tree_levels parameter and to adjust
|
appropriate value for the tree_levels parameter and to adjust
|
||||||
expectations accordingly. For example, a point may be near the border of
|
expectations accordingly. For example, a point may be near the border of
|
||||||
a particular grid cell. And may not match a query that only matches the
|
a particular grid cell and may thus not match a query that only matches the
|
||||||
cell right next to it even though the shape is very close to the point.
|
cell right next to it -- even though the shape is very close to the point.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
===== Example
|
===== Example
|
||||||
|
@ -116,8 +116,8 @@ this into a tree_levels setting of 26.
|
||||||
|
|
||||||
Elasticsearch uses the paths in the prefix tree as terms in the index
|
Elasticsearch uses the paths in the prefix tree as terms in the index
|
||||||
and in queries. The higher the levels is (and thus the precision), the
|
and in queries. The higher the levels is (and thus the precision), the
|
||||||
more terms are generated. Both calculating the terms, keeping them in
|
more terms are generated. Of course, calculating the terms, keeping them in
|
||||||
memory, and storing them has a price of course. Especially with higher
|
memory, and storing them on disk all have a price. Especially with higher
|
||||||
tree levels, indices can become extremely large even with a modest
|
tree levels, indices can become extremely large even with a modest
|
||||||
amount of data. Additionally, the size of the features also matters.
|
amount of data. Additionally, the size of the features also matters.
|
||||||
Big, complex polygons can take up a lot of space at higher tree levels.
|
Big, complex polygons can take up a lot of space at higher tree levels.
|
||||||
|
@ -174,7 +174,7 @@ for upper left and lower right points of the shape:
|
||||||
===== http://www.geojson.org/geojson-spec.html#id4[Polygon]
|
===== http://www.geojson.org/geojson-spec.html#id4[Polygon]
|
||||||
|
|
||||||
A polygon is defined by a list of a list of points. The first and last
|
A polygon is defined by a list of a list of points. The first and last
|
||||||
points in each list must be the same (the polygon must be closed).
|
points in each (outer) list must be the same (the polygon must be closed).
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
--------------------------------------------------
|
--------------------------------------------------
|
||||||
|
|
|
@ -134,7 +134,7 @@ example, if we added age and its value is a number, then it can't be
|
||||||
treated as a string.
|
treated as a string.
|
||||||
|
|
||||||
The `dynamic` parameter can also be set to `strict`, meaning that not
|
The `dynamic` parameter can also be set to `strict`, meaning that not
|
||||||
only new fields will not be introduced into the mapping, parsing
|
only will new fields not be introduced into the mapping, but also that parsing
|
||||||
(indexing) docs with such new fields will fail.
|
(indexing) docs with such new fields will fail.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
|
@ -173,6 +173,6 @@ In the above, `name` and its content will not be indexed at all.
|
||||||
==== include_in_all
|
==== include_in_all
|
||||||
|
|
||||||
`include_in_all` can be set on the `object` type level. When set, it
|
`include_in_all` can be set on the `object` type level. When set, it
|
||||||
propagates down to all the inner mapping defined within the `object`
|
propagates down to all the inner mappings defined within the `object`
|
||||||
that do no explicitly set it.
|
that do no explicitly set it.
|
||||||
|
|
||||||
|
|
|
@ -90,7 +90,7 @@ date fields, not for `date` fields that you specify in your mapping.
|
||||||
[float]
|
[float]
|
||||||
==== date_detection
|
==== date_detection
|
||||||
|
|
||||||
Allows to disable automatic date type detection (a new field introduced
|
Allows to disable automatic date type detection (if a new field is introduced
|
||||||
and matches the provided format), for example:
|
and matches the provided format), for example:
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
|
|
|
@ -123,7 +123,7 @@ specific attributes.
|
||||||
|
|
||||||
For example, lets say we have an awareness attribute called `zone`, and
|
For example, lets say we have an awareness attribute called `zone`, and
|
||||||
we know we are going to have two zones, `zone1` and `zone2`. Here is how
|
we know we are going to have two zones, `zone1` and `zone2`. Here is how
|
||||||
we can force awareness one a node:
|
we can force awareness on a node:
|
||||||
|
|
||||||
[source,js]
|
[source,js]
|
||||||
-------------------------------------------------------------------
|
-------------------------------------------------------------------
|
||||||
|
@ -153,7 +153,7 @@ The settings can be updated using the <<cluster-update-settings,cluster update s
|
||||||
[[allocation-filtering]]
|
[[allocation-filtering]]
|
||||||
=== Shard Allocation Filtering
|
=== Shard Allocation Filtering
|
||||||
|
|
||||||
Allow to control allocation if indices on nodes based on include/exclude
|
Allow to control allocation of indices on nodes based on include/exclude
|
||||||
filters. The filters can be set both on the index level and on the
|
filters. The filters can be set both on the index level and on the
|
||||||
cluster level. Lets start with an example of setting it on the cluster
|
cluster level. Lets start with an example of setting it on the cluster
|
||||||
level:
|
level:
|
||||||
|
|
|
@ -38,7 +38,7 @@ Accept-Encoding). Defaults to `false`.
|
||||||
Defaults to `6`.
|
Defaults to `6`.
|
||||||
|=======================================================================
|
|=======================================================================
|
||||||
|
|
||||||
It also shares the uses the common
|
It also uses the common
|
||||||
<<modules-network,network settings>>.
|
<<modules-network,network settings>>.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
|
|
|
@ -16,8 +16,8 @@ The `indices.memory.index_buffer_size` accepts either a percentage or a
|
||||||
byte size value. It defaults to `10%`, meaning that `10%` of the total
|
byte size value. It defaults to `10%`, meaning that `10%` of the total
|
||||||
memory allocated to a node will be used as the indexing buffer size.
|
memory allocated to a node will be used as the indexing buffer size.
|
||||||
This amount is then divided between all the different shards. Also, if
|
This amount is then divided between all the different shards. Also, if
|
||||||
percentage is used, allow to set `min_index_buffer_size` (defaults to
|
percentage is used, it is possible to set `min_index_buffer_size` (defaults to
|
||||||
`48mb`) and `max_index_buffer_size` which by default is unbounded.
|
`48mb`) and `max_index_buffer_size` (defaults to unbounded).
|
||||||
|
|
||||||
The `indices.memory.min_shard_index_buffer_size` allows to set a hard
|
The `indices.memory.min_shard_index_buffer_size` allows to set a hard
|
||||||
lower limit for the memory allocated per shard for its own indexing
|
lower limit for the memory allocated per shard for its own indexing
|
||||||
|
@ -27,7 +27,7 @@ buffer. It defaults to `4mb`.
|
||||||
[[indices-ttl]]
|
[[indices-ttl]]
|
||||||
=== TTL interval
|
=== TTL interval
|
||||||
|
|
||||||
You can dynamically set the `indices.ttl.interval` allows to set how
|
You can dynamically set the `indices.ttl.interval`, which allows to set how
|
||||||
often expired documents will be automatically deleted. The default value
|
often expired documents will be automatically deleted. The default value
|
||||||
is 60s.
|
is 60s.
|
||||||
|
|
||||||
|
@ -40,7 +40,7 @@ See also <<mapping-ttl-field>>.
|
||||||
[[recovery]]
|
[[recovery]]
|
||||||
=== Recovery
|
=== Recovery
|
||||||
|
|
||||||
The following settings can be set to manage recovery policy:
|
The following settings can be set to manage the recovery policy:
|
||||||
|
|
||||||
[horizontal]
|
[horizontal]
|
||||||
`indices.recovery.concurrent_streams`::
|
`indices.recovery.concurrent_streams`::
|
||||||
|
@ -65,7 +65,7 @@ The following settings can be set to manage recovery policy:
|
||||||
[[throttling]]
|
[[throttling]]
|
||||||
=== Store level throttling
|
=== Store level throttling
|
||||||
|
|
||||||
The following settings can be set to control store throttling:
|
The following settings can be set to control the store throttling:
|
||||||
|
|
||||||
[horizontal]
|
[horizontal]
|
||||||
`indices.store.throttle.type`::
|
`indices.store.throttle.type`::
|
||||||
|
|
|
@ -58,7 +58,7 @@ The following are the settings the can be configured for memcached:
|
||||||
|`memcached.port` |A bind port range. Defaults to `11211-11311`.
|
|`memcached.port` |A bind port range. Defaults to `11211-11311`.
|
||||||
|===============================================================
|
|===============================================================
|
||||||
|
|
||||||
It also shares the uses the common
|
It also uses the common
|
||||||
<<modules-network,network settings>>.
|
<<modules-network,network settings>>.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
|
|
|
@ -81,7 +81,7 @@ to `false`.
|
||||||
[float]
|
[float]
|
||||||
=== Native (Java) Scripts
|
=== Native (Java) Scripts
|
||||||
|
|
||||||
Even though `mvel` is pretty fast, allow to register native Java based
|
Even though `mvel` is pretty fast, this allows to register native Java based
|
||||||
scripts for faster execution.
|
scripts for faster execution.
|
||||||
|
|
||||||
In order to allow for scripts, the `NativeScriptFactory` needs to be
|
In order to allow for scripts, the `NativeScriptFactory` needs to be
|
||||||
|
@ -174,7 +174,7 @@ of this geo point field from the provided geohash.
|
||||||
[float]
|
[float]
|
||||||
=== Stored Fields
|
=== Stored Fields
|
||||||
|
|
||||||
Stored fields can also be accessed when executed a script. Note, they
|
Stored fields can also be accessed when executing a script. Note, they
|
||||||
are much slower to access compared with document fields, but are not
|
are much slower to access compared with document fields, but are not
|
||||||
loaded into memory. They can be simply accessed using
|
loaded into memory. They can be simply accessed using
|
||||||
`_fields['my_field_name'].value` or `_fields['my_field_name'].values`.
|
`_fields['my_field_name'].value` or `_fields['my_field_name'].values`.
|
||||||
|
|
|
@ -11,8 +11,8 @@ The transport mechanism is completely asynchronous in nature, meaning
|
||||||
that there is no blocking thread waiting for a response. The benefit of
|
that there is no blocking thread waiting for a response. The benefit of
|
||||||
using asynchronous communication is first solving the
|
using asynchronous communication is first solving the
|
||||||
http://en.wikipedia.org/wiki/C10k_problem[C10k problem], as well as
|
http://en.wikipedia.org/wiki/C10k_problem[C10k problem], as well as
|
||||||
being the idle solution for scatter (broadcast) / gather operations such
|
being the ideal solution for scatter (broadcast) / gather operations such
|
||||||
as search in Elasticsearch.
|
as search in ElasticSearch.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
=== TCP Transport
|
=== TCP Transport
|
||||||
|
@ -38,7 +38,7 @@ time setting format). Defaults to `30s`.
|
||||||
between all nodes. Defaults to `false`.
|
between all nodes. Defaults to `false`.
|
||||||
|=======================================================================
|
|=======================================================================
|
||||||
|
|
||||||
It also shares the uses the common
|
It also uses the common
|
||||||
<<modules-network,network settings>>.
|
<<modules-network,network settings>>.
|
||||||
|
|
||||||
[float]
|
[float]
|
||||||
|
|
Loading…
Reference in New Issue