Reorganize docs of global ordinals. (#24982)

Currently global ordinals are documented under `fielddata`. It moves them to
their own file since they also work with doc values and fielddata is on the way
out.

Closes #23101
This commit is contained in:
Adrien Grand 2017-06-01 16:47:44 +02:00 committed by GitHub
parent 7526c29a05
commit ebf806d38f
6 changed files with 80 additions and 37 deletions

View File

@ -119,7 +119,7 @@ GET my_index/_search
==== Global ordinals
Parent-child uses <<global-ordinals,global ordinals>> to speed up joins.
Parent-child uses <<eager-global-ordinals,global ordinals>> to speed up joins.
Global ordinals need to be rebuilt after any change to a shard. The more
parent id values are stored in a shard, the longer it takes to rebuild the
global ordinals for the `_parent` field.

View File

@ -16,6 +16,7 @@ The following mapping parameters are common to some or all field datatypes:
* <<dynamic,`dynamic`>>
* <<enabled,`enabled`>>
* <<fielddata,`fielddata`>>
* <<eager-global-ordinals,`eager_global_ordinals`>>
* <<mapping-date-format,`format`>>
* <<ignore-above,`ignore_above`>>
* <<ignore-malformed,`ignore_malformed`>>
@ -48,6 +49,8 @@ include::params/dynamic.asciidoc[]
include::params/enabled.asciidoc[]
include::params/eager-global-ordinals.asciidoc[]
include::params/fielddata.asciidoc[]
include::params/format.asciidoc[]

View File

@ -0,0 +1,74 @@
[[eager-global-ordinals]]
=== `eager_global_ordinals`
Global ordinals is a data-structure on top of doc values, that maintains an
incremental numbering for each unique term in a lexicographic order. Each
term has a unique number and the number of term 'A' is lower than the
number of term 'B'. Global ordinals are only supported with
<<keyword,`keyword`>> and <<text,`text`>> fields. In `keyword` fields, they
are available by default but `text` fields can only use them when `fielddata`,
with all of its associated baggage, is enabled.
Doc values (and fielddata) also have ordinals, which is a unique numbering for
all terms in a particular segment and field. Global ordinals just build on top
of this, by providing a mapping between the segment ordinals and the global
ordinals, the latter being unique across the entire shard. Given that global
ordinals for a specific field are tied to _all the segments of a shard_, they
need to be entirely rebuilt whenever a once new segment becomes visible.
Global ordinals are used for features that use segment ordinals, such as
the <<search-aggregations-bucket-terms-aggregation,`terms` aggregation>>,
to improve the execution time. A terms aggregation relies purely on global
ordinals to perform the aggregation at the shard level, then converts global
ordinals to the real term only for the final reduce phase, which combines
results from different shards.
The loading time of global ordinals depends on the number of terms in a field,
but in general it is low, since it source field data has already been loaded.
The memory overhead of global ordinals is a small because it is very
efficiently compressed.
By default, global ordinals are loaded at search-time, which is the right
trade-off if you are optimizing for indexing speed. However, if you are more
interested in search speed, it could be interesting to set
`eager_global_ordinals: true` on fields that you plan to use in terms
aggregations:
[source,js]
------------
PUT my_index/_mapping/my_type
{
"properties": {
"tags": {
"type": "keyword",
"eager_global_ordinals": true
}
}
}
------------
// CONSOLE
// TEST[s/^/PUT my_index\n/]
This will shift the cost from search-time to refresh-time. Elasticsearch will
make sure that global ordinals are built before publishing updates to the
content of the index.
If you ever decide that you do not need to run `terms` aggregations on this
field anymore, then you can disable eager loading of global ordinals at any
time:
[source,js]
------------
PUT my_index/_mapping/my_type
{
"properties": {
"tags": {
"type": "keyword",
"eager_global_ordinals": false
}
}
}
------------
// CONSOLE
// TEST[continued]

View File

@ -105,40 +105,6 @@ same name in the same index. Its value can be updated on existing fields
using the <<indices-put-mapping,PUT mapping API>>.
[[global-ordinals]]
.Global ordinals
*****************************************
Global ordinals is a data-structure on top of fielddata and doc values, that
maintains an incremental numbering for each unique term in a lexicographic
order. Each term has a unique number and the number of term 'A' is lower than
the number of term 'B'. Global ordinals are only supported on <<text,`text`>>
and <<keyword,`keyword`>> fields.
Fielddata and doc values also have ordinals, which is a unique numbering for
all terms in a particular segment and field. Global ordinals just build on top
of this, by providing a mapping between the segment ordinals and the global
ordinals, the latter being unique across the entire shard.
Global ordinals are used for features that use segment ordinals, such as
sorting and the terms aggregation, to improve the execution time. A terms
aggregation relies purely on global ordinals to perform the aggregation at the
shard level, then converts global ordinals to the real term only for the final
reduce phase, which combines results from different shards.
Global ordinals for a specified field are tied to _all the segments of a
shard_, while fielddata and doc values ordinals are tied to a single segment.
which is different than for field data for a specific field which is tied to a
single segment. For this reason global ordinals need to be entirely rebuilt
whenever a once new segment becomes visible.
The loading time of global ordinals depends on the number of terms in a field,
but in general it is low, since it source field data has already been loaded.
The memory overhead of global ordinals is a small because it is very
efficiently compressed.
*****************************************
[[field-data-filtering]]
==== `fielddata_frequency_filter`

View File

@ -48,7 +48,7 @@ The following parameters are accepted by `keyword` fields:
can later be used for sorting, aggregations, or scripting? Accepts `true`
(default) or `false`.
<<global-ordinals,`eager_global_ordinals`>>::
<<eager-global-ordinals,`eager_global_ordinals`>>::
Should global ordinals be loaded eagerly on refresh? Accepts `true` or `false`
(default). Enabling this is a good idea on fields that are frequently used for

View File

@ -57,7 +57,7 @@ The following parameters are accepted by `text` fields:
Mapping field-level query time boosting. Accepts a floating point number, defaults
to `1.0`.
<<global-ordinals,`eager_global_ordinals`>>::
<<eager-global-ordinals,`eager_global_ordinals`>>::
Should global ordinals be loaded eagerly on refresh? Accepts `true` or `false`
(default). Enabling this is a good idea on fields that are frequently used for