Reorganize docs of global ordinals. (#24982)
Currently global ordinals are documented under `fielddata`. It moves them to their own file since they also work with doc values and fielddata is on the way out. Closes #23101
This commit is contained in:
parent
7526c29a05
commit
ebf806d38f
|
@ -119,7 +119,7 @@ GET my_index/_search
|
|||
|
||||
==== Global ordinals
|
||||
|
||||
Parent-child uses <<global-ordinals,global ordinals>> to speed up joins.
|
||||
Parent-child uses <<eager-global-ordinals,global ordinals>> to speed up joins.
|
||||
Global ordinals need to be rebuilt after any change to a shard. The more
|
||||
parent id values are stored in a shard, the longer it takes to rebuild the
|
||||
global ordinals for the `_parent` field.
|
||||
|
|
|
@ -16,6 +16,7 @@ The following mapping parameters are common to some or all field datatypes:
|
|||
* <<dynamic,`dynamic`>>
|
||||
* <<enabled,`enabled`>>
|
||||
* <<fielddata,`fielddata`>>
|
||||
* <<eager-global-ordinals,`eager_global_ordinals`>>
|
||||
* <<mapping-date-format,`format`>>
|
||||
* <<ignore-above,`ignore_above`>>
|
||||
* <<ignore-malformed,`ignore_malformed`>>
|
||||
|
@ -48,6 +49,8 @@ include::params/dynamic.asciidoc[]
|
|||
|
||||
include::params/enabled.asciidoc[]
|
||||
|
||||
include::params/eager-global-ordinals.asciidoc[]
|
||||
|
||||
include::params/fielddata.asciidoc[]
|
||||
|
||||
include::params/format.asciidoc[]
|
||||
|
|
|
@ -0,0 +1,74 @@
|
|||
[[eager-global-ordinals]]
|
||||
=== `eager_global_ordinals`
|
||||
|
||||
Global ordinals is a data-structure on top of doc values, that maintains an
|
||||
incremental numbering for each unique term in a lexicographic order. Each
|
||||
term has a unique number and the number of term 'A' is lower than the
|
||||
number of term 'B'. Global ordinals are only supported with
|
||||
<<keyword,`keyword`>> and <<text,`text`>> fields. In `keyword` fields, they
|
||||
are available by default but `text` fields can only use them when `fielddata`,
|
||||
with all of its associated baggage, is enabled.
|
||||
|
||||
Doc values (and fielddata) also have ordinals, which is a unique numbering for
|
||||
all terms in a particular segment and field. Global ordinals just build on top
|
||||
of this, by providing a mapping between the segment ordinals and the global
|
||||
ordinals, the latter being unique across the entire shard. Given that global
|
||||
ordinals for a specific field are tied to _all the segments of a shard_, they
|
||||
need to be entirely rebuilt whenever a once new segment becomes visible.
|
||||
|
||||
Global ordinals are used for features that use segment ordinals, such as
|
||||
the <<search-aggregations-bucket-terms-aggregation,`terms` aggregation>>,
|
||||
to improve the execution time. A terms aggregation relies purely on global
|
||||
ordinals to perform the aggregation at the shard level, then converts global
|
||||
ordinals to the real term only for the final reduce phase, which combines
|
||||
results from different shards.
|
||||
|
||||
The loading time of global ordinals depends on the number of terms in a field,
|
||||
but in general it is low, since it source field data has already been loaded.
|
||||
The memory overhead of global ordinals is a small because it is very
|
||||
efficiently compressed.
|
||||
|
||||
By default, global ordinals are loaded at search-time, which is the right
|
||||
trade-off if you are optimizing for indexing speed. However, if you are more
|
||||
interested in search speed, it could be interesting to set
|
||||
`eager_global_ordinals: true` on fields that you plan to use in terms
|
||||
aggregations:
|
||||
|
||||
[source,js]
|
||||
------------
|
||||
PUT my_index/_mapping/my_type
|
||||
{
|
||||
"properties": {
|
||||
"tags": {
|
||||
"type": "keyword",
|
||||
"eager_global_ordinals": true
|
||||
}
|
||||
}
|
||||
}
|
||||
------------
|
||||
// CONSOLE
|
||||
// TEST[s/^/PUT my_index\n/]
|
||||
|
||||
This will shift the cost from search-time to refresh-time. Elasticsearch will
|
||||
make sure that global ordinals are built before publishing updates to the
|
||||
content of the index.
|
||||
|
||||
If you ever decide that you do not need to run `terms` aggregations on this
|
||||
field anymore, then you can disable eager loading of global ordinals at any
|
||||
time:
|
||||
|
||||
[source,js]
|
||||
------------
|
||||
PUT my_index/_mapping/my_type
|
||||
{
|
||||
"properties": {
|
||||
"tags": {
|
||||
"type": "keyword",
|
||||
"eager_global_ordinals": false
|
||||
}
|
||||
}
|
||||
}
|
||||
------------
|
||||
// CONSOLE
|
||||
// TEST[continued]
|
||||
|
|
@ -105,40 +105,6 @@ same name in the same index. Its value can be updated on existing fields
|
|||
using the <<indices-put-mapping,PUT mapping API>>.
|
||||
|
||||
|
||||
[[global-ordinals]]
|
||||
.Global ordinals
|
||||
*****************************************
|
||||
|
||||
Global ordinals is a data-structure on top of fielddata and doc values, that
|
||||
maintains an incremental numbering for each unique term in a lexicographic
|
||||
order. Each term has a unique number and the number of term 'A' is lower than
|
||||
the number of term 'B'. Global ordinals are only supported on <<text,`text`>>
|
||||
and <<keyword,`keyword`>> fields.
|
||||
|
||||
Fielddata and doc values also have ordinals, which is a unique numbering for
|
||||
all terms in a particular segment and field. Global ordinals just build on top
|
||||
of this, by providing a mapping between the segment ordinals and the global
|
||||
ordinals, the latter being unique across the entire shard.
|
||||
|
||||
Global ordinals are used for features that use segment ordinals, such as
|
||||
sorting and the terms aggregation, to improve the execution time. A terms
|
||||
aggregation relies purely on global ordinals to perform the aggregation at the
|
||||
shard level, then converts global ordinals to the real term only for the final
|
||||
reduce phase, which combines results from different shards.
|
||||
|
||||
Global ordinals for a specified field are tied to _all the segments of a
|
||||
shard_, while fielddata and doc values ordinals are tied to a single segment.
|
||||
which is different than for field data for a specific field which is tied to a
|
||||
single segment. For this reason global ordinals need to be entirely rebuilt
|
||||
whenever a once new segment becomes visible.
|
||||
|
||||
The loading time of global ordinals depends on the number of terms in a field,
|
||||
but in general it is low, since it source field data has already been loaded.
|
||||
The memory overhead of global ordinals is a small because it is very
|
||||
efficiently compressed.
|
||||
|
||||
*****************************************
|
||||
|
||||
[[field-data-filtering]]
|
||||
==== `fielddata_frequency_filter`
|
||||
|
||||
|
|
|
@ -48,7 +48,7 @@ The following parameters are accepted by `keyword` fields:
|
|||
can later be used for sorting, aggregations, or scripting? Accepts `true`
|
||||
(default) or `false`.
|
||||
|
||||
<<global-ordinals,`eager_global_ordinals`>>::
|
||||
<<eager-global-ordinals,`eager_global_ordinals`>>::
|
||||
|
||||
Should global ordinals be loaded eagerly on refresh? Accepts `true` or `false`
|
||||
(default). Enabling this is a good idea on fields that are frequently used for
|
||||
|
|
|
@ -57,7 +57,7 @@ The following parameters are accepted by `text` fields:
|
|||
Mapping field-level query time boosting. Accepts a floating point number, defaults
|
||||
to `1.0`.
|
||||
|
||||
<<global-ordinals,`eager_global_ordinals`>>::
|
||||
<<eager-global-ordinals,`eager_global_ordinals`>>::
|
||||
|
||||
Should global ordinals be loaded eagerly on refresh? Accepts `true` or `false`
|
||||
(default). Enabling this is a good idea on fields that are frequently used for
|
||||
|
|
Loading…
Reference in New Issue