Reorganize docs of global ordinals. (#24982)
Currently global ordinals are documented under `fielddata`. It moves them to their own file since they also work with doc values and fielddata is on the way out. Closes #23101
This commit is contained in:
parent
7526c29a05
commit
ebf806d38f
|
@ -119,7 +119,7 @@ GET my_index/_search
|
||||||
|
|
||||||
==== Global ordinals
|
==== Global ordinals
|
||||||
|
|
||||||
Parent-child uses <<global-ordinals,global ordinals>> to speed up joins.
|
Parent-child uses <<eager-global-ordinals,global ordinals>> to speed up joins.
|
||||||
Global ordinals need to be rebuilt after any change to a shard. The more
|
Global ordinals need to be rebuilt after any change to a shard. The more
|
||||||
parent id values are stored in a shard, the longer it takes to rebuild the
|
parent id values are stored in a shard, the longer it takes to rebuild the
|
||||||
global ordinals for the `_parent` field.
|
global ordinals for the `_parent` field.
|
||||||
|
|
|
@ -16,6 +16,7 @@ The following mapping parameters are common to some or all field datatypes:
|
||||||
* <<dynamic,`dynamic`>>
|
* <<dynamic,`dynamic`>>
|
||||||
* <<enabled,`enabled`>>
|
* <<enabled,`enabled`>>
|
||||||
* <<fielddata,`fielddata`>>
|
* <<fielddata,`fielddata`>>
|
||||||
|
* <<eager-global-ordinals,`eager_global_ordinals`>>
|
||||||
* <<mapping-date-format,`format`>>
|
* <<mapping-date-format,`format`>>
|
||||||
* <<ignore-above,`ignore_above`>>
|
* <<ignore-above,`ignore_above`>>
|
||||||
* <<ignore-malformed,`ignore_malformed`>>
|
* <<ignore-malformed,`ignore_malformed`>>
|
||||||
|
@ -48,6 +49,8 @@ include::params/dynamic.asciidoc[]
|
||||||
|
|
||||||
include::params/enabled.asciidoc[]
|
include::params/enabled.asciidoc[]
|
||||||
|
|
||||||
|
include::params/eager-global-ordinals.asciidoc[]
|
||||||
|
|
||||||
include::params/fielddata.asciidoc[]
|
include::params/fielddata.asciidoc[]
|
||||||
|
|
||||||
include::params/format.asciidoc[]
|
include::params/format.asciidoc[]
|
||||||
|
|
|
@ -0,0 +1,74 @@
|
||||||
|
[[eager-global-ordinals]]
|
||||||
|
=== `eager_global_ordinals`
|
||||||
|
|
||||||
|
Global ordinals is a data-structure on top of doc values, that maintains an
|
||||||
|
incremental numbering for each unique term in a lexicographic order. Each
|
||||||
|
term has a unique number and the number of term 'A' is lower than the
|
||||||
|
number of term 'B'. Global ordinals are only supported with
|
||||||
|
<<keyword,`keyword`>> and <<text,`text`>> fields. In `keyword` fields, they
|
||||||
|
are available by default but `text` fields can only use them when `fielddata`,
|
||||||
|
with all of its associated baggage, is enabled.
|
||||||
|
|
||||||
|
Doc values (and fielddata) also have ordinals, which is a unique numbering for
|
||||||
|
all terms in a particular segment and field. Global ordinals just build on top
|
||||||
|
of this, by providing a mapping between the segment ordinals and the global
|
||||||
|
ordinals, the latter being unique across the entire shard. Given that global
|
||||||
|
ordinals for a specific field are tied to _all the segments of a shard_, they
|
||||||
|
need to be entirely rebuilt whenever a once new segment becomes visible.
|
||||||
|
|
||||||
|
Global ordinals are used for features that use segment ordinals, such as
|
||||||
|
the <<search-aggregations-bucket-terms-aggregation,`terms` aggregation>>,
|
||||||
|
to improve the execution time. A terms aggregation relies purely on global
|
||||||
|
ordinals to perform the aggregation at the shard level, then converts global
|
||||||
|
ordinals to the real term only for the final reduce phase, which combines
|
||||||
|
results from different shards.
|
||||||
|
|
||||||
|
The loading time of global ordinals depends on the number of terms in a field,
|
||||||
|
but in general it is low, since it source field data has already been loaded.
|
||||||
|
The memory overhead of global ordinals is a small because it is very
|
||||||
|
efficiently compressed.
|
||||||
|
|
||||||
|
By default, global ordinals are loaded at search-time, which is the right
|
||||||
|
trade-off if you are optimizing for indexing speed. However, if you are more
|
||||||
|
interested in search speed, it could be interesting to set
|
||||||
|
`eager_global_ordinals: true` on fields that you plan to use in terms
|
||||||
|
aggregations:
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
------------
|
||||||
|
PUT my_index/_mapping/my_type
|
||||||
|
{
|
||||||
|
"properties": {
|
||||||
|
"tags": {
|
||||||
|
"type": "keyword",
|
||||||
|
"eager_global_ordinals": true
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
------------
|
||||||
|
// CONSOLE
|
||||||
|
// TEST[s/^/PUT my_index\n/]
|
||||||
|
|
||||||
|
This will shift the cost from search-time to refresh-time. Elasticsearch will
|
||||||
|
make sure that global ordinals are built before publishing updates to the
|
||||||
|
content of the index.
|
||||||
|
|
||||||
|
If you ever decide that you do not need to run `terms` aggregations on this
|
||||||
|
field anymore, then you can disable eager loading of global ordinals at any
|
||||||
|
time:
|
||||||
|
|
||||||
|
[source,js]
|
||||||
|
------------
|
||||||
|
PUT my_index/_mapping/my_type
|
||||||
|
{
|
||||||
|
"properties": {
|
||||||
|
"tags": {
|
||||||
|
"type": "keyword",
|
||||||
|
"eager_global_ordinals": false
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
------------
|
||||||
|
// CONSOLE
|
||||||
|
// TEST[continued]
|
||||||
|
|
|
@ -105,40 +105,6 @@ same name in the same index. Its value can be updated on existing fields
|
||||||
using the <<indices-put-mapping,PUT mapping API>>.
|
using the <<indices-put-mapping,PUT mapping API>>.
|
||||||
|
|
||||||
|
|
||||||
[[global-ordinals]]
|
|
||||||
.Global ordinals
|
|
||||||
*****************************************
|
|
||||||
|
|
||||||
Global ordinals is a data-structure on top of fielddata and doc values, that
|
|
||||||
maintains an incremental numbering for each unique term in a lexicographic
|
|
||||||
order. Each term has a unique number and the number of term 'A' is lower than
|
|
||||||
the number of term 'B'. Global ordinals are only supported on <<text,`text`>>
|
|
||||||
and <<keyword,`keyword`>> fields.
|
|
||||||
|
|
||||||
Fielddata and doc values also have ordinals, which is a unique numbering for
|
|
||||||
all terms in a particular segment and field. Global ordinals just build on top
|
|
||||||
of this, by providing a mapping between the segment ordinals and the global
|
|
||||||
ordinals, the latter being unique across the entire shard.
|
|
||||||
|
|
||||||
Global ordinals are used for features that use segment ordinals, such as
|
|
||||||
sorting and the terms aggregation, to improve the execution time. A terms
|
|
||||||
aggregation relies purely on global ordinals to perform the aggregation at the
|
|
||||||
shard level, then converts global ordinals to the real term only for the final
|
|
||||||
reduce phase, which combines results from different shards.
|
|
||||||
|
|
||||||
Global ordinals for a specified field are tied to _all the segments of a
|
|
||||||
shard_, while fielddata and doc values ordinals are tied to a single segment.
|
|
||||||
which is different than for field data for a specific field which is tied to a
|
|
||||||
single segment. For this reason global ordinals need to be entirely rebuilt
|
|
||||||
whenever a once new segment becomes visible.
|
|
||||||
|
|
||||||
The loading time of global ordinals depends on the number of terms in a field,
|
|
||||||
but in general it is low, since it source field data has already been loaded.
|
|
||||||
The memory overhead of global ordinals is a small because it is very
|
|
||||||
efficiently compressed.
|
|
||||||
|
|
||||||
*****************************************
|
|
||||||
|
|
||||||
[[field-data-filtering]]
|
[[field-data-filtering]]
|
||||||
==== `fielddata_frequency_filter`
|
==== `fielddata_frequency_filter`
|
||||||
|
|
||||||
|
|
|
@ -48,7 +48,7 @@ The following parameters are accepted by `keyword` fields:
|
||||||
can later be used for sorting, aggregations, or scripting? Accepts `true`
|
can later be used for sorting, aggregations, or scripting? Accepts `true`
|
||||||
(default) or `false`.
|
(default) or `false`.
|
||||||
|
|
||||||
<<global-ordinals,`eager_global_ordinals`>>::
|
<<eager-global-ordinals,`eager_global_ordinals`>>::
|
||||||
|
|
||||||
Should global ordinals be loaded eagerly on refresh? Accepts `true` or `false`
|
Should global ordinals be loaded eagerly on refresh? Accepts `true` or `false`
|
||||||
(default). Enabling this is a good idea on fields that are frequently used for
|
(default). Enabling this is a good idea on fields that are frequently used for
|
||||||
|
|
|
@ -57,7 +57,7 @@ The following parameters are accepted by `text` fields:
|
||||||
Mapping field-level query time boosting. Accepts a floating point number, defaults
|
Mapping field-level query time boosting. Accepts a floating point number, defaults
|
||||||
to `1.0`.
|
to `1.0`.
|
||||||
|
|
||||||
<<global-ordinals,`eager_global_ordinals`>>::
|
<<eager-global-ordinals,`eager_global_ordinals`>>::
|
||||||
|
|
||||||
Should global ordinals be loaded eagerly on refresh? Accepts `true` or `false`
|
Should global ordinals be loaded eagerly on refresh? Accepts `true` or `false`
|
||||||
(default). Enabling this is a good idea on fields that are frequently used for
|
(default). Enabling this is a good idea on fields that are frequently used for
|
||||||
|
|
Loading…
Reference in New Issue