mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-05 20:48:22 +00:00
* Default include_type_name to false for get and put mappings. * Default include_type_name to false for get field mappings. * Add a constant for the default include_type_name value. * Default include_type_name to false for get and put index templates. * Default include_type_name to false for create index. * Update create index calls in REST documentation to use include_type_name=true. * Some minor clean-ups around the get index API. * In REST tests, use include_type_name=true by default for index creation. * Make sure to use 'expression == false'. * Clarify the different IndexTemplateMetaData toXContent methods. * Fix FullClusterRestartIT#testSnapshotRestore. * Fix the ml_anomalies_default_mappings test. * Fix GetFieldMappingsResponseTests and GetIndexTemplateResponseTests. We make sure to specify include_type_name=true during xContent parsing, so we continue to test the legacy typed responses. XContent generation for the typeless responses is currently only covered by REST tests, but we will be adding unit test coverage for these as we implement each typeless API in the Java HLRC. This commit also refactors GetMappingsResponse to follow the same appraoch as the other mappings-related responses, where we read include_type_name out of the xContent params, instead of creating a second toXContent method. This gives better consistency in the response parsing code. * Fix more REST tests. * Improve some wording in the create index documentation. * Add a note about types removal in the create index docs. * Fix SmokeTestMonitoringWithSecurityIT#testHTTPExporterWithSSL. * Make sure to mention include_type_name in the REST docs for affected APIs. * Make sure to use 'expression == false' in FullClusterRestartIT. * Mention include_type_name in the REST templates docs.
75 lines
2.6 KiB
Plaintext
75 lines
2.6 KiB
Plaintext
[[eager-global-ordinals]]
|
|
=== `eager_global_ordinals`
|
|
|
|
Global ordinals is a data-structure on top of doc values, that maintains an
|
|
incremental numbering for each unique term in a lexicographic order. Each
|
|
term has a unique number and the number of term 'A' is lower than the
|
|
number of term 'B'. Global ordinals are only supported with
|
|
<<keyword,`keyword`>> and <<text,`text`>> fields. In `keyword` fields, they
|
|
are available by default but `text` fields can only use them when `fielddata`,
|
|
with all of its associated baggage, is enabled.
|
|
|
|
Doc values (and fielddata) also have ordinals, which is a unique numbering for
|
|
all terms in a particular segment and field. Global ordinals just build on top
|
|
of this, by providing a mapping between the segment ordinals and the global
|
|
ordinals, the latter being unique across the entire shard. Given that global
|
|
ordinals for a specific field are tied to _all the segments of a shard_, they
|
|
need to be entirely rebuilt whenever a once new segment becomes visible.
|
|
|
|
Global ordinals are used for features that use segment ordinals, such as
|
|
the <<search-aggregations-bucket-terms-aggregation,`terms` aggregation>>,
|
|
to improve the execution time. A terms aggregation relies purely on global
|
|
ordinals to perform the aggregation at the shard level, then converts global
|
|
ordinals to the real term only for the final reduce phase, which combines
|
|
results from different shards.
|
|
|
|
The loading time of global ordinals depends on the number of terms in a field,
|
|
but in general it is low, since it source field data has already been loaded.
|
|
The memory overhead of global ordinals is a small because it is very
|
|
efficiently compressed.
|
|
|
|
By default, global ordinals are loaded at search-time, which is the right
|
|
trade-off if you are optimizing for indexing speed. However, if you are more
|
|
interested in search speed, it could be interesting to set
|
|
`eager_global_ordinals: true` on fields that you plan to use in terms
|
|
aggregations:
|
|
|
|
[source,js]
|
|
------------
|
|
PUT my_index/_mapping
|
|
{
|
|
"properties": {
|
|
"tags": {
|
|
"type": "keyword",
|
|
"eager_global_ordinals": true
|
|
}
|
|
}
|
|
}
|
|
------------
|
|
// CONSOLE
|
|
// TEST[s/^/PUT my_index\n/]
|
|
|
|
This will shift the cost from search-time to refresh-time. Elasticsearch will
|
|
make sure that global ordinals are built before publishing updates to the
|
|
content of the index.
|
|
|
|
If you ever decide that you do not need to run `terms` aggregations on this
|
|
field anymore, then you can disable eager loading of global ordinals at any
|
|
time:
|
|
|
|
[source,js]
|
|
------------
|
|
PUT my_index/_mapping
|
|
{
|
|
"properties": {
|
|
"tags": {
|
|
"type": "keyword",
|
|
"eager_global_ordinals": false
|
|
}
|
|
}
|
|
}
|
|
------------
|
|
// CONSOLE
|
|
// TEST[continued]
|
|
|