mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-19 19:35:02 +00:00
Docs: Added explanation of how to do multi-field terms agg
Closes #5100
This commit is contained in:
parent
7c2490b2ad
commit
1bdf79e527
@ -380,6 +380,7 @@ WARNING: When NOT sorting on `doc_count` descending, high values of `min_doc_cou
|
||||
back by increasing `shard_size`.
|
||||
Setting `shard_min_doc_count` too high will cause terms to be filtered out on a shard level. This value should be set much lower than `min_doc_count/#shards`.
|
||||
|
||||
[[search-aggregations-bucket-terms-aggregation-script]]
|
||||
==== Script
|
||||
|
||||
Generating the terms using a script:
|
||||
@ -476,6 +477,33 @@ http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#UNICODE_CA
|
||||
http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#UNICODE_CHARACTER_CLASS[`UNICODE_CHARACTER_CLASS`] and
|
||||
http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#UNIX_LINES[`UNIX_LINES`]
|
||||
|
||||
==== Multi-field terms aggregation
|
||||
|
||||
The `terms` aggregation does not support collecting terms from multiple fields
|
||||
in the same document. The reason is that the `terms` agg doesn't collect the
|
||||
string term values themselves, but rather uses
|
||||
<<search-aggregations-bucket-terms-aggregation-execution-hint,global ordinals>>
|
||||
to produce a list of all of the unique values in the field. Global ordinals
|
||||
results in an important performance boost which would not be possible across
|
||||
multiple fields.
|
||||
|
||||
There are two approaches that you can use to perform a `terms` agg across
|
||||
multiple fields:
|
||||
|
||||
<<search-aggregations-bucket-terms-aggregation-script,Script>>::
|
||||
|
||||
Use a script to retrieve terms from multiple fields. This disables the global
|
||||
ordinals optimization and will be slower than collecting terms from a single
|
||||
field, but it gives you the flexibility to implement this option at search
|
||||
time.
|
||||
|
||||
<<copy-to,`copy_to` field>>::
|
||||
|
||||
If you know ahead of time that you want to collect the terms from two or more
|
||||
fields, then use `copy_to` in your mapping to create a new dedicated field at
|
||||
index time which contains the values from both fields. You can aggregate on
|
||||
this single field, which will benefit from the global ordinals optimization.
|
||||
|
||||
==== Collect mode
|
||||
|
||||
added[1.3.0] Deferring calculation of child aggregations
|
||||
@ -548,7 +576,7 @@ WARNING: It is not possible to nest aggregations such as `top_hits` which requir
|
||||
the `breadth_first` collection mode. This is because this would require a RAM buffer to hold the float score value for every document and
|
||||
this would typically be too costly in terms of RAM.
|
||||
|
||||
|
||||
[[search-aggregations-bucket-terms-aggregation-execution-hint]]
|
||||
==== Execution hint
|
||||
|
||||
added[1.2.0] Added the `global_ordinals`, `global_ordinals_hash` and `global_ordinals_low_cardinality` execution modes
|
||||
|
Loading…
x
Reference in New Issue
Block a user