Docs: Added explanation of how to do multi-field terms agg

Closes #5100
This commit is contained in:
Clinton Gormley 2014-09-07 11:09:52 +02:00
parent 7c2490b2ad
commit 1bdf79e527

View File

@ -380,6 +380,7 @@ WARNING: When NOT sorting on `doc_count` descending, high values of `min_doc_cou
back by increasing `shard_size`.
Setting `shard_min_doc_count` too high will cause terms to be filtered out on a shard level. This value should be set much lower than `min_doc_count/#shards`.
[[search-aggregations-bucket-terms-aggregation-script]]
==== Script
Generating the terms using a script:
@ -476,6 +477,33 @@ http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#UNICODE_CA
http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#UNICODE_CHARACTER_CLASS[`UNICODE_CHARACTER_CLASS`] and
http://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#UNIX_LINES[`UNIX_LINES`]
==== Multi-field terms aggregation
The `terms` aggregation does not support collecting terms from multiple fields
in the same document. The reason is that the `terms` agg doesn't collect the
string term values themselves, but rather uses
<<search-aggregations-bucket-terms-aggregation-execution-hint,global ordinals>>
to produce a list of all of the unique values in the field. Global ordinals
results in an important performance boost which would not be possible across
multiple fields.
There are two approaches that you can use to perform a `terms` agg across
multiple fields:
<<search-aggregations-bucket-terms-aggregation-script,Script>>::
Use a script to retrieve terms from multiple fields. This disables the global
ordinals optimization and will be slower than collecting terms from a single
field, but it gives you the flexibility to implement this option at search
time.
<<copy-to,`copy_to` field>>::
If you know ahead of time that you want to collect the terms from two or more
fields, then use `copy_to` in your mapping to create a new dedicated field at
index time which contains the values from both fields. You can aggregate on
this single field, which will benefit from the global ordinals optimization.
==== Collect mode
added[1.3.0] Deferring calculation of child aggregations
@ -548,7 +576,7 @@ WARNING: It is not possible to nest aggregations such as `top_hits` which requir
the `breadth_first` collection mode. This is because this would require a RAM buffer to hold the float score value for every document and
this would typically be too costly in terms of RAM.
[[search-aggregations-bucket-terms-aggregation-execution-hint]]
==== Execution hint
added[1.2.0] Added the `global_ordinals`, `global_ordinals_hash` and `global_ordinals_low_cardinality` execution modes