OpenSearch/docs/reference
Colin Goodheart-Smithe 0edb096eb4 Adds a new auto-interval date histogram (#28993)
* Adds a new auto-interval date histogram

This change adds a new type of histogram aggregation called `auto_date_histogram` where you can specify the target number of buckets you require and it will find an appropriate interval for the returned buckets. The aggregation works by first collecting documents in buckets at second interval, when it has created more than the target number of buckets it merges these buckets into minute interval bucket and continues collecting until it reaches the target number of buckets again. It will keep merging buckets when it exceeds the target until either collection is finished or the highest interval (currently years) is reached. A similar process happens at reduce time.

This aggregation intentionally does not support min_doc_count, offest and extended_bounds to keep the already complex logic from becoming more complex. The aggregation accepts sub-aggregations but will always operate in `breadth_first` mode deferring the computation of sub-aggregations until the final buckets from the shard are known. min_doc_count is effectively hard-coded to zero meaning that we will insert empty buckets where necessary.

Closes #9572

* Adds documentation

* Added sub aggregator test

* Fixes failing docs test

* Brings branch up to date with master changes

* trying to get tests to pass again

* Fixes multiBucketConsumer accounting

* Collects more buckets than needed on shards

This gives us more options at reduce time in terms of how we do the
final merge of the buckeets to produce the final result

* Revert "Collects more buckets than needed on shards"

This reverts commit 993c782d117892af9a3c86a51921cdee630a3ac5.

* Adds ability to merge within a rounding

* Fixes nonn-timezone doc test failure

* Fix time zone tests

* iterates on tests

* Adds test case and documentation changes

Added some notes in the documentation about the intervals that can bbe
returned.

Also added a test case that utilises the merging of conseecutive buckets

* Fixes performance bug

The bug meant that getAppropriate rounding look a huge amount of time
if the range of the data was large but also sparsely populated. In
these situations the rounding would be very low so iterating through
the rounding values from the min key to the max keey look a long time
(~120 seconds in one test).

The solution is to add a rough estimate first which chooses the
rounding based just on the long values of the min and max keeys alone
but selects the rounding one lower than the one it thinks is
appropriate so the accurate method can choose the final rounding taking
into account the fact that intervals are not always fixed length.

Thee commit also adds more tests

* Changes to only do complex reduction on final reduce

* merge latest with master

* correct tests and add a new test case for 10k buckets

* refactor to perform bucket number check in innerBuild

* correctly derive bucket setting, update tests to increase bucket threshold

* fix checkstyle

* address code review comments

* add documentation for default buckets

* fix typo
2018-07-13 13:08:35 -04:00
..
aggregations Adds a new auto-interval date histogram (#28993) 2018-07-13 13:08:35 -04:00
analysis Added lenient flag for synonym token filter (#31484) 2018-07-10 17:11:50 -04:00
cat Docs: Use the default distribution to test docs (#31251) 2018-06-18 12:06:42 -04:00
cluster rest-high-level: added get cluster settings (#31706) 2018-07-02 13:25:17 -04:00
commands [DOCS] Moves commands to docs folder (#31114) 2018-06-06 07:49:15 -07:00
docs Docs: Explain _bulk?refresh shard targeting 2018-07-05 16:24:03 -04:00
how-to Docs: remove notes on sparsity. (#30905) 2018-06-05 08:58:52 +02:00
images Docs/windows installer (#27369) 2017-11-15 21:35:54 +11:00
index-modules Docs: Clarify constraints on scripted similarities. (#31076) 2018-06-05 08:51:00 +02:00
indices Docs: Inconsistency between description and example (#31858) 2018-07-06 12:44:20 -04:00
ingest Ingest: Add ignore_missing option to RemoveProc (#31693) 2018-07-09 10:24:34 +02:00
licensing [DOCS] Fix licensing API details (#31667) 2018-06-28 15:38:41 -07:00
mapping Docs: Remove duplicate test setup 2018-06-28 10:59:35 -04:00
migration Correct spelling of AnalysisPlugin#requriesAnalysisSettings (#32025) 2018-07-13 13:13:21 +01:00
modules Circuit-break based on real memory usage 2018-07-13 10:08:28 +02:00
monitoring [DOCS] Move monitoring to docs folder (#31477) 2018-06-22 15:39:34 -07:00
query-dsl Unify headers for full text queries 2018-06-27 10:11:14 +02:00
release-notes Percentile/Ranks should return null instead of NaN when empty (#30460) 2018-06-18 10:01:28 -04:00
rest-api [DOCS] Move migration APIs to docs (#31473) 2018-06-21 08:19:23 -07:00
search Add second level of field collapsing (#31808) 2018-07-13 11:40:03 -04:00
settings [DOCS] Replace CONFIG_DIR with ES_PATH_CONF (#31635) 2018-06-28 08:27:04 -07:00
setup Docs: Change formatting of Cloud options 2018-07-13 15:40:38 +02:00
sql SQL: Remove restriction for single column grouping (#31818) 2018-07-06 20:55:27 +03:00
testing [Docs] Use capital letters in section headings (#31678) 2018-06-29 11:58:39 +02:00
upgrade Improve allocation-disabling instructions (#30248) 2018-05-29 08:34:20 +01:00
aggregations.asciidoc [Docs] Update aggregations.asciidoc (#29265) 2018-03-28 15:01:45 +02:00
analysis.asciidoc [Docs] Add clarification to analysis example (#31826) 2018-07-06 14:36:58 +02:00
api-conventions.asciidoc Default to one shard (#30539) 2018-05-14 12:22:35 -04:00
cat.asciidoc Rename the bulk thread pool to write thread pool (#29593) 2018-04-19 08:18:58 -04:00
cluster.asciidoc rest-high-level: added get cluster settings (#31706) 2018-07-02 13:25:17 -04:00
docs.asciidoc Inclusion of link to Multi Delete (#22619) 2017-01-16 12:58:59 +01:00
getting-started.asciidoc Docs: Restyled cloud link in getting started 2018-07-13 15:48:14 +02:00
glossary.asciidoc Default to one shard (#30539) 2018-05-14 12:22:35 -04:00
gs-index.asciidoc [DOCS] Adding index file for GS "mini book". 2017-07-18 13:44:08 -07:00
how-to.asciidoc Correct grammar in list in how-to docs 2017-01-17 20:57:22 -05:00
index-modules.asciidoc Document woes between auto-expand-replicas and allocation filtering (#30531) 2018-05-14 12:14:37 +02:00
index.asciidoc [DOCS] Move sql to docs (#31474) 2018-06-22 15:40:25 -07:00
index.x.asciidoc [DOCS] Removes redundant index.asciidoc files (#30707) 2018-05-18 11:05:40 -07:00
indices.asciidoc add split index reference in indices.asciidoc 2017-11-06 12:55:41 +01:00
ingest.asciidoc [Docs] Changes to ingest.asciidoc (#28212) 2018-01-16 09:36:19 +01:00
mapping.asciidoc Limit the number of nested documents (#27405) 2017-11-22 10:16:28 -05:00
modules.asciidoc Remove left-over tribe reference 2018-01-30 21:44:21 +01:00
query-dsl.asciidoc Update query-dsl.asciidoc (#27669) 2017-12-11 18:06:08 +01:00
redirects.asciidoc [Docs] Clarify `fuzzy_like_this` redirect (#30183) 2018-05-02 11:45:37 +02:00
release-notes.asciidoc Migrate migration docs from 6.0 to 7.0 (#26227) 2017-08-16 13:12:44 -06:00
search.asciidoc Move search concurrency and parallelism paragraphs 2018-02-26 07:47:57 -08:00
setup.asciidoc [DOCS] Starting Elasticsearch (#31701) 2018-07-03 13:40:37 -07:00
testing.asciidoc [Docs] Unify spelling of Elasticsearch (#27567) 2017-11-29 09:44:25 +01:00
upgrade.asciidoc Revert "[DOCS] Added 6.3 info & updated the upgrade table. (#30940)" 2018-06-11 22:04:36 -04:00