Turns the top example in each of the geo aggregation docs into a working
example that can be opened in CONSOLE. Subsequent examples can all also
be opened in console and will work after you've run the first example.
All examples are tested as part of the build.
This adds the `COPY AS CURL` and `VIEW IN CONSOLE` links to the docs
and causes the snippets to be tested during Elasticsearch's build.
Relates to #18160
GeoDistance query, sort, and scripts make use of a crazy GeoDistance enum for handling 4 different ways of computing geo distance: SLOPPY_ARC, ARC, FACTOR, and PLANE. Only two of these are necessary: ARC, PLANE. This commit removes SLOPPY_ARC, and FACTOR and cleans up the way Geo distance is computed.
Added missing CONSOLE scripts to documentation for sampler and diversified_sampler aggs.
Includes new StackOverflow index setup in build.gradle
Closes#22746
* Formatting tweaks
This adds the `VIEW IN CONSOLE` and `COPY AS CURL` links to the
snippets in the docs for the `date_range` aggregation and tests
those snippets as part of the build.
Relates to #18160
This adds the `VIEW IN SENSE` and `COPY AS CURL` links and has
the build automatically execute the snippets and verify that they
work.
Relates to #18160
Adds the `VIEW IN CONSOLE` and `COPY AS CURL` links to the example
`global` aggregation. Also improves the example by adding a
non-`global` aggregation to compare it to.
Relates to #18160
Similar to the Filters aggregation but only supports "keyed" filter buckets and automatically "ANDs" pairs of filters to produce a form of adjacency matrix.
The intersection of buckets "A" and "B" is named "A&B" (the choice of separator is configurable). Empty intersection buckets are removed from the final results.
Closes#22169
* Promote longs to doubles when a terms agg mixes decimal and non-decimal number
This change makes the terms aggregation work when the buckets coming from different indices are a mix of decimal numbers and non-decimal numbers. In this case non-decimal number (longs) are promoted to decimal (double) which can result in a loss of precision for big numbers.
Fixes#22232
The use of the avg aggregation for sorting the terms aggregation is not encouraged since it has unbounded error. This changes the examples to use the max aggregation which does not suffer the same issues
and be much more stingy about what we consider a console candidate.
* Add `// CONSOLE` to check-running
* Fix version in some snippets
* Mark groovy snippets as groovy
* Fix versions in plugins
* Fix language marker errors
* Fix language parsing in snippets
This adds support for snippets who's language is written like
`[source, txt]` and `["source","js",subs="attributes,callouts"]`.
This also makes language required for snippets which is nice because
then we can be sure we can grep for snippets in a particular language.
Currently both aggregations really share the same implementation. This commit
splits the implementations so that regular histograms can support decimal
intervals/offsets and compute correct buckets for negative decimal values.
However the response API is still the same. So for intance both regular
histograms and date histograms will produce an
`org.elasticsearch.search.aggregations.bucket.histogram.Histogram`
aggregation.
The optimization to compute an identifier of the rounded value and the
rounded value itself has been removed since it was only used by regular
histograms, which now do the rounding themselves instead of relying on the
Rounding abstraction.
Closes#8082Closes#4847
The current heuristic to compute a default shard size is pretty aggressive,
it returns `max(10, number_of_shards * size)` as a value for the shard size.
I think making it less aggressive has the benefit that it would reduce the
likelyness of running into OOME when there are many shards (yearly
aggregations with time-based indices can make numbers of shards in the
thousands) and make the use of breadth-first more likely/efficient.
This commit replaces the heuristic with `size * 1.5 + 10`, which is enough
to have good accuracy on zipfian distributions.