Commit Graph

135 Commits

Author SHA1 Message Date
Adrien Grand e5be85d586 Aggs: Change the default `min_doc_count` to 0 on histograms.
The assumption is that gaps in histogram are generally undesirable, for instance
if you want to build a visualization from it. Additionally, we are building new
aggregations that require that there are no gaps to work correctly (eg.
derivatives).
2015-04-30 15:48:23 +02:00
Colin Goodheart-Smithe 969f53e399 fix typo in Min bucket aggregation docs 2015-04-30 14:41:01 +01:00
Colin Goodheart-Smithe d16bf992a9 Aggregations: min_bucket aggregation
An aggregation to calculate the minimum value in a set of buckets.

Closes #9999
2015-04-30 13:34:21 +01:00
Zachary Tong 351a4d3315 [DOCS] Fix movavg images and naming 2015-04-29 13:33:54 -04:00
Colin Goodheart-Smithe 57a8885964 Merge branch 'master' into feature/aggs_2_0
# Conflicts:
#	src/main/java/org/elasticsearch/index/query/CommonTermsQueryBuilder.java
#	src/main/java/org/elasticsearch/search/aggregations/AggregationModule.java
#	src/main/java/org/elasticsearch/search/aggregations/AggregatorFactories.java
#	src/main/java/org/elasticsearch/search/aggregations/AggregatorParsers.java
#	src/main/java/org/elasticsearch/search/aggregations/InternalMultiBucketAggregation.java
#	src/main/java/org/elasticsearch/search/aggregations/bucket/nested/NestedAggregator.java
#	src/main/java/org/elasticsearch/search/aggregations/metrics/InternalNumericMetricsAggregation.java
#	src/test/java/org/elasticsearch/search/aggregations/bucket/nested/NestedAggregatorTest.java
2015-04-29 15:49:41 +01:00
Antonio Bonuccelli ab83eb036b Docs: adding missing single quote on PUT index request
Closes #10876
2015-04-29 14:45:25 +02:00
Zachary Tong bf9739d0f0 [DOCS] review comment fixes 2015-04-27 14:40:04 -04:00
Clinton Gormley 37ed61807f Docs: Updated the experimental annotations in the docs as follows:
* Removed the docs for `index.compound_format` and `index.compound_on_flush` - these are expert settings which should probably be removed (see https://github.com/elastic/elasticsearch/issues/10778)
* Removed the docs for `index.index_concurrency` - another expert setting
* Labelled the segments verbose output as experimental
* Marked the `compression`, `precision_threshold` and `rehash` options as experimental in the cardinality and percentile aggs
* Improved the experimental text on `significant_terms`, `execution_hint` in the terms agg, and `terminate_after` param on count and search
* Removed the experimental flag on the `geobounds` agg
* Marked the settings in the `merge` and `store` modules as experimental, rather than the modules themselves

Closes #10782
2015-04-26 18:49:15 +02:00
Clinton Gormley f1a0e2216a Docs: Mentioned script_id and script_file parameters across all aggs
Closes #10760
2015-04-26 17:30:38 +02:00
Zachary Tong e08e45cee8 [DOCS] Add link to movavg page 2015-04-22 18:59:39 -04:00
Zachary Tong a03cefcece [DOCS] Add documentation for moving average 2015-04-22 18:59:39 -04:00
Colin Goodheart-Smithe bd28c9c44e Documentation for the max_bucket reducer 2015-04-21 15:06:20 +01:00
Colin Goodheart-Smithe be647a89d3 Documentation for the derivative reducer 2015-04-21 15:06:20 +01:00
markharwood 63db34f649 New feature - Sampler aggregation used to limit any nested aggregations' processing to a sample of the top-scoring documents.
Optionally, a “diversify” setting can limit the number of collected matches that share a common value such as an "author".

Closes #8108
2015-04-21 10:22:05 +01:00
Adrien Grand f4d5914511 Docs: Warn about the fact that min_doc_count=0 might return terms that only belong to different types. 2015-04-21 00:57:57 +02:00
Clinton Gormley abc7de96ae Docs: Updated version annotations in master 2015-04-09 14:50:11 +02:00
Adrien Grand aecd9ac515 Aggregations: Speed up include/exclude in terms aggregations with regexps.
Today we check every regular expression eagerly against every possible term.
This can be very slow if you have lots of unique terms, and even the bottleneck
if your query is selective.

This commit switches to Lucene regular expressions instead of Java (not exactly
the same syntax yet most existing regular expressions should keep working) and
uses the same logic as RegExpQuery to intersect the regular expression with the
terms dictionary. I wrote a quick benchmark (in the PR) to make sure it made
things faster and the same request that took 750ms on master now takes 74ms with
this change.

Close #7526
2015-04-09 12:12:56 +02:00
marko asplund 5585175173 Docs: fix typos in example JSON data
Closes #10479
2015-04-08 13:40:35 +02:00
olivier bourgain bcb4decca9 [DOCS] add missing comma in percentile_rank aggregation example 2015-03-10 08:21:06 -07:00
olivier bourgain fb7cd2ea9a [DOCS] Adjusted geo_distance aggregation example
unit is not returned in the response, but we have key and an implicit from starting at 0 for the first bucket
2015-03-10 08:20:20 -07:00
olivier bourgain eaeddc6bd4 [DOCS] missing curly brace in ip_range aggregation example 2015-03-10 08:19:57 -07:00
Britta Weber 580728dfd6 significant terms: add scriptable significance heuristic
This commit adds scripting capability to significant_terms.
Custom heuristics can be implemented with a script that provides
parameters subset_freq, superset_freq,subset_size, superset_size.

closes #7850
2015-03-06 17:06:04 +01:00
Clinton Gormley e194fb3a07 Docs: Default distance unit in geo distance agg is metres, not km
Closes #9812
2015-02-28 01:45:29 +01:00
Colin Goodheart-Smithe 2520dc78ec [DOCS] added a note for the default shard_size value 2015-02-25 11:00:55 +00:00
markharwood 29b1902cfb New aggregations feature - “PercentageScore” heuristic for significant_terms aggregation provides simple “per-capita” type measures.
Closes #9720
2015-02-20 13:22:08 +00:00
Christoph Büscher 30fd70f07b Aggregations: Simplify time zone option in `date_histogram`
Removed the existing `pre_zone` and `post_zone` option in `date_histogram` in favor of
the simpler `time_zone` option. Previously, specifying different values for these could
lead to confusing scenarios where ES would return bucket keys that are not UTC.
Now `time_zone` is the only option setting, the calculation of date buckets to take place in the
preferred time zone, but after rounding converting the bucket key values back to UTC.

Closes #9062
Closes #9637
2015-02-16 16:54:06 +01:00
Clinton Gormley 6fadeeca56 Updated doc annotations for 1.4.3 2015-02-11 17:54:53 +01:00
Christoph Büscher d2f852a274 Aggregations: Add 'offset' option to date_histogram, replacing 'pre_offset' and 'post_offset'
Add offset option to 'date_histogram' replacing and simplifying the previous 'pre_offset' and 'post_offset' options.
This change is part of a larger clean up task for `date_histogram` from issue #9062.
2015-02-09 14:03:28 +01:00
Adrien Grand 95f46f1212 Docs: Use the new experimental annotation.
We now have a very useful annotation to mark features or parameters as
experimental. Let's use it! This commit replaces some custom text warnings with
this annotation and adds this annotation to some existing features/parameters:
 - inner_hits (unreleased yet)
 - terminate_after (released in 1.4)
 - per-bucket doc count errors in the terms agg (released in 1.4)

I also tagged with this annotation settings which should either be not needed
(like the ability to evict entries from the filter cache based on time) or that
are too deep into the way that Elasticsearch works like the Directory
implementation or merge settings.

Close #9563
2015-02-05 15:29:45 +01:00
Adrien Grand 3a486066fd Docs: Remove the experimental status of the cardinality and percentiles(-ranks) aggregations
These aggregations are not experimental anymore but some of their parameters
still are:
 - `precision_threshold` and `rehash` on `cardinality`
 - `compression` on percentiles(-ranks)

Close #9560
2015-02-05 15:18:40 +01:00
Christoph Büscher 44193e7ba5 Aggregations: Add 'offset' option to histogram aggregation
Histogram aggregation supports an 'offset' option to move bucket boundaries.
In a histogram with buckets of size X these can be moved from 0, X, 2X, 3X,...
by an offset value of Y to Y, X+Y, 2X+Y, 3X+Y... by using the 'offset' option.
The previous 'pre_offset' and 'post_offset' options are removed in favour of
the simplified 'offset' option.

Closes #9417
Closes #9505
2015-02-02 18:23:01 +01:00
Oliver e412dab63a Docs: Fix sample query
Closes #9472
2015-01-29 15:56:24 +01:00
Zachary Tong a4eb1d5505 Aggregations: Add standard deviation bounds to extended_stats
Extended_stats now displays the upper and lower bounds on standard deviations (e.g. avg +/- std).
Default is to show 2 std above/below, but can be changed using the `sigma` parameter.
Accepts non-negative doubles

Closes #9356
2015-01-28 11:47:20 -05:00
eBuildy 85ef44fd73 Docs: Fix missing comma and boolean true
Closes #9350
2015-01-19 21:31:29 +01:00
Ryan Ernst 39b3613420 Fix date histogram docs grammar. 2014-12-23 10:19:55 -08:00
Clinton Gormley 88e06cba80 Update daterange-aggregation.asciidoc
Clarified the date-math expressions on date range aggregations

Closes #8703
2014-11-28 16:53:33 +01:00
David Pilato 43a1435d3b [Docs] fix consistency between examples 2014-11-27 20:29:34 +01:00
David Pilato 40f0e07db3 [Docs] Fix missing new line 2014-11-27 19:39:12 +01:00
David Pilato da27c2104a [Docs] Fix missing comma in mapping 2014-11-27 11:03:19 +01:00
Boaz Leskes 1e16375d04 Docs: Update execution hint docs for Significant terms agg
copied over the relevant pieces from the terms agg

Closes #8532
2014-11-18 20:54:26 +01:00
Clinton Gormley cff544dcc2 Docs: Removed old coming/added tags 2014-11-10 14:41:24 +01:00
Veres Lajos 4059e4ac86 typo fixes - https://github.com/vlajos/misspell_fixer
Closes #8323
2014-11-08 18:55:57 +01:00
Clinton Gormley 08aa715d2e Update datehistogram-aggregation.asciidoc
Clarified use of fractional time units in the date histo agg.

Closes #7957
2014-11-08 17:49:34 +01:00
Adrien Grand 7ea490dfd1 Aggregations: Return the sum of the doc counts of other buckets.
This commit adds a new field to the response of the terms aggregation called
`sum_other_doc_count` which is equal to the sum of the doc counts of the buckets
that did not make it to the list of top buckets. It is typically useful to have
a sector called eg. `other` when using terms aggregations to build pie charts.

Example query and response:

```json
GET test/_search?search_type=count
{
  "aggs": {
    "colors": {
      "terms": {
        "field": "color",
        "size": 3
      }
    }
  }
}
```

```json
{
   [...],
   "aggregations": {
      "colors": {
         "doc_count_error_upper_bound": 0,
         "sum_other_doc_count": 4,
         "buckets": [
            {
               "key": "blue",
               "doc_count": 65
            },
            {
               "key": "red",
               "doc_count": 14
            },
            {
               "key": "brown",
               "doc_count": 3
            }
         ]
      }
   }
}
```

Close #8213
2014-10-27 12:11:26 +01:00
Andrew O'Brien 33097d901b Docs: Typo: s/by/be/
Closes #8114
2014-10-16 20:51:58 +02:00
Martijn van Groningen 5763b24686 Core: Make fetch phase nested doc aware
By letting the fetch phase understand the nested docs structure we can serve nested docs as hits.
The `top_hits` aggregation can because of this commit be placed in a `nested` or `reverse_nested` aggregation.

Closes #7164
2014-10-08 22:21:30 +02:00
Colin Goodheart-Smithe 6cf371395a Aggregations: makes script params consistent with other APIs in scripted_metric
This change removes the script_type parameter form the Scripted Metric Aggregation and adds support for _file and _id suffixes to the init_script, map_script, combine_script and reduce_script parameters to make defining the source of the script consistent with the other APIs which use the ScriptService
2014-10-06 09:07:25 +01:00
Clinton Gormley cb00d4a542 Docs: Removed all the added/deprecated tags from 1.x 2014-09-26 21:04:42 +02:00
Colin Goodheart-Smithe 8a70b115f2 Aggregations: More consistent response format for scripted metrics aggregation
Changes the name of the field in the scripted metrics aggregation from 'aggregation' to 'value' to be more in line with the other metrics aggregations like 'avg'
2014-09-17 11:46:26 +01:00
Jordan Snodgrass 6246aac9ab Docs: Indicate that the Children Aggregation is coming in 1.4.0 2014-09-17 09:22:02 +02:00