Commit Graph

9 Commits

Author SHA1 Message Date
Tim Roes 5689dc1182 [Docs] Fix typo in composite aggregation (#28891) 2018-03-04 11:47:24 -08:00
Jim Ferenczi c4e0a84344
Mark the composite aggregation as a beta feature (#28431)
The `composite` aggregation should be marked as beta (rather than experimental) in the documentation.
2018-02-02 09:24:10 +01:00
Jim Ferenczi c26d4ac6c1
Always return the after_key in composite aggregation response (#28358)
This change adds the `after_key` of a composite aggregation directly in the response.
It is redundant when all buckets are not filtered/removed by a pipeline aggregation since in this case the `after_key` is always the last bucket
in the response. Though when using a pipeline aggregation to filter composite buckets, the `after_key` can be lost if the last bucket is filtered.
This commit fixes this situation by always returning the `after_key` in a dedicated section.
2018-01-25 09:15:27 +01:00
Jim Ferenczi b2ce994be7 [Docs] Fix asciidoc style in composite agg docs 2018-01-23 16:41:32 +01:00
Jim Ferenczi 19cfc25873
Adds the ability to specify a format on composite date_histogram source (#28310)
This commit adds the ability to specify a date format on the `date_histogram` composite source.
If the format is defined, the key for the source is returned as a formatted date.

Closes #27923
2018-01-23 15:14:49 +01:00
Shaunak Kashyap da0ed578b2 Fixing typo in param name: values => sources (#28016) 2017-12-28 18:18:30 +01:00
Clinton Gormley d1b1d711df Update composite-aggregation.asciidoc
Fixed asciidoc typo
2017-11-23 15:05:14 +01:00
Jim Ferenczi d1093bd2fa #26800: Fix docs rendering 2017-11-20 08:41:02 +01:00
Jim Ferenczi 623367d793
Add composite aggregator (#26800)
* This change adds a module called `aggs-composite` that defines a new aggregation named `composite`.
The `composite` aggregation is a multi-buckets aggregation that creates composite buckets made of multiple sources.
The sources for each bucket can be defined as:
  * A `terms` source, values are extracted from a field or a script.
  * A `date_histogram` source, values are extracted from a date field and rounded to the provided interval.
This aggregation can be used to retrieve all buckets of a deeply nested aggregation by flattening the nested aggregation in composite buckets.
A composite buckets is composed of one value per source and is built for each document as the combinations of values in the provided sources.
For instance the following aggregation:

````
"test_agg": {
  "terms": {
    "field": "field1"
  },
  "aggs": {
    "nested_test_agg":
      "terms": {
        "field": "field2"
      }
  }
}
````
... which retrieves the top N terms for `field1` and for each top term in `field1` the top N terms for `field2`, can be replaced by a `composite` aggregation in order to retrieve **all** the combinations of `field1`, `field2` in the matching documents:

````
"composite_agg": {
  "composite": {
    "sources": [
      {
	"field1": {
          "terms": {
              "field": "field1"
            }
        }
      },
      {
	"field2": {
          "terms": {
            "field": "field2"
          }
        }
      },
    }
  }
````

The response of the aggregation looks like this:

````
"aggregations": {
  "composite_agg": {
    "buckets": [
      {
        "key": {
          "field1": "alabama",
          "field2": "almanach"
        },
        "doc_count": 100
      },
      {
        "key": {
          "field1": "alabama",
          "field2": "calendar"
        },
        "doc_count": 1
      },
      {
        "key": {
          "field1": "arizona",
          "field2": "calendar"
        },
        "doc_count": 1
      }
    ]
  }
}
````

By default this aggregation returns 10 buckets sorted in ascending order of the composite key.
Pagination can be achieved by providing `after` values, the values of the composite key to aggregate after.
For instance the following aggregation will aggregate all composite keys that sorts after `arizona, calendar`:

````
"composite_agg": {
  "composite": {
    "after": {"field1": "alabama", "field2": "calendar"},
    "size": 100,
    "sources": [
      {
	"field1": {
          "terms": {
            "field": "field1"
          }
        }
      },
      {
	"field2": {
          "terms": {
            "field": "field2"
          }
	}
      }
    }
  }
````

This aggregation is optimized for indices that set an index sorting that match the composite source definition.
For instance the aggregation above could run faster on indices that defines an index sorting like this:

````
"settings": {
  "index.sort.field": ["field1", "field2"]
}
````

In this case the `composite` aggregation can early terminate on each segment.
This aggregation also accepts multi-valued field but disables early termination for these fields even if index sorting matches the sources definition.
This is mandatory because index sorting picks only one value per document to perform the sort.
2017-11-16 15:13:36 +01:00