Aggregations: Added an option to show the upper bound of the error for the terms aggregation.

This is only applicable when the order is set to _count.  The upper bound of the error in the doc count is calculated by summing the doc count of the last term on each shard which did not return the term.  The implementation calculates the error by summing the doc count for the last term on each shard for which the term IS returned and then subtracts this value from the sum of the doc counts for the last term from ALL shards.

Closes #6696
This commit is contained in:
Colin Goodheart-Smithe 2014-07-07 13:58:42 +01:00
parent 593fffc7a1
commit 655157c83a
23 changed files with 1431 additions and 130 deletions

View File

@ -43,7 +43,7 @@ Response:
By default, the `terms` aggregation will return the buckets for the top ten terms ordered by the `doc_count`. One can By default, the `terms` aggregation will return the buckets for the top ten terms ordered by the `doc_count`. One can
change this default behaviour by setting the `size` parameter. change this default behaviour by setting the `size` parameter.
==== Size & Shard Size ==== Size
The `size` parameter can be set to define how many term buckets should be returned out of the overall terms list. By The `size` parameter can be set to define how many term buckets should be returned out of the overall terms list. By
default, the node coordinating the search process will request each shard to provide its own top `size` term buckets default, the node coordinating the search process will request each shard to provide its own top `size` term buckets
@ -52,6 +52,87 @@ This means that if the number of unique terms is greater than `size`, the return
(it could be that the term counts are slightly off and it could even be that a term that should have been in the top (it could be that the term counts are slightly off and it could even be that a term that should have been in the top
size buckets was not returned). If set to `0`, the `size` will be set to `Integer.MAX_VALUE`. size buckets was not returned). If set to `0`, the `size` will be set to `Integer.MAX_VALUE`.
==== Document counts are approximate
As described above, the document counts (and the results of any sub aggregations) in the terms aggregation are not always
accurate. This is because each shard provides its own view of what the ordered list of terms should be and these are
combined to give a final view. Consider the following scenario:
A request is made to obtain the top 5 terms in the field product, ordered by descending document count from an index with
3 shards. In this case each shard is asked to give its top 5 terms.
[source,js]
--------------------------------------------------
{
"aggs" : {
"products" : {
"terms" : {
"field" : "product",
"size" : 5
}
}
}
}
--------------------------------------------------
The terms for each of the three shards are shown below with their
respective document counts in brackets:
[width="100%",cols="^2,^2,^2,^2",options="header"]
|=========================================================
| | Shard A | Shard B | Shard C
| 1 | Product A (25) | Product A (30) | Product A (45)
| 2 | Product B (18) | Product B (25) | Product C (44)
| 3 | Product C (6) | Product F (17) | Product Z (36)
| 4 | Product D (3) | Product Z (16) | Product G (30)
| 5 | Product E (2) | Product G (15) | Product E (29)
| 6 | Product F (2) | Product H (14) | Product H (28)
| 7 | Product G (2) | Product I (10) | Product Q (2)
| 8 | Product H (2) | Product Q (6) | Product D (1)
| 9 | Product I (1) | Product J (8) |
| 10 | Product J (1) | Product C (4) |
|=========================================================
The shards will return their top 5 terms so the results from the shards will be:
[width="100%",cols="^2,^2,^2,^2",options="header"]
|=========================================================
| | Shard A | Shard B | Shard C
| 1 | Product A (25) | Product A (30) | Product A (45)
| 2 | Product B (18) | Product B (25) | Product C (44)
| 3 | Product C (6) | Product F (17) | Product Z (36)
| 4 | Product D (3) | Product Z (16) | Product G (30)
| 5 | Product E (2) | Product G (15) | Product E (29)
|=========================================================
Taking the top 5 results from each of the shards (as requested) and combining them to make a final top 5 list produces
the following:
[width="40%",cols="^2,^2"]
|=========================================================
| 1 | Product A (100)
| 2 | Product Z (52)
| 3 | Product C (50)
| 4 | Product G (45)
| 5 | Product B (43)
|=========================================================
Because Product A was returned from all shards we know that its document count value is accurate. Product C was only
returned by shards A and C so its document count is shown as 50 but this is not an accurate count. Product C exists on
shard B, but its count of 4 was not high enough to put Product C into the top 5 list for that shard. Product Z was also
returned only by 2 shards but the third shard does not contain the term. There is no way of knowing, at the point of
combining the results to produce the final list of terms, that there is an error in the document count for Product C and
not for Product Z. Product H has a document count of 44 across all 3 shards but was not included in the final list of
terms because it did not make it into the top five terms on any of the shards.
==== Shard Size
The higher the requested `size` is, the more accurate the results will be, but also, the more expensive it will be to The higher the requested `size` is, the more accurate the results will be, but also, the more expensive it will be to
compute the final results (both due to bigger priority queues that are managed on a shard level and due to bigger data compute the final results (both due to bigger priority queues that are managed on a shard level and due to bigger data
@ -70,6 +151,81 @@ NOTE: `shard_size` cannot be smaller than `size` (as it doesn't make much sens
added[1.1.0] It is possible to not limit the number of terms that are returned by setting `size` to `0`. Don't use this added[1.1.0] It is possible to not limit the number of terms that are returned by setting `size` to `0`. Don't use this
on high-cardinality fields as this will kill both your CPU since terms need to be return sorted, and your network. on high-cardinality fields as this will kill both your CPU since terms need to be return sorted, and your network.
==== Calculating Document Count Error
coming[1.4.0]
There are two error values which can be shown on the terms aggregation. The first gives a value for the aggregation as
a whole which represents the maximum potential document count for a term which did not make it into the final list of
terms. This is calculated as the sum of the document count from the last term returned from each shard .For the example
given above the value would be 46 (2 + 15 + 29). This means that in the worst case scenario a term which was not returned
could have the 4th highest document count.
[source,js]
--------------------------------------------------
{
...
"aggregations" : {
"products" : {
"doc_count_error_upper_bound" : 46,
"buckets" : [
{
"key" : "Product A",
"doc_count" : 100
},
{
"key" : "Product Z",
"doc_count" : 52
},
...
]
}
}
}
--------------------------------------------------
The second error value can be enabled by setting the `show_term_doc_count_error` parameter to true. This shows an error value
for each term returned by the aggregation which represents the 'worst case' error in the document count and can be useful when
deciding on a value for the `shard_size` parameter. This is calculated by summing the document counts for the last term returned
by all shards which did not return the term. In the example above the error in the document count for Product C would be 15 as
Shard B was the only shard not to return the term and the document count of the last termit did return was 15. The actual document
count of Product C was 54 so the document count was only actually off by 4 even though the worst case was that it would be off by
15. Product A, however has an error of 0 for its document count, since every shard returned it we can be confident that the count
returned is accurate.
[source,js]
--------------------------------------------------
{
...
"aggregations" : {
"products" : {
"doc_count_error_upper_bound" : 46,
"buckets" : [
{
"key" : "Product A",
"doc_count" : 100,
"doc_count_error_upper_bound" : 0
},
{
"key" : "Product Z",
"doc_count" : 52,
"doc_count_error_upper_bound" : 2
},
...
]
}
}
}
--------------------------------------------------
These errors can only be calculated in this way when the terms are ordered by descending document count. When the aggregation is
ordered by the terms values themselves (either ascending or descending) there is no error in the document count since if a shard
does not return a particular term which appears in the results from another shard, it must not have that term in its index. When the
aggregation is either sorted by a sub aggregation or in order of ascending document count, the error in the document counts cannot be
determined and is given a value of -1 to indicate this.
==== Order ==== Order
The order of the buckets can be customized by setting the `order` parameter. By default, the buckets are ordered by The order of the buckets can be customized by setting the `order` parameter. By default, the buckets are ordered by

View File

@ -47,7 +47,7 @@ public class GlobalOrdinalsSignificantTermsAggregator extends GlobalOrdinalsStri
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent,
SignificantTermsAggregatorFactory termsAggFactory) { SignificantTermsAggregatorFactory termsAggFactory) {
super(name, factories, valuesSource, estimatedBucketCount, maxOrd, null, bucketCountThresholds, includeExclude, aggregationContext, parent, SubAggCollectionMode.DEPTH_FIRST); super(name, factories, valuesSource, estimatedBucketCount, maxOrd, null, bucketCountThresholds, includeExclude, aggregationContext, parent, SubAggCollectionMode.DEPTH_FIRST, false);
this.termsAggFactory = termsAggFactory; this.termsAggFactory = termsAggFactory;
} }

View File

@ -42,7 +42,7 @@ public class SignificantLongTermsAggregator extends LongTermsAggregator {
long estimatedBucketCount, BucketCountThresholds bucketCountThresholds, long estimatedBucketCount, BucketCountThresholds bucketCountThresholds,
AggregationContext aggregationContext, Aggregator parent, SignificantTermsAggregatorFactory termsAggFactory) { AggregationContext aggregationContext, Aggregator parent, SignificantTermsAggregatorFactory termsAggFactory) {
super(name, factories, valuesSource, format, estimatedBucketCount, null, bucketCountThresholds, aggregationContext, parent, SubAggCollectionMode.DEPTH_FIRST); super(name, factories, valuesSource, format, estimatedBucketCount, null, bucketCountThresholds, aggregationContext, parent, SubAggCollectionMode.DEPTH_FIRST, false);
this.termsAggFactory = termsAggFactory; this.termsAggFactory = termsAggFactory;
} }

View File

@ -46,7 +46,7 @@ public class SignificantStringTermsAggregator extends StringTermsAggregator {
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent,
SignificantTermsAggregatorFactory termsAggFactory) { SignificantTermsAggregatorFactory termsAggFactory) {
super(name, factories, valuesSource, estimatedBucketCount, null, bucketCountThresholds, includeExclude, aggregationContext, parent, SubAggCollectionMode.DEPTH_FIRST); super(name, factories, valuesSource, estimatedBucketCount, null, bucketCountThresholds, includeExclude, aggregationContext, parent, SubAggCollectionMode.DEPTH_FIRST, false);
this.termsAggFactory = termsAggFactory; this.termsAggFactory = termsAggFactory;
} }

View File

@ -29,10 +29,13 @@ import java.util.Collections;
abstract class AbstractStringTermsAggregator extends TermsAggregator { abstract class AbstractStringTermsAggregator extends TermsAggregator {
protected final boolean showTermDocCountError;
public AbstractStringTermsAggregator(String name, AggregatorFactories factories, public AbstractStringTermsAggregator(String name, AggregatorFactories factories,
long estimatedBucketsCount, AggregationContext context, Aggregator parent, long estimatedBucketsCount, AggregationContext context, Aggregator parent,
InternalOrder order, BucketCountThresholds bucketCountThresholds, SubAggCollectionMode subAggCollectMode) { InternalOrder order, BucketCountThresholds bucketCountThresholds, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError) {
super(name, BucketAggregationMode.PER_BUCKET, factories, estimatedBucketsCount, context, parent, bucketCountThresholds, order, subAggCollectMode); super(name, BucketAggregationMode.PER_BUCKET, factories, estimatedBucketsCount, context, parent, bucketCountThresholds, order, subAggCollectMode);
this.showTermDocCountError = showTermDocCountError;
} }
@Override @Override
@ -42,7 +45,7 @@ abstract class AbstractStringTermsAggregator extends TermsAggregator {
@Override @Override
public InternalAggregation buildEmptyAggregation() { public InternalAggregation buildEmptyAggregation() {
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList()); return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList(), showTermDocCountError, 0);
} }
} }

View File

@ -36,6 +36,7 @@ public abstract class AbstractTermsParametersParser {
public static final ParseField MIN_DOC_COUNT_FIELD_NAME = new ParseField("min_doc_count"); public static final ParseField MIN_DOC_COUNT_FIELD_NAME = new ParseField("min_doc_count");
public static final ParseField SHARD_MIN_DOC_COUNT_FIELD_NAME = new ParseField("shard_min_doc_count"); public static final ParseField SHARD_MIN_DOC_COUNT_FIELD_NAME = new ParseField("shard_min_doc_count");
public static final ParseField REQUIRED_SIZE_FIELD_NAME = new ParseField("size"); public static final ParseField REQUIRED_SIZE_FIELD_NAME = new ParseField("size");
public static final ParseField SHOW_TERM_DOC_COUNT_ERROR = new ParseField("show_term_doc_count_error");
//These are the results of the parsing. //These are the results of the parsing.
@ -64,7 +65,6 @@ public abstract class AbstractTermsParametersParser {
return collectMode; return collectMode;
} }
public void parse(String aggregationName, XContentParser parser, SearchContext context, ValuesSourceParser vsParser, IncludeExclude.Parser incExcParser) throws IOException { public void parse(String aggregationName, XContentParser parser, SearchContext context, ValuesSourceParser vsParser, IncludeExclude.Parser incExcParser) throws IOException {
bucketCountThresholds = getDefaultBucketCountThresholds(); bucketCountThresholds = getDefaultBucketCountThresholds();
XContentParser.Token token; XContentParser.Token token;

View File

@ -18,6 +18,7 @@
*/ */
package org.elasticsearch.search.aggregations.bucket.terms; package org.elasticsearch.search.aggregations.bucket.terms;
import org.elasticsearch.Version;
import org.elasticsearch.common.Nullable; import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.io.stream.StreamInput; import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput; import org.elasticsearch.common.io.stream.StreamOutput;
@ -31,7 +32,6 @@ import org.elasticsearch.search.aggregations.support.format.ValueFormatterStream
import java.io.IOException; import java.io.IOException;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.Collection;
import java.util.List; import java.util.List;
/** /**
@ -58,8 +58,8 @@ public class DoubleTerms extends InternalTerms {
double term; double term;
public Bucket(double term, long docCount, InternalAggregations aggregations) { public Bucket(double term, long docCount, InternalAggregations aggregations, boolean showDocCountError, long docCountError) {
super(docCount, aggregations); super(docCount, aggregations, showDocCountError, docCountError);
this.term = term; this.term = term;
} }
@ -89,8 +89,8 @@ public class DoubleTerms extends InternalTerms {
} }
@Override @Override
Bucket newBucket(long docCount, InternalAggregations aggs) { Bucket newBucket(long docCount, InternalAggregations aggs, long docCountError) {
return new Bucket(term, docCount, aggs); return new Bucket(term, docCount, aggs, showDocCountError, docCountError);
} }
} }
@ -98,8 +98,8 @@ public class DoubleTerms extends InternalTerms {
DoubleTerms() {} // for serialization DoubleTerms() {} // for serialization
public DoubleTerms(String name, InternalOrder order, @Nullable ValueFormatter formatter, int requiredSize, long minDocCount, Collection<InternalTerms.Bucket> buckets) { public DoubleTerms(String name, InternalOrder order, @Nullable ValueFormatter formatter, int requiredSize, int shardSize, long minDocCount, List<InternalTerms.Bucket> buckets, boolean showTermDocCountError, long docCountError) {
super(name, order, requiredSize, minDocCount, buckets); super(name, order, requiredSize, shardSize, minDocCount, buckets, showTermDocCountError, docCountError);
this.formatter = formatter; this.formatter = formatter;
} }
@ -109,21 +109,40 @@ public class DoubleTerms extends InternalTerms {
} }
@Override @Override
protected InternalTerms newAggregation(String name, List<InternalTerms.Bucket> buckets) { protected InternalTerms newAggregation(String name, List<InternalTerms.Bucket> buckets, boolean showTermDocCountError, long docCountError) {
return new DoubleTerms(name, order, formatter, requiredSize, minDocCount, buckets); return new DoubleTerms(name, order, formatter, requiredSize, shardSize, minDocCount, buckets, showTermDocCountError, docCountError);
} }
@Override @Override
public void readFrom(StreamInput in) throws IOException { public void readFrom(StreamInput in) throws IOException {
this.name = in.readString(); this.name = in.readString();
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
this.docCountError = in.readLong();
} else {
this.docCountError = -1;
}
this.order = InternalOrder.Streams.readOrder(in); this.order = InternalOrder.Streams.readOrder(in);
this.formatter = ValueFormatterStreams.readOptional(in); this.formatter = ValueFormatterStreams.readOptional(in);
this.requiredSize = readSize(in); this.requiredSize = readSize(in);
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
this.shardSize = readSize(in);
this.showTermDocCountError = in.readBoolean();
} else {
this.shardSize = requiredSize;
this.showTermDocCountError = false;
}
this.minDocCount = in.readVLong(); this.minDocCount = in.readVLong();
int size = in.readVInt(); int size = in.readVInt();
List<InternalTerms.Bucket> buckets = new ArrayList<>(size); List<InternalTerms.Bucket> buckets = new ArrayList<>(size);
for (int i = 0; i < size; i++) { for (int i = 0; i < size; i++) {
buckets.add(new Bucket(in.readDouble(), in.readVLong(), InternalAggregations.readAggregations(in))); double term = in.readDouble();
long docCount = in.readVLong();
long bucketDocCountError = -1;
if (in.getVersion().onOrAfter(Version.V_1_4_0) && showTermDocCountError) {
bucketDocCountError = in.readLong();
}
InternalAggregations aggregations = InternalAggregations.readAggregations(in);
buckets.add(new Bucket(term, docCount, aggregations, showTermDocCountError, bucketDocCountError));
} }
this.buckets = buckets; this.buckets = buckets;
this.bucketMap = null; this.bucketMap = null;
@ -132,20 +151,31 @@ public class DoubleTerms extends InternalTerms {
@Override @Override
public void writeTo(StreamOutput out) throws IOException { public void writeTo(StreamOutput out) throws IOException {
out.writeString(name); out.writeString(name);
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
out.writeLong(docCountError);
}
InternalOrder.Streams.writeOrder(order, out); InternalOrder.Streams.writeOrder(order, out);
ValueFormatterStreams.writeOptional(formatter, out); ValueFormatterStreams.writeOptional(formatter, out);
writeSize(requiredSize, out); writeSize(requiredSize, out);
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
writeSize(shardSize, out);
out.writeBoolean(showTermDocCountError);
}
out.writeVLong(minDocCount); out.writeVLong(minDocCount);
out.writeVInt(buckets.size()); out.writeVInt(buckets.size());
for (InternalTerms.Bucket bucket : buckets) { for (InternalTerms.Bucket bucket : buckets) {
out.writeDouble(((Bucket) bucket).term); out.writeDouble(((Bucket) bucket).term);
out.writeVLong(bucket.getDocCount()); out.writeVLong(bucket.getDocCount());
if (out.getVersion().onOrAfter(Version.V_1_4_0) && showTermDocCountError) {
out.writeLong(bucket.docCountError);
}
((InternalAggregations) bucket.getAggregations()).writeTo(out); ((InternalAggregations) bucket.getAggregations()).writeTo(out);
} }
} }
@Override @Override
public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException { public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, docCountError);
builder.startArray(CommonFields.BUCKETS); builder.startArray(CommonFields.BUCKETS);
for (InternalTerms.Bucket bucket : buckets) { for (InternalTerms.Bucket bucket : buckets) {
builder.startObject(); builder.startObject();
@ -154,6 +184,9 @@ public class DoubleTerms extends InternalTerms {
builder.field(CommonFields.KEY_AS_STRING, formatter.format(((Bucket) bucket).term)); builder.field(CommonFields.KEY_AS_STRING, formatter.format(((Bucket) bucket).term));
} }
builder.field(CommonFields.DOC_COUNT, bucket.getDocCount()); builder.field(CommonFields.DOC_COUNT, bucket.getDocCount());
if (showTermDocCountError) {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, bucket.getDocCountError());
}
((InternalAggregations) bucket.getAggregations()).toXContentInternal(builder, params); ((InternalAggregations) bucket.getAggregations()).toXContentInternal(builder, params);
builder.endObject(); builder.endObject();
} }

View File

@ -43,12 +43,14 @@ public class DoubleTermsAggregator extends TermsAggregator {
private final ValuesSource.Numeric valuesSource; private final ValuesSource.Numeric valuesSource;
private final ValueFormatter formatter; private final ValueFormatter formatter;
private final LongHash bucketOrds; private final LongHash bucketOrds;
private final boolean showTermDocCountError;
private SortedNumericDoubleValues values; private SortedNumericDoubleValues values;
public DoubleTermsAggregator(String name, AggregatorFactories factories, ValuesSource.Numeric valuesSource, @Nullable ValueFormat format, long estimatedBucketCount, public DoubleTermsAggregator(String name, AggregatorFactories factories, ValuesSource.Numeric valuesSource, @Nullable ValueFormat format, long estimatedBucketCount,
InternalOrder order, BucketCountThresholds bucketCountThresholds, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode) { InternalOrder order, BucketCountThresholds bucketCountThresholds, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode, boolean showTermDocCountError) {
super(name, BucketAggregationMode.PER_BUCKET, factories, estimatedBucketCount, aggregationContext, parent, bucketCountThresholds, order, collectionMode); super(name, BucketAggregationMode.PER_BUCKET, factories, estimatedBucketCount, aggregationContext, parent, bucketCountThresholds, order, collectionMode);
this.valuesSource = valuesSource; this.valuesSource = valuesSource;
this.showTermDocCountError = showTermDocCountError;
this.formatter = format != null ? format.formatter() : null; this.formatter = format != null ? format.formatter() : null;
bucketOrds = new LongHash(estimatedBucketCount, aggregationContext.bigArrays()); bucketOrds = new LongHash(estimatedBucketCount, aggregationContext.bigArrays());
} }
@ -111,7 +113,7 @@ public class DoubleTermsAggregator extends TermsAggregator {
DoubleTerms.Bucket spare = null; DoubleTerms.Bucket spare = null;
for (long i = 0; i < bucketOrds.size(); i++) { for (long i = 0; i < bucketOrds.size(); i++) {
if (spare == null) { if (spare == null) {
spare = new DoubleTerms.Bucket(0, 0, null); spare = new DoubleTerms.Bucket(0, 0, null, showTermDocCountError, 0);
} }
spare.term = Double.longBitsToDouble(bucketOrds.get(i)); spare.term = Double.longBitsToDouble(bucketOrds.get(i));
spare.docCount = bucketDocCount(i); spare.docCount = bucketDocCount(i);
@ -131,18 +133,19 @@ public class DoubleTermsAggregator extends TermsAggregator {
} }
// replay any deferred collections // replay any deferred collections
runDeferredCollections(survivingBucketOrds); runDeferredCollections(survivingBucketOrds);
// Now build the aggs // Now build the aggs
for (int i = 0; i < list.length; i++) { for (int i = 0; i < list.length; i++) {
list[i].aggregations = bucketAggregations(list[i].bucketOrd); list[i].aggregations = bucketAggregations(list[i].bucketOrd);
list[i].docCountError = 0;
} }
return new DoubleTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list), showTermDocCountError, 0);
return new DoubleTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list));
} }
@Override @Override
public DoubleTerms buildEmptyAggregation() { public DoubleTerms buildEmptyAggregation() {
return new DoubleTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList()); return new DoubleTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList(), showTermDocCountError, 0);
} }
@Override @Override

View File

@ -69,8 +69,8 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
public GlobalOrdinalsStringTermsAggregator(String name, AggregatorFactories factories, ValuesSource.Bytes.WithOrdinals.FieldData valuesSource, long estimatedBucketCount, public GlobalOrdinalsStringTermsAggregator(String name, AggregatorFactories factories, ValuesSource.Bytes.WithOrdinals.FieldData valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, BucketCountThresholds bucketCountThresholds, long maxOrd, InternalOrder order, BucketCountThresholds bucketCountThresholds,
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode) { IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode, boolean showTermDocCountError) {
super(name, factories, maxOrd, aggregationContext, parent, order, bucketCountThresholds, collectionMode); super(name, factories, maxOrd, aggregationContext, parent, order, bucketCountThresholds, collectionMode, showTermDocCountError);
this.valuesSource = valuesSource; this.valuesSource = valuesSource;
this.includeExclude = includeExclude; this.includeExclude = includeExclude;
} }
@ -151,7 +151,7 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
size = (int) Math.min(maxBucketOrd(), bucketCountThresholds.getShardSize()); size = (int) Math.min(maxBucketOrd(), bucketCountThresholds.getShardSize());
} }
BucketPriorityQueue ordered = new BucketPriorityQueue(size, order.comparator(this)); BucketPriorityQueue ordered = new BucketPriorityQueue(size, order.comparator(this));
OrdBucket spare = new OrdBucket(-1, 0, null); OrdBucket spare = new OrdBucket(-1, 0, null, showTermDocCountError, 0);
for (long globalTermOrd = 0; globalTermOrd < globalOrds.getValueCount(); ++globalTermOrd) { for (long globalTermOrd = 0; globalTermOrd < globalOrds.getValueCount(); ++globalTermOrd) {
if (includeExclude != null && !acceptedGlobalOrdinals.get(globalTermOrd)) { if (includeExclude != null && !acceptedGlobalOrdinals.get(globalTermOrd)) {
continue; continue;
@ -167,7 +167,7 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
if (bucketCountThresholds.getShardMinDocCount() <= spare.docCount) { if (bucketCountThresholds.getShardMinDocCount() <= spare.docCount) {
spare = (OrdBucket) ordered.insertWithOverflow(spare); spare = (OrdBucket) ordered.insertWithOverflow(spare);
if (spare == null) { if (spare == null) {
spare = new OrdBucket(-1, 0, null); spare = new OrdBucket(-1, 0, null, showTermDocCountError, 0);
} }
} }
} }
@ -180,26 +180,28 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
survivingBucketOrds[i] = bucket.bucketOrd; survivingBucketOrds[i] = bucket.bucketOrd;
BytesRef scratch = new BytesRef(); BytesRef scratch = new BytesRef();
copy(globalOrds.lookupOrd(bucket.globalOrd), scratch); copy(globalOrds.lookupOrd(bucket.globalOrd), scratch);
list[i] = new StringTerms.Bucket(scratch, bucket.docCount, null); list[i] = new StringTerms.Bucket(scratch, bucket.docCount, null, showTermDocCountError, 0);
list[i].bucketOrd = bucket.bucketOrd; list[i].bucketOrd = bucket.bucketOrd;
} }
//replay any deferred collections //replay any deferred collections
runDeferredCollections(survivingBucketOrds); runDeferredCollections(survivingBucketOrds);
//Now build the aggs //Now build the aggs
for (int i = 0; i < list.length; i++) { for (int i = 0; i < list.length; i++) {
Bucket bucket = list[i]; Bucket bucket = list[i];
bucket.aggregations = bucket.docCount == 0 ? bucketEmptyAggregations() : bucketAggregations(bucket.bucketOrd); bucket.aggregations = bucket.docCount == 0 ? bucketEmptyAggregations() : bucketAggregations(bucket.bucketOrd);
bucket.docCountError = 0;
} }
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list)); return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list), showTermDocCountError, 0);
} }
/** This is used internally only, just for compare using global ordinal instead of term bytes in the PQ */ /** This is used internally only, just for compare using global ordinal instead of term bytes in the PQ */
static class OrdBucket extends InternalTerms.Bucket { static class OrdBucket extends InternalTerms.Bucket {
long globalOrd; long globalOrd;
OrdBucket(long globalOrd, long docCount, InternalAggregations aggregations) { OrdBucket(long globalOrd, long docCount, InternalAggregations aggregations, boolean showDocCountError, long docCountError) {
super(docCount, aggregations); super(docCount, aggregations, showDocCountError, docCountError);
this.globalOrd = globalOrd; this.globalOrd = globalOrd;
} }
@ -224,7 +226,7 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
} }
@Override @Override
Bucket newBucket(long docCount, InternalAggregations aggs) { Bucket newBucket(long docCount, InternalAggregations aggs, long docCountError) {
throw new UnsupportedOperationException(); throw new UnsupportedOperationException();
} }
@ -248,9 +250,9 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
public WithHash(String name, AggregatorFactories factories, ValuesSource.Bytes.WithOrdinals.FieldData valuesSource, long estimatedBucketCount, public WithHash(String name, AggregatorFactories factories, ValuesSource.Bytes.WithOrdinals.FieldData valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude, AggregationContext aggregationContext, long maxOrd, InternalOrder order, BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude, AggregationContext aggregationContext,
Aggregator parent, SubAggCollectionMode collectionMode) { Aggregator parent, SubAggCollectionMode collectionMode, boolean showTermDocCountError) {
// Set maxOrd to estimatedBucketCount! To be conservative with memory. // Set maxOrd to estimatedBucketCount! To be conservative with memory.
super(name, factories, valuesSource, estimatedBucketCount, estimatedBucketCount, order, bucketCountThresholds, includeExclude, aggregationContext, parent, collectionMode); super(name, factories, valuesSource, estimatedBucketCount, estimatedBucketCount, order, bucketCountThresholds, includeExclude, aggregationContext, parent, collectionMode, showTermDocCountError);
bucketOrds = new LongHash(estimatedBucketCount, aggregationContext.bigArrays()); bucketOrds = new LongHash(estimatedBucketCount, aggregationContext.bigArrays());
} }
@ -277,17 +279,17 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
public void collect(int doc) throws IOException { public void collect(int doc) throws IOException {
ords.setDocument(doc); ords.setDocument(doc);
final int numOrds = ords.cardinality(); final int numOrds = ords.cardinality();
for (int i = 0; i < numOrds; i++) { for (int i = 0; i < numOrds; i++) {
final long globalOrd = ords.ordAt(i); final long globalOrd = ords.ordAt(i);
long bucketOrd = bucketOrds.add(globalOrd); long bucketOrd = bucketOrds.add(globalOrd);
if (bucketOrd < 0) { if (bucketOrd < 0) {
bucketOrd = -1 - bucketOrd; bucketOrd = -1 - bucketOrd;
collectExistingBucket(doc, bucketOrd); collectExistingBucket(doc, bucketOrd);
} else { } else {
collectBucket(doc, bucketOrd); collectBucket(doc, bucketOrd);
} }
} }
} }
}; };
} }
} }
@ -316,8 +318,8 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
private RandomAccessOrds segmentOrds; private RandomAccessOrds segmentOrds;
public LowCardinality(String name, AggregatorFactories factories, ValuesSource.Bytes.WithOrdinals.FieldData valuesSource, long estimatedBucketCount, public LowCardinality(String name, AggregatorFactories factories, ValuesSource.Bytes.WithOrdinals.FieldData valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, BucketCountThresholds bucketCountThresholds, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode) { long maxOrd, InternalOrder order, BucketCountThresholds bucketCountThresholds, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode, boolean showTermDocCountError) {
super(name, factories, valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, null, aggregationContext, parent, collectionMode); super(name, factories, valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, null, aggregationContext, parent, collectionMode, showTermDocCountError);
assert factories == null || factories.count() == 0; assert factories == null || factories.count() == 0;
this.segmentDocCounts = bigArrays.newIntArray(maxOrd + 1, true); this.segmentDocCounts = bigArrays.newIntArray(maxOrd + 1, true);
} }
@ -327,7 +329,7 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
final SortedDocValues singleValues = DocValues.unwrapSingleton(segmentOrds); final SortedDocValues singleValues = DocValues.unwrapSingleton(segmentOrds);
if (singleValues != null) { if (singleValues != null) {
return new Collector() { return new Collector() {
@Override @Override
public void collect(int doc) throws IOException { public void collect(int doc) throws IOException {
final int ord = singleValues.getOrd(doc); final int ord = singleValues.getOrd(doc);
segmentDocCounts.increment(ord + 1, 1); segmentDocCounts.increment(ord + 1, 1);
@ -338,7 +340,7 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
public void collect(int doc) throws IOException { public void collect(int doc) throws IOException {
segmentOrds.setDocument(doc); segmentOrds.setDocument(doc);
final int numOrds = segmentOrds.cardinality(); final int numOrds = segmentOrds.cardinality();
for (int i = 0; i < numOrds; i++) { for (int i = 0; i < numOrds; i++) {
final long segmentOrd = segmentOrds.ordAt(i); final long segmentOrd = segmentOrds.ordAt(i);
segmentDocCounts.increment(segmentOrd + 1, 1); segmentDocCounts.increment(segmentOrd + 1, 1);
} }
@ -356,7 +358,7 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
globalOrds = valuesSource.globalOrdinalsValues(); globalOrds = valuesSource.globalOrdinalsValues();
segmentOrds = valuesSource.ordinalsValues(); segmentOrds = valuesSource.ordinalsValues();
collector = newCollector(segmentOrds); collector = newCollector(segmentOrds);
} }
@Override @Override
protected void doPostCollection() { protected void doPostCollection() {
@ -432,8 +434,8 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
if (accepted.get(ord)) { if (accepted.get(ord)) {
ords[cardinality++] = ord; ords[cardinality++] = ord;
} }
}
} }
}
@Override @Override
public int cardinality() { public int cardinality() {

View File

@ -21,6 +21,7 @@ package org.elasticsearch.search.aggregations.bucket.terms;
import com.google.common.collect.ArrayListMultimap; import com.google.common.collect.ArrayListMultimap;
import com.google.common.collect.Maps; import com.google.common.collect.Maps;
import com.google.common.collect.Multimap; import com.google.common.collect.Multimap;
import org.elasticsearch.ElasticsearchIllegalStateException;
import org.elasticsearch.common.io.stream.Streamable; import org.elasticsearch.common.io.stream.Streamable;
import org.elasticsearch.common.util.BigArrays; import org.elasticsearch.common.util.BigArrays;
import org.elasticsearch.common.xcontent.ToXContent; import org.elasticsearch.common.xcontent.ToXContent;
@ -36,16 +37,23 @@ import java.util.*;
*/ */
public abstract class InternalTerms extends InternalAggregation implements Terms, ToXContent, Streamable { public abstract class InternalTerms extends InternalAggregation implements Terms, ToXContent, Streamable {
protected static final String DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME = "doc_count_error_upper_bound";
public static abstract class Bucket extends Terms.Bucket { public static abstract class Bucket extends Terms.Bucket {
long bucketOrd; long bucketOrd;
protected long docCount; protected long docCount;
protected long docCountError;
protected InternalAggregations aggregations; protected InternalAggregations aggregations;
protected boolean showDocCountError;
protected Bucket(long docCount, InternalAggregations aggregations) {
protected Bucket(long docCount, InternalAggregations aggregations, boolean showDocCountError, long docCountError) {
this.docCount = docCount; this.docCount = docCount;
this.aggregations = aggregations; this.aggregations = aggregations;
this.showDocCountError = showDocCountError;
this.docCountError = docCountError;
} }
@Override @Override
@ -53,6 +61,13 @@ public abstract class InternalTerms extends InternalAggregation implements Terms
return docCount; return docCount;
} }
public long getDocCountError() {
if (!showDocCountError) {
throw new ElasticsearchIllegalStateException("show_terms_doc_count_error is false");
}
return docCountError;
}
@Override @Override
public Aggregations getAggregations() { public Aggregations getAggregations() {
return aggregations; return aggregations;
@ -60,34 +75,48 @@ public abstract class InternalTerms extends InternalAggregation implements Terms
abstract Object getKeyAsObject(); abstract Object getKeyAsObject();
abstract Bucket newBucket(long docCount, InternalAggregations aggs); abstract Bucket newBucket(long docCount, InternalAggregations aggs, long docCountError);
public Bucket reduce(List<? extends Bucket> buckets, BigArrays bigArrays) { public Bucket reduce(List<? extends Bucket> buckets, BigArrays bigArrays) {
long docCount = 0; long docCount = 0;
long docCountError = 0;
List<InternalAggregations> aggregationsList = new ArrayList<>(buckets.size()); List<InternalAggregations> aggregationsList = new ArrayList<>(buckets.size());
for (Bucket bucket : buckets) { for (Bucket bucket : buckets) {
docCount += bucket.docCount; docCount += bucket.docCount;
if (docCountError != -1) {
if (bucket.docCountError == -1) {
docCountError = -1;
} else {
docCountError += bucket.docCountError;
}
}
aggregationsList.add(bucket.aggregations); aggregationsList.add(bucket.aggregations);
} }
InternalAggregations aggs = InternalAggregations.reduce(aggregationsList, bigArrays); InternalAggregations aggs = InternalAggregations.reduce(aggregationsList, bigArrays);
return newBucket(docCount, aggs); return newBucket(docCount, aggs, docCountError);
} }
} }
protected InternalOrder order; protected InternalOrder order;
protected int requiredSize; protected int requiredSize;
protected int shardSize;
protected long minDocCount; protected long minDocCount;
protected Collection<Bucket> buckets; protected List<Bucket> buckets;
protected Map<String, Bucket> bucketMap; protected Map<String, Bucket> bucketMap;
protected long docCountError;
protected boolean showTermDocCountError;
protected InternalTerms() {} // for serialization protected InternalTerms() {} // for serialization
protected InternalTerms(String name, InternalOrder order, int requiredSize, long minDocCount, Collection<Bucket> buckets) { protected InternalTerms(String name, InternalOrder order, int requiredSize, int shardSize, long minDocCount, List<Bucket> buckets, boolean showTermDocCountError, long docCountError) {
super(name); super(name);
this.order = order; this.order = order;
this.requiredSize = requiredSize; this.requiredSize = requiredSize;
this.shardSize = shardSize;
this.minDocCount = minDocCount; this.minDocCount = minDocCount;
this.buckets = buckets; this.buckets = buckets;
this.showTermDocCountError = showTermDocCountError;
this.docCountError = docCountError;
} }
@Override @Override
@ -107,14 +136,36 @@ public abstract class InternalTerms extends InternalAggregation implements Terms
return bucketMap.get(term); return bucketMap.get(term);
} }
public long getDocCountError() {
return docCountError;
}
@Override @Override
public InternalAggregation reduce(ReduceContext reduceContext) { public InternalAggregation reduce(ReduceContext reduceContext) {
List<InternalAggregation> aggregations = reduceContext.aggregations(); List<InternalAggregation> aggregations = reduceContext.aggregations();
Multimap<Object, InternalTerms.Bucket> buckets = ArrayListMultimap.create(); Multimap<Object, InternalTerms.Bucket> buckets = ArrayListMultimap.create();
long sumDocCountError = 0;
for (InternalAggregation aggregation : aggregations) { for (InternalAggregation aggregation : aggregations) {
InternalTerms terms = (InternalTerms) aggregation; InternalTerms terms = (InternalTerms) aggregation;
final long thisAggDocCountError;
if (terms.buckets.size() < this.shardSize || this.order == InternalOrder.TERM_ASC || this.order == InternalOrder.TERM_DESC) {
thisAggDocCountError = 0;
} else if (this.order == InternalOrder.COUNT_DESC) {
thisAggDocCountError = terms.buckets.get(terms.buckets.size() - 1).docCount;
} else {
thisAggDocCountError = -1;
}
if (sumDocCountError != -1) {
if (thisAggDocCountError == -1) {
sumDocCountError = -1;
} else {
sumDocCountError += thisAggDocCountError;
}
}
terms.docCountError = thisAggDocCountError;
for (Bucket bucket : terms.buckets) { for (Bucket bucket : terms.buckets) {
bucket.docCountError = thisAggDocCountError;
buckets.put(bucket.getKeyAsObject(), bucket); buckets.put(bucket.getKeyAsObject(), bucket);
} }
} }
@ -124,6 +175,13 @@ public abstract class InternalTerms extends InternalAggregation implements Terms
for (Collection<Bucket> l : buckets.asMap().values()) { for (Collection<Bucket> l : buckets.asMap().values()) {
List<Bucket> sameTermBuckets = (List<Bucket>) l; // cast is ok according to javadocs List<Bucket> sameTermBuckets = (List<Bucket>) l; // cast is ok according to javadocs
final Bucket b = sameTermBuckets.get(0).reduce(sameTermBuckets, reduceContext.bigArrays()); final Bucket b = sameTermBuckets.get(0).reduce(sameTermBuckets, reduceContext.bigArrays());
if (b.docCountError != -1) {
if (sumDocCountError == -1) {
b.docCountError = -1;
} else {
b.docCountError = sumDocCountError - b.docCountError;
}
}
if (b.docCount >= minDocCount) { if (b.docCount >= minDocCount) {
ordered.insertWithOverflow(b); ordered.insertWithOverflow(b);
} }
@ -132,9 +190,15 @@ public abstract class InternalTerms extends InternalAggregation implements Terms
for (int i = ordered.size() - 1; i >= 0; i--) { for (int i = ordered.size() - 1; i >= 0; i--) {
list[i] = (Bucket) ordered.pop(); list[i] = (Bucket) ordered.pop();
} }
return newAggregation(name, Arrays.asList(list)); long docCountError;
if (sumDocCountError == -1) {
docCountError = -1;
} else {
docCountError = aggregations.size() == 1 ? 0 : sumDocCountError;
}
return newAggregation(name, Arrays.asList(list), showTermDocCountError, docCountError);
} }
protected abstract InternalTerms newAggregation(String name, List<Bucket> buckets); protected abstract InternalTerms newAggregation(String name, List<Bucket> buckets, boolean showTermDocCountError, long docCountError);
} }

View File

@ -18,6 +18,7 @@
*/ */
package org.elasticsearch.search.aggregations.bucket.terms; package org.elasticsearch.search.aggregations.bucket.terms;
import org.elasticsearch.Version;
import org.elasticsearch.common.Nullable; import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.io.stream.StreamInput; import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput; import org.elasticsearch.common.io.stream.StreamOutput;
@ -31,7 +32,6 @@ import org.elasticsearch.search.aggregations.support.format.ValueFormatterStream
import java.io.IOException; import java.io.IOException;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.Collection;
import java.util.List; import java.util.List;
/** /**
@ -59,8 +59,8 @@ public class LongTerms extends InternalTerms {
long term; long term;
public Bucket(long term, long docCount, InternalAggregations aggregations) { public Bucket(long term, long docCount, InternalAggregations aggregations, boolean showDocCountError, long docCountError) {
super(docCount, aggregations); super(docCount, aggregations, showDocCountError, docCountError);
this.term = term; this.term = term;
} }
@ -90,8 +90,8 @@ public class LongTerms extends InternalTerms {
} }
@Override @Override
Bucket newBucket(long docCount, InternalAggregations aggs) { Bucket newBucket(long docCount, InternalAggregations aggs, long docCountError) {
return new Bucket(term, docCount, aggs); return new Bucket(term, docCount, aggs, showDocCountError, docCountError);
} }
} }
@ -99,8 +99,8 @@ public class LongTerms extends InternalTerms {
LongTerms() {} // for serialization LongTerms() {} // for serialization
public LongTerms(String name, InternalOrder order, @Nullable ValueFormatter formatter, int requiredSize, long minDocCount, Collection<InternalTerms.Bucket> buckets) { public LongTerms(String name, InternalOrder order, @Nullable ValueFormatter formatter, int requiredSize, int shardSize, long minDocCount, List<InternalTerms.Bucket> buckets, boolean showTermDocCountError, long docCountError) {
super(name, order, requiredSize, minDocCount, buckets); super(name, order, requiredSize, shardSize, minDocCount, buckets, showTermDocCountError, docCountError);
this.formatter = formatter; this.formatter = formatter;
} }
@ -110,21 +110,40 @@ public class LongTerms extends InternalTerms {
} }
@Override @Override
protected InternalTerms newAggregation(String name, List<InternalTerms.Bucket> buckets) { protected InternalTerms newAggregation(String name, List<InternalTerms.Bucket> buckets, boolean showTermDocCountError, long docCountError) {
return new LongTerms(name, order, formatter, requiredSize, minDocCount, buckets); return new LongTerms(name, order, formatter, requiredSize, shardSize, minDocCount, buckets, showTermDocCountError, docCountError);
} }
@Override @Override
public void readFrom(StreamInput in) throws IOException { public void readFrom(StreamInput in) throws IOException {
this.name = in.readString(); this.name = in.readString();
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
this.docCountError = in.readLong();
} else {
this.docCountError = -1;
}
this.order = InternalOrder.Streams.readOrder(in); this.order = InternalOrder.Streams.readOrder(in);
this.formatter = ValueFormatterStreams.readOptional(in); this.formatter = ValueFormatterStreams.readOptional(in);
this.requiredSize = readSize(in); this.requiredSize = readSize(in);
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
this.shardSize = readSize(in);
this.showTermDocCountError = in.readBoolean();
} else {
this.shardSize = requiredSize;
this.showTermDocCountError = false;
}
this.minDocCount = in.readVLong(); this.minDocCount = in.readVLong();
int size = in.readVInt(); int size = in.readVInt();
List<InternalTerms.Bucket> buckets = new ArrayList<>(size); List<InternalTerms.Bucket> buckets = new ArrayList<>(size);
for (int i = 0; i < size; i++) { for (int i = 0; i < size; i++) {
buckets.add(new Bucket(in.readLong(), in.readVLong(), InternalAggregations.readAggregations(in))); long term = in.readLong();
long docCount = in.readVLong();
long bucketDocCountError = -1;
if (in.getVersion().onOrAfter(Version.V_1_4_0) && showTermDocCountError) {
bucketDocCountError = in.readLong();
}
InternalAggregations aggregations = InternalAggregations.readAggregations(in);
buckets.add(new Bucket(term, docCount, aggregations, showTermDocCountError, bucketDocCountError));
} }
this.buckets = buckets; this.buckets = buckets;
this.bucketMap = null; this.bucketMap = null;
@ -133,20 +152,31 @@ public class LongTerms extends InternalTerms {
@Override @Override
public void writeTo(StreamOutput out) throws IOException { public void writeTo(StreamOutput out) throws IOException {
out.writeString(name); out.writeString(name);
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
out.writeLong(docCountError);
}
InternalOrder.Streams.writeOrder(order, out); InternalOrder.Streams.writeOrder(order, out);
ValueFormatterStreams.writeOptional(formatter, out); ValueFormatterStreams.writeOptional(formatter, out);
writeSize(requiredSize, out); writeSize(requiredSize, out);
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
writeSize(shardSize, out);
out.writeBoolean(showTermDocCountError);
}
out.writeVLong(minDocCount); out.writeVLong(minDocCount);
out.writeVInt(buckets.size()); out.writeVInt(buckets.size());
for (InternalTerms.Bucket bucket : buckets) { for (InternalTerms.Bucket bucket : buckets) {
out.writeLong(((Bucket) bucket).term); out.writeLong(((Bucket) bucket).term);
out.writeVLong(bucket.getDocCount()); out.writeVLong(bucket.getDocCount());
if (out.getVersion().onOrAfter(Version.V_1_4_0) && showTermDocCountError) {
out.writeLong(bucket.docCountError);
}
((InternalAggregations) bucket.getAggregations()).writeTo(out); ((InternalAggregations) bucket.getAggregations()).writeTo(out);
} }
} }
@Override @Override
public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException { public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, docCountError);
builder.startArray(CommonFields.BUCKETS); builder.startArray(CommonFields.BUCKETS);
for (InternalTerms.Bucket bucket : buckets) { for (InternalTerms.Bucket bucket : buckets) {
builder.startObject(); builder.startObject();
@ -155,6 +185,9 @@ public class LongTerms extends InternalTerms {
builder.field(CommonFields.KEY_AS_STRING, formatter.format(((Bucket) bucket).term)); builder.field(CommonFields.KEY_AS_STRING, formatter.format(((Bucket) bucket).term));
} }
builder.field(CommonFields.DOC_COUNT, bucket.getDocCount()); builder.field(CommonFields.DOC_COUNT, bucket.getDocCount());
if (showTermDocCountError) {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, bucket.getDocCountError());
}
((InternalAggregations) bucket.getAggregations()).toXContentInternal(builder, params); ((InternalAggregations) bucket.getAggregations()).toXContentInternal(builder, params);
builder.endObject(); builder.endObject();
} }

View File

@ -44,12 +44,14 @@ public class LongTermsAggregator extends TermsAggregator {
protected final ValuesSource.Numeric valuesSource; protected final ValuesSource.Numeric valuesSource;
protected final @Nullable ValueFormatter formatter; protected final @Nullable ValueFormatter formatter;
protected final LongHash bucketOrds; protected final LongHash bucketOrds;
private boolean showTermDocCountError;
private SortedNumericDocValues values; private SortedNumericDocValues values;
public LongTermsAggregator(String name, AggregatorFactories factories, ValuesSource.Numeric valuesSource, @Nullable ValueFormat format, long estimatedBucketCount, public LongTermsAggregator(String name, AggregatorFactories factories, ValuesSource.Numeric valuesSource, @Nullable ValueFormat format, long estimatedBucketCount,
InternalOrder order, BucketCountThresholds bucketCountThresholds, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode) { InternalOrder order, BucketCountThresholds bucketCountThresholds, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError) {
super(name, BucketAggregationMode.PER_BUCKET, factories, estimatedBucketCount, aggregationContext, parent, bucketCountThresholds, order, subAggCollectMode); super(name, BucketAggregationMode.PER_BUCKET, factories, estimatedBucketCount, aggregationContext, parent, bucketCountThresholds, order, subAggCollectMode);
this.valuesSource = valuesSource; this.valuesSource = valuesSource;
this.showTermDocCountError = showTermDocCountError;
this.formatter = format != null ? format.formatter() : null; this.formatter = format != null ? format.formatter() : null;
bucketOrds = new LongHash(estimatedBucketCount, aggregationContext.bigArrays()); bucketOrds = new LongHash(estimatedBucketCount, aggregationContext.bigArrays());
} }
@ -76,13 +78,13 @@ public class LongTermsAggregator extends TermsAggregator {
for (int i = 0; i < valuesCount; ++i) { for (int i = 0; i < valuesCount; ++i) {
final long val = values.valueAt(i); final long val = values.valueAt(i);
if (previous != val || i != 0) { if (previous != val || i != 0) {
long bucketOrdinal = bucketOrds.add(val); long bucketOrdinal = bucketOrds.add(val);
if (bucketOrdinal < 0) { // already seen if (bucketOrdinal < 0) { // already seen
bucketOrdinal = - 1 - bucketOrdinal; bucketOrdinal = - 1 - bucketOrdinal;
collectExistingBucket(doc, bucketOrdinal); collectExistingBucket(doc, bucketOrdinal);
} else { } else {
collectBucket(doc, bucketOrdinal); collectBucket(doc, bucketOrdinal);
} }
previous = val; previous = val;
} }
} }
@ -113,7 +115,7 @@ public class LongTermsAggregator extends TermsAggregator {
LongTerms.Bucket spare = null; LongTerms.Bucket spare = null;
for (long i = 0; i < bucketOrds.size(); i++) { for (long i = 0; i < bucketOrds.size(); i++) {
if (spare == null) { if (spare == null) {
spare = new LongTerms.Bucket(0, 0, null); spare = new LongTerms.Bucket(0, 0, null, showTermDocCountError, 0);
} }
spare.term = bucketOrds.get(i); spare.term = bucketOrds.get(i);
spare.docCount = bucketDocCount(i); spare.docCount = bucketDocCount(i);
@ -123,8 +125,6 @@ public class LongTermsAggregator extends TermsAggregator {
} }
} }
// Get the top buckets // Get the top buckets
final InternalTerms.Bucket[] list = new InternalTerms.Bucket[ordered.size()]; final InternalTerms.Bucket[] list = new InternalTerms.Bucket[ordered.size()];
long survivingBucketOrds[] = new long[ordered.size()]; long survivingBucketOrds[] = new long[ordered.size()];
@ -139,14 +139,16 @@ public class LongTermsAggregator extends TermsAggregator {
//Now build the aggs //Now build the aggs
for (int i = 0; i < list.length; i++) { for (int i = 0; i < list.length; i++) {
list[i].aggregations = bucketAggregations(list[i].bucketOrd); list[i].aggregations = bucketAggregations(list[i].bucketOrd);
list[i].docCountError = 0;
} }
return new LongTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list));
return new LongTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list), showTermDocCountError, 0);
} }
@Override @Override
public InternalAggregation buildEmptyAggregation() { public InternalAggregation buildEmptyAggregation() {
return new LongTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList()); return new LongTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList(), showTermDocCountError, 0);
} }
@Override @Override

View File

@ -19,6 +19,7 @@
package org.elasticsearch.search.aggregations.bucket.terms; package org.elasticsearch.search.aggregations.bucket.terms;
import org.apache.lucene.util.BytesRef; import org.apache.lucene.util.BytesRef;
import org.elasticsearch.Version;
import org.elasticsearch.common.bytes.BytesArray; import org.elasticsearch.common.bytes.BytesArray;
import org.elasticsearch.common.io.stream.StreamInput; import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput; import org.elasticsearch.common.io.stream.StreamOutput;
@ -31,7 +32,6 @@ import org.elasticsearch.search.aggregations.InternalAggregations;
import java.io.IOException; import java.io.IOException;
import java.util.ArrayList; import java.util.ArrayList;
import java.util.Collection;
import java.util.List; import java.util.List;
/** /**
@ -59,8 +59,8 @@ public class StringTerms extends InternalTerms {
BytesRef termBytes; BytesRef termBytes;
public Bucket(BytesRef term, long docCount, InternalAggregations aggregations) { public Bucket(BytesRef term, long docCount, InternalAggregations aggregations, boolean showDocCountError, long docCountError) {
super(docCount, aggregations); super(docCount, aggregations, showDocCountError, docCountError);
this.termBytes = term; this.termBytes = term;
} }
@ -91,15 +91,15 @@ public class StringTerms extends InternalTerms {
} }
@Override @Override
Bucket newBucket(long docCount, InternalAggregations aggs) { Bucket newBucket(long docCount, InternalAggregations aggs, long docCountError) {
return new Bucket(termBytes, docCount, aggs); return new Bucket(termBytes, docCount, aggs, showDocCountError, docCountError);
} }
} }
StringTerms() {} // for serialization StringTerms() {} // for serialization
public StringTerms(String name, InternalOrder order, int requiredSize, long minDocCount, Collection<InternalTerms.Bucket> buckets) { public StringTerms(String name, InternalOrder order, int requiredSize, int shardSize, long minDocCount, List<InternalTerms.Bucket> buckets, boolean showTermDocCountError, long docCountError) {
super(name, order, requiredSize, minDocCount, buckets); super(name, order, requiredSize, shardSize, minDocCount, buckets, showTermDocCountError, docCountError);
} }
@Override @Override
@ -108,20 +108,39 @@ public class StringTerms extends InternalTerms {
} }
@Override @Override
protected InternalTerms newAggregation(String name, List<InternalTerms.Bucket> buckets) { protected InternalTerms newAggregation(String name, List<InternalTerms.Bucket> buckets, boolean showTermDocCountError, long docCountError) {
return new StringTerms(name, order, requiredSize, minDocCount, buckets); return new StringTerms(name, order, requiredSize, shardSize, minDocCount, buckets, showTermDocCountError, docCountError);
} }
@Override @Override
public void readFrom(StreamInput in) throws IOException { public void readFrom(StreamInput in) throws IOException {
this.name = in.readString(); this.name = in.readString();
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
this.docCountError = in.readLong();
} else {
this.docCountError = -1;
}
this.order = InternalOrder.Streams.readOrder(in); this.order = InternalOrder.Streams.readOrder(in);
this.requiredSize = readSize(in); this.requiredSize = readSize(in);
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
this.shardSize = readSize(in);
this.showTermDocCountError = in.readBoolean();
} else {
this.shardSize = requiredSize;
this.showTermDocCountError = false;
}
this.minDocCount = in.readVLong(); this.minDocCount = in.readVLong();
int size = in.readVInt(); int size = in.readVInt();
List<InternalTerms.Bucket> buckets = new ArrayList<>(size); List<InternalTerms.Bucket> buckets = new ArrayList<>(size);
for (int i = 0; i < size; i++) { for (int i = 0; i < size; i++) {
buckets.add(new Bucket(in.readBytesRef(), in.readVLong(), InternalAggregations.readAggregations(in))); BytesRef termBytes = in.readBytesRef();
long docCount = in.readVLong();
long bucketDocCountError = -1;
if (in.getVersion().onOrAfter(Version.V_1_4_0) && showTermDocCountError) {
bucketDocCountError = in.readLong();
}
InternalAggregations aggregations = InternalAggregations.readAggregations(in);
buckets.add(new Bucket(termBytes, docCount, aggregations, showTermDocCountError, bucketDocCountError));
} }
this.buckets = buckets; this.buckets = buckets;
this.bucketMap = null; this.bucketMap = null;
@ -130,24 +149,38 @@ public class StringTerms extends InternalTerms {
@Override @Override
public void writeTo(StreamOutput out) throws IOException { public void writeTo(StreamOutput out) throws IOException {
out.writeString(name); out.writeString(name);
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
out.writeLong(docCountError);
}
InternalOrder.Streams.writeOrder(order, out); InternalOrder.Streams.writeOrder(order, out);
writeSize(requiredSize, out); writeSize(requiredSize, out);
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
writeSize(shardSize, out);
out.writeBoolean(showTermDocCountError);
}
out.writeVLong(minDocCount); out.writeVLong(minDocCount);
out.writeVInt(buckets.size()); out.writeVInt(buckets.size());
for (InternalTerms.Bucket bucket : buckets) { for (InternalTerms.Bucket bucket : buckets) {
out.writeBytesRef(((Bucket) bucket).termBytes); out.writeBytesRef(((Bucket) bucket).termBytes);
out.writeVLong(bucket.getDocCount()); out.writeVLong(bucket.getDocCount());
if (out.getVersion().onOrAfter(Version.V_1_4_0) && showTermDocCountError) {
out.writeLong(bucket.docCountError);
}
((InternalAggregations) bucket.getAggregations()).writeTo(out); ((InternalAggregations) bucket.getAggregations()).writeTo(out);
} }
} }
@Override @Override
public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException { public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, docCountError);
builder.startArray(CommonFields.BUCKETS); builder.startArray(CommonFields.BUCKETS);
for (InternalTerms.Bucket bucket : buckets) { for (InternalTerms.Bucket bucket : buckets) {
builder.startObject(); builder.startObject();
builder.utf8Field(CommonFields.KEY, ((Bucket) bucket).termBytes); builder.utf8Field(CommonFields.KEY, ((Bucket) bucket).termBytes);
builder.field(CommonFields.DOC_COUNT, bucket.getDocCount()); builder.field(CommonFields.DOC_COUNT, bucket.getDocCount());
if (showTermDocCountError) {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, bucket.getDocCountError());
}
((InternalAggregations) bucket.getAggregations()).toXContentInternal(builder, params); ((InternalAggregations) bucket.getAggregations()).toXContentInternal(builder, params);
builder.endObject(); builder.endObject();
} }

View File

@ -48,9 +48,9 @@ public class StringTermsAggregator extends AbstractStringTermsAggregator {
public StringTermsAggregator(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount, public StringTermsAggregator(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount,
InternalOrder order, BucketCountThresholds bucketCountThresholds, InternalOrder order, BucketCountThresholds bucketCountThresholds,
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode) { IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode, boolean showTermDocCountError) {
super(name, factories, estimatedBucketCount, aggregationContext, parent, order, bucketCountThresholds, collectionMode); super(name, factories, estimatedBucketCount, aggregationContext, parent, order, bucketCountThresholds, collectionMode, showTermDocCountError);
this.valuesSource = valuesSource; this.valuesSource = valuesSource;
this.includeExclude = includeExclude; this.includeExclude = includeExclude;
bucketOrds = new BytesRefHash(estimatedBucketCount, aggregationContext.bigArrays()); bucketOrds = new BytesRefHash(estimatedBucketCount, aggregationContext.bigArrays());
@ -123,7 +123,7 @@ public class StringTermsAggregator extends AbstractStringTermsAggregator {
StringTerms.Bucket spare = null; StringTerms.Bucket spare = null;
for (int i = 0; i < bucketOrds.size(); i++) { for (int i = 0; i < bucketOrds.size(); i++) {
if (spare == null) { if (spare == null) {
spare = new StringTerms.Bucket(new BytesRef(), 0, null); spare = new StringTerms.Bucket(new BytesRef(), 0, null, showTermDocCountError, 0);
} }
bucketOrds.get(i, spare.termBytes); bucketOrds.get(i, spare.termBytes);
spare.docCount = bucketDocCount(i); spare.docCount = bucketDocCount(i);
@ -143,20 +143,21 @@ public class StringTermsAggregator extends AbstractStringTermsAggregator {
} }
// replay any deferred collections // replay any deferred collections
runDeferredCollections(survivingBucketOrds); runDeferredCollections(survivingBucketOrds);
// Now build the aggs // Now build the aggs
for (int i = 0; i < list.length; i++) { for (int i = 0; i < list.length; i++) {
final StringTerms.Bucket bucket = (StringTerms.Bucket)list[i]; final StringTerms.Bucket bucket = (StringTerms.Bucket)list[i];
bucket.termBytes = BytesRef.deepCopyOf(bucket.termBytes); bucket.termBytes = BytesRef.deepCopyOf(bucket.termBytes);
bucket.aggregations = bucketAggregations(bucket.bucketOrd); bucket.aggregations = bucketAggregations(bucket.bucketOrd);
bucket.docCountError = 0;
} }
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list), showTermDocCountError, 0);
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list));
} }
@Override @Override
public InternalAggregation buildEmptyAggregation() { public InternalAggregation buildEmptyAggregation() {
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList()); return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList(), showTermDocCountError, 0);
} }
@Override @Override

View File

@ -66,12 +66,16 @@ public interface Terms extends MultiBucketsAggregation {
abstract int compareTerm(Terms.Bucket other); abstract int compareTerm(Terms.Bucket other);
public abstract long getDocCountError();
} }
Collection<Bucket> getBuckets(); Collection<Bucket> getBuckets();
Bucket getBucketByKey(String term); Bucket getBucketByKey(String term);
long getDocCountError();
/** /**
* Determines the order by which the term buckets will be sorted * Determines the order by which the term buckets will be sorted
*/ */

View File

@ -41,8 +41,8 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
@Override @Override
Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount, Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude, long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude,
AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode) { AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError) {
return new StringTermsAggregator(name, factories, valuesSource, estimatedBucketCount, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode); return new StringTermsAggregator(name, factories, valuesSource, estimatedBucketCount, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
} }
@Override @Override
@ -56,8 +56,8 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
@Override @Override
Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount, Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude, long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude,
AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode) { AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError) {
return new GlobalOrdinalsStringTermsAggregator(name, factories, (ValuesSource.Bytes.WithOrdinals.FieldData) valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode); return new GlobalOrdinalsStringTermsAggregator(name, factories, (ValuesSource.Bytes.WithOrdinals.FieldData) valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
} }
@Override @Override
@ -71,8 +71,8 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
@Override @Override
Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount, Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude, long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude,
AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode) { AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError) {
return new GlobalOrdinalsStringTermsAggregator.WithHash(name, factories, (ValuesSource.Bytes.WithOrdinals.FieldData) valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode); return new GlobalOrdinalsStringTermsAggregator.WithHash(name, factories, (ValuesSource.Bytes.WithOrdinals.FieldData) valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
} }
@Override @Override
@ -85,11 +85,11 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
@Override @Override
Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount, Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude, long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude,
AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode) { AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError) {
if (includeExclude != null || factories.count() > 0) { if (includeExclude != null || factories.count() > 0) {
return GLOBAL_ORDINALS.create(name, factories, valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode); return GLOBAL_ORDINALS.create(name, factories, valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
} }
return new GlobalOrdinalsStringTermsAggregator.LowCardinality(name, factories, (ValuesSource.Bytes.WithOrdinals.FieldData) valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, aggregationContext, parent, subAggCollectMode); return new GlobalOrdinalsStringTermsAggregator.LowCardinality(name, factories, (ValuesSource.Bytes.WithOrdinals.FieldData) valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
} }
@Override @Override
@ -115,7 +115,7 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
abstract Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount, abstract Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds,
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode); IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError);
abstract boolean needsGlobalOrdinals(); abstract boolean needsGlobalOrdinals();
@ -130,19 +130,21 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
private final String executionHint; private final String executionHint;
private SubAggCollectionMode subAggCollectMode; private SubAggCollectionMode subAggCollectMode;
private final TermsAggregator.BucketCountThresholds bucketCountThresholds; private final TermsAggregator.BucketCountThresholds bucketCountThresholds;
private boolean showTermDocCountError;
public TermsAggregatorFactory(String name, ValuesSourceConfig config, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude, String executionHint,SubAggCollectionMode executionMode) { public TermsAggregatorFactory(String name, ValuesSourceConfig config, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude, String executionHint,SubAggCollectionMode executionMode, boolean showTermDocCountError) {
super(name, StringTerms.TYPE.name(), config); super(name, StringTerms.TYPE.name(), config);
this.order = order; this.order = order;
this.includeExclude = includeExclude; this.includeExclude = includeExclude;
this.executionHint = executionHint; this.executionHint = executionHint;
this.bucketCountThresholds = bucketCountThresholds; this.bucketCountThresholds = bucketCountThresholds;
this.subAggCollectMode = executionMode; this.subAggCollectMode = executionMode;
this.showTermDocCountError = showTermDocCountError;
} }
@Override @Override
protected Aggregator createUnmapped(AggregationContext aggregationContext, Aggregator parent) { protected Aggregator createUnmapped(AggregationContext aggregationContext, Aggregator parent) {
final InternalAggregation aggregation = new UnmappedTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount()); final InternalAggregation aggregation = new UnmappedTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount());
return new NonCollectingAggregator(name, aggregationContext, parent) { return new NonCollectingAggregator(name, aggregationContext, parent) {
@Override @Override
public InternalAggregation buildEmptyAggregation() { public InternalAggregation buildEmptyAggregation() {
@ -226,7 +228,7 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
assert execution != null; assert execution != null;
valuesSource.setNeedsGlobalOrdinals(execution.needsGlobalOrdinals()); valuesSource.setNeedsGlobalOrdinals(execution.needsGlobalOrdinals());
return execution.create(name, factories, valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode); return execution.create(name, factories, valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
} }
if (includeExclude != null) { if (includeExclude != null) {
@ -236,9 +238,9 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
if (valuesSource instanceof ValuesSource.Numeric) { if (valuesSource instanceof ValuesSource.Numeric) {
if (((ValuesSource.Numeric) valuesSource).isFloatingPoint()) { if (((ValuesSource.Numeric) valuesSource).isFloatingPoint()) {
return new DoubleTermsAggregator(name, factories, (ValuesSource.Numeric) valuesSource, config.format(), estimatedBucketCount, order, bucketCountThresholds, aggregationContext, parent, subAggCollectMode); return new DoubleTermsAggregator(name, factories, (ValuesSource.Numeric) valuesSource, config.format(), estimatedBucketCount, order, bucketCountThresholds, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
} }
return new LongTermsAggregator(name, factories, (ValuesSource.Numeric) valuesSource, config.format(), estimatedBucketCount, order, bucketCountThresholds, aggregationContext, parent, subAggCollectMode); return new LongTermsAggregator(name, factories, (ValuesSource.Numeric) valuesSource, config.format(), estimatedBucketCount, order, bucketCountThresholds, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
} }
throw new AggregationExecutionException("terms aggregation cannot be applied to field [" + config.fieldContext().field() + throw new AggregationExecutionException("terms aggregation cannot be applied to field [" + config.fieldContext().field() +

View File

@ -42,6 +42,7 @@ public class TermsBuilder extends ValuesSourceAggregationBuilder<TermsBuilder> {
private int excludeFlags; private int excludeFlags;
private String executionHint; private String executionHint;
private SubAggCollectionMode collectionMode; private SubAggCollectionMode collectionMode;
private Boolean showTermDocCountError;
public TermsBuilder(String name) { public TermsBuilder(String name) {
super(name, "terms"); super(name, "terms");
@ -150,11 +151,19 @@ public class TermsBuilder extends ValuesSourceAggregationBuilder<TermsBuilder> {
return this; return this;
} }
public TermsBuilder showTermDocCountError(boolean showTermDocCountError) {
this.showTermDocCountError = showTermDocCountError;
return this;
}
@Override @Override
protected XContentBuilder doInternalXContent(XContentBuilder builder, Params params) throws IOException { protected XContentBuilder doInternalXContent(XContentBuilder builder, Params params) throws IOException {
bucketCountThresholds.toXContent(builder); bucketCountThresholds.toXContent(builder);
if (showTermDocCountError != null) {
builder.field(AbstractTermsParametersParser.SHOW_TERM_DOC_COUNT_ERROR.getPreferredName(), showTermDocCountError);
}
if (executionHint != null) { if (executionHint != null) {
builder.field(AbstractTermsParametersParser.EXECUTION_HINT_FIELD_NAME.getPreferredName(), executionHint); builder.field(AbstractTermsParametersParser.EXECUTION_HINT_FIELD_NAME.getPreferredName(), executionHint);
} }

View File

@ -39,8 +39,13 @@ public class TermsParametersParser extends AbstractTermsParametersParser {
return orderAsc; return orderAsc;
} }
public boolean showTermDocCountError() {
return showTermDocCountError;
}
String orderKey = "_count"; String orderKey = "_count";
boolean orderAsc = false; boolean orderAsc = false;
private boolean showTermDocCountError = false;
@Override @Override
public void parseSpecial(String aggregationName, XContentParser parser, SearchContext context, XContentParser.Token token, String currentFieldName) throws IOException { public void parseSpecial(String aggregationName, XContentParser parser, SearchContext context, XContentParser.Token token, String currentFieldName) throws IOException {
@ -65,6 +70,10 @@ public class TermsParametersParser extends AbstractTermsParametersParser {
} else { } else {
throw new SearchParseException(context, "Unknown key for a " + token + " in [" + aggregationName + "]: [" + currentFieldName + "]."); throw new SearchParseException(context, "Unknown key for a " + token + " in [" + aggregationName + "]: [" + currentFieldName + "].");
} }
} else if (token == XContentParser.Token.VALUE_BOOLEAN) {
if (SHOW_TERM_DOC_COUNT_ERROR.match(currentFieldName)) {
showTermDocCountError = parser.booleanValue();
}
} else { } else {
throw new SearchParseException(context, "Unknown key for a " + token + " in [" + aggregationName + "]: [" + currentFieldName + "]."); throw new SearchParseException(context, "Unknown key for a " + token + " in [" + aggregationName + "]: [" + currentFieldName + "].");
} }

View File

@ -55,8 +55,7 @@ public class TermsParser implements Aggregator.Parser {
context.numberOfShards())); context.numberOfShards()));
} }
bucketCountThresholds.ensureValidity(); bucketCountThresholds.ensureValidity();
return new TermsAggregatorFactory(aggregationName, vsParser.config(), order, bucketCountThresholds, aggParser.getIncludeExclude(), return new TermsAggregatorFactory(aggregationName, vsParser.config(), order, bucketCountThresholds, aggParser.getIncludeExclude(), aggParser.getExecutionHint(), aggParser.getCollectionMode(), aggParser.showTermDocCountError());
aggParser.getExecutionHint(), aggParser.getCollectionMode());
} }
static InternalOrder resolveOrder(String key, boolean asc) { static InternalOrder resolveOrder(String key, boolean asc) {

View File

@ -25,7 +25,6 @@ import org.elasticsearch.search.aggregations.AggregationStreams;
import org.elasticsearch.search.aggregations.InternalAggregation; import org.elasticsearch.search.aggregations.InternalAggregation;
import java.io.IOException; import java.io.IOException;
import java.util.Collection;
import java.util.Collections; import java.util.Collections;
import java.util.List; import java.util.List;
import java.util.Map; import java.util.Map;
@ -37,7 +36,7 @@ public class UnmappedTerms extends InternalTerms {
public static final Type TYPE = new Type("terms", "umterms"); public static final Type TYPE = new Type("terms", "umterms");
private static final Collection<Bucket> BUCKETS = Collections.emptyList(); private static final List<Bucket> BUCKETS = Collections.emptyList();
private static final Map<String, Bucket> BUCKETS_MAP = Collections.emptyMap(); private static final Map<String, Bucket> BUCKETS_MAP = Collections.emptyMap();
public static final AggregationStreams.Stream STREAM = new AggregationStreams.Stream() { public static final AggregationStreams.Stream STREAM = new AggregationStreams.Stream() {
@ -55,8 +54,8 @@ public class UnmappedTerms extends InternalTerms {
UnmappedTerms() {} // for serialization UnmappedTerms() {} // for serialization
public UnmappedTerms(String name, InternalOrder order, int requiredSize, long minDocCount) { public UnmappedTerms(String name, InternalOrder order, int requiredSize, int shardSize, long minDocCount) {
super(name, order, requiredSize, minDocCount, BUCKETS); super(name, order, requiredSize, shardSize, minDocCount, BUCKETS, false, 0);
} }
@Override @Override
@ -67,6 +66,7 @@ public class UnmappedTerms extends InternalTerms {
@Override @Override
public void readFrom(StreamInput in) throws IOException { public void readFrom(StreamInput in) throws IOException {
this.name = in.readString(); this.name = in.readString();
this.docCountError = 0;
this.order = InternalOrder.Streams.readOrder(in); this.order = InternalOrder.Streams.readOrder(in);
this.requiredSize = readSize(in); this.requiredSize = readSize(in);
this.minDocCount = in.readVLong(); this.minDocCount = in.readVLong();
@ -93,12 +93,13 @@ public class UnmappedTerms extends InternalTerms {
} }
@Override @Override
protected InternalTerms newAggregation(String name, List<Bucket> buckets) { protected InternalTerms newAggregation(String name, List<Bucket> buckets, boolean showTermDocCountError, long docCountError) {
throw new UnsupportedOperationException("How did you get there?"); throw new UnsupportedOperationException("How did you get there?");
} }
@Override @Override
public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException { public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, docCountError);
builder.startArray(CommonFields.BUCKETS).endArray(); builder.startArray(CommonFields.BUCKETS).endArray();
return builder; return builder;
} }

View File

@ -251,9 +251,9 @@ public class SignificantTermsSignificanceScoreTests extends ElasticsearchIntegra
classes.toXContent(responseBuilder, null); classes.toXContent(responseBuilder, null);
String result = null; String result = null;
if (type.equals("long")) { if (type.equals("long")) {
result = "\"class\"{\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"sig_terms\":{\"doc_count\":4,\"buckets\":[{\"key\":0,\"key_as_string\":\"0\",\"doc_count\":4,\"score\":0.39999999999999997,\"bg_count\":5}]}},{\"key\":\"1\",\"doc_count\":3,\"sig_terms\":{\"doc_count\":3,\"buckets\":[{\"key\":1,\"key_as_string\":\"1\",\"doc_count\":3,\"score\":0.75,\"bg_count\":4}]}}]}"; result = "\"class\"{\"doc_count_error_upper_bound\":0,\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"sig_terms\":{\"doc_count\":4,\"buckets\":[{\"key\":0,\"key_as_string\":\"0\",\"doc_count\":4,\"score\":0.39999999999999997,\"bg_count\":5}]}},{\"key\":\"1\",\"doc_count\":3,\"sig_terms\":{\"doc_count\":3,\"buckets\":[{\"key\":1,\"key_as_string\":\"1\",\"doc_count\":3,\"score\":0.75,\"bg_count\":4}]}}]}";
} else { } else {
result = "\"class\"{\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"sig_terms\":{\"doc_count\":4,\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"score\":0.39999999999999997,\"bg_count\":5}]}},{\"key\":\"1\",\"doc_count\":3,\"sig_terms\":{\"doc_count\":3,\"buckets\":[{\"key\":\"1\",\"doc_count\":3,\"score\":0.75,\"bg_count\":4}]}}]}"; result = "\"class\"{\"doc_count_error_upper_bound\":0,\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"sig_terms\":{\"doc_count\":4,\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"score\":0.39999999999999997,\"bg_count\":5}]}},{\"key\":\"1\",\"doc_count\":3,\"sig_terms\":{\"doc_count\":3,\"buckets\":[{\"key\":\"1\",\"doc_count\":3,\"score\":0.75,\"bg_count\":4}]}}]}";
} }
assertThat(responseBuilder.string(), equalTo(result)); assertThat(responseBuilder.string(), equalTo(result));

View File

@ -36,7 +36,9 @@ import org.elasticsearch.search.aggregations.bucket.terms.TermsBuilder;
import org.elasticsearch.test.ElasticsearchIntegrationTest; import org.elasticsearch.test.ElasticsearchIntegrationTest;
import org.junit.Test; import org.junit.Test;
import java.util.*; import java.util.HashMap;
import java.util.HashSet;
import java.util.Set;
import static org.elasticsearch.cluster.metadata.IndexMetaData.SETTING_NUMBER_OF_REPLICAS; import static org.elasticsearch.cluster.metadata.IndexMetaData.SETTING_NUMBER_OF_REPLICAS;
import static org.elasticsearch.cluster.metadata.IndexMetaData.SETTING_NUMBER_OF_SHARDS; import static org.elasticsearch.cluster.metadata.IndexMetaData.SETTING_NUMBER_OF_SHARDS;
@ -276,8 +278,8 @@ public class SignificantTermsTests extends ElasticsearchIntegrationTest {
.setQuery(new TermQueryBuilder("_all", "terje")) .setQuery(new TermQueryBuilder("_all", "terje"))
.setFrom(0).setSize(60).setExplain(true) .setFrom(0).setSize(60).setExplain(true)
.addAggregation(new SignificantTermsBuilder("mySignificantTerms").field("description") .addAggregation(new SignificantTermsBuilder("mySignificantTerms").field("description")
.executionHint(randomExecutionHint()) .executionHint(randomExecutionHint())
.minDocCount(2)) .minDocCount(2))
.execute() .execute()
.actionGet(); .actionGet();
assertSearchResponse(response); assertSearchResponse(response);

View File

@ -0,0 +1,945 @@
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.elasticsearch.search.aggregations.bucket;
import org.elasticsearch.action.index.IndexRequestBuilder;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.cluster.metadata.IndexMetaData;
import org.elasticsearch.common.settings.ImmutableSettings;
import org.elasticsearch.search.aggregations.Aggregator.SubAggCollectionMode;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.aggregations.bucket.terms.Terms.Bucket;
import org.elasticsearch.search.aggregations.bucket.terms.Terms.Order;
import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregatorFactory.ExecutionMode;
import org.elasticsearch.test.ElasticsearchIntegrationTest;
import org.junit.Test;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
import static org.elasticsearch.search.aggregations.AggregationBuilders.sum;
import static org.elasticsearch.search.aggregations.AggregationBuilders.terms;
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked;
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertSearchResponse;
import static org.hamcrest.Matchers.*;
import static org.hamcrest.core.IsNull.notNullValue;
@ElasticsearchIntegrationTest.SuiteScopeTest
public class TermsDocCountErrorTests extends ElasticsearchIntegrationTest{
private static final String STRING_FIELD_NAME = "s_value";
private static final String LONG_FIELD_NAME = "l_value";
private static final String DOUBLE_FIELD_NAME = "d_value";
private static final String ROUTING_FIELD_NAME = "route";
public static String randomExecutionHint() {
return randomBoolean() ? null : randomFrom(ExecutionMode.values()).toString();
}
private static int numRoutingValues;
@Override
public void setupSuiteScopeCluster() throws Exception {
createIndex("idx");
List<IndexRequestBuilder> builders = new ArrayList<>();
int numDocs = between(10, 200);
int numUniqueTerms = between(2,numDocs/2);
for (int i = 0; i < numDocs; i++) {
builders.add(client().prepareIndex("idx", "type", ""+i).setSource(jsonBuilder()
.startObject()
.field(STRING_FIELD_NAME, "val" + randomInt(numUniqueTerms))
.field(LONG_FIELD_NAME, randomInt(numUniqueTerms))
.field(DOUBLE_FIELD_NAME, 1.0 * randomInt(numUniqueTerms))
.endObject()));
}
assertAcked(prepareCreate("idx_single_shard").setSettings(ImmutableSettings.builder().put(IndexMetaData.SETTING_NUMBER_OF_SHARDS, 1)));
for (int i = 0; i < numDocs; i++) {
builders.add(client().prepareIndex("idx_single_shard", "type", ""+i).setSource(jsonBuilder()
.startObject()
.field(STRING_FIELD_NAME, "val" + randomInt(numUniqueTerms))
.field(LONG_FIELD_NAME, randomInt(numUniqueTerms))
.field(DOUBLE_FIELD_NAME, 1.0 * randomInt(numUniqueTerms))
.endObject()));
}
numRoutingValues = between(1,40);
assertAcked(prepareCreate("idx_with_routing").addMapping("type", "{ \"type\" : { \"_routing\" : { \"required\" : true, \"path\" : \"" + ROUTING_FIELD_NAME + "\" } } }"));
for (int i = 0; i < numDocs; i++) {
builders.add(client().prepareIndex("idx_single_shard", "type", ""+i).setSource(jsonBuilder()
.startObject()
.field(STRING_FIELD_NAME, "val" + randomInt(numUniqueTerms))
.field(LONG_FIELD_NAME, randomInt(numUniqueTerms))
.field(DOUBLE_FIELD_NAME, 1.0 * randomInt(numUniqueTerms))
.field(ROUTING_FIELD_NAME, String.valueOf(randomInt(numRoutingValues)))
.endObject()));
}
indexRandom(true, builders);
ensureSearchable();
}
private void assertDocCountErrorWithinBounds(int size, SearchResponse accurateResponse, SearchResponse testResponse) {
Terms accurateTerms = accurateResponse.getAggregations().get("terms");
assertThat(accurateTerms, notNullValue());
assertThat(accurateTerms.getName(), equalTo("terms"));
assertThat(accurateTerms.getDocCountError(), equalTo(0l));
Terms testTerms = testResponse.getAggregations().get("terms");
assertThat(testTerms, notNullValue());
assertThat(testTerms.getName(), equalTo("terms"));
assertThat(testTerms.getDocCountError(), greaterThanOrEqualTo(0l));
Collection<Bucket> testBuckets = testTerms.getBuckets();
assertThat(testBuckets.size(), lessThanOrEqualTo(size));
assertThat(accurateTerms.getBuckets().size(), greaterThanOrEqualTo(testBuckets.size()));
for (Terms.Bucket testBucket : testBuckets) {
assertThat(testBucket, notNullValue());
Terms.Bucket accurateBucket = accurateTerms.getBucketByKey(testBucket.getKey());
assertThat(accurateBucket, notNullValue());
assertThat(accurateBucket.getDocCountError(), equalTo(0l));
assertThat(testBucket.getDocCountError(), lessThanOrEqualTo(testTerms.getDocCountError()));
assertThat(testBucket.getDocCount() + testBucket.getDocCountError(), greaterThanOrEqualTo(accurateBucket.getDocCount()));
assertThat(testBucket.getDocCount() - testBucket.getDocCountError(), lessThanOrEqualTo(accurateBucket.getDocCount()));
}
for (Terms.Bucket accurateBucket: accurateTerms.getBuckets()) {
assertThat(accurateBucket, notNullValue());
Terms.Bucket testBucket = accurateTerms.getBucketByKey(accurateBucket.getKey());
if (testBucket == null) {
assertThat(accurateBucket.getDocCount(), lessThanOrEqualTo(testTerms.getDocCountError()));
}
}
}
private void assertNoDocCountError(int size, SearchResponse accurateResponse, SearchResponse testResponse) {
Terms accurateTerms = accurateResponse.getAggregations().get("terms");
assertThat(accurateTerms, notNullValue());
assertThat(accurateTerms.getName(), equalTo("terms"));
assertThat(accurateTerms.getDocCountError(), equalTo(0l));
Terms testTerms = testResponse.getAggregations().get("terms");
assertThat(testTerms, notNullValue());
assertThat(testTerms.getName(), equalTo("terms"));
assertThat(testTerms.getDocCountError(), equalTo(0l));
Collection<Bucket> testBuckets = testTerms.getBuckets();
assertThat(testBuckets.size(), lessThanOrEqualTo(size));
assertThat(accurateTerms.getBuckets().size(), greaterThanOrEqualTo(testBuckets.size()));
for (Terms.Bucket testBucket : testBuckets) {
assertThat(testBucket, notNullValue());
Terms.Bucket accurateBucket = accurateTerms.getBucketByKey(testBucket.getKey());
assertThat(accurateBucket, notNullValue());
assertThat(accurateBucket.getDocCountError(), equalTo(0l));
assertThat(testBucket.getDocCountError(), equalTo(0l));
}
}
private void assertNoDocCountErrorSingleResponse(int size, SearchResponse testResponse) {
Terms testTerms = testResponse.getAggregations().get("terms");
assertThat(testTerms, notNullValue());
assertThat(testTerms.getName(), equalTo("terms"));
assertThat(testTerms.getDocCountError(), equalTo(0l));
Collection<Bucket> testBuckets = testTerms.getBuckets();
assertThat(testBuckets.size(), lessThanOrEqualTo(size));
for (Terms.Bucket testBucket : testBuckets) {
assertThat(testBucket, notNullValue());
assertThat(testBucket.getDocCountError(), equalTo(0l));
}
}
private void assertUnboundedDocCountError(int size, SearchResponse accurateResponse, SearchResponse testResponse) {
Terms accurateTerms = accurateResponse.getAggregations().get("terms");
assertThat(accurateTerms, notNullValue());
assertThat(accurateTerms.getName(), equalTo("terms"));
assertThat(accurateTerms.getDocCountError(), equalTo(0l));
Terms testTerms = testResponse.getAggregations().get("terms");
assertThat(testTerms, notNullValue());
assertThat(testTerms.getName(), equalTo("terms"));
assertThat(testTerms.getDocCountError(),anyOf(equalTo(-1l), equalTo(0l)));
Collection<Bucket> testBuckets = testTerms.getBuckets();
assertThat(testBuckets.size(), lessThanOrEqualTo(size));
assertThat(accurateTerms.getBuckets().size(), greaterThanOrEqualTo(testBuckets.size()));
for (Terms.Bucket testBucket : testBuckets) {
assertThat(testBucket, notNullValue());
Terms.Bucket accurateBucket = accurateTerms.getBucketByKey(testBucket.getKey());
assertThat(accurateBucket, notNullValue());
assertThat(accurateBucket.getDocCountError(), equalTo(0l));
assertThat(testBucket.getDocCountError(), anyOf(equalTo(-1l), equalTo(0l)));
}
}
@Test
public void stringValueField() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertDocCountErrorWithinBounds(size, accurateResponse, testResponse);
}
@Test
public void stringValueField_singleShard() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void stringValueField_withRouting() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse testResponse = client().prepareSearch("idx_with_routing").setTypes("type").setRouting(String.valueOf(between(1, numRoutingValues)))
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountErrorSingleResponse(size, testResponse);
}
@Test
public void stringValueField_docCountAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.count(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.count(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void stringValueField_termSortAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.term(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.term(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void stringValueField_termSortDesc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.term(false))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.term(false))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void stringValueField_subAggAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.aggregation("sortAgg", true))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.aggregation("sortAgg", true))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void stringValueField_subAggDesc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.aggregation("sortAgg", false))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.aggregation("sortAgg", false))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void longValueField() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertDocCountErrorWithinBounds(size, accurateResponse, testResponse);
}
@Test
public void longValueField_singleShard() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void longValueField_withRouting() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse testResponse = client().prepareSearch("idx_with_routing").setTypes("type").setRouting(String.valueOf(between(1, numRoutingValues)))
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountErrorSingleResponse(size, testResponse);
}
@Test
public void longValueField_docCountAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.count(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.count(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void longValueField_termSortAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.term(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.term(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void longValueField_termSortDesc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.term(false))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.term(false))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void longValueField_subAggAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.aggregation("sortAgg", true))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.aggregation("sortAgg", true))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void longValueField_subAggDesc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.aggregation("sortAgg", false))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(DOUBLE_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.aggregation("sortAgg", false))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(DOUBLE_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertDocCountErrorWithinBounds(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField_singleShard() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField_withRouting() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse testResponse = client().prepareSearch("idx_with_routing").setTypes("type").setRouting(String.valueOf(between(1, numRoutingValues)))
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountErrorSingleResponse(size, testResponse);
}
@Test
public void doubleValueField_docCountAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.count(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.count(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField_termSortAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.term(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.term(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField_termSortDesc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.term(false))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.term(false))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField_subAggAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.aggregation("sortAgg", true))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.aggregation("sortAgg", true))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField_subAggDesc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.aggregation("sortAgg", false))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.aggregation("sortAgg", false))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
}