Aggregations: Added an option to show the upper bound of the error for the terms aggregation.

This is only applicable when the order is set to _count.  The upper bound of the error in the doc count is calculated by summing the doc count of the last term on each shard which did not return the term.  The implementation calculates the error by summing the doc count for the last term on each shard for which the term IS returned and then subtracts this value from the sum of the doc counts for the last term from ALL shards.

Closes #6696
This commit is contained in:
Colin Goodheart-Smithe 2014-07-07 13:58:42 +01:00
parent 593fffc7a1
commit 655157c83a
23 changed files with 1431 additions and 130 deletions

View File

@ -43,7 +43,7 @@ Response:
By default, the `terms` aggregation will return the buckets for the top ten terms ordered by the `doc_count`. One can
change this default behaviour by setting the `size` parameter.
==== Size & Shard Size
==== Size
The `size` parameter can be set to define how many term buckets should be returned out of the overall terms list. By
default, the node coordinating the search process will request each shard to provide its own top `size` term buckets
@ -52,6 +52,87 @@ This means that if the number of unique terms is greater than `size`, the return
(it could be that the term counts are slightly off and it could even be that a term that should have been in the top
size buckets was not returned). If set to `0`, the `size` will be set to `Integer.MAX_VALUE`.
==== Document counts are approximate
As described above, the document counts (and the results of any sub aggregations) in the terms aggregation are not always
accurate. This is because each shard provides its own view of what the ordered list of terms should be and these are
combined to give a final view. Consider the following scenario:
A request is made to obtain the top 5 terms in the field product, ordered by descending document count from an index with
3 shards. In this case each shard is asked to give its top 5 terms.
[source,js]
--------------------------------------------------
{
"aggs" : {
"products" : {
"terms" : {
"field" : "product",
"size" : 5
}
}
}
}
--------------------------------------------------
The terms for each of the three shards are shown below with their
respective document counts in brackets:
[width="100%",cols="^2,^2,^2,^2",options="header"]
|=========================================================
| | Shard A | Shard B | Shard C
| 1 | Product A (25) | Product A (30) | Product A (45)
| 2 | Product B (18) | Product B (25) | Product C (44)
| 3 | Product C (6) | Product F (17) | Product Z (36)
| 4 | Product D (3) | Product Z (16) | Product G (30)
| 5 | Product E (2) | Product G (15) | Product E (29)
| 6 | Product F (2) | Product H (14) | Product H (28)
| 7 | Product G (2) | Product I (10) | Product Q (2)
| 8 | Product H (2) | Product Q (6) | Product D (1)
| 9 | Product I (1) | Product J (8) |
| 10 | Product J (1) | Product C (4) |
|=========================================================
The shards will return their top 5 terms so the results from the shards will be:
[width="100%",cols="^2,^2,^2,^2",options="header"]
|=========================================================
| | Shard A | Shard B | Shard C
| 1 | Product A (25) | Product A (30) | Product A (45)
| 2 | Product B (18) | Product B (25) | Product C (44)
| 3 | Product C (6) | Product F (17) | Product Z (36)
| 4 | Product D (3) | Product Z (16) | Product G (30)
| 5 | Product E (2) | Product G (15) | Product E (29)
|=========================================================
Taking the top 5 results from each of the shards (as requested) and combining them to make a final top 5 list produces
the following:
[width="40%",cols="^2,^2"]
|=========================================================
| 1 | Product A (100)
| 2 | Product Z (52)
| 3 | Product C (50)
| 4 | Product G (45)
| 5 | Product B (43)
|=========================================================
Because Product A was returned from all shards we know that its document count value is accurate. Product C was only
returned by shards A and C so its document count is shown as 50 but this is not an accurate count. Product C exists on
shard B, but its count of 4 was not high enough to put Product C into the top 5 list for that shard. Product Z was also
returned only by 2 shards but the third shard does not contain the term. There is no way of knowing, at the point of
combining the results to produce the final list of terms, that there is an error in the document count for Product C and
not for Product Z. Product H has a document count of 44 across all 3 shards but was not included in the final list of
terms because it did not make it into the top five terms on any of the shards.
==== Shard Size
The higher the requested `size` is, the more accurate the results will be, but also, the more expensive it will be to
compute the final results (both due to bigger priority queues that are managed on a shard level and due to bigger data
@ -70,6 +151,81 @@ NOTE: `shard_size` cannot be smaller than `size` (as it doesn't make much sens
added[1.1.0] It is possible to not limit the number of terms that are returned by setting `size` to `0`. Don't use this
on high-cardinality fields as this will kill both your CPU since terms need to be return sorted, and your network.
==== Calculating Document Count Error
coming[1.4.0]
There are two error values which can be shown on the terms aggregation. The first gives a value for the aggregation as
a whole which represents the maximum potential document count for a term which did not make it into the final list of
terms. This is calculated as the sum of the document count from the last term returned from each shard .For the example
given above the value would be 46 (2 + 15 + 29). This means that in the worst case scenario a term which was not returned
could have the 4th highest document count.
[source,js]
--------------------------------------------------
{
...
"aggregations" : {
"products" : {
"doc_count_error_upper_bound" : 46,
"buckets" : [
{
"key" : "Product A",
"doc_count" : 100
},
{
"key" : "Product Z",
"doc_count" : 52
},
...
]
}
}
}
--------------------------------------------------
The second error value can be enabled by setting the `show_term_doc_count_error` parameter to true. This shows an error value
for each term returned by the aggregation which represents the 'worst case' error in the document count and can be useful when
deciding on a value for the `shard_size` parameter. This is calculated by summing the document counts for the last term returned
by all shards which did not return the term. In the example above the error in the document count for Product C would be 15 as
Shard B was the only shard not to return the term and the document count of the last termit did return was 15. The actual document
count of Product C was 54 so the document count was only actually off by 4 even though the worst case was that it would be off by
15. Product A, however has an error of 0 for its document count, since every shard returned it we can be confident that the count
returned is accurate.
[source,js]
--------------------------------------------------
{
...
"aggregations" : {
"products" : {
"doc_count_error_upper_bound" : 46,
"buckets" : [
{
"key" : "Product A",
"doc_count" : 100,
"doc_count_error_upper_bound" : 0
},
{
"key" : "Product Z",
"doc_count" : 52,
"doc_count_error_upper_bound" : 2
},
...
]
}
}
}
--------------------------------------------------
These errors can only be calculated in this way when the terms are ordered by descending document count. When the aggregation is
ordered by the terms values themselves (either ascending or descending) there is no error in the document count since if a shard
does not return a particular term which appears in the results from another shard, it must not have that term in its index. When the
aggregation is either sorted by a sub aggregation or in order of ascending document count, the error in the document counts cannot be
determined and is given a value of -1 to indicate this.
==== Order
The order of the buckets can be customized by setting the `order` parameter. By default, the buckets are ordered by

View File

@ -47,7 +47,7 @@ public class GlobalOrdinalsSignificantTermsAggregator extends GlobalOrdinalsStri
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent,
SignificantTermsAggregatorFactory termsAggFactory) {
super(name, factories, valuesSource, estimatedBucketCount, maxOrd, null, bucketCountThresholds, includeExclude, aggregationContext, parent, SubAggCollectionMode.DEPTH_FIRST);
super(name, factories, valuesSource, estimatedBucketCount, maxOrd, null, bucketCountThresholds, includeExclude, aggregationContext, parent, SubAggCollectionMode.DEPTH_FIRST, false);
this.termsAggFactory = termsAggFactory;
}

View File

@ -42,7 +42,7 @@ public class SignificantLongTermsAggregator extends LongTermsAggregator {
long estimatedBucketCount, BucketCountThresholds bucketCountThresholds,
AggregationContext aggregationContext, Aggregator parent, SignificantTermsAggregatorFactory termsAggFactory) {
super(name, factories, valuesSource, format, estimatedBucketCount, null, bucketCountThresholds, aggregationContext, parent, SubAggCollectionMode.DEPTH_FIRST);
super(name, factories, valuesSource, format, estimatedBucketCount, null, bucketCountThresholds, aggregationContext, parent, SubAggCollectionMode.DEPTH_FIRST, false);
this.termsAggFactory = termsAggFactory;
}

View File

@ -46,7 +46,7 @@ public class SignificantStringTermsAggregator extends StringTermsAggregator {
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent,
SignificantTermsAggregatorFactory termsAggFactory) {
super(name, factories, valuesSource, estimatedBucketCount, null, bucketCountThresholds, includeExclude, aggregationContext, parent, SubAggCollectionMode.DEPTH_FIRST);
super(name, factories, valuesSource, estimatedBucketCount, null, bucketCountThresholds, includeExclude, aggregationContext, parent, SubAggCollectionMode.DEPTH_FIRST, false);
this.termsAggFactory = termsAggFactory;
}

View File

@ -29,10 +29,13 @@ import java.util.Collections;
abstract class AbstractStringTermsAggregator extends TermsAggregator {
protected final boolean showTermDocCountError;
public AbstractStringTermsAggregator(String name, AggregatorFactories factories,
long estimatedBucketsCount, AggregationContext context, Aggregator parent,
InternalOrder order, BucketCountThresholds bucketCountThresholds, SubAggCollectionMode subAggCollectMode) {
InternalOrder order, BucketCountThresholds bucketCountThresholds, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError) {
super(name, BucketAggregationMode.PER_BUCKET, factories, estimatedBucketsCount, context, parent, bucketCountThresholds, order, subAggCollectMode);
this.showTermDocCountError = showTermDocCountError;
}
@Override
@ -42,7 +45,7 @@ abstract class AbstractStringTermsAggregator extends TermsAggregator {
@Override
public InternalAggregation buildEmptyAggregation() {
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList());
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList(), showTermDocCountError, 0);
}
}

View File

@ -36,6 +36,7 @@ public abstract class AbstractTermsParametersParser {
public static final ParseField MIN_DOC_COUNT_FIELD_NAME = new ParseField("min_doc_count");
public static final ParseField SHARD_MIN_DOC_COUNT_FIELD_NAME = new ParseField("shard_min_doc_count");
public static final ParseField REQUIRED_SIZE_FIELD_NAME = new ParseField("size");
public static final ParseField SHOW_TERM_DOC_COUNT_ERROR = new ParseField("show_term_doc_count_error");
//These are the results of the parsing.
@ -64,7 +65,6 @@ public abstract class AbstractTermsParametersParser {
return collectMode;
}
public void parse(String aggregationName, XContentParser parser, SearchContext context, ValuesSourceParser vsParser, IncludeExclude.Parser incExcParser) throws IOException {
bucketCountThresholds = getDefaultBucketCountThresholds();
XContentParser.Token token;

View File

@ -18,6 +18,7 @@
*/
package org.elasticsearch.search.aggregations.bucket.terms;
import org.elasticsearch.Version;
import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
@ -31,7 +32,6 @@ import org.elasticsearch.search.aggregations.support.format.ValueFormatterStream
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
/**
@ -58,8 +58,8 @@ public class DoubleTerms extends InternalTerms {
double term;
public Bucket(double term, long docCount, InternalAggregations aggregations) {
super(docCount, aggregations);
public Bucket(double term, long docCount, InternalAggregations aggregations, boolean showDocCountError, long docCountError) {
super(docCount, aggregations, showDocCountError, docCountError);
this.term = term;
}
@ -89,8 +89,8 @@ public class DoubleTerms extends InternalTerms {
}
@Override
Bucket newBucket(long docCount, InternalAggregations aggs) {
return new Bucket(term, docCount, aggs);
Bucket newBucket(long docCount, InternalAggregations aggs, long docCountError) {
return new Bucket(term, docCount, aggs, showDocCountError, docCountError);
}
}
@ -98,8 +98,8 @@ public class DoubleTerms extends InternalTerms {
DoubleTerms() {} // for serialization
public DoubleTerms(String name, InternalOrder order, @Nullable ValueFormatter formatter, int requiredSize, long minDocCount, Collection<InternalTerms.Bucket> buckets) {
super(name, order, requiredSize, minDocCount, buckets);
public DoubleTerms(String name, InternalOrder order, @Nullable ValueFormatter formatter, int requiredSize, int shardSize, long minDocCount, List<InternalTerms.Bucket> buckets, boolean showTermDocCountError, long docCountError) {
super(name, order, requiredSize, shardSize, minDocCount, buckets, showTermDocCountError, docCountError);
this.formatter = formatter;
}
@ -109,21 +109,40 @@ public class DoubleTerms extends InternalTerms {
}
@Override
protected InternalTerms newAggregation(String name, List<InternalTerms.Bucket> buckets) {
return new DoubleTerms(name, order, formatter, requiredSize, minDocCount, buckets);
protected InternalTerms newAggregation(String name, List<InternalTerms.Bucket> buckets, boolean showTermDocCountError, long docCountError) {
return new DoubleTerms(name, order, formatter, requiredSize, shardSize, minDocCount, buckets, showTermDocCountError, docCountError);
}
@Override
public void readFrom(StreamInput in) throws IOException {
this.name = in.readString();
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
this.docCountError = in.readLong();
} else {
this.docCountError = -1;
}
this.order = InternalOrder.Streams.readOrder(in);
this.formatter = ValueFormatterStreams.readOptional(in);
this.requiredSize = readSize(in);
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
this.shardSize = readSize(in);
this.showTermDocCountError = in.readBoolean();
} else {
this.shardSize = requiredSize;
this.showTermDocCountError = false;
}
this.minDocCount = in.readVLong();
int size = in.readVInt();
List<InternalTerms.Bucket> buckets = new ArrayList<>(size);
for (int i = 0; i < size; i++) {
buckets.add(new Bucket(in.readDouble(), in.readVLong(), InternalAggregations.readAggregations(in)));
double term = in.readDouble();
long docCount = in.readVLong();
long bucketDocCountError = -1;
if (in.getVersion().onOrAfter(Version.V_1_4_0) && showTermDocCountError) {
bucketDocCountError = in.readLong();
}
InternalAggregations aggregations = InternalAggregations.readAggregations(in);
buckets.add(new Bucket(term, docCount, aggregations, showTermDocCountError, bucketDocCountError));
}
this.buckets = buckets;
this.bucketMap = null;
@ -132,20 +151,31 @@ public class DoubleTerms extends InternalTerms {
@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeString(name);
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
out.writeLong(docCountError);
}
InternalOrder.Streams.writeOrder(order, out);
ValueFormatterStreams.writeOptional(formatter, out);
writeSize(requiredSize, out);
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
writeSize(shardSize, out);
out.writeBoolean(showTermDocCountError);
}
out.writeVLong(minDocCount);
out.writeVInt(buckets.size());
for (InternalTerms.Bucket bucket : buckets) {
out.writeDouble(((Bucket) bucket).term);
out.writeVLong(bucket.getDocCount());
if (out.getVersion().onOrAfter(Version.V_1_4_0) && showTermDocCountError) {
out.writeLong(bucket.docCountError);
}
((InternalAggregations) bucket.getAggregations()).writeTo(out);
}
}
@Override
public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, docCountError);
builder.startArray(CommonFields.BUCKETS);
for (InternalTerms.Bucket bucket : buckets) {
builder.startObject();
@ -154,6 +184,9 @@ public class DoubleTerms extends InternalTerms {
builder.field(CommonFields.KEY_AS_STRING, formatter.format(((Bucket) bucket).term));
}
builder.field(CommonFields.DOC_COUNT, bucket.getDocCount());
if (showTermDocCountError) {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, bucket.getDocCountError());
}
((InternalAggregations) bucket.getAggregations()).toXContentInternal(builder, params);
builder.endObject();
}

View File

@ -43,12 +43,14 @@ public class DoubleTermsAggregator extends TermsAggregator {
private final ValuesSource.Numeric valuesSource;
private final ValueFormatter formatter;
private final LongHash bucketOrds;
private final boolean showTermDocCountError;
private SortedNumericDoubleValues values;
public DoubleTermsAggregator(String name, AggregatorFactories factories, ValuesSource.Numeric valuesSource, @Nullable ValueFormat format, long estimatedBucketCount,
InternalOrder order, BucketCountThresholds bucketCountThresholds, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode) {
InternalOrder order, BucketCountThresholds bucketCountThresholds, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode, boolean showTermDocCountError) {
super(name, BucketAggregationMode.PER_BUCKET, factories, estimatedBucketCount, aggregationContext, parent, bucketCountThresholds, order, collectionMode);
this.valuesSource = valuesSource;
this.showTermDocCountError = showTermDocCountError;
this.formatter = format != null ? format.formatter() : null;
bucketOrds = new LongHash(estimatedBucketCount, aggregationContext.bigArrays());
}
@ -111,7 +113,7 @@ public class DoubleTermsAggregator extends TermsAggregator {
DoubleTerms.Bucket spare = null;
for (long i = 0; i < bucketOrds.size(); i++) {
if (spare == null) {
spare = new DoubleTerms.Bucket(0, 0, null);
spare = new DoubleTerms.Bucket(0, 0, null, showTermDocCountError, 0);
}
spare.term = Double.longBitsToDouble(bucketOrds.get(i));
spare.docCount = bucketDocCount(i);
@ -130,19 +132,20 @@ public class DoubleTermsAggregator extends TermsAggregator {
list[i] = bucket;
}
// replay any deferred collections
runDeferredCollections(survivingBucketOrds);
runDeferredCollections(survivingBucketOrds);
// Now build the aggs
for (int i = 0; i < list.length; i++) {
list[i].aggregations = bucketAggregations(list[i].bucketOrd);
}
list[i].docCountError = 0;
}
return new DoubleTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list));
return new DoubleTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list), showTermDocCountError, 0);
}
@Override
public DoubleTerms buildEmptyAggregation() {
return new DoubleTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList());
return new DoubleTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList(), showTermDocCountError, 0);
}
@Override

View File

@ -69,8 +69,8 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
public GlobalOrdinalsStringTermsAggregator(String name, AggregatorFactories factories, ValuesSource.Bytes.WithOrdinals.FieldData valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, BucketCountThresholds bucketCountThresholds,
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode) {
super(name, factories, maxOrd, aggregationContext, parent, order, bucketCountThresholds, collectionMode);
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode, boolean showTermDocCountError) {
super(name, factories, maxOrd, aggregationContext, parent, order, bucketCountThresholds, collectionMode, showTermDocCountError);
this.valuesSource = valuesSource;
this.includeExclude = includeExclude;
}
@ -151,7 +151,7 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
size = (int) Math.min(maxBucketOrd(), bucketCountThresholds.getShardSize());
}
BucketPriorityQueue ordered = new BucketPriorityQueue(size, order.comparator(this));
OrdBucket spare = new OrdBucket(-1, 0, null);
OrdBucket spare = new OrdBucket(-1, 0, null, showTermDocCountError, 0);
for (long globalTermOrd = 0; globalTermOrd < globalOrds.getValueCount(); ++globalTermOrd) {
if (includeExclude != null && !acceptedGlobalOrdinals.get(globalTermOrd)) {
continue;
@ -167,7 +167,7 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
if (bucketCountThresholds.getShardMinDocCount() <= spare.docCount) {
spare = (OrdBucket) ordered.insertWithOverflow(spare);
if (spare == null) {
spare = new OrdBucket(-1, 0, null);
spare = new OrdBucket(-1, 0, null, showTermDocCountError, 0);
}
}
}
@ -180,26 +180,28 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
survivingBucketOrds[i] = bucket.bucketOrd;
BytesRef scratch = new BytesRef();
copy(globalOrds.lookupOrd(bucket.globalOrd), scratch);
list[i] = new StringTerms.Bucket(scratch, bucket.docCount, null);
list[i] = new StringTerms.Bucket(scratch, bucket.docCount, null, showTermDocCountError, 0);
list[i].bucketOrd = bucket.bucketOrd;
}
//replay any deferred collections
runDeferredCollections(survivingBucketOrds);
//Now build the aggs
for (int i = 0; i < list.length; i++) {
Bucket bucket = list[i];
bucket.aggregations = bucket.docCount == 0 ? bucketEmptyAggregations() : bucketAggregations(bucket.bucketOrd);
bucket.docCountError = 0;
}
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list));
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list), showTermDocCountError, 0);
}
/** This is used internally only, just for compare using global ordinal instead of term bytes in the PQ */
static class OrdBucket extends InternalTerms.Bucket {
long globalOrd;
OrdBucket(long globalOrd, long docCount, InternalAggregations aggregations) {
super(docCount, aggregations);
OrdBucket(long globalOrd, long docCount, InternalAggregations aggregations, boolean showDocCountError, long docCountError) {
super(docCount, aggregations, showDocCountError, docCountError);
this.globalOrd = globalOrd;
}
@ -224,7 +226,7 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
}
@Override
Bucket newBucket(long docCount, InternalAggregations aggs) {
Bucket newBucket(long docCount, InternalAggregations aggs, long docCountError) {
throw new UnsupportedOperationException();
}
@ -248,9 +250,9 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
public WithHash(String name, AggregatorFactories factories, ValuesSource.Bytes.WithOrdinals.FieldData valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude, AggregationContext aggregationContext,
Aggregator parent, SubAggCollectionMode collectionMode) {
Aggregator parent, SubAggCollectionMode collectionMode, boolean showTermDocCountError) {
// Set maxOrd to estimatedBucketCount! To be conservative with memory.
super(name, factories, valuesSource, estimatedBucketCount, estimatedBucketCount, order, bucketCountThresholds, includeExclude, aggregationContext, parent, collectionMode);
super(name, factories, valuesSource, estimatedBucketCount, estimatedBucketCount, order, bucketCountThresholds, includeExclude, aggregationContext, parent, collectionMode, showTermDocCountError);
bucketOrds = new LongHash(estimatedBucketCount, aggregationContext.bigArrays());
}
@ -277,17 +279,17 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
public void collect(int doc) throws IOException {
ords.setDocument(doc);
final int numOrds = ords.cardinality();
for (int i = 0; i < numOrds; i++) {
for (int i = 0; i < numOrds; i++) {
final long globalOrd = ords.ordAt(i);
long bucketOrd = bucketOrds.add(globalOrd);
if (bucketOrd < 0) {
bucketOrd = -1 - bucketOrd;
collectExistingBucket(doc, bucketOrd);
} else {
collectBucket(doc, bucketOrd);
}
}
}
long bucketOrd = bucketOrds.add(globalOrd);
if (bucketOrd < 0) {
bucketOrd = -1 - bucketOrd;
collectExistingBucket(doc, bucketOrd);
} else {
collectBucket(doc, bucketOrd);
}
}
}
};
}
}
@ -316,8 +318,8 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
private RandomAccessOrds segmentOrds;
public LowCardinality(String name, AggregatorFactories factories, ValuesSource.Bytes.WithOrdinals.FieldData valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, BucketCountThresholds bucketCountThresholds, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode) {
super(name, factories, valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, null, aggregationContext, parent, collectionMode);
long maxOrd, InternalOrder order, BucketCountThresholds bucketCountThresholds, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode, boolean showTermDocCountError) {
super(name, factories, valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, null, aggregationContext, parent, collectionMode, showTermDocCountError);
assert factories == null || factories.count() == 0;
this.segmentDocCounts = bigArrays.newIntArray(maxOrd + 1, true);
}
@ -327,7 +329,7 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
final SortedDocValues singleValues = DocValues.unwrapSingleton(segmentOrds);
if (singleValues != null) {
return new Collector() {
@Override
@Override
public void collect(int doc) throws IOException {
final int ord = singleValues.getOrd(doc);
segmentDocCounts.increment(ord + 1, 1);
@ -338,7 +340,7 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
public void collect(int doc) throws IOException {
segmentOrds.setDocument(doc);
final int numOrds = segmentOrds.cardinality();
for (int i = 0; i < numOrds; i++) {
for (int i = 0; i < numOrds; i++) {
final long segmentOrd = segmentOrds.ordAt(i);
segmentDocCounts.increment(segmentOrd + 1, 1);
}
@ -356,7 +358,7 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
globalOrds = valuesSource.globalOrdinalsValues();
segmentOrds = valuesSource.ordinalsValues();
collector = newCollector(segmentOrds);
}
}
@Override
protected void doPostCollection() {
@ -432,8 +434,8 @@ public class GlobalOrdinalsStringTermsAggregator extends AbstractStringTermsAggr
if (accepted.get(ord)) {
ords[cardinality++] = ord;
}
}
}
}
@Override
public int cardinality() {

View File

@ -21,6 +21,7 @@ package org.elasticsearch.search.aggregations.bucket.terms;
import com.google.common.collect.ArrayListMultimap;
import com.google.common.collect.Maps;
import com.google.common.collect.Multimap;
import org.elasticsearch.ElasticsearchIllegalStateException;
import org.elasticsearch.common.io.stream.Streamable;
import org.elasticsearch.common.util.BigArrays;
import org.elasticsearch.common.xcontent.ToXContent;
@ -36,16 +37,23 @@ import java.util.*;
*/
public abstract class InternalTerms extends InternalAggregation implements Terms, ToXContent, Streamable {
protected static final String DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME = "doc_count_error_upper_bound";
public static abstract class Bucket extends Terms.Bucket {
long bucketOrd;
protected long docCount;
protected long docCountError;
protected InternalAggregations aggregations;
protected boolean showDocCountError;
protected Bucket(long docCount, InternalAggregations aggregations) {
protected Bucket(long docCount, InternalAggregations aggregations, boolean showDocCountError, long docCountError) {
this.docCount = docCount;
this.aggregations = aggregations;
this.showDocCountError = showDocCountError;
this.docCountError = docCountError;
}
@Override
@ -53,6 +61,13 @@ public abstract class InternalTerms extends InternalAggregation implements Terms
return docCount;
}
public long getDocCountError() {
if (!showDocCountError) {
throw new ElasticsearchIllegalStateException("show_terms_doc_count_error is false");
}
return docCountError;
}
@Override
public Aggregations getAggregations() {
return aggregations;
@ -60,34 +75,48 @@ public abstract class InternalTerms extends InternalAggregation implements Terms
abstract Object getKeyAsObject();
abstract Bucket newBucket(long docCount, InternalAggregations aggs);
abstract Bucket newBucket(long docCount, InternalAggregations aggs, long docCountError);
public Bucket reduce(List<? extends Bucket> buckets, BigArrays bigArrays) {
long docCount = 0;
long docCountError = 0;
List<InternalAggregations> aggregationsList = new ArrayList<>(buckets.size());
for (Bucket bucket : buckets) {
docCount += bucket.docCount;
if (docCountError != -1) {
if (bucket.docCountError == -1) {
docCountError = -1;
} else {
docCountError += bucket.docCountError;
}
}
aggregationsList.add(bucket.aggregations);
}
InternalAggregations aggs = InternalAggregations.reduce(aggregationsList, bigArrays);
return newBucket(docCount, aggs);
return newBucket(docCount, aggs, docCountError);
}
}
protected InternalOrder order;
protected int requiredSize;
protected int shardSize;
protected long minDocCount;
protected Collection<Bucket> buckets;
protected List<Bucket> buckets;
protected Map<String, Bucket> bucketMap;
protected long docCountError;
protected boolean showTermDocCountError;
protected InternalTerms() {} // for serialization
protected InternalTerms(String name, InternalOrder order, int requiredSize, long minDocCount, Collection<Bucket> buckets) {
protected InternalTerms(String name, InternalOrder order, int requiredSize, int shardSize, long minDocCount, List<Bucket> buckets, boolean showTermDocCountError, long docCountError) {
super(name);
this.order = order;
this.requiredSize = requiredSize;
this.shardSize = shardSize;
this.minDocCount = minDocCount;
this.buckets = buckets;
this.showTermDocCountError = showTermDocCountError;
this.docCountError = docCountError;
}
@Override
@ -106,15 +135,37 @@ public abstract class InternalTerms extends InternalAggregation implements Terms
}
return bucketMap.get(term);
}
public long getDocCountError() {
return docCountError;
}
@Override
public InternalAggregation reduce(ReduceContext reduceContext) {
List<InternalAggregation> aggregations = reduceContext.aggregations();
Multimap<Object, InternalTerms.Bucket> buckets = ArrayListMultimap.create();
long sumDocCountError = 0;
for (InternalAggregation aggregation : aggregations) {
InternalTerms terms = (InternalTerms) aggregation;
final long thisAggDocCountError;
if (terms.buckets.size() < this.shardSize || this.order == InternalOrder.TERM_ASC || this.order == InternalOrder.TERM_DESC) {
thisAggDocCountError = 0;
} else if (this.order == InternalOrder.COUNT_DESC) {
thisAggDocCountError = terms.buckets.get(terms.buckets.size() - 1).docCount;
} else {
thisAggDocCountError = -1;
}
if (sumDocCountError != -1) {
if (thisAggDocCountError == -1) {
sumDocCountError = -1;
} else {
sumDocCountError += thisAggDocCountError;
}
}
terms.docCountError = thisAggDocCountError;
for (Bucket bucket : terms.buckets) {
bucket.docCountError = thisAggDocCountError;
buckets.put(bucket.getKeyAsObject(), bucket);
}
}
@ -124,6 +175,13 @@ public abstract class InternalTerms extends InternalAggregation implements Terms
for (Collection<Bucket> l : buckets.asMap().values()) {
List<Bucket> sameTermBuckets = (List<Bucket>) l; // cast is ok according to javadocs
final Bucket b = sameTermBuckets.get(0).reduce(sameTermBuckets, reduceContext.bigArrays());
if (b.docCountError != -1) {
if (sumDocCountError == -1) {
b.docCountError = -1;
} else {
b.docCountError = sumDocCountError - b.docCountError;
}
}
if (b.docCount >= minDocCount) {
ordered.insertWithOverflow(b);
}
@ -132,9 +190,15 @@ public abstract class InternalTerms extends InternalAggregation implements Terms
for (int i = ordered.size() - 1; i >= 0; i--) {
list[i] = (Bucket) ordered.pop();
}
return newAggregation(name, Arrays.asList(list));
long docCountError;
if (sumDocCountError == -1) {
docCountError = -1;
} else {
docCountError = aggregations.size() == 1 ? 0 : sumDocCountError;
}
return newAggregation(name, Arrays.asList(list), showTermDocCountError, docCountError);
}
protected abstract InternalTerms newAggregation(String name, List<Bucket> buckets);
protected abstract InternalTerms newAggregation(String name, List<Bucket> buckets, boolean showTermDocCountError, long docCountError);
}

View File

@ -18,6 +18,7 @@
*/
package org.elasticsearch.search.aggregations.bucket.terms;
import org.elasticsearch.Version;
import org.elasticsearch.common.Nullable;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
@ -31,7 +32,6 @@ import org.elasticsearch.search.aggregations.support.format.ValueFormatterStream
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
/**
@ -59,8 +59,8 @@ public class LongTerms extends InternalTerms {
long term;
public Bucket(long term, long docCount, InternalAggregations aggregations) {
super(docCount, aggregations);
public Bucket(long term, long docCount, InternalAggregations aggregations, boolean showDocCountError, long docCountError) {
super(docCount, aggregations, showDocCountError, docCountError);
this.term = term;
}
@ -90,8 +90,8 @@ public class LongTerms extends InternalTerms {
}
@Override
Bucket newBucket(long docCount, InternalAggregations aggs) {
return new Bucket(term, docCount, aggs);
Bucket newBucket(long docCount, InternalAggregations aggs, long docCountError) {
return new Bucket(term, docCount, aggs, showDocCountError, docCountError);
}
}
@ -99,8 +99,8 @@ public class LongTerms extends InternalTerms {
LongTerms() {} // for serialization
public LongTerms(String name, InternalOrder order, @Nullable ValueFormatter formatter, int requiredSize, long minDocCount, Collection<InternalTerms.Bucket> buckets) {
super(name, order, requiredSize, minDocCount, buckets);
public LongTerms(String name, InternalOrder order, @Nullable ValueFormatter formatter, int requiredSize, int shardSize, long minDocCount, List<InternalTerms.Bucket> buckets, boolean showTermDocCountError, long docCountError) {
super(name, order, requiredSize, shardSize, minDocCount, buckets, showTermDocCountError, docCountError);
this.formatter = formatter;
}
@ -110,21 +110,40 @@ public class LongTerms extends InternalTerms {
}
@Override
protected InternalTerms newAggregation(String name, List<InternalTerms.Bucket> buckets) {
return new LongTerms(name, order, formatter, requiredSize, minDocCount, buckets);
protected InternalTerms newAggregation(String name, List<InternalTerms.Bucket> buckets, boolean showTermDocCountError, long docCountError) {
return new LongTerms(name, order, formatter, requiredSize, shardSize, minDocCount, buckets, showTermDocCountError, docCountError);
}
@Override
public void readFrom(StreamInput in) throws IOException {
this.name = in.readString();
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
this.docCountError = in.readLong();
} else {
this.docCountError = -1;
}
this.order = InternalOrder.Streams.readOrder(in);
this.formatter = ValueFormatterStreams.readOptional(in);
this.requiredSize = readSize(in);
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
this.shardSize = readSize(in);
this.showTermDocCountError = in.readBoolean();
} else {
this.shardSize = requiredSize;
this.showTermDocCountError = false;
}
this.minDocCount = in.readVLong();
int size = in.readVInt();
List<InternalTerms.Bucket> buckets = new ArrayList<>(size);
for (int i = 0; i < size; i++) {
buckets.add(new Bucket(in.readLong(), in.readVLong(), InternalAggregations.readAggregations(in)));
long term = in.readLong();
long docCount = in.readVLong();
long bucketDocCountError = -1;
if (in.getVersion().onOrAfter(Version.V_1_4_0) && showTermDocCountError) {
bucketDocCountError = in.readLong();
}
InternalAggregations aggregations = InternalAggregations.readAggregations(in);
buckets.add(new Bucket(term, docCount, aggregations, showTermDocCountError, bucketDocCountError));
}
this.buckets = buckets;
this.bucketMap = null;
@ -133,20 +152,31 @@ public class LongTerms extends InternalTerms {
@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeString(name);
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
out.writeLong(docCountError);
}
InternalOrder.Streams.writeOrder(order, out);
ValueFormatterStreams.writeOptional(formatter, out);
writeSize(requiredSize, out);
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
writeSize(shardSize, out);
out.writeBoolean(showTermDocCountError);
}
out.writeVLong(minDocCount);
out.writeVInt(buckets.size());
for (InternalTerms.Bucket bucket : buckets) {
out.writeLong(((Bucket) bucket).term);
out.writeVLong(bucket.getDocCount());
if (out.getVersion().onOrAfter(Version.V_1_4_0) && showTermDocCountError) {
out.writeLong(bucket.docCountError);
}
((InternalAggregations) bucket.getAggregations()).writeTo(out);
}
}
@Override
public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, docCountError);
builder.startArray(CommonFields.BUCKETS);
for (InternalTerms.Bucket bucket : buckets) {
builder.startObject();
@ -155,6 +185,9 @@ public class LongTerms extends InternalTerms {
builder.field(CommonFields.KEY_AS_STRING, formatter.format(((Bucket) bucket).term));
}
builder.field(CommonFields.DOC_COUNT, bucket.getDocCount());
if (showTermDocCountError) {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, bucket.getDocCountError());
}
((InternalAggregations) bucket.getAggregations()).toXContentInternal(builder, params);
builder.endObject();
}

View File

@ -44,12 +44,14 @@ public class LongTermsAggregator extends TermsAggregator {
protected final ValuesSource.Numeric valuesSource;
protected final @Nullable ValueFormatter formatter;
protected final LongHash bucketOrds;
private boolean showTermDocCountError;
private SortedNumericDocValues values;
public LongTermsAggregator(String name, AggregatorFactories factories, ValuesSource.Numeric valuesSource, @Nullable ValueFormat format, long estimatedBucketCount,
InternalOrder order, BucketCountThresholds bucketCountThresholds, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode) {
InternalOrder order, BucketCountThresholds bucketCountThresholds, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError) {
super(name, BucketAggregationMode.PER_BUCKET, factories, estimatedBucketCount, aggregationContext, parent, bucketCountThresholds, order, subAggCollectMode);
this.valuesSource = valuesSource;
this.showTermDocCountError = showTermDocCountError;
this.formatter = format != null ? format.formatter() : null;
bucketOrds = new LongHash(estimatedBucketCount, aggregationContext.bigArrays());
}
@ -76,13 +78,13 @@ public class LongTermsAggregator extends TermsAggregator {
for (int i = 0; i < valuesCount; ++i) {
final long val = values.valueAt(i);
if (previous != val || i != 0) {
long bucketOrdinal = bucketOrds.add(val);
if (bucketOrdinal < 0) { // already seen
bucketOrdinal = - 1 - bucketOrdinal;
collectExistingBucket(doc, bucketOrdinal);
} else {
collectBucket(doc, bucketOrdinal);
}
long bucketOrdinal = bucketOrds.add(val);
if (bucketOrdinal < 0) { // already seen
bucketOrdinal = - 1 - bucketOrdinal;
collectExistingBucket(doc, bucketOrdinal);
} else {
collectBucket(doc, bucketOrdinal);
}
previous = val;
}
}
@ -113,7 +115,7 @@ public class LongTermsAggregator extends TermsAggregator {
LongTerms.Bucket spare = null;
for (long i = 0; i < bucketOrds.size(); i++) {
if (spare == null) {
spare = new LongTerms.Bucket(0, 0, null);
spare = new LongTerms.Bucket(0, 0, null, showTermDocCountError, 0);
}
spare.term = bucketOrds.get(i);
spare.docCount = bucketDocCount(i);
@ -122,9 +124,7 @@ public class LongTermsAggregator extends TermsAggregator {
spare = (LongTerms.Bucket) ordered.insertWithOverflow(spare);
}
}
// Get the top buckets
final InternalTerms.Bucket[] list = new InternalTerms.Bucket[ordered.size()];
long survivingBucketOrds[] = new long[ordered.size()];
@ -139,14 +139,16 @@ public class LongTermsAggregator extends TermsAggregator {
//Now build the aggs
for (int i = 0; i < list.length; i++) {
list[i].aggregations = bucketAggregations(list[i].bucketOrd);
list[i].docCountError = 0;
}
return new LongTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list));
return new LongTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list), showTermDocCountError, 0);
}
@Override
public InternalAggregation buildEmptyAggregation() {
return new LongTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList());
return new LongTerms(name, order, formatter, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList(), showTermDocCountError, 0);
}
@Override

View File

@ -19,6 +19,7 @@
package org.elasticsearch.search.aggregations.bucket.terms;
import org.apache.lucene.util.BytesRef;
import org.elasticsearch.Version;
import org.elasticsearch.common.bytes.BytesArray;
import org.elasticsearch.common.io.stream.StreamInput;
import org.elasticsearch.common.io.stream.StreamOutput;
@ -31,7 +32,6 @@ import org.elasticsearch.search.aggregations.InternalAggregations;
import java.io.IOException;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
/**
@ -59,8 +59,8 @@ public class StringTerms extends InternalTerms {
BytesRef termBytes;
public Bucket(BytesRef term, long docCount, InternalAggregations aggregations) {
super(docCount, aggregations);
public Bucket(BytesRef term, long docCount, InternalAggregations aggregations, boolean showDocCountError, long docCountError) {
super(docCount, aggregations, showDocCountError, docCountError);
this.termBytes = term;
}
@ -91,15 +91,15 @@ public class StringTerms extends InternalTerms {
}
@Override
Bucket newBucket(long docCount, InternalAggregations aggs) {
return new Bucket(termBytes, docCount, aggs);
Bucket newBucket(long docCount, InternalAggregations aggs, long docCountError) {
return new Bucket(termBytes, docCount, aggs, showDocCountError, docCountError);
}
}
StringTerms() {} // for serialization
public StringTerms(String name, InternalOrder order, int requiredSize, long minDocCount, Collection<InternalTerms.Bucket> buckets) {
super(name, order, requiredSize, minDocCount, buckets);
public StringTerms(String name, InternalOrder order, int requiredSize, int shardSize, long minDocCount, List<InternalTerms.Bucket> buckets, boolean showTermDocCountError, long docCountError) {
super(name, order, requiredSize, shardSize, minDocCount, buckets, showTermDocCountError, docCountError);
}
@Override
@ -108,20 +108,39 @@ public class StringTerms extends InternalTerms {
}
@Override
protected InternalTerms newAggregation(String name, List<InternalTerms.Bucket> buckets) {
return new StringTerms(name, order, requiredSize, minDocCount, buckets);
protected InternalTerms newAggregation(String name, List<InternalTerms.Bucket> buckets, boolean showTermDocCountError, long docCountError) {
return new StringTerms(name, order, requiredSize, shardSize, minDocCount, buckets, showTermDocCountError, docCountError);
}
@Override
public void readFrom(StreamInput in) throws IOException {
this.name = in.readString();
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
this.docCountError = in.readLong();
} else {
this.docCountError = -1;
}
this.order = InternalOrder.Streams.readOrder(in);
this.requiredSize = readSize(in);
if (in.getVersion().onOrAfter(Version.V_1_4_0)) {
this.shardSize = readSize(in);
this.showTermDocCountError = in.readBoolean();
} else {
this.shardSize = requiredSize;
this.showTermDocCountError = false;
}
this.minDocCount = in.readVLong();
int size = in.readVInt();
List<InternalTerms.Bucket> buckets = new ArrayList<>(size);
for (int i = 0; i < size; i++) {
buckets.add(new Bucket(in.readBytesRef(), in.readVLong(), InternalAggregations.readAggregations(in)));
BytesRef termBytes = in.readBytesRef();
long docCount = in.readVLong();
long bucketDocCountError = -1;
if (in.getVersion().onOrAfter(Version.V_1_4_0) && showTermDocCountError) {
bucketDocCountError = in.readLong();
}
InternalAggregations aggregations = InternalAggregations.readAggregations(in);
buckets.add(new Bucket(termBytes, docCount, aggregations, showTermDocCountError, bucketDocCountError));
}
this.buckets = buckets;
this.bucketMap = null;
@ -130,24 +149,38 @@ public class StringTerms extends InternalTerms {
@Override
public void writeTo(StreamOutput out) throws IOException {
out.writeString(name);
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
out.writeLong(docCountError);
}
InternalOrder.Streams.writeOrder(order, out);
writeSize(requiredSize, out);
if (out.getVersion().onOrAfter(Version.V_1_4_0)) {
writeSize(shardSize, out);
out.writeBoolean(showTermDocCountError);
}
out.writeVLong(minDocCount);
out.writeVInt(buckets.size());
for (InternalTerms.Bucket bucket : buckets) {
out.writeBytesRef(((Bucket) bucket).termBytes);
out.writeVLong(bucket.getDocCount());
if (out.getVersion().onOrAfter(Version.V_1_4_0) && showTermDocCountError) {
out.writeLong(bucket.docCountError);
}
((InternalAggregations) bucket.getAggregations()).writeTo(out);
}
}
@Override
public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, docCountError);
builder.startArray(CommonFields.BUCKETS);
for (InternalTerms.Bucket bucket : buckets) {
builder.startObject();
builder.utf8Field(CommonFields.KEY, ((Bucket) bucket).termBytes);
builder.field(CommonFields.DOC_COUNT, bucket.getDocCount());
if (showTermDocCountError) {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, bucket.getDocCountError());
}
((InternalAggregations) bucket.getAggregations()).toXContentInternal(builder, params);
builder.endObject();
}

View File

@ -48,9 +48,9 @@ public class StringTermsAggregator extends AbstractStringTermsAggregator {
public StringTermsAggregator(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount,
InternalOrder order, BucketCountThresholds bucketCountThresholds,
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode) {
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode collectionMode, boolean showTermDocCountError) {
super(name, factories, estimatedBucketCount, aggregationContext, parent, order, bucketCountThresholds, collectionMode);
super(name, factories, estimatedBucketCount, aggregationContext, parent, order, bucketCountThresholds, collectionMode, showTermDocCountError);
this.valuesSource = valuesSource;
this.includeExclude = includeExclude;
bucketOrds = new BytesRefHash(estimatedBucketCount, aggregationContext.bigArrays());
@ -123,7 +123,7 @@ public class StringTermsAggregator extends AbstractStringTermsAggregator {
StringTerms.Bucket spare = null;
for (int i = 0; i < bucketOrds.size(); i++) {
if (spare == null) {
spare = new StringTerms.Bucket(new BytesRef(), 0, null);
spare = new StringTerms.Bucket(new BytesRef(), 0, null, showTermDocCountError, 0);
}
bucketOrds.get(i, spare.termBytes);
spare.docCount = bucketDocCount(i);
@ -142,21 +142,22 @@ public class StringTermsAggregator extends AbstractStringTermsAggregator {
list[i] = bucket;
}
// replay any deferred collections
runDeferredCollections(survivingBucketOrds);
runDeferredCollections(survivingBucketOrds);
// Now build the aggs
for (int i = 0; i < list.length; i++) {
final StringTerms.Bucket bucket = (StringTerms.Bucket)list[i];
bucket.termBytes = BytesRef.deepCopyOf(bucket.termBytes);
bucket.aggregations = bucketAggregations(bucket.bucketOrd);
}
bucket.docCountError = 0;
}
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list));
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Arrays.asList(list), showTermDocCountError, 0);
}
@Override
public InternalAggregation buildEmptyAggregation() {
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList());
return new StringTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount(), Collections.<InternalTerms.Bucket>emptyList(), showTermDocCountError, 0);
}
@Override

View File

@ -65,12 +65,16 @@ public interface Terms extends MultiBucketsAggregation {
public abstract Number getKeyAsNumber();
abstract int compareTerm(Terms.Bucket other);
public abstract long getDocCountError();
}
Collection<Bucket> getBuckets();
Bucket getBucketByKey(String term);
long getDocCountError();
/**
* Determines the order by which the term buckets will be sorted

View File

@ -41,8 +41,8 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
@Override
Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude,
AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode) {
return new StringTermsAggregator(name, factories, valuesSource, estimatedBucketCount, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode);
AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError) {
return new StringTermsAggregator(name, factories, valuesSource, estimatedBucketCount, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
}
@Override
@ -56,8 +56,8 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
@Override
Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude,
AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode) {
return new GlobalOrdinalsStringTermsAggregator(name, factories, (ValuesSource.Bytes.WithOrdinals.FieldData) valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode);
AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError) {
return new GlobalOrdinalsStringTermsAggregator(name, factories, (ValuesSource.Bytes.WithOrdinals.FieldData) valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
}
@Override
@ -71,8 +71,8 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
@Override
Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude,
AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode) {
return new GlobalOrdinalsStringTermsAggregator.WithHash(name, factories, (ValuesSource.Bytes.WithOrdinals.FieldData) valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode);
AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError) {
return new GlobalOrdinalsStringTermsAggregator.WithHash(name, factories, (ValuesSource.Bytes.WithOrdinals.FieldData) valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
}
@Override
@ -85,11 +85,11 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
@Override
Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude,
AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode) {
AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError) {
if (includeExclude != null || factories.count() > 0) {
return GLOBAL_ORDINALS.create(name, factories, valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode);
return GLOBAL_ORDINALS.create(name, factories, valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
}
return new GlobalOrdinalsStringTermsAggregator.LowCardinality(name, factories, (ValuesSource.Bytes.WithOrdinals.FieldData) valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, aggregationContext, parent, subAggCollectMode);
return new GlobalOrdinalsStringTermsAggregator.LowCardinality(name, factories, (ValuesSource.Bytes.WithOrdinals.FieldData) valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
}
@Override
@ -115,7 +115,7 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
abstract Aggregator create(String name, AggregatorFactories factories, ValuesSource valuesSource, long estimatedBucketCount,
long maxOrd, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds,
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode);
IncludeExclude includeExclude, AggregationContext aggregationContext, Aggregator parent, SubAggCollectionMode subAggCollectMode, boolean showTermDocCountError);
abstract boolean needsGlobalOrdinals();
@ -130,19 +130,21 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
private final String executionHint;
private SubAggCollectionMode subAggCollectMode;
private final TermsAggregator.BucketCountThresholds bucketCountThresholds;
private boolean showTermDocCountError;
public TermsAggregatorFactory(String name, ValuesSourceConfig config, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude, String executionHint,SubAggCollectionMode executionMode) {
public TermsAggregatorFactory(String name, ValuesSourceConfig config, InternalOrder order, TermsAggregator.BucketCountThresholds bucketCountThresholds, IncludeExclude includeExclude, String executionHint,SubAggCollectionMode executionMode, boolean showTermDocCountError) {
super(name, StringTerms.TYPE.name(), config);
this.order = order;
this.includeExclude = includeExclude;
this.executionHint = executionHint;
this.bucketCountThresholds = bucketCountThresholds;
this.subAggCollectMode = executionMode;
this.showTermDocCountError = showTermDocCountError;
}
@Override
protected Aggregator createUnmapped(AggregationContext aggregationContext, Aggregator parent) {
final InternalAggregation aggregation = new UnmappedTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getMinDocCount());
final InternalAggregation aggregation = new UnmappedTerms(name, order, bucketCountThresholds.getRequiredSize(), bucketCountThresholds.getShardSize(), bucketCountThresholds.getMinDocCount());
return new NonCollectingAggregator(name, aggregationContext, parent) {
@Override
public InternalAggregation buildEmptyAggregation() {
@ -226,7 +228,7 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
assert execution != null;
valuesSource.setNeedsGlobalOrdinals(execution.needsGlobalOrdinals());
return execution.create(name, factories, valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode);
return execution.create(name, factories, valuesSource, estimatedBucketCount, maxOrd, order, bucketCountThresholds, includeExclude, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
}
if (includeExclude != null) {
@ -236,9 +238,9 @@ public class TermsAggregatorFactory extends ValuesSourceAggregatorFactory {
if (valuesSource instanceof ValuesSource.Numeric) {
if (((ValuesSource.Numeric) valuesSource).isFloatingPoint()) {
return new DoubleTermsAggregator(name, factories, (ValuesSource.Numeric) valuesSource, config.format(), estimatedBucketCount, order, bucketCountThresholds, aggregationContext, parent, subAggCollectMode);
return new DoubleTermsAggregator(name, factories, (ValuesSource.Numeric) valuesSource, config.format(), estimatedBucketCount, order, bucketCountThresholds, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
}
return new LongTermsAggregator(name, factories, (ValuesSource.Numeric) valuesSource, config.format(), estimatedBucketCount, order, bucketCountThresholds, aggregationContext, parent, subAggCollectMode);
return new LongTermsAggregator(name, factories, (ValuesSource.Numeric) valuesSource, config.format(), estimatedBucketCount, order, bucketCountThresholds, aggregationContext, parent, subAggCollectMode, showTermDocCountError);
}
throw new AggregationExecutionException("terms aggregation cannot be applied to field [" + config.fieldContext().field() +

View File

@ -42,6 +42,7 @@ public class TermsBuilder extends ValuesSourceAggregationBuilder<TermsBuilder> {
private int excludeFlags;
private String executionHint;
private SubAggCollectionMode collectionMode;
private Boolean showTermDocCountError;
public TermsBuilder(String name) {
super(name, "terms");
@ -149,12 +150,20 @@ public class TermsBuilder extends ValuesSourceAggregationBuilder<TermsBuilder> {
this.collectionMode = mode;
return this;
}
public TermsBuilder showTermDocCountError(boolean showTermDocCountError) {
this.showTermDocCountError = showTermDocCountError;
return this;
}
@Override
protected XContentBuilder doInternalXContent(XContentBuilder builder, Params params) throws IOException {
bucketCountThresholds.toXContent(builder);
if (showTermDocCountError != null) {
builder.field(AbstractTermsParametersParser.SHOW_TERM_DOC_COUNT_ERROR.getPreferredName(), showTermDocCountError);
}
if (executionHint != null) {
builder.field(AbstractTermsParametersParser.EXECUTION_HINT_FIELD_NAME.getPreferredName(), executionHint);
}

View File

@ -38,9 +38,14 @@ public class TermsParametersParser extends AbstractTermsParametersParser {
public boolean isOrderAsc() {
return orderAsc;
}
public boolean showTermDocCountError() {
return showTermDocCountError;
}
String orderKey = "_count";
boolean orderAsc = false;
private boolean showTermDocCountError = false;
@Override
public void parseSpecial(String aggregationName, XContentParser parser, SearchContext context, XContentParser.Token token, String currentFieldName) throws IOException {
@ -65,6 +70,10 @@ public class TermsParametersParser extends AbstractTermsParametersParser {
} else {
throw new SearchParseException(context, "Unknown key for a " + token + " in [" + aggregationName + "]: [" + currentFieldName + "].");
}
} else if (token == XContentParser.Token.VALUE_BOOLEAN) {
if (SHOW_TERM_DOC_COUNT_ERROR.match(currentFieldName)) {
showTermDocCountError = parser.booleanValue();
}
} else {
throw new SearchParseException(context, "Unknown key for a " + token + " in [" + aggregationName + "]: [" + currentFieldName + "].");
}

View File

@ -55,8 +55,7 @@ public class TermsParser implements Aggregator.Parser {
context.numberOfShards()));
}
bucketCountThresholds.ensureValidity();
return new TermsAggregatorFactory(aggregationName, vsParser.config(), order, bucketCountThresholds, aggParser.getIncludeExclude(),
aggParser.getExecutionHint(), aggParser.getCollectionMode());
return new TermsAggregatorFactory(aggregationName, vsParser.config(), order, bucketCountThresholds, aggParser.getIncludeExclude(), aggParser.getExecutionHint(), aggParser.getCollectionMode(), aggParser.showTermDocCountError());
}
static InternalOrder resolveOrder(String key, boolean asc) {

View File

@ -25,7 +25,6 @@ import org.elasticsearch.search.aggregations.AggregationStreams;
import org.elasticsearch.search.aggregations.InternalAggregation;
import java.io.IOException;
import java.util.Collection;
import java.util.Collections;
import java.util.List;
import java.util.Map;
@ -37,7 +36,7 @@ public class UnmappedTerms extends InternalTerms {
public static final Type TYPE = new Type("terms", "umterms");
private static final Collection<Bucket> BUCKETS = Collections.emptyList();
private static final List<Bucket> BUCKETS = Collections.emptyList();
private static final Map<String, Bucket> BUCKETS_MAP = Collections.emptyMap();
public static final AggregationStreams.Stream STREAM = new AggregationStreams.Stream() {
@ -55,8 +54,8 @@ public class UnmappedTerms extends InternalTerms {
UnmappedTerms() {} // for serialization
public UnmappedTerms(String name, InternalOrder order, int requiredSize, long minDocCount) {
super(name, order, requiredSize, minDocCount, BUCKETS);
public UnmappedTerms(String name, InternalOrder order, int requiredSize, int shardSize, long minDocCount) {
super(name, order, requiredSize, shardSize, minDocCount, BUCKETS, false, 0);
}
@Override
@ -67,6 +66,7 @@ public class UnmappedTerms extends InternalTerms {
@Override
public void readFrom(StreamInput in) throws IOException {
this.name = in.readString();
this.docCountError = 0;
this.order = InternalOrder.Streams.readOrder(in);
this.requiredSize = readSize(in);
this.minDocCount = in.readVLong();
@ -93,12 +93,13 @@ public class UnmappedTerms extends InternalTerms {
}
@Override
protected InternalTerms newAggregation(String name, List<Bucket> buckets) {
protected InternalTerms newAggregation(String name, List<Bucket> buckets, boolean showTermDocCountError, long docCountError) {
throw new UnsupportedOperationException("How did you get there?");
}
@Override
public XContentBuilder doXContentBody(XContentBuilder builder, Params params) throws IOException {
builder.field(InternalTerms.DOC_COUNT_ERROR_UPPER_BOUND_FIELD_NAME, docCountError);
builder.startArray(CommonFields.BUCKETS).endArray();
return builder;
}

View File

@ -251,9 +251,9 @@ public class SignificantTermsSignificanceScoreTests extends ElasticsearchIntegra
classes.toXContent(responseBuilder, null);
String result = null;
if (type.equals("long")) {
result = "\"class\"{\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"sig_terms\":{\"doc_count\":4,\"buckets\":[{\"key\":0,\"key_as_string\":\"0\",\"doc_count\":4,\"score\":0.39999999999999997,\"bg_count\":5}]}},{\"key\":\"1\",\"doc_count\":3,\"sig_terms\":{\"doc_count\":3,\"buckets\":[{\"key\":1,\"key_as_string\":\"1\",\"doc_count\":3,\"score\":0.75,\"bg_count\":4}]}}]}";
result = "\"class\"{\"doc_count_error_upper_bound\":0,\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"sig_terms\":{\"doc_count\":4,\"buckets\":[{\"key\":0,\"key_as_string\":\"0\",\"doc_count\":4,\"score\":0.39999999999999997,\"bg_count\":5}]}},{\"key\":\"1\",\"doc_count\":3,\"sig_terms\":{\"doc_count\":3,\"buckets\":[{\"key\":1,\"key_as_string\":\"1\",\"doc_count\":3,\"score\":0.75,\"bg_count\":4}]}}]}";
} else {
result = "\"class\"{\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"sig_terms\":{\"doc_count\":4,\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"score\":0.39999999999999997,\"bg_count\":5}]}},{\"key\":\"1\",\"doc_count\":3,\"sig_terms\":{\"doc_count\":3,\"buckets\":[{\"key\":\"1\",\"doc_count\":3,\"score\":0.75,\"bg_count\":4}]}}]}";
result = "\"class\"{\"doc_count_error_upper_bound\":0,\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"sig_terms\":{\"doc_count\":4,\"buckets\":[{\"key\":\"0\",\"doc_count\":4,\"score\":0.39999999999999997,\"bg_count\":5}]}},{\"key\":\"1\",\"doc_count\":3,\"sig_terms\":{\"doc_count\":3,\"buckets\":[{\"key\":\"1\",\"doc_count\":3,\"score\":0.75,\"bg_count\":4}]}}]}";
}
assertThat(responseBuilder.string(), equalTo(result));

View File

@ -36,7 +36,9 @@ import org.elasticsearch.search.aggregations.bucket.terms.TermsBuilder;
import org.elasticsearch.test.ElasticsearchIntegrationTest;
import org.junit.Test;
import java.util.*;
import java.util.HashMap;
import java.util.HashSet;
import java.util.Set;
import static org.elasticsearch.cluster.metadata.IndexMetaData.SETTING_NUMBER_OF_REPLICAS;
import static org.elasticsearch.cluster.metadata.IndexMetaData.SETTING_NUMBER_OF_SHARDS;
@ -213,8 +215,8 @@ public class SignificantTermsTests extends ElasticsearchIntegrationTest {
}
}
assertTrue(hasMissingBackgroundTerms);
}
}
@Test
public void filteredAnalysis() throws Exception {
@ -276,8 +278,8 @@ public class SignificantTermsTests extends ElasticsearchIntegrationTest {
.setQuery(new TermQueryBuilder("_all", "terje"))
.setFrom(0).setSize(60).setExplain(true)
.addAggregation(new SignificantTermsBuilder("mySignificantTerms").field("description")
.executionHint(randomExecutionHint())
.minDocCount(2))
.executionHint(randomExecutionHint())
.minDocCount(2))
.execute()
.actionGet();
assertSearchResponse(response);

View File

@ -0,0 +1,945 @@
/*
* Licensed to Elasticsearch under one or more contributor
* license agreements. See the NOTICE file distributed with
* this work for additional information regarding copyright
* ownership. Elasticsearch licenses this file to you under
* the Apache License, Version 2.0 (the "License"); you may
* not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing,
* software distributed under the License is distributed on an
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
* KIND, either express or implied. See the License for the
* specific language governing permissions and limitations
* under the License.
*/
package org.elasticsearch.search.aggregations.bucket;
import org.elasticsearch.action.index.IndexRequestBuilder;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.cluster.metadata.IndexMetaData;
import org.elasticsearch.common.settings.ImmutableSettings;
import org.elasticsearch.search.aggregations.Aggregator.SubAggCollectionMode;
import org.elasticsearch.search.aggregations.bucket.terms.Terms;
import org.elasticsearch.search.aggregations.bucket.terms.Terms.Bucket;
import org.elasticsearch.search.aggregations.bucket.terms.Terms.Order;
import org.elasticsearch.search.aggregations.bucket.terms.TermsAggregatorFactory.ExecutionMode;
import org.elasticsearch.test.ElasticsearchIntegrationTest;
import org.junit.Test;
import java.util.ArrayList;
import java.util.Collection;
import java.util.List;
import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
import static org.elasticsearch.search.aggregations.AggregationBuilders.sum;
import static org.elasticsearch.search.aggregations.AggregationBuilders.terms;
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked;
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertSearchResponse;
import static org.hamcrest.Matchers.*;
import static org.hamcrest.core.IsNull.notNullValue;
@ElasticsearchIntegrationTest.SuiteScopeTest
public class TermsDocCountErrorTests extends ElasticsearchIntegrationTest{
private static final String STRING_FIELD_NAME = "s_value";
private static final String LONG_FIELD_NAME = "l_value";
private static final String DOUBLE_FIELD_NAME = "d_value";
private static final String ROUTING_FIELD_NAME = "route";
public static String randomExecutionHint() {
return randomBoolean() ? null : randomFrom(ExecutionMode.values()).toString();
}
private static int numRoutingValues;
@Override
public void setupSuiteScopeCluster() throws Exception {
createIndex("idx");
List<IndexRequestBuilder> builders = new ArrayList<>();
int numDocs = between(10, 200);
int numUniqueTerms = between(2,numDocs/2);
for (int i = 0; i < numDocs; i++) {
builders.add(client().prepareIndex("idx", "type", ""+i).setSource(jsonBuilder()
.startObject()
.field(STRING_FIELD_NAME, "val" + randomInt(numUniqueTerms))
.field(LONG_FIELD_NAME, randomInt(numUniqueTerms))
.field(DOUBLE_FIELD_NAME, 1.0 * randomInt(numUniqueTerms))
.endObject()));
}
assertAcked(prepareCreate("idx_single_shard").setSettings(ImmutableSettings.builder().put(IndexMetaData.SETTING_NUMBER_OF_SHARDS, 1)));
for (int i = 0; i < numDocs; i++) {
builders.add(client().prepareIndex("idx_single_shard", "type", ""+i).setSource(jsonBuilder()
.startObject()
.field(STRING_FIELD_NAME, "val" + randomInt(numUniqueTerms))
.field(LONG_FIELD_NAME, randomInt(numUniqueTerms))
.field(DOUBLE_FIELD_NAME, 1.0 * randomInt(numUniqueTerms))
.endObject()));
}
numRoutingValues = between(1,40);
assertAcked(prepareCreate("idx_with_routing").addMapping("type", "{ \"type\" : { \"_routing\" : { \"required\" : true, \"path\" : \"" + ROUTING_FIELD_NAME + "\" } } }"));
for (int i = 0; i < numDocs; i++) {
builders.add(client().prepareIndex("idx_single_shard", "type", ""+i).setSource(jsonBuilder()
.startObject()
.field(STRING_FIELD_NAME, "val" + randomInt(numUniqueTerms))
.field(LONG_FIELD_NAME, randomInt(numUniqueTerms))
.field(DOUBLE_FIELD_NAME, 1.0 * randomInt(numUniqueTerms))
.field(ROUTING_FIELD_NAME, String.valueOf(randomInt(numRoutingValues)))
.endObject()));
}
indexRandom(true, builders);
ensureSearchable();
}
private void assertDocCountErrorWithinBounds(int size, SearchResponse accurateResponse, SearchResponse testResponse) {
Terms accurateTerms = accurateResponse.getAggregations().get("terms");
assertThat(accurateTerms, notNullValue());
assertThat(accurateTerms.getName(), equalTo("terms"));
assertThat(accurateTerms.getDocCountError(), equalTo(0l));
Terms testTerms = testResponse.getAggregations().get("terms");
assertThat(testTerms, notNullValue());
assertThat(testTerms.getName(), equalTo("terms"));
assertThat(testTerms.getDocCountError(), greaterThanOrEqualTo(0l));
Collection<Bucket> testBuckets = testTerms.getBuckets();
assertThat(testBuckets.size(), lessThanOrEqualTo(size));
assertThat(accurateTerms.getBuckets().size(), greaterThanOrEqualTo(testBuckets.size()));
for (Terms.Bucket testBucket : testBuckets) {
assertThat(testBucket, notNullValue());
Terms.Bucket accurateBucket = accurateTerms.getBucketByKey(testBucket.getKey());
assertThat(accurateBucket, notNullValue());
assertThat(accurateBucket.getDocCountError(), equalTo(0l));
assertThat(testBucket.getDocCountError(), lessThanOrEqualTo(testTerms.getDocCountError()));
assertThat(testBucket.getDocCount() + testBucket.getDocCountError(), greaterThanOrEqualTo(accurateBucket.getDocCount()));
assertThat(testBucket.getDocCount() - testBucket.getDocCountError(), lessThanOrEqualTo(accurateBucket.getDocCount()));
}
for (Terms.Bucket accurateBucket: accurateTerms.getBuckets()) {
assertThat(accurateBucket, notNullValue());
Terms.Bucket testBucket = accurateTerms.getBucketByKey(accurateBucket.getKey());
if (testBucket == null) {
assertThat(accurateBucket.getDocCount(), lessThanOrEqualTo(testTerms.getDocCountError()));
}
}
}
private void assertNoDocCountError(int size, SearchResponse accurateResponse, SearchResponse testResponse) {
Terms accurateTerms = accurateResponse.getAggregations().get("terms");
assertThat(accurateTerms, notNullValue());
assertThat(accurateTerms.getName(), equalTo("terms"));
assertThat(accurateTerms.getDocCountError(), equalTo(0l));
Terms testTerms = testResponse.getAggregations().get("terms");
assertThat(testTerms, notNullValue());
assertThat(testTerms.getName(), equalTo("terms"));
assertThat(testTerms.getDocCountError(), equalTo(0l));
Collection<Bucket> testBuckets = testTerms.getBuckets();
assertThat(testBuckets.size(), lessThanOrEqualTo(size));
assertThat(accurateTerms.getBuckets().size(), greaterThanOrEqualTo(testBuckets.size()));
for (Terms.Bucket testBucket : testBuckets) {
assertThat(testBucket, notNullValue());
Terms.Bucket accurateBucket = accurateTerms.getBucketByKey(testBucket.getKey());
assertThat(accurateBucket, notNullValue());
assertThat(accurateBucket.getDocCountError(), equalTo(0l));
assertThat(testBucket.getDocCountError(), equalTo(0l));
}
}
private void assertNoDocCountErrorSingleResponse(int size, SearchResponse testResponse) {
Terms testTerms = testResponse.getAggregations().get("terms");
assertThat(testTerms, notNullValue());
assertThat(testTerms.getName(), equalTo("terms"));
assertThat(testTerms.getDocCountError(), equalTo(0l));
Collection<Bucket> testBuckets = testTerms.getBuckets();
assertThat(testBuckets.size(), lessThanOrEqualTo(size));
for (Terms.Bucket testBucket : testBuckets) {
assertThat(testBucket, notNullValue());
assertThat(testBucket.getDocCountError(), equalTo(0l));
}
}
private void assertUnboundedDocCountError(int size, SearchResponse accurateResponse, SearchResponse testResponse) {
Terms accurateTerms = accurateResponse.getAggregations().get("terms");
assertThat(accurateTerms, notNullValue());
assertThat(accurateTerms.getName(), equalTo("terms"));
assertThat(accurateTerms.getDocCountError(), equalTo(0l));
Terms testTerms = testResponse.getAggregations().get("terms");
assertThat(testTerms, notNullValue());
assertThat(testTerms.getName(), equalTo("terms"));
assertThat(testTerms.getDocCountError(),anyOf(equalTo(-1l), equalTo(0l)));
Collection<Bucket> testBuckets = testTerms.getBuckets();
assertThat(testBuckets.size(), lessThanOrEqualTo(size));
assertThat(accurateTerms.getBuckets().size(), greaterThanOrEqualTo(testBuckets.size()));
for (Terms.Bucket testBucket : testBuckets) {
assertThat(testBucket, notNullValue());
Terms.Bucket accurateBucket = accurateTerms.getBucketByKey(testBucket.getKey());
assertThat(accurateBucket, notNullValue());
assertThat(accurateBucket.getDocCountError(), equalTo(0l));
assertThat(testBucket.getDocCountError(), anyOf(equalTo(-1l), equalTo(0l)));
}
}
@Test
public void stringValueField() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertDocCountErrorWithinBounds(size, accurateResponse, testResponse);
}
@Test
public void stringValueField_singleShard() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void stringValueField_withRouting() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse testResponse = client().prepareSearch("idx_with_routing").setTypes("type").setRouting(String.valueOf(between(1, numRoutingValues)))
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountErrorSingleResponse(size, testResponse);
}
@Test
public void stringValueField_docCountAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.count(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.count(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void stringValueField_termSortAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.term(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.term(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void stringValueField_termSortDesc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.term(false))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.term(false))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void stringValueField_subAggAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.aggregation("sortAgg", true))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.aggregation("sortAgg", true))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void stringValueField_subAggDesc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.aggregation("sortAgg", false))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(STRING_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.aggregation("sortAgg", false))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void longValueField() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertDocCountErrorWithinBounds(size, accurateResponse, testResponse);
}
@Test
public void longValueField_singleShard() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void longValueField_withRouting() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse testResponse = client().prepareSearch("idx_with_routing").setTypes("type").setRouting(String.valueOf(between(1, numRoutingValues)))
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountErrorSingleResponse(size, testResponse);
}
@Test
public void longValueField_docCountAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.count(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.count(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void longValueField_termSortAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.term(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.term(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void longValueField_termSortDesc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.term(false))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.term(false))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void longValueField_subAggAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.aggregation("sortAgg", true))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.aggregation("sortAgg", true))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void longValueField_subAggDesc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.aggregation("sortAgg", false))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(DOUBLE_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(LONG_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.aggregation("sortAgg", false))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(DOUBLE_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertDocCountErrorWithinBounds(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField_singleShard() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField_withRouting() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse testResponse = client().prepareSearch("idx_with_routing").setTypes("type").setRouting(String.valueOf(between(1, numRoutingValues)))
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountErrorSingleResponse(size, testResponse);
}
@Test
public void doubleValueField_docCountAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.count(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.count(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField_termSortAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.term(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.term(true))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField_termSortDesc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.term(false))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.term(false))
.collectMode(randomFrom(SubAggCollectionMode.values())))
.execute().actionGet();
assertSearchResponse(testResponse);
assertNoDocCountError(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField_subAggAsc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.aggregation("sortAgg", true))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.aggregation("sortAgg", true))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
@Test
public void doubleValueField_subAggDesc() throws Exception {
int size = randomIntBetween(1, 20);
int shardSize = randomIntBetween(size, size * 2);
SearchResponse accurateResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(0)
.shardSize(0)
.order(Order.aggregation("sortAgg", false))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(accurateResponse);
SearchResponse testResponse = client().prepareSearch("idx_single_shard").setTypes("type")
.addAggregation(terms("terms")
.executionHint(randomExecutionHint())
.field(DOUBLE_FIELD_NAME)
.showTermDocCountError(true)
.size(size)
.shardSize(shardSize)
.order(Order.aggregation("sortAgg", false))
.collectMode(randomFrom(SubAggCollectionMode.values()))
.subAggregation(sum("sortAgg").field(LONG_FIELD_NAME)))
.execute().actionGet();
assertSearchResponse(testResponse);
assertUnboundedDocCountError(size, accurateResponse, testResponse);
}
}