Use CoveringQuery to select percolate candidate matches and

extract all clauses from a conjunction query.

When clauses from a conjunction are extracted the number of clauses is
also stored in an internal doc values field (minimum_should_match field).
This field is used by the CoveringQuery and allows the percolator to
reduce the number of false positives when selecting candidate matches and
in certain cases be absolutely sure that a conjunction candidate match
will match and then skip MemoryIndex validation. This can greatly improve
performance.

Before this change only a single clause was extracted from a conjunction
query. The percolator tried to extract the clauses that was rarest in order
(based on term length) to attempt less candidate queries to be selected
in the first place. However this still method there is still a very high
chance that candidate query matches are false positives.

This change also removes the influencing query extraction added via #26081
as this is no longer needed because now all conjunction clauses are extracted.

https://www.elastic.co/guide/en/elasticsearch/reference/6.x/percolator.html#_influencing_query_extraction

Closes #26307
This commit is contained in:
Martijn van Groningen 2017-11-05 10:39:02 +01:00
parent 06ff92d237
commit b4048b4e7f
No known key found for this signature in database
GPG Key ID: AB236F4FCF2AF12A
12 changed files with 1095 additions and 649 deletions

View File

@ -59,69 +59,6 @@ Fields referred in a percolator query may exist in any type of the index contain
=====================================
[float]
==== Influencing query extraction
As part of indexing the percolator query, the percolator field mapper extracts the query terms and numeric ranges from the provided
query and indexes that alongside the query in separate internal fields. The `percolate` query uses these internal fields
to build a candidate query from the document being percolated in order to reduce the number of document that need to be verified.
In case a percolator query contains a `bool` query with must or filter clauses, then the percolator field mapper only has to
extract ranges or terms from a single clause. The percolator field mapper will prefer longer terms over shorter terms, because
longer terms in general match with less documents. For the same reason it prefers smaller ranges over bigger ranges.
In general this behaviour works well. However sometimes there are fields in a bool query that shouldn't be taken into account
when selecting the best must or filter clause, or fields are known to be more selective than other fields.
For example a status like field may in fact not work well, because each status matches with many percolator queries and
then the candidate query the `percolate` query generates may not be able to filter out that many percolator queries.
The percolator field mapping allows to configure `boost_fields` in order to indicate to the percolator what fields are
important or not important when selecting the best must or filter clause in a `bool` query:
[source,js]
--------------------------------------------------
PUT another_index
{
"mappings": {
"doc": {
"properties": {
"query": {
"type": "percolator",
"boost_fields": {
"status_field": 0, <1>
"price_field": 2 <2>
}
},
"status_field": {
"type": "keyword"
},
"price_field": {
"type": "long"
},
"field": {
"type": "text"
}
}
}
}
}
--------------------------------------------------
// CONSOLE
<1> A boost of zero hints to the percolator that if there are other clauses in a conjunction query then these should be
preferred over this one.
<2> Any boost higher than 1 overrides the default behaviour when it comes to selecting the best clause. The clause
that has the field with the highest boost will be selected from a conjunction query for extraction.
The steps the percolator field mapper takes when selecting a clause from a conjunction query:
* If there are clauses that have boosted fields then the clause with highest boost field is selected.
* If there are range based clauses and term based clauses then term based clauses are picked over range based clauses
* From all term based clauses the clause with longest term is picked.
* In the case when there are only range based clauses then the range clause with smallest range is picked over clauses with wider ranges.
[float]
==== Reindexing your percolator queries

View File

@ -183,6 +183,10 @@ final class PercolateQuery extends Query implements Accountable {
return queryStore;
}
Query getCandidateMatchesQuery() {
return candidateMatchesQuery;
}
// Comparing identity here to avoid being cached
// Note that in theory if the same instance gets used multiple times it could still get cached,
// however since we create a new query instance each time we this query this shouldn't happen and thus

View File

@ -639,7 +639,7 @@ public class PercolateQueryBuilder extends AbstractQueryBuilder<PercolateQueryBu
String name = this.name != null ? this.name : field;
PercolatorFieldMapper.FieldType pft = (PercolatorFieldMapper.FieldType) fieldType;
PercolateQuery.QueryStore queryStore = createStore(pft.queryBuilderField, percolateShardContext, mapUnmappedFieldsAsString);
return pft.percolateQuery(name, queryStore, documents, docSearcher);
return pft.percolateQuery(name, queryStore, documents, docSearcher, context.indexVersionCreated());
}
public String getField() {

View File

@ -20,6 +20,7 @@ package org.elasticsearch.percolator;
import org.apache.lucene.document.BinaryRange;
import org.apache.lucene.document.Field;
import org.apache.lucene.document.NumericDocValuesField;
import org.apache.lucene.index.DocValuesType;
import org.apache.lucene.index.FieldInfo;
import org.apache.lucene.index.IndexOptions;
@ -30,10 +31,12 @@ import org.apache.lucene.index.PointValues;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.Terms;
import org.apache.lucene.index.TermsEnum;
import org.apache.lucene.search.BooleanClause.Occur;
import org.apache.lucene.search.BooleanClause;
import org.apache.lucene.search.BooleanQuery;
import org.apache.lucene.search.CoveringQuery;
import org.apache.lucene.search.DocValuesFieldExistsQuery;
import org.apache.lucene.search.IndexSearcher;
import org.apache.lucene.search.LongValuesSource;
import org.apache.lucene.search.MatchNoDocsQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.TermInSetQuery;
@ -44,6 +47,7 @@ import org.elasticsearch.Version;
import org.elasticsearch.action.support.PlainActionFuture;
import org.elasticsearch.common.ParsingException;
import org.elasticsearch.common.bytes.BytesReference;
import org.elasticsearch.common.collect.Tuple;
import org.elasticsearch.common.hash.MurmurHash3;
import org.elasticsearch.common.io.stream.OutputStreamStreamOutput;
import org.elasticsearch.common.logging.DeprecationLogger;
@ -62,6 +66,7 @@ import org.elasticsearch.index.mapper.KeywordFieldMapper;
import org.elasticsearch.index.mapper.MappedFieldType;
import org.elasticsearch.index.mapper.Mapper;
import org.elasticsearch.index.mapper.MapperParsingException;
import org.elasticsearch.index.mapper.NumberFieldMapper;
import org.elasticsearch.index.mapper.ParseContext;
import org.elasticsearch.index.mapper.RangeFieldMapper;
import org.elasticsearch.index.mapper.RangeFieldMapper.RangeType;
@ -87,9 +92,6 @@ import java.util.List;
import java.util.Map;
import java.util.function.Supplier;
import static org.elasticsearch.common.xcontent.support.XContentMapValues.isObject;
import static org.elasticsearch.common.xcontent.support.XContentMapValues.nodeFloatValue;
import static org.elasticsearch.common.xcontent.support.XContentMapValues.nodeStringValue;
import static org.elasticsearch.index.query.AbstractQueryBuilder.parseInnerQueryBuilder;
public class PercolatorFieldMapper extends FieldMapper {
@ -113,11 +115,11 @@ public class PercolatorFieldMapper extends FieldMapper {
static final String EXTRACTION_RESULT_FIELD_NAME = "extraction_result";
static final String QUERY_BUILDER_FIELD_NAME = "query_builder_field";
static final String RANGE_FIELD_NAME = "range_field";
static final String MINIMUM_SHOULD_MATCH_FIELD_NAME = "minimum_should_match_field";
static class Builder extends FieldMapper.Builder<Builder, PercolatorFieldMapper> {
private final Supplier<QueryShardContext> queryShardContext;
private final Map<String, Float> boostFields = new HashMap<>();
Builder(String fieldName, Supplier<QueryShardContext> queryShardContext) {
super(fieldName, FIELD_TYPE, FIELD_TYPE);
@ -138,15 +140,13 @@ public class PercolatorFieldMapper extends FieldMapper {
// have to introduce a new field type...
RangeFieldMapper rangeFieldMapper = createExtractedRangeFieldBuilder(RANGE_FIELD_NAME, RangeType.IP, context);
fieldType.rangeField = rangeFieldMapper.fieldType();
NumberFieldMapper minimumShouldMatchFieldMapper = createMinimumShouldMatchField(context);
fieldType.minimumShouldMatchField = minimumShouldMatchFieldMapper.fieldType();
context.path().remove();
setupFieldType(context);
return new PercolatorFieldMapper(name(), fieldType, defaultFieldType, context.indexSettings(),
multiFieldsBuilder.build(this, context), copyTo, queryShardContext, extractedTermsField,
extractionResultField, queryBuilderField, rangeFieldMapper, Collections.unmodifiableMap(boostFields));
}
void addBoostField(String field, float boost) {
this.boostFields.put(field, boost);
extractionResultField, queryBuilderField, rangeFieldMapper, minimumShouldMatchFieldMapper);
}
static KeywordFieldMapper createExtractQueryFieldBuilder(String name, BuilderContext context) {
@ -173,30 +173,23 @@ public class PercolatorFieldMapper extends FieldMapper {
return builder.build(context);
}
static NumberFieldMapper createMinimumShouldMatchField(BuilderContext context) {
NumberFieldMapper.Builder builder =
new NumberFieldMapper.Builder(MINIMUM_SHOULD_MATCH_FIELD_NAME, NumberFieldMapper.NumberType.INTEGER);
builder.index(false);
builder.store(false);
builder.docValues(true);
builder.fieldType().setDocValuesType(DocValuesType.NUMERIC);
return builder.build(context);
}
}
static class TypeParser implements FieldMapper.TypeParser {
@Override
public Builder parse(String name, Map<String, Object> node, ParserContext parserContext) throws MapperParsingException {
Builder builder = new Builder(name, parserContext.queryShardContextSupplier());
for (Iterator<Map.Entry<String, Object>> iterator = node.entrySet().iterator(); iterator.hasNext();) {
Map.Entry<String, Object> entry = iterator.next();
String propName = entry.getKey();
Object propNode = entry.getValue();
if (propName.equals("boost_fields")) {
if (isObject(propNode)) {
for (Map.Entry<?, ?> innerEntry : ((Map<?, ?>) propNode).entrySet()) {
String fieldName = nodeStringValue(innerEntry.getKey(), null);
builder.addBoostField(fieldName, nodeFloatValue(innerEntry.getValue()));
}
} else {
throw new IllegalArgumentException("boost_fields [" + propNode + "] is not an object");
}
iterator.remove();
}
}
return builder;
return new Builder(name, parserContext.queryShardContextSupplier());
}
}
@ -205,6 +198,7 @@ public class PercolatorFieldMapper extends FieldMapper {
MappedFieldType queryTermsField;
MappedFieldType extractionResultField;
MappedFieldType queryBuilderField;
MappedFieldType minimumShouldMatchField;
RangeFieldMapper.RangeFieldType rangeField;
@ -220,6 +214,7 @@ public class PercolatorFieldMapper extends FieldMapper {
extractionResultField = ref.extractionResultField;
queryBuilderField = ref.queryBuilderField;
rangeField = ref.rangeField;
minimumShouldMatchField = ref.minimumShouldMatchField;
}
@Override
@ -247,23 +242,37 @@ public class PercolatorFieldMapper extends FieldMapper {
}
Query percolateQuery(String name, PercolateQuery.QueryStore queryStore, List<BytesReference> documents,
IndexSearcher searcher) throws IOException {
IndexSearcher searcher, Version indexVersion) throws IOException {
IndexReader indexReader = searcher.getIndexReader();
Query candidateMatchesQuery = createCandidateQuery(indexReader);
Tuple<List<Query>, Boolean> t = createCandidateQueryClauses(indexReader);
BooleanQuery.Builder candidateQuery = new BooleanQuery.Builder();
if (t.v2() && indexVersion.onOrAfter(Version.V_6_1_0)) {
LongValuesSource valuesSource = LongValuesSource.fromIntField(minimumShouldMatchField.name());
candidateQuery.add(new CoveringQuery(t.v1(), valuesSource), BooleanClause.Occur.SHOULD);
} else {
for (Query query : t.v1()) {
candidateQuery.add(query, BooleanClause.Occur.SHOULD);
}
}
// include extractionResultField:failed, because docs with this term have no extractedTermsField
// and otherwise we would fail to return these docs. Docs that failed query term extraction
// always need to be verified by MemoryIndex:
candidateQuery.add(new TermQuery(new Term(extractionResultField.name(), EXTRACTION_FAILED)), BooleanClause.Occur.SHOULD);
Query verifiedMatchesQuery;
// We can only skip the MemoryIndex verification when percolating a single document.
// When the document being percolated contains a nested object field then the MemoryIndex contains multiple
// documents. In this case the term query that indicates whether memory index verification can be skipped
// can incorrectly indicate that non nested queries would match, while their nested variants would not.
if (indexReader.maxDoc() == 1) {
if (t.v2() && indexReader.maxDoc() == 1) {
verifiedMatchesQuery = new TermQuery(new Term(extractionResultField.name(), EXTRACTION_COMPLETE));
} else {
verifiedMatchesQuery = new MatchNoDocsQuery("multiple/nested docs, so no verified matches");
verifiedMatchesQuery = new MatchNoDocsQuery("multiple or nested docs or CoveringQuery could not be used");
}
return new PercolateQuery(name, queryStore, documents, candidateMatchesQuery, searcher, verifiedMatchesQuery);
return new PercolateQuery(name, queryStore, documents, candidateQuery.build(), searcher, verifiedMatchesQuery);
}
Query createCandidateQuery(IndexReader indexReader) throws IOException {
Tuple<List<Query>, Boolean> createCandidateQueryClauses(IndexReader indexReader) throws IOException {
List<BytesRef> extractedTerms = new ArrayList<>();
Map<String, List<byte[]>> encodedPointValuesByField = new HashMap<>();
@ -290,14 +299,17 @@ public class PercolatorFieldMapper extends FieldMapper {
}
}
BooleanQuery.Builder builder = new BooleanQuery.Builder();
if (extractedTerms.size() != 0) {
builder.add(new TermInSetQuery(queryTermsField.name(), extractedTerms), Occur.SHOULD);
final boolean canUseMinimumShouldMatchField;
final List<Query> queries = new ArrayList<>();
if (extractedTerms.size() + encodedPointValuesByField.size() <= BooleanQuery.getMaxClauseCount()) {
canUseMinimumShouldMatchField = true;
for (BytesRef extractedTerm : extractedTerms) {
queries.add(new TermQuery(new Term(queryTermsField.name(), extractedTerm)));
}
} else {
canUseMinimumShouldMatchField = false;
queries.add(new TermInSetQuery(queryTermsField.name(), extractedTerms));
}
// include extractionResultField:failed, because docs with this term have no extractedTermsField
// and otherwise we would fail to return these docs. Docs that failed query term extraction
// always need to be verified by MemoryIndex:
builder.add(new TermQuery(new Term(extractionResultField.name(), EXTRACTION_FAILED)), Occur.SHOULD);
for (Map.Entry<String, List<byte[]>> entry : encodedPointValuesByField.entrySet()) {
String rangeFieldName = entry.getKey();
@ -305,9 +317,9 @@ public class PercolatorFieldMapper extends FieldMapper {
byte[] min = encodedPointValues.get(0);
byte[] max = encodedPointValues.get(1);
Query query = BinaryRange.newIntersectsQuery(rangeField.name(), encodeRange(rangeFieldName, min, max));
builder.add(query, Occur.SHOULD);
queries.add(query);
}
return builder.build();
return new Tuple<>(queries, canUseMinimumShouldMatchField);
}
}
@ -317,24 +329,24 @@ public class PercolatorFieldMapper extends FieldMapper {
private KeywordFieldMapper queryTermsField;
private KeywordFieldMapper extractionResultField;
private BinaryFieldMapper queryBuilderField;
private NumberFieldMapper minimumShouldMatchFieldMapper;
private RangeFieldMapper rangeFieldMapper;
private Map<String, Float> boostFields;
PercolatorFieldMapper(String simpleName, MappedFieldType fieldType, MappedFieldType defaultFieldType,
Settings indexSettings, MultiFields multiFields, CopyTo copyTo,
Supplier<QueryShardContext> queryShardContext,
KeywordFieldMapper queryTermsField, KeywordFieldMapper extractionResultField,
BinaryFieldMapper queryBuilderField, RangeFieldMapper rangeFieldMapper,
Map<String, Float> boostFields) {
Settings indexSettings, MultiFields multiFields, CopyTo copyTo,
Supplier<QueryShardContext> queryShardContext,
KeywordFieldMapper queryTermsField, KeywordFieldMapper extractionResultField,
BinaryFieldMapper queryBuilderField, RangeFieldMapper rangeFieldMapper,
NumberFieldMapper minimumShouldMatchFieldMapper) {
super(simpleName, fieldType, defaultFieldType, indexSettings, multiFields, copyTo);
this.queryShardContext = queryShardContext;
this.queryTermsField = queryTermsField;
this.extractionResultField = extractionResultField;
this.queryBuilderField = queryBuilderField;
this.minimumShouldMatchFieldMapper = minimumShouldMatchFieldMapper;
this.mapUnmappedFieldAsText = getMapUnmappedFieldAsText(indexSettings);
this.rangeFieldMapper = rangeFieldMapper;
this.boostFields = boostFields;
}
private static boolean getMapUnmappedFieldAsText(Settings indexSettings) {
@ -361,6 +373,7 @@ public class PercolatorFieldMapper extends FieldMapper {
KeywordFieldMapper extractionResultUpdated = (KeywordFieldMapper) extractionResultField.updateFieldType(fullNameToFieldType);
BinaryFieldMapper queryBuilderUpdated = (BinaryFieldMapper) queryBuilderField.updateFieldType(fullNameToFieldType);
RangeFieldMapper rangeFieldMapperUpdated = (RangeFieldMapper) rangeFieldMapper.updateFieldType(fullNameToFieldType);
NumberFieldMapper msmFieldMapperUpdated = (NumberFieldMapper) minimumShouldMatchFieldMapper.updateFieldType(fullNameToFieldType);
if (updated == this && queryTermsUpdated == queryTermsField && extractionResultUpdated == extractionResultField
&& queryBuilderUpdated == queryBuilderField && rangeFieldMapperUpdated == rangeFieldMapper) {
@ -373,6 +386,7 @@ public class PercolatorFieldMapper extends FieldMapper {
updated.extractionResultField = extractionResultUpdated;
updated.queryBuilderField = queryBuilderUpdated;
updated.rangeFieldMapper = rangeFieldMapperUpdated;
updated.minimumShouldMatchFieldMapper = msmFieldMapperUpdated;
return updated;
}
@ -429,7 +443,8 @@ public class PercolatorFieldMapper extends FieldMapper {
FieldType pft = (FieldType) this.fieldType();
QueryAnalyzer.Result result;
try {
result = QueryAnalyzer.analyze(query, boostFields);
Version indexVersion = context.mapperService().getIndexSettings().getIndexVersionCreated();
result = QueryAnalyzer.analyze(query, indexVersion);
} catch (QueryAnalyzer.UnsupportedQueryException e) {
doc.add(new Field(pft.extractionResultField.name(), EXTRACTION_FAILED, extractionResultField.fieldType()));
return;
@ -457,6 +472,9 @@ public class PercolatorFieldMapper extends FieldMapper {
for (IndexableField field : fields) {
context.doc().add(field);
}
if (context.mapperService().getIndexSettings().getIndexVersionCreated().onOrAfter(Version.V_6_1_0)) {
doc.add(new NumericDocValuesField(minimumShouldMatchFieldMapper.name(), result.minimumShouldMatch));
}
}
static Query parseQuery(QueryShardContext context, boolean mapUnmappedFieldsAsString, XContentParser parser) throws IOException {
@ -491,7 +509,9 @@ public class PercolatorFieldMapper extends FieldMapper {
@Override
public Iterator<Mapper> iterator() {
return Arrays.<Mapper>asList(queryTermsField, extractionResultField, queryBuilderField, rangeFieldMapper).iterator();
return Arrays.<Mapper>asList(
queryTermsField, extractionResultField, queryBuilderField, minimumShouldMatchFieldMapper, rangeFieldMapper
).iterator();
}
@Override
@ -504,28 +524,6 @@ public class PercolatorFieldMapper extends FieldMapper {
return CONTENT_TYPE;
}
@Override
protected void doMerge(Mapper mergeWith, boolean updateAllTypes) {
super.doMerge(mergeWith, updateAllTypes);
PercolatorFieldMapper percolatorMergeWith = (PercolatorFieldMapper) mergeWith;
// Updating the boost_fields can be allowed, because it doesn't break previously indexed percolator queries
// However the updated boost_fields to completely take effect, percolator queries prior to the mapping update need to be reindexed
boostFields = percolatorMergeWith.boostFields;
}
@Override
protected void doXContentBody(XContentBuilder builder, boolean includeDefaults, Params params) throws IOException {
super.doXContentBody(builder, includeDefaults, params);
if (boostFields.isEmpty() == false) {
builder.startObject("boost_fields");
for (Map.Entry<String, Float> entry : boostFields.entrySet()) {
builder.field(entry.getKey(), entry.getValue());
}
builder.endObject();
}
}
boolean isMapUnmappedFieldAsText() {
return mapUnmappedFieldAsText;
}

View File

@ -45,6 +45,7 @@ import org.apache.lucene.search.spans.SpanQuery;
import org.apache.lucene.search.spans.SpanTermQuery;
import org.apache.lucene.util.BytesRef;
import org.apache.lucene.util.NumericUtils;
import org.elasticsearch.Version;
import org.elasticsearch.common.logging.LoggerMessageFormat;
import org.elasticsearch.common.lucene.search.function.FunctionScoreQuery;
import org.elasticsearch.index.search.ESToParentBlockJoinQuery;
@ -59,16 +60,15 @@ import java.util.Map;
import java.util.Objects;
import java.util.Set;
import java.util.function.BiFunction;
import java.util.function.Predicate;
import static java.util.stream.Collectors.toSet;
final class QueryAnalyzer {
private static final Map<Class<? extends Query>, BiFunction<Query, Map<String, Float>, Result>> queryProcessors;
private static final Map<Class<? extends Query>, BiFunction<Query, Version, Result>> queryProcessors;
static {
Map<Class<? extends Query>, BiFunction<Query, Map<String, Float>, Result>> map = new HashMap<>();
Map<Class<? extends Query>, BiFunction<Query, Version, Result>> map = new HashMap<>();
map.put(MatchNoDocsQuery.class, matchNoDocsQuery());
map.put(ConstantScoreQuery.class, constantScoreQuery());
map.put(BoostQuery.class, boostQuery());
@ -119,161 +119,196 @@ final class QueryAnalyzer {
* Sometimes the query analyzer can't always extract terms or ranges from a sub query, if that happens then
* query analysis is stopped and an UnsupportedQueryException is thrown. So that the caller can mark
* this query in such a way that the PercolatorQuery always verifies if this query with the MemoryIndex.
*
* @param query The query to analyze.
* @param indexVersion The create version of the index containing the percolator queries.
*/
static Result analyze(Query query, Map<String, Float> boosts) {
static Result analyze(Query query, Version indexVersion) {
Class queryClass = query.getClass();
if (queryClass.isAnonymousClass()) {
// Sometimes queries have anonymous classes in that case we need the direct super class.
// (for example blended term query)
queryClass = queryClass.getSuperclass();
}
BiFunction<Query, Map<String, Float>, Result> queryProcessor = queryProcessors.get(queryClass);
BiFunction<Query, Version, Result> queryProcessor = queryProcessors.get(queryClass);
if (queryProcessor != null) {
return queryProcessor.apply(query, boosts);
return queryProcessor.apply(query, indexVersion);
} else {
throw new UnsupportedQueryException(query);
}
}
private static BiFunction<Query, Map<String, Float>, Result> matchNoDocsQuery() {
return (query, boosts) -> new Result(true, Collections.emptySet());
private static BiFunction<Query, Version, Result> matchNoDocsQuery() {
return (query, version) -> new Result(true, Collections.emptySet(), 1);
}
private static BiFunction<Query, Map<String, Float>, Result> constantScoreQuery() {
return (query, boosts)-> {
private static BiFunction<Query, Version, Result> constantScoreQuery() {
return (query, boosts) -> {
Query wrappedQuery = ((ConstantScoreQuery) query).getQuery();
return analyze(wrappedQuery, boosts);
};
}
private static BiFunction<Query, Map<String, Float>, Result> boostQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> boostQuery() {
return (query, version) -> {
Query wrappedQuery = ((BoostQuery) query).getQuery();
return analyze(wrappedQuery, boosts);
return analyze(wrappedQuery, version);
};
}
private static BiFunction<Query, Map<String, Float>, Result> termQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> termQuery() {
return (query, version) -> {
TermQuery termQuery = (TermQuery) query;
return new Result(true, Collections.singleton(new QueryExtraction(termQuery.getTerm())));
return new Result(true, Collections.singleton(new QueryExtraction(termQuery.getTerm())), 1);
};
}
private static BiFunction<Query, Map<String, Float>, Result> termInSetQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> termInSetQuery() {
return (query, version) -> {
TermInSetQuery termInSetQuery = (TermInSetQuery) query;
Set<QueryExtraction> terms = new HashSet<>();
PrefixCodedTerms.TermIterator iterator = termInSetQuery.getTermData().iterator();
for (BytesRef term = iterator.next(); term != null; term = iterator.next()) {
terms.add(new QueryExtraction(new Term(iterator.field(), term)));
}
return new Result(true, terms);
return new Result(true, terms, 1);
};
}
private static BiFunction<Query, Map<String, Float>, Result> synonymQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> synonymQuery() {
return (query, version) -> {
Set<QueryExtraction> terms = ((SynonymQuery) query).getTerms().stream().map(QueryExtraction::new).collect(toSet());
return new Result(true, terms);
return new Result(true, terms, 1);
};
}
private static BiFunction<Query, Map<String, Float>, Result> commonTermsQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> commonTermsQuery() {
return (query, version) -> {
Set<QueryExtraction> terms = ((CommonTermsQuery) query).getTerms().stream().map(QueryExtraction::new).collect(toSet());
return new Result(false, terms);
return new Result(false, terms, 1);
};
}
private static BiFunction<Query, Map<String, Float>, Result> blendedTermQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> blendedTermQuery() {
return (query, version) -> {
Set<QueryExtraction> terms = ((BlendedTermQuery) query).getTerms().stream().map(QueryExtraction::new).collect(toSet());
return new Result(true, terms);
return new Result(true, terms, 1);
};
}
private static BiFunction<Query, Map<String, Float>, Result> phraseQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> phraseQuery() {
return (query, version) -> {
Term[] terms = ((PhraseQuery) query).getTerms();
if (terms.length == 0) {
return new Result(true, Collections.emptySet());
return new Result(true, Collections.emptySet(), 1);
}
// the longest term is likely to be the rarest,
// so from a performance perspective it makes sense to extract that
Term longestTerm = terms[0];
for (Term term : terms) {
if (longestTerm.bytes().length < term.bytes().length) {
longestTerm = term;
if (version.onOrAfter(Version.V_6_1_0)) {
Set<QueryExtraction> extractions = Arrays.stream(terms).map(QueryExtraction::new).collect(toSet());
return new Result(false, extractions, extractions.size());
} else {
// the longest term is likely to be the rarest,
// so from a performance perspective it makes sense to extract that
Term longestTerm = terms[0];
for (Term term : terms) {
if (longestTerm.bytes().length < term.bytes().length) {
longestTerm = term;
}
}
return new Result(false, Collections.singleton(new QueryExtraction(longestTerm)), 1);
}
return new Result(false, Collections.singleton(new QueryExtraction(longestTerm)));
};
}
private static BiFunction<Query, Map<String, Float>, Result> multiPhraseQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> multiPhraseQuery() {
return (query, version) -> {
Term[][] terms = ((MultiPhraseQuery) query).getTermArrays();
if (terms.length == 0) {
return new Result(true, Collections.emptySet());
return new Result(true, Collections.emptySet(), 1);
}
Set<QueryExtraction> bestTermArr = null;
for (Term[] termArr : terms) {
Set<QueryExtraction> queryExtractions = Arrays.stream(termArr).map(QueryExtraction::new).collect(toSet());
bestTermArr = selectBestExtraction(boosts, bestTermArr, queryExtractions);
if (version.onOrAfter(Version.V_6_1_0)) {
Set<QueryExtraction> extractions = new HashSet<>();
for (Term[] termArr : terms) {
extractions.addAll(Arrays.stream(termArr).map(QueryExtraction::new).collect(toSet()));
}
return new Result(false, extractions, terms.length);
} else {
Set<QueryExtraction> bestTermArr = null;
for (Term[] termArr : terms) {
Set<QueryExtraction> queryExtractions = Arrays.stream(termArr).map(QueryExtraction::new).collect(toSet());
bestTermArr = selectBestExtraction(bestTermArr, queryExtractions);
}
return new Result(false, bestTermArr, 1);
}
return new Result(false, bestTermArr);
};
}
private static BiFunction<Query, Map<String, Float>, Result> spanTermQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> spanTermQuery() {
return (query, version) -> {
Term term = ((SpanTermQuery) query).getTerm();
return new Result(true, Collections.singleton(new QueryExtraction(term)));
return new Result(true, Collections.singleton(new QueryExtraction(term)), 1);
};
}
private static BiFunction<Query, Map<String, Float>, Result> spanNearQuery() {
return (query, boosts) -> {
Set<QueryExtraction> bestClauses = null;
private static BiFunction<Query, Version, Result> spanNearQuery() {
return (query, version) -> {
SpanNearQuery spanNearQuery = (SpanNearQuery) query;
for (SpanQuery clause : spanNearQuery.getClauses()) {
Result temp = analyze(clause, boosts);
bestClauses = selectBestExtraction(boosts, temp.extractions, bestClauses);
if (version.onOrAfter(Version.V_6_1_0)) {
Set<Result> results = Arrays.stream(spanNearQuery.getClauses()).map(clause -> analyze(clause, version)).collect(toSet());
int msm = 0;
Set<QueryExtraction> extractions = new HashSet<>();
Set<String> seenRangeFields = new HashSet<>();
for (Result result : results) {
QueryExtraction[] t = result.extractions.toArray(new QueryExtraction[1]);
if (result.extractions.size() == 1 && t[0].range != null) {
if (seenRangeFields.add(t[0].range.fieldName)) {
msm += 1;
}
} else {
msm += result.minimumShouldMatch;
}
extractions.addAll(result.extractions);
}
return new Result(false, extractions, msm);
} else {
Set<QueryExtraction> bestClauses = null;
for (SpanQuery clause : spanNearQuery.getClauses()) {
Result temp = analyze(clause, version);
bestClauses = selectBestExtraction(temp.extractions, bestClauses);
}
return new Result(false, bestClauses, 1);
}
return new Result(false, bestClauses);
};
}
private static BiFunction<Query, Map<String, Float>, Result> spanOrQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> spanOrQuery() {
return (query, version) -> {
Set<QueryExtraction> terms = new HashSet<>();
SpanOrQuery spanOrQuery = (SpanOrQuery) query;
for (SpanQuery clause : spanOrQuery.getClauses()) {
terms.addAll(analyze(clause, boosts).extractions);
terms.addAll(analyze(clause, version).extractions);
}
return new Result(false, terms);
return new Result(false, terms, 1);
};
}
private static BiFunction<Query, Map<String, Float>, Result> spanNotQuery() {
return (query, boosts) -> {
Result result = analyze(((SpanNotQuery) query).getInclude(), boosts);
return new Result(false, result.extractions);
private static BiFunction<Query, Version, Result> spanNotQuery() {
return (query, version) -> {
Result result = analyze(((SpanNotQuery) query).getInclude(), version);
return new Result(false, result.extractions, result.minimumShouldMatch);
};
}
private static BiFunction<Query, Map<String, Float>, Result> spanFirstQuery() {
return (query, boosts) -> {
Result result = analyze(((SpanFirstQuery) query).getMatch(), boosts);
return new Result(false, result.extractions);
private static BiFunction<Query, Version, Result> spanFirstQuery() {
return (query, version) -> {
Result result = analyze(((SpanFirstQuery) query).getMatch(), version);
return new Result(false, result.extractions, result.minimumShouldMatch);
};
}
private static BiFunction<Query, Map<String, Float>, Result> booleanQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> booleanQuery() {
return (query, version) -> {
BooleanQuery bq = (BooleanQuery) query;
List<BooleanClause> clauses = bq.clauses();
int minimumShouldMatch = bq.getMinimumNumberShouldMatch();
@ -292,34 +327,89 @@ final class QueryAnalyzer {
}
}
if (numRequiredClauses > 0) {
Set<QueryExtraction> bestClause = null;
UnsupportedQueryException uqe = null;
for (BooleanClause clause : clauses) {
if (clause.isRequired() == false) {
// skip must_not clauses, we don't need to remember the things that do *not* match...
// skip should clauses, this bq has must clauses, so we don't need to remember should clauses,
// since they are completely optional.
continue;
if (version.onOrAfter(Version.V_6_1_0)) {
UnsupportedQueryException uqe = null;
List<Result> results = new ArrayList<>(numRequiredClauses);
for (BooleanClause clause : clauses) {
if (clause.isRequired()) {
// skip must_not clauses, we don't need to remember the things that do *not* match...
// skip should clauses, this bq has must clauses, so we don't need to remember should clauses,
// since they are completely optional.
try {
results.add(analyze(clause.getQuery(), version));
} catch (UnsupportedQueryException e) {
uqe = e;
}
}
}
Result temp;
try {
temp = analyze(clause.getQuery(), boosts);
} catch (UnsupportedQueryException e) {
uqe = e;
continue;
}
bestClause = selectBestExtraction(boosts, temp.extractions, bestClause);
}
if (bestClause != null) {
return new Result(false, bestClause);
} else {
if (uqe != null) {
// we're unable to select the best clause and an exception occurred, so we bail
throw uqe;
if (results.isEmpty()) {
if (uqe != null) {
// we're unable to select the best clause and an exception occurred, so we bail
throw uqe;
} else {
// We didn't find a clause and no exception occurred, so this bq only contained MatchNoDocsQueries,
return new Result(true, Collections.emptySet(), 1);
}
} else {
// We didn't find a clause and no exception occurred, so this bq only contained MatchNoDocsQueries,
return new Result(true, Collections.emptySet());
int msm = 0;
boolean requiredShouldClauses = minimumShouldMatch > 0 && numOptionalClauses > 0;
boolean verified = uqe == null && numProhibitedClauses == 0 && requiredShouldClauses == false;
Set<QueryExtraction> extractions = new HashSet<>();
Set<String> seenRangeFields = new HashSet<>();
for (Result result : results) {
QueryExtraction[] t = result.extractions.toArray(new QueryExtraction[1]);
if (result.extractions.size() == 1 && t[0].range != null) {
// In case of range queries each extraction does not simply increment the minimum_should_match
// for that percolator query like for a term based extraction, so that can lead to more false
// positives for percolator queries with range queries than term based queries.
// The is because the way number fields are extracted from the document to be percolated.
// Per field a single range is extracted and if a percolator query has two or more range queries
// on the same field than the the minimum should match can be higher than clauses in the CoveringQuery.
// Therefore right now the minimum should match is incremented once per number field when processing
// the percolator query at index time.
if (seenRangeFields.add(t[0].range.fieldName)) {
msm += 1;
}
} else {
msm += result.minimumShouldMatch;
}
verified &= result.verified;
extractions.addAll(result.extractions);
}
return new Result(verified, extractions, msm);
}
} else {
Set<QueryExtraction> bestClause = null;
UnsupportedQueryException uqe = null;
for (BooleanClause clause : clauses) {
if (clause.isRequired() == false) {
// skip must_not clauses, we don't need to remember the things that do *not* match...
// skip should clauses, this bq has must clauses, so we don't need to remember should clauses,
// since they are completely optional.
continue;
}
Result temp;
try {
temp = analyze(clause.getQuery(), version);
} catch (UnsupportedQueryException e) {
uqe = e;
continue;
}
bestClause = selectBestExtraction(temp.extractions, bestClause);
}
if (bestClause != null) {
return new Result(false, bestClause, 1);
} else {
if (uqe != null) {
// we're unable to select the best clause and an exception occurred, so we bail
throw uqe;
} else {
// We didn't find a clause and no exception occurred, so this bq only contained MatchNoDocsQueries,
return new Result(true, Collections.emptySet(), 1);
}
}
}
} else {
@ -329,33 +419,33 @@ final class QueryAnalyzer {
disjunctions.add(clause.getQuery());
}
}
return handleDisjunction(disjunctions, minimumShouldMatch, numProhibitedClauses > 0, boosts);
return handleDisjunction(disjunctions, minimumShouldMatch, numProhibitedClauses > 0, version);
}
};
}
private static BiFunction<Query, Map<String, Float>, Result> disjunctionMaxQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> disjunctionMaxQuery() {
return (query, version) -> {
List<Query> disjuncts = ((DisjunctionMaxQuery) query).getDisjuncts();
return handleDisjunction(disjuncts, 1, false, boosts);
return handleDisjunction(disjuncts, 1, false, version);
};
}
private static BiFunction<Query, Map<String, Float>, Result> functionScoreQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> functionScoreQuery() {
return (query, version) -> {
FunctionScoreQuery functionScoreQuery = (FunctionScoreQuery) query;
Result result = analyze(functionScoreQuery.getSubQuery(), boosts);
Result result = analyze(functionScoreQuery.getSubQuery(), version);
// If min_score is specified we can't guarantee upfront that this percolator query matches,
// so in that case we set verified to false.
// (if it matches with the percolator document matches with the extracted terms.
// Min score filters out docs, which is different than the functions, which just influences the score.)
boolean verified = functionScoreQuery.getMinScore() == null;
return new Result(verified, result.extractions);
return new Result(verified, result.extractions, result.minimumShouldMatch);
};
}
private static BiFunction<Query, Map<String, Float>, Result> pointRangeQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> pointRangeQuery() {
return (query, version) -> {
PointRangeQuery pointRangeQuery = (PointRangeQuery) query;
if (pointRangeQuery.getNumDims() != 1) {
throw new UnsupportedQueryException(query);
@ -367,14 +457,13 @@ final class QueryAnalyzer {
// Need to check whether upper is not smaller than lower, otherwise NumericUtils.subtract(...) fails IAE
// If upper is really smaller than lower then we deal with like MatchNoDocsQuery. (verified and no extractions)
if (new BytesRef(lowerPoint).compareTo(new BytesRef(upperPoint)) > 0) {
return new Result(true, Collections.emptySet());
return new Result(true, Collections.emptySet(), 1);
}
byte[] interval = new byte[16];
NumericUtils.subtract(16, 0, prepad(upperPoint), prepad(lowerPoint), interval);
return new Result(false, Collections.singleton(new QueryExtraction(
new Range(pointRangeQuery.getField(), lowerPoint, upperPoint, interval))
));
new Range(pointRangeQuery.getField(), lowerPoint, upperPoint, interval))), 1);
};
}
@ -385,82 +474,83 @@ final class QueryAnalyzer {
return result;
}
private static BiFunction<Query, Map<String, Float>, Result> indexOrDocValuesQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> indexOrDocValuesQuery() {
return (query, version) -> {
IndexOrDocValuesQuery indexOrDocValuesQuery = (IndexOrDocValuesQuery) query;
return analyze(indexOrDocValuesQuery.getIndexQuery(), boosts);
return analyze(indexOrDocValuesQuery.getIndexQuery(), version);
};
}
private static BiFunction<Query, Map<String, Float>, Result> toParentBlockJoinQuery() {
return (query, boosts) -> {
private static BiFunction<Query, Version, Result> toParentBlockJoinQuery() {
return (query, version) -> {
ESToParentBlockJoinQuery toParentBlockJoinQuery = (ESToParentBlockJoinQuery) query;
Result result = analyze(toParentBlockJoinQuery.getChildQuery(), boosts);
return new Result(false, result.extractions);
Result result = analyze(toParentBlockJoinQuery.getChildQuery(), version);
return new Result(false, result.extractions, result.minimumShouldMatch);
};
}
private static Result handleDisjunction(List<Query> disjunctions, int minimumShouldMatch, boolean otherClauses,
Map<String, Float> boosts) {
boolean verified = minimumShouldMatch <= 1 && otherClauses == false;
private static Result handleDisjunction(List<Query> disjunctions, int requiredShouldClauses, boolean otherClauses,
Version version) {
// Keep track of the msm for each clause:
int[] msmPerClause = new int[disjunctions.size()];
String[] rangeFieldNames = new String[disjunctions.size()];
boolean verified = otherClauses == false;
Set<QueryExtraction> terms = new HashSet<>();
for (Query disjunct : disjunctions) {
Result subResult = analyze(disjunct, boosts);
if (subResult.verified == false) {
verified = false;
}
for (int i = 0; i < disjunctions.size(); i++) {
Query disjunct = disjunctions.get(i);
Result subResult = analyze(disjunct, version);
verified &= subResult.verified;
terms.addAll(subResult.extractions);
QueryExtraction[] t = subResult.extractions.toArray(new QueryExtraction[1]);
msmPerClause[i] = subResult.minimumShouldMatch;
if (subResult.extractions.size() == 1 && t[0].range != null) {
rangeFieldNames[i] = t[0].range.fieldName;
}
}
return new Result(verified, terms);
int msm = 0;
if (version.onOrAfter(Version.V_6_1_0)) {
Set<String> seenRangeFields = new HashSet<>();
// Figure out what the combined msm is for this disjunction:
// (sum the lowest required clauses, otherwise we're too strict and queries may not match)
Arrays.sort(msmPerClause);
int limit = Math.min(msmPerClause.length, Math.max(1, requiredShouldClauses));
for (int i = 0; i < limit; i++) {
if (rangeFieldNames[i] != null) {
if (seenRangeFields.add(rangeFieldNames[i])) {
msm += 1;
}
} else {
msm += msmPerClause[i];
}
}
} else {
msm = 1;
}
return new Result(verified, terms, msm);
}
static Set<QueryExtraction> selectBestExtraction(Map<String, Float> boostFields, Set<QueryExtraction> extractions1,
Set<QueryExtraction> extractions2) {
static Set<QueryExtraction> selectBestExtraction(Set<QueryExtraction> extractions1, Set<QueryExtraction> extractions2) {
assert extractions1 != null || extractions2 != null;
if (extractions1 == null) {
return extractions2;
} else if (extractions2 == null) {
return extractions1;
} else {
Set<QueryExtraction> filtered1;
Set<QueryExtraction> filtered2;
if (boostFields.isEmpty() == false) {
Predicate<QueryExtraction> predicate = extraction -> {
String fieldName = extraction.term != null ? extraction.term.field() : extraction.range.fieldName;
float boost = boostFields.getOrDefault(fieldName, 1F);
return boost != 0F;
};
filtered1 = extractions1.stream().filter(predicate).collect(toSet());
if (filtered1.isEmpty()) {
return extractions2;
}
filtered2 = extractions2.stream().filter(predicate).collect(toSet());
if (filtered2.isEmpty()) {
return extractions1;
}
float extraction1LowestBoost = lowestBoost(filtered1, boostFields);
float extraction2LowestBoost = lowestBoost(filtered2, boostFields);
if (extraction1LowestBoost > extraction2LowestBoost) {
return extractions1;
} else if (extraction2LowestBoost > extraction1LowestBoost) {
return extractions2;
}
// Step out, because boosts are equal, so pick best extraction on either term or range size.
} else {
filtered1 = extractions1;
filtered2 = extractions2;
}
// Prefer term based extractions over range based extractions:
boolean onlyRangeBasedExtractions = true;
for (QueryExtraction clause : filtered1) {
for (QueryExtraction clause : extractions1) {
if (clause.term != null) {
onlyRangeBasedExtractions = false;
break;
}
}
for (QueryExtraction clause : filtered2) {
for (QueryExtraction clause : extractions2) {
if (clause.term != null) {
onlyRangeBasedExtractions = false;
break;
@ -468,8 +558,8 @@ final class QueryAnalyzer {
}
if (onlyRangeBasedExtractions) {
BytesRef extraction1SmallestRange = smallestRange(filtered1);
BytesRef extraction2SmallestRange = smallestRange(filtered2);
BytesRef extraction1SmallestRange = smallestRange(extractions1);
BytesRef extraction2SmallestRange = smallestRange(extractions2);
if (extraction1SmallestRange == null) {
return extractions2;
} else if (extraction2SmallestRange == null) {
@ -483,8 +573,8 @@ final class QueryAnalyzer {
return extractions2;
}
} else {
int extraction1ShortestTerm = minTermLength(filtered1);
int extraction2ShortestTerm = minTermLength(filtered2);
int extraction1ShortestTerm = minTermLength(extractions1);
int extraction2ShortestTerm = minTermLength(extractions2);
// keep the clause with longest terms, this likely to be rarest.
if (extraction1ShortestTerm >= extraction2ShortestTerm) {
return extractions1;
@ -495,21 +585,11 @@ final class QueryAnalyzer {
}
}
private static float lowestBoost(Set<QueryExtraction> extractions, Map<String, Float> boostFields) {
float lowestBoost = Float.POSITIVE_INFINITY;
for (QueryExtraction extraction : extractions) {
String fieldName = extraction.term != null ? extraction.term.field() : extraction.range.fieldName;
float boost = boostFields.getOrDefault(fieldName, 1F);
lowestBoost = Math.min(lowestBoost, boost);
}
return lowestBoost;
}
private static int minTermLength(Set<QueryExtraction> extractions) {
// In case there are only range extractions, then we return Integer.MIN_VALUE,
// so that selectBestExtraction(...) we are likely to prefer the extractions that contains at least a single extraction
if (extractions.stream().filter(queryExtraction -> queryExtraction.term != null).count() == 0 &&
extractions.stream().filter(queryExtraction -> queryExtraction.range != null).count() > 0) {
extractions.stream().filter(queryExtraction -> queryExtraction.range != null).count() > 0) {
return Integer.MIN_VALUE;
}
@ -538,10 +618,12 @@ final class QueryAnalyzer {
final Set<QueryExtraction> extractions;
final boolean verified;
final int minimumShouldMatch;
Result(boolean verified, Set<QueryExtraction> extractions) {
Result(boolean verified, Set<QueryExtraction> extractions, int minimumShouldMatch) {
this.extractions = extractions;
this.verified = verified;
this.minimumShouldMatch = minimumShouldMatch;
}
}

View File

@ -55,6 +55,7 @@ import org.apache.lucene.search.MatchNoDocsQuery;
import org.apache.lucene.search.PrefixQuery;
import org.apache.lucene.search.Query;
import org.apache.lucene.search.Scorer;
import org.apache.lucene.search.TermInSetQuery;
import org.apache.lucene.search.TermQuery;
import org.apache.lucene.search.TopDocs;
import org.apache.lucene.search.Weight;
@ -64,6 +65,8 @@ import org.apache.lucene.search.spans.SpanNotQuery;
import org.apache.lucene.search.spans.SpanOrQuery;
import org.apache.lucene.search.spans.SpanTermQuery;
import org.apache.lucene.store.Directory;
import org.apache.lucene.store.RAMDirectory;
import org.elasticsearch.Version;
import org.elasticsearch.common.CheckedFunction;
import org.elasticsearch.common.bytes.BytesArray;
import org.elasticsearch.common.compress.CompressedXContent;
@ -88,6 +91,7 @@ import java.util.function.Function;
import static org.elasticsearch.common.network.InetAddresses.forString;
import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.instanceOf;
public class CandidateQueryTests extends ESSingleNodeTestCase {
@ -307,9 +311,10 @@ public class CandidateQueryTests extends ESSingleNodeTestCase {
IndexSearcher shardSearcher = newSearcher(directoryReader);
shardSearcher.setQueryCache(null);
Version v = Version.V_6_1_0;
MemoryIndex memoryIndex = MemoryIndex.fromDocument(Collections.singleton(new IntPoint("int_field", 3)), new WhitespaceAnalyzer());
IndexSearcher percolateSearcher = memoryIndex.createSearcher();
Query query = fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher);
Query query = fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher, v);
TopDocs topDocs = shardSearcher.search(query, 1);
assertEquals(1L, topDocs.totalHits);
assertEquals(1, topDocs.scoreDocs.length);
@ -317,7 +322,7 @@ public class CandidateQueryTests extends ESSingleNodeTestCase {
memoryIndex = MemoryIndex.fromDocument(Collections.singleton(new LongPoint("long_field", 7L)), new WhitespaceAnalyzer());
percolateSearcher = memoryIndex.createSearcher();
query = fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher);
query = fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher, v);
topDocs = shardSearcher.search(query, 1);
assertEquals(1L, topDocs.totalHits);
assertEquals(1, topDocs.scoreDocs.length);
@ -326,7 +331,7 @@ public class CandidateQueryTests extends ESSingleNodeTestCase {
memoryIndex = MemoryIndex.fromDocument(Collections.singleton(new HalfFloatPoint("half_float_field", 12)),
new WhitespaceAnalyzer());
percolateSearcher = memoryIndex.createSearcher();
query = fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher);
query = fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher, v);
topDocs = shardSearcher.search(query, 1);
assertEquals(1L, topDocs.totalHits);
assertEquals(1, topDocs.scoreDocs.length);
@ -334,7 +339,7 @@ public class CandidateQueryTests extends ESSingleNodeTestCase {
memoryIndex = MemoryIndex.fromDocument(Collections.singleton(new FloatPoint("float_field", 17)), new WhitespaceAnalyzer());
percolateSearcher = memoryIndex.createSearcher();
query = fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher);
query = fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher, v);
topDocs = shardSearcher.search(query, 1);
assertEquals(1, topDocs.totalHits);
assertEquals(1, topDocs.scoreDocs.length);
@ -342,7 +347,7 @@ public class CandidateQueryTests extends ESSingleNodeTestCase {
memoryIndex = MemoryIndex.fromDocument(Collections.singleton(new DoublePoint("double_field", 21)), new WhitespaceAnalyzer());
percolateSearcher = memoryIndex.createSearcher();
query = fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher);
query = fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher, v);
topDocs = shardSearcher.search(query, 1);
assertEquals(1, topDocs.totalHits);
assertEquals(1, topDocs.scoreDocs.length);
@ -351,7 +356,7 @@ public class CandidateQueryTests extends ESSingleNodeTestCase {
memoryIndex = MemoryIndex.fromDocument(Collections.singleton(new InetAddressPoint("ip_field",
forString("192.168.0.4"))), new WhitespaceAnalyzer());
percolateSearcher = memoryIndex.createSearcher();
query = fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher);
query = fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher, v);
topDocs = shardSearcher.search(query, 1);
assertEquals(1, topDocs.totalHits);
assertEquals(1, topDocs.scoreDocs.length);
@ -461,11 +466,99 @@ public class CandidateQueryTests extends ESSingleNodeTestCase {
duelRun(queryStore, memoryIndex, shardSearcher);
}
public void testPercolateSmallAndLargeDocument() throws Exception {
List<ParseContext.Document> docs = new ArrayList<>();
BooleanQuery.Builder builder = new BooleanQuery.Builder();
builder.add(new TermQuery(new Term("field", "value1")), BooleanClause.Occur.MUST);
builder.add(new TermQuery(new Term("field", "value2")), BooleanClause.Occur.MUST);
addQuery(builder.build(), docs);
builder = new BooleanQuery.Builder();
builder.add(new TermQuery(new Term("field", "value2")), BooleanClause.Occur.MUST);
builder.add(new TermQuery(new Term("field", "value3")), BooleanClause.Occur.MUST);
addQuery(builder.build(), docs);
builder = new BooleanQuery.Builder();
builder.add(new TermQuery(new Term("field", "value3")), BooleanClause.Occur.MUST);
builder.add(new TermQuery(new Term("field", "value4")), BooleanClause.Occur.MUST);
addQuery(builder.build(), docs);
indexWriter.addDocuments(docs);
indexWriter.close();
directoryReader = DirectoryReader.open(directory);
IndexSearcher shardSearcher = newSearcher(directoryReader);
shardSearcher.setQueryCache(null);
Version v = Version.CURRENT;
try (RAMDirectory directory = new RAMDirectory()) {
try (IndexWriter iw = new IndexWriter(directory, newIndexWriterConfig())) {
Document document = new Document();
document.add(new StringField("field", "value1", Field.Store.NO));
document.add(new StringField("field", "value2", Field.Store.NO));
iw.addDocument(document);
document = new Document();
document.add(new StringField("field", "value5", Field.Store.NO));
document.add(new StringField("field", "value6", Field.Store.NO));
iw.addDocument(document);
document = new Document();
document.add(new StringField("field", "value3", Field.Store.NO));
document.add(new StringField("field", "value4", Field.Store.NO));
iw.addDocument(document);
}
try (IndexReader ir = DirectoryReader.open(directory)){
IndexSearcher percolateSearcher = new IndexSearcher(ir);
Query query =
fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher, v);
TopDocs topDocs = shardSearcher.search(query, 10);
assertEquals(2L, topDocs.totalHits);
assertEquals(2, topDocs.scoreDocs.length);
assertEquals(0, topDocs.scoreDocs[0].doc);
assertEquals(2, topDocs.scoreDocs[1].doc);
query = new ConstantScoreQuery(query);
topDocs = shardSearcher.search(query, 10);
assertEquals(2L, topDocs.totalHits);
assertEquals(2, topDocs.scoreDocs.length);
assertEquals(0, topDocs.scoreDocs[0].doc);
assertEquals(2, topDocs.scoreDocs[1].doc);
}
}
// This will trigger using the TermsQuery instead of individual term query clauses in the CoveringQuery:
try (RAMDirectory directory = new RAMDirectory()) {
try (IndexWriter iw = new IndexWriter(directory, newIndexWriterConfig())) {
Document document = new Document();
for (int i = 0; i < 1025; i++) {
int fieldNumber = 2 + i;
document.add(new StringField("field", "value" + fieldNumber, Field.Store.NO));
}
iw.addDocument(document);
}
try (IndexReader ir = DirectoryReader.open(directory)){
IndexSearcher percolateSearcher = new IndexSearcher(ir);
PercolateQuery query = (PercolateQuery)
fieldType.percolateQuery("_name", queryStore, Collections.singletonList(new BytesArray("{}")), percolateSearcher, v);
BooleanQuery candidateQuery = (BooleanQuery) query.getCandidateMatchesQuery();
assertThat(candidateQuery.clauses().get(0).getQuery(), instanceOf(TermInSetQuery.class));
TopDocs topDocs = shardSearcher.search(query, 10);
assertEquals(2L, topDocs.totalHits);
assertEquals(2, topDocs.scoreDocs.length);
assertEquals(1, topDocs.scoreDocs[0].doc);
assertEquals(2, topDocs.scoreDocs[1].doc);
topDocs = shardSearcher.search(new ConstantScoreQuery(query), 10);
assertEquals(2L, topDocs.totalHits);
assertEquals(2, topDocs.scoreDocs.length);
assertEquals(1, topDocs.scoreDocs[0].doc);
assertEquals(2, topDocs.scoreDocs[1].doc);
}
}
}
private void duelRun(PercolateQuery.QueryStore queryStore, MemoryIndex memoryIndex, IndexSearcher shardSearcher) throws IOException {
boolean requireScore = randomBoolean();
IndexSearcher percolateSearcher = memoryIndex.createSearcher();
Query percolateQuery = fieldType.percolateQuery("_name", queryStore,
Collections.singletonList(new BytesArray("{}")), percolateSearcher);
Collections.singletonList(new BytesArray("{}")), percolateSearcher, Version.CURRENT);
Query query = requireScore ? percolateQuery : new ConstantScoreQuery(percolateQuery);
TopDocs topDocs = shardSearcher.search(query, 10);
@ -499,7 +592,7 @@ public class CandidateQueryTests extends ESSingleNodeTestCase {
IndexSearcher shardSearcher) throws IOException {
IndexSearcher percolateSearcher = memoryIndex.createSearcher();
Query percolateQuery = fieldType.percolateQuery("_name", queryStore,
Collections.singletonList(new BytesArray("{}")), percolateSearcher);
Collections.singletonList(new BytesArray("{}")), percolateSearcher, Version.CURRENT);
return shardSearcher.search(percolateQuery, 10);
}

View File

@ -28,10 +28,8 @@ import org.apache.lucene.document.IntPoint;
import org.apache.lucene.document.LongPoint;
import org.apache.lucene.index.IndexReader;
import org.apache.lucene.index.IndexableField;
import org.apache.lucene.index.PrefixCodedTerms;
import org.apache.lucene.index.Term;
import org.apache.lucene.index.memory.MemoryIndex;
import org.apache.lucene.search.BooleanClause;
import org.apache.lucene.search.BooleanClause.Occur;
import org.apache.lucene.search.BooleanQuery;
import org.apache.lucene.search.PhraseQuery;
@ -43,6 +41,7 @@ import org.apache.lucene.search.join.ScoreMode;
import org.apache.lucene.util.BytesRef;
import org.elasticsearch.action.support.PlainActionFuture;
import org.elasticsearch.common.bytes.BytesArray;
import org.elasticsearch.common.collect.Tuple;
import org.elasticsearch.common.compress.CompressedXContent;
import org.elasticsearch.common.hash.MurmurHash3;
import org.elasticsearch.common.io.stream.InputStreamStreamInput;
@ -115,6 +114,7 @@ import static org.elasticsearch.percolator.PercolatorFieldMapper.EXTRACTION_PART
import static org.hamcrest.Matchers.containsString;
import static org.hamcrest.Matchers.equalTo;
import static org.hamcrest.Matchers.instanceOf;
import static org.hamcrest.Matchers.is;
public class PercolatorFieldMapperTests extends ESSingleNodeTestCase {
@ -171,9 +171,9 @@ public class PercolatorFieldMapperTests extends ESSingleNodeTestCase {
addQueryFieldMappings();
BooleanQuery.Builder bq = new BooleanQuery.Builder();
TermQuery termQuery1 = new TermQuery(new Term("field", "term1"));
bq.add(termQuery1, BooleanClause.Occur.SHOULD);
bq.add(termQuery1, Occur.SHOULD);
TermQuery termQuery2 = new TermQuery(new Term("field", "term2"));
bq.add(termQuery2, BooleanClause.Occur.SHOULD);
bq.add(termQuery2, Occur.SHOULD);
DocumentMapper documentMapper = mapperService.documentMapper("doc");
PercolatorFieldMapper fieldMapper = (PercolatorFieldMapper) documentMapper.mappers().getMapper(fieldName);
@ -189,6 +189,31 @@ public class PercolatorFieldMapperTests extends ESSingleNodeTestCase {
assertThat(fields.size(), equalTo(2));
assertThat(fields.get(0).binaryValue().utf8ToString(), equalTo("field\u0000term1"));
assertThat(fields.get(1).binaryValue().utf8ToString(), equalTo("field\u0000term2"));
fields = new ArrayList<>(Arrays.asList(document.getFields(fieldType.minimumShouldMatchField.name())));
assertThat(fields.size(), equalTo(1));
assertThat(fields.get(0).numericValue(), equalTo(1L));
// Now test conjunction:
bq = new BooleanQuery.Builder();
bq.add(termQuery1, Occur.MUST);
bq.add(termQuery2, Occur.MUST);
parseContext = new ParseContext.InternalParseContext(Settings.EMPTY, mapperService.documentMapperParser(),
documentMapper, null, null);
fieldMapper.processQuery(bq.build(), parseContext);
document = parseContext.doc();
assertThat(document.getField(fieldType.extractionResultField.name()).stringValue(), equalTo(EXTRACTION_COMPLETE));
fields = new ArrayList<>(Arrays.asList(document.getFields(fieldType.queryTermsField.name())));
fields.sort(Comparator.comparing(IndexableField::binaryValue));
assertThat(fields.size(), equalTo(2));
assertThat(fields.get(0).binaryValue().utf8ToString(), equalTo("field\u0000term1"));
assertThat(fields.get(1).binaryValue().utf8ToString(), equalTo("field\u0000term2"));
fields = new ArrayList<>(Arrays.asList(document.getFields(fieldType.minimumShouldMatchField.name())));
assertThat(fields.size(), equalTo(1));
assertThat(fields.get(0).numericValue(), equalTo(2L));
}
public void testExtractRanges() throws Exception {
@ -212,9 +237,40 @@ public class PercolatorFieldMapperTests extends ESSingleNodeTestCase {
assertThat(document.getField(fieldType.extractionResultField.name()).stringValue(), equalTo(EXTRACTION_PARTIAL));
List<IndexableField> fields = new ArrayList<>(Arrays.asList(document.getFields(fieldType.rangeField.name())));
fields.sort(Comparator.comparing(IndexableField::binaryValue));
assertThat(fields.size(), equalTo(1));
assertThat(IntPoint.decodeDimension(fields.get(0).binaryValue().bytes, 12), equalTo(15));
assertThat(fields.size(), equalTo(2));
assertThat(IntPoint.decodeDimension(fields.get(0).binaryValue().bytes, 12), equalTo(10));
assertThat(IntPoint.decodeDimension(fields.get(0).binaryValue().bytes, 28), equalTo(20));
assertThat(IntPoint.decodeDimension(fields.get(1).binaryValue().bytes, 12), equalTo(15));
assertThat(IntPoint.decodeDimension(fields.get(1).binaryValue().bytes, 28), equalTo(20));
fields = new ArrayList<>(Arrays.asList(document.getFields(fieldType.minimumShouldMatchField.name())));
assertThat(fields.size(), equalTo(1));
assertThat(fields.get(0).numericValue(), equalTo(1L));
// Range queries on different fields:
bq = new BooleanQuery.Builder();
bq.add(rangeQuery1, Occur.MUST);
rangeQuery2 = mapperService.documentMapper("doc").mappers().getMapper("number_field2").fieldType()
.rangeQuery(15, 20, true, true, null, null, null, null);
bq.add(rangeQuery2, Occur.MUST);
parseContext = new ParseContext.InternalParseContext(Settings.EMPTY,
mapperService.documentMapperParser(), documentMapper, null, null);
fieldMapper.processQuery(bq.build(), parseContext);
document = parseContext.doc();
assertThat(document.getField(fieldType.extractionResultField.name()).stringValue(), equalTo(EXTRACTION_PARTIAL));
fields = new ArrayList<>(Arrays.asList(document.getFields(fieldType.rangeField.name())));
fields.sort(Comparator.comparing(IndexableField::binaryValue));
assertThat(fields.size(), equalTo(2));
assertThat(IntPoint.decodeDimension(fields.get(0).binaryValue().bytes, 12), equalTo(10));
assertThat(IntPoint.decodeDimension(fields.get(0).binaryValue().bytes, 28), equalTo(20));
assertThat(LongPoint.decodeDimension(fields.get(1).binaryValue().bytes, 8), equalTo(15L));
assertThat(LongPoint.decodeDimension(fields.get(1).binaryValue().bytes, 24), equalTo(20L));
fields = new ArrayList<>(Arrays.asList(document.getFields(fieldType.minimumShouldMatchField.name())));
assertThat(fields.size(), equalTo(1));
assertThat(fields.get(0).numericValue(), equalTo(2L));
}
public void testExtractTermsAndRanges_failed() throws Exception {
@ -243,7 +299,7 @@ public class PercolatorFieldMapperTests extends ESSingleNodeTestCase {
ParseContext.Document document = parseContext.doc();
PercolatorFieldMapper.FieldType fieldType = (PercolatorFieldMapper.FieldType) fieldMapper.fieldType();
assertThat(document.getFields().size(), equalTo(2));
assertThat(document.getFields().size(), equalTo(3));
assertThat(document.getFields().get(0).binaryValue().utf8ToString(), equalTo("field\u0000term"));
assertThat(document.getField(fieldType.extractionResultField.name()).stringValue(), equalTo(EXTRACTION_PARTIAL));
}
@ -260,35 +316,57 @@ public class PercolatorFieldMapperTests extends ESSingleNodeTestCase {
IndexReader indexReader = memoryIndex.createSearcher().getIndexReader();
BooleanQuery candidateQuery = (BooleanQuery) fieldType.createCandidateQuery(indexReader);
assertEquals(3, candidateQuery.clauses().size());
assertEquals(Occur.SHOULD, candidateQuery.clauses().get(0).getOccur());
TermInSetQuery termsQuery = (TermInSetQuery) candidateQuery.clauses().get(0).getQuery();
Tuple<List<Query>, Boolean> t = fieldType.createCandidateQueryClauses(indexReader);
assertTrue(t.v2());
List<Query> clauses = t.v1();
clauses.sort(Comparator.comparing(Query::toString));
assertEquals(15, clauses.size());
assertEquals(fieldType.queryTermsField.name() + ":_field3\u0000me", clauses.get(0).toString());
assertEquals(fieldType.queryTermsField.name() + ":_field3\u0000unhide", clauses.get(1).toString());
assertEquals(fieldType.queryTermsField.name() + ":field1\u0000brown", clauses.get(2).toString());
assertEquals(fieldType.queryTermsField.name() + ":field1\u0000dog", clauses.get(3).toString());
assertEquals(fieldType.queryTermsField.name() + ":field1\u0000fox", clauses.get(4).toString());
assertEquals(fieldType.queryTermsField.name() + ":field1\u0000jumps", clauses.get(5).toString());
assertEquals(fieldType.queryTermsField.name() + ":field1\u0000lazy", clauses.get(6).toString());
assertEquals(fieldType.queryTermsField.name() + ":field1\u0000over", clauses.get(7).toString());
assertEquals(fieldType.queryTermsField.name() + ":field1\u0000quick", clauses.get(8).toString());
assertEquals(fieldType.queryTermsField.name() + ":field1\u0000the", clauses.get(9).toString());
assertEquals(fieldType.queryTermsField.name() + ":field2\u0000more", clauses.get(10).toString());
assertEquals(fieldType.queryTermsField.name() + ":field2\u0000some", clauses.get(11).toString());
assertEquals(fieldType.queryTermsField.name() + ":field2\u0000text", clauses.get(12).toString());
assertEquals(fieldType.queryTermsField.name() + ":field4\u0000123", clauses.get(13).toString());
assertThat(clauses.get(14).toString(), containsString(fieldName + ".range_field:<ranges:"));
}
PrefixCodedTerms terms = termsQuery.getTermData();
assertThat(terms.size(), equalTo(14L));
PrefixCodedTerms.TermIterator termIterator = terms.iterator();
assertTermIterator(termIterator, "_field3\u0000me", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "_field3\u0000unhide", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "field1\u0000brown", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "field1\u0000dog", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "field1\u0000fox", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "field1\u0000jumps", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "field1\u0000lazy", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "field1\u0000over", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "field1\u0000quick", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "field1\u0000the", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "field2\u0000more", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "field2\u0000some", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "field2\u0000text", fieldType.queryTermsField.name());
assertTermIterator(termIterator, "field4\u0000123", fieldType.queryTermsField.name());
assertEquals(Occur.SHOULD, candidateQuery.clauses().get(1).getOccur());
assertEquals(new TermQuery(new Term(fieldType.extractionResultField.name(), EXTRACTION_FAILED)),
candidateQuery.clauses().get(1).getQuery());
public void testCreateCandidateQuery_largeDocument() throws Exception {
addQueryFieldMappings();
assertEquals(Occur.SHOULD, candidateQuery.clauses().get(2).getOccur());
assertThat(candidateQuery.clauses().get(2).getQuery().toString(), containsString(fieldName + ".range_field:<ranges:"));
MemoryIndex memoryIndex = new MemoryIndex(false);
StringBuilder text = new StringBuilder();
for (int i = 0; i < 1023; i++) {
text.append(i).append(' ');
}
memoryIndex.addField("field1", text.toString(), new WhitespaceAnalyzer());
memoryIndex.addField(new LongPoint("field2", 10L), new WhitespaceAnalyzer());
IndexReader indexReader = memoryIndex.createSearcher().getIndexReader();
Tuple<List<Query>, Boolean> t = fieldType.createCandidateQueryClauses(indexReader);
assertTrue(t.v2());
List<Query> clauses = t.v1();
assertEquals(1024, clauses.size());
assertThat(clauses.get(1023).toString(), containsString(fieldName + ".range_field:<ranges:"));
// Now push it over the edge, so that it falls back using TermInSetQuery
memoryIndex.addField("field2", "value", new WhitespaceAnalyzer());
indexReader = memoryIndex.createSearcher().getIndexReader();
t = fieldType.createCandidateQueryClauses(indexReader);
assertFalse(t.v2());
clauses = t.v1();
assertEquals(2, clauses.size());
TermInSetQuery termInSetQuery = (TermInSetQuery) clauses.get(0);
assertEquals(1024, termInSetQuery.getTermData().size());
assertThat(clauses.get(1).toString(), containsString(fieldName + ".range_field:<ranges:"));
}
public void testCreateCandidateQuery_numberFields() throws Exception {
@ -307,38 +385,17 @@ public class PercolatorFieldMapperTests extends ESSingleNodeTestCase {
IndexReader indexReader = memoryIndex.createSearcher().getIndexReader();
BooleanQuery candidateQuery = (BooleanQuery) fieldType.createCandidateQuery(indexReader);
assertEquals(8, candidateQuery.clauses().size());
assertEquals(Occur.SHOULD, candidateQuery.clauses().get(0).getOccur());
assertEquals(new TermQuery(new Term(fieldType.extractionResultField.name(), EXTRACTION_FAILED)),
candidateQuery.clauses().get(0).getQuery());
assertEquals(Occur.SHOULD, candidateQuery.clauses().get(1).getOccur());
assertThat(candidateQuery.clauses().get(1).getQuery().toString(), containsString(fieldName + ".range_field:<ranges:[["));
assertEquals(Occur.SHOULD, candidateQuery.clauses().get(2).getOccur());
assertThat(candidateQuery.clauses().get(2).getQuery().toString(), containsString(fieldName + ".range_field:<ranges:[["));
assertEquals(Occur.SHOULD, candidateQuery.clauses().get(3).getOccur());
assertThat(candidateQuery.clauses().get(3).getQuery().toString(), containsString(fieldName + ".range_field:<ranges:[["));
assertEquals(Occur.SHOULD, candidateQuery.clauses().get(4).getOccur());
assertThat(candidateQuery.clauses().get(4).getQuery().toString(), containsString(fieldName + ".range_field:<ranges:[["));
assertEquals(Occur.SHOULD, candidateQuery.clauses().get(5).getOccur());
assertThat(candidateQuery.clauses().get(5).getQuery().toString(), containsString(fieldName + ".range_field:<ranges:[["));
assertEquals(Occur.SHOULD, candidateQuery.clauses().get(6).getOccur());
assertThat(candidateQuery.clauses().get(6).getQuery().toString(), containsString(fieldName + ".range_field:<ranges:[["));
assertEquals(Occur.SHOULD, candidateQuery.clauses().get(7).getOccur());
assertThat(candidateQuery.clauses().get(7).getQuery().toString(), containsString(fieldName + ".range_field:<ranges:[["));
}
private void assertTermIterator(PrefixCodedTerms.TermIterator termIterator, String expectedValue, String expectedField) {
assertThat(termIterator.next().utf8ToString(), equalTo(expectedValue));
assertThat(termIterator.field(), equalTo(expectedField));
Tuple<List<Query>, Boolean> t = fieldType.createCandidateQueryClauses(indexReader);
assertThat(t.v2(), is(true));
List<Query> clauses = t.v1();
assertEquals(7, clauses.size());
assertThat(clauses.get(0).toString(), containsString(fieldName + ".range_field:<ranges:[["));
assertThat(clauses.get(1).toString(), containsString(fieldName + ".range_field:<ranges:[["));
assertThat(clauses.get(2).toString(), containsString(fieldName + ".range_field:<ranges:[["));
assertThat(clauses.get(3).toString(), containsString(fieldName + ".range_field:<ranges:[["));
assertThat(clauses.get(4).toString(), containsString(fieldName + ".range_field:<ranges:[["));
assertThat(clauses.get(5).toString(), containsString(fieldName + ".range_field:<ranges:[["));
assertThat(clauses.get(6).toString(), containsString(fieldName + ".range_field:<ranges:[["));
}
public void testPercolatorFieldMapper() throws Exception {
@ -488,7 +545,7 @@ public class PercolatorFieldMapperTests extends ESSingleNodeTestCase {
.field("query_field2", queryBuilder)
.endObject().bytes(),
XContentType.JSON));
assertThat(doc.rootDoc().getFields().size(), equalTo(12)); // also includes all other meta fields
assertThat(doc.rootDoc().getFields().size(), equalTo(14)); // also includes all other meta fields
BytesRef queryBuilderAsBytes = doc.rootDoc().getField("query_field1.query_builder_field").binaryValue();
assertQueryBuilder(queryBuilderAsBytes, queryBuilder);
@ -518,7 +575,7 @@ public class PercolatorFieldMapperTests extends ESSingleNodeTestCase {
.field("query_field", queryBuilder)
.endObject().endObject().bytes(),
XContentType.JSON));
assertThat(doc.rootDoc().getFields().size(), equalTo(9)); // also includes all other meta fields
assertThat(doc.rootDoc().getFields().size(), equalTo(10)); // also includes all other meta fields
BytesRef queryBuilderAsBytes = doc.rootDoc().getField("object_field.query_field.query_builder_field").binaryValue();
assertQueryBuilder(queryBuilderAsBytes, queryBuilder);
@ -529,7 +586,7 @@ public class PercolatorFieldMapperTests extends ESSingleNodeTestCase {
.endArray()
.endObject().bytes(),
XContentType.JSON));
assertThat(doc.rootDoc().getFields().size(), equalTo(9)); // also includes all other meta fields
assertThat(doc.rootDoc().getFields().size(), equalTo(10)); // also includes all other meta fields
queryBuilderAsBytes = doc.rootDoc().getField("object_field.query_field.query_builder_field").binaryValue();
assertQueryBuilder(queryBuilderAsBytes, queryBuilder);
@ -741,90 +798,6 @@ public class PercolatorFieldMapperTests extends ESSingleNodeTestCase {
return Arrays.copyOfRange(source, offset, offset + length);
}
public void testBoostFields() throws Exception {
IndexService indexService = createIndex("another_index");
MapperService mapperService = indexService.mapperService();
String mapper = XContentFactory.jsonBuilder().startObject().startObject("doc")
.startObject("_field_names").field("enabled", false).endObject() // makes testing easier
.startObject("properties")
.startObject("status").field("type", "keyword").endObject()
.startObject("update_field").field("type", "keyword").endObject()
.startObject("price").field("type", "long").endObject()
.startObject("query1").field("type", "percolator")
.startObject("boost_fields").field("status", 0).field("updated_field", 2).endObject()
.endObject()
.startObject("query2").field("type", "percolator").endObject()
.endObject().endObject().endObject().string();
mapperService.merge("doc", new CompressedXContent(mapper), MapperService.MergeReason.MAPPING_UPDATE, false);
DocumentMapper documentMapper = mapperService.documentMapper("doc");
BooleanQuery.Builder bq = new BooleanQuery.Builder();
bq.add(new TermQuery(new Term("status", "updated")), Occur.FILTER);
bq.add(LongPoint.newRangeQuery("price", 5, 10), Occur.FILTER);
// Boost fields will ignore status_field:
PercolatorFieldMapper fieldMapper = (PercolatorFieldMapper) documentMapper.mappers().getMapper("query1");
ParseContext.InternalParseContext parseContext = new ParseContext.InternalParseContext(Settings.EMPTY,
mapperService.documentMapperParser(), documentMapper, null, null);
fieldMapper.processQuery(bq.build(), parseContext);
ParseContext.Document document = parseContext.doc();
PercolatorFieldMapper.FieldType fieldType = (PercolatorFieldMapper.FieldType) fieldMapper.fieldType();
assertThat(document.getField(fieldType.extractionResultField.name()).stringValue(), equalTo(EXTRACTION_PARTIAL));
assertThat(document.getFields(fieldType.queryTermsField.name()).length, equalTo(0));
List<IndexableField> fields = new ArrayList<>(Arrays.asList(document.getFields(fieldType.rangeField.name())));
assertThat(fields.size(), equalTo(1));
assertThat(LongPoint.decodeDimension(subByteArray(fields.get(0).binaryValue().bytes, 8, 8), 0), equalTo(5L));
assertThat(LongPoint.decodeDimension(subByteArray(fields.get(0).binaryValue().bytes, 24, 8), 0), equalTo(10L));
// No boost fields, so default extraction logic:
fieldMapper = (PercolatorFieldMapper) documentMapper.mappers().getMapper("query2");
parseContext = new ParseContext.InternalParseContext(Settings.EMPTY, mapperService.documentMapperParser(),
documentMapper, null, null);
fieldMapper.processQuery(bq.build(), parseContext);
document = parseContext.doc();
fieldType = (PercolatorFieldMapper.FieldType) fieldMapper.fieldType();
assertThat(document.getField(fieldType.extractionResultField.name()).stringValue(), equalTo(EXTRACTION_PARTIAL));
assertThat(document.getFields(fieldType.rangeField.name()).length, equalTo(0));
fields = new ArrayList<>(Arrays.asList(document.getFields(fieldType.queryTermsField.name())));
assertThat(fields.size(), equalTo(1));
assertThat(fields.get(0).binaryValue().utf8ToString(), equalTo("status\0updated"));
// Second clause is extracted, because it is boosted by 2:
bq = new BooleanQuery.Builder();
bq.add(new TermQuery(new Term("status", "updated")), Occur.FILTER);
bq.add(new TermQuery(new Term("updated_field", "done")), Occur.FILTER);
fieldMapper = (PercolatorFieldMapper) documentMapper.mappers().getMapper("query1");
parseContext = new ParseContext.InternalParseContext(Settings.EMPTY, mapperService.documentMapperParser(),
documentMapper, null, null);
fieldMapper.processQuery(bq.build(), parseContext);
document = parseContext.doc();
fieldType = (PercolatorFieldMapper.FieldType) fieldMapper.fieldType();
assertThat(document.getField(fieldType.extractionResultField.name()).stringValue(), equalTo(EXTRACTION_PARTIAL));
assertThat(document.getFields(fieldType.rangeField.name()).length, equalTo(0));
fields = new ArrayList<>(Arrays.asList(document.getFields(fieldType.queryTermsField.name())));
assertThat(fields.size(), equalTo(1));
assertThat(fields.get(0).binaryValue().utf8ToString(), equalTo("updated_field\0done"));
// First clause is extracted, because default logic:
bq = new BooleanQuery.Builder();
bq.add(new TermQuery(new Term("status", "updated")), Occur.FILTER);
bq.add(new TermQuery(new Term("updated_field", "done")), Occur.FILTER);
fieldMapper = (PercolatorFieldMapper) documentMapper.mappers().getMapper("query2");
parseContext = new ParseContext.InternalParseContext(Settings.EMPTY, mapperService.documentMapperParser(),
documentMapper, null, null);
fieldMapper.processQuery(bq.build(), parseContext);
document = parseContext.doc();
fieldType = (PercolatorFieldMapper.FieldType) fieldMapper.fieldType();
assertThat(document.getField(fieldType.extractionResultField.name()).stringValue(), equalTo(EXTRACTION_PARTIAL));
assertThat(document.getFields(fieldType.rangeField.name()).length, equalTo(0));
fields = new ArrayList<>(Arrays.asList(document.getFields(fieldType.queryTermsField.name())));
assertThat(fields.size(), equalTo(1));
assertThat(fields.get(0).binaryValue().utf8ToString(), equalTo("status\0updated"));
}
// Just so that we store scripts in percolator queries, but not really execute these scripts.
public static class FoolMeScriptPlugin extends MockScriptPlugin {

View File

@ -193,6 +193,7 @@ public class PercolatorQuerySearchIT extends ESIntegTestCase {
SearchResponse response = client().prepareSearch()
.setQuery(new PercolateQueryBuilder("query", source, XContentType.JSON))
.get();
logger.info("response={}", response);
assertHitCount(response, 2);
assertThat(response.getHits().getAt(0).getId(), equalTo("3"));
assertThat(response.getHits().getAt(1).getId(), equalTo("1"));
@ -849,34 +850,4 @@ public class PercolatorQuerySearchIT extends ESIntegTestCase {
assertThat(item.getFailureMessage(), containsString("[test/type/6] couldn't be found"));
}
public void testBoostFields() throws Exception {
XContentBuilder mappingSource = XContentFactory.jsonBuilder().startObject().startObject("type")
.startObject("properties")
.startObject("status").field("type", "keyword").endObject()
.startObject("price").field("type", "long").endObject()
.startObject("query").field("type", "percolator")
.startObject("boost_fields").field("status", 0.0F).endObject()
.endObject()
.endObject().endObject().endObject();
assertAcked(client().admin().indices().prepareCreate("test").addMapping("type", mappingSource));
client().prepareIndex("test", "type", "q1")
.setSource(jsonBuilder().startObject().field("query", boolQuery()
.must(matchQuery("status", "sold"))
.must(matchQuery("price", 100))
).endObject())
.get();
refresh();
SearchResponse response = client().prepareSearch()
.setQuery(new PercolateQueryBuilder("query",
XContentFactory.jsonBuilder().startObject()
.field("status", "sold")
.field("price", 100)
.endObject().bytes(), XContentType.JSON))
.get();
assertHitCount(response, 1);
assertThat(response.getHits().getAt(0).getId(), equalTo("q1"));
}
}

View File

@ -52,6 +52,7 @@ import org.apache.lucene.search.spans.SpanNotQuery;
import org.apache.lucene.search.spans.SpanOrQuery;
import org.apache.lucene.search.spans.SpanTermQuery;
import org.apache.lucene.util.BytesRef;
import org.elasticsearch.Version;
import org.elasticsearch.common.lucene.search.function.CombineFunction;
import org.elasticsearch.common.lucene.search.function.FunctionScoreQuery;
import org.elasticsearch.common.lucene.search.function.RandomScoreFunction;
@ -63,12 +64,9 @@ import org.elasticsearch.test.ESTestCase;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.Collections;
import java.util.Comparator;
import java.util.HashMap;
import java.util.HashSet;
import java.util.List;
import java.util.Map;
import java.util.Set;
import java.util.function.Consumer;
import java.util.stream.Collectors;
@ -84,8 +82,9 @@ public class QueryAnalyzerTests extends ESTestCase {
public void testExtractQueryMetadata_termQuery() {
TermQuery termQuery = new TermQuery(new Term("_field", "_term"));
Result result = analyze(termQuery, Collections.emptyMap());
Result result = analyze(termQuery, Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryExtraction> terms = new ArrayList<>(result.extractions);
assertThat(terms.size(), equalTo(1));
assertThat(terms.get(0).field(), equalTo(termQuery.getTerm().field()));
@ -94,8 +93,9 @@ public class QueryAnalyzerTests extends ESTestCase {
public void testExtractQueryMetadata_termsQuery() {
TermInSetQuery termsQuery = new TermInSetQuery("_field", new BytesRef("_term1"), new BytesRef("_term2"));
Result result = analyze(termsQuery, Collections.emptyMap());
Result result = analyze(termsQuery, Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryExtraction> terms = new ArrayList<>(result.extractions);
terms.sort(Comparator.comparing(qt -> qt.term));
assertThat(terms.size(), equalTo(2));
@ -107,23 +107,55 @@ public class QueryAnalyzerTests extends ESTestCase {
public void testExtractQueryMetadata_phraseQuery() {
PhraseQuery phraseQuery = new PhraseQuery("_field", "_term1", "term2");
Result result = analyze(phraseQuery, Collections.emptyMap());
Result result = analyze(phraseQuery, Version.CURRENT);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(2));
List<QueryExtraction> terms = new ArrayList<>(result.extractions);
assertThat(terms.size(), equalTo(1));
terms.sort(Comparator.comparing(qt -> qt.term));
assertThat(terms.size(), equalTo(2));
assertThat(terms.get(0).field(), equalTo(phraseQuery.getTerms()[0].field()));
assertThat(terms.get(0).bytes(), equalTo(phraseQuery.getTerms()[0].bytes()));
assertThat(terms.get(1).field(), equalTo(phraseQuery.getTerms()[1].field()));
assertThat(terms.get(1).bytes(), equalTo(phraseQuery.getTerms()[1].bytes()));
}
public void testExtractQueryMetadata_multiPhraseQuery() {
MultiPhraseQuery multiPhraseQuery = new MultiPhraseQuery.Builder()
.add(new Term("_field", "_term1"))
.add(new Term[] {new Term("_field", "_term2"), new Term("_field", "_term3")})
.add(new Term[] {new Term("_field", "_term4"), new Term("_field", "_term5")})
.add(new Term[] {new Term("_field", "_term6")})
.build();
Result result = analyze(multiPhraseQuery, Version.CURRENT);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(4));
List<QueryExtraction> terms = new ArrayList<>(result.extractions);
terms.sort(Comparator.comparing(qt -> qt.term));
assertThat(terms.size(), equalTo(6));
assertThat(terms.get(0).field(), equalTo("_field"));
assertThat(terms.get(0).bytes().utf8ToString(), equalTo("_term1"));
assertThat(terms.get(1).field(), equalTo("_field"));
assertThat(terms.get(1).bytes().utf8ToString(), equalTo("_term2"));
assertThat(terms.get(2).field(), equalTo("_field"));
assertThat(terms.get(2).bytes().utf8ToString(), equalTo("_term3"));
assertThat(terms.get(3).field(), equalTo("_field"));
assertThat(terms.get(3).bytes().utf8ToString(), equalTo("_term4"));
assertThat(terms.get(4).field(), equalTo("_field"));
assertThat(terms.get(4).bytes().utf8ToString(), equalTo("_term5"));
assertThat(terms.get(5).field(), equalTo("_field"));
assertThat(terms.get(5).bytes().utf8ToString(), equalTo("_term6"));
}
public void testExtractQueryMetadata_multiPhraseQuery_pre6dot1() {
MultiPhraseQuery multiPhraseQuery = new MultiPhraseQuery.Builder()
.add(new Term("_field", "_long_term"))
.add(new Term[] {new Term("_field", "_long_term"), new Term("_field", "_term")})
.add(new Term[] {new Term("_field", "_long_term"), new Term("_field", "_very_long_term")})
.add(new Term[] {new Term("_field", "_very_long_term")})
.build();
Result result = analyze(multiPhraseQuery, Collections.emptyMap());
Result result = analyze(multiPhraseQuery, Version.V_6_0_0);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryExtraction> terms = new ArrayList<>(result.extractions);
assertThat(terms.size(), equalTo(1));
assertThat(terms.get(0).field(), equalTo("_field"));
@ -131,6 +163,39 @@ public class QueryAnalyzerTests extends ESTestCase {
}
public void testExtractQueryMetadata_booleanQuery() {
BooleanQuery.Builder builder = new BooleanQuery.Builder();
TermQuery termQuery1 = new TermQuery(new Term("_field", "term0"));
builder.add(termQuery1, BooleanClause.Occur.SHOULD);
PhraseQuery phraseQuery = new PhraseQuery("_field", "term1", "term2");
builder.add(phraseQuery, BooleanClause.Occur.SHOULD);
BooleanQuery.Builder subBuilder = new BooleanQuery.Builder();
TermQuery termQuery2 = new TermQuery(new Term("_field1", "term4"));
subBuilder.add(termQuery2, BooleanClause.Occur.MUST);
TermQuery termQuery3 = new TermQuery(new Term("_field3", "term5"));
subBuilder.add(termQuery3, BooleanClause.Occur.MUST);
builder.add(subBuilder.build(), BooleanClause.Occur.SHOULD);
BooleanQuery booleanQuery = builder.build();
Result result = analyze(booleanQuery, Version.CURRENT);
assertThat("Should clause with phrase query isn't verified, so entire query can't be verified", result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryExtraction> terms = new ArrayList<>(result.extractions);
terms.sort(Comparator.comparing(qt -> qt.term));
assertThat(terms.size(), equalTo(5));
assertThat(terms.get(0).field(), equalTo(termQuery1.getTerm().field()));
assertThat(terms.get(0).bytes(), equalTo(termQuery1.getTerm().bytes()));
assertThat(terms.get(1).field(), equalTo(phraseQuery.getTerms()[0].field()));
assertThat(terms.get(1).bytes(), equalTo(phraseQuery.getTerms()[0].bytes()));
assertThat(terms.get(2).field(), equalTo(phraseQuery.getTerms()[1].field()));
assertThat(terms.get(2).bytes(), equalTo(phraseQuery.getTerms()[1].bytes()));
assertThat(terms.get(3).field(), equalTo(termQuery2.getTerm().field()));
assertThat(terms.get(3).bytes(), equalTo(termQuery2.getTerm().bytes()));
assertThat(terms.get(4).field(), equalTo(termQuery3.getTerm().field()));
assertThat(terms.get(4).bytes(), equalTo(termQuery3.getTerm().bytes()));
}
public void testExtractQueryMetadata_booleanQuery_pre6dot1() {
BooleanQuery.Builder builder = new BooleanQuery.Builder();
TermQuery termQuery1 = new TermQuery(new Term("_field", "_term"));
builder.add(termQuery1, BooleanClause.Occur.SHOULD);
@ -145,8 +210,9 @@ public class QueryAnalyzerTests extends ESTestCase {
builder.add(subBuilder.build(), BooleanClause.Occur.SHOULD);
BooleanQuery booleanQuery = builder.build();
Result result = analyze(booleanQuery, Collections.emptyMap());
Result result = analyze(booleanQuery, Version.V_6_0_0);
assertThat("Should clause with phrase query isn't verified, so entire query can't be verified", result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryExtraction> terms = new ArrayList<>(result.extractions);
terms.sort(Comparator.comparing(qt -> qt.term));
assertThat(terms.size(), equalTo(3));
@ -173,8 +239,9 @@ public class QueryAnalyzerTests extends ESTestCase {
builder.add(subBuilder.build(), BooleanClause.Occur.SHOULD);
BooleanQuery booleanQuery = builder.build();
Result result = analyze(booleanQuery, Collections.emptyMap());
Result result = analyze(booleanQuery, Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryAnalyzer.QueryExtraction> terms = new ArrayList<>(result.extractions);
terms.sort(Comparator.comparing(qt -> qt.term));
assertThat(terms.size(), equalTo(4));
@ -196,12 +263,16 @@ public class QueryAnalyzerTests extends ESTestCase {
builder.add(phraseQuery, BooleanClause.Occur.SHOULD);
BooleanQuery booleanQuery = builder.build();
Result result = analyze(booleanQuery, Collections.emptyMap());
Result result = analyze(booleanQuery, Version.CURRENT);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(2));
List<QueryExtraction> terms = new ArrayList<>(result.extractions);
assertThat(terms.size(), equalTo(1));
assertThat(terms.size(), equalTo(2));
terms.sort(Comparator.comparing(qt -> qt.term));
assertThat(terms.get(0).field(), equalTo(phraseQuery.getTerms()[0].field()));
assertThat(terms.get(0).bytes(), equalTo(phraseQuery.getTerms()[0].bytes()));
assertThat(terms.get(1).field(), equalTo(phraseQuery.getTerms()[1].field()));
assertThat(terms.get(1).bytes(), equalTo(phraseQuery.getTerms()[1].bytes()));
}
public void testExactMatch_booleanQuery() {
@ -210,59 +281,119 @@ public class QueryAnalyzerTests extends ESTestCase {
builder.add(termQuery1, BooleanClause.Occur.SHOULD);
TermQuery termQuery2 = new TermQuery(new Term("_field", "_term2"));
builder.add(termQuery2, BooleanClause.Occur.SHOULD);
Result result = analyze(builder.build(), Collections.emptyMap());
Result result = analyze(builder.build(), Version.CURRENT);
assertThat("All clauses are exact, so candidate matches are verified", result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
builder = new BooleanQuery.Builder();
builder.add(termQuery1, BooleanClause.Occur.SHOULD);
PhraseQuery phraseQuery1 = new PhraseQuery("_field", "_term1", "_term2");
builder.add(phraseQuery1, BooleanClause.Occur.SHOULD);
result = analyze(builder.build(), Collections.emptyMap());
result = analyze(builder.build(), Version.CURRENT);
assertThat("Clause isn't exact, so candidate matches are not verified", result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
builder = new BooleanQuery.Builder();
builder.add(phraseQuery1, BooleanClause.Occur.SHOULD);
PhraseQuery phraseQuery2 = new PhraseQuery("_field", "_term3", "_term4");
builder.add(phraseQuery2, BooleanClause.Occur.SHOULD);
result = analyze(builder.build(), Collections.emptyMap());
result = analyze(builder.build(), Version.CURRENT);
assertThat("No clause is exact, so candidate matches are not verified", result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(2));
builder = new BooleanQuery.Builder();
builder.add(termQuery1, BooleanClause.Occur.MUST_NOT);
builder.add(termQuery2, BooleanClause.Occur.SHOULD);
result = analyze(builder.build(), Collections.emptyMap());
result = analyze(builder.build(), Version.CURRENT);
assertThat("There is a must_not clause, so candidate matches are not verified", result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
builder = new BooleanQuery.Builder();
builder.setMinimumNumberShouldMatch(randomIntBetween(2, 32));
builder.add(termQuery1, BooleanClause.Occur.SHOULD);
builder.add(termQuery2, BooleanClause.Occur.SHOULD);
result = analyze(builder.build(), Collections.emptyMap());
assertThat("Minimum match is >= 1, so candidate matches are not verified", result.verified, is(false));
result = analyze(builder.build(), Version.CURRENT);
assertThat("Minimum match has not impact on whether the result is verified", result.verified, is(true));
assertThat("msm is at least two so result.minimumShouldMatch should 2 too", result.minimumShouldMatch, equalTo(2));
builder = new BooleanQuery.Builder();
builder.add(termQuery1, randomBoolean() ? BooleanClause.Occur.MUST : BooleanClause.Occur.FILTER);
result = analyze(builder.build(), Collections.emptyMap());
assertThat("Single required clause, so candidate matches are verified", result.verified, is(false));
result = analyze(builder.build(), Version.CURRENT);
assertThat("Also required clauses are taken into account whether the result is verified", result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
builder = new BooleanQuery.Builder();
builder.add(termQuery1, randomBoolean() ? BooleanClause.Occur.MUST : BooleanClause.Occur.FILTER);
builder.add(termQuery2, randomBoolean() ? BooleanClause.Occur.MUST : BooleanClause.Occur.FILTER);
result = analyze(builder.build(), Collections.emptyMap());
assertThat("Two or more required clauses, so candidate matches are not verified", result.verified, is(false));
result = analyze(builder.build(), Version.CURRENT);
assertThat("Also required clauses are taken into account whether the result is verified", result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(2));
builder = new BooleanQuery.Builder();
builder.add(termQuery1, randomBoolean() ? BooleanClause.Occur.MUST : BooleanClause.Occur.FILTER);
builder.add(termQuery2, BooleanClause.Occur.MUST_NOT);
result = analyze(builder.build(), Collections.emptyMap());
assertThat("Required and prohibited clauses, so candidate matches are not verified", result.verified, is(false));
result = analyze(builder.build(), Version.CURRENT);
assertThat("Prohibited clause, so candidate matches are not verified", result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
}
public void testBooleanQueryWithMustAndShouldClauses() {
BooleanQuery.Builder builder = new BooleanQuery.Builder();
TermQuery termQuery1 = new TermQuery(new Term("_field", "_term1"));
builder.add(termQuery1, BooleanClause.Occur.SHOULD);
TermQuery termQuery2 = new TermQuery(new Term("_field", "_term2"));
builder.add(termQuery2, BooleanClause.Occur.SHOULD);
TermQuery termQuery3 = new TermQuery(new Term("_field", "_term3"));
builder.add(termQuery3, BooleanClause.Occur.MUST);
Result result = analyze(builder.build(), Version.CURRENT);
assertThat("Must clause is exact, so this is a verified candidate match", result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
assertThat(result.extractions.size(), equalTo(1));
List<QueryExtraction> extractions = new ArrayList<>(result.extractions);
assertThat(extractions.get(0).term, equalTo(new Term("_field", "_term3")));
builder.setMinimumNumberShouldMatch(1);
result = analyze(builder.build(), Version.CURRENT);
assertThat("Must clause is exact, but m_s_m is 1 so one should clause must match too", result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
assertThat(result.extractions.size(), equalTo(1));
extractions = new ArrayList<>(result.extractions);
assertThat(extractions.get(0).term, equalTo(new Term("_field", "_term3")));
builder = new BooleanQuery.Builder();
BooleanQuery.Builder innerBuilder = new BooleanQuery.Builder();
innerBuilder.setMinimumNumberShouldMatch(2);
innerBuilder.add(termQuery1, BooleanClause.Occur.SHOULD);
innerBuilder.add(termQuery2, BooleanClause.Occur.SHOULD);
builder.add(innerBuilder.build(), BooleanClause.Occur.MUST);
builder.add(termQuery3, BooleanClause.Occur.MUST);
result = analyze(builder.build(), Version.CURRENT);
assertThat("Verified, because m_s_m is specified in an inner clause and not top level clause", result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(3));
assertThat(result.extractions.size(), equalTo(3));
extractions = new ArrayList<>(result.extractions);
extractions.sort(Comparator.comparing(key -> key.term));
assertThat(extractions.get(0).term, equalTo(new Term("_field", "_term1")));
assertThat(extractions.get(1).term, equalTo(new Term("_field", "_term2")));
assertThat(extractions.get(2).term, equalTo(new Term("_field", "_term3")));
builder = new BooleanQuery.Builder();
builder.add(innerBuilder.build(), BooleanClause.Occur.SHOULD);
builder.add(termQuery3, BooleanClause.Occur.MUST);
result = analyze(builder.build(), Version.CURRENT);
assertThat("Verified, because m_s_m is specified in an inner clause and not top level clause", result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
assertThat(result.extractions.size(), equalTo(1));
extractions = new ArrayList<>(result.extractions);
assertThat(extractions.get(0).term, equalTo(new Term("_field", "_term3")));
}
public void testExtractQueryMetadata_constantScoreQuery() {
TermQuery termQuery1 = new TermQuery(new Term("_field", "_term"));
ConstantScoreQuery constantScoreQuery = new ConstantScoreQuery(termQuery1);
Result result = analyze(constantScoreQuery, Collections.emptyMap());
Result result = analyze(constantScoreQuery, Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryExtraction> terms = new ArrayList<>(result.extractions);
assertThat(terms.size(), equalTo(1));
assertThat(terms.get(0).field(), equalTo(termQuery1.getTerm().field()));
@ -272,8 +403,9 @@ public class QueryAnalyzerTests extends ESTestCase {
public void testExtractQueryMetadata_boostQuery() {
TermQuery termQuery1 = new TermQuery(new Term("_field", "_term"));
BoostQuery constantScoreQuery = new BoostQuery(termQuery1, 1f);
Result result = analyze(constantScoreQuery, Collections.emptyMap());
Result result = analyze(constantScoreQuery, Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryExtraction> terms = new ArrayList<>(result.extractions);
assertThat(terms.size(), equalTo(1));
assertThat(terms.get(0).field(), equalTo(termQuery1.getTerm().field()));
@ -284,11 +416,13 @@ public class QueryAnalyzerTests extends ESTestCase {
CommonTermsQuery commonTermsQuery = new CommonTermsQuery(BooleanClause.Occur.SHOULD, BooleanClause.Occur.SHOULD, 100);
commonTermsQuery.add(new Term("_field", "_term1"));
commonTermsQuery.add(new Term("_field", "_term2"));
Result result = analyze(commonTermsQuery, Collections.emptyMap());
Result result = analyze(commonTermsQuery, Version.CURRENT);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryExtraction> terms = new ArrayList<>(result.extractions);
terms.sort(Comparator.comparing(qt -> qt.term));
assertThat(terms.size(), equalTo(2));
assertThat(result.minimumShouldMatch, equalTo(1));
assertThat(terms.get(0).field(), equalTo("_field"));
assertThat(terms.get(0).text(), equalTo("_term1"));
assertThat(terms.get(1).field(), equalTo("_field"));
@ -298,8 +432,9 @@ public class QueryAnalyzerTests extends ESTestCase {
public void testExtractQueryMetadata_blendedTermQuery() {
Term[] termsArr = new Term[]{new Term("_field", "_term1"), new Term("_field", "_term2")};
BlendedTermQuery commonTermsQuery = BlendedTermQuery.dismaxBlendedQuery(termsArr, 1.0f);
Result result = analyze(commonTermsQuery, Collections.emptyMap());
Result result = analyze(commonTermsQuery, Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryAnalyzer.QueryExtraction> terms = new ArrayList<>(result.extractions);
terms.sort(Comparator.comparing(qt -> qt.term));
assertThat(terms.size(), equalTo(2));
@ -322,8 +457,9 @@ public class QueryAnalyzerTests extends ESTestCase {
// 4) FieldMaskingSpanQuery is a tricky query so we shouldn't optimize this
SpanTermQuery spanTermQuery1 = new SpanTermQuery(new Term("_field", "_short_term"));
Result result = analyze(spanTermQuery1, Collections.emptyMap());
Result result = analyze(spanTermQuery1, Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
assertTermsEqual(result.extractions, spanTermQuery1.getTerm());
}
@ -333,8 +469,21 @@ public class QueryAnalyzerTests extends ESTestCase {
SpanNearQuery spanNearQuery = new SpanNearQuery.Builder("_field", true)
.addClause(spanTermQuery1).addClause(spanTermQuery2).build();
Result result = analyze(spanNearQuery, Collections.emptyMap());
Result result = analyze(spanNearQuery, Version.CURRENT);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(2));
assertTermsEqual(result.extractions, spanTermQuery1.getTerm(), spanTermQuery2.getTerm());
}
public void testExtractQueryMetadata_spanNearQuery_pre6dot1() {
SpanTermQuery spanTermQuery1 = new SpanTermQuery(new Term("_field", "_short_term"));
SpanTermQuery spanTermQuery2 = new SpanTermQuery(new Term("_field", "_very_long_term"));
SpanNearQuery spanNearQuery = new SpanNearQuery.Builder("_field", true)
.addClause(spanTermQuery1).addClause(spanTermQuery2).build();
Result result = analyze(spanNearQuery, Version.V_6_0_0);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
assertTermsEqual(result.extractions, spanTermQuery2.getTerm());
}
@ -342,16 +491,18 @@ public class QueryAnalyzerTests extends ESTestCase {
SpanTermQuery spanTermQuery1 = new SpanTermQuery(new Term("_field", "_short_term"));
SpanTermQuery spanTermQuery2 = new SpanTermQuery(new Term("_field", "_very_long_term"));
SpanOrQuery spanOrQuery = new SpanOrQuery(spanTermQuery1, spanTermQuery2);
Result result = analyze(spanOrQuery, Collections.emptyMap());
Result result = analyze(spanOrQuery, Version.CURRENT);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
assertTermsEqual(result.extractions, spanTermQuery1.getTerm(), spanTermQuery2.getTerm());
}
public void testExtractQueryMetadata_spanFirstQuery() {
SpanTermQuery spanTermQuery1 = new SpanTermQuery(new Term("_field", "_short_term"));
SpanFirstQuery spanFirstQuery = new SpanFirstQuery(spanTermQuery1, 20);
Result result = analyze(spanFirstQuery, Collections.emptyMap());
Result result = analyze(spanFirstQuery, Version.CURRENT);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
assertTermsEqual(result.extractions, spanTermQuery1.getTerm());
}
@ -359,47 +510,54 @@ public class QueryAnalyzerTests extends ESTestCase {
SpanTermQuery spanTermQuery1 = new SpanTermQuery(new Term("_field", "_short_term"));
SpanTermQuery spanTermQuery2 = new SpanTermQuery(new Term("_field", "_very_long_term"));
SpanNotQuery spanNotQuery = new SpanNotQuery(spanTermQuery1, spanTermQuery2);
Result result = analyze(spanNotQuery, Collections.emptyMap());
Result result = analyze(spanNotQuery, Version.CURRENT);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
assertTermsEqual(result.extractions, spanTermQuery1.getTerm());
}
public void testExtractQueryMetadata_matchNoDocsQuery() {
Result result = analyze(new MatchNoDocsQuery("sometimes there is no reason at all"), Collections.emptyMap());
Result result = analyze(new MatchNoDocsQuery("sometimes there is no reason at all"), Version.CURRENT);
assertThat(result.verified, is(true));
assertEquals(0, result.extractions.size());
assertThat(result.minimumShouldMatch, equalTo(1));
BooleanQuery.Builder bq = new BooleanQuery.Builder();
bq.add(new TermQuery(new Term("field", "value")), BooleanClause.Occur.MUST);
bq.add(new MatchNoDocsQuery("sometimes there is no reason at all"), BooleanClause.Occur.MUST);
result = analyze(bq.build(), Collections.emptyMap());
assertThat(result.verified, is(false));
assertEquals(0, result.extractions.size());
result = analyze(bq.build(), Version.CURRENT);
assertThat(result.verified, is(true));
assertEquals(1, result.extractions.size());
assertThat(result.minimumShouldMatch, equalTo(2));
assertTermsEqual(result.extractions, new Term("field", "value"));
bq = new BooleanQuery.Builder();
bq.add(new TermQuery(new Term("field", "value")), BooleanClause.Occur.SHOULD);
bq.add(new MatchNoDocsQuery("sometimes there is no reason at all"), BooleanClause.Occur.SHOULD);
result = analyze(bq.build(), Collections.emptyMap());
result = analyze(bq.build(), Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
assertTermsEqual(result.extractions, new Term("field", "value"));
DisjunctionMaxQuery disjunctionMaxQuery = new DisjunctionMaxQuery(
Arrays.asList(new TermQuery(new Term("field", "value")), new MatchNoDocsQuery("sometimes there is no reason at all")),
1f
);
result = analyze(disjunctionMaxQuery, Collections.emptyMap());
result = analyze(disjunctionMaxQuery, Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
assertTermsEqual(result.extractions, new Term("field", "value"));
}
public void testExtractQueryMetadata_matchAllDocsQuery() {
expectThrows(UnsupportedQueryException.class, () -> analyze(new MatchAllDocsQuery(), Collections.emptyMap()));
expectThrows(UnsupportedQueryException.class, () -> analyze(new MatchAllDocsQuery(), Version.CURRENT));
BooleanQuery.Builder builder = new BooleanQuery.Builder();
builder.add(new TermQuery(new Term("field", "value")), BooleanClause.Occur.MUST);
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.MUST);
Result result = analyze(builder.build(), Collections.emptyMap());
Result result = analyze(builder.build(), Version.CURRENT);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
assertTermsEqual(result.extractions, new Term("field", "value"));
builder = new BooleanQuery.Builder();
@ -407,40 +565,40 @@ public class QueryAnalyzerTests extends ESTestCase {
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.MUST);
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.MUST);
BooleanQuery bq1 = builder.build();
expectThrows(UnsupportedQueryException.class, () -> analyze(bq1, Collections.emptyMap()));
expectThrows(UnsupportedQueryException.class, () -> analyze(bq1, Version.CURRENT));
builder = new BooleanQuery.Builder();
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.MUST_NOT);
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.MUST);
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.MUST);
BooleanQuery bq2 = builder.build();
expectThrows(UnsupportedQueryException.class, () -> analyze(bq2, Collections.emptyMap()));
expectThrows(UnsupportedQueryException.class, () -> analyze(bq2, Version.CURRENT));
builder = new BooleanQuery.Builder();
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.SHOULD);
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.SHOULD);
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.SHOULD);
BooleanQuery bq3 = builder.build();
expectThrows(UnsupportedQueryException.class, () -> analyze(bq3, Collections.emptyMap()));
expectThrows(UnsupportedQueryException.class, () -> analyze(bq3, Version.CURRENT));
builder = new BooleanQuery.Builder();
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.MUST_NOT);
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.SHOULD);
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.SHOULD);
BooleanQuery bq4 = builder.build();
expectThrows(UnsupportedQueryException.class, () -> analyze(bq4, Collections.emptyMap()));
expectThrows(UnsupportedQueryException.class, () -> analyze(bq4, Version.CURRENT));
builder = new BooleanQuery.Builder();
builder.add(new TermQuery(new Term("field", "value")), BooleanClause.Occur.SHOULD);
builder.add(new MatchAllDocsQuery(), BooleanClause.Occur.SHOULD);
BooleanQuery bq5 = builder.build();
expectThrows(UnsupportedQueryException.class, () -> analyze(bq5, Collections.emptyMap()));
expectThrows(UnsupportedQueryException.class, () -> analyze(bq5, Version.CURRENT));
}
public void testExtractQueryMetadata_unsupportedQuery() {
TermRangeQuery termRangeQuery = new TermRangeQuery("_field", null, null, true, false);
UnsupportedQueryException e = expectThrows(UnsupportedQueryException.class,
() -> analyze(termRangeQuery, Collections.emptyMap()));
() -> analyze(termRangeQuery, Version.CURRENT));
assertThat(e.getUnsupportedQuery(), sameInstance(termRangeQuery));
TermQuery termQuery1 = new TermQuery(new Term("_field", "_term"));
@ -449,7 +607,7 @@ public class QueryAnalyzerTests extends ESTestCase {
builder.add(termRangeQuery, BooleanClause.Occur.SHOULD);
BooleanQuery bq = builder.build();
e = expectThrows(UnsupportedQueryException.class, () -> analyze(bq, Collections.emptyMap()));
e = expectThrows(UnsupportedQueryException.class, () -> analyze(bq, Version.CURRENT));
assertThat(e.getUnsupportedQuery(), sameInstance(termRangeQuery));
}
@ -462,8 +620,9 @@ public class QueryAnalyzerTests extends ESTestCase {
builder.add(unsupportedQuery, BooleanClause.Occur.MUST);
BooleanQuery bq1 = builder.build();
Result result = analyze(bq1, Collections.emptyMap());
Result result = analyze(bq1, Version.CURRENT);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
assertTermsEqual(result.extractions, termQuery1.getTerm());
TermQuery termQuery2 = new TermQuery(new Term("_field", "_longer_term"));
@ -472,15 +631,16 @@ public class QueryAnalyzerTests extends ESTestCase {
builder.add(termQuery2, BooleanClause.Occur.MUST);
builder.add(unsupportedQuery, BooleanClause.Occur.MUST);
bq1 = builder.build();
result = analyze(bq1, Collections.emptyMap());
result = analyze(bq1, Version.CURRENT);
assertThat(result.verified, is(false));
assertTermsEqual(result.extractions, termQuery2.getTerm());
assertThat(result.minimumShouldMatch, equalTo(2));
assertTermsEqual(result.extractions, termQuery1.getTerm(), termQuery2.getTerm());
builder = new BooleanQuery.Builder();
builder.add(unsupportedQuery, BooleanClause.Occur.MUST);
builder.add(unsupportedQuery, BooleanClause.Occur.MUST);
BooleanQuery bq2 = builder.build();
UnsupportedQueryException e = expectThrows(UnsupportedQueryException.class, () -> analyze(bq2, Collections.emptyMap()));
UnsupportedQueryException e = expectThrows(UnsupportedQueryException.class, () -> analyze(bq2, Version.CURRENT));
assertThat(e.getUnsupportedQuery(), sameInstance(unsupportedQuery));
}
@ -493,8 +653,9 @@ public class QueryAnalyzerTests extends ESTestCase {
Arrays.asList(termQuery1, termQuery2, termQuery3, termQuery4), 0.1f
);
Result result = analyze(disjunctionMaxQuery, Collections.emptyMap());
Result result = analyze(disjunctionMaxQuery, Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryAnalyzer.QueryExtraction> terms = new ArrayList<>(result.extractions);
terms.sort(Comparator.comparing(qt -> qt.term));
assertThat(terms.size(), equalTo(4));
@ -511,8 +672,9 @@ public class QueryAnalyzerTests extends ESTestCase {
Arrays.asList(termQuery1, termQuery2, termQuery3, new PhraseQuery("_field", "_term4")), 0.1f
);
result = analyze(disjunctionMaxQuery, Collections.emptyMap());
result = analyze(disjunctionMaxQuery, Version.CURRENT);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
terms = new ArrayList<>(result.extractions);
terms.sort(Comparator.comparing(qt -> qt.term));
assertThat(terms.size(), equalTo(4));
@ -528,148 +690,91 @@ public class QueryAnalyzerTests extends ESTestCase {
public void testSynonymQuery() {
SynonymQuery query = new SynonymQuery();
Result result = analyze(query, Collections.emptyMap());
Result result = analyze(query, Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
assertThat(result.extractions.isEmpty(), is(true));
query = new SynonymQuery(new Term("_field", "_value1"), new Term("_field", "_value2"));
result = analyze(query, Collections.emptyMap());
result = analyze(query, Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
assertTermsEqual(result.extractions, new Term("_field", "_value1"), new Term("_field", "_value2"));
}
public void testFunctionScoreQuery() {
TermQuery termQuery = new TermQuery(new Term("_field", "_value"));
FunctionScoreQuery functionScoreQuery = new FunctionScoreQuery(termQuery, new RandomScoreFunction(0, 0, null));
Result result = analyze(functionScoreQuery, Collections.emptyMap());
Result result = analyze(functionScoreQuery, Version.CURRENT);
assertThat(result.verified, is(true));
assertThat(result.minimumShouldMatch, equalTo(1));
assertTermsEqual(result.extractions, new Term("_field", "_value"));
functionScoreQuery = new FunctionScoreQuery(termQuery, new RandomScoreFunction(0, 0, null),
CombineFunction.MULTIPLY, 1f, 10f);
result = analyze(functionScoreQuery, Collections.emptyMap());
result = analyze(functionScoreQuery, Version.CURRENT);
assertThat(result.verified, is(false));
assertThat(result.minimumShouldMatch, equalTo(1));
assertTermsEqual(result.extractions, new Term("_field", "_value"));
}
public void testSelectBestExtraction() {
Set<QueryExtraction> queryTerms1 = terms(new int[0], "12", "1234", "12345");
Set<QueryAnalyzer.QueryExtraction> queryTerms2 = terms(new int[0], "123", "1234", "12345");
Set<QueryExtraction> result = selectBestExtraction(Collections.emptyMap(), queryTerms1, queryTerms2);
Set<QueryExtraction> result = selectBestExtraction(queryTerms1, queryTerms2);
assertSame(queryTerms2, result);
queryTerms1 = terms(new int[]{1, 2, 3});
queryTerms2 = terms(new int[]{2, 3, 4});
result = selectBestExtraction(Collections.emptyMap(), queryTerms1, queryTerms2);
result = selectBestExtraction(queryTerms1, queryTerms2);
assertSame(queryTerms1, result);
queryTerms1 = terms(new int[]{4, 5, 6});
queryTerms2 = terms(new int[]{1, 2, 3});
result = selectBestExtraction(Collections.emptyMap(), queryTerms1, queryTerms2);
result = selectBestExtraction(queryTerms1, queryTerms2);
assertSame(queryTerms2, result);
queryTerms1 = terms(new int[]{1, 2, 3}, "123", "456");
queryTerms2 = terms(new int[]{2, 3, 4}, "123", "456");
result = selectBestExtraction(Collections.emptyMap(), queryTerms1, queryTerms2);
result = selectBestExtraction(queryTerms1, queryTerms2);
assertSame(queryTerms1, result);
queryTerms1 = terms(new int[]{10});
queryTerms2 = terms(new int[]{1});
result = selectBestExtraction(Collections.emptyMap(), queryTerms1, queryTerms2);
result = selectBestExtraction(queryTerms1, queryTerms2);
assertSame(queryTerms2, result);
queryTerms1 = terms(new int[]{10}, "123");
queryTerms2 = terms(new int[]{1});
result = selectBestExtraction(Collections.emptyMap(), queryTerms1, queryTerms2);
result = selectBestExtraction(queryTerms1, queryTerms2);
assertSame(queryTerms1, result);
queryTerms1 = terms(new int[]{10}, "1", "123");
queryTerms2 = terms(new int[]{1}, "1", "2");
result = selectBestExtraction(Collections.emptyMap(), queryTerms1, queryTerms2);
result = selectBestExtraction(queryTerms1, queryTerms2);
assertSame(queryTerms1, result);
queryTerms1 = terms(new int[]{1, 2, 3}, "123", "456");
queryTerms2 = terms(new int[]{2, 3, 4}, "1", "456");
result = selectBestExtraction(Collections.emptyMap(), queryTerms1, queryTerms2);
result = selectBestExtraction(queryTerms1, queryTerms2);
assertSame("Ignoring ranges, so then prefer queryTerms1, because it has the longest shortest term", queryTerms1, result);
queryTerms1 = terms(new int[]{});
queryTerms2 = terms(new int[]{});
result = selectBestExtraction(Collections.emptyMap(), queryTerms1, queryTerms2);
result = selectBestExtraction(queryTerms1, queryTerms2);
assertSame("In case query extractions are empty", queryTerms2, result);
queryTerms1 = terms(new int[]{1});
queryTerms2 = terms(new int[]{});
result = selectBestExtraction(Collections.emptyMap(), queryTerms1, queryTerms2);
result = selectBestExtraction(queryTerms1, queryTerms2);
assertSame("In case query a single extraction is empty", queryTerms1, result);
queryTerms1 = terms(new int[]{});
queryTerms2 = terms(new int[]{1});
result = selectBestExtraction(Collections.emptyMap(), queryTerms1, queryTerms2);
result = selectBestExtraction(queryTerms1, queryTerms2);
assertSame("In case query a single extraction is empty", queryTerms2, result);
}
public void testSelectBestExtraction_boostFields() {
Set<QueryExtraction> queryTerms1 = new HashSet<>(Arrays.asList(
new QueryExtraction(new Term("status_field", "sold")),
new QueryExtraction(new Term("category", "accessory"))
));
Set<QueryAnalyzer.QueryExtraction> queryTerms2 = new HashSet<>(Arrays.asList(
new QueryExtraction(new Term("status_field", "instock")),
new QueryExtraction(new Term("category", "hardware"))
));
Set<QueryExtraction> result = selectBestExtraction(Collections.singletonMap("status_field", 0F), queryTerms1, queryTerms2);
assertSame(queryTerms1, result);
byte[] interval = new byte[Long.BYTES];
LongPoint.encodeDimension(4, interval, 0);
queryTerms1 = new HashSet<>(Arrays.asList(
new QueryExtraction(new Term("status_field", "sold")),
new QueryExtraction(new QueryAnalyzer.Range("price", null, null, interval))
));
interval = new byte[Long.BYTES];
LongPoint.encodeDimension(8, interval, 0);
queryTerms2 = new HashSet<>(Arrays.asList(
new QueryExtraction(new Term("status_field", "instock")),
new QueryExtraction(new QueryAnalyzer.Range("price", null, null, interval))
));
result = selectBestExtraction(Collections.singletonMap("status_field", 0F), queryTerms1, queryTerms2);
assertSame(queryTerms1, result);
Map<String, Float> boostFields = new HashMap<>();
boostFields.put("field1", 2F);
boostFields.put("field2", 0.5F);
boostFields.put("field4", 3F);
boostFields.put("field5", 0.6F);
queryTerms1 = new HashSet<>(Arrays.asList(
new QueryExtraction(new Term("field1", "sold")),
new QueryExtraction(new Term("field2", "accessory")),
new QueryExtraction(new QueryAnalyzer.Range("field3", null, null, new byte[0]))
));
queryTerms2 = new HashSet<>(Arrays.asList(
new QueryExtraction(new Term("field3", "sold")),
new QueryExtraction(new Term("field4", "accessory")),
new QueryExtraction(new QueryAnalyzer.Range("field5", null, null, new byte[0]))
));
result = selectBestExtraction(boostFields, queryTerms1, queryTerms2);
assertSame(queryTerms2, result);
boostFields.put("field2", 6F);
result = selectBestExtraction(boostFields, queryTerms1, queryTerms2);
assertSame(queryTerms1, result);
boostFields.put("field2", 0F);
boostFields.put("field3", 0F);
boostFields.put("field5", 0F);
result = selectBestExtraction(boostFields, queryTerms1, queryTerms2);
assertSame(queryTerms2, result);
boostFields = new HashMap<>();
boostFields.put("field2", 2F);
result = selectBestExtraction(boostFields, queryTerms1, queryTerms2);
assertSame(queryTerms1, result);
}
public void testSelectBestExtraction_random() {
Set<QueryExtraction> terms1 = new HashSet<>();
int shortestTerms1Length = Integer.MAX_VALUE;
@ -691,7 +796,7 @@ public class QueryAnalyzerTests extends ESTestCase {
sumTermLength -= length;
}
Set<QueryAnalyzer.QueryExtraction> result = selectBestExtraction(Collections.emptyMap(), terms1, terms2);
Set<QueryAnalyzer.QueryExtraction> result = selectBestExtraction(terms1, terms2);
Set<QueryExtraction> expected = shortestTerms1Length >= shortestTerms2Length ? terms1 : terms2;
assertThat(result, sameInstance(expected));
}
@ -699,8 +804,9 @@ public class QueryAnalyzerTests extends ESTestCase {
public void testPointRangeQuery() {
// int ranges get converted to long ranges:
Query query = IntPoint.newRangeQuery("_field", 10, 20);
Result result = analyze(query, Collections.emptyMap());
Result result = analyze(query, Version.CURRENT);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryAnalyzer.QueryExtraction> ranges = new ArrayList<>(result.extractions);
assertThat(ranges.size(), equalTo(1));
assertNull(ranges.get(0).term);
@ -709,7 +815,8 @@ public class QueryAnalyzerTests extends ESTestCase {
assertDimension(ranges.get(0).range.upperPoint, bytes -> IntPoint.encodeDimension(20, bytes, 0));
query = LongPoint.newRangeQuery("_field", 10L, 21L);
result = analyze(query, Collections.emptyMap());
result = analyze(query, Version.CURRENT);
assertThat(result.minimumShouldMatch, equalTo(1));
assertFalse(result.verified);
ranges = new ArrayList<>(result.extractions);
assertThat(ranges.size(), equalTo(1));
@ -720,7 +827,8 @@ public class QueryAnalyzerTests extends ESTestCase {
// Half float ranges get converted to double ranges:
query = HalfFloatPoint.newRangeQuery("_field", 10F, 20F);
result = analyze(query, Collections.emptyMap());
result = analyze(query, Version.CURRENT);
assertThat(result.minimumShouldMatch, equalTo(1));
assertFalse(result.verified);
ranges = new ArrayList<>(result.extractions);
assertThat(ranges.size(), equalTo(1));
@ -731,7 +839,8 @@ public class QueryAnalyzerTests extends ESTestCase {
// Float ranges get converted to double ranges:
query = FloatPoint.newRangeQuery("_field", 10F, 20F);
result = analyze(query, Collections.emptyMap());
result = analyze(query, Version.CURRENT);
assertThat(result.minimumShouldMatch, equalTo(1));
assertFalse(result.verified);
ranges = new ArrayList<>(result.extractions);
assertThat(ranges.size(), equalTo(1));
@ -741,7 +850,8 @@ public class QueryAnalyzerTests extends ESTestCase {
assertDimension(ranges.get(0).range.upperPoint, bytes -> FloatPoint.encodeDimension(20F, bytes, 0));
query = DoublePoint.newRangeQuery("_field", 10D, 20D);
result = analyze(query, Collections.emptyMap());
result = analyze(query, Version.CURRENT);
assertThat(result.minimumShouldMatch, equalTo(1));
assertFalse(result.verified);
ranges = new ArrayList<>(result.extractions);
assertThat(ranges.size(), equalTo(1));
@ -752,7 +862,8 @@ public class QueryAnalyzerTests extends ESTestCase {
query = InetAddressPoint.newRangeQuery("_field", InetAddresses.forString("192.168.1.0"),
InetAddresses.forString("192.168.1.255"));
result = analyze(query, Collections.emptyMap());
result = analyze(query, Version.CURRENT);
assertThat(result.minimumShouldMatch, equalTo(1));
assertFalse(result.verified);
ranges = new ArrayList<>(result.extractions);
assertThat(ranges.size(), equalTo(1));
@ -765,24 +876,26 @@ public class QueryAnalyzerTests extends ESTestCase {
public void testTooManyPointDimensions() {
// For now no extraction support for geo queries:
Query query1 = LatLonPoint.newBoxQuery("_field", 0, 1, 0, 1);
expectThrows(UnsupportedQueryException.class, () -> analyze(query1, Collections.emptyMap()));
expectThrows(UnsupportedQueryException.class, () -> analyze(query1, Version.CURRENT));
Query query2 = LongPoint.newRangeQuery("_field", new long[]{0, 0, 0}, new long[]{1, 1, 1});
expectThrows(UnsupportedQueryException.class, () -> analyze(query2, Collections.emptyMap()));
expectThrows(UnsupportedQueryException.class, () -> analyze(query2, Version.CURRENT));
}
public void testPointRangeQuery_lowerUpperReversed() {
Query query = IntPoint.newRangeQuery("_field", 20, 10);
Result result = analyze(query, Collections.emptyMap());
Result result = analyze(query, Version.CURRENT);
assertTrue(result.verified);
assertThat(result.minimumShouldMatch, equalTo(1));
assertThat(result.extractions.size(), equalTo(0));
}
public void testIndexOrDocValuesQuery() {
Query query = new IndexOrDocValuesQuery(IntPoint.newRangeQuery("_field", 10, 20),
SortedNumericDocValuesField.newSlowRangeQuery("_field", 10, 20));
Result result = analyze(query, Collections.emptyMap());
Result result = analyze(query, Version.CURRENT);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(1));
List<QueryAnalyzer.QueryExtraction> ranges = new ArrayList<>(result.extractions);
assertThat(ranges.size(), equalTo(1));
assertNull(ranges.get(0).term);
@ -795,8 +908,9 @@ public class QueryAnalyzerTests extends ESTestCase {
TermQuery termQuery = new TermQuery(new Term("field", "value"));
QueryBitSetProducer queryBitSetProducer = new QueryBitSetProducer(new TermQuery(new Term("_type", "child")));
ESToParentBlockJoinQuery query = new ESToParentBlockJoinQuery(termQuery, queryBitSetProducer, ScoreMode.None, "child");
Result result = analyze(query, Collections.emptyMap());
Result result = analyze(query, Version.CURRENT);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(1));
assertEquals(1, result.extractions.size());
assertNull(result.extractions.toArray(new QueryExtraction[0])[0].range);
assertEquals(new Term("field", "value"), result.extractions.toArray(new QueryExtraction[0])[0].term);
@ -806,44 +920,101 @@ public class QueryAnalyzerTests extends ESTestCase {
BooleanQuery.Builder boolQuery = new BooleanQuery.Builder();
boolQuery.add(LongPoint.newRangeQuery("_field1", 10, 20), BooleanClause.Occur.FILTER);
boolQuery.add(LongPoint.newRangeQuery("_field2", 10, 15), BooleanClause.Occur.FILTER);
Result result = analyze(boolQuery.build(), Collections.emptyMap());
Result result = analyze(boolQuery.build(), Version.V_6_0_0);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(1));
assertEquals(1, result.extractions.size());
assertEquals("_field2", new ArrayList<>(result.extractions).get(0).range.fieldName);
boolQuery = new BooleanQuery.Builder();
boolQuery.add(LongPoint.newRangeQuery("_field1", 10, 20), BooleanClause.Occur.FILTER);
boolQuery.add(IntPoint.newRangeQuery("_field2", 10, 15), BooleanClause.Occur.FILTER);
result = analyze(boolQuery.build(), Collections.emptyMap());
result = analyze(boolQuery.build(), Version.V_6_0_0);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(1));
assertEquals(1, result.extractions.size());
assertEquals("_field2", new ArrayList<>(result.extractions).get(0).range.fieldName);
boolQuery = new BooleanQuery.Builder();
boolQuery.add(DoublePoint.newRangeQuery("_field1", 10, 20), BooleanClause.Occur.FILTER);
boolQuery.add(DoublePoint.newRangeQuery("_field2", 10, 15), BooleanClause.Occur.FILTER);
result = analyze(boolQuery.build(), Collections.emptyMap());
result = analyze(boolQuery.build(), Version.V_6_0_0);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(1));
assertEquals(1, result.extractions.size());
assertEquals("_field2", new ArrayList<>(result.extractions).get(0).range.fieldName);
boolQuery = new BooleanQuery.Builder();
boolQuery.add(DoublePoint.newRangeQuery("_field1", 10, 20), BooleanClause.Occur.FILTER);
boolQuery.add(FloatPoint.newRangeQuery("_field2", 10, 15), BooleanClause.Occur.FILTER);
result = analyze(boolQuery.build(), Collections.emptyMap());
result = analyze(boolQuery.build(), Version.V_6_0_0);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(1));
assertEquals(1, result.extractions.size());
assertEquals("_field2", new ArrayList<>(result.extractions).get(0).range.fieldName);
boolQuery = new BooleanQuery.Builder();
boolQuery.add(HalfFloatPoint.newRangeQuery("_field1", 10, 20), BooleanClause.Occur.FILTER);
boolQuery.add(HalfFloatPoint.newRangeQuery("_field2", 10, 15), BooleanClause.Occur.FILTER);
result = analyze(boolQuery.build(), Collections.emptyMap());
result = analyze(boolQuery.build(), Version.V_6_0_0);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(1));
assertEquals(1, result.extractions.size());
assertEquals("_field2", new ArrayList<>(result.extractions).get(0).range.fieldName);
}
public void testPointRangeQuerySelectRanges() {
BooleanQuery.Builder boolQuery = new BooleanQuery.Builder();
boolQuery.add(LongPoint.newRangeQuery("_field1", 10, 20), BooleanClause.Occur.SHOULD);
boolQuery.add(LongPoint.newRangeQuery("_field2", 10, 15), BooleanClause.Occur.SHOULD);
Result result = analyze(boolQuery.build(), Version.CURRENT);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(1));
assertEquals(2, result.extractions.size());
assertEquals("_field2", new ArrayList<>(result.extractions).get(0).range.fieldName);
assertEquals("_field1", new ArrayList<>(result.extractions).get(1).range.fieldName);
boolQuery = new BooleanQuery.Builder();
boolQuery.add(LongPoint.newRangeQuery("_field1", 10, 20), BooleanClause.Occur.FILTER);
boolQuery.add(LongPoint.newRangeQuery("_field2", 10, 15), BooleanClause.Occur.FILTER);
result = analyze(boolQuery.build(), Version.CURRENT);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(2));
assertEquals(2, result.extractions.size());
assertEquals("_field2", new ArrayList<>(result.extractions).get(0).range.fieldName);
assertEquals("_field1", new ArrayList<>(result.extractions).get(1).range.fieldName);
boolQuery = new BooleanQuery.Builder();
boolQuery.add(LongPoint.newRangeQuery("_field1", 10, 20), BooleanClause.Occur.FILTER);
boolQuery.add(LongPoint.newRangeQuery("_field1", 10, 15), BooleanClause.Occur.FILTER);
result = analyze(boolQuery.build(), Version.CURRENT);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(1));
assertEquals(2, result.extractions.size());
assertEquals("_field1", new ArrayList<>(result.extractions).get(0).range.fieldName);
assertEquals("_field1", new ArrayList<>(result.extractions).get(1).range.fieldName);
boolQuery = new BooleanQuery.Builder().setMinimumNumberShouldMatch(2);
boolQuery.add(LongPoint.newRangeQuery("_field1", 10, 20), BooleanClause.Occur.SHOULD);
boolQuery.add(LongPoint.newRangeQuery("_field2", 10, 15), BooleanClause.Occur.SHOULD);
result = analyze(boolQuery.build(), Version.CURRENT);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(2));
assertEquals(2, result.extractions.size());
assertEquals("_field2", new ArrayList<>(result.extractions).get(0).range.fieldName);
assertEquals("_field1", new ArrayList<>(result.extractions).get(1).range.fieldName);
boolQuery = new BooleanQuery.Builder().setMinimumNumberShouldMatch(2);
boolQuery.add(LongPoint.newRangeQuery("_field1", 10, 20), BooleanClause.Occur.SHOULD);
boolQuery.add(LongPoint.newRangeQuery("_field1", 10, 15), BooleanClause.Occur.SHOULD);
result = analyze(boolQuery.build(), Version.CURRENT);
assertFalse(result.verified);
assertThat(result.minimumShouldMatch, equalTo(1));
assertEquals(2, result.extractions.size());
assertEquals("_field1", new ArrayList<>(result.extractions).get(0).range.fieldName);
assertEquals("_field1", new ArrayList<>(result.extractions).get(1).range.fieldName);
}
private static void assertDimension(byte[] expected, Consumer<byte[]> consumer) {
byte[] dest = new byte[expected.length];
consumer.accept(dest);

View File

@ -86,3 +86,49 @@
ingest.get_pipeline:
id: "my_pipeline"
- match: { my_pipeline.description: "_description" }
---
"Use the percolate query in mixed cluster":
- do:
search:
index: queries
body:
query:
percolate:
field: query
type: doc
document:
field1: value
- match: { hits.total: 1 }
- match: { hits.hits.0._id: q1 }
- do:
search:
index: queries
body:
sort: _doc
query:
percolate:
field: query
type: doc
document:
field1: value
field2: value
- match: { hits.total: 2 }
- match: { hits.hits.0._id: q1 }
- match: { hits.hits.1._id: q2 }
- do:
search:
index: queries
body:
sort: _doc
query:
percolate:
field: query
type: doc
document:
field2: value
field3: value
- match: { hits.total: 1 }
- match: { hits.hits.0._id: q3 }

View File

@ -100,3 +100,111 @@
params:
f1: v5_old
- match: { hits.total: 1 }
---
"Index percolator queries and use the percolate query in old cluster":
- do:
indices.create:
index: queries
body:
settings:
index:
number_of_shards: 1
mappings:
doc:
properties:
query:
type: percolator
field1:
type: keyword
field2:
type: keyword
field3:
type: keyword
- do:
index:
index: queries
type: doc
id: q1
body:
query:
term:
field1: value
- do:
index:
index: queries
type: doc
id: q2
body:
query:
bool:
must:
- term:
field1: value
- term:
field2: value
- do:
index:
index: queries
type: doc
id: q3
body:
query:
bool:
minimum_should_match: 2
should:
- term:
field2: value
- term:
field3: value
- do:
indices.refresh:
index: queries
- do:
search:
index: queries
body:
query:
percolate:
field: query
type: doc
document:
field1: value
- match: { hits.total: 1 }
- match: { hits.hits.0._id: q1 }
- do:
search:
index: queries
body:
sort: _doc
query:
percolate:
field: query
type: doc
document:
field1: value
field2: value
- match: { hits.total: 2 }
- match: { hits.hits.0._id: q1 }
- match: { hits.hits.1._id: q2 }
- do:
search:
index: queries
body:
sort: _doc
query:
percolate:
field: query
type: doc
document:
field2: value
field3: value
- match: { hits.total: 1 }
- match: { hits.hits.0._id: q3 }

View File

@ -66,3 +66,66 @@
ingest.get_pipeline:
id: "my_pipeline"
- match: { my_pipeline.description: "_description" }
---
"Index percolator query and use the percolate query in upgraded cluster":
- do:
index:
index: queries
type: doc
id: q4
refresh: true
body:
query:
bool:
minimum_should_match: 2
should:
- term:
field1: value
- term:
field2: value
- do:
search:
index: queries
body:
query:
percolate:
field: query
type: doc
document:
field1: value
- match: { hits.total: 1 }
- match: { hits.hits.0._id: q1 }
- do:
search:
index: queries
body:
sort: _doc
query:
percolate:
field: query
type: doc
document:
field1: value
field2: value
- match: { hits.total: 3 }
- match: { hits.hits.0._id: q1 }
- match: { hits.hits.1._id: q2 }
- match: { hits.hits.2._id: q4 }
- do:
search:
index: queries
body:
sort: _doc
query:
percolate:
field: query
type: doc
document:
field2: value
field3: value
- match: { hits.total: 1 }
- match: { hits.hits.0._id: q3 }