Merge pull request #10391 from jpountz/fix/remove_flt
Queries: Remove fuzzy-like-this support. Close #10391
This commit is contained in:
commit
bad6656f02
|
@ -380,3 +380,6 @@ http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/
|
|||
The cluster state api doesn't return the `routing_nodes` section anymore when
|
||||
`routing_table` is requested. The newly introduced `routing_nodes` flag can
|
||||
be used separately to control whether `routing_nodes` should be returned.
|
||||
=== Query DSL
|
||||
|
||||
The `fuzzy_like_this` and `fuzzy_like_this_field` queries have been removed.
|
||||
|
|
|
@ -22,10 +22,6 @@ include::queries/dis-max-query.asciidoc[]
|
|||
|
||||
include::queries/filtered-query.asciidoc[]
|
||||
|
||||
include::queries/flt-query.asciidoc[]
|
||||
|
||||
include::queries/flt-field-query.asciidoc[]
|
||||
|
||||
include::queries/function-score-query.asciidoc[]
|
||||
|
||||
include::queries/fuzzy-query.asciidoc[]
|
||||
|
|
|
@ -1,47 +0,0 @@
|
|||
[[query-dsl-flt-field-query]]
|
||||
=== Fuzzy Like This Field Query
|
||||
|
||||
The `fuzzy_like_this_field` query is the same as the `fuzzy_like_this`
|
||||
query, except that it runs against a single field. It provides nicer
|
||||
query DSL over the generic `fuzzy_like_this` query, and support typed
|
||||
fields query (automatically wraps typed fields with type filter to match
|
||||
only on the specific type).
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
{
|
||||
"fuzzy_like_this_field" : {
|
||||
"name.first" : {
|
||||
"like_text" : "text like this one",
|
||||
"max_query_terms" : 12
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
`fuzzy_like_this_field` can be shortened to `flt_field`.
|
||||
|
||||
The `fuzzy_like_this_field` top level parameters include:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Parameter |Description
|
||||
|`like_text` |The text to find documents like it, *required*.
|
||||
|
||||
|`ignore_tf` |Should term frequency be ignored. Defaults to `false`.
|
||||
|
||||
|`max_query_terms` |The maximum number of query terms that will be
|
||||
included in any generated query. Defaults to `25`.
|
||||
|
||||
|`fuzziness` |The fuzziness of the term variants. Defaults
|
||||
to `0.5`. See <<fuzziness>>.
|
||||
|
||||
|`prefix_length` |Length of required common prefix on variant terms.
|
||||
Defaults to `0`.
|
||||
|
||||
|`boost` |Sets the boost value of the query. Defaults to `1.0`.
|
||||
|
||||
|`analyzer` |The analyzer that will be used to analyze the text.
|
||||
Defaults to the analyzer associated with the field.
|
||||
|=======================================================================
|
||||
|
|
@ -1,65 +0,0 @@
|
|||
[[query-dsl-flt-query]]
|
||||
=== Fuzzy Like This Query
|
||||
|
||||
Fuzzy like this query find documents that are "like" provided text by
|
||||
running it against one or more fields.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
{
|
||||
"fuzzy_like_this" : {
|
||||
"fields" : ["name.first", "name.last"],
|
||||
"like_text" : "text like this one",
|
||||
"max_query_terms" : 12
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
`fuzzy_like_this` can be shortened to `flt`.
|
||||
|
||||
The `fuzzy_like_this` top level parameters include:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Parameter |Description
|
||||
|`fields` |A list of the fields to run the more like this query against.
|
||||
Defaults to the `_all` field.
|
||||
|
||||
|`like_text` |The text to find documents like it, *required*.
|
||||
|
||||
|`ignore_tf` |Should term frequency be ignored. Defaults to `false`.
|
||||
|
||||
|`max_query_terms` |The maximum number of query terms that will be
|
||||
included in any generated query. Defaults to `25`.
|
||||
|
||||
|`fuzziness` |The minimum similarity of the term variants. Defaults
|
||||
to `0.5`. See <<fuzziness>>.
|
||||
|
||||
|`prefix_length` |Length of required common prefix on variant terms.
|
||||
Defaults to `0`.
|
||||
|
||||
|`boost` |Sets the boost value of the query. Defaults to `1.0`.
|
||||
|
||||
|`analyzer` |The analyzer that will be used to analyze the text.
|
||||
Defaults to the analyzer associated with the field.
|
||||
|=======================================================================
|
||||
|
||||
[float]
|
||||
==== How it Works
|
||||
|
||||
Fuzzifies ALL terms provided as strings and then picks the best n
|
||||
differentiating terms. In effect this mixes the behaviour of FuzzyQuery
|
||||
and MoreLikeThis but with special consideration of fuzzy scoring
|
||||
factors. This generally produces good results for queries where users
|
||||
may provide details in a number of fields and have no knowledge of
|
||||
boolean query syntax and also want a degree of fuzzy matching and a fast
|
||||
query.
|
||||
|
||||
For each source term the fuzzy variants are held in a BooleanQuery with
|
||||
no coord factor (because we are not looking for matches on multiple
|
||||
variants in any one doc). Additionally, a specialized TermQuery is used
|
||||
for variants and does not use that variant term's IDF because this would
|
||||
favor rarer terms, such as misspellings. Instead, all variants use the
|
||||
same IDF ranking (the one for the source query term) and this is
|
||||
factored into the variant's boost. If the source query term does not
|
||||
exist in the index the average IDF of the variants is used.
|
|
@ -1,148 +0,0 @@
|
|||
/*
|
||||
* Licensed to Elasticsearch under one or more contributor
|
||||
* license agreements. See the NOTICE file distributed with
|
||||
* this work for additional information regarding copyright
|
||||
* ownership. Elasticsearch licenses this file to you under
|
||||
* the Apache License, Version 2.0 (the "License"); you may
|
||||
* not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing,
|
||||
* software distributed under the License is distributed on an
|
||||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
* KIND, either express or implied. See the License for the
|
||||
* specific language governing permissions and limitations
|
||||
* under the License.
|
||||
*/
|
||||
|
||||
package org.elasticsearch.index.query;
|
||||
|
||||
import org.elasticsearch.ElasticsearchIllegalArgumentException;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
|
||||
import java.io.IOException;
|
||||
|
||||
/**
|
||||
*
|
||||
*/
|
||||
public class FuzzyLikeThisFieldQueryBuilder extends BaseQueryBuilder implements BoostableQueryBuilder<FuzzyLikeThisFieldQueryBuilder> {
|
||||
|
||||
private final String name;
|
||||
|
||||
private Float boost;
|
||||
|
||||
private String likeText = null;
|
||||
private Fuzziness fuzziness;
|
||||
private Integer prefixLength;
|
||||
private Integer maxQueryTerms;
|
||||
private Boolean ignoreTF;
|
||||
private String analyzer;
|
||||
private Boolean failOnUnsupportedField;
|
||||
private String queryName;
|
||||
|
||||
/**
|
||||
* A fuzzy more like this query on the provided field.
|
||||
*
|
||||
* @param name the name of the field
|
||||
*/
|
||||
public FuzzyLikeThisFieldQueryBuilder(String name) {
|
||||
this.name = name;
|
||||
}
|
||||
|
||||
/**
|
||||
* The text to use in order to find documents that are "like" this.
|
||||
*/
|
||||
public FuzzyLikeThisFieldQueryBuilder likeText(String likeText) {
|
||||
this.likeText = likeText;
|
||||
return this;
|
||||
}
|
||||
|
||||
public FuzzyLikeThisFieldQueryBuilder fuzziness(Fuzziness fuzziness) {
|
||||
this.fuzziness = fuzziness;
|
||||
return this;
|
||||
}
|
||||
|
||||
public FuzzyLikeThisFieldQueryBuilder prefixLength(int prefixLength) {
|
||||
this.prefixLength = prefixLength;
|
||||
return this;
|
||||
}
|
||||
|
||||
public FuzzyLikeThisFieldQueryBuilder maxQueryTerms(int maxQueryTerms) {
|
||||
this.maxQueryTerms = maxQueryTerms;
|
||||
return this;
|
||||
}
|
||||
|
||||
public FuzzyLikeThisFieldQueryBuilder ignoreTF(boolean ignoreTF) {
|
||||
this.ignoreTF = ignoreTF;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* The analyzer that will be used to analyze the text. Defaults to the analyzer associated with the field.
|
||||
*/
|
||||
public FuzzyLikeThisFieldQueryBuilder analyzer(String analyzer) {
|
||||
this.analyzer = analyzer;
|
||||
return this;
|
||||
}
|
||||
|
||||
@Override
|
||||
public FuzzyLikeThisFieldQueryBuilder boost(float boost) {
|
||||
this.boost = boost;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Whether to fail or return no result when this query is run against a field which is not supported such as binary/numeric fields.
|
||||
*/
|
||||
public FuzzyLikeThisFieldQueryBuilder failOnUnsupportedField(boolean fail) {
|
||||
failOnUnsupportedField = fail;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the query name for the filter that can be used when searching for matched_filters per hit.
|
||||
*/
|
||||
public FuzzyLikeThisFieldQueryBuilder queryName(String queryName) {
|
||||
this.queryName = queryName;
|
||||
return this;
|
||||
}
|
||||
|
||||
@Override
|
||||
protected void doXContent(XContentBuilder builder, Params params) throws IOException {
|
||||
builder.startObject(FuzzyLikeThisFieldQueryParser.NAME);
|
||||
builder.startObject(name);
|
||||
if (likeText == null) {
|
||||
throw new ElasticsearchIllegalArgumentException("fuzzyLikeThis requires 'likeText' to be provided");
|
||||
}
|
||||
builder.field("like_text", likeText);
|
||||
if (maxQueryTerms != null) {
|
||||
builder.field("max_query_terms", maxQueryTerms);
|
||||
}
|
||||
if (fuzziness != null) {
|
||||
fuzziness.toXContent(builder, params);
|
||||
}
|
||||
if (prefixLength != null) {
|
||||
builder.field("prefix_length", prefixLength);
|
||||
}
|
||||
if (ignoreTF != null) {
|
||||
builder.field("ignore_tf", ignoreTF);
|
||||
}
|
||||
if (boost != null) {
|
||||
builder.field("boost", boost);
|
||||
}
|
||||
if (analyzer != null) {
|
||||
builder.field("analyzer", analyzer);
|
||||
}
|
||||
if (failOnUnsupportedField != null) {
|
||||
builder.field("fail_on_unsupported_field", failOnUnsupportedField);
|
||||
}
|
||||
if (queryName != null) {
|
||||
builder.field("_name", queryName);
|
||||
}
|
||||
builder.endObject();
|
||||
builder.endObject();
|
||||
}
|
||||
}
|
|
@ -1,160 +0,0 @@
|
|||
/*
|
||||
* Licensed to Elasticsearch under one or more contributor
|
||||
* license agreements. See the NOTICE file distributed with
|
||||
* this work for additional information regarding copyright
|
||||
* ownership. Elasticsearch licenses this file to you under
|
||||
* the Apache License, Version 2.0 (the "License"); you may
|
||||
* not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing,
|
||||
* software distributed under the License is distributed on an
|
||||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
* KIND, either express or implied. See the License for the
|
||||
* specific language governing permissions and limitations
|
||||
* under the License.
|
||||
*/
|
||||
|
||||
package org.elasticsearch.index.query;
|
||||
|
||||
import org.apache.lucene.analysis.Analyzer;
|
||||
import org.apache.lucene.sandbox.queries.FuzzyLikeThisQuery;
|
||||
import org.apache.lucene.search.Query;
|
||||
import org.elasticsearch.ElasticsearchIllegalArgumentException;
|
||||
import org.elasticsearch.common.ParseField;
|
||||
import org.elasticsearch.common.Strings;
|
||||
import org.elasticsearch.common.inject.Inject;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.analysis.Analysis;
|
||||
import org.elasticsearch.index.mapper.MapperService;
|
||||
|
||||
import java.io.IOException;
|
||||
|
||||
/**
|
||||
* <pre>
|
||||
* {
|
||||
* fuzzy_like_this_field : {
|
||||
* field1 : {
|
||||
* maxNumTerms : 12,
|
||||
* boost : 1.1,
|
||||
* likeText : "..."
|
||||
* }
|
||||
* }
|
||||
* </pre>
|
||||
*/
|
||||
public class FuzzyLikeThisFieldQueryParser implements QueryParser {
|
||||
|
||||
public static final String NAME = "flt_field";
|
||||
private static final Fuzziness DEFAULT_FUZZINESS = Fuzziness.fromSimilarity(0.5f);
|
||||
private static final ParseField FUZZINESS = Fuzziness.FIELD.withDeprecation("min_similarity");
|
||||
|
||||
@Inject
|
||||
public FuzzyLikeThisFieldQueryParser() {
|
||||
}
|
||||
|
||||
@Override
|
||||
public String[] names() {
|
||||
return new String[]{NAME, "fuzzy_like_this_field", Strings.toCamelCase(NAME), "fuzzyLikeThisField"};
|
||||
}
|
||||
|
||||
@Override
|
||||
public Query parse(QueryParseContext parseContext) throws IOException, QueryParsingException {
|
||||
XContentParser parser = parseContext.parser();
|
||||
|
||||
int maxNumTerms = 25;
|
||||
float boost = 1.0f;
|
||||
String likeText = null;
|
||||
Fuzziness fuzziness = DEFAULT_FUZZINESS;
|
||||
int prefixLength = 0;
|
||||
boolean ignoreTF = false;
|
||||
Analyzer analyzer = null;
|
||||
boolean failOnUnsupportedField = true;
|
||||
String queryName = null;
|
||||
|
||||
XContentParser.Token token = parser.nextToken();
|
||||
if (token != XContentParser.Token.FIELD_NAME) {
|
||||
throw new QueryParsingException(parseContext.index(), "[flt_field] query malformed, no field");
|
||||
}
|
||||
String fieldName = parser.currentName();
|
||||
|
||||
// now, we move after the field name, which starts the object
|
||||
token = parser.nextToken();
|
||||
if (token != XContentParser.Token.START_OBJECT) {
|
||||
throw new QueryParsingException(parseContext.index(), "[flt_field] query malformed, no start_object");
|
||||
}
|
||||
|
||||
|
||||
String currentFieldName = null;
|
||||
while ((token = parser.nextToken()) != XContentParser.Token.END_OBJECT) {
|
||||
if (token == XContentParser.Token.FIELD_NAME) {
|
||||
currentFieldName = parser.currentName();
|
||||
} else if (token.isValue()) {
|
||||
if ("like_text".equals(currentFieldName) || "likeText".equals(currentFieldName)) {
|
||||
likeText = parser.text();
|
||||
} else if ("max_query_terms".equals(currentFieldName) || "maxQueryTerms".equals(currentFieldName)) {
|
||||
maxNumTerms = parser.intValue();
|
||||
} else if ("boost".equals(currentFieldName)) {
|
||||
boost = parser.floatValue();
|
||||
} else if ("ignore_tf".equals(currentFieldName) || "ignoreTF".equals(currentFieldName)) {
|
||||
ignoreTF = parser.booleanValue();
|
||||
} else if (FUZZINESS.match(currentFieldName, parseContext.parseFlags())) {
|
||||
fuzziness = Fuzziness.parse(parser);
|
||||
} else if ("prefix_length".equals(currentFieldName) || "prefixLength".equals(currentFieldName)) {
|
||||
prefixLength = parser.intValue();
|
||||
} else if ("analyzer".equals(currentFieldName)) {
|
||||
analyzer = parseContext.analysisService().analyzer(parser.text());
|
||||
} else if ("fail_on_unsupported_field".equals(currentFieldName) || "failOnUnsupportedField".equals(currentFieldName)) {
|
||||
failOnUnsupportedField = parser.booleanValue();
|
||||
} else if ("_name".equals(currentFieldName)) {
|
||||
queryName = parser.text();
|
||||
} else {
|
||||
throw new QueryParsingException(parseContext.index(), "[flt_field] query does not support [" + currentFieldName + "]");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (likeText == null) {
|
||||
throw new QueryParsingException(parseContext.index(), "fuzzy_like_This_field requires 'like_text' to be specified");
|
||||
}
|
||||
|
||||
MapperService.SmartNameFieldMappers smartNameFieldMappers = parseContext.smartFieldMappers(fieldName);
|
||||
if (smartNameFieldMappers != null) {
|
||||
if (smartNameFieldMappers.hasMapper()) {
|
||||
fieldName = smartNameFieldMappers.mapper().names().indexName();
|
||||
if (analyzer == null) {
|
||||
analyzer = smartNameFieldMappers.mapper().searchAnalyzer();
|
||||
}
|
||||
}
|
||||
}
|
||||
if (analyzer == null) {
|
||||
analyzer = parseContext.mapperService().searchAnalyzer();
|
||||
}
|
||||
if (!Analysis.generatesCharacterTokenStream(analyzer, fieldName)) {
|
||||
if (failOnUnsupportedField) {
|
||||
throw new ElasticsearchIllegalArgumentException("fuzzy_like_this_field doesn't support binary/numeric fields: [" + fieldName + "]");
|
||||
} else {
|
||||
return null;
|
||||
}
|
||||
}
|
||||
|
||||
FuzzyLikeThisQuery fuzzyLikeThisQuery = new FuzzyLikeThisQuery(maxNumTerms, analyzer);
|
||||
fuzzyLikeThisQuery.addTerms(likeText, fieldName, fuzziness.asSimilarity(), prefixLength);
|
||||
fuzzyLikeThisQuery.setBoost(boost);
|
||||
fuzzyLikeThisQuery.setIgnoreTF(ignoreTF);
|
||||
|
||||
// move to the next end object, to close the field name
|
||||
token = parser.nextToken();
|
||||
if (token != XContentParser.Token.END_OBJECT) {
|
||||
throw new QueryParsingException(parseContext.index(), "[flt_field] query malformed, no end_object");
|
||||
}
|
||||
assert token == XContentParser.Token.END_OBJECT;
|
||||
|
||||
if (queryName != null) {
|
||||
parseContext.addNamedQuery(queryName, fuzzyLikeThisQuery);
|
||||
}
|
||||
return fuzzyLikeThisQuery;
|
||||
}
|
||||
}
|
|
@ -1,160 +0,0 @@
|
|||
/*
|
||||
* Licensed to Elasticsearch under one or more contributor
|
||||
* license agreements. See the NOTICE file distributed with
|
||||
* this work for additional information regarding copyright
|
||||
* ownership. Elasticsearch licenses this file to you under
|
||||
* the Apache License, Version 2.0 (the "License"); you may
|
||||
* not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing,
|
||||
* software distributed under the License is distributed on an
|
||||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
* KIND, either express or implied. See the License for the
|
||||
* specific language governing permissions and limitations
|
||||
* under the License.
|
||||
*/
|
||||
|
||||
package org.elasticsearch.index.query;
|
||||
|
||||
import org.elasticsearch.ElasticsearchIllegalArgumentException;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
|
||||
import java.io.IOException;
|
||||
|
||||
/**
|
||||
*
|
||||
*/
|
||||
public class FuzzyLikeThisQueryBuilder extends BaseQueryBuilder implements BoostableQueryBuilder<FuzzyLikeThisQueryBuilder> {
|
||||
|
||||
private final String[] fields;
|
||||
|
||||
private Float boost;
|
||||
|
||||
private String likeText = null;
|
||||
private Fuzziness fuzziness;
|
||||
private Integer prefixLength;
|
||||
private Integer maxQueryTerms;
|
||||
private Boolean ignoreTF;
|
||||
private String analyzer;
|
||||
private Boolean failOnUnsupportedField;
|
||||
private String queryName;
|
||||
|
||||
/**
|
||||
* Constructs a new fuzzy like this query which uses the "_all" field.
|
||||
*/
|
||||
public FuzzyLikeThisQueryBuilder() {
|
||||
this.fields = null;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the field names that will be used when generating the 'Fuzzy Like This' query.
|
||||
*
|
||||
* @param fields the field names that will be used when generating the 'Fuzzy Like This' query.
|
||||
*/
|
||||
public FuzzyLikeThisQueryBuilder(String... fields) {
|
||||
this.fields = fields;
|
||||
}
|
||||
|
||||
/**
|
||||
* The text to use in order to find documents that are "like" this.
|
||||
*/
|
||||
public FuzzyLikeThisQueryBuilder likeText(String likeText) {
|
||||
this.likeText = likeText;
|
||||
return this;
|
||||
}
|
||||
|
||||
public FuzzyLikeThisQueryBuilder fuzziness(Fuzziness fuzziness) {
|
||||
this.fuzziness = fuzziness;
|
||||
return this;
|
||||
}
|
||||
|
||||
public FuzzyLikeThisQueryBuilder prefixLength(int prefixLength) {
|
||||
this.prefixLength = prefixLength;
|
||||
return this;
|
||||
}
|
||||
|
||||
public FuzzyLikeThisQueryBuilder maxQueryTerms(int maxQueryTerms) {
|
||||
this.maxQueryTerms = maxQueryTerms;
|
||||
return this;
|
||||
}
|
||||
|
||||
public FuzzyLikeThisQueryBuilder ignoreTF(boolean ignoreTF) {
|
||||
this.ignoreTF = ignoreTF;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* The analyzer that will be used to analyze the text. Defaults to the analyzer associated with the fied.
|
||||
*/
|
||||
public FuzzyLikeThisQueryBuilder analyzer(String analyzer) {
|
||||
this.analyzer = analyzer;
|
||||
return this;
|
||||
}
|
||||
|
||||
@Override
|
||||
public FuzzyLikeThisQueryBuilder boost(float boost) {
|
||||
this.boost = boost;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Whether to fail or return no result when this query is run against a field which is not supported such as binary/numeric fields.
|
||||
*/
|
||||
public FuzzyLikeThisQueryBuilder failOnUnsupportedField(boolean fail) {
|
||||
failOnUnsupportedField = fail;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Sets the query name for the filter that can be used when searching for matched_filters per hit.
|
||||
*/
|
||||
public FuzzyLikeThisQueryBuilder queryName(String queryName) {
|
||||
this.queryName = queryName;
|
||||
return this;
|
||||
}
|
||||
|
||||
@Override
|
||||
protected void doXContent(XContentBuilder builder, Params params) throws IOException {
|
||||
builder.startObject(FuzzyLikeThisQueryParser.NAME);
|
||||
if (fields != null) {
|
||||
builder.startArray("fields");
|
||||
for (String field : fields) {
|
||||
builder.value(field);
|
||||
}
|
||||
builder.endArray();
|
||||
}
|
||||
if (likeText == null) {
|
||||
throw new ElasticsearchIllegalArgumentException("fuzzyLikeThis requires 'likeText' to be provided");
|
||||
}
|
||||
builder.field("like_text", likeText);
|
||||
if (maxQueryTerms != null) {
|
||||
builder.field("max_query_terms", maxQueryTerms);
|
||||
}
|
||||
if (fuzziness != null) {
|
||||
fuzziness.toXContent(builder, params);
|
||||
}
|
||||
if (prefixLength != null) {
|
||||
builder.field("prefix_length", prefixLength);
|
||||
}
|
||||
if (ignoreTF != null) {
|
||||
builder.field("ignore_tf", ignoreTF);
|
||||
}
|
||||
if (boost != null) {
|
||||
builder.field("boost", boost);
|
||||
}
|
||||
if (analyzer != null) {
|
||||
builder.field("analyzer", analyzer);
|
||||
}
|
||||
if (failOnUnsupportedField != null) {
|
||||
builder.field("fail_on_unsupported_field", failOnUnsupportedField);
|
||||
}
|
||||
if (queryName != null) {
|
||||
builder.field("_name", queryName);
|
||||
}
|
||||
builder.endObject();
|
||||
}
|
||||
}
|
|
@ -1,162 +0,0 @@
|
|||
/*
|
||||
* Licensed to Elasticsearch under one or more contributor
|
||||
* license agreements. See the NOTICE file distributed with
|
||||
* this work for additional information regarding copyright
|
||||
* ownership. Elasticsearch licenses this file to you under
|
||||
* the Apache License, Version 2.0 (the "License"); you may
|
||||
* not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing,
|
||||
* software distributed under the License is distributed on an
|
||||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
* KIND, either express or implied. See the License for the
|
||||
* specific language governing permissions and limitations
|
||||
* under the License.
|
||||
*/
|
||||
|
||||
package org.elasticsearch.index.query;
|
||||
|
||||
import com.google.common.collect.Lists;
|
||||
import org.apache.lucene.analysis.Analyzer;
|
||||
import org.apache.lucene.sandbox.queries.FuzzyLikeThisQuery;
|
||||
import org.apache.lucene.search.Query;
|
||||
import org.elasticsearch.ElasticsearchIllegalArgumentException;
|
||||
import org.elasticsearch.common.ParseField;
|
||||
import org.elasticsearch.common.inject.Inject;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.analysis.Analysis;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.util.Iterator;
|
||||
import java.util.List;
|
||||
|
||||
/**
|
||||
* <pre>
|
||||
* {
|
||||
* fuzzy_like_this : {
|
||||
* maxNumTerms : 12,
|
||||
* boost : 1.1,
|
||||
* fields : ["field1", "field2"]
|
||||
* likeText : "..."
|
||||
* }
|
||||
* }
|
||||
* </pre>
|
||||
*/
|
||||
public class FuzzyLikeThisQueryParser implements QueryParser {
|
||||
|
||||
public static final String NAME = "flt";
|
||||
private static final ParseField FUZZINESS = Fuzziness.FIELD.withDeprecation("min_similarity");
|
||||
|
||||
@Inject
|
||||
public FuzzyLikeThisQueryParser() {
|
||||
}
|
||||
|
||||
@Override
|
||||
public String[] names() {
|
||||
return new String[]{NAME, "fuzzy_like_this", "fuzzyLikeThis"};
|
||||
}
|
||||
|
||||
@Override
|
||||
public Query parse(QueryParseContext parseContext) throws IOException, QueryParsingException {
|
||||
XContentParser parser = parseContext.parser();
|
||||
|
||||
int maxNumTerms = 25;
|
||||
float boost = 1.0f;
|
||||
List<String> fields = null;
|
||||
String likeText = null;
|
||||
Fuzziness fuzziness = Fuzziness.TWO;
|
||||
int prefixLength = 0;
|
||||
boolean ignoreTF = false;
|
||||
Analyzer analyzer = null;
|
||||
boolean failOnUnsupportedField = true;
|
||||
String queryName = null;
|
||||
|
||||
XContentParser.Token token;
|
||||
String currentFieldName = null;
|
||||
while ((token = parser.nextToken()) != XContentParser.Token.END_OBJECT) {
|
||||
if (token == XContentParser.Token.FIELD_NAME) {
|
||||
currentFieldName = parser.currentName();
|
||||
} else if (token.isValue()) {
|
||||
if ("like_text".equals(currentFieldName) || "likeText".equals(currentFieldName)) {
|
||||
likeText = parser.text();
|
||||
} else if ("max_query_terms".equals(currentFieldName) || "maxQueryTerms".equals(currentFieldName)) {
|
||||
maxNumTerms = parser.intValue();
|
||||
} else if ("boost".equals(currentFieldName)) {
|
||||
boost = parser.floatValue();
|
||||
} else if ("ignore_tf".equals(currentFieldName) || "ignoreTF".equals(currentFieldName)) {
|
||||
ignoreTF = parser.booleanValue();
|
||||
} else if (FUZZINESS.match(currentFieldName, parseContext.parseFlags())) {
|
||||
fuzziness = Fuzziness.parse(parser);
|
||||
} else if ("prefix_length".equals(currentFieldName) || "prefixLength".equals(currentFieldName)) {
|
||||
prefixLength = parser.intValue();
|
||||
} else if ("analyzer".equals(currentFieldName)) {
|
||||
analyzer = parseContext.analysisService().analyzer(parser.text());
|
||||
} else if ("fail_on_unsupported_field".equals(currentFieldName) || "failOnUnsupportedField".equals(currentFieldName)) {
|
||||
failOnUnsupportedField = parser.booleanValue();
|
||||
} else if ("_name".equals(currentFieldName)) {
|
||||
queryName = parser.text();
|
||||
} else {
|
||||
throw new QueryParsingException(parseContext.index(), "[flt] query does not support [" + currentFieldName + "]");
|
||||
}
|
||||
} else if (token == XContentParser.Token.START_ARRAY) {
|
||||
if ("fields".equals(currentFieldName)) {
|
||||
fields = Lists.newLinkedList();
|
||||
while ((token = parser.nextToken()) != XContentParser.Token.END_ARRAY) {
|
||||
fields.add(parseContext.indexName(parser.text()));
|
||||
}
|
||||
} else {
|
||||
throw new QueryParsingException(parseContext.index(), "[flt] query does not support [" + currentFieldName + "]");
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if (likeText == null) {
|
||||
throw new QueryParsingException(parseContext.index(), "fuzzy_like_this requires 'like_text' to be specified");
|
||||
}
|
||||
|
||||
if (analyzer == null) {
|
||||
analyzer = parseContext.mapperService().searchAnalyzer();
|
||||
}
|
||||
|
||||
FuzzyLikeThisQuery query = new FuzzyLikeThisQuery(maxNumTerms, analyzer);
|
||||
if (fields == null) {
|
||||
fields = Lists.newArrayList(parseContext.defaultField());
|
||||
} else if (fields.isEmpty()) {
|
||||
throw new QueryParsingException(parseContext.index(), "fuzzy_like_this requires 'fields' to be non-empty");
|
||||
}
|
||||
for (Iterator<String> it = fields.iterator(); it.hasNext(); ) {
|
||||
final String fieldName = it.next();
|
||||
if (!Analysis.generatesCharacterTokenStream(analyzer, fieldName)) {
|
||||
if (failOnUnsupportedField) {
|
||||
throw new ElasticsearchIllegalArgumentException("more_like_this doesn't support binary/numeric fields: [" + fieldName + "]");
|
||||
} else {
|
||||
it.remove();
|
||||
}
|
||||
}
|
||||
}
|
||||
if (fields.isEmpty()) {
|
||||
return null;
|
||||
}
|
||||
float minSimilarity = fuzziness.asFloat();
|
||||
if (minSimilarity >= 1.0f && minSimilarity != (int)minSimilarity) {
|
||||
throw new ElasticsearchIllegalArgumentException("fractional edit distances are not allowed");
|
||||
}
|
||||
if (minSimilarity < 0.0f) {
|
||||
throw new ElasticsearchIllegalArgumentException("minimumSimilarity cannot be less than 0");
|
||||
}
|
||||
for (String field : fields) {
|
||||
query.addTerms(likeText, field, minSimilarity, prefixLength);
|
||||
}
|
||||
query.setBoost(boost);
|
||||
query.setIgnoreTF(ignoreTF);
|
||||
|
||||
if (queryName != null) {
|
||||
parseContext.addNamedQuery(queryName, query);
|
||||
}
|
||||
return query;
|
||||
}
|
||||
}
|
|
@ -458,31 +458,6 @@ public abstract class QueryBuilders {
|
|||
return new MoreLikeThisQueryBuilder();
|
||||
}
|
||||
|
||||
/**
|
||||
* A fuzzy like this query that finds documents that are "like" the provided {@link FuzzyLikeThisQueryBuilder#likeText(String)}
|
||||
* which is checked against the fields the query is constructed with.
|
||||
*
|
||||
* @param fields The fields to run the query against
|
||||
*/
|
||||
public static FuzzyLikeThisQueryBuilder fuzzyLikeThisQuery(String... fields) {
|
||||
return new FuzzyLikeThisQueryBuilder(fields);
|
||||
}
|
||||
|
||||
/**
|
||||
* A fuzzy like this query that finds documents that are "like" the provided {@link FuzzyLikeThisQueryBuilder#likeText(String)}
|
||||
* which is checked against the "_all" field.
|
||||
*/
|
||||
public static FuzzyLikeThisQueryBuilder fuzzyLikeThisQuery() {
|
||||
return new FuzzyLikeThisQueryBuilder();
|
||||
}
|
||||
|
||||
/**
|
||||
* A fuzzy like this query that finds documents that are "like" the provided {@link FuzzyLikeThisFieldQueryBuilder#likeText(String)}.
|
||||
*/
|
||||
public static FuzzyLikeThisFieldQueryBuilder fuzzyLikeThisFieldQuery(String name) {
|
||||
return new FuzzyLikeThisFieldQueryBuilder(name);
|
||||
}
|
||||
|
||||
/**
|
||||
* Constructs a new scoring child query, with the child type and the query to run on the child documents. The
|
||||
* results of this query are the parent docs that those child docs matched.
|
||||
|
|
|
@ -94,8 +94,6 @@ public class IndicesQueriesModule extends AbstractModule {
|
|||
qpBinders.addBinding().to(SpanNearQueryParser.class).asEagerSingleton();
|
||||
qpBinders.addBinding().to(SpanOrQueryParser.class).asEagerSingleton();
|
||||
qpBinders.addBinding().to(MoreLikeThisQueryParser.class).asEagerSingleton();
|
||||
qpBinders.addBinding().to(FuzzyLikeThisQueryParser.class).asEagerSingleton();
|
||||
qpBinders.addBinding().to(FuzzyLikeThisFieldQueryParser.class).asEagerSingleton();
|
||||
qpBinders.addBinding().to(WrapperQueryParser.class).asEagerSingleton();
|
||||
qpBinders.addBinding().to(IndicesQueryParser.class).asEagerSingleton();
|
||||
qpBinders.addBinding().to(CommonTermsQueryParser.class).asEagerSingleton();
|
||||
|
|
|
@ -1,88 +0,0 @@
|
|||
/*
|
||||
* Licensed to Elasticsearch under one or more contributor
|
||||
* license agreements. See the NOTICE file distributed with
|
||||
* this work for additional information regarding copyright
|
||||
* ownership. Elasticsearch licenses this file to you under
|
||||
* the Apache License, Version 2.0 (the "License"); you may
|
||||
* not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing,
|
||||
* software distributed under the License is distributed on an
|
||||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
* KIND, either express or implied. See the License for the
|
||||
* specific language governing permissions and limitations
|
||||
* under the License.
|
||||
*/
|
||||
|
||||
package org.elasticsearch.flt;
|
||||
|
||||
import org.elasticsearch.action.search.SearchPhaseExecutionException;
|
||||
import org.elasticsearch.action.search.SearchResponse;
|
||||
import org.elasticsearch.test.ElasticsearchIntegrationTest;
|
||||
import org.junit.Test;
|
||||
|
||||
import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
|
||||
import static org.elasticsearch.index.query.QueryBuilders.fuzzyLikeThisFieldQuery;
|
||||
import static org.elasticsearch.index.query.QueryBuilders.fuzzyLikeThisQuery;
|
||||
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertAcked;
|
||||
import static org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertThrows;
|
||||
import static org.hamcrest.Matchers.equalTo;
|
||||
|
||||
/**
|
||||
*
|
||||
*/
|
||||
public class FuzzyLikeThisActionTests extends ElasticsearchIntegrationTest {
|
||||
|
||||
@Test
|
||||
// See issue https://github.com/elasticsearch/elasticsearch/issues/3252
|
||||
public void testNumericField() throws Exception {
|
||||
assertAcked(prepareCreate("test")
|
||||
.addMapping("type", "int_value", "type=integer"));
|
||||
ensureGreen();
|
||||
client().prepareIndex("test", "type", "1")
|
||||
.setSource(jsonBuilder().startObject().field("string_value", "lucene index").field("int_value", 1).endObject())
|
||||
.execute().actionGet();
|
||||
client().prepareIndex("test", "type", "2")
|
||||
.setSource(jsonBuilder().startObject().field("string_value", "elasticsearch index").field("int_value", 42).endObject())
|
||||
.execute().actionGet();
|
||||
|
||||
refresh();
|
||||
|
||||
// flt query with no field -> OK
|
||||
SearchResponse searchResponse = client().prepareSearch().setQuery(fuzzyLikeThisQuery().likeText("index")).execute().actionGet();
|
||||
assertThat(searchResponse.getFailedShards(), equalTo(0));
|
||||
assertThat(searchResponse.getHits().getTotalHits(), equalTo(2L));
|
||||
|
||||
// flt query with string fields
|
||||
searchResponse = client().prepareSearch().setQuery(fuzzyLikeThisQuery("string_value").likeText("index")).execute().actionGet();
|
||||
assertThat(searchResponse.getFailedShards(), equalTo(0));
|
||||
assertThat(searchResponse.getHits().getTotalHits(), equalTo(2L));
|
||||
|
||||
// flt query with at least a numeric field -> fail by default
|
||||
assertThrows(client().prepareSearch().setQuery(fuzzyLikeThisQuery("string_value", "int_value").likeText("index")), SearchPhaseExecutionException.class);
|
||||
|
||||
// flt query with at least a numeric field -> fail by command
|
||||
assertThrows(client().prepareSearch().setQuery(fuzzyLikeThisQuery("string_value", "int_value").likeText("index").failOnUnsupportedField(true)), SearchPhaseExecutionException.class);
|
||||
|
||||
|
||||
// flt query with at least a numeric field but fail_on_unsupported_field set to false
|
||||
searchResponse = client().prepareSearch().setQuery(fuzzyLikeThisQuery("string_value", "int_value").likeText("index").failOnUnsupportedField(false)).execute().actionGet();
|
||||
assertThat(searchResponse.getFailedShards(), equalTo(0));
|
||||
assertThat(searchResponse.getHits().getTotalHits(), equalTo(2L));
|
||||
|
||||
// flt field query on a numeric field -> failure by default
|
||||
assertThrows(client().prepareSearch().setQuery(fuzzyLikeThisFieldQuery("int_value").likeText("42")), SearchPhaseExecutionException.class);
|
||||
|
||||
// flt field query on a numeric field -> failure by command
|
||||
assertThrows(client().prepareSearch().setQuery(fuzzyLikeThisFieldQuery("int_value").likeText("42").failOnUnsupportedField(true)), SearchPhaseExecutionException.class);
|
||||
|
||||
// flt field query on a numeric field but fail_on_unsupported_field set to false
|
||||
searchResponse = client().prepareSearch().setQuery(fuzzyLikeThisFieldQuery("int_value").likeText("42").failOnUnsupportedField(false)).execute().actionGet();
|
||||
assertThat(searchResponse.getFailedShards(), equalTo(0));
|
||||
assertThat(searchResponse.getHits().getTotalHits(), equalTo(0L));
|
||||
}
|
||||
|
||||
}
|
|
@ -25,7 +25,6 @@ import org.apache.lucene.analysis.core.WhitespaceAnalyzer;
|
|||
import org.apache.lucene.index.*;
|
||||
import org.apache.lucene.index.memory.MemoryIndex;
|
||||
import org.apache.lucene.queries.*;
|
||||
import org.apache.lucene.sandbox.queries.FuzzyLikeThisQuery;
|
||||
import org.apache.lucene.search.*;
|
||||
import org.apache.lucene.search.spans.*;
|
||||
import org.apache.lucene.spatial.prefix.IntersectsPrefixTreeFilter;
|
||||
|
@ -1795,74 +1794,6 @@ public class SimpleIndexQueryParserTests extends ElasticsearchSingleNodeTest {
|
|||
return strings;
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testFuzzyLikeThisBuilder() throws Exception {
|
||||
IndexQueryParserService queryParser = queryParser();
|
||||
Query parsedQuery = queryParser.parse(fuzzyLikeThisQuery("name.first", "name.last").likeText("something").maxQueryTerms(12)).query();
|
||||
assertThat(parsedQuery, instanceOf(FuzzyLikeThisQuery.class));
|
||||
parsedQuery = queryParser.parse(fuzzyLikeThisQuery("name.first", "name.last").likeText("something").maxQueryTerms(12).fuzziness(Fuzziness.build("4"))).query();
|
||||
assertThat(parsedQuery, instanceOf(FuzzyLikeThisQuery.class));
|
||||
|
||||
Query parsedQuery1 = queryParser.parse(fuzzyLikeThisQuery("name.first", "name.last").likeText("something").maxQueryTerms(12).fuzziness(Fuzziness.build("4.0"))).query();
|
||||
assertThat(parsedQuery1, instanceOf(FuzzyLikeThisQuery.class));
|
||||
assertThat(parsedQuery, equalTo(parsedQuery1));
|
||||
|
||||
try {
|
||||
queryParser.parse(fuzzyLikeThisQuery("name.first", "name.last").likeText("something").maxQueryTerms(12).fuzziness(Fuzziness.build("4.1"))).query();
|
||||
fail("exception expected - fractional edit distance");
|
||||
} catch (ElasticsearchException ex) {
|
||||
//
|
||||
}
|
||||
|
||||
try {
|
||||
queryParser.parse(fuzzyLikeThisQuery("name.first", "name.last").likeText("something").maxQueryTerms(12).fuzziness(Fuzziness.build("-" + between(1, 100)))).query();
|
||||
fail("exception expected - negative edit distance");
|
||||
} catch (ElasticsearchException ex) {
|
||||
//
|
||||
}
|
||||
String[] queries = new String[] {
|
||||
"{\"flt\": {\"fields\": [\"comment\"], \"like_text\": \"FFFdfds\",\"fuzziness\": \"4\"}}",
|
||||
"{\"flt\": {\"fields\": [\"comment\"], \"like_text\": \"FFFdfds\",\"fuzziness\": \"4.00000000\"}}",
|
||||
"{\"flt\": {\"fields\": [\"comment\"], \"like_text\": \"FFFdfds\",\"fuzziness\": \"4.\"}}",
|
||||
"{\"flt\": {\"fields\": [\"comment\"], \"like_text\": \"FFFdfds\",\"fuzziness\": 4}}",
|
||||
"{\"flt\": {\"fields\": [\"comment\"], \"like_text\": \"FFFdfds\",\"fuzziness\": 4.0}}"
|
||||
};
|
||||
int iters = scaledRandomIntBetween(5, 100);
|
||||
for (int i = 0; i < iters; i++) {
|
||||
parsedQuery = queryParser.parse(new BytesArray((String) randomFrom(queries))).query();
|
||||
parsedQuery1 = queryParser.parse(new BytesArray((String) randomFrom(queries))).query();
|
||||
assertThat(parsedQuery1, instanceOf(FuzzyLikeThisQuery.class));
|
||||
assertThat(parsedQuery, instanceOf(FuzzyLikeThisQuery.class));
|
||||
assertThat(parsedQuery, equalTo(parsedQuery1));
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testFuzzyLikeThis() throws Exception {
|
||||
IndexQueryParserService queryParser = queryParser();
|
||||
String query = copyToStringFromClasspath("/org/elasticsearch/index/query/fuzzyLikeThis.json");
|
||||
Query parsedQuery = queryParser.parse(query).query();
|
||||
assertThat(parsedQuery, instanceOf(FuzzyLikeThisQuery.class));
|
||||
// FuzzyLikeThisQuery fuzzyLikeThisQuery = (FuzzyLikeThisQuery) parsedQuery;
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testFuzzyLikeFieldThisBuilder() throws Exception {
|
||||
IndexQueryParserService queryParser = queryParser();
|
||||
Query parsedQuery = queryParser.parse(fuzzyLikeThisFieldQuery("name.first").likeText("something").maxQueryTerms(12)).query();
|
||||
assertThat(parsedQuery, instanceOf(FuzzyLikeThisQuery.class));
|
||||
// FuzzyLikeThisQuery fuzzyLikeThisQuery = (FuzzyLikeThisQuery) parsedQuery;
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testFuzzyLikeThisField() throws Exception {
|
||||
IndexQueryParserService queryParser = queryParser();
|
||||
String query = copyToStringFromClasspath("/org/elasticsearch/index/query/fuzzyLikeThisField.json");
|
||||
Query parsedQuery = queryParser.parse(query).query();
|
||||
assertThat(parsedQuery, instanceOf(FuzzyLikeThisQuery.class));
|
||||
// FuzzyLikeThisQuery fuzzyLikeThisQuery = (FuzzyLikeThisQuery) parsedQuery;
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testGeoDistanceFilterNamed() throws IOException {
|
||||
IndexQueryParserService queryParser = queryParser();
|
||||
|
|
|
@ -1,7 +0,0 @@
|
|||
{
|
||||
fuzzy_like_this:{
|
||||
fields:["name.first", "name.last"],
|
||||
like_text:"something",
|
||||
max_query_terms:12
|
||||
}
|
||||
}
|
|
@ -1,8 +0,0 @@
|
|||
{
|
||||
fuzzy_like_this_field:{
|
||||
"name.first":{
|
||||
like_text:"something",
|
||||
max_query_terms:12
|
||||
}
|
||||
}
|
||||
}
|
Loading…
Reference in New Issue