Rename edit_distance/min_similarity to fuzziness
A lot of different API's currently use different names for the same logical parameter. Since lucene moved away from the notion of a `similarity` and now uses an `fuzziness` we should generalize this and encapsulate the generation, parsing and creation of these settings across all queries. This commit adds a new `Fuzziness` class that handles the renaming and generalization in a backwards compatible manner. This commit also added a ParseField class to better support deprecated Query DSL parameters The ParseField class allows specifying parameger that have been deprecated. Those parameters can be more easily tracked and removed in future version. This also allows to run queries in `strict` mode per index to throw exceptions if a query is executed with deprected keys. Closes #4082
This commit is contained in:
parent
f7db7eb99e
commit
bc5a9ca342
|
@ -122,6 +122,21 @@ fields within a document indexed treated as boolean fields.
|
|||
All REST APIs support providing numbered parameters as `string` on top
|
||||
of supporting the native JSON number types.
|
||||
|
||||
[[time-units]]
|
||||
[float]
|
||||
=== Time units
|
||||
|
||||
Whenever durations need to be specified, eg for a `timeout` parameter, the duration
|
||||
can be specified as a whole number representing time in milliseconds, or as a time value like `2d` for 2 days. The supported units are:
|
||||
|
||||
[horizontal]
|
||||
`y`:: Year
|
||||
`M`:: Month
|
||||
`w`:: Week
|
||||
`h`:: Hour
|
||||
`m`:: Minute
|
||||
`s`:: Second
|
||||
|
||||
[[distance-units]]
|
||||
[float]
|
||||
=== Distance Units
|
||||
|
@ -144,6 +159,63 @@ Centimeter:: `cm` or `centimeters`
|
|||
Millimeter:: `mm` or `millimeters`
|
||||
|
||||
|
||||
[[fuzziness]]
|
||||
[float]
|
||||
=== Fuzziness
|
||||
|
||||
Some queries and APIs support parameters to allow inexact _fuzzy_ matching,
|
||||
using the `fuzziness` parameter. The `fuzziness` parameter is context
|
||||
sensitive which means that it depends on the type of the field being queried:
|
||||
|
||||
[float]
|
||||
==== Numeric, date and IPv4 fields
|
||||
|
||||
When querying numeric, date and IPv4 fields, `fuzziness` is interpreted as a
|
||||
`+/- margin. It behaves like a <<query-dsl-range-query>> where:
|
||||
|
||||
-fuzziness <= field value <= +fuzziness
|
||||
|
||||
The `fuzziness` parameter should be set to a numeric value, eg `2` or `2.0`. A
|
||||
`date` field interprets a long as milliseconds, but also accepts a string
|
||||
containing a time value -- `"1h"` -- as explained in <<time-units>>. An `ip`
|
||||
field accepts a long or another IPv4 address (which will be converted into a
|
||||
long).
|
||||
|
||||
[float]
|
||||
==== String fields
|
||||
|
||||
When querying `string` fields, `fuzziness` is interpreted as a
|
||||
http://en.wikipedia.org/wiki/Levenshtein_distance[Levenshtein Edit Distance]
|
||||
-- the number of one character changes that need to be made to one string to
|
||||
make it the same as another string.
|
||||
|
||||
The `fuzziness` parameter can be specified as:
|
||||
|
||||
`0`, `1`, `2`::
|
||||
|
||||
the maximum allowed Levenshtein Edit Distance (or number of edits)
|
||||
|
||||
`AUTO`::
|
||||
+
|
||||
--
|
||||
generates an edit distance based on the length of the term. For lengths:
|
||||
|
||||
`0..1`:: must match exactly
|
||||
`1..4`:: one edit allowed
|
||||
`>4`:: two edits allowed
|
||||
|
||||
`AUTO` should generally be the preferred value for `fuzziness`.
|
||||
--
|
||||
|
||||
`0.0..1.0`::
|
||||
|
||||
converted into an edit distance using the formula: `length(term) * (1.0 -
|
||||
fuzziness)`, eg a `fuzziness` of `0.6` with a term of length 10 would result
|
||||
in an edit distance of `4`. Note: in all APIs except for the
|
||||
<<query-dsl-flt-query>>, the maximum allowed edit distance is `2`.
|
||||
|
||||
|
||||
|
||||
[float]
|
||||
=== Result Casing
|
||||
|
||||
|
|
|
@ -33,8 +33,8 @@ The `fuzzy_like_this_field` top level parameters include:
|
|||
|`max_query_terms` |The maximum number of query terms that will be
|
||||
included in any generated query. Defaults to `25`.
|
||||
|
||||
|`min_similarity` |The minimum similarity of the term variants. Defaults
|
||||
to `0.5`.
|
||||
|`fuzziness` |The fuzziness of the term variants. Defaults
|
||||
to `0.5`. See <<fuzziness>>.
|
||||
|
||||
|`prefix_length` |Length of required common prefix on variant terms.
|
||||
Defaults to `0`.
|
||||
|
|
|
@ -32,8 +32,8 @@ Defaults to the `_all` field.
|
|||
|`max_query_terms` |The maximum number of query terms that will be
|
||||
included in any generated query. Defaults to `25`.
|
||||
|
||||
|`min_similarity` |The minimum similarity of the term variants. Defaults
|
||||
to `0.5`.
|
||||
|`fuzziness` |The minimum similarity of the term variants. Defaults
|
||||
to `0.5`. See <<fuzziness>>.
|
||||
|
||||
|`prefix_length` |Length of required common prefix on variant terms.
|
||||
Defaults to `0`.
|
||||
|
|
|
@ -1,12 +1,15 @@
|
|||
[[query-dsl-fuzzy-query]]
|
||||
=== Fuzzy Query
|
||||
|
||||
A fuzzy query that uses similarity based on Levenshtein (edit
|
||||
distance) algorithm. This maps to Lucene's `FuzzyQuery`.
|
||||
The fuzzy query uses similarity based on Levenshtein edit distance for
|
||||
`string` fields, and a `+/-` margin on numeric and date fields.
|
||||
|
||||
Warning: this query is not very scalable with its default prefix length
|
||||
of 0 - in this case, *every* term will be enumerated and cause an edit
|
||||
score calculation or `max_expansions` is not set.
|
||||
==== String fields
|
||||
|
||||
The `fuzzy` query generates all possible matching terms that are within the
|
||||
maximum edit distance specified in `fuzziness` and then checks the term
|
||||
dictionary to find out which of those generated terms actually exist in the
|
||||
index.
|
||||
|
||||
Here is a simple example:
|
||||
|
||||
|
@ -17,31 +20,57 @@ Here is a simple example:
|
|||
}
|
||||
--------------------------------------------------
|
||||
|
||||
More complex settings can be set (the values here are the default
|
||||
values):
|
||||
Or with more advanced settings:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
{
|
||||
"fuzzy" : {
|
||||
"user" : {
|
||||
"value" : "ki",
|
||||
"boost" : 1.0,
|
||||
"min_similarity" : 0.5,
|
||||
"prefix_length" : 0
|
||||
}
|
||||
{
|
||||
"fuzzy" : {
|
||||
"user" : {
|
||||
"value" : "ki",
|
||||
"boost" : 1.0,
|
||||
"fuzziness" : 2,
|
||||
"prefix_length" : 0,
|
||||
"max_expansions": 100
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
The `max_expansions` parameter (unbounded by default) controls the
|
||||
number of terms the fuzzy query will expand to.
|
||||
[float]
|
||||
===== Parameters
|
||||
|
||||
[horizontal]
|
||||
`fuzziness`::
|
||||
|
||||
The maximum edit distance. Defaults to `AUTO`. See <<fuzziness>>.
|
||||
|
||||
`prefix_length`::
|
||||
|
||||
The number of initial characters which will not be ``fuzzified''. This
|
||||
helps to reduce the number of terms which must be examined. Defaults
|
||||
to `0`.
|
||||
|
||||
`max_expansions`::
|
||||
|
||||
The maximum number of terms that the `fuzzy` query will expand to.
|
||||
Defaults to `0`.
|
||||
|
||||
|
||||
WARNING: this query can be very heavy if `prefix_length` and `max_expansions`
|
||||
are both set to their defaults of `0`. This could cause every term in the
|
||||
index to be examined!
|
||||
|
||||
|
||||
[float]
|
||||
==== Numeric / Date Fuzzy
|
||||
==== Numeric and date fields
|
||||
|
||||
`fuzzy` query on a numeric field will result in a range query "around"
|
||||
the value using the `min_similarity` value. For example:
|
||||
Performs a <<query-dsl-range-query>> ``around'' the value using the
|
||||
`fuzziness` value as a `+/-` range, where:
|
||||
|
||||
-fuzziness <= field value <= +fuzziness
|
||||
|
||||
For example:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
|
@ -49,14 +78,14 @@ the value using the `min_similarity` value. For example:
|
|||
"fuzzy" : {
|
||||
"price" : {
|
||||
"value" : 12,
|
||||
"min_similarity" : 2
|
||||
"fuzziness" : 2
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
Will result in a range query between 10 and 14. Same applies to dates,
|
||||
with support for time format for the `min_similarity` field:
|
||||
Will result in a range query between 10 and 14. Date fields support
|
||||
<<time-units,time values>>, eg:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
|
@ -64,16 +93,10 @@ with support for time format for the `min_similarity` field:
|
|||
"fuzzy" : {
|
||||
"created" : {
|
||||
"value" : "2010-02-05T12:05:07",
|
||||
"min_similarity" : "1d"
|
||||
"fuzziness" : "1d"
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
In the mapping, numeric and date types now allow to configure a
|
||||
`fuzzy_factor` mapping value (defaults to 1), which will be used to
|
||||
multiply the fuzzy value by it when used in a `query_string` type query.
|
||||
For example, for dates, a fuzzy factor of "1d" will result in
|
||||
multiplying whatever fuzzy value provided in the min_similarity by it.
|
||||
Note, this is explicitly supported since query_string query only allowed
|
||||
for similarity valued between 0.0 and 1.0.
|
||||
See <<fuzziness>> for more details about accepted values.
|
||||
|
|
|
@ -34,9 +34,10 @@ The `analyzer` can be set to control which analyzer will perform the
|
|||
analysis process on the text. It default to the field explicit mapping
|
||||
definition, or the default search analyzer.
|
||||
|
||||
`fuzziness` can be set to a value (depending on the relevant type, for
|
||||
string types it should be a value between `0.0` and `1.0`) to constructs
|
||||
fuzzy queries for each term analyzed. The `prefix_length` and
|
||||
`fuzziness` allows _fuzzy matching_ based on the type of field being queried.
|
||||
See <<fuzziness>> for allowed settings.
|
||||
|
||||
The `prefix_length` and
|
||||
`max_expansions` can be set in this case to control the fuzzy process.
|
||||
If the fuzzy option is set the query will use `constant_score_rewrite`
|
||||
as its <<query-dsl-multi-term-rewrite,rewrite
|
||||
|
@ -80,9 +81,9 @@ change that the `zero_terms_query` option can be used, which accepts
|
|||
.cutoff_frequency
|
||||
The match query supports a `cutoff_frequency` that allows
|
||||
specifying an absolute or relative document frequency where high
|
||||
frequent terms are moved into an optional subquery and are only scored
|
||||
if one of the low frequent (below the cutoff) terms in the case of an
|
||||
`or` operator or all of the low frequent terms in the case of an `and`
|
||||
frequent terms are moved into an optional subquery and are only scored
|
||||
if one of the low frequent (below the cutoff) terms in the case of an
|
||||
`or` operator or all of the low frequent terms in the case of an `and`
|
||||
operator match.
|
||||
|
||||
This query allows handling `stopwords` dynamically at runtime, is domain
|
||||
|
@ -101,8 +102,8 @@ Note: If the `cutoff_frequency` is used and the operator is `and`
|
|||
_stacked tokens_ (tokens that are on the same position like `synonym` filter emits)
|
||||
are not handled gracefully as they are in a pure `and` query. For instance the query
|
||||
`fast fox` is analyzed into 3 terms `[fast, quick, fox]` where `quick` is a synonym
|
||||
for `fast` on the same token positions the query might require `fast` and `quick` to
|
||||
match if the operator is `and`.
|
||||
for `fast` on the same token positions the query might require `fast` and `quick` to
|
||||
match if the operator is `and`.
|
||||
|
||||
Here is an example showing a query composed of stopwords exclusivly:
|
||||
|
||||
|
|
|
@ -46,8 +46,8 @@ increments in result queries. Defaults to `true`.
|
|||
|`fuzzy_max_expansions` |Controls the number of terms fuzzy queries will
|
||||
expand to. Defaults to `50`
|
||||
|
||||
|`fuzzy_min_sim` |Set the minimum similarity for fuzzy queries. Defaults
|
||||
to `0.5`
|
||||
|`fuzziness` |Set the fuzziness for fuzzy queries. Defaults
|
||||
to `AUTO`. See <<fuzziness>> for allowed settings.
|
||||
|
||||
|`fuzzy_prefix_length` |Set the prefix length for fuzzy queries. Default
|
||||
is `0`.
|
||||
|
@ -70,7 +70,7 @@ in the resulting boolean query should match. It can be an absolute value
|
|||
both>>.
|
||||
|
||||
|`lenient` |If set to `true` will cause format based failures (like
|
||||
providing text to a numeric field) to be ignored.
|
||||
providing text to a numeric field) to be ignored.
|
||||
|=======================================================================
|
||||
|
||||
When a multi term query is being generated, one can control how it gets
|
||||
|
@ -128,7 +128,7 @@ search on all "city" fields:
|
|||
|
||||
Another option is to provide the wildcard fields search in the query
|
||||
string itself (properly escaping the `*` sign), for example:
|
||||
`city.\*:something`.
|
||||
`city.\*:something`.
|
||||
|
||||
When running the `query_string` query against multiple fields, the
|
||||
following additional parameters are allowed:
|
||||
|
|
|
@ -199,7 +199,7 @@ curl -X POST 'localhost:9200/music/_suggest?pretty' -d '{
|
|||
"completion" : {
|
||||
"field" : "suggest",
|
||||
"fuzzy" : {
|
||||
"edit_distance" : 2
|
||||
"fuzziness" : 2
|
||||
}
|
||||
}
|
||||
}
|
||||
|
@ -210,8 +210,9 @@ The fuzzy query can take specific fuzzy parameters.
|
|||
The following parameters are supported:
|
||||
|
||||
[horizontal]
|
||||
`edit_distance`::
|
||||
Maximum edit distance, defaults to `1`
|
||||
`fuzziness`::
|
||||
The fuzziness factor, defaults to `AUTO`.
|
||||
See <<fuzziness>> for allowed settings.
|
||||
|
||||
`transpositions`::
|
||||
Sets if transpositions should be counted
|
||||
|
|
|
@ -30,6 +30,7 @@ import org.apache.lucene.util.automaton.RegExp;
|
|||
import org.elasticsearch.common.lucene.Lucene;
|
||||
import org.elasticsearch.common.lucene.search.Queries;
|
||||
import org.elasticsearch.common.lucene.search.XFilteredQuery;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.index.mapper.FieldMapper;
|
||||
import org.elasticsearch.index.mapper.MapperService;
|
||||
import org.elasticsearch.index.query.QueryParseContext;
|
||||
|
@ -435,7 +436,7 @@ public class MapperQueryParser extends QueryParser {
|
|||
if (currentMapper != null) {
|
||||
try {
|
||||
//LUCENE 4 UPGRADE I disabled transpositions here by default - maybe this needs to be changed
|
||||
Query fuzzyQuery = currentMapper.fuzzyQuery(termStr, minSimilarity, fuzzyPrefixLength, settings.fuzzyMaxExpansions(), false);
|
||||
Query fuzzyQuery = currentMapper.fuzzyQuery(termStr, Fuzziness.build(minSimilarity), fuzzyPrefixLength, settings.fuzzyMaxExpansions(), false);
|
||||
return wrapSmartNameQuery(fuzzyQuery, fieldMappers, parseContext);
|
||||
} catch (RuntimeException e) {
|
||||
if (settings.lenient()) {
|
||||
|
|
|
@ -0,0 +1,74 @@
|
|||
/*
|
||||
* Licensed to Elasticsearch under one or more contributor
|
||||
* license agreements. See the NOTICE file distributed with
|
||||
* this work for additional information regarding copyright
|
||||
* ownership. Elasticsearch licenses this file to you under
|
||||
* the Apache License, Version 2.0 (the "License"); you may
|
||||
* not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing,
|
||||
* software distributed under the License is distributed on an
|
||||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
* KIND, either express or implied. See the License for the
|
||||
* specific language governing permissions and limitations
|
||||
* under the License.
|
||||
*/
|
||||
package org.elasticsearch.common;
|
||||
|
||||
import org.elasticsearch.ElasticsearchIllegalArgumentException;
|
||||
|
||||
import java.util.EnumSet;
|
||||
import java.util.HashSet;
|
||||
|
||||
/**
|
||||
*/
|
||||
public class ParseField {
|
||||
private final String camelCaseName;
|
||||
private final String underscoreName;
|
||||
private final String[] deprecatedNames;
|
||||
|
||||
public static final EnumSet<Flag> EMPTY_FLAGS = EnumSet.noneOf(Flag.class);
|
||||
|
||||
public static enum Flag {
|
||||
STRICT
|
||||
}
|
||||
|
||||
public ParseField(String value, String... deprecatedNames) {
|
||||
camelCaseName = Strings.toCamelCase(value);
|
||||
underscoreName = Strings.toUnderscoreCase(value);
|
||||
if (deprecatedNames == null || deprecatedNames.length == 0) {
|
||||
this.deprecatedNames = Strings.EMPTY_ARRAY;
|
||||
} else {
|
||||
final HashSet<String> set = new HashSet<String>();
|
||||
for (String depName : deprecatedNames) {
|
||||
set.add(Strings.toCamelCase(depName));
|
||||
set.add(Strings.toUnderscoreCase(depName));
|
||||
}
|
||||
this.deprecatedNames = set.toArray(new String[0]);
|
||||
}
|
||||
}
|
||||
|
||||
public ParseField withDeprecation(String... deprecatedNames) {
|
||||
return new ParseField(this.underscoreName, deprecatedNames);
|
||||
}
|
||||
|
||||
public boolean match(String currentFieldName, EnumSet<Flag> flags) {
|
||||
if (currentFieldName.equals(camelCaseName) || currentFieldName.equals(underscoreName)) {
|
||||
return true;
|
||||
}
|
||||
for (String depName : deprecatedNames) {
|
||||
if (currentFieldName.equals(depName)) {
|
||||
if (flags.contains(Flag.STRICT)) {
|
||||
throw new ElasticsearchIllegalArgumentException("Deprecated field [" + currentFieldName + "] used expected [" + underscoreName + "] instead");
|
||||
}
|
||||
return true;
|
||||
}
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
|
||||
}
|
|
@ -0,0 +1,256 @@
|
|||
/*
|
||||
* Licensed to Elasticsearch under one or more contributor
|
||||
* license agreements. See the NOTICE file distributed with
|
||||
* this work for additional information regarding copyright
|
||||
* ownership. Elasticsearch licenses this file to you under
|
||||
* the Apache License, Version 2.0 (the "License"); you may
|
||||
* not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing,
|
||||
* software distributed under the License is distributed on an
|
||||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
* KIND, either express or implied. See the License for the
|
||||
* specific language governing permissions and limitations
|
||||
* under the License.
|
||||
*/
|
||||
package org.elasticsearch.common.unit;
|
||||
|
||||
import org.apache.lucene.search.FuzzyQuery;
|
||||
import org.apache.lucene.util.automaton.LevenshteinAutomata;
|
||||
import org.elasticsearch.ElasticsearchIllegalArgumentException;
|
||||
import org.elasticsearch.common.ParseField;
|
||||
import org.elasticsearch.common.Preconditions;
|
||||
import org.elasticsearch.common.xcontent.ToXContent;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilderString;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
|
||||
import java.io.IOException;
|
||||
|
||||
/**
|
||||
* A unit class that encapsulates all in-exact search
|
||||
* parsing and conversion from similarities to edit distances
|
||||
* etc.
|
||||
*/
|
||||
public final class Fuzziness implements ToXContent {
|
||||
|
||||
public static final XContentBuilderString X_FIELD_NAME = new XContentBuilderString("fuzziness");
|
||||
public static final Fuzziness ZERO = new Fuzziness(0);
|
||||
public static final Fuzziness ONE = new Fuzziness(1);
|
||||
public static final Fuzziness TWO = new Fuzziness(2);
|
||||
public static final Fuzziness AUTO = new Fuzziness("AUTO");
|
||||
public static final ParseField FIELD = new ParseField(X_FIELD_NAME.camelCase().getValue());
|
||||
|
||||
private final Object fuzziness;
|
||||
|
||||
private Fuzziness(int fuzziness) {
|
||||
Preconditions.checkArgument(fuzziness >= 0 && fuzziness <= 2, "Valid edit distances are [0, 1, 2] but was [" + fuzziness + "]");
|
||||
this.fuzziness = fuzziness;
|
||||
}
|
||||
|
||||
private Fuzziness(float fuzziness) {
|
||||
Preconditions.checkArgument(fuzziness >= 0.0 && fuzziness < 1.0f, "Valid similarities must be in the interval [0..1] but was [" + fuzziness + "]");
|
||||
this.fuzziness = fuzziness;
|
||||
}
|
||||
|
||||
private Fuzziness(String fuzziness) {
|
||||
this.fuzziness = fuzziness;
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a {@link Fuzziness} instance from a similarity. The value must be in the range <tt>[0..1)</tt>
|
||||
*/
|
||||
public static Fuzziness fromSimilarity(float similarity) {
|
||||
return new Fuzziness(similarity);
|
||||
}
|
||||
|
||||
/**
|
||||
* Creates a {@link Fuzziness} instance from an edit distance. The value must be one of <tt>[0, 1, 2]</tt>
|
||||
*/
|
||||
public static Fuzziness fromEdits(int edits) {
|
||||
return new Fuzziness(edits);
|
||||
}
|
||||
|
||||
public static Fuzziness build(Object fuzziness) {
|
||||
if (fuzziness instanceof Fuzziness) {
|
||||
return (Fuzziness) fuzziness;
|
||||
}
|
||||
String string = fuzziness.toString();
|
||||
if (AUTO.asString().equalsIgnoreCase(string)) {
|
||||
return AUTO;
|
||||
}
|
||||
return new Fuzziness(string);
|
||||
}
|
||||
|
||||
public static Fuzziness parse(XContentParser parser) throws IOException {
|
||||
XContentParser.Token token = parser.currentToken();
|
||||
switch (token) {
|
||||
case VALUE_STRING:
|
||||
case VALUE_NUMBER:
|
||||
final String fuzziness = parser.text();
|
||||
if (AUTO.asString().equalsIgnoreCase(fuzziness)) {
|
||||
return AUTO;
|
||||
}
|
||||
try {
|
||||
final int minimumSimilarity = Integer.parseInt(fuzziness);
|
||||
switch (minimumSimilarity) {
|
||||
case 0:
|
||||
return ZERO;
|
||||
case 1:
|
||||
return ONE;
|
||||
case 2:
|
||||
return TWO;
|
||||
default:
|
||||
return build(fuzziness);
|
||||
}
|
||||
} catch (NumberFormatException ex) {
|
||||
return build(fuzziness);
|
||||
}
|
||||
|
||||
default:
|
||||
throw new ElasticsearchIllegalArgumentException("Can't parse fuzziness on token: [" + token + "]");
|
||||
}
|
||||
}
|
||||
|
||||
@Override
|
||||
public XContentBuilder toXContent(XContentBuilder builder, Params params) throws IOException {
|
||||
return toXContent(builder, params, true);
|
||||
}
|
||||
|
||||
public XContentBuilder toXContent(XContentBuilder builder, Params params, boolean includeFieldName) throws IOException {
|
||||
if (includeFieldName) {
|
||||
builder.field(X_FIELD_NAME, fuzziness);
|
||||
} else {
|
||||
builder.value(fuzziness);
|
||||
}
|
||||
return builder;
|
||||
}
|
||||
|
||||
public int asDistance() {
|
||||
return asDistance(null);
|
||||
}
|
||||
|
||||
public int asDistance(String text) {
|
||||
if (fuzziness instanceof String) {
|
||||
if (this == AUTO) { //AUTO
|
||||
final int len = termLen(text);
|
||||
if (len <= 2) {
|
||||
return 0;
|
||||
} else if (len > 5) {
|
||||
return 2;
|
||||
} else {
|
||||
return 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
return FuzzyQuery.floatToEdits(asFloat(), termLen(text));
|
||||
}
|
||||
|
||||
public TimeValue asTimeValue() {
|
||||
if (this == AUTO) {
|
||||
return TimeValue.timeValueMillis(1);
|
||||
} else {
|
||||
return TimeValue.parseTimeValue(fuzziness.toString(), null);
|
||||
}
|
||||
}
|
||||
|
||||
public long asLong() {
|
||||
if (this == AUTO) {
|
||||
return 1;
|
||||
}
|
||||
try {
|
||||
return Long.parseLong(fuzziness.toString());
|
||||
} catch (NumberFormatException ex) {
|
||||
return (long) Double.parseDouble(fuzziness.toString());
|
||||
}
|
||||
}
|
||||
|
||||
public int asInt() {
|
||||
if (this == AUTO) {
|
||||
return 1;
|
||||
}
|
||||
try {
|
||||
return Integer.parseInt(fuzziness.toString());
|
||||
} catch (NumberFormatException ex) {
|
||||
return (int) Float.parseFloat(fuzziness.toString());
|
||||
}
|
||||
}
|
||||
|
||||
public short asShort() {
|
||||
if (this == AUTO) {
|
||||
return 1;
|
||||
}
|
||||
try {
|
||||
return Short.parseShort(fuzziness.toString());
|
||||
} catch (NumberFormatException ex) {
|
||||
return (short) Float.parseFloat(fuzziness.toString());
|
||||
}
|
||||
}
|
||||
|
||||
public byte asByte() {
|
||||
if (this == AUTO) {
|
||||
return 1;
|
||||
}
|
||||
try {
|
||||
return Byte.parseByte(fuzziness.toString());
|
||||
} catch (NumberFormatException ex) {
|
||||
return (byte) Float.parseFloat(fuzziness.toString());
|
||||
}
|
||||
}
|
||||
|
||||
public double asDouble() {
|
||||
if (this == AUTO) {
|
||||
return 1d;
|
||||
}
|
||||
return Double.parseDouble(fuzziness.toString());
|
||||
}
|
||||
|
||||
public float asFloat() {
|
||||
if (this == AUTO) {
|
||||
return 1f;
|
||||
}
|
||||
return Float.parseFloat(fuzziness.toString());
|
||||
}
|
||||
|
||||
public float asSimilarity() {
|
||||
return asSimilarity(null);
|
||||
}
|
||||
|
||||
public float asSimilarity(String text) {
|
||||
if (this == AUTO) {
|
||||
final int len = termLen(text);
|
||||
if (len <= 2) {
|
||||
return 0.0f;
|
||||
} else if (len > 5) {
|
||||
return 0.5f;
|
||||
} else {
|
||||
return 0.66f;
|
||||
}
|
||||
// return dist == 0 ? dist : Math.min(0.999f, Math.max(0.0f, 1.0f - ((float) dist/ (float) termLen(text))));
|
||||
}
|
||||
if (fuzziness instanceof Float) { // it's a similarity
|
||||
return ((Float) fuzziness).floatValue();
|
||||
} else if (fuzziness instanceof Integer) { // it's an edit!
|
||||
int dist = Math.min(((Integer) fuzziness).intValue(),
|
||||
LevenshteinAutomata.MAXIMUM_SUPPORTED_DISTANCE);
|
||||
return Math.min(0.999f, Math.max(0.0f, 1.0f - ((float) dist / (float) termLen(text))));
|
||||
} else {
|
||||
final float similarity = Float.parseFloat(fuzziness.toString());
|
||||
if (similarity >= 0.0f && similarity < 1.0f) {
|
||||
return similarity;
|
||||
}
|
||||
}
|
||||
throw new ElasticsearchIllegalArgumentException("Can't get similarity from fuzziness [" + fuzziness + "]");
|
||||
}
|
||||
|
||||
private int termLen(String text) {
|
||||
return text == null ? 5 : text.codePointCount(0, text.length()); // 5 avg term length in english
|
||||
}
|
||||
|
||||
public String asString() {
|
||||
return fuzziness.toString();
|
||||
}
|
||||
}
|
|
@ -28,6 +28,7 @@ import org.apache.lucene.search.MultiTermQuery;
|
|||
import org.apache.lucene.search.Query;
|
||||
import org.apache.lucene.util.BytesRef;
|
||||
import org.elasticsearch.common.Nullable;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.index.codec.docvaluesformat.DocValuesFormatProvider;
|
||||
import org.elasticsearch.index.codec.postingsformat.PostingsFormatProvider;
|
||||
import org.elasticsearch.index.fielddata.FieldDataType;
|
||||
|
@ -214,7 +215,7 @@ public interface FieldMapper<T> extends Mapper {
|
|||
|
||||
Filter rangeFilter(Object lowerTerm, Object upperTerm, boolean includeLower, boolean includeUpper, @Nullable QueryParseContext context);
|
||||
|
||||
Query fuzzyQuery(String value, String minSim, int prefixLength, int maxExpansions, boolean transpositions);
|
||||
Query fuzzyQuery(String value, Fuzziness fuzziness, int prefixLength, int maxExpansions, boolean transpositions);
|
||||
|
||||
Query prefixQuery(Object value, @Nullable MultiTermQuery.RewriteMethod method, @Nullable QueryParseContext context);
|
||||
|
||||
|
|
|
@ -37,6 +37,7 @@ import org.elasticsearch.common.lucene.Lucene;
|
|||
import org.elasticsearch.common.lucene.search.RegexpFilter;
|
||||
import org.elasticsearch.common.settings.ImmutableSettings;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.index.analysis.NamedAnalyzer;
|
||||
import org.elasticsearch.index.codec.docvaluesformat.DocValuesFormatProvider;
|
||||
|
@ -466,9 +467,8 @@ public abstract class AbstractFieldMapper<T> implements FieldMapper<T> {
|
|||
}
|
||||
|
||||
@Override
|
||||
public Query fuzzyQuery(String value, String minSim, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
int edits = FuzzyQuery.floatToEdits(Float.parseFloat(minSim), value.codePointCount(0, value.length()));
|
||||
return new FuzzyQuery(names.createIndexNameTerm(indexedValueForSearch(value)), edits, prefixLength, maxExpansions, transpositions);
|
||||
public Query fuzzyQuery(String value, Fuzziness fuzziness, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
return new FuzzyQuery(names.createIndexNameTerm(indexedValueForSearch(value)), fuzziness.asDistance(value), prefixLength, maxExpansions, transpositions);
|
||||
}
|
||||
|
||||
@Override
|
||||
|
|
|
@ -34,6 +34,7 @@ import org.elasticsearch.common.Explicit;
|
|||
import org.elasticsearch.common.Nullable;
|
||||
import org.elasticsearch.common.Strings;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.analysis.NamedAnalyzer;
|
||||
|
@ -181,14 +182,9 @@ public class ByteFieldMapper extends NumberFieldMapper<Byte> {
|
|||
}
|
||||
|
||||
@Override
|
||||
public Query fuzzyQuery(String value, String minSim, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
public Query fuzzyQuery(String value, Fuzziness fuzziness, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
byte iValue = Byte.parseByte(value);
|
||||
byte iSim;
|
||||
try {
|
||||
iSim = Byte.parseByte(minSim);
|
||||
} catch (NumberFormatException e) {
|
||||
iSim = (byte) Float.parseFloat(minSim);
|
||||
}
|
||||
byte iSim = fuzziness.asByte();
|
||||
return NumericRangeQuery.newIntRange(names.indexName(), precisionStep,
|
||||
iValue - iSim,
|
||||
iValue + iSim,
|
||||
|
|
|
@ -36,7 +36,7 @@ import org.elasticsearch.common.joda.DateMathParser;
|
|||
import org.elasticsearch.common.joda.FormatDateTimeFormatter;
|
||||
import org.elasticsearch.common.joda.Joda;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.TimeValue;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.analysis.NamedAnalyzer;
|
||||
|
@ -291,14 +291,14 @@ public class DateFieldMapper extends NumberFieldMapper<Long> {
|
|||
}
|
||||
|
||||
@Override
|
||||
public Query fuzzyQuery(String value, String minSim, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
public Query fuzzyQuery(String value, Fuzziness fuzziness, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
long iValue = dateMathParser.parse(value, System.currentTimeMillis());
|
||||
long iSim;
|
||||
try {
|
||||
iSim = TimeValue.parseTimeValue(minSim, null).millis();
|
||||
iSim = fuzziness.asTimeValue().millis();
|
||||
} catch (Exception e) {
|
||||
// not a time format
|
||||
iSim = (long) Double.parseDouble(minSim);
|
||||
iSim = fuzziness.asLong();
|
||||
}
|
||||
return NumericRangeQuery.newLongRange(names.indexName(), precisionStep,
|
||||
iValue - iSim,
|
||||
|
|
|
@ -36,6 +36,7 @@ import org.elasticsearch.common.Explicit;
|
|||
import org.elasticsearch.common.Nullable;
|
||||
import org.elasticsearch.common.Numbers;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.util.ByteUtils;
|
||||
import org.elasticsearch.common.util.CollectionUtils;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
|
@ -171,9 +172,9 @@ public class DoubleFieldMapper extends NumberFieldMapper<Double> {
|
|||
}
|
||||
|
||||
@Override
|
||||
public Query fuzzyQuery(String value, String minSim, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
public Query fuzzyQuery(String value, Fuzziness fuzziness, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
double iValue = Double.parseDouble(value);
|
||||
double iSim = Double.parseDouble(minSim);
|
||||
double iSim = fuzziness.asDouble();
|
||||
return NumericRangeQuery.newDoubleRange(names.indexName(), precisionStep,
|
||||
iValue - iSim,
|
||||
iValue + iSim,
|
||||
|
|
|
@ -37,6 +37,7 @@ import org.elasticsearch.common.Nullable;
|
|||
import org.elasticsearch.common.Numbers;
|
||||
import org.elasticsearch.common.Strings;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.util.ByteUtils;
|
||||
import org.elasticsearch.common.util.CollectionUtils;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
|
@ -181,9 +182,9 @@ public class FloatFieldMapper extends NumberFieldMapper<Float> {
|
|||
}
|
||||
|
||||
@Override
|
||||
public Query fuzzyQuery(String value, String minSim, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
public Query fuzzyQuery(String value, Fuzziness fuzziness, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
float iValue = Float.parseFloat(value);
|
||||
float iSim = Float.parseFloat(minSim);
|
||||
final float iSim = fuzziness.asFloat();
|
||||
return NumericRangeQuery.newFloatRange(names.indexName(), precisionStep,
|
||||
iValue - iSim,
|
||||
iValue + iSim,
|
||||
|
|
|
@ -35,6 +35,7 @@ import org.elasticsearch.common.Nullable;
|
|||
import org.elasticsearch.common.Numbers;
|
||||
import org.elasticsearch.common.Strings;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.analysis.NumericIntegerAnalyzer;
|
||||
|
@ -176,14 +177,9 @@ public class IntegerFieldMapper extends NumberFieldMapper<Integer> {
|
|||
}
|
||||
|
||||
@Override
|
||||
public Query fuzzyQuery(String value, String minSim, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
public Query fuzzyQuery(String value, Fuzziness fuzziness, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
int iValue = Integer.parseInt(value);
|
||||
int iSim;
|
||||
try {
|
||||
iSim = Integer.parseInt(minSim);
|
||||
} catch (NumberFormatException e) {
|
||||
iSim = (int) Float.parseFloat(minSim);
|
||||
}
|
||||
int iSim = fuzziness.asInt();
|
||||
return NumericRangeQuery.newIntRange(names.indexName(), precisionStep,
|
||||
iValue - iSim,
|
||||
iValue + iSim,
|
||||
|
|
|
@ -35,6 +35,7 @@ import org.elasticsearch.common.Nullable;
|
|||
import org.elasticsearch.common.Numbers;
|
||||
import org.elasticsearch.common.Strings;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.analysis.NumericLongAnalyzer;
|
||||
|
@ -165,14 +166,9 @@ public class LongFieldMapper extends NumberFieldMapper<Long> {
|
|||
}
|
||||
|
||||
@Override
|
||||
public Query fuzzyQuery(String value, String minSim, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
public Query fuzzyQuery(String value, Fuzziness fuzziness, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
long iValue = Long.parseLong(value);
|
||||
long iSim;
|
||||
try {
|
||||
iSim = Long.parseLong(minSim);
|
||||
} catch (NumberFormatException e) {
|
||||
iSim = (long) Double.parseDouble(minSim);
|
||||
}
|
||||
final long iSim = fuzziness.asLong();
|
||||
return NumericRangeQuery.newLongRange(names.indexName(), precisionStep,
|
||||
iValue - iSim,
|
||||
iValue + iSim,
|
||||
|
|
|
@ -39,6 +39,7 @@ import org.apache.lucene.util.NumericUtils;
|
|||
import org.elasticsearch.common.Explicit;
|
||||
import org.elasticsearch.common.Nullable;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.util.ByteUtils;
|
||||
import org.elasticsearch.common.util.CollectionUtils;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
|
@ -239,7 +240,7 @@ public abstract class NumberFieldMapper<T extends Number> extends AbstractFieldM
|
|||
public abstract Filter rangeFilter(Object lowerTerm, Object upperTerm, boolean includeLower, boolean includeUpper, @Nullable QueryParseContext context);
|
||||
|
||||
@Override
|
||||
public abstract Query fuzzyQuery(String value, String minSim, int prefixLength, int maxExpansions, boolean transpositions);
|
||||
public abstract Query fuzzyQuery(String value, Fuzziness fuzziness, int prefixLength, int maxExpansions, boolean transpositions);
|
||||
|
||||
/**
|
||||
* A range filter based on the field data cache.
|
||||
|
|
|
@ -35,6 +35,7 @@ import org.elasticsearch.common.Nullable;
|
|||
import org.elasticsearch.common.Numbers;
|
||||
import org.elasticsearch.common.Strings;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.analysis.NamedAnalyzer;
|
||||
|
@ -180,14 +181,9 @@ public class ShortFieldMapper extends NumberFieldMapper<Short> {
|
|||
}
|
||||
|
||||
@Override
|
||||
public Query fuzzyQuery(String value, String minSim, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
public Query fuzzyQuery(String value, Fuzziness fuzziness, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
short iValue = Short.parseShort(value);
|
||||
short iSim;
|
||||
try {
|
||||
iSim = Short.parseShort(minSim);
|
||||
} catch (NumberFormatException e) {
|
||||
iSim = (short) Float.parseFloat(minSim);
|
||||
}
|
||||
short iSim = fuzziness.asShort();
|
||||
return NumericRangeQuery.newIntRange(names.indexName(), precisionStep,
|
||||
iValue - iSim,
|
||||
iValue + iSim,
|
||||
|
|
|
@ -32,6 +32,7 @@ import org.elasticsearch.common.Numbers;
|
|||
import org.elasticsearch.common.Strings;
|
||||
import org.elasticsearch.common.settings.ImmutableSettings;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.analysis.NumericFloatAnalyzer;
|
||||
|
@ -183,9 +184,9 @@ public class BoostFieldMapper extends NumberFieldMapper<Float> implements Intern
|
|||
}
|
||||
|
||||
@Override
|
||||
public Query fuzzyQuery(String value, String minSim, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
public Query fuzzyQuery(String value, Fuzziness fuzziness, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
float iValue = Float.parseFloat(value);
|
||||
float iSim = Float.parseFloat(minSim);
|
||||
float iSim = fuzziness.asFloat();
|
||||
return NumericRangeQuery.newFloatRange(names.indexName(), precisionStep,
|
||||
iValue - iSim,
|
||||
iValue + iSim,
|
||||
|
|
|
@ -34,6 +34,7 @@ import org.elasticsearch.common.Nullable;
|
|||
import org.elasticsearch.common.Numbers;
|
||||
import org.elasticsearch.common.Strings;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.analysis.NamedAnalyzer;
|
||||
|
@ -216,17 +217,13 @@ public class IpFieldMapper extends NumberFieldMapper<Long> {
|
|||
}
|
||||
|
||||
@Override
|
||||
public Query fuzzyQuery(String value, String minSim, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
public Query fuzzyQuery(String value, Fuzziness fuzziness, int prefixLength, int maxExpansions, boolean transpositions) {
|
||||
long iValue = ipToLong(value);
|
||||
long iSim;
|
||||
try {
|
||||
iSim = ipToLong(minSim);
|
||||
iSim = ipToLong(fuzziness.asString());
|
||||
} catch (ElasticsearchIllegalArgumentException e) {
|
||||
try {
|
||||
iSim = Long.parseLong(minSim);
|
||||
} catch (NumberFormatException e1) {
|
||||
iSim = (long) Double.parseDouble(minSim);
|
||||
}
|
||||
iSim = fuzziness.asLong();
|
||||
}
|
||||
return NumericRangeQuery.newLongRange(names.indexName(), precisionStep,
|
||||
iValue - iSim,
|
||||
|
|
|
@ -20,6 +20,7 @@
|
|||
package org.elasticsearch.index.query;
|
||||
|
||||
import org.elasticsearch.ElasticsearchIllegalArgumentException;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
|
||||
import java.io.IOException;
|
||||
|
@ -34,7 +35,7 @@ public class FuzzyLikeThisFieldQueryBuilder extends BaseQueryBuilder implements
|
|||
private Float boost;
|
||||
|
||||
private String likeText = null;
|
||||
private Float minSimilarity;
|
||||
private Fuzziness fuzziness;
|
||||
private Integer prefixLength;
|
||||
private Integer maxQueryTerms;
|
||||
private Boolean ignoreTF;
|
||||
|
@ -59,8 +60,8 @@ public class FuzzyLikeThisFieldQueryBuilder extends BaseQueryBuilder implements
|
|||
return this;
|
||||
}
|
||||
|
||||
public FuzzyLikeThisFieldQueryBuilder minSimilarity(float minSimilarity) {
|
||||
this.minSimilarity = minSimilarity;
|
||||
public FuzzyLikeThisFieldQueryBuilder fuzziness(Fuzziness fuzziness) {
|
||||
this.fuzziness = fuzziness;
|
||||
return this;
|
||||
}
|
||||
|
||||
|
@ -119,8 +120,8 @@ public class FuzzyLikeThisFieldQueryBuilder extends BaseQueryBuilder implements
|
|||
if (maxQueryTerms != null) {
|
||||
builder.field("max_query_terms", maxQueryTerms);
|
||||
}
|
||||
if (minSimilarity != null) {
|
||||
builder.field("min_similarity", minSimilarity);
|
||||
if (fuzziness != null) {
|
||||
fuzziness.toXContent(builder, params);
|
||||
}
|
||||
if (prefixLength != null) {
|
||||
builder.field("prefix_length", prefixLength);
|
||||
|
|
|
@ -23,8 +23,10 @@ import org.apache.lucene.analysis.Analyzer;
|
|||
import org.apache.lucene.sandbox.queries.FuzzyLikeThisQuery;
|
||||
import org.apache.lucene.search.Query;
|
||||
import org.elasticsearch.ElasticsearchIllegalArgumentException;
|
||||
import org.elasticsearch.common.ParseField;
|
||||
import org.elasticsearch.common.Strings;
|
||||
import org.elasticsearch.common.inject.Inject;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.analysis.Analysis;
|
||||
import org.elasticsearch.index.mapper.MapperService;
|
||||
|
@ -48,6 +50,8 @@ import static org.elasticsearch.index.query.support.QueryParsers.wrapSmartNameQu
|
|||
public class FuzzyLikeThisFieldQueryParser implements QueryParser {
|
||||
|
||||
public static final String NAME = "flt_field";
|
||||
private static final Fuzziness DEFAULT_FUZZINESS = Fuzziness.fromSimilarity(0.5f);
|
||||
private static final ParseField FUZZINESS = Fuzziness.FIELD.withDeprecation("min_similarity");
|
||||
|
||||
@Inject
|
||||
public FuzzyLikeThisFieldQueryParser() {
|
||||
|
@ -65,7 +69,7 @@ public class FuzzyLikeThisFieldQueryParser implements QueryParser {
|
|||
int maxNumTerms = 25;
|
||||
float boost = 1.0f;
|
||||
String likeText = null;
|
||||
float minSimilarity = 0.5f;
|
||||
Fuzziness fuzziness = DEFAULT_FUZZINESS;
|
||||
int prefixLength = 0;
|
||||
boolean ignoreTF = false;
|
||||
Analyzer analyzer = null;
|
||||
|
@ -98,8 +102,8 @@ public class FuzzyLikeThisFieldQueryParser implements QueryParser {
|
|||
boost = parser.floatValue();
|
||||
} else if ("ignore_tf".equals(currentFieldName) || "ignoreTF".equals(currentFieldName)) {
|
||||
ignoreTF = parser.booleanValue();
|
||||
} else if ("min_similarity".equals(currentFieldName) || "minSimilarity".equals(currentFieldName)) {
|
||||
minSimilarity = parser.floatValue();
|
||||
} else if (FUZZINESS.match(currentFieldName, parseContext.parseFlags())) {
|
||||
fuzziness = Fuzziness.parse(parser);
|
||||
} else if ("prefix_length".equals(currentFieldName) || "prefixLength".equals(currentFieldName)) {
|
||||
prefixLength = parser.intValue();
|
||||
} else if ("analyzer".equals(currentFieldName)) {
|
||||
|
@ -139,7 +143,7 @@ public class FuzzyLikeThisFieldQueryParser implements QueryParser {
|
|||
}
|
||||
|
||||
FuzzyLikeThisQuery fuzzyLikeThisQuery = new FuzzyLikeThisQuery(maxNumTerms, analyzer);
|
||||
fuzzyLikeThisQuery.addTerms(likeText, fieldName, minSimilarity, prefixLength);
|
||||
fuzzyLikeThisQuery.addTerms(likeText, fieldName, fuzziness.asSimilarity(), prefixLength);
|
||||
fuzzyLikeThisQuery.setBoost(boost);
|
||||
fuzzyLikeThisQuery.setIgnoreTF(ignoreTF);
|
||||
|
||||
|
@ -156,4 +160,4 @@ public class FuzzyLikeThisFieldQueryParser implements QueryParser {
|
|||
}
|
||||
return query;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
@ -20,6 +20,7 @@
|
|||
package org.elasticsearch.index.query;
|
||||
|
||||
import org.elasticsearch.ElasticsearchIllegalArgumentException;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
|
||||
import java.io.IOException;
|
||||
|
@ -34,7 +35,7 @@ public class FuzzyLikeThisQueryBuilder extends BaseQueryBuilder implements Boost
|
|||
private Float boost;
|
||||
|
||||
private String likeText = null;
|
||||
private Float minSimilarity;
|
||||
private Fuzziness fuzziness;
|
||||
private Integer prefixLength;
|
||||
private Integer maxQueryTerms;
|
||||
private Boolean ignoreTF;
|
||||
|
@ -66,8 +67,8 @@ public class FuzzyLikeThisQueryBuilder extends BaseQueryBuilder implements Boost
|
|||
return this;
|
||||
}
|
||||
|
||||
public FuzzyLikeThisQueryBuilder minSimilarity(float minSimilarity) {
|
||||
this.minSimilarity = minSimilarity;
|
||||
public FuzzyLikeThisQueryBuilder fuzziness(Fuzziness fuzziness) {
|
||||
this.fuzziness = fuzziness;
|
||||
return this;
|
||||
}
|
||||
|
||||
|
@ -132,8 +133,8 @@ public class FuzzyLikeThisQueryBuilder extends BaseQueryBuilder implements Boost
|
|||
if (maxQueryTerms != null) {
|
||||
builder.field("max_query_terms", maxQueryTerms);
|
||||
}
|
||||
if (minSimilarity != null) {
|
||||
builder.field("min_similarity", minSimilarity);
|
||||
if (fuzziness != null) {
|
||||
fuzziness.toXContent(builder, params);
|
||||
}
|
||||
if (prefixLength != null) {
|
||||
builder.field("prefix_length", prefixLength);
|
||||
|
|
|
@ -24,7 +24,9 @@ import org.apache.lucene.analysis.Analyzer;
|
|||
import org.apache.lucene.sandbox.queries.FuzzyLikeThisQuery;
|
||||
import org.apache.lucene.search.Query;
|
||||
import org.elasticsearch.ElasticsearchIllegalArgumentException;
|
||||
import org.elasticsearch.common.ParseField;
|
||||
import org.elasticsearch.common.inject.Inject;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.analysis.Analysis;
|
||||
|
||||
|
@ -47,6 +49,7 @@ import java.util.List;
|
|||
public class FuzzyLikeThisQueryParser implements QueryParser {
|
||||
|
||||
public static final String NAME = "flt";
|
||||
private static final ParseField FUZZINESS = Fuzziness.FIELD.withDeprecation("min_similarity");
|
||||
|
||||
@Inject
|
||||
public FuzzyLikeThisQueryParser() {
|
||||
|
@ -65,7 +68,7 @@ public class FuzzyLikeThisQueryParser implements QueryParser {
|
|||
float boost = 1.0f;
|
||||
List<String> fields = null;
|
||||
String likeText = null;
|
||||
float minSimilarity = 0.5f;
|
||||
Fuzziness fuzziness = Fuzziness.TWO;
|
||||
int prefixLength = 0;
|
||||
boolean ignoreTF = false;
|
||||
Analyzer analyzer = null;
|
||||
|
@ -86,8 +89,8 @@ public class FuzzyLikeThisQueryParser implements QueryParser {
|
|||
boost = parser.floatValue();
|
||||
} else if ("ignore_tf".equals(currentFieldName) || "ignoreTF".equals(currentFieldName)) {
|
||||
ignoreTF = parser.booleanValue();
|
||||
} else if ("min_similarity".equals(currentFieldName) || "minSimilarity".equals(currentFieldName)) {
|
||||
minSimilarity = parser.floatValue();
|
||||
} else if (FUZZINESS.match(currentFieldName, parseContext.parseFlags())) {
|
||||
fuzziness = Fuzziness.parse(parser);
|
||||
} else if ("prefix_length".equals(currentFieldName) || "prefixLength".equals(currentFieldName)) {
|
||||
prefixLength = parser.intValue();
|
||||
} else if ("analyzer".equals(currentFieldName)) {
|
||||
|
@ -139,7 +142,7 @@ public class FuzzyLikeThisQueryParser implements QueryParser {
|
|||
return null;
|
||||
}
|
||||
for (String field : fields) {
|
||||
query.addTerms(likeText, field, minSimilarity, prefixLength);
|
||||
query.addTerms(likeText, field, fuzziness.asSimilarity(), prefixLength);
|
||||
}
|
||||
query.setBoost(boost);
|
||||
query.setIgnoreTF(ignoreTF);
|
||||
|
|
|
@ -19,6 +19,7 @@
|
|||
|
||||
package org.elasticsearch.index.query;
|
||||
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
|
||||
import java.io.IOException;
|
||||
|
@ -36,7 +37,7 @@ public class FuzzyQueryBuilder extends BaseQueryBuilder implements MultiTermQuer
|
|||
|
||||
private float boost = -1;
|
||||
|
||||
private String minSimilarity;
|
||||
private Fuzziness fuzziness;
|
||||
|
||||
private Integer prefixLength;
|
||||
|
||||
|
@ -67,13 +68,8 @@ public class FuzzyQueryBuilder extends BaseQueryBuilder implements MultiTermQuer
|
|||
return this;
|
||||
}
|
||||
|
||||
public FuzzyQueryBuilder minSimilarity(float defaultMinSimilarity) {
|
||||
this.minSimilarity = Float.toString(defaultMinSimilarity);
|
||||
return this;
|
||||
}
|
||||
|
||||
public FuzzyQueryBuilder minSimilarity(String defaultMinSimilarity) {
|
||||
this.minSimilarity = defaultMinSimilarity;
|
||||
public FuzzyQueryBuilder fuzziness(Fuzziness fuzziness) {
|
||||
this.fuzziness = fuzziness;
|
||||
return this;
|
||||
}
|
||||
|
||||
|
@ -103,7 +99,7 @@ public class FuzzyQueryBuilder extends BaseQueryBuilder implements MultiTermQuer
|
|||
@Override
|
||||
public void doXContent(XContentBuilder builder, Params params) throws IOException {
|
||||
builder.startObject(FuzzyQueryParser.NAME);
|
||||
if (boost == -1 && minSimilarity == null && prefixLength == null && queryName != null) {
|
||||
if (boost == -1 && fuzziness == null && prefixLength == null && queryName != null) {
|
||||
builder.field(name, value);
|
||||
} else {
|
||||
builder.startObject(name);
|
||||
|
@ -114,8 +110,8 @@ public class FuzzyQueryBuilder extends BaseQueryBuilder implements MultiTermQuer
|
|||
if (transpositions != null) {
|
||||
builder.field("transpositions", transpositions);
|
||||
}
|
||||
if (minSimilarity != null) {
|
||||
builder.field("min_similarity", minSimilarity);
|
||||
if (fuzziness != null) {
|
||||
fuzziness.toXContent(builder, params);
|
||||
}
|
||||
if (prefixLength != null) {
|
||||
builder.field("prefix_length", prefixLength);
|
||||
|
|
|
@ -23,7 +23,9 @@ import org.apache.lucene.index.Term;
|
|||
import org.apache.lucene.search.FuzzyQuery;
|
||||
import org.apache.lucene.search.MultiTermQuery;
|
||||
import org.apache.lucene.search.Query;
|
||||
import org.elasticsearch.common.ParseField;
|
||||
import org.elasticsearch.common.inject.Inject;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.mapper.MapperService;
|
||||
import org.elasticsearch.index.query.support.QueryParsers;
|
||||
|
@ -38,6 +40,9 @@ import static org.elasticsearch.index.query.support.QueryParsers.wrapSmartNameQu
|
|||
public class FuzzyQueryParser implements QueryParser {
|
||||
|
||||
public static final String NAME = "fuzzy";
|
||||
private static final Fuzziness DEFAULT_FUZZINESS = Fuzziness.AUTO;
|
||||
private static final ParseField FUZZINESS = Fuzziness.FIELD.withDeprecation("min_similarity");
|
||||
|
||||
|
||||
@Inject
|
||||
public FuzzyQueryParser() {
|
||||
|
@ -60,8 +65,7 @@ public class FuzzyQueryParser implements QueryParser {
|
|||
|
||||
String value = null;
|
||||
float boost = 1.0f;
|
||||
//LUCENE 4 UPGRADE we should find a good default here I'd vote for 1.0 -> 1 edit
|
||||
String minSimilarity = "0.5";
|
||||
Fuzziness fuzziness = DEFAULT_FUZZINESS;
|
||||
int prefixLength = FuzzyQuery.defaultPrefixLength;
|
||||
int maxExpansions = FuzzyQuery.defaultMaxExpansions;
|
||||
boolean transpositions = false;
|
||||
|
@ -80,8 +84,8 @@ public class FuzzyQueryParser implements QueryParser {
|
|||
value = parser.text();
|
||||
} else if ("boost".equals(currentFieldName)) {
|
||||
boost = parser.floatValue();
|
||||
} else if ("min_similarity".equals(currentFieldName) || "minSimilarity".equals(currentFieldName)) {
|
||||
minSimilarity = parser.text();
|
||||
} else if (FUZZINESS.match(currentFieldName, parseContext.parseFlags())) {
|
||||
fuzziness = Fuzziness.parse(parser);
|
||||
} else if ("prefix_length".equals(currentFieldName) || "prefixLength".equals(currentFieldName)) {
|
||||
prefixLength = parser.intValue();
|
||||
} else if ("max_expansions".equals(currentFieldName) || "maxExpansions".equals(currentFieldName)) {
|
||||
|
@ -112,14 +116,11 @@ public class FuzzyQueryParser implements QueryParser {
|
|||
MapperService.SmartNameFieldMappers smartNameFieldMappers = parseContext.smartFieldMappers(fieldName);
|
||||
if (smartNameFieldMappers != null) {
|
||||
if (smartNameFieldMappers.hasMapper()) {
|
||||
query = smartNameFieldMappers.mapper().fuzzyQuery(value, minSimilarity, prefixLength, maxExpansions, transpositions);
|
||||
query = smartNameFieldMappers.mapper().fuzzyQuery(value, fuzziness, prefixLength, maxExpansions, transpositions);
|
||||
}
|
||||
}
|
||||
if (query == null) {
|
||||
//LUCENE 4 UPGRADE we need to document that this should now be an int rather than a float
|
||||
int edits = FuzzyQuery.floatToEdits(Float.parseFloat(minSimilarity),
|
||||
value.codePointCount(0, value.length()));
|
||||
query = new FuzzyQuery(new Term(fieldName, value), edits, prefixLength, maxExpansions, transpositions);
|
||||
query = new FuzzyQuery(new Term(fieldName, value), fuzziness.asDistance(value), prefixLength, maxExpansions, transpositions);
|
||||
}
|
||||
if (query instanceof MultiTermQuery) {
|
||||
QueryParsers.setRewriteMethod((MultiTermQuery) query, rewriteMethod);
|
||||
|
|
|
@ -26,6 +26,7 @@ import org.apache.lucene.util.CloseableThreadLocal;
|
|||
import org.elasticsearch.ElasticsearchException;
|
||||
import org.elasticsearch.cache.recycler.CacheRecycler;
|
||||
import org.elasticsearch.common.Nullable;
|
||||
import org.elasticsearch.common.ParseField;
|
||||
import org.elasticsearch.common.bytes.BytesReference;
|
||||
import org.elasticsearch.common.inject.Inject;
|
||||
import org.elasticsearch.common.lucene.search.Queries;
|
||||
|
@ -47,6 +48,7 @@ import org.elasticsearch.indices.query.IndicesQueriesRegistry;
|
|||
import org.elasticsearch.script.ScriptService;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.util.EnumSet;
|
||||
import java.util.List;
|
||||
import java.util.Map;
|
||||
|
||||
|
@ -93,6 +95,7 @@ public class IndexQueryParserService extends AbstractIndexComponent {
|
|||
|
||||
private String defaultField;
|
||||
private boolean queryStringLenient;
|
||||
private final boolean strict;
|
||||
|
||||
@Inject
|
||||
public IndexQueryParserService(Index index, @IndexSettings Settings indexSettings,
|
||||
|
@ -114,6 +117,7 @@ public class IndexQueryParserService extends AbstractIndexComponent {
|
|||
|
||||
this.defaultField = indexSettings.get("index.query.default_field", AllFieldMapper.NAME);
|
||||
this.queryStringLenient = indexSettings.getAsBoolean("index.query_string.lenient", false);
|
||||
this.strict = indexSettings.getAsBoolean("index.query.parse.strict", false);
|
||||
|
||||
List<QueryParser> queryParsers = newArrayList();
|
||||
if (namedQueryParsers != null) {
|
||||
|
@ -311,6 +315,9 @@ public class IndexQueryParserService extends AbstractIndexComponent {
|
|||
|
||||
private ParsedQuery parse(QueryParseContext parseContext, XContentParser parser) throws IOException, QueryParsingException {
|
||||
parseContext.reset(parser);
|
||||
if (strict) {
|
||||
parseContext.parseFlags(EnumSet.of(ParseField.Flag.STRICT));
|
||||
}
|
||||
Query query = parseContext.parseInnerQuery();
|
||||
if (query == null) {
|
||||
query = Queries.newMatchNoDocsQuery();
|
||||
|
|
|
@ -19,6 +19,7 @@
|
|||
|
||||
package org.elasticsearch.index.query;
|
||||
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
|
||||
import java.io.IOException;
|
||||
|
@ -69,7 +70,7 @@ public class MatchQueryBuilder extends BaseQueryBuilder implements BoostableQuer
|
|||
|
||||
private Integer slop;
|
||||
|
||||
private String fuzziness;
|
||||
private Fuzziness fuzziness;
|
||||
|
||||
private Integer prefixLength;
|
||||
|
||||
|
@ -82,11 +83,11 @@ public class MatchQueryBuilder extends BaseQueryBuilder implements BoostableQuer
|
|||
private String fuzzyRewrite = null;
|
||||
|
||||
private Boolean lenient;
|
||||
|
||||
|
||||
private Boolean fuzzyTranspositions = null;
|
||||
|
||||
private ZeroTermsQuery zeroTermsQuery;
|
||||
|
||||
|
||||
private Float cutoff_Frequency = null;
|
||||
|
||||
private String queryName;
|
||||
|
@ -141,10 +142,10 @@ public class MatchQueryBuilder extends BaseQueryBuilder implements BoostableQuer
|
|||
}
|
||||
|
||||
/**
|
||||
* Sets the minimum similarity used when evaluated to a fuzzy query type. Defaults to "0.5".
|
||||
* Sets the fuzziness used when evaluated to a fuzzy query type. Defaults to "AUTO".
|
||||
*/
|
||||
public MatchQueryBuilder fuzziness(Object fuzziness) {
|
||||
this.fuzziness = fuzziness.toString();
|
||||
this.fuzziness = Fuzziness.build(fuzziness);
|
||||
return this;
|
||||
}
|
||||
|
||||
|
@ -161,7 +162,7 @@ public class MatchQueryBuilder extends BaseQueryBuilder implements BoostableQuer
|
|||
this.maxExpansions = maxExpansions;
|
||||
return this;
|
||||
}
|
||||
|
||||
|
||||
/**
|
||||
* Set a cutoff value in [0..1] (or absolute number >=1) representing the
|
||||
* maximum threshold of a terms document frequency to be considered a low
|
||||
|
@ -186,7 +187,7 @@ public class MatchQueryBuilder extends BaseQueryBuilder implements BoostableQuer
|
|||
this.fuzzyRewrite = fuzzyRewrite;
|
||||
return this;
|
||||
}
|
||||
|
||||
|
||||
public MatchQueryBuilder fuzzyTranspositions(boolean fuzzyTranspositions) {
|
||||
//LUCENE 4 UPGRADE add documentation
|
||||
this.fuzzyTranspositions = fuzzyTranspositions;
|
||||
|
@ -236,7 +237,7 @@ public class MatchQueryBuilder extends BaseQueryBuilder implements BoostableQuer
|
|||
builder.field("slop", slop);
|
||||
}
|
||||
if (fuzziness != null) {
|
||||
builder.field("fuzziness", fuzziness);
|
||||
fuzziness.toXContent(builder, params);
|
||||
}
|
||||
if (prefixLength != null) {
|
||||
builder.field("prefix_length", prefixLength);
|
||||
|
@ -269,7 +270,7 @@ public class MatchQueryBuilder extends BaseQueryBuilder implements BoostableQuer
|
|||
if (queryName != null) {
|
||||
builder.field("_name", queryName);
|
||||
}
|
||||
|
||||
|
||||
|
||||
builder.endObject();
|
||||
builder.endObject();
|
||||
|
|
|
@ -25,6 +25,7 @@ import org.apache.lucene.search.BooleanQuery;
|
|||
import org.apache.lucene.search.Query;
|
||||
import org.elasticsearch.common.inject.Inject;
|
||||
import org.elasticsearch.common.lucene.search.Queries;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.query.support.QueryParsers;
|
||||
import org.elasticsearch.index.search.MatchQuery;
|
||||
|
@ -102,8 +103,8 @@ public class MatchQueryParser implements QueryParser {
|
|||
boost = parser.floatValue();
|
||||
} else if ("slop".equals(currentFieldName) || "phrase_slop".equals(currentFieldName) || "phraseSlop".equals(currentFieldName)) {
|
||||
matchQuery.setPhraseSlop(parser.intValue());
|
||||
} else if ("fuzziness".equals(currentFieldName)) {
|
||||
matchQuery.setFuzziness(parser.textOrNull());
|
||||
} else if (Fuzziness.FIELD.match(currentFieldName, parseContext.parseFlags())) {
|
||||
matchQuery.setFuzziness(Fuzziness.parse(parser));
|
||||
} else if ("prefix_length".equals(currentFieldName) || "prefixLength".equals(currentFieldName)) {
|
||||
matchQuery.setFuzzyPrefixLength(parser.intValue());
|
||||
} else if ("max_expansions".equals(currentFieldName) || "maxExpansions".equals(currentFieldName)) {
|
||||
|
|
|
@ -21,6 +21,7 @@ package org.elasticsearch.index.query;
|
|||
|
||||
import com.carrotsearch.hppc.ObjectFloatOpenHashMap;
|
||||
import com.google.common.collect.Lists;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
|
||||
import java.io.IOException;
|
||||
|
@ -48,7 +49,7 @@ public class MultiMatchQueryBuilder extends BaseQueryBuilder implements Boostabl
|
|||
|
||||
private Integer slop;
|
||||
|
||||
private String fuzziness;
|
||||
private Fuzziness fuzziness;
|
||||
|
||||
private Integer prefixLength;
|
||||
|
||||
|
@ -143,10 +144,10 @@ public class MultiMatchQueryBuilder extends BaseQueryBuilder implements Boostabl
|
|||
}
|
||||
|
||||
/**
|
||||
* Sets the minimum similarity used when evaluated to a fuzzy query type. Defaults to "0.5".
|
||||
* Sets the fuzziness used when evaluated to a fuzzy query type. Defaults to "AUTO".
|
||||
*/
|
||||
public MultiMatchQueryBuilder fuzziness(Object fuzziness) {
|
||||
this.fuzziness = fuzziness.toString();
|
||||
this.fuzziness = Fuzziness.build(fuzziness);
|
||||
return this;
|
||||
}
|
||||
|
||||
|
@ -252,7 +253,7 @@ public class MultiMatchQueryBuilder extends BaseQueryBuilder implements Boostabl
|
|||
builder.field("slop", slop);
|
||||
}
|
||||
if (fuzziness != null) {
|
||||
builder.field("fuzziness", fuzziness);
|
||||
fuzziness.toXContent(builder, params);
|
||||
}
|
||||
if (prefixLength != null) {
|
||||
builder.field("prefix_length", prefixLength);
|
||||
|
|
|
@ -24,6 +24,7 @@ import org.apache.lucene.search.BooleanClause;
|
|||
import org.apache.lucene.search.Query;
|
||||
import org.elasticsearch.common.inject.Inject;
|
||||
import org.elasticsearch.common.regex.Regex;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.query.support.QueryParsers;
|
||||
import org.elasticsearch.index.search.MatchQuery;
|
||||
|
@ -99,8 +100,8 @@ public class MultiMatchQueryParser implements QueryParser {
|
|||
boost = parser.floatValue();
|
||||
} else if ("slop".equals(currentFieldName) || "phrase_slop".equals(currentFieldName) || "phraseSlop".equals(currentFieldName)) {
|
||||
multiMatchQuery.setPhraseSlop(parser.intValue());
|
||||
} else if ("fuzziness".equals(currentFieldName)) {
|
||||
multiMatchQuery.setFuzziness(parser.textOrNull());
|
||||
} else if (Fuzziness.FIELD.match(currentFieldName, parseContext.parseFlags())) {
|
||||
multiMatchQuery.setFuzziness(Fuzziness.parse(parser));
|
||||
} else if ("prefix_length".equals(currentFieldName) || "prefixLength".equals(currentFieldName)) {
|
||||
multiMatchQuery.setFuzzyPrefixLength(parser.intValue());
|
||||
} else if ("max_expansions".equals(currentFieldName) || "maxExpansions".equals(currentFieldName)) {
|
||||
|
|
|
@ -29,6 +29,7 @@ import org.apache.lucene.search.QueryWrapperFilter;
|
|||
import org.apache.lucene.search.similarities.Similarity;
|
||||
import org.elasticsearch.cache.recycler.CacheRecycler;
|
||||
import org.elasticsearch.common.Nullable;
|
||||
import org.elasticsearch.common.ParseField;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.Index;
|
||||
import org.elasticsearch.index.analysis.AnalysisService;
|
||||
|
@ -45,10 +46,7 @@ import org.elasticsearch.search.internal.SearchContext;
|
|||
import org.elasticsearch.search.lookup.SearchLookup;
|
||||
|
||||
import java.io.IOException;
|
||||
import java.util.Arrays;
|
||||
import java.util.Collection;
|
||||
import java.util.Map;
|
||||
import java.util.Set;
|
||||
import java.util.*;
|
||||
|
||||
/**
|
||||
*
|
||||
|
@ -85,12 +83,24 @@ public class QueryParseContext {
|
|||
|
||||
private XContentParser parser;
|
||||
|
||||
private EnumSet<ParseField.Flag> parseFlags = ParseField.EMPTY_FLAGS;
|
||||
|
||||
|
||||
public QueryParseContext(Index index, IndexQueryParserService indexQueryParser) {
|
||||
this.index = index;
|
||||
this.indexQueryParser = indexQueryParser;
|
||||
}
|
||||
|
||||
public void parseFlags(EnumSet<ParseField.Flag> parseFlags) {
|
||||
this.parseFlags = parseFlags == null ? ParseField.EMPTY_FLAGS : parseFlags;
|
||||
}
|
||||
|
||||
public EnumSet<ParseField.Flag> parseFlags() {
|
||||
return parseFlags;
|
||||
}
|
||||
|
||||
public void reset(XContentParser jp) {
|
||||
this.parseFlags = ParseField.EMPTY_FLAGS;
|
||||
this.lookup = null;
|
||||
this.parser = jp;
|
||||
this.namedFilters.clear();
|
||||
|
|
|
@ -20,6 +20,7 @@
|
|||
package org.elasticsearch.index.query;
|
||||
|
||||
import com.carrotsearch.hppc.ObjectFloatOpenHashMap;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
|
||||
import java.io.IOException;
|
||||
|
@ -35,7 +36,6 @@ import static com.google.common.collect.Lists.newArrayList;
|
|||
* (using {@link #field(String)}), will run the parsed query against the provided fields, and combine
|
||||
* them either using DisMax or a plain boolean query (see {@link #useDisMax(boolean)}).
|
||||
* <p/>
|
||||
* (shay.baon)
|
||||
*/
|
||||
public class QueryStringQueryBuilder extends BaseQueryBuilder implements BoostableQueryBuilder<QueryStringQueryBuilder> {
|
||||
|
||||
|
@ -68,7 +68,7 @@ public class QueryStringQueryBuilder extends BaseQueryBuilder implements Boostab
|
|||
|
||||
private float boost = -1;
|
||||
|
||||
private float fuzzyMinSim = -1;
|
||||
private Fuzziness fuzziness;
|
||||
private int fuzzyPrefixLength = -1;
|
||||
private int fuzzyMaxExpansions = -1;
|
||||
private String fuzzyRewrite;
|
||||
|
@ -226,15 +226,15 @@ public class QueryStringQueryBuilder extends BaseQueryBuilder implements Boostab
|
|||
}
|
||||
|
||||
/**
|
||||
* Set the minimum similarity for fuzzy queries. Default is 0.5f.
|
||||
* Set the edit distance for fuzzy queries. Default is "AUTO".
|
||||
*/
|
||||
public QueryStringQueryBuilder fuzzyMinSim(float fuzzyMinSim) {
|
||||
this.fuzzyMinSim = fuzzyMinSim;
|
||||
public QueryStringQueryBuilder fuzziness(Fuzziness fuzziness) {
|
||||
this.fuzziness = fuzziness;
|
||||
return this;
|
||||
}
|
||||
|
||||
/**
|
||||
* Set the minimum similarity for fuzzy queries. Default is 0.5f.
|
||||
* Set the minimum prefix length for fuzzy queries. Default is 1.
|
||||
*/
|
||||
public QueryStringQueryBuilder fuzzyPrefixLength(int fuzzyPrefixLength) {
|
||||
this.fuzzyPrefixLength = fuzzyPrefixLength;
|
||||
|
@ -356,8 +356,8 @@ public class QueryStringQueryBuilder extends BaseQueryBuilder implements Boostab
|
|||
if (enablePositionIncrements != null) {
|
||||
builder.field("enable_position_increments", enablePositionIncrements);
|
||||
}
|
||||
if (fuzzyMinSim != -1) {
|
||||
builder.field("fuzzy_min_sim", fuzzyMinSim);
|
||||
if (fuzziness != null) {
|
||||
fuzziness.toXContent(builder, params);
|
||||
}
|
||||
if (boost != -1) {
|
||||
builder.field("boost", boost);
|
||||
|
|
|
@ -25,11 +25,13 @@ import org.apache.lucene.queryparser.classic.MapperQueryParser;
|
|||
import org.apache.lucene.queryparser.classic.QueryParserSettings;
|
||||
import org.apache.lucene.search.BooleanQuery;
|
||||
import org.apache.lucene.search.Query;
|
||||
import org.elasticsearch.common.ParseField;
|
||||
import org.elasticsearch.common.Strings;
|
||||
import org.elasticsearch.common.inject.Inject;
|
||||
import org.elasticsearch.common.lucene.search.Queries;
|
||||
import org.elasticsearch.common.regex.Regex;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.analysis.NamedAnalyzer;
|
||||
import org.elasticsearch.index.query.support.QueryParsers;
|
||||
|
@ -45,6 +47,7 @@ import static org.elasticsearch.common.lucene.search.Queries.optimizeQuery;
|
|||
public class QueryStringQueryParser implements QueryParser {
|
||||
|
||||
public static final String NAME = "query_string";
|
||||
private static final ParseField FUZZINESS = Fuzziness.FIELD.withDeprecation("fuzzy_min_sim");
|
||||
|
||||
private final boolean defaultAnalyzeWildcard;
|
||||
private final boolean defaultAllowLeadingWildcard;
|
||||
|
@ -167,8 +170,8 @@ public class QueryStringQueryParser implements QueryParser {
|
|||
qpSettings.fuzzyRewriteMethod(QueryParsers.parseRewriteMethod(parser.textOrNull()));
|
||||
} else if ("phrase_slop".equals(currentFieldName) || "phraseSlop".equals(currentFieldName)) {
|
||||
qpSettings.phraseSlop(parser.intValue());
|
||||
} else if ("fuzzy_min_sim".equals(currentFieldName) || "fuzzyMinSim".equals(currentFieldName)) {
|
||||
qpSettings.fuzzyMinSim(parser.floatValue());
|
||||
} else if (FUZZINESS.match(currentFieldName, parseContext.parseFlags())) {
|
||||
qpSettings.fuzzyMinSim(Fuzziness.parse(parser).asSimilarity());
|
||||
} else if ("boost".equals(currentFieldName)) {
|
||||
qpSettings.boost(parser.floatValue());
|
||||
} else if ("tie_breaker".equals(currentFieldName) || "tieBreaker".equals(currentFieldName)) {
|
||||
|
|
|
@ -35,6 +35,7 @@ import org.elasticsearch.ElasticsearchIllegalStateException;
|
|||
import org.elasticsearch.common.Nullable;
|
||||
import org.elasticsearch.common.lucene.search.MultiPhrasePrefixQuery;
|
||||
import org.elasticsearch.common.lucene.search.Queries;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.index.mapper.FieldMapper;
|
||||
import org.elasticsearch.index.mapper.MapperService;
|
||||
import org.elasticsearch.index.query.QueryParseContext;
|
||||
|
@ -69,7 +70,7 @@ public class MatchQuery {
|
|||
|
||||
protected int phraseSlop = 0;
|
||||
|
||||
protected String fuzziness = null;
|
||||
protected Fuzziness fuzziness = null;
|
||||
|
||||
protected int fuzzyPrefixLength = FuzzyQuery.defaultPrefixLength;
|
||||
|
||||
|
@ -112,7 +113,7 @@ public class MatchQuery {
|
|||
this.phraseSlop = phraseSlop;
|
||||
}
|
||||
|
||||
public void setFuzziness(String fuzziness) {
|
||||
public void setFuzziness(Fuzziness fuzziness) {
|
||||
this.fuzziness = fuzziness;
|
||||
}
|
||||
|
||||
|
@ -365,10 +366,7 @@ public class MatchQuery {
|
|||
QueryParsers.setRewriteMethod((FuzzyQuery) query, fuzzyRewriteMethod);
|
||||
}
|
||||
}
|
||||
String text = term.text();
|
||||
//LUCENE 4 UPGRADE we need to document that this should now be an int rather than a float
|
||||
int edits = FuzzyQuery.floatToEdits(Float.parseFloat(fuzziness),
|
||||
text.codePointCount(0, text.length()));
|
||||
int edits = fuzziness.asDistance(term.text());
|
||||
FuzzyQuery query = new FuzzyQuery(term, edits, fuzzyPrefixLength, maxExpansions, transpositions);
|
||||
QueryParsers.setRewriteMethod(query, rewriteMethod);
|
||||
return query;
|
||||
|
|
|
@ -19,6 +19,8 @@
|
|||
package org.elasticsearch.search.suggest.completion;
|
||||
|
||||
import org.elasticsearch.ElasticsearchIllegalArgumentException;
|
||||
import org.elasticsearch.common.ParseField;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.index.mapper.MapperService;
|
||||
import org.elasticsearch.search.suggest.SuggestContextParser;
|
||||
|
@ -34,6 +36,7 @@ import static org.elasticsearch.search.suggest.SuggestUtils.parseSuggestContext;
|
|||
public class CompletionSuggestParser implements SuggestContextParser {
|
||||
|
||||
private CompletionSuggester completionSuggester;
|
||||
private static final ParseField FUZZINESS = Fuzziness.FIELD.withDeprecation("edit_distance");
|
||||
|
||||
public CompletionSuggestParser(CompletionSuggester completionSuggester) {
|
||||
this.completionSuggester = completionSuggester;
|
||||
|
@ -60,8 +63,8 @@ public class CompletionSuggestParser implements SuggestContextParser {
|
|||
if (token == XContentParser.Token.FIELD_NAME) {
|
||||
fuzzyConfigName = parser.currentName();
|
||||
} else if (token.isValue()) {
|
||||
if ("edit_distance".equals(fuzzyConfigName) || "editDistance".equals(fuzzyConfigName)) {
|
||||
suggestion.setFuzzyEditDistance(parser.intValue());
|
||||
if (FUZZINESS.match(fuzzyConfigName, ParseField.EMPTY_FLAGS)) {
|
||||
suggestion.setFuzzyEditDistance(Fuzziness.parse(parser).asDistance());
|
||||
} else if ("transpositions".equals(fuzzyConfigName)) {
|
||||
suggestion.setFuzzyTranspositions(parser.booleanValue());
|
||||
} else if ("min_length".equals(fuzzyConfigName) || "minLength".equals(fuzzyConfigName)) {
|
||||
|
|
|
@ -19,6 +19,7 @@
|
|||
package org.elasticsearch.search.suggest.completion;
|
||||
|
||||
import org.apache.lucene.search.suggest.analyzing.XFuzzySuggester;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.ToXContent;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.search.suggest.SuggestBuilder;
|
||||
|
@ -34,18 +35,18 @@ public class CompletionSuggestionFuzzyBuilder extends SuggestBuilder.SuggestionB
|
|||
super(name, "completion");
|
||||
}
|
||||
|
||||
private int fuzzyEditDistance = XFuzzySuggester.DEFAULT_MAX_EDITS;
|
||||
private Fuzziness fuzziness = Fuzziness.ONE;
|
||||
private boolean fuzzyTranspositions = XFuzzySuggester.DEFAULT_TRANSPOSITIONS;
|
||||
private int fuzzyMinLength = XFuzzySuggester.DEFAULT_MIN_FUZZY_LENGTH;
|
||||
private int fuzzyPrefixLength = XFuzzySuggester.DEFAULT_NON_FUZZY_PREFIX;
|
||||
private boolean unicodeAware = XFuzzySuggester.DEFAULT_UNICODE_AWARE;
|
||||
|
||||
public int getFuzzyEditDistance() {
|
||||
return fuzzyEditDistance;
|
||||
public Fuzziness getFuzziness() {
|
||||
return fuzziness;
|
||||
}
|
||||
|
||||
public CompletionSuggestionFuzzyBuilder setFuzzyEditDistance(int fuzzyEditDistance) {
|
||||
this.fuzzyEditDistance = fuzzyEditDistance;
|
||||
public CompletionSuggestionFuzzyBuilder setFuzziness(Fuzziness fuzziness) {
|
||||
this.fuzziness = fuzziness;
|
||||
return this;
|
||||
}
|
||||
|
||||
|
@ -89,8 +90,8 @@ public class CompletionSuggestionFuzzyBuilder extends SuggestBuilder.SuggestionB
|
|||
protected XContentBuilder innerToXContent(XContentBuilder builder, ToXContent.Params params) throws IOException {
|
||||
builder.startObject("fuzzy");
|
||||
|
||||
if (fuzzyEditDistance != XFuzzySuggester.DEFAULT_MAX_EDITS) {
|
||||
builder.field("edit_distance", fuzzyEditDistance);
|
||||
if (fuzziness != Fuzziness.ONE) {
|
||||
fuzziness.toXContent(builder, params);
|
||||
}
|
||||
if (fuzzyTranspositions != XFuzzySuggester.DEFAULT_TRANSPOSITIONS) {
|
||||
builder.field("transpositions", fuzzyTranspositions);
|
||||
|
|
|
@ -0,0 +1,74 @@
|
|||
/*
|
||||
* Licensed to Elasticsearch under one or more contributor
|
||||
* license agreements. See the NOTICE file distributed with
|
||||
* this work for additional information regarding copyright
|
||||
* ownership. Elasticsearch licenses this file to you under
|
||||
* the Apache License, Version 2.0 (the "License"); you may
|
||||
* not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing,
|
||||
* software distributed under the License is distributed on an
|
||||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
* KIND, either express or implied. See the License for the
|
||||
* specific language governing permissions and limitations
|
||||
* under the License.
|
||||
*/
|
||||
package org.elasticsearch.common;
|
||||
|
||||
import org.elasticsearch.ElasticsearchIllegalArgumentException;
|
||||
import org.elasticsearch.test.ElasticsearchTestCase;
|
||||
|
||||
import java.util.EnumSet;
|
||||
|
||||
import static org.hamcrest.CoreMatchers.*;
|
||||
|
||||
public class ParseFieldTests extends ElasticsearchTestCase {
|
||||
|
||||
public void testParse() {
|
||||
String[] values = new String[]{"foo_bar", "fooBar"};
|
||||
ParseField field = new ParseField(randomFrom(values));
|
||||
String[] deprecated = new String[]{"barFoo", "bar_foo"};
|
||||
ParseField withDepredcations = field.withDeprecation("Foobar", randomFrom(deprecated));
|
||||
assertThat(field, not(sameInstance(withDepredcations)));
|
||||
assertThat(field.match(randomFrom(values), ParseField.EMPTY_FLAGS), is(true));
|
||||
assertThat(field.match("foo bar", ParseField.EMPTY_FLAGS), is(false));
|
||||
assertThat(field.match(randomFrom(deprecated), ParseField.EMPTY_FLAGS), is(false));
|
||||
assertThat(field.match("barFoo", ParseField.EMPTY_FLAGS), is(false));
|
||||
|
||||
|
||||
assertThat(withDepredcations.match(randomFrom(values), ParseField.EMPTY_FLAGS), is(true));
|
||||
assertThat(withDepredcations.match("foo bar", ParseField.EMPTY_FLAGS), is(false));
|
||||
assertThat(withDepredcations.match(randomFrom(deprecated), ParseField.EMPTY_FLAGS), is(true));
|
||||
assertThat(withDepredcations.match("barFoo", ParseField.EMPTY_FLAGS), is(true));
|
||||
|
||||
// now with strict mode
|
||||
EnumSet<ParseField.Flag> flags = EnumSet.of(ParseField.Flag.STRICT);
|
||||
assertThat(field.match(randomFrom(values), flags), is(true));
|
||||
assertThat(field.match("foo bar", flags), is(false));
|
||||
assertThat(field.match(randomFrom(deprecated), flags), is(false));
|
||||
assertThat(field.match("barFoo", flags), is(false));
|
||||
|
||||
|
||||
assertThat(withDepredcations.match(randomFrom(values), flags), is(true));
|
||||
assertThat(withDepredcations.match("foo bar", flags), is(false));
|
||||
try {
|
||||
withDepredcations.match(randomFrom(deprecated), flags);
|
||||
fail();
|
||||
} catch (ElasticsearchIllegalArgumentException ex) {
|
||||
|
||||
}
|
||||
|
||||
try {
|
||||
withDepredcations.match("barFoo", flags);
|
||||
fail();
|
||||
} catch (ElasticsearchIllegalArgumentException ex) {
|
||||
|
||||
}
|
||||
|
||||
|
||||
}
|
||||
|
||||
}
|
|
@ -0,0 +1,199 @@
|
|||
/*
|
||||
* Licensed to Elasticsearch under one or more contributor
|
||||
* license agreements. See the NOTICE file distributed with
|
||||
* this work for additional information regarding copyright
|
||||
* ownership. Elasticsearch licenses this file to you under
|
||||
* the Apache License, Version 2.0 (the "License"); you may
|
||||
* not use this file except in compliance with the License.
|
||||
* You may obtain a copy of the License at
|
||||
*
|
||||
* http://www.apache.org/licenses/LICENSE-2.0
|
||||
*
|
||||
* Unless required by applicable law or agreed to in writing,
|
||||
* software distributed under the License is distributed on an
|
||||
* "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
|
||||
* KIND, either express or implied. See the License for the
|
||||
* specific language governing permissions and limitations
|
||||
* under the License.
|
||||
*/
|
||||
package org.elasticsearch.common.unit;
|
||||
|
||||
import org.elasticsearch.common.xcontent.XContent;
|
||||
import org.elasticsearch.common.xcontent.XContentParser;
|
||||
import org.elasticsearch.common.xcontent.XContentType;
|
||||
import org.elasticsearch.test.ElasticsearchTestCase;
|
||||
import org.junit.Test;
|
||||
|
||||
import java.io.IOException;
|
||||
|
||||
import static org.elasticsearch.common.xcontent.XContentFactory.jsonBuilder;
|
||||
import static org.hamcrest.CoreMatchers.*;
|
||||
import static org.hamcrest.number.IsCloseTo.closeTo;
|
||||
|
||||
public class FuzzinessTests extends ElasticsearchTestCase {
|
||||
|
||||
@Test
|
||||
public void testNumerics() {
|
||||
String[] options = new String[]{"1.0", "1", "1.000000"};
|
||||
assertThat(Fuzziness.build(randomFrom(options)).asByte(), equalTo((byte) 1));
|
||||
assertThat(Fuzziness.build(randomFrom(options)).asInt(), equalTo(1));
|
||||
assertThat(Fuzziness.build(randomFrom(options)).asFloat(), equalTo(1f));
|
||||
assertThat(Fuzziness.build(randomFrom(options)).asDouble(), equalTo(1d));
|
||||
assertThat(Fuzziness.build(randomFrom(options)).asLong(), equalTo(1l));
|
||||
assertThat(Fuzziness.build(randomFrom(options)).asShort(), equalTo((short) 1));
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testParseFromXContent() throws IOException {
|
||||
final int iters = atLeast(10);
|
||||
for (int i = 0; i < iters; i++) {
|
||||
{
|
||||
XContent xcontent = XContentType.JSON.xContent();
|
||||
float floatValue = randomFloat();
|
||||
String json = jsonBuilder().startObject()
|
||||
.field(Fuzziness.X_FIELD_NAME, floatValue)
|
||||
.endObject().string();
|
||||
XContentParser parser = xcontent.createParser(json);
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.START_OBJECT));
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.FIELD_NAME));
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.VALUE_NUMBER));
|
||||
Fuzziness parse = Fuzziness.parse(parser);
|
||||
assertThat(parse.asFloat(), equalTo(floatValue));
|
||||
assertThat(parse.asDouble(), closeTo((double) floatValue, 0.000001));
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.END_OBJECT));
|
||||
}
|
||||
|
||||
{
|
||||
XContent xcontent = XContentType.JSON.xContent();
|
||||
Integer intValue = frequently() ? randomIntBetween(0, 2) : randomIntBetween(0, 100);
|
||||
Float floatRep = randomFloat();
|
||||
Number value = intValue;
|
||||
if (randomBoolean()) {
|
||||
value = new Float(floatRep += intValue);
|
||||
}
|
||||
String json = jsonBuilder().startObject()
|
||||
.field(Fuzziness.X_FIELD_NAME, randomBoolean() ? value.toString() : value)
|
||||
.endObject().string();
|
||||
XContentParser parser = xcontent.createParser(json);
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.START_OBJECT));
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.FIELD_NAME));
|
||||
assertThat(parser.nextToken(), anyOf(equalTo(XContentParser.Token.VALUE_NUMBER), equalTo(XContentParser.Token.VALUE_STRING)));
|
||||
Fuzziness parse = Fuzziness.parse(parser);
|
||||
assertThat(parse.asInt(), equalTo(intValue));
|
||||
assertThat((int) parse.asShort(), equalTo(intValue));
|
||||
assertThat((int) parse.asByte(), equalTo(intValue));
|
||||
assertThat(parse.asLong(), equalTo((long) intValue));
|
||||
if (value.intValue() >= 1) {
|
||||
assertThat(parse.asDistance(), equalTo(Math.min(2, intValue)));
|
||||
}
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.END_OBJECT));
|
||||
if (intValue.equals(value)) {
|
||||
switch (intValue) {
|
||||
case 1:
|
||||
assertThat(parse, sameInstance(Fuzziness.ONE));
|
||||
break;
|
||||
case 2:
|
||||
assertThat(parse, sameInstance(Fuzziness.TWO));
|
||||
break;
|
||||
case 0:
|
||||
assertThat(parse, sameInstance(Fuzziness.ZERO));
|
||||
break;
|
||||
default:
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
{
|
||||
XContent xcontent = XContentType.JSON.xContent();
|
||||
String json = jsonBuilder().startObject()
|
||||
.field(Fuzziness.X_FIELD_NAME, randomBoolean() ? "AUTO" : "auto")
|
||||
.endObject().string();
|
||||
if (randomBoolean()) {
|
||||
json = Fuzziness.AUTO.toXContent(jsonBuilder().startObject(), null).endObject().string();
|
||||
}
|
||||
XContentParser parser = xcontent.createParser(json);
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.START_OBJECT));
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.FIELD_NAME));
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.VALUE_STRING));
|
||||
Fuzziness parse = Fuzziness.parse(parser);
|
||||
assertThat(parse, sameInstance(Fuzziness.AUTO));
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.END_OBJECT));
|
||||
}
|
||||
|
||||
{
|
||||
String[] values = new String[]{"d", "H", "ms", "s", "S", "w"};
|
||||
String actual = randomIntBetween(1, 3) + randomFrom(values);
|
||||
XContent xcontent = XContentType.JSON.xContent();
|
||||
String json = jsonBuilder().startObject()
|
||||
.field(Fuzziness.X_FIELD_NAME, actual)
|
||||
.endObject().string();
|
||||
XContentParser parser = xcontent.createParser(json);
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.START_OBJECT));
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.FIELD_NAME));
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.VALUE_STRING));
|
||||
Fuzziness parse = Fuzziness.parse(parser);
|
||||
assertThat(parse.asTimeValue(), equalTo(TimeValue.parseTimeValue(actual, null)));
|
||||
assertThat(parser.nextToken(), equalTo(XContentParser.Token.END_OBJECT));
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testAuto() {
|
||||
final int codePoints = randomIntBetween(0, 10);
|
||||
String string = randomRealisticUnicodeOfCodepointLength(codePoints);
|
||||
if (codePoints <= 2) {
|
||||
assertThat(Fuzziness.AUTO.asDistance(string), equalTo(0));
|
||||
assertThat(Fuzziness.fromSimilarity(Fuzziness.AUTO.asSimilarity(string)).asDistance(string), equalTo(0));
|
||||
} else if (codePoints > 5) {
|
||||
assertThat(Fuzziness.AUTO.asDistance(string), equalTo(2));
|
||||
assertThat(Fuzziness.fromSimilarity(Fuzziness.AUTO.asSimilarity(string)).asDistance(string), equalTo(2));
|
||||
} else {
|
||||
assertThat(Fuzziness.AUTO.asDistance(string), equalTo(1));
|
||||
assertThat(Fuzziness.fromSimilarity(Fuzziness.AUTO.asSimilarity(string)).asDistance(string), equalTo(1));
|
||||
}
|
||||
assertThat(Fuzziness.AUTO.asByte(), equalTo((byte) 1));
|
||||
assertThat(Fuzziness.AUTO.asInt(), equalTo(1));
|
||||
assertThat(Fuzziness.AUTO.asFloat(), equalTo(1f));
|
||||
assertThat(Fuzziness.AUTO.asDouble(), equalTo(1d));
|
||||
assertThat(Fuzziness.AUTO.asLong(), equalTo(1l));
|
||||
assertThat(Fuzziness.AUTO.asShort(), equalTo((short) 1));
|
||||
assertThat(Fuzziness.AUTO.asTimeValue(), equalTo(TimeValue.parseTimeValue("1", TimeValue.timeValueMillis(1))));
|
||||
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testAsDistance() {
|
||||
final int iters = atLeast(10);
|
||||
for (int i = 0; i < iters; i++) {
|
||||
Integer integer = Integer.valueOf(randomIntBetween(0, 10));
|
||||
String value = "" + (randomBoolean() ? integer.intValue() : integer.floatValue());
|
||||
assertThat(Fuzziness.build(value).asDistance(), equalTo(Math.min(2, integer.intValue())));
|
||||
}
|
||||
}
|
||||
|
||||
@Test
|
||||
public void testSimilarityToDistance() {
|
||||
assertThat(Fuzziness.fromSimilarity(0.5f).asDistance("ab"), equalTo(1));
|
||||
assertThat(Fuzziness.fromSimilarity(0.66f).asDistance("abcefg"), equalTo(2));
|
||||
assertThat(Fuzziness.fromSimilarity(0.8f).asDistance("ab"), equalTo(0));
|
||||
assertThat(Fuzziness.fromSimilarity(0.8f).asDistance("abcefg"), equalTo(1));
|
||||
assertThat((double) Fuzziness.ONE.asSimilarity("abcefg"), closeTo(0.8f, 0.05));
|
||||
assertThat((double) Fuzziness.TWO.asSimilarity("abcefg"), closeTo(0.66f, 0.05));
|
||||
assertThat((double) Fuzziness.ONE.asSimilarity("ab"), closeTo(0.5f, 0.05));
|
||||
|
||||
int iters = atLeast(100);
|
||||
for (int i = 0; i < iters; i++) {
|
||||
Fuzziness fuzziness = Fuzziness.fromEdits(between(1, 2));
|
||||
String string = rarely() ? randomRealisticUnicodeOfLengthBetween(2, 4) :
|
||||
randomRealisticUnicodeOfLengthBetween(4, 10);
|
||||
float similarity = fuzziness.asSimilarity(string);
|
||||
if (similarity != 0.0f) {
|
||||
Fuzziness similarityBased = Fuzziness.build(similarity);
|
||||
assertThat((double) similarityBased.asSimilarity(string), closeTo(similarity, 0.05));
|
||||
assertThat(similarityBased.asDistance(string), equalTo(Math.min(2, fuzziness.asDistance(string))));
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
|
@ -43,6 +43,7 @@ import org.elasticsearch.common.lucene.search.function.FunctionScoreQuery;
|
|||
import org.elasticsearch.common.settings.ImmutableSettings;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.settings.SettingsModule;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.index.Index;
|
||||
import org.elasticsearch.index.IndexNameModule;
|
||||
import org.elasticsearch.index.analysis.AnalysisModule;
|
||||
|
@ -432,7 +433,7 @@ public class SimpleIndexQueryParserTests extends ElasticsearchTestCase {
|
|||
@Test
|
||||
public void testFuzzyQueryWithFieldsBuilder() throws IOException {
|
||||
IndexQueryParserService queryParser = queryParser();
|
||||
Query parsedQuery = queryParser.parse(fuzzyQuery("name.first", "sh").minSimilarity(0.1f).prefixLength(1).boost(2.0f).buildAsBytes()).query();
|
||||
Query parsedQuery = queryParser.parse(fuzzyQuery("name.first", "sh").fuzziness(Fuzziness.fromSimilarity(0.1f)).prefixLength(1).boost(2.0f).buildAsBytes()).query();
|
||||
assertThat(parsedQuery, instanceOf(FuzzyQuery.class));
|
||||
FuzzyQuery fuzzyQuery = (FuzzyQuery) parsedQuery;
|
||||
assertThat(fuzzyQuery.getTerm(), equalTo(new Term("name.first", "sh")));
|
||||
|
|
|
@ -2,9 +2,9 @@
|
|||
"fuzzy":{
|
||||
"name.first":{
|
||||
"value":"sh",
|
||||
"min_similarity":0.1,
|
||||
"fuzziness":0.1,
|
||||
"prefix_length":1,
|
||||
"boost":2.0
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
@ -2,8 +2,8 @@
|
|||
"fuzzy":{
|
||||
"age":{
|
||||
"value":12,
|
||||
"min_similarity":5,
|
||||
"fuzziness":5,
|
||||
"boost":2.0
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
|
|
@ -4,7 +4,7 @@
|
|||
"fuzzy":{
|
||||
"age":{
|
||||
"value":12,
|
||||
"min_similarity":5,
|
||||
"fuzziness":5,
|
||||
"boost":2.0
|
||||
}
|
||||
}
|
||||
|
|
|
@ -34,6 +34,7 @@ import org.elasticsearch.action.suggest.SuggestResponse;
|
|||
import org.elasticsearch.client.Requests;
|
||||
import org.elasticsearch.common.settings.ImmutableSettings;
|
||||
import org.elasticsearch.common.settings.Settings;
|
||||
import org.elasticsearch.common.unit.Fuzziness;
|
||||
import org.elasticsearch.common.xcontent.XContentBuilder;
|
||||
import org.elasticsearch.index.mapper.MapperException;
|
||||
import org.elasticsearch.index.mapper.MapperParsingException;
|
||||
|
@ -502,7 +503,7 @@ public class CompletionSuggestSearchTests extends ElasticsearchIntegrationTest {
|
|||
|
||||
// edit distance 2
|
||||
suggestResponse = client().prepareSuggest(INDEX).addSuggestion(
|
||||
new CompletionSuggestionFuzzyBuilder("foo").field(FIELD).text("Norw").size(10).setFuzzyEditDistance(2)
|
||||
new CompletionSuggestionFuzzyBuilder("foo").field(FIELD).text("Norw").size(10).setFuzziness(Fuzziness.TWO)
|
||||
).execute().actionGet();
|
||||
assertSuggestions(suggestResponse, false, "foo", "Nirvana");
|
||||
}
|
||||
|
@ -520,12 +521,12 @@ public class CompletionSuggestSearchTests extends ElasticsearchIntegrationTest {
|
|||
refresh();
|
||||
|
||||
SuggestResponse suggestResponse = client().prepareSuggest(INDEX).addSuggestion(
|
||||
new CompletionSuggestionFuzzyBuilder("foo").field(FIELD).text("Nriv").size(10).setFuzzyTranspositions(false).setFuzzyEditDistance(1)
|
||||
new CompletionSuggestionFuzzyBuilder("foo").field(FIELD).text("Nriv").size(10).setFuzzyTranspositions(false).setFuzziness(Fuzziness.ONE)
|
||||
).execute().actionGet();
|
||||
assertSuggestions(suggestResponse, false, "foo");
|
||||
|
||||
suggestResponse = client().prepareSuggest(INDEX).addSuggestion(
|
||||
new CompletionSuggestionFuzzyBuilder("foo").field(FIELD).text("Nriv").size(10).setFuzzyTranspositions(true).setFuzzyEditDistance(1)
|
||||
new CompletionSuggestionFuzzyBuilder("foo").field(FIELD).text("Nriv").size(10).setFuzzyTranspositions(true).setFuzziness(Fuzziness.ONE)
|
||||
).execute().actionGet();
|
||||
assertSuggestions(suggestResponse, false, "foo", "Nirvana");
|
||||
}
|
||||
|
@ -601,7 +602,7 @@ public class CompletionSuggestSearchTests extends ElasticsearchIntegrationTest {
|
|||
assertSuggestions(suggestResponse, false, "foo");
|
||||
|
||||
// increasing edit distance instead of unicode awareness works again, as this is only a single character
|
||||
completionSuggestionBuilder.setFuzzyEditDistance(2);
|
||||
completionSuggestionBuilder.setFuzziness(Fuzziness.TWO);
|
||||
suggestResponse = client().prepareSuggest(INDEX).addSuggestion(completionSuggestionBuilder).execute().actionGet();
|
||||
assertSuggestions(suggestResponse, false, "foo", "ööööö");
|
||||
}
|
||||
|
|
Loading…
Reference in New Issue