OpenSearch/docs/reference/migration/migrate_query_refactoring.a...

171 lines
7.9 KiB
Plaintext
Raw Normal View History

[[breaking-changes query-refactoring]]
== Breaking changes on the query-refactoring branch
This section discusses changes that are breaking to the current rest or java-api
on the query-refactoring feature branch.
=== Plugins
Plugins implementing custom queries need to implement the `fromXContent(QueryParseContext)` method in their
`QueryParser` subclass rather than `parse`. This method will take care of parsing the query from `XContent` format
into an intermediate query representation that can be streamed between the nodes in binary format, effectively the
query object used in the java api. Also, the query parser needs to implement the `getBuilderPrototype` method that
returns a prototype of the `NamedWriteable` query, which allows to deserialize an incoming query by calling
`readFrom(StreamInput)` against it, which will create a new object, see usages of `Writeable`. The `QueryParser`
also needs to declare the generic type of the query that it supports and it's able to parse.
The query object can then transform itself into a lucene query through the new `toQuery(QueryShardContext)` method,
which returns a lucene query to be executed on the data node.
Similarly, plugins implementing custom score functions need to implement the `fromXContent(QueryParseContext)`
method in their `ScoreFunctionParser` subclass rather than `parse`. This method will take care of parsing
the function from `XContent` format into an intermediate function representation that can be streamed between
the nodes in binary format, effectively the function object used in the java api. Also, the query parser needs
to implement the `getBuilderPrototype` method that returns a prototype of the `NamedWriteable` function, which
allows to deserialize an incoming function by calling `readFrom(StreamInput)` against it, which will create a
new object, see usages of `Writeable`. The `ScoreFunctionParser` also needs to declare the generic type of the
function that it supports and it's able to parse. The function object can then transform itself into a lucene
function through the new `toFunction(QueryShardContext)` method, which returns a lucene function to be executed
on the data node.
=== Java-API
==== BoostingQueryBuilder
Removed setters for mandatory positive/negative query. Both arguments now have
to be supplied at construction time already and have to be non-null.
==== SpanContainingQueryBuilder
Removed setters for mandatory big/little inner span queries. Both arguments now have
to be supplied at construction time already and have to be non-null. Updated
static factory methods in QueryBuilders accordingly.
==== SpanOrQueryBuilder
Making sure that query contains at least one clause by making initial clause mandatory
in constructor.
==== SpanNearQueryBuilder
Removed setter for mandatory slop parameter, needs to be set in constructor now. Also
making sure that query contains at least one clause by making initial clause mandatory
in constructor. Updated the static factory methods in QueryBuilders accordingly.
==== SpanNotQueryBuilder
Removed setter for mandatory include/exclude span query clause, needs to be set in constructor now.
Updated the static factory methods in QueryBuilders and tests accordingly.
==== SpanWithinQueryBuilder
Removed setters for mandatory big/little inner span queries. Both arguments now have
to be supplied at construction time already and have to be non-null. Updated
static factory methods in QueryBuilders accordingly.
==== QueryFilterBuilder
Removed the setter `queryName(String queryName)` since this field is not supported
in this type of query. Use `FQueryFilterBuilder.queryName(String queryName)` instead
when in need to wrap a named query as a filter.
==== WrapperQueryBuilder
Removed `wrapperQueryBuilder(byte[] source, int offset, int length)`. Instead simply
use `wrapperQueryBuilder(byte[] source)`. Updated the static factory methods in
QueryBuilders accordingly.
==== QueryStringQueryBuilder
Removed ability to pass in boost value using `field(String field)` method in form e.g. `field^2`.
Use the `field(String, float)` method instead.
Query refactoring: unify boost and query name Following the discussion in #11744, move boost and query _name to base class AbstractQueryBuilder with their getters and setters. Unify their serialization code and equals/hashcode handling in the base class too. This guarantess that every query supports both _name and boost and nothing needs to be done around those in subclasses besides properly parsing the fields in the parsers and printing them out as part of the doXContent method in the builders. More specifically, these are the performed changes: - Introduced printBoostAndQueryName utility method in AbstractQueryBuilder that subclasses can use to print out _name and boost in their doXContent method. - readFrom and writeTo are now final methods that take care of _name and boost serialization. Subclasses have to implement doReadFrom and doWriteTo instead. - toQuery is a final method too that takes care of properly applying _name and boost to the lucene query. Subclasses have to implement doToQuery instead. The query returned will have boost and queryName applied automatically. - Removed BoostableQueryBuilder interface, given that every query is boostable after this change. This won't have any negative effect on filters, as the boost simply gets ignored in that case. - Extended equals and hashcode to handle queryName and boost automatically as well. - Update the query test infra so that queryName and boost are tested automatically, and whenever they are forgotten in parser or doXContent tests fail, so this makes things a lot less error-prone - Introduced DEFAULT_BOOST constant to make sure we don't repeat 1.0f all the time for default boost values. SpanQueryBuilder is again a marker interface only. The convenient toQuery that allowed us to override the return type to SpanQuery cannot be supported anymore due to a clash with the toQuery implementation from AbstractQueryBuilder. We have to go back to castin lucene Query to SpanQuery when dealing with span queries unfortunately. Note that this change touches not only the already refactored queries but also the untouched ones, by making sure that we parse _name and boost whenever we need to and that we print them out as part of QueryBuilder#doXContent. This will result in printing out the default boost all the time rather than skipping it in non refactored queries, something that we would have changed anyway as part of the query refactoring. The following are the queries that support boost now while previously they didn't (parser now parses it and builder prints it out): and, exists, fquery, geo_bounding_box, geo_distance, geo_distance_range, geo_hash_cell, geo_polygon, indices, limit, missing, not, or, script, type. The following are the queries that support _name now while previously they didn't (parser now parses it and builder prints it out): boosting, constant_score, function_score, limit, match_all, type. Range query parser supports now _name at the same level as boost too (_name is still supported on the outer object though for bw comp). There are two exceptions that despite have getters and setters for queryName and boost don't really support boost and queryName: query filter and span multi term query. The reason for this is that they only support a single inner object which is another query that they wrap, no other elements. Relates to #11744 Closes #10776 Closes #11974
2015-06-30 09:41:32 -04:00
==== Operator
Removed the enums called `Operator` from `MatchQueryBuilder`, `QueryStringQueryBuilder`,
`SimpleQueryStringBuilder`, and `CommonTermsQueryBuilder` in favour of using the enum
defined in `org.elasticsearch.index.query.Operator` in an effort to consolidate the
codebase and avoid duplication.
Query refactoring: unify boost and query name Following the discussion in #11744, move boost and query _name to base class AbstractQueryBuilder with their getters and setters. Unify their serialization code and equals/hashcode handling in the base class too. This guarantess that every query supports both _name and boost and nothing needs to be done around those in subclasses besides properly parsing the fields in the parsers and printing them out as part of the doXContent method in the builders. More specifically, these are the performed changes: - Introduced printBoostAndQueryName utility method in AbstractQueryBuilder that subclasses can use to print out _name and boost in their doXContent method. - readFrom and writeTo are now final methods that take care of _name and boost serialization. Subclasses have to implement doReadFrom and doWriteTo instead. - toQuery is a final method too that takes care of properly applying _name and boost to the lucene query. Subclasses have to implement doToQuery instead. The query returned will have boost and queryName applied automatically. - Removed BoostableQueryBuilder interface, given that every query is boostable after this change. This won't have any negative effect on filters, as the boost simply gets ignored in that case. - Extended equals and hashcode to handle queryName and boost automatically as well. - Update the query test infra so that queryName and boost are tested automatically, and whenever they are forgotten in parser or doXContent tests fail, so this makes things a lot less error-prone - Introduced DEFAULT_BOOST constant to make sure we don't repeat 1.0f all the time for default boost values. SpanQueryBuilder is again a marker interface only. The convenient toQuery that allowed us to override the return type to SpanQuery cannot be supported anymore due to a clash with the toQuery implementation from AbstractQueryBuilder. We have to go back to castin lucene Query to SpanQuery when dealing with span queries unfortunately. Note that this change touches not only the already refactored queries but also the untouched ones, by making sure that we parse _name and boost whenever we need to and that we print them out as part of QueryBuilder#doXContent. This will result in printing out the default boost all the time rather than skipping it in non refactored queries, something that we would have changed anyway as part of the query refactoring. The following are the queries that support boost now while previously they didn't (parser now parses it and builder prints it out): and, exists, fquery, geo_bounding_box, geo_distance, geo_distance_range, geo_hash_cell, geo_polygon, indices, limit, missing, not, or, script, type. The following are the queries that support _name now while previously they didn't (parser now parses it and builder prints it out): boosting, constant_score, function_score, limit, match_all, type. Range query parser supports now _name at the same level as boost too (_name is still supported on the outer object though for bw comp). There are two exceptions that despite have getters and setters for queryName and boost don't really support boost and queryName: query filter and span multi term query. The reason for this is that they only support a single inner object which is another query that they wrap, no other elements. Relates to #11744 Closes #10776 Closes #11974
2015-06-30 09:41:32 -04:00
==== queryName and boost support
Support for `queryName` and `boost` has been streamlined to all of the queries. That is
a breaking change till queries get sent over the network as serialized json rather
than in `Streamable` format. In fact whenever additional fields are added to the json
representation of the query, older nodes might throw error when they find unknown fields.
==== InnerHitsBuilder
InnerHitsBuilder now has a dedicated addParentChildInnerHits and addNestedInnerHits methods
to differentiate between inner hits for nested vs. parent / child documents. This change
makes the type / path parameter mandatory.
==== MatchQueryBuilder
Moving MatchQueryBuilder.Type and MatchQueryBuilder.ZeroTermsQuery enum to MatchQuery.Type.
Also reusing new Operator enum.
==== MoreLikeThisQueryBuilder
Removed `MoreLikeThisQueryBuilder.Item#id(String id)`, `Item#doc(BytesReference doc)`,
`Item#doc(XContentBuilder doc)`. Use provided constructors instead.
Removed `MoreLikeThisQueryBuilder#addLike` in favor of texts and/or items beeing provided
at construction time. Using arrays there instead of lists now.
Removed `MoreLikeThisQueryBuilder#addUnlike` in favor to using the `unlike` methods
which take arrays as arguments now rather than the lists used before.
The deprecated `docs(Item... docs)`, `ignoreLike(Item... docs)`,
`ignoreLike(String... likeText)`, `addItem(Item... likeItems)` have been removed.
==== GeoDistanceQueryBuilder
Removing individual setters for lon() and lat() values, both values should be set together
using point(lon, lat).
==== GeoDistanceRangeQueryBuilder
Removing setters for to(Object ...) and from(Object ...) in favour of the only two allowed input
arguments (String, Number). Removing setter for center point (point(), geohash()) because parameter
is mandatory and should already be set in constructor.
Also removing setters for lt(), lte(), gt(), gte() since they can all be replaced by equivallent
calls to to/from() and inludeLower()/includeUpper().
==== GeoPolygonQueryBuilder
Require shell of polygon already to be specified in constructor instead of adding it pointwise.
This enables validation, but makes it necessary to remove the addPoint() methods.
==== MultiMatchQueryBuilder
Moving MultiMatchQueryBuilder.ZeroTermsQuery enum to MatchQuery.ZeroTermsQuery.
Also reusing new Operator enum.
Removed ability to pass in boost value using `field(String field)` method in form e.g. `field^2`.
Use the `field(String, float)` method instead.
==== MissingQueryBuilder
The two individual setters for existence() and nullValue() were removed in favour of
optional constructor settings in order to better capture and validate their interdependent
settings at construction time.
==== TermsQueryBuilder
Remove the setter for `termsLookup()`, making it only possible to either use a TermsLookup object or
individual values at construction time. Also moving individual settings for the TermsLookup (lookupIndex,
lookupType, lookupId, lookupPath) to the separate TermsLookup class, using constructor only and moving
checks for validation there. Removed `TermsLookupQueryBuilder` in favour of `TermsQueryBuilder`.
==== FunctionScoreQueryBuilder
`add` methods have been removed, all filters and functions must be provided as constructor arguments by
creating an array of `FunctionScoreQueryBuilder.FilterFunctionBuilder` objects, containing one element
for each filter/function pair.
`scoreMode` and `boostMode` can only be provided using corresponding enum members instead
of string values: see `FilterFunctionScoreQuery.ScoreMode` and `CombineFunction`.
`CombineFunction.MULT` has been renamed to `MULTIPLY`.