Following the discussion in #11744, move boost and query _name to base class AbstractQueryBuilder with their getters and setters. Unify their serialization code and equals/hashcode handling in the base class too. This guarantess that every query supports both _name and boost and nothing needs to be done around those in subclasses besides properly parsing the fields in the parsers and printing them out as part of the doXContent method in the builders. More specifically, these are the performed changes:
- Introduced printBoostAndQueryName utility method in AbstractQueryBuilder that subclasses can use to print out _name and boost in their doXContent method.
- readFrom and writeTo are now final methods that take care of _name and boost serialization. Subclasses have to implement doReadFrom and doWriteTo instead.
- toQuery is a final method too that takes care of properly applying _name and boost to the lucene query. Subclasses have to implement doToQuery instead. The query returned will have boost and queryName applied automatically.
- Removed BoostableQueryBuilder interface, given that every query is boostable after this change. This won't have any negative effect on filters, as the boost simply gets ignored in that case.
- Extended equals and hashcode to handle queryName and boost automatically as well.
- Update the query test infra so that queryName and boost are tested automatically, and whenever they are forgotten in parser or doXContent tests fail, so this makes things a lot less error-prone
- Introduced DEFAULT_BOOST constant to make sure we don't repeat 1.0f all the time for default boost values.
SpanQueryBuilder is again a marker interface only. The convenient toQuery that allowed us to override the return type to SpanQuery cannot be supported anymore due to a clash with the toQuery implementation from AbstractQueryBuilder. We have to go back to castin lucene Query to SpanQuery when dealing with span queries unfortunately.
Note that this change touches not only the already refactored queries but also the untouched ones, by making sure that we parse _name and boost whenever we need to and that we print them out as part of QueryBuilder#doXContent. This will result in printing out the default boost all the time rather than skipping it in non refactored queries, something that we would have changed anyway as part of the query refactoring.
The following are the queries that support boost now while previously they didn't (parser now parses it and builder prints it out): and, exists, fquery, geo_bounding_box, geo_distance, geo_distance_range, geo_hash_cell, geo_polygon, indices, limit, missing, not, or, script, type.
The following are the queries that support _name now while previously they didn't (parser now parses it and builder prints it out): boosting, constant_score, function_score, limit, match_all, type.
Range query parser supports now _name at the same level as boost too (_name is still supported on the outer object though for bw comp).
There are two exceptions that despite have getters and setters for queryName and boost don't really support boost and queryName: query filter and span multi term query. The reason for this is that they only support a single inner object which is another query that they wrap, no other elements.
Relates to #11744Closes#10776Closes#11974
We had several problems with Java Serializatin in the past. At some point
in the Java 1.7.x series JDKs where not compatible anymore when java
serialization (ObjectStream) was used to exchange objects. In elasticsearch
we used this to serialize exceptions across the wire which caused several problems
with incompatible JDKs. While causing lot of trouble this essentially prevented
users from moving forward and upgrade their JVMs. To prevent these kind of issues
this commit removes the dependency on java serialization entirely and bans the
usage of ObjectOutputStream and ObjectInputStream entirely.
Yet, we can't fully serialize all exception anymore such that this commit
is best effort and adds hand written serialization to all elasticsearch exceptions
as well to a selected set of JDK and Lucene exceptions. (see StreamOutput#writeThrowable /
StreamInput.readThrowable). Stacktraces should be preserved for all exceptions while
several names might be replaced with ElasticsearchException if there is no mapping for
the given exception.
Added validation for all inner queries that any already refactored query may hold. Added also tests around this. At the end of the refactoring validate will be called by SearchRequest#validate during TransportSearchAction execution, which will call validate against the top level query builder that will need to go and validate its data plus all of its inner queries, and so on.
Closes#11889
In order to support older RPM based distributions like CentOS5,
we should have one RPM available, which is not signed.
This commit creates an unsigned RPM first, then moves it over to
target/releases during the build, then builds a signed RPM.
The unsigned one is uploaded via S3, where as the signed one is
used for the repositories.
In addition, you can now build an RPM without having to specify
any gpg credentials due to offloading this into a maven profile
that is only activated when specifying `rpm.sign` property.
Closes#11587
instead of maintaining a thread local cache in the PercolatorQueriesRegistry.
Before PercolatorQueriesRegistry had its own cache, because all the queries had to forcefully opt out of caching. Nowadays in master small segments are never cached by the query cache, so the reason for the dedicated cache is no longer valid.
Today we keep track of how often filters are used at the index level in order
to decide whether they should be cached or not. This is an issue if you have
several shards of the same index on the same node as it will multiply statistics
by the number of shards that you have for this index on the node, which defeats
the purpose of waiting for a filter to be reused before caching them.
If the translog UUID is corrupted we should not convert it
to UTF-8 since it might be invalid. Instead we should compare
the UTF-8 byte representation directly.
This more consistent with the other logging it makes and since it can be used in many operations the output can be more verbose (without adding too much info as to who timed out exactly - which we can fix separately). If need be the caller of the observer can log a higher level message.
Closes#11722
In order to be more consistent with what they do, the query cache has been
renamed to request cache and the filter cache has been renamed to query
cache.
A known issue is that package/logger names do no longer match settings names,
please speak up if you think this is an issue.
Here are the settings for which I kept backward compatibility. Note that they
are a bit different from what was discussed on #11569 but putting `cache` before
the name of what is cached has the benefit of making these settings consistent
with the fielddata cache whose size is configured by
`indices.fielddata.cache.size`:
* index.cache.query.enable -> index.requests.cache.enable
* indices.cache.query.size -> indices.requests.cache.size
* indices.cache.filter.size -> indices.queries.cache.size
Close#11569