Commit Graph

181 Commits

Author SHA1 Message Date
Christoph Büscher 2b07f63bd5
Fix NDCG for empty search results (#29267)
Fixes and edge case where DiscountedCumulativeGain can return NaN as
result of the quality metric calculation. This can happen when the
search result set is empty and normalization is used. We should return 0
in this case. Also adding related unit tests to the other two metrics.
2018-04-03 11:15:44 +02:00
Christoph Büscher e4b30071bb
RankEvalRequest should implement IndicesRequest (#29188)
Change RankEvalRequest to implement IndicesRequest, so it gets treated
in a similar fashion to regular search requests e.g. by security.
2018-03-22 11:58:55 +01:00
Christoph Büscher 80532229a9
Move indices field from RankEvalSpec to RankEvalRequest (#28341)
Currently we store the indices specified in the request URL together with all
the other ranking evaluation specification in RankEvalSpec. This is not ideal
since e.g. the indices are not rendered to xContent and so cannot be parsed
back. Instead we should keep them in RankEvalRequest.
2018-03-19 16:26:02 +01:00
Lee Hinman 8e8fdc4f0e
Decouple XContentBuilder from BytesReference (#28972)
* Decouple XContentBuilder from BytesReference

This commit removes all mentions of `BytesReference` from `XContentBuilder`.
This is needed so that we can completely decouple the XContent code and move it
into its own dependency.

While this change appears large, it is due to two main changes, moving
`.bytes()` and `.string()` out of XContentBuilder itself into static methods
`BytesReference.bytes` and `Strings.toString` respectively. The rest of the
change is code reacting to these changes (the majority of it in tests).

Relates to #28504
2018-03-14 13:47:57 -06:00
Christoph Büscher 01791277cb
Test that rank_eval request parsing is not lenient (#28516)
Parsing of a ranking evaluation request and its subcomponents should throw parsing
errors on unknown fields. This change adds tests for this and changes the parser 
behaviour in cases where it is needed.
2018-02-08 17:38:45 +01:00
Lee Hinman eebff4d2b3
Use non deprecated xcontenthelper (#28503)
* Move to non-deprecated XContentHelper.createParser(...)

This moves away from one of the now-deprecated XContentHelper.createParser
methods in favor of specifying the deprecation logger at parser creation time.

Relates to #28449

Note that this doesn't move all the `createParser` calls because some of them
use the already-deprecated method that doesn't specify the XContentType.

* Remove the deprecated (and now non-needed) createParser method
2018-02-05 16:18:18 -07:00
Christoph Büscher 1c296fe7ed Update bwc version for rank_eval rest tests 2018-01-30 21:02:19 +01:00
Christoph Büscher 6731c76900
Add ranking evaluation API to High Level Rest Client (#28357)
This change adds support for the new ranking evaluation API to the High Level Rest Client.
This mostly means adding support for parsing the various response objects back from the
REST representation. It includes one change to the response syntax where previously we didn't
print the type of the metric details section but we now need it to pick the right parser to
parse this section back.

Closes #28198
2018-01-30 17:48:09 +01:00
Christoph Büscher a6bfe67f8b [Test] Lower bwc version for rank-eval rest tests
The API was backported to 6.2 so the version we test against on master can be
lowered to that.
2018-01-22 13:33:42 +01:00
Christoph Büscher 77dcaab34f
Simplify RankEvalResponse output (#28266)
Currenty the rest response of the ranking evaluation API wraps all inside an
enclosing `rank_eval` object. This is redundant since it is clear from the API
call and it doesn't provide any other useful information. This change removes
this.
2018-01-18 09:32:27 +01:00
Christoph Büscher 29b07bb6c4 [Test] Fix scores for dcg in RankEvalRequestIT and RankEvalYamlIT
Allow small deviations when asserting ranking scores, otherwise some tests break
on floating point calculation differences e.g. when running on ARM.
2018-01-03 17:24:10 +01:00
Christoph Büscher 8925dabcb8 [Test] Fix allowed delta for calculated scores in DiscountedCumulativeGainTests 2018-01-02 16:46:31 +01:00
Tanguy Leroux d2939a9daa [Test] Mute DiscountedCumulativeGainTests on ARM
These tests fail on ARM architectures. This is tracked in
https://github.com/elastic/elasticsearch/issues/28048
2018-01-02 16:16:43 +01:00
Christoph Büscher c541a0c60e Add skip versions for rank_eval yaml tests 2017-12-14 22:18:37 +01:00
Christoph Büscher 33bcfddb54 Use SPI to provide named XContent parsers for ranking evaluation 2017-12-12 18:39:01 +01:00
Christoph Büscher 72d0de4197
Add search window parameter k to MRR and DCG metric (#27595) 2017-12-04 10:54:03 +01:00
Christoph Büscher 7bfb273763
Add k parameter to PrecisionAtK metric (#27569) 2017-11-29 15:19:16 +01:00
Christoph Büscher 1352b7c6ea
Use msearch instead of single search (#27520)
Change TransportRankEvalAction to use one MultiSearchRequest instead of issuing several parallel search requests to simplify the transport action.
2017-11-27 10:15:59 +01:00
Christoph Büscher 94a0631a3e [Tests] Add testToXContent() RankEvalResponseTests 2017-11-21 14:09:50 +01:00
Christoph Büscher 35fabdaf8a Parse EvluationMetrics as named Objects 2017-11-21 14:09:38 +01:00
Christoph Büscher fdb24cd3e4 Fixing occasional test failure in RankEvalSpecTests 2017-11-21 14:09:13 +01:00
Christoph Büscher 3348d2317f Reworking javadocs, minor changes in some implementation classes 2017-11-21 14:09:04 +01:00
Christoph Büscher e278c1d17d Improving and cleaning up tests
Removing the unnecessary RankEvalTestHelper, making use of the common test infra
in ESTestCase, also hardening a few of the classes by making more fields final.
2017-11-21 14:08:53 +01:00
Christoph Büscher 5c65a59369 Extending rank_eval asciidocs 2017-11-21 14:08:42 +01:00
Christoph Büscher d9e67a2c95 Extending `_rank_eval` documentation 2017-11-21 14:08:28 +01:00
Christoph Büscher 0a6c6ac360 Remove usage of types in rank_eval endpoint 2017-11-21 14:07:41 +01:00
Christoph Büscher c83ec1f133 Fixing test after merging in master 2017-09-15 13:44:40 +02:00
Christoph Büscher cb4fd3bac6 Fix more tests 2017-08-23 13:14:48 +02:00
Christoph Büscher 56360ecfb5 Fix failing tests due to xContent changes 2017-08-23 12:22:07 +02:00
Christoph Büscher bc544e2d1b Adapt branch to changes on master 2017-08-23 12:05:52 +02:00
Christoph Büscher 887ed68cf2 Fixing compilation issues and tests after merging in master 2017-07-14 19:23:35 +02:00
Christoph Büscher 4de4c795b7 Fix issues after merging in master 2017-06-14 12:16:58 +02:00
Christoph Büscher 5a4124d4fb Fixing template rendering after changes in master 2017-05-30 15:30:24 +02:00
Christoph Büscher 10d308578e Fix compilation issues after merge with master 2017-05-18 17:52:58 +02:00
Christoph Büscher d1703decee Adapting to changes in master 2017-04-22 22:06:06 +02:00
Christoph Büscher 6cfbef73a0 Follow renaming of randomAsciiOfLength() to randomAlphaOfLength() 2017-04-04 18:31:00 +02:00
Christoph Büscher 4a75ede208 Reformatting source to fit 100 character line length restriction 2017-03-23 20:20:22 +01:00
Christoph Büscher f5388e5799 Adapting rank_eval integration tests 2017-03-14 12:21:28 -07:00
Christoph Büscher ddae32705c Adapting build.gradle to changes on master 2017-02-27 11:42:46 +01:00
Christoph Büscher 6f6b2933b1 Fixing compile issues after merging in master 2017-02-16 11:02:02 +01:00
Christoph Büscher dde2a09ba5 Updating rank-eval module after major changes on master 2017-02-03 21:17:46 +01:00
Isabel Drost-Fromm 7d849fb861 RankEvaluation: Add mutation based testing to RankEvalSpec (#22258)
RankEvaluation: Add mutation based testing to RankEvalSpec, also fix RatedRequestsTests that were failing intermittently.
2016-12-19 15:15:22 +01:00
Isabel Drost-Fromm 46c30e6bc3 Make maximum number of parallel search requests configurable. (#22192)
Problem: So far all rank eval requests are being executed in parallel. If there
are more than the search thread pool can handle, or if there are other search
requests executed in parallel rank eval can fail.

Solution: Make number of max_concurrent_searches configurable.

Name of configuration parameter is analogous to msearch. Default
max_concurrent_searches set to 10: Rank_eval isn't particularly time critical so
trying to avoid being more clever than probably needed here. Can set this value
through the API to a higher value anytime.

Fixes #21403
2016-12-19 13:05:49 +01:00
Isabel Drost-Fromm bdc32be8b7 Support specifying multiple templates (#22139)
Problem: We introduced the ability to shorten the rank eval request by using a
template in #20231. When playing with the API it turned out that there might be
use cases where - e.g. due to various heuristics - folks might want to translate
the original user query into more than just one type of Elasticsearch query.

Solution: Give each template an id that can later be referenced in the
actual requests.

Closes #21257
2016-12-19 12:49:15 +01:00
Isabel Drost-Fromm 58342d4c9a Add checks to RankEvalSpec to safe guard against missing parameters. (#22026)
Add checks to RankEvalSpec to safe guard against missing parameters.

Fail early in case no metric is supplied, no rated requests are supplied or the search source builder is missing but no template is supplied neither.

Add stricter checks around rank eval request parsing: Fail if in a rated request we see both, a verbatim request as well as request
template parameters.


Relates to #21260
2016-12-13 11:21:57 +01:00
Isabel Drost-Fromm 6d1a658106 Fix compile issues after merging in master. 2016-12-13 11:16:53 +01:00
Isabel Drost-Fromm 165cec2757 Serialisation and validation checks for rank evaluation request components (#21975)
Adds tests around serialisation/validation checks for rank evaluation request components

* Add null/ empty string checks to RatedDocument constructor
* Add mutation test to RatedDocument serialization tests.
* Reorganise rank-eval RatedDocument tests and add serialisation test.
* Add roundtrip serialisation testing for RatedRequests
* Adds serialisation testing and equals/hashcode testing for RatedRequest.
* Fixes a bug in previous equals implementation of RatedRequest along the way.
* Add roundtrip tests for Precision and ReciprocalRank
* Also fixes a bug with serialising ReciprocalRank.
* Add roundtrip testing for DiscountedCumulativeGain
* Add serialisation test for DocumentKey and fix test init
* Add check that relevant doc threshold is always positive for precision.
* Check that relevant threshold is always positive for precision and reciprocal
rank

Closes #21401
2016-12-07 11:47:47 +01:00
Isabel Drost-Fromm e0b15eafb0 Move rank-eval template compilation down to TransportRankEvalAction (#21855)
Move rank-eval template compilation down to TransportRankEvalAction

Closes #21777 and #21465

Search templates for rank_eval endpoint so far only worked when sent through
REST end point

However we also allow templates to be set through a Java API call to
"setTemplate" on that same spec. This doesn't go through template execution so
fails further down the line.

To make this work, moved template execution further down, probably to
TransportRankEvalAction.

* Add template IT test for Java API
* Move template compilation to TransportRankEvalAction
2016-12-06 10:30:25 +01:00
Christoph Büscher 8ba771e913 Rank Eval: Handle exceptions on search requests and add them to response
Currently we fail the whole ranking evaluation request when we receive an
exception for any of the search requests. We should collect those errors and
report them back to the user in the rest response. This change adds collecting
the errors and propagating them back via the RankEvalResponse.

Closes #19889
2016-11-24 13:52:10 +01:00
Christoph Büscher 9c58578dc6 Renaming RankEvalRequestTests to RankEvalRequestIT 2016-11-17 15:18:08 +01:00
Christoph Büscher 6cc7fb5600 Fix potential NPE in RankEvalSpec roundtrip test 2016-11-15 20:53:51 +01:00
Christoph Büscher 3d4804f9ed Fix test after merging in master branch 2016-11-15 15:55:09 +01:00
Christoph Büscher d6738fa650 Merge branch 'master' into feature/rank-eval
Conflicts:
	core/src/main/java/org/elasticsearch/script/Script.java
2016-11-15 14:56:36 +01:00
Christoph Büscher 57ea1abb55 Fixing compile error in SmokeTestRankEvalWithMustacheYAMLTestSuiteIT 2016-11-14 13:42:14 +01:00
Christoph Büscher 4e5c868709 RankEval: Add optional parameter to ignore unlabeled documents in Precision metric
Our current default behaviour to ignore unrated documents when calculating the
precision seems a bit counter intuitive. Instead we should treat those documents
as "irrelevant" by default and provide an optional parameter to ignore those
documents if that is the behaviour the user wants.
2016-11-09 15:53:16 +01:00
Christoph Büscher 4718f000df Remove test that uses max_acceptable_rank parameter 2016-11-08 15:51:37 +01:00
Christoph Büscher adb24333ca Remove maxAcceptableRank parameter from ReciprocalRank 2016-11-08 12:35:06 +01:00
Christoph Büscher c8d9d063ca Removing the 'size' parameter from the dcg metric 2016-11-08 12:35:06 +01:00
Christoph Büscher 3855d7f721 Removing the 'size' parameter from precision metric 2016-11-08 12:35:06 +01:00
Christoph Büscher d013a2c8f1 Adapting to change in ESClientYamlSuiteTestCase api 2016-11-08 11:30:36 +01:00
Christoph Büscher 2dad72e68c Rank Eval: Handle precion@ edge case
There's a currently unhandled edge case for the precion@ metric. When none of
the search hits in the result are rated, we have neither true nor false
positives which currently leads to division by zero. We should return a precion
of 0.0 in this case.
2016-11-03 12:59:36 +01:00
Christoph Büscher 25565b9baa RankEval: Check for duplicate keys in rated documents
When multiple ratings for the same document (identified by _index, _type,
_id) are specified in the request we should throw an error. This change adds a
check for this in the RatedRequest setter (and ctor that uses that setter).

Closes #20997
2016-11-01 14:54:05 +01:00
Christoph Büscher 51102ee91c Fixing compile issue with ScriptType after merge with master 2016-11-01 14:42:24 +01:00
Isabel Drost-Fromm 0b8a2e40cb First step towards supporting templating in rank eval requests. (#20374)
This adds support for templating in rank eval requests.

Relates to #20231

Problem: In it's current state the rank-eval request API forces the user to repeat complete queries for each test request. In most use cases the structure of the query to test will be stable with only parameters changing across requests, so this looks like lots of boilerplate json for something that could be expressed in a more concise way.

Uses templating/ ScriptServices to enable users to submit only one test request template and let them only specify template parameters on a per test request basis.
2016-11-01 11:36:22 +01:00
Christoph Büscher dfc6d1f369 Remove unknown docs from EvalQueryQuality
The unknown document section in the response for each query can be rendered
using the rated hits that are now also part of the response by just filtering
the documents without a rating.
2016-10-14 17:14:05 +02:00
Christoph Büscher 9e394b0644 Pull common operations into RankedListQualityMetric interface
Currently each implementation of RankedListQualityMetric does some initial
joining operation that links the input search hits with a rated document rating,
if available. Also all metrics collect unknown docs and now also need to add the
list of rated search hits to the partial query evaluation. This change
centralizes this work in some new helper methods in RankedListQualityMetric.
2016-10-14 17:14:05 +02:00
Christoph Büscher ebe13100df Add `hits` section to response for each ranking evaluation query
This change adds a `hits` section to the response part for each ranking
evaluation query, containing a list of documents (index/type/id) and ratings (if
the document was rated in the request). This section can be used to better
understand the calculation of the ranking quality of this particular query, but
it can also be used to identify the "unknown" (that is unrated) documents that
were part of the seach hits, for example because a UI later wants to present
those documents to the user to get a rating for them.

If the user specifies a set of field names using a parameter called
`summary_fields` in the request, those fields are also included as part of the
response in addition to "_index", "_type", "_id".
2016-10-14 17:14:05 +02:00
Christoph Büscher c3380863be Adapting RestRankEvalAction to changes in API on master 2016-10-10 12:57:18 +02:00
Christoph Büscher 29402a28e0 RankEval: Adding details section to response (#20497)
In order to understand how well particular queries in a joint ranking evaluation 
request work we want to break down the overall metric into its components, each
contributed by a particular query. The response structure now has a
`details` section under which we can summarize this information. Each
sub-section is keyed by the query-id and currently only contains the partial
metric and the unknown_docs section for each query.
2016-09-22 12:17:10 +02:00
Christoph Büscher 6c1003b81a Normalize rated document key parameter
To be consitent with the output of the search API, we should use the same field
names for specifying the document ("_index", "_type", "_id") when providing the
rated documents in the `rank_eval` request.
2016-09-21 12:36:50 +02:00
Christoph Büscher 5398425f4f Remove top level spec_id
Currently the top level spec_id serves as a human-readable description of the
ranking evaluation API call. Since there is only one id possible, it can be
dropped to simplify the request.

Closes #20438
2016-09-13 13:10:47 +02:00
Christoph Büscher 7dc1c3349d Remove nested `key` field for rated documents
Every rated document needs an index/type/id parameter, so adding a "key" object
like we currently do only leads to an additional unneeded level of nesting in
the rest request.

Closes #20417
2016-09-13 12:12:39 +02:00
Christoph Büscher 63822c2745 Adapt to changes on master 2016-09-12 12:05:17 +02:00
Isabel Drost-Fromm 450b756152 We don't actually need this annotation anymore. 2016-09-07 13:54:27 +02:00
Isabel Drost-Fromm d34a117422 Use roundtrip helper for rank eval spec tests. 2016-09-07 13:03:18 +02:00
Isabel Drost-Fromm ba9956a468 Remove FromXContentBuilder from RatedDocumentKey 2016-09-07 12:52:30 +02:00
Isabel Drost-Fromm 9d8bb720dc Remove leftover FromXContentBuilder reference 2016-09-07 12:50:04 +02:00
Isabel Drost-Fromm 2af0218bdd Rename test class to match renamed implementation
Used to be QuerySpec, is now RatedRequest; changing name
of test accordingly.
2016-09-07 12:12:06 +02:00
Isabel Drost-Fromm 0d25fc4925 Re-use roundtrip helper for QuerySpecTests 2016-09-07 12:11:07 +02:00
Isabel Drost-Fromm b3a4a89151 Replace magic number in test with random number. 2016-09-07 12:06:39 +02:00
Isabel Drost-Fromm c8a3b3c32f Move xcontent roundtrip method to helper class. 2016-09-07 12:02:00 +02:00
Isabel Drost-Fromm efbef20361 Remove call to getClass from hashCode implementations. 2016-09-07 11:53:39 +02:00
Isabel Drost-Fromm 333b769871 Remove usage of FromXContentBuilder 2016-09-07 11:48:50 +02:00
Isabel Drost-Fromm 6bc646ac84 Make constructor enforce non-optional position argument. 2016-09-07 11:21:50 +02:00
Isabel Drost-Fromm b8652b1224 Test wouldn't compile w/o these annotations. 2016-09-07 11:21:29 +02:00
Isabel Drost-Fromm c4cfb9d7b5 Merge branch 'feature/rank-eval' into feature/rank-eval-roundtrip-testing 2016-09-07 11:05:56 +02:00
Christoph Büscher 0b92d524a7 Add threshold for document ratings for PrecisionAtN and ReciprocalRank
PrecisionAtN and ReciprocalRank are binary evaluation metrics by default that
only distiguish between relevant/irrelevant search results. So far we assumed
that relevant documents are labaled with 1 (irrelevant docs with 0) in the
evaluation request, but this is cumbersome if the ratings are provided on a
larger integer scale and would need to get mapped to a 0/1 value.

This change introduces a threshold parameter on the PrecisionAtN and
ReciprocalRank metric than can be used to set the threshold from which on a
document is considered "relevant". It defaults to 1, so in case of 0/1 ratings
the threshold doesn't have to be set and only ratings with value 0 are
considered to be irrelevant.
2016-09-05 15:34:47 +02:00
Christoph Büscher 8daec109fc Adding additional json object level to PrecisionAtN rendering 2016-09-02 18:03:58 +02:00
Isabel Drost-Fromm f18e23eb51 Fix test errors, add roundtrip testing to RankEvalSpec
This adds roundtrip testing to RankEvalSpec, fixes issues
introduced with the previous roundtrip tests, splits
xcontent generation/parsing from actually checking the resulting
objects to deal with e.g. all evaluation metrics needing
some extra treatment.

Renames QuerySpec to RatedRequest, renames newly introduced xcontent
generation helper to conform with naming conventions.

Fixes several lines that were too long, adds missing types where
needed.
2016-08-25 12:09:55 +02:00
Isabel Drost-Fromm 87367be4e7 Add roundtrip testing to RatedDocumentKey 2016-08-24 15:32:44 +02:00
Isabel Drost-Fromm 94497871b5 Add roundtrip testing to QuerySpec 2016-08-24 15:17:58 +02:00
Isabel Drost-Fromm 5979802415 Add comment wrt to changed xcontent generation 2016-08-24 15:17:42 +02:00
Isabel Drost-Fromm cebb0ba0d8 Add roundtripping to ReciprocalRank 2016-08-24 14:41:58 +02:00
Isabel Drost-Fromm 5c9cc1d453 Add roundtripping to PrecisionAtN 2016-08-24 14:39:12 +02:00
Isabel Drost-Fromm a2a92b9629 Add roundtrip xcontent test to DiscountedCumulativeGainAt
This factors the roundtripping out of RatedDocumentTests. Makes
RankedListQualityMetric and RatedDocument implement FromXContenBuilder
to be able to do the aforementioned refactoring in a generic way. Adds
a roundtrip test to DiscountedCumulativeGainAt.

Open questions:

DiscountedCumulativeGain didn't have a constructor that accepted all possible
parameters as arguments. Added one. I guess we still want to keep the one
that only requires the position argument?

To make roundtripping work I had to change the NAME parameter when generating
XContent for DiscountedCumulativeGainAt - all remaining unit tests seem to be
passing (haven't checked the REST tests yet) - need to figure out why that was
there to begin with.
2016-08-24 14:21:24 +02:00
Christoph Büscher 2f506bfe04 Add toXContent method to classes used in ranking request 2016-08-18 18:14:21 +02:00
Christoph Büscher 2e892185f0 Adapting to introduction of SearchRequestParers on master 2016-08-17 11:54:07 +02:00
Christoph Büscher 1eedb4c033 Resolve missing imports due to changes in master 2016-08-16 11:48:28 +02:00
Christoph Büscher b7af7c21d1 Adapt to changes in master 2016-08-12 11:03:55 +02:00
Christoph Büscher cac4961ef4 Fix test failure because of broken RankEvalResponse serialization
The introduction of RatedDocumentKey accidentally broke the response
serialization because it cannot be written using writeGenericValue().
2016-08-11 15:42:18 +02:00