Commit Graph

120 Commits

Author SHA1 Message Date
Christoph Büscher 4ae4ac08d5
Add Expected Reciprocal Rank metric (#31891)
This change adds Expected Reciprocal Rank (ERR) as a ranking evaluation metric
as descriped in:

Chapelle, O., Metlzer, D., Zhang, Y., & Grinspan, P. (2009).
Expected reciprocal rank for graded relevance.
Proceeding of the 18th ACM Conference on Information and Knowledge Management.
https://doi.org/10.1145/1645953.1646033

ERR is an extension of the classical reciprocal rank to the graded relevance
case and assumes a cascade browsing model. It quantifies the usefulness of a
document at rank `i` conditioned on the degree of relevance of the items at ranks
less than `i`. ERR seems to be gain traction as an alternative to (n)DCG, so it
seems like a good metric to support. Also ERR seems to be the default optimization
metric used for training in RankLib, a widely used learning to rank library.

Relates to #29653
2018-07-12 15:50:58 +02:00
Christoph Büscher 02346c20a2
Rankeval: Fold template test project into main module (#31203)
This change moves tests in `smoke-test-rank-eval-with-mustache` into the main
ranking evaluation module by declaring that the integration testing cluster
requires the `lang-mustache` plugin. This avoids having to maintain the qa
project for only one basic test suite.
2018-06-15 15:55:39 +02:00
Christoph Büscher a0d6c19e75
Add details section for dcg ranking metric (#31177)
While the other two ranking evaluation metrics (precicion and reciprocal rank)
already provide a more detailed output for how their score is calculated, the
discounted cumulative gain metric (dcg) and its normalized variant are lacking
this until now. Its not really clear which level of detail might be useful for
debugging and understanding the final metric calculation, but this change adds a
`metric_details` section to REST output that contains some information about the
evaluation details.
2018-06-15 11:56:16 +02:00
Christoph Büscher 0c9d4cb417
Fix expectation on parsing exception (#31108)
The structure of the expected exception slightly changed, the change adapts the
assertions accordingly.

Closes #31104
2018-06-06 09:58:16 +02:00
Christoph Büscher 4624ba5e10 [Tests] Muting RatedRequestsTests#testXContentParsingIsNotLenient 2018-06-05 15:29:49 +02:00
Christoph Büscher 3f87c79500
Change ObjectParser exception (#31030)
ObjectParser should throw XContentParseExceptions, not IAE. A dedicated parsing
exception can includes the place where the error occurred.

Closes #30605
2018-06-04 20:20:37 +02:00
Christoph Büscher 0a5d46ef3c
[Test] Prefer ArrayList over Vector (#30965)
Replaces some occurances of Vector class with ArrayList in
tests of the rank-eval module.
2018-05-30 21:11:49 +02:00
Christoph Büscher cc93131318
Forbid expensive query parts in ranking evaluation (#30151)
Currently the ranking evaluation API accepts the full query syntax for
the queries specified in the evaluation set and executes them via multi
search. This potentially runs costly aggregations and suggestions too.
This change adds checks that forbid using aggregations, suggesters, 
highlighters and the explain and profile options in the queries that are 
run as part of the ranking evaluation since they are irrelevent in the 
context of this API.
2018-05-14 17:36:26 +02:00
Christoph Büscher d0f6657d90
Add tests for ranking evaluation with aliases (#29452)
The ranking evaluation requests so far were not tested against aliases
but they should run regardless of the targeted index is a real index or
an alias. This change adds cases for this to the integration and rest
tests.
2018-04-19 17:00:52 +02:00
Christoph Büscher 7c56cc2624 Make ranking evaluation details accessible for client
Allow high level java rest client to access details of the metric
calculation by making them accessible across packages. Also renaming the
inner `Breakdown` classes of the evaluation metrics to `Detail` to
better communicate their use.
2018-04-19 14:39:41 +02:00
Christoph Büscher fa1052017c
[Test] Minor changes to rank_eval tests (#29577)
Removing an enum in favour of local constants to simplify tests and
removing a few deprecated method calls and warnings.
2018-04-19 13:50:18 +02:00
Lee Hinman a93c942927
Move ObjectParser into the x-content lib (#29373)
* Move ObjectParser into the x-content lib

This moves `ObjectParser`, `AbstractObjectParser`, and
`ConstructingObjectParser` into the libs/x-content dependency. This decoupling
allows them to be used for parsing for projects that don't want to depend on the
entire Elasticsearch jar.

Relates to #28504
2018-04-06 09:41:14 -06:00
Christoph Büscher 570f1d9ac7
Add indices options support to _rank_eval (#29386)
Currently the ranking evaluation API doesn't support many of the
standard parameters of the search API. Some of these make sense, like
adding support for the common indices options parameters, which this
change adds.
2018-04-06 16:23:19 +02:00
Christoph Büscher 2b07f63bd5
Fix NDCG for empty search results (#29267)
Fixes and edge case where DiscountedCumulativeGain can return NaN as
result of the quality metric calculation. This can happen when the
search result set is empty and normalization is used. We should return 0
in this case. Also adding related unit tests to the other two metrics.
2018-04-03 11:15:44 +02:00
Christoph Büscher e4b30071bb
RankEvalRequest should implement IndicesRequest (#29188)
Change RankEvalRequest to implement IndicesRequest, so it gets treated
in a similar fashion to regular search requests e.g. by security.
2018-03-22 11:58:55 +01:00
Christoph Büscher 80532229a9
Move indices field from RankEvalSpec to RankEvalRequest (#28341)
Currently we store the indices specified in the request URL together with all
the other ranking evaluation specification in RankEvalSpec. This is not ideal
since e.g. the indices are not rendered to xContent and so cannot be parsed
back. Instead we should keep them in RankEvalRequest.
2018-03-19 16:26:02 +01:00
Lee Hinman 8e8fdc4f0e
Decouple XContentBuilder from BytesReference (#28972)
* Decouple XContentBuilder from BytesReference

This commit removes all mentions of `BytesReference` from `XContentBuilder`.
This is needed so that we can completely decouple the XContent code and move it
into its own dependency.

While this change appears large, it is due to two main changes, moving
`.bytes()` and `.string()` out of XContentBuilder itself into static methods
`BytesReference.bytes` and `Strings.toString` respectively. The rest of the
change is code reacting to these changes (the majority of it in tests).

Relates to #28504
2018-03-14 13:47:57 -06:00
Christoph Büscher 01791277cb
Test that rank_eval request parsing is not lenient (#28516)
Parsing of a ranking evaluation request and its subcomponents should throw parsing
errors on unknown fields. This change adds tests for this and changes the parser 
behaviour in cases where it is needed.
2018-02-08 17:38:45 +01:00
Christoph Büscher 1c296fe7ed Update bwc version for rank_eval rest tests 2018-01-30 21:02:19 +01:00
Christoph Büscher 6731c76900
Add ranking evaluation API to High Level Rest Client (#28357)
This change adds support for the new ranking evaluation API to the High Level Rest Client.
This mostly means adding support for parsing the various response objects back from the
REST representation. It includes one change to the response syntax where previously we didn't
print the type of the metric details section but we now need it to pick the right parser to
parse this section back.

Closes #28198
2018-01-30 17:48:09 +01:00
Christoph Büscher a6bfe67f8b [Test] Lower bwc version for rank-eval rest tests
The API was backported to 6.2 so the version we test against on master can be
lowered to that.
2018-01-22 13:33:42 +01:00
Christoph Büscher 77dcaab34f
Simplify RankEvalResponse output (#28266)
Currenty the rest response of the ranking evaluation API wraps all inside an
enclosing `rank_eval` object. This is redundant since it is clear from the API
call and it doesn't provide any other useful information. This change removes
this.
2018-01-18 09:32:27 +01:00
Christoph Büscher 29b07bb6c4 [Test] Fix scores for dcg in RankEvalRequestIT and RankEvalYamlIT
Allow small deviations when asserting ranking scores, otherwise some tests break
on floating point calculation differences e.g. when running on ARM.
2018-01-03 17:24:10 +01:00
Christoph Büscher 8925dabcb8 [Test] Fix allowed delta for calculated scores in DiscountedCumulativeGainTests 2018-01-02 16:46:31 +01:00
Tanguy Leroux d2939a9daa [Test] Mute DiscountedCumulativeGainTests on ARM
These tests fail on ARM architectures. This is tracked in
https://github.com/elastic/elasticsearch/issues/28048
2018-01-02 16:16:43 +01:00
Christoph Büscher c541a0c60e Add skip versions for rank_eval yaml tests 2017-12-14 22:18:37 +01:00
Christoph Büscher 72d0de4197
Add search window parameter k to MRR and DCG metric (#27595) 2017-12-04 10:54:03 +01:00
Christoph Büscher 7bfb273763
Add k parameter to PrecisionAtK metric (#27569) 2017-11-29 15:19:16 +01:00
Christoph Büscher 94a0631a3e [Tests] Add testToXContent() RankEvalResponseTests 2017-11-21 14:09:50 +01:00
Christoph Büscher 35fabdaf8a Parse EvluationMetrics as named Objects 2017-11-21 14:09:38 +01:00
Christoph Büscher fdb24cd3e4 Fixing occasional test failure in RankEvalSpecTests 2017-11-21 14:09:13 +01:00
Christoph Büscher 3348d2317f Reworking javadocs, minor changes in some implementation classes 2017-11-21 14:09:04 +01:00
Christoph Büscher e278c1d17d Improving and cleaning up tests
Removing the unnecessary RankEvalTestHelper, making use of the common test infra
in ESTestCase, also hardening a few of the classes by making more fields final.
2017-11-21 14:08:53 +01:00
Christoph Büscher 5c65a59369 Extending rank_eval asciidocs 2017-11-21 14:08:42 +01:00
Christoph Büscher d9e67a2c95 Extending `_rank_eval` documentation 2017-11-21 14:08:28 +01:00
Christoph Büscher 0a6c6ac360 Remove usage of types in rank_eval endpoint 2017-11-21 14:07:41 +01:00
Christoph Büscher c83ec1f133 Fixing test after merging in master 2017-09-15 13:44:40 +02:00
Christoph Büscher cb4fd3bac6 Fix more tests 2017-08-23 13:14:48 +02:00
Christoph Büscher 56360ecfb5 Fix failing tests due to xContent changes 2017-08-23 12:22:07 +02:00
Christoph Büscher 887ed68cf2 Fixing compilation issues and tests after merging in master 2017-07-14 19:23:35 +02:00
Christoph Büscher 5a4124d4fb Fixing template rendering after changes in master 2017-05-30 15:30:24 +02:00
Christoph Büscher 10d308578e Fix compilation issues after merge with master 2017-05-18 17:52:58 +02:00
Christoph Büscher d1703decee Adapting to changes in master 2017-04-22 22:06:06 +02:00
Christoph Büscher 6cfbef73a0 Follow renaming of randomAsciiOfLength() to randomAlphaOfLength() 2017-04-04 18:31:00 +02:00
Christoph Büscher 4a75ede208 Reformatting source to fit 100 character line length restriction 2017-03-23 20:20:22 +01:00
Christoph Büscher f5388e5799 Adapting rank_eval integration tests 2017-03-14 12:21:28 -07:00
Christoph Büscher 6f6b2933b1 Fixing compile issues after merging in master 2017-02-16 11:02:02 +01:00
Christoph Büscher dde2a09ba5 Updating rank-eval module after major changes on master 2017-02-03 21:17:46 +01:00
Isabel Drost-Fromm 7d849fb861 RankEvaluation: Add mutation based testing to RankEvalSpec (#22258)
RankEvaluation: Add mutation based testing to RankEvalSpec, also fix RatedRequestsTests that were failing intermittently.
2016-12-19 15:15:22 +01:00
Isabel Drost-Fromm 46c30e6bc3 Make maximum number of parallel search requests configurable. (#22192)
Problem: So far all rank eval requests are being executed in parallel. If there
are more than the search thread pool can handle, or if there are other search
requests executed in parallel rank eval can fail.

Solution: Make number of max_concurrent_searches configurable.

Name of configuration parameter is analogous to msearch. Default
max_concurrent_searches set to 10: Rank_eval isn't particularly time critical so
trying to avoid being more clever than probably needed here. Can set this value
through the API to a higher value anytime.

Fixes #21403
2016-12-19 13:05:49 +01:00