Adrien Grand 37a42219fc
Reduce the overhead of ImpactsDISI. (#12490)
`ImpactsDISI` is nice: you give it an `ImpactsEnum`, typically coming from the
`PostingsFormat` and it will automatically skip hits whose score cannot be
greater than the minimum competitive score. This is the class that yields 10x
or more speedups on top-level `TermQuery`s compared to exhaustive evaluation.

However, when nested under a disjunction or a conjunction, `ImpactsDISI`
typically adds more overhead than it enables skipping. The reason is that on a
disjunction `a OR b`, the minimum competitive score of `a` is the minimum score
for the disjunction minus the maximum score of `b`. While this sort of
propagation of minimum competitive scores down the query tree sometimes helps,
it does hurt more than it helps on average, because `ImpactsDISI` adds quite
some overhead and the per-clauses minimum scores are usually so low that they
don't actually enable skipping hits. I looked into reducing this overhead, but
a big part of it is the additional virtual call, so the only way to get rid of
this overhead is to not wrap with an `ImpactsDISI` at all.

This means that scorers need a way to know whether they are producing the
top-level score, or whether they are producing a partial score that then gets
combined into the top-level score. Term queries would then only wrap with
`ImpactsDISI` when they produce the top-level score. Note that this does not
only include top-level term queries, but also conjunctions that have a single
scoring clause (`a #b`) or combinations of a term query and one or more
prohibited clauses (`a -b`).
2023-09-12 15:23:28 +02:00
2023-06-26 11:05:46 +02:00
2022-08-16 20:02:47 +09:00
2010-12-12 15:36:08 +00:00
2023-04-18 15:58:09 -04:00

Apache Lucene

Lucene Logo

Apache Lucene is a high-performance, full-featured text search engine library written in Java.

Build Status

Online Documentation

This README file only contains basic setup instructions. For more comprehensive documentation, visit:

Building

Basic steps:

  1. Install OpenJDK 17 or 18.
  2. Clone Lucene's git repository (or download the source distribution).
  3. Run gradle launcher script (gradlew).

We'll assume that you know how to get and set up the JDK - if you don't, then we suggest starting at https://jdk.java.net/ and learning more about Java, before returning to this README.

See Contributing Guide for details.

Contributing

Bug fixes, improvements and new features are always welcome! Please review the Contributing to Lucene Guide for information on contributing.

Discussion and Support

Description
Apache Lucene open-source search software
Readme 851 MiB
Languages
Java 97.7%
HTML 1%
Python 0.9%
Lex 0.3%