Adrien Grand a779a64d7b
Move BooleanScorer to work on top of Scorers rather than BulkScorers. (#13931)
I was looking at some queries where Lucene performs significantly worse than
Tantivy at https://tantivy-search.github.io/bench/, and found out that we get
quite some overhead from implementing `BooleanScorer` on top of `BulkScorer`
(effectively implemented by `DefaultBulkScorer` since it only runs term queries
as boolean clauses) rather than `Scorer` directly.

The `CountOrHighHigh` and `CountOrHighMed` tasks are a bit noisy on my machine,
so I did 3 runs on wikibigall, and all of them had speedups for these two
tasks, often with a very low p-value.

In theory, this change could make things slower when the inner query has a
specialized bulk scorer, such as `MatchAllDocsQuery` or a conjunction. It does
feel right to optimize for term queries though.
2024-10-21 16:55:04 +02:00
2022-08-16 20:02:47 +09:00
2010-12-12 15:36:08 +00:00
2024-09-30 11:31:56 +01:00
2024-10-14 17:59:52 +02:00

Apache Lucene

Lucene Logo

Apache Lucene is a high-performance, full-featured text search engine library written in Java.

Build Status Revved up by Develocity

Online Documentation

This README file only contains basic setup instructions. For more comprehensive documentation, visit:

Building

Basic steps:

  1. Install OpenJDK 21.
  2. Clone Lucene's git repository (or download the source distribution).
  3. Run gradle launcher script (gradlew).

We'll assume that you know how to get and set up the JDK - if you don't, then we suggest starting at https://jdk.java.net/ and learning more about Java, before returning to this README.

Contributing

Bug fixes, improvements and new features are always welcome! Please review the Contributing to Lucene Guide for information on contributing.

  • Additional Developer Documentation: dev-docs/

Discussion and Support

Description
Apache Lucene open-source search software
Readme 855 MiB
Languages
Java 97.7%
HTML 1%
Python 0.9%
Lex 0.3%