Apache Lucene open-source search software
Go to file
Adrien Grand 3ad73336ae
Make BP work on indexes that have blocks. (#13125)
The current logic for reordering splits a slice of doc IDs into a left side and
a right side, and for each document it computes the expected gain of moving to
the other side. Then it swaps documents from both sides as long as the sum of
the gain of moving the left doc to the right and the right doc to the left is
positive.

This works well, but I would like to extend BP reordering to also work with
blocks, and the swapping logic is challenging to modify as two parent documents
may have different numbers of children.

One of the follow-up papers on BP suggested to use a different logic, where one
would compute a bias for all documents that is negative when a document is
attracted to the left and positive otherwise. Then we only have to partition doc
IDs around the mid point, e.g. with quickselect.

A benefit of this change is that it will make it easier to generalize BP
reordering to indexes that have blocks, e.g. by using a stable sort on biases.
2024-03-13 10:14:14 +01:00
.github Make run-nightly-smoketester.yml run on java 21+ only 2024-02-29 13:03:42 +01:00
buildSrc Bump minimum required Java version to 21 (#12753) 2024-02-29 12:16:29 +01:00
dev-docs a bit of clarification about GitHub Milestone 2022-08-28 13:52:58 +09:00
dev-tools Bump minimum required Java version to 21 (#12753) 2024-02-29 12:16:29 +01:00
gradle An eye-gouging way to limit suppressAccessChecks to just the three JARs that need them. (#13164) 2024-03-08 08:10:49 +01:00
help Fix typo in help/formatting.txt (#12960) 2023-12-21 19:58:53 +01:00
lucene Make BP work on indexes that have blocks. (#13125) 2024-03-13 10:14:14 +01:00
.asf.yaml .asf.yaml 2022-08-16 20:02:47 +09:00
.dir-locals.el LUCENE-9322: Add Lucene90 codec, including VectorFormat 2020-10-18 07:49:36 -04:00
.git-blame-ignore-revs GITHUB#12655: Add google java format upgrade tidy / regen to blame ignore 2023-10-11 16:15:42 -04:00
.gitattributes LUCENE-10305: Ensure line endings of versions.props is LF 2021-12-11 10:10:44 +09:00
.gitignore LUCENE-9920: Remove binary gradle-wrapper.jar from the repository 2021-04-10 16:08:39 +02:00
.hgignore LUCENE-2792: add FST impl 2010-12-12 15:36:08 +00:00
.lift.toml Disable liftbot, we have our own tools 2022-05-05 22:27:57 +02:00
CONTRIBUTING.md Update contributing guide: autocrlf and build dependencies (#12963) 2023-12-22 09:28:53 +01:00
LICENSE.txt LUCENE-10163 Move LICENSE and NOTICE file to top level (#388) 2021-10-18 01:24:11 +02:00
NOTICE.txt Cleanup NOTICE.txt (#12227) 2023-04-18 15:58:09 -04:00
README.md Bump minimum required Java version to 21 (#12753) 2024-02-29 12:16:29 +01:00
build.gradle Bump minimum required Java version to 21 (#12753) 2024-02-29 12:16:29 +01:00
gradlew Bump minimum required Java version to 21 (#12753) 2024-02-29 12:16:29 +01:00
gradlew.bat Bump minimum required Java version to 21 (#12753) 2024-02-29 12:16:29 +01:00
settings.gradle Build: build scans on ge.apache.org to benefit from deep build insights (#12293) 2023-10-24 12:32:18 -04:00
versions.lock upgrade to OpenNLP 2.3.2 (#12674) 2024-02-09 11:21:41 +00:00
versions.props upgrade to OpenNLP 2.3.2 (#12674) 2024-02-09 11:21:41 +00:00

README.md

Apache Lucene

Lucene Logo

Apache Lucene is a high-performance, full-featured text search engine library written in Java.

Build Status

Online Documentation

This README file only contains basic setup instructions. For more comprehensive documentation, visit:

Building

Basic steps:

  1. Install OpenJDK 21.
  2. Clone Lucene's git repository (or download the source distribution).
  3. Run gradle launcher script (gradlew).

We'll assume that you know how to get and set up the JDK - if you don't, then we suggest starting at https://jdk.java.net/ and learning more about Java, before returning to this README.

See Contributing Guide for details.

Contributing

Bug fixes, improvements and new features are always welcome! Please review the Contributing to Lucene Guide for information on contributing.

Discussion and Support