mirror of
https://github.com/apache/lucene.git
synced 2025-02-09 03:25:15 +00:00
Currently, the disjunction iterator puts all clauses in a heap in order to be able to merge doc IDs in a streaming fashion. This is a good approach for exhaustive evaluation, when only one clause moves to a different doc ID on average and the per-iteration cost is in the order of O(log(N)) where N is the number of clauses. However, if a selective filter is applied, this could cause many clauses to move to a different doc ID. In the worst-case scenario, all clauses could move to a different doc ID and the cost of maintaiting heap invariants could grow to O(N * log(N)) (every clause introduces a O(log(N)) cost). With many clauses, this is much higher than the cost of checking all clauses sequentially: O(N). To protect from this reordering overhead, DisjunctionDISIApproximation now only puts the cheapest clauses in a heap in a way that tries to achieve up to 1.5 clauses moving to a different doc ID on average. More expensive clauses are checked linearly.
Apache Lucene
Apache Lucene is a high-performance, full-featured text search engine library written in Java.
Online Documentation
This README file only contains basic setup instructions. For more comprehensive documentation, visit:
- Latest Releases: https://lucene.apache.org/core/documentation.html
- Nightly: https://ci-builds.apache.org/job/Lucene/job/Lucene-Artifacts-main/javadoc/
- New contributors should start by reading Contributing Guide
- Build System Documentation: help/
- Migration Guide: lucene/MIGRATE.md
Building
Basic steps:
- Install OpenJDK 21.
- Clone Lucene's git repository (or download the source distribution).
- Run gradle launcher script (
gradlew
).
We'll assume that you know how to get and set up the JDK - if you don't, then we suggest starting at https://jdk.java.net/ and learning more about Java, before returning to this README.
Contributing
Bug fixes, improvements and new features are always welcome! Please review the Contributing to Lucene Guide for information on contributing.
- Additional Developer Documentation: dev-docs/
Discussion and Support
- Users Mailing List
- Developers Mailing List
- IRC:
#lucene
and#lucene-dev
on freenode.net
Description
Languages
Java
97.7%
HTML
1%
Python
0.9%
Lex
0.3%