Apache Lucene open-source search software
Go to file
Michael McCandless a1418d9433
Remove 8 bit quantization for HNSW/KNN vector indexing (it is buggy today) (#13767)
4 and 7 bit quantization still work.

It's a bit tricky because 9.11 indices may have 8 bit compressed
vectors which are buggy at search time (and users may not realize it,
or may not be using them at search time).  But the index is still
intact since we keep the original full float precision vectors.  So,
users can force rewrite all their 9.11 written segments (or reindex
those docs), and can change to 4 or 7 bit quantization for newly
indexed documents.  The 9.11 index is still usable.

(I added a couple test cases confirming that one can indeed change
their mind, indexing a given vector field first with 4 bit
quantization, then later (new IndexWriter / Codec) with 7 bit or with
no quantization.)

I added MIGRATE.md explanation.

Separately, I also tightned up the `compress` boolean to throw an
exception unless bits=4.  Previously (for 7 bit compression) it
silently ignored `compress=true` for 7, 8 bit quantization.  And tried
to improve its javadocs a bit.

Closes #13519.
2024-09-15 15:26:28 -04:00
.github Nightly gh action 'buildAndPushRelease and smokeTestRelease.py' should save release.log on failure #13754 2024-09-11 08:07:08 +02:00
build-tools Upgrade to gradle 8.10 (#13700) 2024-08-30 12:36:56 +02:00
dev-docs a bit of clarification about GitHub Milestone 2022-08-28 13:52:58 +09:00
dev-tools Add simple tool to diff entries in lucene's CHANGES.txt that should be identical (#12860) 2024-07-22 11:37:34 -04:00
gradle jgit/ clean status check should ignore any 'untracked folders' (#13728) 2024-09-06 09:01:15 +02:00
help Gradle build: cleanup of dependency resolution and consolidation of dependency versions (#13484) 2024-06-17 09:49:21 +02:00
lucene Remove 8 bit quantization for HNSW/KNN vector indexing (it is buggy today) (#13767) 2024-09-15 15:26:28 -04:00
.asf.yaml .asf.yaml 2022-08-16 20:02:47 +09:00
.dir-locals.el LUCENE-9322: Add Lucene90 codec, including VectorFormat 2020-10-18 07:49:36 -04:00
.git-blame-ignore-revs GITHUB#12655: Add google java format upgrade tidy / regen to blame ignore 2023-10-11 16:15:42 -04:00
.gitattributes Add versions.toml to .gitattributes and normalize line endings to lf. #13484 2024-06-18 14:25:40 +02:00
.gitignore LUCENE-9920: Remove binary gradle-wrapper.jar from the repository 2021-04-10 16:08:39 +02:00
.hgignore LUCENE-2792: add FST impl 2010-12-12 15:36:08 +00:00
.lift.toml Disable liftbot, we have our own tools 2022-05-05 22:27:57 +02:00
CONTRIBUTING.md equivocate about IntelliJ test runner 2024-06-10 08:40:38 -04:00
LICENSE.txt LUCENE-10163 Move LICENSE and NOTICE file to top level (#388) 2021-10-18 01:24:11 +02:00
NOTICE.txt Cleanup NOTICE.txt (#12227) 2023-04-18 15:58:09 -04:00
README.md Make Gradle dashboard easy to find by adding a badge (#13476) 2024-07-01 09:09:51 +01:00
build.gradle Support JDK 23 in Panama Vectorization Provider (#13678) 2024-08-22 13:51:32 +01:00
gradlew Gradle build: cleanup of dependency resolution and consolidation of dependency versions (#13484) 2024-06-17 09:49:21 +02:00
gradlew.bat Gradle build: cleanup of dependency resolution and consolidation of dependency versions (#13484) 2024-06-17 09:49:21 +02:00
settings.gradle Gradle build: cleanup of dependency resolution and consolidation of dependency versions (#13484) 2024-06-17 09:49:21 +02:00
versions.lock Gradle build: cleanup of dependency resolution and consolidation of dependency versions (#13484) 2024-06-17 09:49:21 +02:00
versions.toml jgit/ clean status check should ignore any 'untracked folders' (#13728) 2024-09-06 09:01:15 +02:00

README.md

Apache Lucene

Lucene Logo

Apache Lucene is a high-performance, full-featured text search engine library written in Java.

Build Status Revved up by Develocity

Online Documentation

This README file only contains basic setup instructions. For more comprehensive documentation, visit:

Building

Basic steps:

  1. Install OpenJDK 21.
  2. Clone Lucene's git repository (or download the source distribution).
  3. Run gradle launcher script (gradlew).

We'll assume that you know how to get and set up the JDK - if you don't, then we suggest starting at https://jdk.java.net/ and learning more about Java, before returning to this README.

Contributing

Bug fixes, improvements and new features are always welcome! Please review the Contributing to Lucene Guide for information on contributing.

  • Additional Developer Documentation: dev-docs/

Discussion and Support