Apache Lucene open-source search software
Go to file
Benjamin Trent 7da509b708
Prevent humongous allocations when calculating scalar quantiles (#13090)
The initial release of scalar quantization would periodically create a humongous allocation, which can put unwarranted pressure on the GC & on the heap usage as a whole.

This commit adjusts this by only allocating a float array of 20*dimensions and averaging the discovered quantiles from there. 

Why does this work?

 - Quantiles based on confidence intervals are (generally) unbiased and doing an average gives statistically good results
 - The selector algorithm scales linearly, so the cost is just about the same
 - We need to do more than `1` vector at a time to prevent extreme confidence intervals interacting strangely with edge cases
2024-02-08 15:56:37 -05:00
.github [Minor] Document operation costs for stale workflow (#13000) 2024-01-22 09:40:25 +00:00
buildSrc Fix only use of .toLowerCase() with no Locale (#12856) 2024-01-08 22:04:04 +01:00
dev-docs a bit of clarification about GitHub Milestone 2022-08-28 13:52:58 +09:00
dev-tools Modernize BWC testing with parameterized tests (#13046) 2024-01-31 15:27:56 +01:00
gradle Modify getEnWikiRandomLines to fetch and decompress the zstd resource #13083 2024-02-06 22:08:09 +01:00
help Fix typo in help/formatting.txt (#12960) 2023-12-21 19:58:53 +01:00
lucene Prevent humongous allocations when calculating scalar quantiles (#13090) 2024-02-08 15:56:37 -05:00
.asf.yaml .asf.yaml 2022-08-16 20:02:47 +09:00
.dir-locals.el LUCENE-9322: Add Lucene90 codec, including VectorFormat 2020-10-18 07:49:36 -04:00
.git-blame-ignore-revs GITHUB#12655: Add google java format upgrade tidy / regen to blame ignore 2023-10-11 16:15:42 -04:00
.gitattributes LUCENE-10305: Ensure line endings of versions.props is LF 2021-12-11 10:10:44 +09:00
.gitignore LUCENE-9920: Remove binary gradle-wrapper.jar from the repository 2021-04-10 16:08:39 +02:00
.hgignore LUCENE-2792: add FST impl 2010-12-12 15:36:08 +00:00
.lift.toml Disable liftbot, we have our own tools 2022-05-05 22:27:57 +02:00
CONTRIBUTING.md Update contributing guide: autocrlf and build dependencies (#12963) 2023-12-22 09:28:53 +01:00
LICENSE.txt LUCENE-10163 Move LICENSE and NOTICE file to top level (#388) 2021-10-18 01:24:11 +02:00
NOTICE.txt Cleanup NOTICE.txt (#12227) 2023-04-18 15:58:09 -04:00
README.md Allow building with java 18 now that gradle supports it (#11889) 2022-10-28 23:41:09 -04:00
build.gradle Only enable support for tests.profile if jdk.jfr module is available in Gradle runtime (#12845) 2023-11-25 20:16:09 +01:00
gradlew GITHUB#12655: Upgrade to Gradle 8.4 2023-10-11 16:11:53 -04:00
gradlew.bat GITHUB#12655: Upgrade to Gradle 8.4 2023-10-11 16:11:53 -04:00
settings.gradle Build: build scans on ge.apache.org to benefit from deep build insights (#12293) 2023-10-24 12:32:18 -04:00
versions.lock Rewrite JavaScriptCompiler to use modern JVM features (Java 17) (#12873) 2023-12-05 11:53:57 +01:00
versions.props Rewrite JavaScriptCompiler to use modern JVM features (Java 17) (#12873) 2023-12-05 11:53:57 +01:00

README.md

Apache Lucene

Lucene Logo

Apache Lucene is a high-performance, full-featured text search engine library written in Java.

Build Status

Online Documentation

This README file only contains basic setup instructions. For more comprehensive documentation, visit:

Building

Basic steps:

  1. Install OpenJDK 17 or 18.
  2. Clone Lucene's git repository (or download the source distribution).
  3. Run gradle launcher script (gradlew).

We'll assume that you know how to get and set up the JDK - if you don't, then we suggest starting at https://jdk.java.net/ and learning more about Java, before returning to this README.

See Contributing Guide for details.

Contributing

Bug fixes, improvements and new features are always welcome! Please review the Contributing to Lucene Guide for information on contributing.

Discussion and Support