lucene

Commit Graph

Author	SHA1	Message	Date
Adrien Grand	299d7f6721	Speed up prefix sums when decoding doc IDs. (#13658 ) This updates file formats to compute prefix sums by summing up 8 deltas per long at the same time if the number of bits per value is 4 or less, and 4 deltas per long at the same time if the number of bits per value is between 5 included and 11 included. Otherwise, we keep summing up 2 deltas per long like today. The `PostingDecodingUtil` was slightly modified due to the fact that more numbers of bits per value now need to apply different shifts to the input data. E.g. now that we store integers that require 5 bits per value as 16-bit integers under the hood rather than 8, we extract the first values by shifting by 16-5=11, 16-25=6 and 16-35=1 and then decode tail values from the remaining bit per 16-bit integer.	2024-08-16 21:27:32 +02:00
Dawid Weiss	8b25c4dd0b	Fix icu's regeneration script: instead of getVersion we can just pick the version from the catalog.	2024-08-15 22:28:49 +02:00
Dawid Weiss	1cfa697c06	Fix eclipse ide settings generation (#13649 ) * Only run the ide configuration block for eclipse when explicitly invoked. fix property access ordering here and there. * Correct dependsOn task name. * Correct crlf/encoding after versionCatalogFormatDeps finishes. * Change java-library to java-base in the plugin applied within the eclipse task. * use ant.fixcrlf to correct line endings. * Changes entry. * Simplify fixcrlf --------- Co-authored-by: Uwe Schindler <uschindler@apache.org>	2024-08-14 23:50:06 +02:00
Adrien Grand	b4a8810b7a	Inline skip data into postings lists (#13585 ) This updates the postings format in order to inline skip data into postings. This format is generally similar to the current `Lucene99PostingsFormat`, e.g. it shares the same block encoding logic, but it has a few differences: - Skip data is inlined into postings to make the access pattern more sequential. - There are only 2 levels of skip data: on every block (128 docs) and every 32 blocks (4,096 docs). In general, I found that the fact that skip data is inlined may slow down a bit queries that don't need skip data at all (e.g. `CountOrXXX` tasks that never advance of consult impacts) and speed up a bit queries that advance by small intervals. The fact that the greatest level only allows skipping 4096 docs at once means that we're slower at advancing by large intervals, but data suggests that it doesn't significantly hurt performance.	2024-07-31 17:18:28 +02:00
Dawid Weiss	dc287862dd	Gradle build: cleanup of dependency resolution and consolidation of dependency versions (#13484 )	2024-06-17 09:49:21 +02:00
Dawid Weiss	06f86a5096	Silence odd test runner warnings after gradle upgrade (#13471 )	2024-06-10 11:31:40 +02:00
Chris Hegarty	9a4caa935a	Update Gradle wrapper to 8.8 - supports Java 22 (#13453 ) This commit updates the Gradle wrapper to 8.8, which has support for Java 22.	2024-06-06 08:46:18 +01:00
Chris Hegarty	8d7e4174af	Add a separate option to allow running Panama Vectorization for all tests with suitable C2 defaults (#13351 ) This commit adds a separate option, tests.defaultvectorization, to allow running Panama Vectorization for all tests with suitable C2 defaults. For example: ./gradlew :lucene:core:test -Ptests.defaultvectorization=true --------- Co-authored-by: Uwe Schindler <uschindler@apache.org>	2024-05-09 11:00:51 +01:00
Dawid Weiss	afe982b3ef	Schedule compileJava after the internal task if it affects source files (#13282 )	2024-04-09 07:44:07 +02:00
Robert Muir	a7e916223c	remove unnecessary chmod+x, file is marked executable in snowball git	2024-03-28 16:43:47 -04:00
Robert Muir	3553769463	remove now-unnecessary snowball mojibake hack (#13231 )	2024-03-28 16:40:55 -04:00
Robert Muir	11712a3364	remove unsupported snowball algorithms (#13230 )	2024-03-28 16:37:18 -04:00
Robert Muir	ad8545151d	upgrade snowball to 34f3612e5e8c (round two) (#13227 ) * upgrade snowball to 34f3612e5e8c (round two) * disable forbidden-apis on snowball code (thanks @uschindler)	2024-03-27 17:51:48 -04:00
Robert Muir	d54663ad76	upgrade snowball to 26db1ab9adbf437f37a6facd3ee2aad1da9eba03 (#13209 ) * upgrade snowball to 26db1ab9adbf437f37a6facd3ee2aad1da9eba03 * add back-compat-hack to the factory, too * remove open of irish package now that we don't have our own stopwords file here anymore * CHANGES / MIGRATE	2024-03-27 10:05:57 -04:00
Uwe Schindler	26f5065e15	Add support for Github issue numbers in Markdown converter (e.g., MIGRATE.md file) (#13215 )	2024-03-26 13:45:56 +01:00
Uwe Schindler	a4055dae62	Add support for posix_madvise to Java 21 MMapDirectory (#13196 )	2024-03-25 18:44:33 +01:00
Dawid Weiss	1c77e2315c	An eye-gouging way to limit suppressAccessChecks to just the three JARs that need them. (#13164 )	2024-03-08 08:10:49 +01:00
Dawid Weiss	3ce9ba9fd5	Correct typo #13148	2024-03-01 07:12:57 +01:00
Uwe Schindler	08325ac3e8	Fix successful tests counting not working in Gradle build by adding ReflectPermission back (see ##13146)	2024-03-01 01:25:02 +01:00
Uwe Schindler	6910a4358c	Do not place Panama Java 21 class files in MR-JAR section of core.jar file (#13148 )	2024-02-29 23:10:16 +01:00
Uwe Schindler	e446904c61	Remove ByteBufferIndexInput and update all Panama implementations (MMap and Vector) to Java 21 (#13146 )	2024-02-29 19:38:37 +01:00
Uwe Schindler	dfce6ee8d2	Update the Javadoc package list to Java 21	2024-02-29 15:06:47 +01:00
Uwe Schindler	5aaaeaee39	Update link to javadocs for Java 21	2024-02-29 14:37:09 +01:00
Uwe Schindler	8f17f23acf	Bump minimum required Java version to 21 (#12753 ) Co-authored-by: ChrisHegarty <chegar999@gmail.com> Co-authored-by: Dawid Weiss <dawid.weiss@carrotsearch.com> Co-authored-by: Robert Muir <rmuir@apache.org>	2024-02-29 12:16:29 +01:00
Uwe Schindler	e7d2bd48a6	Revert "Merge branch 'java_21' of https://github.com/ChrisHegarty/lucene into main" This reverts commit `a356fc1e23`, reversing changes made to `7b01f2f516`.	2024-02-29 11:58:40 +01:00
Uwe Schindler	0ccb119495	Merge branch 'main' into java_21	2024-02-25 16:39:41 +01:00
Uwe Schindler	47021ae98f	Remove hardcoded "--release" from renderJavadoc task (#13132 )	2024-02-25 16:32:30 +01:00
ChrisHegarty	07f4b5b19f	Merge branch 'main' into java_21	2024-02-19 11:43:46 +00:00
Dawid Weiss	a270acae01	This reverts the addition of spotless:on/off regions and shows just one possible alternative that is formatter fool-proof. (#13098 )	2024-02-13 19:00:11 +01:00
Uwe Schindler	178f5a7a7e	Enable MemorySegment in MMapDirectory for Java 22+ and Vectorization (incubation) for exact Java 22 (#12706 )	2024-02-09 23:02:42 +01:00
Robert Muir	d7a16dc10a	Merge branch 'main' into java_21	2024-02-09 15:22:45 -05:00
Robert Muir	fefde0f721	java 17 -> java 21	2024-02-09 15:16:41 -05:00
Robert Muir	1f9545e830	remove java < 21	2024-02-09 15:15:52 -05:00
Robert Muir	784c331b68	java 17 -> java 21	2024-02-09 15:14:55 -05:00
Dawid Weiss	8c2c276c6c	Modify getEnWikiRandomLines to fetch and decompress the zstd resource #13083	2024-02-06 22:08:09 +01:00
Shubham Chaudhary	4b5917029f	Forbidden Thread.sleep API (#13001 ) Co-authored-by: Shubham Chaudhary <cshbha@amazon.com>	2024-02-05 17:23:52 +01:00
sabi0	78b4f75a2c	Replace .collect(toList()) with .toList() and misc. code cleanups (#12978 )	2023-12-30 17:04:11 +01:00
Dawid Weiss	6bb244a932	An improved check for ignoring the c2-crash test if running on a client compiler. (#12953 )	2023-12-18 12:37:57 +01:00
Jakub Slowinski	3965319441	Attempting to clean up some remaining Solr references (#12939 ) * Attempting to clean up some remaining Solr references * Update gradle/help.gradle Co-authored-by: Dawid Weiss <dawid.weiss@gmail.com> --------- Co-authored-by: Dawid Weiss <dawid.weiss@gmail.com>	2023-12-14 06:02:16 -05:00
Uwe Schindler	16d0b822b3	Prevent the common zero-width code points and detect invalid UTF-8 encoding in our sources and selected resource files (#12937 ) * Simple patch to prevent the common zero-width code points in our source and some types of resource files * Validate correct UTF-8 input and fix buggy CSS file (ISO-8859-x encoded) * add a bit of context * Add CHANGES.txt	2023-12-13 17:27:05 +01:00
Robert Muir	98d2df17d5	enable error-prone's DisableUnicodeInCode check (#12936 ) Closes #12931	2023-12-13 08:19:22 -05:00
Uwe Schindler	10387f136f	Fix encoding problem caused by invisible character with ExtractJdkApis.java	2023-12-12 15:00:01 +01:00
Chris Hegarty	a6f70ad2bb	Reflow computeCommonPrefixLengthAndBuildHistogram to avoid crash (#12905 ) This commit reflows the code in the method body of computeCommonPrefixLengthAndBuildHistogram, so as to avoid a JVM JIT crash. The purpose of this change is to workaround the JVM bug which is somewhat fragile, but the best that we can do for now and appears to be working well.	2023-12-11 20:10:03 +00:00
Uwe Schindler	880d0ba1a8	Rewrite JavaScriptCompiler to use modern JVM features (Java 17) (#12873 ) * Rewrite Javascript expression compiler to use hidden classes and MethodHandles for functions * Use dynamic constants for MethodHandles * Remove invokestatic code and handle everything through dynamic constants * Rewrite code to patch stack trace (keep Expressions class unmodified) * Improve generating of constant names * Remove classloader test (no longer needed) * Add benchmark * use better exception in benchmark * Add documentation, migration guide and a utility method to convert legacy function maps * also ignore SecurityException here while checking compatibility (if it happens only an imprecise error message is thrown) * Use Map.copyOf to not clone the map each time we compile an expression * Add another test with same method multiple times * Update ASM to 9.6 and set classfile version to Java 17 * Cleanup classloader permissions, unfortunately "createClassLoader" is still needed for Jacoco for God knows what	2023-12-05 11:53:57 +01:00
Uwe Schindler	17bb73332c	Only enable support for tests.profile if jdk.jfr module is available in Gradle runtime (#12845 )	2023-11-25 20:16:09 +01:00
Uwe Schindler	b45c21f9db	Fix errorprone with alternative runtime (#12808 )	2023-11-14 22:56:55 +01:00
Robert Muir	c28d174cd7	script to run microbenchmarks across different ec2 instance types (#12787 )	2023-11-10 12:31:10 -05:00
Jakub Slowinski	8ae598bae5	Remove patching for doc blocks. (#12741 ) * Change Postings back to using FOR in Lucene99PostingsFormat We are still keeping PFOR for positions only. This is a partial revert of https://github.com/apache/lucene/pull/69 which brings back ForDeltaUtil. * fix merge commit * Add forgotten forDeltaUtil calls to reader * Addressing comments: adding Lucene90RWPostingsFormat + more Also: * Change to Changes.txt * Removal of dead code which was only used in unit tests * Removal of test code from PForUtil * Changes.txt edit in right place now * Apply suggestions from code review: `90 -> 99 refactoring` Co-authored-by: gf2121 <52390227+gf2121@users.noreply.github.com> * Remove decodeTo32 from ForUtil and regenerate --------- Co-authored-by: gf2121 <52390227+gf2121@users.noreply.github.com>	2023-11-06 10:46:03 -05:00
Dawid Weiss	d6836d3d0e	tests.multiplier could be omitted in failed test reproduce line (#12752 ) The default tests.multiplier passed from gradle was 1, but LuceneTestCase tried to compute its default value from TESTS_NIGHTLY. This could lead to subtle errors: nightly mode failures would not report tests.multipler=1 and when started from the IDE, the tests.multiplier would be set to 2 (leading to different randomness).	2023-11-03 17:05:17 +01:00
Dawid Weiss	8400f89a91	Fix javac task inputs so that they include modular dependencies #12742 (#12745 ) Fix javac task inputs so that they include modular dependencies #12742	2023-11-02 08:49:41 +01:00

1 2 3 4 5 ...

578 Commits