lucene

Commit Graph

Author	SHA1	Message	Date
Dawid Weiss	486141f0eb	LUCENE-9660: correct help/tests.txt.	2021-10-26 08:45:58 +02:00
Mayya Sharipova	2ed6e4aa78	LUCENE-10154 NumericLeafComparator to define getPointValues (#364 ) This patch adds getPointValues to NumericLeafComparatorsimilar how it has getNumericDocValues. Numeric Sort optimization with points relies on the assumption that points and doc values record the same information, as we substitute iterator over doc_values with one over points. If we override getNumericDocValues it almost certainly means that whatever PointValues NumericComparator is going to look at shouldn't be used to skip non-competitive documents. Returning null for pointValues in this case will force comparator NOT to use sort optimization with points, and continue with a traditional way of iterating over doc values.	2021-10-25 09:38:37 -04:00
Dawid Weiss	81f5b4d642	LUCENE-9660: add tests.neverUpToDate=true option which, by default, makes test tasks always execute. (#410 )	2021-10-25 14:51:11 +02:00
David Smiley	2719cf6630	LUCENE-9431: UnifiedHighlighter WEIGHT_MATCHES is now true by default (#362 ) Co-authored-by: Animesh Pandey <apanimesh061@gmail.com>	2021-10-22 20:40:22 -04:00
Michael McCandless	e3151d6c7d	LUCENE-10093: fix conflicting test assert to match how TieredMergePolicy (TMP) works; improv TMP javadocs (#375 )	2021-10-21 09:23:17 -04:00
Adrien Grand	8b6c90eccd	LUCENE-10165: Fix test failures.	2021-10-21 11:32:10 +02:00
Adrien Grand	9e84b2fd41	LUCENE-10165: Implement Lucene90DocValuesProducer#getMergeInstance. (#374 ) This speeds up merging by returning doc values that perform faster when all doc IDs and values are consumed.	2021-10-21 08:41:47 +02:00
Nhat Nguyen	4c2692e897	Do not run testHighOrdsSortedSetDV with SimpleTextCodec (#403 ) Avoid running testHighOrdsSortedSetDV with SimpleTextCodec as it requires a lot of memory and the bug was with Lucene90 Codec.	2021-10-20 18:22:34 -04:00
Adrien Grand	3a11983de2	LUCENE-10189: Optimize flush of doc-value fields that are effectively single-valued. (#399 )	2021-10-20 19:05:40 +02:00
Adrien Grand	0e1f9fcf31	LUCENE-10193: Cut over more array access to VarHandles. (#402 ) LZ4 is interesting because it used to read data in little-endian order even though Directory APIs were big endian. So most calls to LZ4 in backward-codecs have been changed to change the endianness of the input/output.	2021-10-20 19:04:01 +02:00
Julie Tibshirani	6bb2bbcd6a	LUCENE-10146: Add note that dot product is preferred over cosine (#400 ) While VectorSimilarityFunction#COSINE is helpful when you need to preserve the original vectors, it is significantly slower than DOT_PRODUCT. This commit adds javadocs to COSINE explaining that dot product is the fastest option.	2021-10-20 09:50:25 -07:00
Jan Høydahl	5b8f0a5eb5	LUCENE-10174 Speed up 'pushLocal' by using uncompressed tar (#401 )	2021-10-20 14:41:24 +02:00
Adrien Grand	f13a400b9a	LUCENE-10187: Reduce DirectWriter's padding. (#398 ) It would make us more likely to detect out-of-bounds access in the future.	2021-10-20 10:30:09 +02:00
Tomoko Uchida	54418cef45	LUCENE-9997: write release revision to system temp dir (#394 )	2021-10-20 07:06:30 +09:00
Jan Høydahl	c77e9ddf93	LUCENE-9997 Second pass smoketester fixes for 9.0 (#391 ) * Java17 fixes * Add to error message that the unexpected file is in lucene/ folder * Fix gpg command utf-8 output * Add --no-daemon to all gradle calls, and skip clean Co-authored-by: Dawid Weiss <dawid.weiss@carrotsearch.com> Co-Authored-by: Tomoko Uchida <tomoko.uchida.1111@gmail.com>	2021-10-19 21:24:06 +02:00
Jan Høydahl	f5486d13e6	LUCENE-10174 BuildAndPushRelease additional improvements (#396 )	2021-10-19 19:48:44 +02:00
Stefan Vodita	54c5a2ce28	LUCENE-10182: Order assertion parameters correctly (#397 )	2021-10-19 16:29:46 +02:00
Adrien Grand	1448e4739b	LUCENE-10180: Avoid using lambdas in SegmentMerger. (#385 )	2021-10-19 15:00:20 +02:00
Nhat Nguyen	8b68bf60c9	LUCENE-10159: Fix invalid access in sorted set dv (#389 ) We introduced invalid accesses for sorted set doc values in LUCENE-9613. However, the issue has been unnoticed because the ordinals in doc values tests aren't complex enough to use high packed bits, and the 3 padding bytes make these invalid accesses perfectly fine. To reproduce this issue, we need to use at least 20 bits per value for the ordinals.	2021-10-19 08:00:00 -04:00
Dawid Weiss	6c21862a55	LUCENE-10186: Include manifest and legalese in source and javadoc jars. (#395 )	2021-10-19 10:04:42 +02:00
Dawid Weiss	e290f91bb2	LUCENE-10166: removed module-level README.txt and modified a few links, removed a few obsolete instructions from 20 years ago. (#379 )	2021-10-19 09:45:49 +02:00
Mayya Sharipova	6f67e8287f	Add back-compat indices for 8.10.1	2021-10-18 20:38:34 -04:00
Stefan Vodita	d9e3d99ec9	LUCENE-10182: Be specific about which sizeOf() is called; rename RamUsageTester.sizeOf to ramUsed (#386 ) Co-authored-by: Stefan Vodita <voditas@amazon.com>	2021-10-19 00:13:32 +02:00
Mayya Sharipova	a93dfe93c9	Add bugfix version 8.10.1	2021-10-18 18:02:05 -04:00
Robert Muir	f8d431ae44	LUCENE-10185: pass --release 11 to ECJ linter, fix JDK 17 build (#393 ) * LUCENE-10185: pass --release 11 to ECJ linter, fix JDK 17 build Otherwise, new java releases such as JDK 18, JDK 19, ... may have even more new deprecations, the build shouldn't fail in such cases. Remove -source/-target now that we pass --release Fix casting so ECJ understands it and creates correct call signature (UweSays: "It's ok. I know why it happens, but it's a bug in ECJ. The type safety is checked by the invokeexact") Co-authored-by: Uwe Schindler <uschindler@apache.org>	2021-10-18 16:43:53 -04:00
Dawid Weiss	c4c3c3270e	LUCENE-9997: Collect signed maven artifacts if -Psign is passed. (#392 ) * Collect signed maven artifacts if -Psign is passed. * Configure signing using gpg across all projects.	2021-10-18 20:58:29 +02:00
Mayya Sharipova	41fe301a21	DOAP changes for release 8.10.1	2021-10-18 11:11:15 -04:00
Jan Høydahl	175a49e54a	LUCENE-10163 Move LICENSE and NOTICE file to top level (#388 ) * Add changes entry, under a new "Build" headline	2021-10-18 01:24:11 +02:00
Tomoko Uchida	18c6010e0f	LUCENE-10163: Remove pointer to no longer exists file (#390 )	2021-10-17 18:55:33 +09:00
Tomoko Uchida	03e8192674	Specify minimum required python version for dev scripts (#387 )	2021-10-17 13:45:49 +09:00
Jan Høydahl	cdfa11b158	LUCENE-10174 Update buildAndPushRelease.py for new gradle build (#382 ) Co-authored-by: Tomoko Uchida <tomoko.uchida.1111@gmail.com>	2021-10-17 01:17:34 +02:00
Jan Høydahl	f38c401283	LUCENE-10179 No longer check for release status on mirrors (#384 )	2021-10-15 20:25:29 +02:00
Stefan Vodita	560f71b47d	LUCENE-10129: Add RamUsageEstimator.shallowSizeOf() for primitive arrays (#367 ) Co-authored-by: Stefan Vodita <voditas@amazon.com>	2021-10-15 15:45:04 +02:00
Mayya Sharipova	c9e56d27a3	LUCENE-10178 Add toString methond for Lucene90HnswVectorsFormat (#383 ) All toString method to Lucene90HnswVectorsFormat for testing and debugging.	2021-10-15 09:09:27 -04:00
Shintaro Murakami	7d5df2d6fe	Remove redundant null check (#378 ) In commit method, mgr is already used without null check before this null check.	2021-10-15 07:38:46 -04:00
Mike Drob	95759d299e	Fix typo	2021-10-14 13:28:10 -05:00
Chris Hostetter	f64c81c3f8	LUCENE-10173: remove max-worker restriction added by LUCENE-9488 when 'useGpg' in effect Also update docs to remove the point of confusion that lead to thinking that restriction was useful	2021-10-14 10:50:16 -07:00
Tommaso Teofili	cfd9f9f98f	LUCENE-10172 - minor java code improvements to Lucene Classification (#381 ) * LUCENE-10172 - minor code improvements * LUCENE-10172 - spotlessApply	2021-10-14 10:04:33 +02:00
Adrien Grand	c36ce300ae	LUCENE-10170: Restore compression speed for LZ4. (#377 ) A slowdown had been introduced in LUCENE-7521.	2021-10-14 08:21:15 +02:00
Jan Høydahl	ae956db41c	LUCENE-9997 Revisit smoketester for 9.0 build (#355 ) * LUCENE-9997 Revisit smoketester for 9.0 build * Remove checkBrokenLinks * Add back checkBrokenLinks * Review feedback. Remove traces of solr-specific testNotice() method Move backCompat test up to other "if isSrc" block * Review feedback. Bring back the 'checkMaven()' method, as it checks lucene maven artifacts. But since we dont have pom template files anymore, no need to compare with templates * Review feedback. Fix script compatibility by comparing against X.Y instead of X.Y.Z * Review feedback. Remove unnecessary if lucene test Convert some ant commands to gradle * Update MANIFEST tests to match the gradle-produced manifest * LUCENE-10107 Read multi-line commit from Manifest Backport from branch_8x * Collapse for project in 'lucene' loops and methods taking 'project' as argument Disable checkJavadocLinks, as this dependency no longer exists in 'scripts' folder * Review feedback - fix more ant stuff, convert to gradle equivalent * Review feedback: Refactor file open * Comment out javadoc generation - was only used to check broken links? * Fix charset of gpg console output to always be utf-8 Fix two more places to use with open() * Accept 'LICENSE' without txt or md suffix in top-level * Disable vector dictionary abuse exception if started with -Dsmoketester * Reformat code * Use -Dsmoketester flag when invoking IndexFiles	2021-10-13 15:24:14 +02:00
Patrick Zhai	6a41bc6310	LUCENE-10103 Make QueryCache respect Accountable queries (#346 )	2021-10-13 09:10:09 -04:00
Dawid Weiss	8bcc3dc430	LUCENE-9488: rewrite distribution assembly, signing and checksum generation (#372 )	2021-10-13 11:50:58 +02:00
Dawid Weiss	dad926ad17	LUCENE-10167: Run tests on PRs (and pushes to the main branch) (#376 )	2021-10-12 15:19:34 +02:00
Alan Woodward	ca073c98fa	LUCENE-10140: Correct minimizing iterator sub-matches (#370 ) Some interval iterators will attempt to minimize themselves by moving sub-iterators forward until they are no longer positioned within the current match. This causes problems when we try and pull Matches for these iterators, as their sub-iterators are now out of position. We have previously tried to deal with this by introducing caching iterators that check to see if they have been moved beyond the end of the current interval, but this fails in cases where an interval can contain multiple copies of a particular iterator. This commit adds a the ability for minimizing iterators to signal to their children when a prospective match has been found, so that they can cache their positions and offsets. Co-authored-by: Nikolay Khitrin <khitrin@gmail.com>	2021-10-12 09:33:36 +01:00
Robert Muir	f67dec1739	LUCENE-10164: lucene/replicator should only have jetty as a test dependency (#373 )	2021-10-11 13:53:58 -04:00
Julie Tibshirani	f4861159c3	LUCENE-10146: Add VectorSimilarityFunction.COSINE (#366 ) This PR adds support for using cosine similarity with kNN vector fields. It takes a simple approach and doesn't attempt optimizations like normalizing the query vector in advance, or performing loop unrolling. The thinking is that users who prioritize efficiency can normalize all vectors in advance and use `VectorSimilarityFunction.DOT_PRODUCT`.	2021-10-11 08:49:19 -07:00
jimczi	ed69f6080f	Update CHANGES entry for 8.10.1	2021-10-11 11:13:58 +02:00
Uwe Schindler	c94aca7e5d	LUCENE-10158: Add a new interface Unwrappable to the utils package to ease migration to new MMAPDirectory and its testing (#369 )	2021-10-11 00:25:40 +02:00
Mayya Sharipova	6f232b6f4b	Add CHANGES entry for 8.10.1	2021-10-10 07:43:08 -04:00
Robert Muir	c1fe9efb4b	LUCENE-10160: improve assert to be easier to debug Instead of a vague: java.lang.AssertionError at..., include some basic information: java.lang.AssertionError: size=16252835,limit=15728640,maxSegmentSizeMb=10.0	2021-10-09 12:33:29 -04:00

... 2 3 4 5 6 ...

35575 Commits All Branches Search

35575 Commits

All Branches