Michael McCandless
02528c6757
#10878 : add some test verbosity on failure ( #11935 )
2022-11-15 14:20:04 -05:00
Adrien Grand
e2e14df5ac
Add changes entry for #11930 .
2022-11-15 15:31:47 +01:00
Adrien Grand
729dc2bb82
Introduce IOContext.LOAD ( #11930 )
...
The default codec has a number of small and hot files, that actually used to be
fully loaded in memory before we moved them off-heap. In the general case,
these files are expected to fully fit into the page cache for things to work
well. Should we give control over preloading to codecs? This is what this
commit does for the following files:
- Terms index (`tip`)
- Points index (`kdi`)
- Stored fields index (`fdx`)
- Terms vector index (`tvx`)
This only has an effect on `MMapDirectory`.
2022-11-15 13:59:51 +01:00
Robert Muir
1b9d98d6ec
enable error-prone "narrow calculation" check ( #11923 )
...
This check finds bugs such as https://github.com/apache/lucene/pull/11905 .
See https://errorprone.info/bugpattern/NarrowCalculation
2022-11-15 06:42:41 -05:00
Lu Xugang
e5426dbbd2
count() in BooleanQuery could be early quit ( #11895 )
...
* Count() in BooleanQuery could be early quit if queries are pure disjunctional
2022-11-15 17:57:51 +08:00
Adrien Grand
69a7cb22e7
More granular control of preloading on MMapDirectory. ( #11929 )
...
This enables configuring preloading on MMapDirectory based on the file name as well as the IOContext that is used to open the file.
2022-11-15 10:52:49 +01:00
Dawid Weiss
d270025b7f
Update github actions to v3. ( #11931 )
2022-11-14 19:50:52 +01:00
Uwe Schindler
0741a354c0
Improve test introduced in #11918 to also check that reported invalid position is transformed back original position by slicing code ( #11926 )
2022-11-13 15:34:10 +01:00
Uwe Schindler
98b26e0885
fix merge problem in CHANGES.txt
2022-11-11 17:36:08 +01:00
Uwe Schindler
57ac311c70
Port generic exception handling from MemorySegmentIndexInput to ByteBufferIndexInput ( #11918 )
...
Port generic exception handling from MemorySegmentIndexInput to ByteBufferIndexInput. This also adds the invalid position while seeking or reading to the exception message.
2022-11-11 16:47:52 +01:00
Uwe Schindler
2a68f282f4
Synchronize changelog with 9.4 branch so we do not have duplicates
2022-11-11 16:36:02 +01:00
Peter Gromov
6fbc5f73c3
hunspell: introduce FragmentChecker to speed up ModifyingSuggester ( #11909 )
...
hunspell: introduce FragmentChecker to speed up ModifyingSuggester
add NGramFragmentChecker to quickly check whether insertions/replacements produce strings that are even possible in the language
Co-authored-by: Dawid Weiss <dawid.weiss@gmail.com>
2022-11-11 12:13:47 +01:00
Benjamin Trent
c8d44acf20
Follow up to GITHUB#11916, remove deleted docs check ( #11919 )
2022-11-10 18:40:24 -05:00
Benjamin Trent
3a506ec87a
GITHUB#11911: improve checkindex to be more thorough for vectors ( #11916 )
...
search every N docs to get close to 64 tests
2022-11-10 16:45:47 -05:00
Uwe Schindler
e9ef61ba39
Fix bug with set of strings since upgrade of Gradle -> explicit cast from GString to String
2022-11-10 17:18:30 +01:00
Benjamin Trent
1360baaee9
Fix integer overflow when seeking the vector index for connections ( #11905 )
...
* Fix integer overflow when seeking the vector index for connections
* Adding monster test to cause overflow failure
2022-11-10 08:24:32 -05:00
Peter Gromov
f7417d5961
hunspell: allow for faster dictionary iteration during 'suggest' by using more memory (opt-in) ( #11893 )
...
hunspell: allow for faster dictionary iteration during 'suggest' by using more memory (opt-in)
2022-11-09 08:20:50 +01:00
Greg Miller
c66a559050
Further optimize DrillSideways scoring ( #11881 )
2022-11-08 10:08:12 -08:00
Benjamin Trent
f9c26ed501
Fix latent casting bug in BKDWriter ( #11907 )
2022-11-08 15:55:07 +01:00
Peter Gromov
682e5c94e8
[hunspell] speed up WordFormGenerator ( #11904 )
2022-11-07 19:41:17 +01:00
Lu Xugang
a8120bcb32
Simplify the logic of matchAll() in IndexSortSortedNumericDocValuesRangeQuery ( #11884 )
...
* Simplify the logic of matchAll() in IndexSortSortedNumericDocValuesRangeQuery
2022-11-07 19:09:52 +08:00
Michael Sokolov
48aad5090f
#11896 : reduce top k in test to avoid split-graph ( #11899 )
2022-11-04 09:30:46 -04:00
Nhat Nguyen
1a5ad61b9d
Document that bulkScorer method can return null ( #11897 )
...
Like Weight#scorer, we should warn users that Weight#bulkScorer can
return null if the query matches no documents.
2022-11-02 15:12:43 -07:00
Robert Muir
4e207fed62
Tone down TestDocumentsWriterStallControl.testRandom, so it does not take minutes ( #11894 )
...
This test often takes several minutes with normal runs (no NIGHTLY/multiplier/etc). Tone it down so that it isn't slow: CI builds can work it harder by passing those parameters
2022-11-02 12:17:15 -04:00
Tim Stewart
7c130d2f07
Fix type in CONTRIBUTING.md ( #11879 )
2022-11-01 20:10:05 +00:00
Peter Gromov
419ffd3974
[hunspell] perform a bit fewer checks after 2 suffixes have been removed
2022-10-31 10:09:54 +01:00
Marios Trivyzas
3210a42f09
Fix nanos to millis conversion for tests ( #11856 )
2022-10-29 09:05:17 +02:00
Patrick Zhai
26ec0dd44c
add gradle aggregated coverage console log and html location ( #11882 )
...
* jacoco/ coverage shouldn't trigger all test tasks as dependencies - instead, it should run after those test tasks that you choose to run. removed java plugin from top-level.
* Make coverage depend on the default test task.
* Update jacoco log plugin so that it doesn't make hard dependencies on test tasks.
Co-authored-by: Dawid Weiss <dawid.weiss@carrotsearch.com>
2022-10-28 23:33:37 -07:00
Robert Muir
8736c18747
Allow building with java 18 now that gradle supports it ( #11889 )
...
* Allow building with java 18 now that gradle supports it
* update the "generic error" in these scripts
2022-10-28 23:41:09 -04:00
Navneet Verma
e7253f112d
Add interface to relate a LatLonShape with another shape represented as Component2D. ( #11753 )
...
Adds createLatLonShapeDocValues and createXYShapeDocValues factory methods
to LatLonShape and XYShape factory classes, respectively.
Signed-off-by: Nicholas Walter Knize <nknize@apache.org>
2022-10-28 13:52:20 -05:00
Dawid Weiss
5c7edd7f38
Upgrade to gradle 7.5.1 (excluding launch scripts, which we have customized) ( #11886 )
2022-10-28 08:49:36 +02:00
Marc D'Mello
2793256682
GITHUB#11795: Add FilterDirectory to track write amplification factor ( #11796 )
...
* LUCENE-11795: Add FilterDirectory to track write amplification factor
* addressed feedback
* added optional temp output tracking and real time tracking
* addressed more feedback
* more improvements + added CHANGED.txt entry
* format edit to CHANGES.txt
* remove waf factor calculation
Co-authored-by: Marc D'Mello <dmellomd@amazon.com>
2022-10-27 15:07:56 -04:00
Michael Sokolov
b3bc59910f
When evaluating expressions, defer calling advanceExact on operands until doubleValue() is called ( #11878 )
2022-10-26 14:05:39 -04:00
gf2121
05bd83dfe1
Use ByteArrayComparator for PointInSetQuery#MergePointVisitor ( #11876 )
2022-10-26 13:39:32 +08:00
Dawid Weiss
50261de406
Update java version to 17 for Lucene 10 in the release wizard. ( #11872 )
2022-10-25 13:50:21 +02:00
gf2121
b1d1e488f2
Move LUCENE-10376 CHANGES entry to 10.0.0 ( #11871 )
2022-10-24 22:39:21 +08:00
iverase
976a38baa0
Add back-compat indices for 9.4.1
2022-10-24 15:20:44 +02:00
iverase
9ce6268cce
Add bugfix version 9.4.1
2022-10-24 15:13:12 +02:00
iverase
70d0ec322b
DOAP changes for release 9.4.1
2022-10-24 14:20:51 +02:00
gf2121
8cfbc18497
LUCENE-10376: Roll up the loop in vint/vlong in DataInput ( #602 )
2022-10-24 17:39:22 +08:00
Julie Tibshirani
0f525bfb14
Fix Lucene94HnswVectorsFormat validation on large segments ( #11861 )
...
When reading large segments, the vectors format can fail with a validation
error:
java.lang.IllegalStateException: Vector data length 3070061568 not matching
size=999369 * dim=768 * byteSize=4 = -1224905728
The problem is that we use an integer to represent the size, which is too small
to hold it. The bug snuck in during the work to enable int8 values, which
switched a long value to an int.
2022-10-19 13:49:59 -07:00
Patrick Zhai
6cde41c9fd
GITHUB-11838 Change API to allow concurrent query rewrite ( #11840 )
...
Replace Query#rewrite(IndexReader) with Query#rewrite(IndexSearcher)
2022-10-19 09:49:40 -07:00
Peter Gromov
05971b3315
hunspell: speed up GeneratingSuggester by not deserializing non-suggestible roots ( #11859 )
2022-10-19 13:17:43 +02:00
Steven Schlansker
f3d85be476
PrimaryNode: add configurable timeout to waitForAllRemotesToClose ( #11822 )
2022-10-18 17:21:01 -07:00
Adrien Grand
2ed16c7846
Revert "Binary search the entries when all suffixes have the same length in a leaf block. ( #11722 )"
...
This reverts commit 3adec5b1ce
.
2022-10-18 14:27:02 +02:00
zhouhui
3adec5b1ce
Binary search the entries when all suffixes have the same length in a leaf block. ( #11722 )
2022-10-18 11:07:52 +02:00
Benjamin Trent
cd5e200f47
Fix failure to load larger data sets in KnnGraphTest ( #11849 )
...
When running the `reindex` task with KnnGraphTester, exceptionally large
datasets can be used. Since mmap is used to read the data, we need to know the
buffer size. This size is limited to Integer.MAX_VALUE, which is inadequate for
larger datasets.
So, this commit adjusts the reading to only read a single vector at a time.
2022-10-17 16:39:58 -07:00
Peter Gromov
2958f2ae9d
hunspell: speedup suggestions by caching speller and compound stemming requests ( #11857 )
...
hunspell: speed up suggestions by caching speller and compound stemming requests
2022-10-17 21:25:12 +02:00
Zach Chen
21e3f654fb
LUCENE-10635: Ensure test coverage for WANDScorer by using a test query ( #1039 )
2022-10-15 13:02:02 -07:00
Robert Muir
ece8ea715c
Fix ExitableDirectoryReader sampling constants to be power-of-2 ( #11850 )
...
If it's performance sensitive enough that we should do sampling, then we should avoid integer division too.
2022-10-15 12:05:15 -04:00