Mayya Sharipova
87255c117d
Add change line for LUCENE-9848
2022-05-04 14:22:31 -04:00
Mayya Sharipova
dc6a7f9468
LUCENE-9848 Sort HNSW graph neighbors for construction ( #862 )
...
* LUCENE-9848 Sort HNSW graph neighbors for construction
Sort HNSW graph neighbors when applying diversity criterion
During HNSW graph construction, when a node has already a number of
connections larger than maximum allowed (maxConn), we need to prune
its connections using a diversity criteria to limit the number of
connections to maxConn.
Currently when we add reverse connections to already existing nodes,
we don't keep them sorted. Thus later, when we apply diversity criteria
we may prune not the worst most distant non-diverse nodes.
This patch makes sure that neighbours connections are always sorted
from best (closest) to worst (distant), and during the application
of diversity criteria processes nodes from worst to best.
This path does the following:
- enhance NeighborArray to always keep neighbour nodes sorted according
to their scores (in desc or asc order). Make NeighborArray aware in
which order the nodes should be sorted.
- make OnHeapHnswGraph aware of the order of similarity function
- make HnswGraphBuilder apply diversity criteria from worst to
best nodes
- create Lucene90NeighborArray to keep the previous logic of
NeighborArray for Lucene90Codec
2022-05-04 14:15:14 -04:00
Gautam Worah
c3d47507e9
LUCENE-10524 Add benchmark suite details to CONTRIBUTING.md ( #853 )
2022-05-03 12:53:20 +09:00
Lu Xugang
fe9d26178d
LUCENE-10552: KnnVectorQuery has incorrect equals/ hashCode ( #859 )
...
* LUCENE-10552: KnnVectorQuery now includes filter in equals/ hashCode
2022-05-02 17:58:47 -04:00
Kevin Risden
7efac761f4
LUCENE-10534: MinFloatFunction / MaxFloatFunction calls exists twice ( #837 )
2022-05-02 13:13:45 -04:00
spike.liu
d9d2cb6f09
LUCENE-10188: Give SortedSetDocValues a docValueCount() ( #663 )
...
Co-authored-by: vlc刘诚 <chengliu@trip.com>
2022-05-02 10:41:12 -04:00
Tomoko Uchida
5f48469837
Allow to link to github PR from changes ( #854 )
2022-05-02 23:06:39 +09:00
Michael McCandless
138d40e657
LUCENE-10551: improve testing of LowercaseAsciiCompression ( #858 )
2022-05-02 08:49:16 -04:00
Kevin Risden
3063109d83
LUCENE-10542: FieldSource exists implementations can avoid value retrieval ( #847 )
2022-04-29 22:43:16 -04:00
Dawid Weiss
05de9085ce
LUCENE-10539: Return a stream of completions from FSTCompletion. ( #844 )
2022-04-29 21:35:35 +02:00
Dawid Weiss
75aadb9589
gradle 7.3.3 quick upgrade ( #856 )
2022-04-29 21:02:19 +02:00
Greg Miller
902a7df0e5
LUCENE-10530: Avoid floating point precision bug in TestTaxonomyFacetAssociations ( #848 )
2022-04-29 08:57:46 -07:00
Ignacio Vera
0dad9ddae8
LUCENE-10508: Use MIN_WIDE_EXTENT for GeoWideDegenerateHorizontalLine ( #855 )
2022-04-29 10:21:08 +02:00
Dawid Weiss
6e6c61eb13
LUCENE-10541: Test-framework: limit the default length of MockTokenizer tokens to 255.
2022-04-29 09:41:42 +02:00
Tomoko Uchida
c28f575b6d
LUCENE-10493: move n-best logic to analysis-common ( #846 )
2022-04-29 10:35:30 +09:00
Chris Hostetter
6afb9bc25a
LUCENE-10292: prevent thread leak (or test timeout) if exception/assertion failure in test iterator
2022-04-28 15:17:53 -07:00
Chris Hostetter
a8d86ea6e8
LUCENE-10292: Suggest: Fix FreeTextSuggester so that getCount() returned results consistent with lookup() during concurrent build()
...
Fix SuggestRebuildTestUtil to reliably surfice this kind of failure that was previously sporadic
2022-04-27 18:14:01 -07:00
Gautam Worah
8d9a333fac
LUCENE-10525 Improve WindowsFS emulation to catch invalid file names ( #829 )
...
* Add filename checks for WindowsFS
* don't delegate Path default methods, which makes it easier for subclassing. Also fix delegation bug (endsWith was calling startsWith).
2022-04-27 09:52:47 -04:00
Ignacio Vera
922d3af8d6
LUCENE-10508: Use MIN_WIDE_EXTENT for all wide rectangles ( #845 )
2022-04-27 11:24:16 +02:00
Ignacio Vera
5d3ab09676
LUCENE-10470: [Tessellator] Fix some failing polygons due to collinear edges ( #756 )
...
Check if polygon has been successfully tessellated before we fail (we are failing some valid
tessellations) and allow filtering edges that fold on top of the previous one
2022-04-27 10:24:22 +02:00
Ignacio Vera
2b20b3f2ca
LUCENE-10508: Fix error for rectangles with an extent close to 180 degrees ( #824 )
...
This commit introduces a GeoWideRectangle.MIN_WIDE_EXTENT that takes into account the angular resolution
in order to build a GeoWideRectangle.
2022-04-27 07:33:49 +02:00
Greg Miller
f11468186a
LUCENE-10529: Fix TestTaxonomyFacetAssociations NPE when randomly indexing no documents for dim
2022-04-26 20:13:28 -07:00
Michael Sokolov
2a618586de
fix path to jar file in demo documentation
2022-04-26 15:48:21 -04:00
xiaoping
ebe2d7b4fd
LUCENE-10499: reduce unnecessary copy data overhead when growing array size ( #786 )
...
Co-authored-by: xiaoping.wjp <xiaoping.wjp@alibaba-inc.com>
2022-04-26 15:35:56 +02:00
Dawid Weiss
2966228fae
LUCENE-10535: upgrade com.palantir.consistent-versions to 2.10.0
2022-04-26 08:31:15 +02:00
Kevin Risden
223a74fcb5
LUCENE-10533: SpellChecker.formGrams is missing bounds check ( #836 )
2022-04-25 15:55:50 -04:00
Dawid Weiss
a53d05b9f9
Upgrade spotless and use runToFixMessage for 'gradlew tidy' hint. ( #834 )
2022-04-25 14:51:14 +02:00
Dawid Weiss
2080caff3f
Fix JVM error branch logic. ( #835 )
2022-04-25 14:33:56 +02:00
Tomoko Uchida
c89f8a7ea1
LUCENE-10493: factor out Viterbi algorithm and share it between kuromoji and nori ( #805 )
2022-04-25 20:09:46 +09:00
Adrien Grand
2a4c21bb58
LUCENE-8836: Speed up TermsEnum#lookupOrd on increasing sequences of ords. ( #827 )
2022-04-25 09:18:21 +02:00
Robert Muir
1089b482fc
LUCENE-10528: use Xvfb in test to avoid messing up user's desktop ( #828 )
...
Co-authored-by: Tomoko Uchida <tomoko.uchida.1111@gmail.com>
2022-04-23 08:00:33 -04:00
gf2121
35ca2d79f7
LUCENE-10315: Speed up DocIdsWriter by ForUtil ( #797 )
2022-04-23 19:32:02 +08:00
Chris Hegarty
3bcc40efe9
LUCENE-10517: Improve performance of SortedSetDV faceting by iterating on class types ( #812 )
2022-04-21 18:39:53 +02:00
Chris Hegarty
08f848a582
Add two facet tests ( #826 )
2022-04-21 18:39:41 +02:00
Robert Muir
c897aac077
fail clearly on too-new JDK ( #819 )
...
Gradle will give a very confusing error, let's make it absolutely clear.
Co-authored-by: Dawid Weiss <dawid.weiss@carrotsearch.com>
2022-04-21 09:22:26 -04:00
Robert Muir
d6461eab0b
improve spotless error to suggest running 'gradlew tidy' ( #817 )
...
The current error isn't helpful as it suggests a per-module command. If
the user has modified multiple modules, they will be running gradle
commands to try to fix each one of them, when it would be easier to just
run 'gradlew tidy' a single time and fix everything.
2022-04-21 08:30:10 -04:00
Robert Muir
844bd88839
LUCENE-10526: add single method to mockfile to wrap a Path ( #822 )
...
Currently "new FilterPath" is called from everywhere, making it impossible for a mockfilesystem to use a custom subclass.
Add FilterFileSystemProvider.wrapPath(path), which subclasses can override. Fix tests to use it instead of juggling URI objects and passing FileSystems around.
2022-04-20 16:40:10 -04:00
Yuting Gan
ec53a72a44
LUCENE-10495: Fix return statement of siblingsLoaded() in TaxonomyFacets ( #778 )
2022-04-20 12:56:43 -07:00
Adrien Grand
2d278a0efe
Clarify that terms dicts are per-field in block-tree's javadocs. ( #823 )
2022-04-20 17:19:51 +02:00
Robert Muir
e390f33258
Fix incorrect docs in README.md: it must be java 17 exactly, java 18 does not work ( #818 )
2022-04-20 11:07:24 -04:00
Adrien Grand
7c173b0e1c
LUCENE-10153: Make errorprone happy.
2022-04-20 16:47:34 +02:00
Ignacio Vera
4c133f435d
LUCENE-10514: Component2D#Within methods should return NOTWITHIN for triangles within the query geometry ( #809 )
...
This commit brings makes sure we always return NOTWITHIN for fully contained triangles in
Component2D#within* methods
2022-04-20 16:30:29 +02:00
Adrien Grand
15ecf3c27f
LUCENE-10503: Fix JIRA number in CHANGES.
2022-04-19 15:40:53 +02:00
Luca Cavanna
866bb86a1c
LUCENE-10506: change visibility of ProfilerCollector#deriveCollectorName to protected ( #799 )
...
This allows subclasses to extend how the inner collector name is derived.
2022-04-19 15:36:11 +02:00
Adrien Grand
d9e37f3123
LUCENE-10153: Improve accuracy of scaled scores in WANDScorer. ( #794 )
2022-04-19 15:26:24 +02:00
Mike McCandless
fb76d0b104
LUCENE-10482, LUCENE-10521: hrmph, put the @Ignore in the right place
2022-04-19 07:19:15 -04:00
Mike McCandless
c388705855
LUCENE-10482: Ignore this test for now
2022-04-18 17:14:04 -04:00
Tomoko Uchida
872349cef9
Add some basic tasks to help/workflow ( #811 )
2022-04-18 11:34:28 +09:00
Gautam Worah
d322be52f2
LUCENE-10482 Bug Fix: Don't use Instant.now() as prefix for the temp dir name ( #814 )
...
* Don't use Instant.now() as prefix for the temp dir name
* spotless
2022-04-17 21:18:08 -04:00
Gautam Worah
10ebc099c8
LUCENE-10482 Allow users to create their own DirectoryTaxonomyReaders with empty taxoArrays instead of letting the taxoEpoch decide ( #762 )
2022-04-15 10:45:02 -07:00