Mayya Sharipova
cc58c51941
LUCENE-10089 Disable numeric sort optim when needed ( #286 )
...
Add a method to SortField that allows to enable/ disable numeric
sort optimization with points, which is enabled by default from 9.0.
2021-09-09 10:22:42 -04:00
Mike McCandless
ee0695eda8
LUCENE-10092: fix test bug by forceMerging the index down to one segment
2021-09-08 14:01:10 -04:00
Adrien Grand
7eb35be045
LUCENE-10087: Validate number of dimensions and bytes per dimension for numeric SortFields. ( #283 )
2021-09-07 13:28:39 +02:00
Mayya Sharipova
bc161e6dcc
LUCENE-10040 Correct TestHnswGraph.testSearchWithAcceptOrds ( #277 )
...
If we set numSeed = 10, this test fails sometimes because it may mark
expected results docs (from 0 to 9) as deleted which don't end up
being retrieved, resulting in a low recall
- set numSeed to 10 to ensure 10 results are returned
- add startIndex paramenter to createRandomAcceptOrds that allows
documents before startIndex to be NOT deleted
- use startIndex equal to 10 for createRandomAcceptOrds
Relates to #239
2021-09-06 06:56:15 -04:00
Jim Ferenczi
4df8d641ac
LUCENE-10081: KoreanTokenizer should check the max backtrace gap on whitespaces ( #272 )
...
This change ensures that we don't skip consecutive whitespaces without checking the maximum backtrace gap.
2021-09-06 08:46:39 +02:00
Mike McCandless
34f37d0d43
LUCENE-10035: move CHANGES.txt entry from 9.0 to 8.10
2021-09-03 10:21:28 -04:00
Adrien Grand
b3ce44cd0d
LUCENE-9620: Implement AssertingWeight#count.
2021-09-03 14:44:07 +02:00
Adrien Grand
4bb018e904
LUCENE-9620: Fix TestTermQuery failure.
2021-09-03 10:48:01 +02:00
Adrien Grand
de661d6535
LUCENE-9620: Address profiling test failures.
2021-09-03 10:48:01 +02:00
zacharymorn
d4e4fe22b1
Revert "LUCENE-9959: Add non thread local based API for term vector reader usage ( #180 )" ( #280 )
...
This reverts commit 180cfa241b
.
2021-09-03 00:31:18 -07:00
Gautam Worah
44e9f5de53
LUCENE-9620 Add Weight#count(LeafReaderContext) ( #242 )
...
Add a default implementation in Weight.java and add sample faster
implementations in MatchAllDocsQuery, MatchNoDocsQuery, TermQuery
Add tests for BooleanQuery and TermQuery
Co-authored-by: Gautam Worah <gauworah@amazon.com>
Co-authored-by: Adrien Grand <jpountz@gmail.com>
2021-09-03 09:09:38 +02:00
Houston Putman
059d06cec7
Fix gpg key download in release wizard. ( #279 )
...
Old URL to check the apache id gpg key is no longer available.
2021-09-02 18:08:57 -04:00
Mayya Sharipova
54179e9372
LUCENE-10063 Correct BaseKnnVectorsFormatTestCase.testRandomWithUpdatesAndGraph ( #278 )
...
- Make sure that k > 0 for knn search
- Make sure that k doesn't exceed the number of live docs
Relates to #262
2021-09-02 16:23:31 -04:00
Adrien Grand
eb2509c846
LUCENE-10035: Fix CHANGES entry.
2021-09-02 18:37:04 +02:00
Robert Muir
b0611a14d0
LUCENE-10083: add CHANGES entry for Telugu analyzer
2021-09-02 12:20:34 -04:00
vinodrenu
544dbbea46
LUCENE-10083: Analyzer and stemmer for Telugu language ( #275 )
...
* initial version of Telugu analyzer
* made entries for factories and added few more terms in stemmer
* added two more terms
* added few mote terms
* added long to short vowel conversion
* added test cases
* applied code formatting rules
* fixed unclosed p tag in javadoc
* spotlessApply removed the closing p tag
2021-09-02 12:00:13 -04:00
Gautam Worah
1036c708db
LUCENE-9476: Add getBulkPath API to DirectoryTaxonomyReader for faster ordinal -> FacetLabel lookup ( #179 )
...
Co-authored-by: Gautam Worah <gauworah@amazon.com>
2021-09-02 07:54:31 -04:00
zacharymorn
34232430f2
LUCENE-9662: fix test failure from merging away soft-deletes ( #276 )
2021-09-01 22:18:29 -07:00
Michael Sokolov
ee7a719dd8
LUCENE-10082: add detail to schema inconsistency error messages
2021-09-01 23:11:35 +00:00
Michael Sokolov
e3e54c95c9
LUCENE-10063: test fixes relating to SimpleTextKnnVectorsReader ( #273 )
2021-09-01 08:19:11 -04:00
zacharymorn
424192e170
LUCENE-9662: CheckIndex should be concurrent - parallelizing index check across segments ( #128 )
2021-08-31 19:24:14 -07:00
Michael Sokolov
9c7f0d45ee
LUCENE-10063: implement SimpleTextKnnvectorsReader.search
2021-08-31 13:55:13 -04:00
wuda
6ade29c71a
LUCENE-10035: Simple text codec add multi level skip list data ( #224 )
2021-08-30 15:27:42 +02:00
Dawid Weiss
e470535072
LUCENE-9654: Expressions module gramar antlr code regeneration ( #269 )
2021-08-27 12:47:19 +02:00
Greg Miller
3b3f9600c2
Fix a DrillSideways unit test I broke when adding more tests in LUCENE-10060 ( #268 )
2021-08-26 14:44:52 -07:00
Greg Miller
dbf7e1865f
LUCENE-10060: Ensure DrillSidewaysQuery instances never get cached ( #261 )
2021-08-26 06:06:54 -07:00
Adrien Grand
f1fdd2465c
LUCENE-9917: Smaller block sizes for BEST_SPEED. ( #257 )
...
This reduces the block size for BEST_SPEED in order to trade some compression
ratio in exchange for better retrieval speed.
2021-08-26 15:04:51 +02:00
Dawid Weiss
f6e3b08ae9
LUCENE-10072: Regenerate FST dictionaries after LUCENE-9047. ( #265 )
2021-08-26 11:31:16 +02:00
Dawid Weiss
39a2fc62d4
LUCENE-10066: Build does not work with JDK16 as gradle's runtime ( #259 )
2021-08-26 10:08:37 +02:00
Adrien Grand
2d7590a355
LUCENE-9613, LUCENE-10067: Further specialize ordinals. ( #260 )
2021-08-26 09:44:24 +02:00
David Smiley
8ac2673791
LUCENE-10003: No C style array declaration ( #206 )
...
Most cases of C-style array declarations have been switched. The Google Java Format, that which we adhere to, disallows C-style array declarations: https://google.github.io/styleguide/javaguide.html#s4.8.3-arrays
Some cases (esp. Snowball) can't be updated.
2021-08-25 17:06:41 -04:00
Michael McCandless
88588e3dea
LUCENE-10052: cutover more tests to newBytesRef, and finally catches a fly (FSTTermsReader.IntersectEnum was illegally ignoring BytesRef.offset, yay!) ( #258 )
2021-08-25 12:18:23 -04:00
Adrien Grand
8917fbe039
LUCENE-9613, LUCENE-10067: Add more specialization for the ordinals case.
2021-08-25 14:34:04 +02:00
Dawid Weiss
45868a52f1
LUCENE-9990: upgrade to gradle 7.2.
2021-08-25 10:04:42 +02:00
Dawid Weiss
0d07104de0
Piggyback spotless upgrade to 5.14.3
2021-08-25 10:03:59 +02:00
Dawid Weiss
a8d4f658de
Upgrade to gradle 7.2
2021-08-25 10:03:59 +02:00
Dawid Weiss
0cbafa4879
Fix gradle error hints.
2021-08-25 10:03:59 +02:00
Dawid Weiss
fdccdee734
Move logging to info leve.
2021-08-25 10:03:59 +02:00
Dawid Weiss
26eb84a3b5
Fix immutable properties. Fix ant uri namespace no longer working (seems like gradle regression).
2021-08-25 10:03:59 +02:00
Dawid Weiss
2b0378cd4a
Use JavaInfo instead of toolchains. Internal but works and is free of toolchain's quirks.
2021-08-25 10:03:59 +02:00
Dawid Weiss
68cf86ba35
Experiments with the new apis.
2021-08-25 10:03:59 +02:00
Dawid Weiss
72f373791e
Upgrade palantir's plugin.
2021-08-25 10:03:59 +02:00
Dawid Weiss
3ff4263535
Upgrade gradle to 7.1.1
2021-08-25 10:03:59 +02:00
Dawid Weiss
523cea2c5d
Revert "Adding initial patch by Gautam Worah" (restore pristine main)
...
This reverts commit 067ab4f503aabea59639e692e3ea9ee30750c68e.
2021-08-25 10:03:59 +02:00
Dawid Weiss
bac22d6116
Adding initial patch by Gautam Worah
2021-08-25 10:03:59 +02:00
Mayya Sharipova
fc67d6aa6e
Revert "LUCENE-10054 Make HnswGraph hierarchical ( #250 )"
...
This reverts commit 257d256def
.
We've decided to have a separate feature branch for HNSW,
and put all related changes there.
2021-08-24 14:58:59 -04:00
Julie Tibshirani
782c3cca3a
LUCENE-10040: Relax TestKnnVectorQuery#testDeletes assertion ( #251 )
...
TestKnnVectorQuery#testDeletes assumes that if there are n total documents, we
can perform a kNN search with k=n and retrieve all documents. This isn't true
with our implementation -- due to randomization we may select less than n entry
points and never visit some vectors.
2021-08-24 11:15:27 -07:00
Adrien Grand
83ba5d859c
LUCENE-7020: Remove TieredMergePolicy#setMaxMergeAtOnceExplicit. ( #230 )
...
TieredMergePolicy no longer bounds the number of segments that can be merged via
a forced merge.
2021-08-24 10:27:00 +02:00
Mayya Sharipova
257d256def
LUCENE-10054 Make HnswGraph hierarchical ( #250 )
...
Currently HNSW has only a single layer.
This is the first part to make it multi-layered.
To keep changes small, this PR only adds
multiple layers in the HnswGraph class.
TODO for following PRs:
- modify graph construction and search algorithm for a hierarchical
graph.
- modify Lucene90HnswVectorsWriter and Lucene90HnswVectorsReader to
write and read multiple layers\
2021-08-23 15:54:26 -04:00
Greg Miller
46fa09d265
LUCENE-5309: Optimize facet counting for single-valued SSDV / StringValueFacetCounts ( #255 )
2021-08-23 10:01:23 -07:00