36042 Commits

Author SHA1 Message Date
Luca Cavanna
ea52a84c7e
Replace TopFieldCollector usages in tests with collector manager (#761)
This commit replaces some usages of TopFieldCollector in tests with a corresponding collector manager created through TopFieldCollector#createSharedManager
2022-04-05 10:03:04 +02:00
Adrien Grand
deb6170107 Fix CHANGES formatting. 2022-04-05 09:24:40 +02:00
xiaoping
898ec1659d
LUCENE-10456: Implement Weight#count for MultiRangeQuery (#731) 2022-04-05 09:23:59 +02:00
Adrien Grand
f249046a1d LUCENE-10484: Move CHANGES entry to 9.2. 2022-04-05 08:53:57 +02:00
Luca Cavanna
7ed0f3d7ad
LUCENE-10484: Add support for concurrent facets random sampling (#765)
This commit adds a new createManager static method to RandomSamplingFacetsCollector that allows users to perform random sampling concurrently. The returned collector manager is very similar to the existing FacetsCollectorManager but it exposes a specialized reduced RandomSamplingFacetsCollector.

This relates to [LUCENE-10002](https://issues.apache.org/jira/browse/LUCENE-10002). It allows users to use a collector manager instead of a collector when doing random sampling, in the effort of reducing usages of IndexSearcher#search(Query, Collector).
2022-04-05 08:51:57 +02:00
Luca Cavanna
e7f9f2c50d
LUCENE-10002: Replaces usages of FacetsCollector with FacetsCollectorManager (#764)
In the effort of decreasing usages of IndexSearcher#search(query, Collector) by using the corresponding method that accepts a collector manager, this commit replaces many usages of FacetsCollector with its corresponding existing collector manager.
2022-04-05 08:51:16 +02:00
Julie Tibshirani
da95fb2ef7 LUCENE-10466: Move changelog entry to 9.2 2022-04-04 12:04:02 -07:00
Andriy Redko
737ce42c1c
LUCENE-10466: Ensure IndexSortSortedNumericDocValuesRangeQuery handles sort types besides LONG
IndexSortSortedNumericDocValuesRangeQuery unconditionally assumes the usage of
the LONG-encoded SortField. Using the numeric range query (in case of sorted
index) with anything but LONG ends up with class cast exception. Now the query
consults the numeric type of the `SortField` and perform appropriate checks.
2022-04-04 12:02:00 -07:00
Luca Cavanna
0a525ce2ab
LUCENE-10498: don't count num hits nor accumulate scores unless necessary (#782)
This commit introduces a no-op implementation of HitsThresholdChecker that does no counting, to be used when early termination is disabled. This is automatically used when creating a TopFieldCollector or a TopScoreDocCollector. In that same scenario MaxScoreAccumulator can be null and scores are no longer accumulated when creating a shared collector manager.

With this, it is safe to replace the custom collector managers in DrillSideways with the ones returned by calling createSharedManager.
2022-04-04 17:56:34 +02:00
Tomoko Uchida
459d361520
LUCENE-10184: mention of opening a Jira issue (#781) 2022-04-04 20:09:39 +09:00
Tomoko Uchida
41204b8f1b
Update link to contribution guide 2022-04-04 19:12:25 +09:00
Gautam Worah
fb79ee1549
Remove redundant index (#776)
Thanks @gautamworah96 !
2022-04-01 14:27:09 +02:00
Baurzhan
69b040fc62
Implement method to bulk add all collection elements to a PriorityQueue (#770)
Implement method to add all collection elements to a PriorityQueue
Co-authored-by: Dawid Weiss <dawid.weiss@carrotsearch.com>
2022-03-30 19:28:01 +02:00
Greg Miller
fb83387002
LUCENE-10491: Fix correctness bug in TaxonomyFacetSumValueSource score providing (#775) 2022-03-30 09:36:05 -07:00
Luca Cavanna
c4935d2fa3
Replace usages of search(Query, Collector) in CheckHits (#763)
This commit replaces usages of IndexSearcher#search(Query, Collector) with search(Query, CollectorManager) in CheckHits.
2022-03-30 17:55:05 +02:00
Luca Cavanna
66bbc95586
LUCENE-10002: Add FixedBitSetCollector and corresponding collector manager to test framework (#766)
Some tests collect matching docs in a FixedBitSet. In the effort of moving such tests to using IndexSearcher#search(Query, CollectorManager) as part of LUCENE-10002, this commit adds a new FixedBitSetCollector class that exposes this functionality as well as a createManager method that returns a corresponding CollectorManager.
2022-03-30 16:14:39 +02:00
Tomoko Uchida
2a3e5ca07f
LUCENE-10475: Merge o.a.l.a.[ja|ko].util into o.a.l.a.[ja|ko].dict (#772) 2022-03-29 21:09:26 +09:00
Tomoko Uchida
ac6c36d406
LUCENE-10184: add CONTRIBUTING.md; reorganize README. (#771) 2022-03-29 16:52:27 +09:00
Greg Miller
d438a0cde7 Add CHANGES entry for LUCENE-10325 2022-03-28 15:57:18 -07:00
Yuting Gan
7c33f04d37
LUCENE-10325: Add getTopDims functionality to Facets (#747) 2022-03-28 15:54:07 -07:00
Tomoko Uchida
0f93130d7b remove obsolete image/description from luke/README.md 2022-03-28 08:44:29 +09:00
Uwe Schindler
ff263f0aa4
Upgrade to forbiddenapis 3.3 (#768) 2022-03-26 17:09:42 +01:00
Tomoko Uchida
bd22f199de
LUCENE-10393: Unify binary dictionary and dictionary writer in kuromoji and nori (#740) 2022-03-25 18:44:36 +09:00
Mike Drob
b3906e96ea
LUCENE-9651 Update benchmark module docs (#759) 2022-03-23 14:51:28 -05:00
Lu Xugang
5450d72258
LUCENE-10458: BoundedDocSetIdIterator may supply error count in Weigth#count(LeafReaderContext) when missingValue enables (#736) 2022-03-23 15:54:52 +01:00
Mike Drob
1c6f631678
LUCENE-10481: FacetsCollector will not request scores if it does not use them (#760) 2022-03-23 09:44:02 -05:00
Christine Poerschke
779c332a8c
LUCENE-10477: mention 'call multiple times' in Query.rewrite javadoc (#758) 2022-03-22 15:39:59 +00:00
Adrien Grand
04127ed9fc Add back-compat indices for 9.1.0. 2022-03-22 16:10:10 +01:00
Adrien Grand
3105998ce6 Synchronize CHANGES. 2022-03-22 16:08:59 +01:00
Christine Poerschke
ca252d6621
LUCENE-10464, LUCENE-10477: WeightedSpanTermExtractor.extractWeightedSpanTerms to rewrite sufficiently (#737) 2022-03-22 14:53:41 +00:00
Adrien Grand
28d3adcf69 Add version 9.1.0. 2022-03-22 15:43:27 +01:00
Adrien Grand
0a3bad5985 DOAP changes for release 9.1.0 2022-03-22 15:22:27 +01:00
Alan Woodward
42bf77229e LUCENE-10422: Make errorprone happy 2022-03-22 09:18:27 +00:00
Tomoko Uchida
fa61953afd
LUCENE-10478: mark Test4GBStoredFields as @Monster (#757) 2022-03-22 17:58:05 +09:00
mogui
be99178956
LUCENE-10422: Read-only monitor implementation (#679)
This commit adds a read-only monitor implementation that can
search the QueryIndex of another monitor without supporting adding
new queries.
2022-03-21 16:42:03 +00:00
Adrien Grand
f239c0e03c
LUCENE-10473: Make tests a bit faster when running nightly. (#754) 2022-03-21 10:37:57 +01:00
Julie Tibshirani
a4b30b4cf4 LUCENE-9905: Fix check in TestPerFieldKnnVectorsFormat#testMergeUsesNewFormat
Before the assertion checked if two sets were equal, which resulted in rare
failures. Now we use 'contains' from hamcrest matchers.
2022-03-18 15:38:30 -07:00
Julie Tibshirani
18f9d31608 LUCENE-9614: Fix rare TestKnnVectorQuery failures
Some of our checks relied on doc IDs corresponding to the order in which docs
were passed to IndexWriter. This is fragile and sometimes resulted in failures.
Now we check against an "id" field instead.
2022-03-18 14:52:00 -07:00
Luca Cavanna
bb7568d865
LUCENE-10472: Fix TestMatchAllDocsQuery#testEarlyTermination (#753)
As part of #716 I moved the test to use a collector manager, but I forgot to update one of the assertions.
We can't rely on totalHits being accurate when the search is executed my multiple threads and early terminated.
2022-03-18 18:49:20 +01:00
Adrien Grand
1dcb64b492 LUCENE-10418: Move CHANGES to the correct section. 2022-03-17 16:44:01 +01:00
Adrien Grand
8fb6543280
LUCENE-10418: Optimize Query#rewrite in the non-scoring case. (#672) 2022-03-17 16:41:55 +01:00
Adrien Grand
86bd921fce
LUCENE-10469: Fix score mode propagation in ConstantScoreQuery. (#750) 2022-03-16 13:16:33 +01:00
Peter Gromov
0e3c315b76 LUCENE-10452, LUCENE-10451: mention hunspell changes in CHANGES.txt 2022-03-16 09:18:58 +01:00
Peter Gromov
af97c5ef37
LUCENE-10452: Hunspell: call checkCanceled less frequently to reduce the overhead (#723) 2022-03-16 09:04:08 +01:00
Julie Tibshirani
6b7953b8ce Add 9.2.0 section to release notes 2022-03-15 11:26:20 -07:00
Peter Gromov
92a20c24e9
LUCENE-10451 Hunspell: don't perform potentially expensive spellchecking after timeout (#721)
move all expensive operations closer to the suggestion creation, encapsulate case and output conversion into a new Suggestion class
2022-03-15 18:43:56 +01:00
Tomoko Uchida
b6c1024f55
LUCENE-10463: increment java version to 17 in smoke tester (#748) 2022-03-15 19:54:54 +09:00
Dawid Weiss
25c4310bd5
LUCENE-10461: fix windows launch script for luke so that it works with integration tests AND actual command line. Cmd escaping rules and start command line is absolutely insane. (#743) 2022-03-12 19:39:31 +09:00
Dawid Weiss
9e9c457f80
LUCENE-10459: Update smoke tester for 9.1 (#744)
Add demo dependencies to third party modules. Add an IT that checks whether
demo classes are loadable.

Co-authored-by: Tomoko Uchida <tomoko.uchida.1111@gmail.com>
Co-authored-by: Julie Tibshirani <julietibs@apache.org>
2022-03-11 10:22:17 -08:00
Dawid Weiss
e999056c19 LUCENE-10311: avoid division by zero on small sets. 2022-03-09 11:41:01 +01:00