Commit Graph

35913 Commits

Author SHA1 Message Date
Michael Gibney cd16b0a9dc
add new module-info provides entry for new CharFilter 2022-04-07 15:23:23 -04:00
Michael Gibney 6799adad2c
change ctor access to package-private 2022-04-07 15:22:12 -04:00
Michael Gibney 90efa1ef7f
update test package names 2022-04-07 15:19:31 -04:00
Michael Gibney aa5c355e2d
Merge remote-tracking branch 'apache-lucene/main' into LUCENE-8972 2022-04-07 14:10:36 -04:00
Tomoko Uchida 9aa8ec9d06
LUCENE-10493: Unify TokenInfoFST in kuromoji and nori (#795) 2022-04-07 21:29:44 +09:00
Tomoko Uchida 4d2b08554a
LUCENE-10493: add 'backWordPos' array to JapaneseTokenizer.Position (#793) 2022-04-07 21:29:07 +09:00
zacharymorn 94fe7e314f
LUCENE-10436: Remove deprecated DocValuesFieldExistsQuery, NormsFieldExistsQuery and KnnVectorFieldExistsQuery (#790) 2022-04-07 00:53:29 -07:00
Greg Miller f870edf2fe
LUCENE-10444: Support alternate aggregation functions in association facets (#718) 2022-04-06 14:51:06 -07:00
Julie Tibshirani 9eeef080e5
Add release wizard step around build failures (#789)
This PR adds a preparation step to look at builds@lucene.apache.org and address
recurring failures. This helps make sure we catch and fix known bugs before
spinning the release candidate. It also prevents flaky tests from failing
during the release vote (which adds confusion).
2022-04-06 14:03:52 -07:00
Luca Cavanna 1cf1b301af
LUCENE-10002: replace more usages of search(Query, Collector) in tests (#787)
This commit replaces more usages of search(Query, Collector) with calling the corresponding search(Query, CollectorManager) instead. This round focuses on tests that implement custom collector, that need a corresponding collector manager.
2022-04-06 11:06:10 +02:00
Luca Cavanna 74e9716aec
LUCENE-10002: move MemoryIndex to search(Query, CollectorManager) (#785) 2022-04-06 11:02:25 +02:00
zacharymorn 91e29405d8
LUCENE-10436: Deprecate DocValuesFieldExistsQuery, NormsFieldExistsQuery and KnnVectorFieldExistsQuery with FieldExistsQuery (#767) 2022-04-05 23:07:20 -07:00
Luca Cavanna 796a19b457
LUCENE-10500: StringValueFacetCounts to not rely on sequential collection (#788)
StringValueFacetCounts should use the segment ordinal instead of the current index when looping through the matching hits, as when search is multi-threaded the order of the matching hits (one per segment) is not deterministic.
2022-04-05 22:42:06 +02:00
Quentin Pradet 6062ba0b3b
LUCENE-10085: Fix rare failure in TestDocValuesFieldExistsQuery (#784)
In rare cases, this test could delete all documents and cause a failure.
2022-04-05 10:30:01 -07:00
Greg Miller a071180a80 Add CHANGES entry for LUCENE-10467 2022-04-05 09:32:07 -07:00
Yuting Gan 6b82e600a8
LUCENE-10467: Throws IllegalArgumentException for getAllDims and getTopChildren if topN <= 0 (#751) 2022-04-05 09:28:59 -07:00
Tomoko Uchida bb4a0dc19b
LUCENE-10497: Add a base Token class to analysis-common (for kuromoji and nori) (#783) 2022-04-05 20:20:38 +09:00
Luca Cavanna ea52a84c7e
Replace TopFieldCollector usages in tests with collector manager (#761)
This commit replaces some usages of TopFieldCollector in tests with a corresponding collector manager created through TopFieldCollector#createSharedManager
2022-04-05 10:03:04 +02:00
Adrien Grand deb6170107 Fix CHANGES formatting. 2022-04-05 09:24:40 +02:00
xiaoping 898ec1659d
LUCENE-10456: Implement Weight#count for MultiRangeQuery (#731) 2022-04-05 09:23:59 +02:00
Adrien Grand f249046a1d LUCENE-10484: Move CHANGES entry to 9.2. 2022-04-05 08:53:57 +02:00
Luca Cavanna 7ed0f3d7ad
LUCENE-10484: Add support for concurrent facets random sampling (#765)
This commit adds a new createManager static method to RandomSamplingFacetsCollector that allows users to perform random sampling concurrently. The returned collector manager is very similar to the existing FacetsCollectorManager but it exposes a specialized reduced RandomSamplingFacetsCollector.

This relates to [LUCENE-10002](https://issues.apache.org/jira/browse/LUCENE-10002). It allows users to use a collector manager instead of a collector when doing random sampling, in the effort of reducing usages of IndexSearcher#search(Query, Collector).
2022-04-05 08:51:57 +02:00
Luca Cavanna e7f9f2c50d
LUCENE-10002: Replaces usages of FacetsCollector with FacetsCollectorManager (#764)
In the effort of decreasing usages of IndexSearcher#search(query, Collector) by using the corresponding method that accepts a collector manager, this commit replaces many usages of FacetsCollector with its corresponding existing collector manager.
2022-04-05 08:51:16 +02:00
Julie Tibshirani da95fb2ef7 LUCENE-10466: Move changelog entry to 9.2 2022-04-04 12:04:02 -07:00
Andriy Redko 737ce42c1c
LUCENE-10466: Ensure IndexSortSortedNumericDocValuesRangeQuery handles sort types besides LONG
IndexSortSortedNumericDocValuesRangeQuery unconditionally assumes the usage of
the LONG-encoded SortField. Using the numeric range query (in case of sorted
index) with anything but LONG ends up with class cast exception. Now the query
consults the numeric type of the `SortField` and perform appropriate checks.
2022-04-04 12:02:00 -07:00
Luca Cavanna 0a525ce2ab
LUCENE-10498: don't count num hits nor accumulate scores unless necessary (#782)
This commit introduces a no-op implementation of HitsThresholdChecker that does no counting, to be used when early termination is disabled. This is automatically used when creating a TopFieldCollector or a TopScoreDocCollector. In that same scenario MaxScoreAccumulator can be null and scores are no longer accumulated when creating a shared collector manager.

With this, it is safe to replace the custom collector managers in DrillSideways with the ones returned by calling createSharedManager.
2022-04-04 17:56:34 +02:00
Tomoko Uchida 459d361520
LUCENE-10184: mention of opening a Jira issue (#781) 2022-04-04 20:09:39 +09:00
Tomoko Uchida 41204b8f1b
Update link to contribution guide 2022-04-04 19:12:25 +09:00
Gautam Worah fb79ee1549
Remove redundant index (#776)
Thanks @gautamworah96 !
2022-04-01 14:27:09 +02:00
Baurzhan 69b040fc62
Implement method to bulk add all collection elements to a PriorityQueue (#770)
Implement method to add all collection elements to a PriorityQueue
Co-authored-by: Dawid Weiss <dawid.weiss@carrotsearch.com>
2022-03-30 19:28:01 +02:00
Greg Miller fb83387002
LUCENE-10491: Fix correctness bug in TaxonomyFacetSumValueSource score providing (#775) 2022-03-30 09:36:05 -07:00
Luca Cavanna c4935d2fa3
Replace usages of search(Query, Collector) in CheckHits (#763)
This commit replaces usages of IndexSearcher#search(Query, Collector) with search(Query, CollectorManager) in CheckHits.
2022-03-30 17:55:05 +02:00
Luca Cavanna 66bbc95586
LUCENE-10002: Add FixedBitSetCollector and corresponding collector manager to test framework (#766)
Some tests collect matching docs in a FixedBitSet. In the effort of moving such tests to using IndexSearcher#search(Query, CollectorManager) as part of LUCENE-10002, this commit adds a new FixedBitSetCollector class that exposes this functionality as well as a createManager method that returns a corresponding CollectorManager.
2022-03-30 16:14:39 +02:00
Tomoko Uchida 2a3e5ca07f
LUCENE-10475: Merge o.a.l.a.[ja|ko].util into o.a.l.a.[ja|ko].dict (#772) 2022-03-29 21:09:26 +09:00
Tomoko Uchida ac6c36d406
LUCENE-10184: add CONTRIBUTING.md; reorganize README. (#771) 2022-03-29 16:52:27 +09:00
Greg Miller d438a0cde7 Add CHANGES entry for LUCENE-10325 2022-03-28 15:57:18 -07:00
Yuting Gan 7c33f04d37
LUCENE-10325: Add getTopDims functionality to Facets (#747) 2022-03-28 15:54:07 -07:00
Tomoko Uchida 0f93130d7b remove obsolete image/description from luke/README.md 2022-03-28 08:44:29 +09:00
Uwe Schindler ff263f0aa4
Upgrade to forbiddenapis 3.3 (#768) 2022-03-26 17:09:42 +01:00
Tomoko Uchida bd22f199de
LUCENE-10393: Unify binary dictionary and dictionary writer in kuromoji and nori (#740) 2022-03-25 18:44:36 +09:00
Mike Drob b3906e96ea
LUCENE-9651 Update benchmark module docs (#759) 2022-03-23 14:51:28 -05:00
Lu Xugang 5450d72258
LUCENE-10458: BoundedDocSetIdIterator may supply error count in Weigth#count(LeafReaderContext) when missingValue enables (#736) 2022-03-23 15:54:52 +01:00
Mike Drob 1c6f631678
LUCENE-10481: FacetsCollector will not request scores if it does not use them (#760) 2022-03-23 09:44:02 -05:00
Christine Poerschke 779c332a8c
LUCENE-10477: mention 'call multiple times' in Query.rewrite javadoc (#758) 2022-03-22 15:39:59 +00:00
Adrien Grand 04127ed9fc Add back-compat indices for 9.1.0. 2022-03-22 16:10:10 +01:00
Adrien Grand 3105998ce6 Synchronize CHANGES. 2022-03-22 16:08:59 +01:00
Christine Poerschke ca252d6621
LUCENE-10464, LUCENE-10477: WeightedSpanTermExtractor.extractWeightedSpanTerms to rewrite sufficiently (#737) 2022-03-22 14:53:41 +00:00
Adrien Grand 28d3adcf69 Add version 9.1.0. 2022-03-22 15:43:27 +01:00
Adrien Grand 0a3bad5985 DOAP changes for release 9.1.0 2022-03-22 15:22:27 +01:00
Alan Woodward 42bf77229e LUCENE-10422: Make errorprone happy 2022-03-22 09:18:27 +00:00