lucene

Commit Graph

Author	SHA1	Message	Date
Adrien Grand	803d131fd0	LUCENE-9535: Try to do larger flushes. DWPTPool currently always returns the last DWPT that was added to the pool. By returning the largest DWPT instead, we could try to do larger flushes by finishing DWPTs that are close to being full instead of the last one that was added to the pool, which might be close to being empty. When indexing wikimediumall, this change did not seem to improve the indexing rate significantly, but it didn't slow things down either and the number of flushes went from 224-226 to 216, about 4% less. My expectation is that our nightly benchmarks are a best-case scenario for DWPTPool as the same number of threads is dedicated to indexing over time, but in the case when you have e.g. a single fixed threadpool that is responsible for indexing into several indices, the number of indexing threads that contribute to a given index might greatly vary over time.	2021-06-16 10:26:45 +02:00
kkewwei	b7b834b756	LUCENE-9998: delete useless param fis in StoredFieldsWriter.finish() and TermVectorsWriter.finish() (#183 )	2021-06-15 16:59:42 +02:00
Nhat Nguyen	6f5a413ec6	LUCENE-9935: Clone term vectors reader for merges (#182 ) The newly added assertion in the bulk-merge logic doesn't always hold because we do not create a new instance of Lucene90CompressingTermVectorsReader for merges and that reader can be accessed in tests (as long as it happens on the same thread). This change clones a new term vectors reader for merges.	2021-06-15 07:10:30 -04:00
Nhat Nguyen	50607e0fb9	LUCENE-9935: Enable bulk-merge for term vectors with index sort (#140 ) This change enables bulk-merge for term vectors with index sort. The algorithm used here is similar to the one that is used to merge stored fields. Relates #134	2021-06-14 11:39:38 -04:00
Dawid Weiss	3bedc0871e	LUCENE-9977: rat task corrections (proper up-to-date checks, cleanup and rewrite of the task itself). (#178 )	2021-06-11 09:26:34 +02:00
Nhat Nguyen	69ab1447a7	Revert "LUCENE-9935: Enable bulk-merge for term vectors with index sort (#140 )" This reverts commit `54fb21e862`.	2021-06-10 11:54:11 -04:00
Nhat Nguyen	54fb21e862	LUCENE-9935: Enable bulk-merge for term vectors with index sort (#140 ) This change enables bulk-merge for term vectors with index sort. The algorithm used here is similar to the one that is used to merge stored fields. Relates #134	2021-06-10 11:03:17 -04:00
Jack Conradson	40f66a450a	LUCENE-9965: Add tooling to introspect query execution time (#144 ) This change adds new IndexSearcher and Collector implementations to profile search execution and break down the timings. The breakdown includes the total time spent in each of the following categories along with the number of times visited: create weight, build scorer, next doc, advance, score, match. Co-authored-by: Julie Tibshirani <julietibs@gmail.com>	2021-06-09 13:25:15 -07:00
Adrien Grand	f5e050bd00	LUCENE-9992: Update expectations about vectors with no values.	2021-06-09 18:59:14 +02:00
Michael Sokolov	465cb17d2b	LUCENE-9992: write empty vector fields when merging (#172 )	2021-06-09 07:56:50 -04:00
Dawid Weiss	332405e7ad	LUCENE-9995: JDK17 generates wbr tags which make javadocs checker angry.	2021-06-09 10:45:01 +02:00
zacharymorn	8bcaf87a83	LUCENE-9976: Fix WANDScorer assertion error (#171 ) LUCENE-9976: Fix WANDScorer assertion error as (tailMaxScore >= minCompetitiveScore) && (tailSize < minShouldMatch) are valid now	2021-06-09 00:11:10 -07:00
Julie Tibshirani	d22af75686	Fix random failures in TestPerFieldVectorFormat#testMergeUsesNewFormat	2021-06-08 14:26:52 -07:00
Julie Tibshirani	300589433f	Move some 9.0 changelog items to 8.x These were backported so should appear in the later sections. This commit also fixes some small typos.	2021-06-08 09:11:28 -07:00
Julie Tibshirani	e9339253f5	LUCENE-9905: Make sure to use configured vector format when merging (#176 ) Before when creating a VectorWriter for merging, we would always load the default implementation. So if the format was configured with parameters, they were ignored. This issue was caught by `TestKnnGraph#testMergeProducesSameGraph`.	2021-06-08 08:07:35 -07:00
Christine Poerschke	1ec2a715a2	Fix 8.9.0 < 8.10.0 comparison in smokeTestRelease.py script. (#2509 )	2021-06-08 15:54:57 +01:00
Julie Tibshirani	84499732c1	Mute TestKnnGraph#testMergeProducesSameGraph while we prepare a fix	2021-06-07 16:50:46 -07:00
Julie Tibshirani	05ae738fc9	LUCENE-9905: Move HNSW build parameters to codec (#166 ) Previously, the max connections and beam width parameters could be configured as field type attributes. This PR moves them to be parameters on Lucene90HnswVectorFormat, to avoid exposing details of the vector format implementation in the API.	2021-06-07 12:51:59 -07:00
Alan Woodward	dbb4c265d5	LUCENE-8143: Remove no-op SpanBoostQuery (#155 ) Boosts are ignored on inner span queries, and top-level boosts can be applied by using a normal BoostQuery, so SpanBoostQuery itself is redundant and trappy. This commit removes it entirely.	2021-06-07 15:56:16 +01:00
Greg Miller	428d2d99d7	Fix typo in CHANGES.txt (#169 )	2021-06-05 07:14:01 -07:00
Greg Miller	4404b19142	LUCENE-9991: Address bug in TestStringValueFacetCounts (#168 )	2021-06-04 14:40:07 -07:00
Jan Høydahl	d47b75395c	LUCENE-9985 Upgrade Jetty to 9.4.41 (#165 )	2021-06-04 09:41:35 +02:00
Greg Miller	7a7003c51c	LUCENE-9988: Fix DrillSideways bug discovered in randomized testing (#167 )	2021-06-03 15:03:09 -07:00
Chris Hostetter	efb7b2a5e8	LUCENE-9970: Add TooManyNestedClauses extends TooManyClauses so that IndexSearcher.rewrite can distinguish hos maxClauseCount is exceeded This is an extension of the work done in LUCENE-8811 which added the two types of checks	2021-06-03 12:46:53 -07:00
Naoto MINAMI	89034ad8cf	LUCENE-9823: Prevent unsafe rewrites for SynonymQuery and CombinedFieldQuery. (#160 ) Before, rewriting could slightly change the scoring when weights were specified. We now rewrite less aggressively to avoid changing the query's behavior.	2021-06-02 17:28:51 -07:00
Julie Tibshirani	eecd1971fa	LUCENE-9905: Allow Lucene90Codec to be configured with a per-field vector format (#164 ) Previously only AssertingCodec could handle a per-field vector format. This PR also strengthens the checks in TestPerFieldVectorFormat.	2021-06-02 08:43:54 -07:00
Greg Miller	8b60641bca	LUCENE-9944: Allow DrillSideways users to pass a CollectorManager without requiring an ExecutorService (and concurrent DrillSideways approach). (#142 )	2021-06-02 06:27:48 -07:00
Greg Miller	3c7a76a148	LUCENE-9962: Allow DrillSideways sub-classes to provide their own "drill down" facet counting implementation (or null). (#143 )	2021-06-01 12:25:34 -07:00
Mike McCandless	c4cf7aa3e1	LUCENE-9981: more efficient getCommonSuffix/Prefix, and more accurate 'effort limit', instead of precise output state limit, during determinize, for throwing TooComplexToDeterminizeException	2021-06-01 13:58:47 -04:00
Gautam Worah	27b009c5d0	LUCENE-9956: Make getBaseQuery, getDrillDownQueries API from DrillDownQuery public (#138 ) Co-authored-by: Gautam Worah <gauworah@amazon.com>	2021-06-01 09:54:18 -07:00
Nhat Nguyen	c46bcf75cc	LUCENE-9980: Do not expose deleted commits (#158 ) If we fail to delete files that belong to a commit point, then we will expose that deleted commit in the next calls of IndexDeletionPolicy#onCommit. I think we should never expose those deleted commit points as some of their files might have been deleted already.	2021-05-31 11:03:48 -04:00
Greg Miller	d76dd6454e	Add CHANGES.txt entry for LUCENE-9971 (#161 )	2021-05-31 06:24:56 -07:00
Alexander Lukyanchikov	65842c5c4d	LUCENE-9971: Inconsistent SSDVFF and Taxonomy facet behavior in case of unseen dimension (#149 )	2021-05-31 05:58:30 -07:00
Greg Miller	d669ddebc5	LUCENE-9946: Support multi-value fields in range facet counting (#127 )	2021-05-30 19:46:11 -07:00
Jan Høydahl	5fdff6eabb	LUCENE-9589 Swedish Minimal Stemmer (#136 )	2021-05-28 14:20:11 +02:00
Dawid Weiss	0a316b2495	LUCENE-9975: don't require signing of 'unsignedJars' publication (maven artifacts published to the user's maven local repository, build folder and apache nexus). (#156 )	2021-05-28 11:51:28 +02:00
Tomoko Uchida	2160d7239d	Revert "LUCENE-9448: clean up unused start scripts for luke." This reverts commit `16104090fb`.	2021-05-27 19:22:29 +09:00
Alan Woodward	1e7d8146ff	LUCENE-9454: Remove version field on Analyzer (#154 ) Version switching on Analyzer behaviour should be implemented in the various component factories, rather than on a mutable setting on Analyzer itself.	2021-05-26 17:34:01 +01:00
Tomoko Uchida	16104090fb	LUCENE-9448: clean up unused start scripts for luke.	2021-05-26 23:32:52 +09:00
Alan Woodward	4464cd87cc	LUCENE 9204: Move SpanQuery and subclasses to the queries module (#152 )	2021-05-26 10:12:14 +01:00
Dawid Weiss	5912e65434	LUCENE-9974: The test-framework module should apply the test ruleset for forbidden APIs. (#153 )	2021-05-26 10:19:55 +02:00
Alan Woodward	93844d3846	LUCENE-9204: Move helper methods from TestMatchesIterator into a base class (#151 ) TestMatchesIterator lives in core/tests and does various sanity checks on the matches returned by various queries, including Span queries. The Span-specific tests cannot stay here once Spans have been moved out of core. This commit pulls various helper methods from this class into a base class in the test framework, so that we can move the Spans tests into their own class and keep coverage once things have been migrated.	2021-05-25 14:16:05 +01:00
Alan Woodward	4b55ae5de4	LUCENE-9204: Remove Spans references from DisiWrapper (#150 ) We have a number of helper classes in o.a.l.search that aid the implementation of two-phase iteration over disjunctions. These have some Spans-specific code, which will stop compiling once Spans are moved into the queries module. This commit removes the Spans references from the main code and duplicates the helper code within the Spans package.	2021-05-25 14:14:47 +01:00
Alan Woodward	5e0e7a5479	LUCENE-9204: Make ConjunctionDISI package-private and add ConjunctionUtils factory class (#148 ) ConjunctionDISI is really an internal implementation of DocIdSetIterator, and would ideally be package-private. However, it is used in a few other places: * directly in ConjunctionSpans * as a utility in the facet and join modules This commit adds a public helper class ConjunctionUtils that allows easy intersection of iterators for use by other modules. This means that ConjunctionDISI itself can become package-private. It also removes a reference to Spans from core classes, which will make it easier to migrate Spans to the queries module. ConjuctionSpans and ConjunctionIntervalIterator now use the public Utils class, and intervals no longer need their own ConjunctionDISI implementation.	2021-05-25 12:07:20 +01:00
Mike McCandless	654e978190	LUCENE-9967: don't throw NullPointerException while handling a different root-cause exception in ReplicaNode.start	2021-05-24 10:51:26 -04:00
Dawid Weiss	f7fbb9eda5	Add a small clarification about the required Java version for gradle.	2021-05-24 09:59:54 +02:00
Nhat Nguyen	a12260eb95	LUCENE-9827: Update backward codec in Lucene 9.0 (#147 ) We need to update the reading logic of the backward codec in Lucene 9 for LUCENE-9827 and LUCENE-9935 as we have backported them to Lucene 8. Relates apache/lucene-solr#2495 Relates apache/lucene-solr#2494	2021-05-20 08:49:43 -04:00
Houston Putman	f919672647	LUCENE-9936: Add Gpg Signing help info to gradle help command	2021-05-19 10:43:31 -05:00
Greg Miller	693b6d3e34	move changes entry for backport to 8.9 (#145 ) Co-authored-by: Greg Miller <gmiller@amazon.com>	2021-05-19 07:04:23 -04:00
Greg Miller	65820e5170	LUCENE-9953: Make FacetResult#value accurate for LongValueFacetCounts multi-value doc cases (#131 ) Co-authored-by: Greg Miller <gmiller@amazon.com>	2021-05-18 12:37:53 -04:00

1 2 3 4 5 ...

35255 Commits All Branches Search

35255 Commits

All Branches