lucene

Commit Graph

Author	SHA1	Message	Date
Jerry Chin	04ef6de826	GITHUB-12291: Skip blank lines from stopwords list. (#12299 )	2023-05-18 16:58:32 +02:00
Michael Sokolov	6b51cce0b8	NeighborQueue.reset() now clears incomplete flag	2023-05-18 10:23:22 -04:00
Greg Miller	3e4ca4042c	Minor cleanup and improvements to DaciukMihovAutomatonBuilder (#12305 )	2023-05-18 07:01:19 -07:00
Michael Sokolov	2facb3ae0e	Revert "allocate one NeighborQueue per search for results (#12255 )" This reverts commit `9a7efe92c0`.	2023-05-18 13:42:17 +00:00
Petr Portnov \| PROgrm_JARvis	0c6e8aec67	Seal `IndexReader` and `IndexReaderContext` (#12296 )	2023-05-17 08:47:47 +02:00
tang donghai	f53eb28af0	remove max recursion from Operations.java to AutomatonTestUtil.java (#12298 ) Co-authored-by: tangdonghai <tangdonghai@meituan.com>	2023-05-16 07:09:28 -04:00
Patrick Zhai	8af305892d	Optimize HNSW diversity calculation (#12235 )	2023-05-15 23:20:31 -07:00
tang donghai	0e172b0723	Update Javadoc for topoSortStates method after #12286 (#12292 )	2023-05-15 18:06:01 +02:00
tang donghai	5d203f8337	toposort use iterator to avoid stackoverflow (#12286 ) Co-authored-by: tangdonghai <tangdonghai@meituan.com>	2023-05-15 16:20:15 +02:00
Luca Cavanna	223e28ef16	Simplify SliceExecutor and QueueSizeBasedExecutor (#12285 ) The only behaviour that QueueSizeBasedExecutor overrides from SliceExecutor is when to execute on the caller thread. There is no need to override the whole invokeAll method for that. Instead, this commit introduces a shouldExecuteOnCallerThread method that can be overridden.	2023-05-11 11:08:48 +02:00
Marcus	963ed7ce88	`ToParentBlockJoinQuery` Explain Support Score Mode (#12245 ) * `ToParentBlockJoinQuery` Explain Support Score Mode --------- Co-authored-by: Mikhail Khludnev <mkhl@apache.org>	2023-05-10 19:10:37 +03:00
Alan Woodward	dcbf7523a1	DOAP changes for release 9.6.0	2023-05-10 08:36:53 +01:00
Luca Cavanna	b6100d9787	Make TimeExceededException members final (#12271 ) TimeExceededException has three members that are set within its constructor and never modified. They can be made final.	2023-05-09 11:28:23 +02:00
Luca Cavanna	082c49a9ef	Update javadocs for QueryTimeout (#12272 ) QueryTimeout was introduced together with ExitableDirectoryReader but is now also optionally set to the IndexSearcher to wrap the bulk scorer with a TimeLimitingBulkScorer. Its javadocs needs updating.	2023-05-09 11:27:47 +02:00
Luca Cavanna	10bad40ed3	Make query timeout members final in ExitableDirectoryReader (#12274 ) There's a couple of places in the Exitable wrapper classes where queryTimeout is set within the constructor and never modified. This commit makes such members final.	2023-05-09 11:27:06 +02:00
Luca Cavanna	1cd9c1d66a	add missing changelog entry for #12220	2023-05-09 10:57:28 +02:00
Luca Cavanna	67bb384f72	add missing changelog entry for #12260	2023-05-09 10:52:03 +02:00
Luca Cavanna	9579d2de76	Move changes entry for #12270 to 9.7.0 section	2023-05-09 10:28:22 +02:00
Armin Braun	add9aba16d	Don't generate stacktrace in CollectionTerminatedException (#12270 ) CollectionTerminatedException is always caught and never exposed to users so there's no point in filling in a stack-trace for it.	2023-05-09 10:18:52 +02:00
Jonathan Ellis	9a7efe92c0	allocate one NeighborQueue per search for results (#12255 )	2023-05-08 17:22:58 -04:00
Michael Sokolov	a39885fdab	GITHUB-12224: remove KnnGraphTester (moved to luceneutil) (#12238 )	2023-05-08 10:12:36 -04:00
Uwe Schindler	397c2e547a	Fix MMapDirectory documentation for Java 20 (#12265 )	2023-05-05 12:04:38 +02:00
Luca Cavanna	caeabf3930	Fix SynonymQuery equals implementation (#12260 ) The term member of TermAndBoost used to be a Term instance and became a BytesRef with #11941, which means its equals impl won't take the field name into account. The SynonymQuery equals impl needs to be updated accordingly to take the field into account as well, otherwise synonym queries with same term and boost across different fields are equal which is a bug.	2023-05-03 11:27:33 +02:00
Jonathan Ellis	3c163745bb	Use HashMap (was TreeMap) for OnHeapHnswGraph neighbors	2023-04-30 17:59:39 -04:00
Patrick Zhai	1fa2be90ea	Tidy the main branch	2023-04-26 21:21:57 -07:00
Alan Woodward	7374c200a1	Add next minor version 9.7.0	2023-04-26 16:44:47 +01:00
Christoph Büscher	f45e096304	Add ordering of files in compound files (#12241 ) Today there is no specific ordering of how files are written to a compound file. The current order is determined by iterating over the set of file names in SegmentInfo, which is undefined. This commit changes to an order based on file size. Colocating data from files that are smaller (typically metadata files like terms index, field info etc...) but accessed often can help when parts of these files are held in cache.	2023-04-26 14:01:02 +01:00
Luca Cavanna	b0befef912	QueryProfilerWeight to extend FilterWeight (#12242 ) QueryProfilerWeight should override matches and delegate to the subQueryWeight. Another way to fix this issue is to make it extend ProfileWeight and override only methods that need to have a different behaviour than delegating to the sub weight.	2023-04-26 10:24:57 +02:00
Alessandro Benedetti	4deb0003c4	Word2VecSynonymFilter constructor null check (#12169 )	2023-04-24 17:28:12 +02:00
Daniele Antuzi	1f4f2bf509	Introduced the Word2VecSynonymFilter (#12169 ) Co-authored-by: Alessandro Benedetti <a.benedetti@sease.io>	2023-04-24 13:35:26 +02:00
Peter Gromov	5e0761eab5	remove timeout dependency from TestHunspell.testSuggestionOrderStabilityOnDictionaryEditing	2023-04-23 21:16:56 +02:00
Peter Gromov	025dfec2dd	Hunspell: reduce suggestion set dependency on the hash table order (#12239 ) * Hunspell: reduce suggestion set dependency on the hash table order When adding words to a dictionary, suggestions for other words shouldn't change unless they're directly related to the added words. But before, GeneratingSuggester selected 100 best first matches from the hash table, whose order can change significantly after adding any unrelated word. That resulted in unexpected suggestion changes on seemingly unrelated dictionary edits.	2023-04-23 16:51:17 +02:00
Stefan Vodita	2e7426961b	Remove statement that SSDV facets aren't hierarchical (#12232 )	2023-04-21 18:40:08 -04:00
Peter Gromov	60c9039d9f	mention "GITHUB#12220: Hunspell: disallow hidden title-case entries from compound middle/end" CHANGES.txt	2023-04-21 15:17:33 +02:00
Usman Shaikh	bed07c6b02	Update Javadoc comment to mention gradle instead of ant (#12201 )	2023-04-18 22:14:19 -07:00
Houston Putman	08f30f82b4	Cleanup NOTICE.txt (#12227 ) - Ant is no longer used as the build system for Lucene - JUnit is not packaged in a Lucene release - The Float16Converter was removed before the PR it was used in was merged: https://github.com/apache/lucene-solr/pull/2108	2023-04-18 15:58:09 -04:00
Kartik Ganesh	3813f5ab7c	Change the access modifier for the "expert" readLatestCommit API to public. (#12229 ) This change also includes a unit test for this functionality. Signed-off-by: Kartik Ganesh <gkart@amazon.com>	2023-04-18 14:38:35 -04:00
Andrey Bozhko	2d0dc6407a	Avoid redundant copies of BytesRef when constructing new Term (#12234 )	2023-04-15 22:44:14 -07:00
Vigya Sharma	4e88118a35	Fix typo in CheckJoinIndex (#12231 )	2023-04-14 14:06:19 -07:00
Marcus	2d7908e3c9	Explain term automaton queries (#12208 )	2023-04-08 16:09:42 -07:00
Patrick Zhai	c31017589b	Remove a test in TestDocumentsWriterDeleteQueue (#12223 )	2023-04-04 10:49:14 -07:00
Peter Gromov	56aef7265a	hunspell: disallow hidden title-case entries from compound middle/end (#12220 ) if we only have custom-case uART and capitalized UART, we shouldn't accept StandUart as a compound (although we keep hidden "Uart" dictionary entries for internal purposes)	2023-04-03 20:06:58 +02:00
Adrien Grand	56e65919b1	Adjust DWPT pool concurrency to the number of cores. (#12216 ) After upgrading Elasticsearch to a recent Lucene snapshot, we observed a few indexing slowdowns when indexing with low numbers of cores. This appears to be due to the fact that we lost too much of the bias towards larger DWPTs in apache/lucene#12199. This change tries to add back more ordering by adjusting the concurrency of `DWPTPool` to the number of cores that are available on the local node.	2023-03-31 15:07:48 +02:00
Greg Miller	172dfaf867	changes entry for GH#12212	2023-03-29 11:09:22 -07:00
Frederic Thevenet	df1b0baa69	Fixes Searches made via DrillSideways may miss documents that should match the query (#12212 )	2023-03-29 11:05:58 -07:00
Uwe Schindler	b84b360f58	Upgrade forbiddenapis to version 3.5 (#12215 ) Upgrade forbiddenapis to version 3.5. This tones down some verbose warnings printed while checking Java 19 and Java 20 sourcesets for the MR-JAR	2023-03-27 13:30:22 +02:00
Hongyu Yan	a6475cecbf	Fix ordered intervals query over interleaved terms (#12214 ) Given an input text 'A B A C A B C' and search ORDERED(A, B, C), we should retrieve hits [0,3] and [4,6]; currently [4,6] is skipped. After finding the first interval [0, 3], the subintervals will become A[0,0], B[1,1], C[3,3]; then the algorithm will try to minimize it and the subintervals will become: A:[2,2], B:[5,5], C:[3,3] (after finding 5 > 3 it breaks the minimization) And when finding next interval, it will do advance(B) before checking whether it is after A(the do-while loop), so subintervals will become A[2,2], B[inf, inf], C[3,3] and return NO_MORE_INTERVAL. This commit instead continues advancing subintervals from where the last `nextInterval` call stopped, rather than always advancing all subintervals.	2023-03-27 09:18:33 +01:00
Adrien Grand	0782535017	Fully reuse postings enums when flushing sorted indexes. (#12206 ) Currently we're only half reusing postings enums when flushing sorted indexes as we still create new wrapper instances every time, which can be costly with fields that have many terms.	2023-03-16 13:51:33 +01:00
Patrick Zhai	d3b6ef3c86	Refactor part of IndexFileDeleter and ReplicaFileDeleter into a common utility class (#12126 )	2023-03-15 20:51:49 -07:00
Adrien Grand	f324204019	Reduce contention in DocumentsWriterPerThreadPool. (#12199 ) Obtaining a DWPT and putting it back into the pool is subject to contention. This change reduces contention by using 8 sub pools that are tried sequentially. When applied on top of #12198, this reduces the time to index geonames with 20 threads from ~19s to ~16-17s.	2023-03-15 13:17:40 +01:00

... 2 3 4 5 6 ...

36665 Commits All Branches Search

36665 Commits

All Branches