lucene

Commit Graph

Author	SHA1	Message	Date
Andrzej Bialecki	7989a863fa	LUCENE-8855: Fix some size estimates and relax test assertions to work under different JVMs.	2019-06-28 10:33:27 +02:00
Sven Amann	7c3d6c7214	LUCENE-8890: Improve parallel iteration of two lists of same length. (#446 ) The class `BooleanWeight` takes a `BooleanQuery` (a list of `BooleanClause`s) as input and maintains a list of weights corresponding to the clauses. The clauses and the weights are iterated in parallel in various places throughout the class. At these code locations, it is not obvious that these two lists always have the same length, i.e., that the parallel iteration is safe. Moreover, the parallel iteration is not well supported by the Java language, which is why this operation is implemented differently throughout the code. This patch joins the two lists to enable parallel iteration without managing two separate lists. This makes the code’s intent more obvious and prevents bugs due to the lists getting out of sync by a future change.	2019-06-28 09:50:37 +02:00
Atri Sharma	7cd20384de	LUCENE-8889: Add Tests For Accessors Of Ranges in PointRangeQuery (#748 )	2019-06-27 13:55:15 +02:00
Adrien Grand	23b6a3cd3a	LUCENE-8871: Fix precommit failures.	2019-06-27 12:03:25 +02:00
iverase	754ce1f437	LUCENE-8886: Fix TestMutablePointsReaderUtils tests	2019-06-27 11:35:54 +02:00
Adrien Grand	7032176705	LUCENE-8815: Remove leftover println.	2019-06-27 08:09:26 +02:00
Adrien Grand	82234ef2f4	LUCENE-8855: Remove unused import.	2019-06-27 08:08:51 +02:00
Adrien Grand	b7029b35d5	LUCENE-8815: Use a LogMergePolicy when the order of documents is important.	2019-06-27 08:08:51 +02:00
Michael Sokolov	024e200bb9	LUCENE-8871: promote kuromoji tools to main jar	2019-06-26 22:34:00 -04:00
Andrzej Bialecki	a76c962ee6	LUCENE-8855: Add Accountable to some Query implementations.	2019-06-26 15:26:54 +02:00
Alan Woodward	6751c072ab	LUCENE-8811: Remove deprecated BooleanQuery maxCount methods	2019-06-26 10:55:55 +01:00
Alan Woodward	53f56fb7ad	LUCENE-8811: Move max clause checks to IndexSearcher	2019-06-26 10:55:55 +01:00
jimczi	889f73105f	LUCENE-8859: The completion suggester's postings format now have an option to load its internal FST off-heap.	2019-06-26 11:16:51 +02:00
Ignacio Vera	dac4310129	LUCENE-8868: New storing strategy for BKD tree leaves with low cardinality (#730 ) When a leaf has only few distinct values, we store the distinct values with the cardinality.	2019-06-26 10:16:12 +02:00
Ignacio Vera	36eaf75b1f	LUCENE-8879: Improve BKDRadixSelector tests This change adds explicit test for the sorting capabilities.	2019-06-26 09:45:44 +02:00
Julie Tibshirani	5bf023cf19	LUCENE-7714: Add a range query in sandbox that takes advantage of index sorting.	2019-06-26 09:17:48 +02:00
jimczi	b85840b97f	LUCENE-8848: Fix IndexWriter leak when TestUnifiedHighlighter#testNotReanalyzed is ignored	2019-06-25 10:36:25 +02:00
David Smiley	85ec39d931	SOLR-13367: Range queries will now highlight in hl.method=unified mode. Lucene MatchesUtils.disjunction method for disjunction over BytesRefIterator terms.	2019-06-25 00:10:08 -04:00
Alan Woodward	c33177e428	LUCENE-8766: Further checks against race in test	2019-06-24 10:04:12 +01:00
Ignacio Vera	d9dbb70d01	LUCENE-8838: Remove support for Steiner points (#703 ) This is currently not used/supported.	2019-06-24 09:41:33 +02:00
Tomoko Uchida	559abd8f28	LUCENE-8778: Update the changelog because this was backported to 8.x branch.	2019-06-22 20:48:41 +09:00
Tomoko Uchida	2d4dea370a	LUCENE-8778: Add SPI name and documentation for the KoreanNumberFilterFactory	2019-06-22 20:23:01 +09:00
Tomoko Uchida	422cf14439	Resolve conflicts in CHANGES.	2019-06-22 16:41:27 +09:00
Tomoko Uchida	8e81f47ca6	LUCENE-8793: Luke enhanced UI for CustomAnalyzer: show detailed analysis steps. Co-authored-by: Jun Ohtani Co-authored-by: Tomoko Uchida	2019-06-22 16:22:26 +09:00
Tomoko Uchida	98c85a0e1a	LUCENE-8778: Define analyzer SPI names as static final fields and document the names in all analysis components. This also changes SPI loader to detect service names via the static NAME fields instead of class names.	2019-06-22 10:46:37 +09:00
David Smiley	54cc70127b	LUCENE-8848 LUCENE-7757 LUCENE-8492: UnifiedHighlighter.hasUnrecognizedQuery The UH now detects that parts of the query are not understood by it. When found, it highlights more safely/reliably. Fixes compatibility with complex and surround query parsers.	2019-06-21 17:05:56 -04:00
Simon Willnauer	b3e759a658	Expose IndexSearchers executor in order to enable searcher cloning (#732 ) Today if an executor was added to the IndexSearcher it's impossible to clone the searcher with it's cache, similarty and caching policy since the executor is not exposed. This adds a simple getter to make cloning easier.	2019-06-21 10:28:47 +02:00
Robert Muir	91331d1a89	LUCENE-8866: remove kuromoji/tools dependency on ICU	2019-06-20 21:20:17 -04:00
Michael Sokolov	aa29bea071	Add missing javadocs for new BinaryDictionary.ResourceScheme	2019-06-21 01:01:10 +02:00
Michael Sokolov	4502065f03	LUCENE-8863: enhance Kuromoji DictionaryBuilder tool added tests enabled ids up to 8191 support loading custom system dictionary from filesystem or classpath	2019-06-21 00:38:44 +02:00
Alan Woodward	371f50acc2	LUCENE-8766: Fix timing problem in test	2019-06-20 17:32:23 +01:00
Jan Høydahl	87c131baa7	LUCENE-8852 ReleaseWizard tool (#710 )	2019-06-20 14:45:17 +02:00
Simon Willnauer	c6899fc40d	LUCENE-8865: Move to executor in IndexSearcher (#731 ) In order to simplify testing this change moves to use the Executor interface instead of ExecutorService. This change also simplifies customizing execute methods for use-cases that need to add additional logic for forking to new threads. This change also adds a test for the optimization added in LUCENE-8865. This change is fully backwards compatible since ExecutorService implements Executor.	2019-06-20 14:26:40 +02:00
Adrien Grand	7c5247c60c	LUCENE-8847: Fix typo in CHANGES.	2019-06-19 09:51:29 +02:00
Michael Sokolov	2e49f13aa1	LUCENE-8781: add FST array-with-gap addressing to Util.readCeilArc	2019-06-18 21:28:16 +02:00
Adrien Grand	2e468abecc	LUCENE-8853: Don't return a FileSwitchDirectory when asked for a FS directory.	2019-06-18 17:15:21 +02:00
Simon Willnauer	60f3b25d06	LUCENE-8865: Use incoming thread for execution if IndexSearcher has an executor (#725 ) Today we don't utilize the incoming thread for a search when IndexSearcher has an executor. This thread is only idling but can be used to execute a search once all other collectors are dispatched.	2019-06-18 14:56:51 +02:00
Luca Cavanna	4fd09eb3e3	LUCENE-8796: Use exponential search in IntArrayDocIdSetIterator#advance (#667 )	2019-06-18 10:29:51 +02:00
Simon Willnauer	fb6e28d9f1	LUCENE-8853: Try parsing original file extension from tmp file (#716 ) FileSwitchDirectory fails if the tmp file are not in the same directory as the file it's renamed to. This is correct behavior but breaks with tmp files used with index sorting. This change tries best effort to find the right extension directory if the file ends with `.tmp`	2019-06-18 08:47:59 +02:00
Cao Manh Dat	0c24aa6c75	SOLR-13541: Upgrade Jetty to 9.4.19.v20190610	2019-06-14 15:46:19 +01:00
Charlie Yan	af2a4fe464	Update package-info.java (#388 ) add a missing parenthesis	2019-06-14 14:54:57 +02:00
Jan Høydahl	d2793688ca	LUCENE-8861: Script to find open PRs that needs attention (#719 )	2019-06-14 13:30:04 +02:00
Alan Woodward	b8c299640d	LUCENE-8766: Pass BytesRef offset/length when decoding from input stream	2019-06-13 16:40:03 +01:00
Alan Woodward	b588e0b19e	LUCENE-8766: Add CHANGES entry	2019-06-13 10:18:12 +01:00
Alan Woodward	251dbe7cea	LUCENE-8766: Add monitor subproject	2019-06-13 09:40:57 +01:00
Simon Willnauer	608d9134ad	LUCENE-8835: Irony - our tests don't emulate windows well enough	2019-06-12 17:56:06 +02:00
Simon Willnauer	e6a9bfb8b2	LUCENE-8853: Temporarily disable random FileSwitchDirectory	2019-06-11 21:32:45 +02:00
Simon Willnauer	b6c68ccded	LUCENE-8835: Respect file extension when listing files form FileSwitchDirectory (#700 ) FileSwitchDirectory splits file actions between 2 directories based on file extensions. The extensions are respected on write operations like delete or create but ignored when we list the content of the directories. Until now we only deduplicated the contents on Directory#listAll which can cause inconsistencies and hard to debug errors due to double deletions in IndexWriter is a file is pending delete in one of the directories but still shows up in the directory listing form the other directory. This case can happen if both directories point to the same underlying FS directory which is a common use-case to split between mmap and NIOFS. This change filters out files from directories depending on their file extension to make sure files that are deleted in one directory are not returned form another if they point to the same FS directory.	2019-06-11 17:27:55 +02:00
Alan Woodward	7a2b965106	LUCENE-8845: Add additional max boolean clause cap on expansion	2019-06-11 12:11:29 +01:00
Alan Woodward	142a20bb0b	LUCENE-8843: Fix precommit	2019-06-11 10:19:37 +01:00
Alan Woodward	50d65889df	LUCENE-8815: Ensure single segments in tests	2019-06-11 10:19:37 +01:00
Adrien Grand	fb0f1776a5	LUCENE-8843: Add CHANGES entry.	2019-06-11 10:22:05 +02:00
Jason Tedor	4fdcb14acf	LUCENE-8843: Only ignore IOException on dirs when invoking force (#706 ) Today in the method IOUtils#fsync we ignore IOExceptions when fsyncing a directory. However, the catch block here is too broad, for example it would be ignoring IOExceptions when we try to open a non-existent file. This commit addresses that by scoping the ignored exceptions only to the invocation of FileChannel#force. This prevents us from suppressing an exception in case we run into an unexpected issue when opening the file. However, fsyncing directories on Windows is not possible. We always suppressed this by allowing that an AccessDeniedException is thrown when attemping to open the directory for reading. Yet, per the above, this suppression also allowed other IOExceptions to be suppressed, and that should be considered a bug (e.g., not only the directory not existing, but any filesystem error and other reasons that we might get an access denied there, like genuine permissions issues). Rather than relying on exceptions for flow control and continuing to suppress there, we simply return early if attempting to fsync a directory on Windows (we should not put this burden on the caller).	2019-06-11 10:19:14 +02:00
Ignacio Vera	88c5817c01	LUCENE-8775: Compute properly the bridge between a polygon and a hole when sharing a vertex.	2019-06-11 07:01:42 +02:00
Koen De Groote	67104dd615	LUCENE-8847: Code Cleanup: Rewrite StringBuilder.append with concatted strings (#707 ) This specific commit affects all points in the casebase where the argument of a StringBuilder.append() call is itself a regular String concatenation. This defeats the purpose of using StringBuilder and also introduces an extra alloction. These changes should avoid that. ant tests have run, succeeded on local machine. Removing test files from the changes. Another suggested rework.	2019-06-10 18:07:43 +02:00
Alan Woodward	e8950f4a52	LUCENE-8845: Allow configurable maxExpansions for prefix/wildcard intervals	2019-06-10 16:14:42 +01:00
Atri Sharma	f84afab008	LUCENE-8362: Introduce DocValues Fields and Range Queries for native Range Field Types This commit introduces a new DocValues field and corresponding range query for binary ranges. These classes are extended into concrete implementations for each of Int, Long, Float and Double range fields.	2019-06-10 15:14:15 +02:00
Colin Goodheart-Smithe	5ef2b3f6b8	LUCENE-8815: Adds a DoubleValues implementation for feature fields (#687 ) This change adds a static method FeatureField#newDoubleValues() which can be used to retrieved the values of a feature for documents directly rathert than having to store the values in a numeric field alongsidde the feature field.	2019-06-10 09:07:24 +02:00
Tim Underwood	97ca9df7ef	LUCENE-8834: Cache the SortedNumericDocValues.docValueCount() value whenever it is used in a loop (#698 )	2019-06-10 08:56:21 +02:00
Namgyu Kim	fe58b6f3a2	LUCENE-8812: disable Java 9 try-with-resources style in TestKoreanNumberFilter Signed-off-by: Namgyu Kim <namgyu@apache.org>	2019-06-10 01:56:34 +09:00
Namgyu Kim	5a75b8a080	LUCENE-8812: Add new KoreanNumberFilter that can change Hangul character to number and process decimal point Signed-off-by: Namgyu Kim <namgyu@apache.org>	2019-06-09 23:00:14 +09:00
Michael Sokolov	e85c6e6429	LUCENE-8844: bump FST version and fix related CHANGES entry	2019-06-08 10:22:02 -04:00
Atri Sharma	965fd194d1	LUCENE-8825: Improve CheckHits's Printing Capabilities Signed-off-by: Adrien Grand <jpountz@gmail.com>	2019-06-07 18:47:41 +02:00
Alan Woodward	67677d995e	LUCENE-8828: Make unorderedNoOverlaps a separate IntervalsSource	2019-06-07 14:45:56 +01:00
Jan Høydahl	8d6fd7298f	LUCENE-8818: Fix smokeTestRelease.py encoding bug	2019-06-06 21:42:24 +02:00
Ignacio Vera	05ea0f2d54	LUCENE-8775: Improve tessellator to handle better cases where a hole share a vertex with the polygon	2019-06-06 08:58:49 +02:00
Ignacio Vera	c6390f80d1	LUCENE-8831: Fixed LatLonShapeBoundingBoxQuery .hashCode method	2019-06-05 15:57:10 +02:00
Jan Høydahl	73b15d8984	Add back-compat indices for 7.7.2	2019-06-05 11:16:41 +02:00
Jan Høydahl	be18d8eaa2	Add bugfix version 7.7.2	2019-06-05 02:31:09 +02:00
Cao Manh Dat	301ea0e462	SOLR-13434: OpenTracing support for Solr (#685 )	2019-06-04 20:04:11 +01:00
Erick Erickson	7ebeab71f4	SOLR-8346: Upgrade Zookeeper to version 3.5.5	2019-06-03 17:50:35 -07:00
Simon Willnauer	d488156921	Merge branch 'master' into LUCENE-8813	2019-05-31 21:05:41 +02:00
Simon Willnauer	086088e699	more feedback	2019-05-29 20:09:17 +02:00
Andrzej Bialecki	2020eb43de	Add backcompat indexes for 8.1.1.	2019-05-29 18:22:22 +02:00
Simon Willnauer	fceee244dd	apply feedback	2019-05-29 09:59:39 +02:00
Simon Willnauer	165d2d5ff5	LUCENE-8813: Ensure we never apply deletes from a closed DWPTDeleteQueue Today we don't have a strong protection that we add and apply deletes / updates on or from an already flushed delete queue. DWPTDeleteQueue instances are replaced once we do a full flush in order to reopen an NRT reader or commit the IndexWriter. In LUCENE-8813 we tripped an assert that used to protect us from such an situation but it didn't take all cornercases from concurrent flushing into account. This change adds a stronger protection and ensures that we neither apply a closed delete queue nor add any updates or deletes to it. This change also allows to speculativly freeze the global buffer that might return null now if the queue has already been closed. This is now possible since we ensure that we never see modifications to the queue after it's been closed and that happens right after the last DWPT for the ongoing full flush is done flushing.	2019-05-28 16:44:34 +02:00
jimczi	db334c792b	LUCENE-8784: Restore the Korean's part of speech tag for NGRAM. The part of speech tag for unigram has been changed inadvertenly in a previous commit (not released). This change restores the original value that is also set on the serialized unkwnown dictionary.	2019-05-28 12:01:05 +02:00
Simon Willnauer	171d7f131f	LUCENE-8813: Count down latch in finally block. This test hangs until it times-out when an assertion is tripped in the indexing thread. Counting down the latch in a finally block will cause the test to fail earlier.	2019-05-28 10:55:18 +02:00
Adrien Grand	c252b92caa	LUCENE-8135: Fix number of clauses randomization.	2019-05-28 09:53:58 +02:00
Namgyu Kim	a556925eb8	LUCENE-8784: The KoreanTokenizer now preserves punctuations if discardPunctuation is set to false (defaults to true). Signed-off-by: Namgyu Kim <kng0828@gmail.com> Signed-off-by: jimczi <jimczi@apache.org>	2019-05-27 15:15:24 +02:00
Colin Goodheart-Smithe	46060d88a2	LUCENE-8803: Provide a FieldComparator to allow sorting by a feature from a FeatureField (#680 ) This change adds a SortField which allows a convenient way to sort search hits using a feature from a FeatureField.	2019-05-24 08:45:57 +02:00
Nhat Nguyen	0435348b29	LUCENE-8809: Ensure release segment states If refresh and rollback happen concurrently, then we can leave segment states unreleased leads to leaking refCount of some SegmentReaders.	2019-05-23 11:25:28 -04:00
Adrien Grand	97046c7054	LUCENE-8757: Fix test bug.	2019-05-22 09:10:52 +02:00
Atri Sharma	87e936f1bb	LUCENE-8757: Improving Default Segments To Thread Mapping Algorithm The current slicing algorithm assigns a thread per segment, which can be detrimental to performance in case the distribution has a large number of small segments. The patch introduces a slicing algorithm which coalesces smaller segments to a single thread, thus reducing the impact of context switching by limiting the number of threads Signed-off-by: Adrien Grand <jpountz@gmail.com>	2019-05-21 20:18:42 +02:00
Namgyu Kim	5a694ea26f	LUCENE-8805: Parameter changes for stringField() in StoredFieldVisitor Signed-off-by: Namgyu Kim <kng0828@gmail.com> Signed-off-by: Adrien Grand <jpountz@gmail.com>	2019-05-21 20:18:42 +02:00
Uwe Schindler	c756b50ae4	LUCENE-8807: Change all download URLs in build files to HTTPS	2019-05-21 17:06:00 +02:00
jimczi	4640a527a4	LUCENE:8770: BlockMaxConjunctionScorer now leverages two-phase iterators in order to avoid executing the second phase when scorers don't intersect	2019-05-21 11:35:44 +02:00
Adrien Grand	ec6ac9756c	LUCENE-8804: Forbid calls to putAttribute on frozen FieldType instances.	2019-05-20 20:23:08 +02:00
Andrzej Bialecki	ed4b789bf4	Add new version number for 8.1.1 release. Move the SOLR-13475 entry to the correct section.	2019-05-20 20:14:21 +02:00
Noble Paul	bd64ed6d2a	SOLR-13437: fork noggit code into Solr (#666 ) * SOLR-13437: fork noggit code into Solr	2019-05-16 11:10:27 +10:00
Ishan Chattopadhyaya	9189472d70	Adding backcompat indexes for 8.1	2019-05-13 16:30:57 +05:30
Atri Sharma	c988b04b18	LUCENE-7840: Avoid Building Scorer Supplier For Redundant SHOULD Clauses For boolean queries, we should eliminate redundant SHOULD clauses during query rewrite and not build the scorer supplier, as opposed to eliminating them during weight construction Signed-off-by: jimczi <jimczi@apache.org>	2019-05-09 09:00:18 +02:00
Simon Willnauer	a759a5d47c	Fix Changes.txt entry	2019-05-08 11:43:49 +02:00
Simon Willnauer	e8d88a5b54	LUCENE-8785: Ensure threadstates are locked before iterating (#664 ) Ensure new threadstates are locked before retrieving the number of active threadstates. This causes assertion errors and potentially broken field attributes in the IndexWriter when IndexWriter#deleteAll is called while actively indexing.	2019-05-08 11:19:21 +02:00
Dawid Weiss	5c9e7d5351	LUCENE-8781: FST lookup performance has been improved in many cases by encoding Arcs using full-sized arrays with gaps. The new encoding is enabled for postings in the default codec and for suggesters. (Mike Sokolov)	2019-05-06 11:19:35 +02:00
Christine Poerschke	6842676952	LUCENE-8756: ant precommit (ant check-forbidden-apis) fix	2019-05-01 19:31:42 +01:00
Ishan Chattopadhyaya	c808b2f5fe	Adding 8.2 version	2019-05-01 15:15:49 +05:30
Thomas Lemmé	424558ff88	LUCENE-8787: DateRangePrefixTree now parses milliseconds when num digits != 3	2019-05-01 00:32:19 -04:00
Uwe Schindler	87c16882ae	LUCENE-8738, LUCENE-8786: Fix ECJ linter to accept Java 11 syntax	2019-04-30 19:40:00 +02:00
Mike McCandless	4a76ad7263	LUCENE-8756: add CHANGES entry	2019-04-30 12:15:49 -04:00

1 2 3 4 5 ...

11342 Commits