lucene

Commit Graph

Author	SHA1	Message	Date
Koen De Groote	67104dd615	LUCENE-8847: Code Cleanup: Rewrite StringBuilder.append with concatted strings (#707 ) This specific commit affects all points in the casebase where the argument of a StringBuilder.append() call is itself a regular String concatenation. This defeats the purpose of using StringBuilder and also introduces an extra alloction. These changes should avoid that. ant tests have run, succeeded on local machine. Removing test files from the changes. Another suggested rework.	2019-06-10 18:07:43 +02:00
Atri Sharma	965fd194d1	LUCENE-8825: Improve CheckHits's Printing Capabilities Signed-off-by: Adrien Grand <jpountz@gmail.com>	2019-06-07 18:47:41 +02:00
Atri Sharma	87e936f1bb	LUCENE-8757: Improving Default Segments To Thread Mapping Algorithm The current slicing algorithm assigns a thread per segment, which can be detrimental to performance in case the distribution has a large number of small segments. The patch introduces a slicing algorithm which coalesces smaller segments to a single thread, thus reducing the impact of context switching by limiting the number of threads Signed-off-by: Adrien Grand <jpountz@gmail.com>	2019-05-21 20:18:42 +02:00
Namgyu Kim	5a694ea26f	LUCENE-8805: Parameter changes for stringField() in StoredFieldVisitor Signed-off-by: Namgyu Kim <kng0828@gmail.com> Signed-off-by: Adrien Grand <jpountz@gmail.com>	2019-05-21 20:18:42 +02:00
Dawid Weiss	5c9e7d5351	LUCENE-8781: FST lookup performance has been improved in many cases by encoding Arcs using full-sized arrays with gaps. The new encoding is enabled for postings in the default codec and for suggesters. (Mike Sokolov)	2019-05-06 11:19:35 +02:00
Nicholas Knize	faa78ad72c	LUCENE-8736: Fix Polygon.contains to approriately check longitude range, and pass correct line segment vertices in EdgeTree	2019-04-18 13:15:07 -05:00
Uwe Schindler	faaee86efb	LUCENE-8738: Move to Java 11 as minimum Java version (merged branch: jira/LUCENE-8738) Co-authored-by: Adrien Grand <jpountz@apache.org>	2019-04-16 14:00:09 +02:00
Simon Willnauer	a302be381e	LUCENE-8671: Introduce Reader attributes (#640 ) Reader attributes allows a per IndexReader configuration of codec internals. For instance this allows a per reader configuration if FSTs are loaded into memory or are left on disk.	2019-04-15 20:39:36 +02:00
Nicholas Knize	55c241d87f	LUCENE-8736: Fix LatLonShapePolygonQuery and Polygon2D.contains to correctly include points that fall on the boundary	2019-04-11 09:27:36 -05:00
Simon Willnauer	a9503d2e81	LUCENE-8754: Prevent ConcurrentModificationException in SegmentInfo (#637 ) In order to prevent ConcurrentModificationException this change makes an unmodifiable copy on write for all maps in SegmentInfo. MergePolicies can access these maps without synchronization and cause exceptions if it's modified in the merge thread.	2019-04-10 09:29:22 +02:00
Simon Willnauer	1ec229b604	LUCENE-8671: Expose FST off/on-heap options on Lucene50PostingsFormat (#613 ) Before we can expose options to configure this postings format on a per-reader basis we need to expose the option to load the terms index FST off or on heap on the postings format. This already allows to change the default in a per-field posting format if an expert user wants to change the defaults. This essentially provides the ability to change defaults globally while still involving some glue code.	2019-04-04 16:59:37 +02:00
Henning Andersen	04afdb6442	LUCENE-8735: Avoid FileAlreadyExistsException on windows. (#619 ) FilterDirectory.getPendingDeletions() did not delegate the call, which resulted in a new IndexWriter on same directory not considering pending delete files. This could in turn result in a FileAlreadyExistsException when running windows.	2019-03-26 14:56:45 +01:00
Simon Willnauer	14175c46d2	LUCENE-8671: Load FST off-heap if reader is not opened from an index writer (#610 ) Today we never load FSTs of ID-like fields off-heap since we need very fast access for updates. Yet, a reader that is not loaded from an IndexWriter can also leave the FST on disk. This change adds this information to SegmentReadState to allow the postings format to make this decision without configuration.	2019-03-20 11:28:10 +01:00
Adrien Grand	577bef53dd	LUCENE-8166: Require merge instances to be consumed in the thread that created them.	2019-03-19 10:51:54 +01:00
Simon Willnauer	ad457d188e	Improve RIW exception handling and opt out of concurrent flushing if exception is expected	2019-03-15 11:00:16 +01:00
Adrien Grand	425f207f40	LUCENE-8688: Forced merges merge more than necessary.	2019-03-15 10:27:27 +01:00
Alan Woodward	fbd05167f4	LUCENE-3041: QueryVisitor (#581 ) This commit adds an introspection API to Query, allowing users to traverse the nested structure of a query and examine its leaves. It replaces the existing `extractTerms` method on Weight, and alters some highlighting code to use the new API	2019-03-14 15:04:33 +00:00
Simon Willnauer	ffb1fc83de	Concurrently flush next buffer during commit in RandomIndexWriter (#607 ) This is a spinn-off from `LUCENE-8700` that is satisfied by IndexWriter#flushNextBuffer. The idea here is to additionally call flushNextBuffer in RandomIndexWriter for better test coverage. This is a test-only change.	2019-03-14 15:43:35 +01:00
Alan Woodward	7ad0ac0191	LUCENE-8714: Don't use NoMergePolicy in norms tests This can cause spurious failures when run in conjunction with HandleLimitFS, as we can end up with lots of very small segments which trips the file handles limit	2019-03-01 14:47:54 +00:00
Simon Willnauer	4a513fa99f	LUCENE-8292: Make TermsEnum fully abstract (#574 )	2019-02-15 17:32:55 +01:00
yyuan2	a3a4ecd80b	LUCENE-8662: Change TermsEnum.seekExact(BytesRef) to abstract	2019-02-08 15:10:38 -08:00
iverase	5d1d6448b9	LUCENE-8673: Use radix partitioning when merging dimensional points instead of sorting all dimensions before hand.	2019-02-07 08:12:13 +01:00
Dawid Weiss	d7dc53ff7c	LUCENE-8474: Remove deprecated RAMDirectory.	2019-01-28 13:49:03 +01:00
Toke Eskildsen	c13645bd4c	LUCENE-8585: Create jump-tables for DocValues at index-time	2019-01-18 22:42:04 +01:00
Dawid Weiss	efef89adc6	LUCENE-8642: RamUsageTester.sizeOf ignores arrays and collections if --illegal-access=deny.	2019-01-18 11:55:53 +01:00
Dawid Weiss	f2352e9456	Revert "LUCENE-8642, LUCENE-8641: correct RamUsageTester.sizeOf's handling of ByteBuffers. Throw exceptions on denied reflection to catch problems early. This affects tests only." This reverts commit `a16f0833ed`.	2019-01-17 13:05:36 +01:00
Dawid Weiss	a16f0833ed	LUCENE-8642, LUCENE-8641: correct RamUsageTester.sizeOf's handling of ByteBuffers. Throw exceptions on denied reflection to catch problems early. This affects tests only.	2019-01-17 12:23:30 +01:00
Dawid Weiss	d4e016afdf	LUCENE-8474: (partial) removal of accesses to RAMFile and RAMDirectory streams. Removal of GrowableByteArrayDataOutput.	2019-01-15 13:42:25 +01:00
Steve Rowe	283b19a8da	LUCENE-8527: Upgrade JFlex to 1.7.0. StandardTokenizer and UAX29URLEmailTokenizer now support Unicode 9.0, and provide UTS#51 v11.0 Emoji tokenization with the '<EMOJI>' token type.	2019-01-08 13:33:49 -05:00
Dawid Weiss	f28c5bec9b	LUCENE-8604: TestRuleLimitSysouts now has an optional "hard limit" of bytes that can be written to stderr and stdout (anything beyond the hard limit is ignored). The default hard limit is 2 GB of logs per test class.	2018-12-18 22:03:44 +01:00
Dawid Weiss	e916f1fb86	LUCENE-8611: Update randomizedtesting to 2.7.2, JUnit to 4.12, add hamcrest-core dependency.	2018-12-15 09:49:36 +01:00
Simon Willnauer	e974311d91	LUCENE-8609: Allow getting consistent docstats from IndexWriter Today we have #numDocs() and #maxDoc() on IndexWriter. This is enough to get all stats for the current index but it's subject to concurrency and might return numbers that are not consistent ie. some cases can return maxDoc < numDocs which is undesirable. This change adds a getDocStats() method to index writer to allow fetching consistent numbers for these stats. This change also deprecates IndexWriter#numDocs() and IndexWriter#maxDoc() and replaces all their usages wiht IndexWriter#getDocStats()	2018-12-14 19:36:25 +01:00
Alan Woodward	f5867a1413	LUCENE-8564: Add GraphTokenFilter	2018-12-04 09:47:42 +00:00
Michael Sokolov	6728f0c4f4	update comment after limiting number of debug tokens	2018-11-27 06:00:29 -05:00
Michael Sokolov	34ed01543a	fixing javadoc; added docs for parameters of new method	2018-11-27 06:00:29 -05:00
Michael Sokolov	54907903e8	LUCENE-8517: do not wrap FixedShingleFilter with conditional in TestRandomChains	2018-11-27 06:00:29 -05:00
Dawid Weiss	bd3ce916bd	LUCENE-8568: TestMockDirectoryWrapper/ RAMInputStream NPE.	2018-11-20 13:37:29 +01:00
Erick Erickson	763e64260f	SOLR-12881: Remove unneeded import statements	2018-11-14 17:48:15 -08:00
Dawid Weiss	4e2481b04b	LUCENE-8560: TestByteBuffersDirectory.testSeekPastEOF() failures with ByteArrayIndexInput. ByteArrayIndexInput removed entirely, without a replacement.	2018-11-10 16:54:28 +01:00
David Smiley	d0cd4245bd	LUCENE-8557: LeafReader.getFieldInfos should always return the same instance MemoryIndex: compute/cache up-front Solr Collapse/Expand with top_fc: compute/cache up-front Json Facets numerics / hash DV: use the cached fieldInfos on SolrIndexSearcher SolrIndexSearcher: move the cached FieldInfos to SlowCompositeReaderWrapper	2018-11-06 14:45:32 -05:00
David Smiley	fd9164801e	LUCENE-7875: Moved MultiFields static methods to MultiTerms, FieldInfos and MultiBits. MultiBits is now public and has getLiveDocs.	2018-10-18 19:49:14 -04:00
Christine Poerschke	1ccd555862	Fix couple of typos.	2018-10-15 15:08:17 -04:00
Nicholas Knize	1118299c33	LUCENE-8496: Selective indexing - modify BKDReader/BKDWriter to allow users to select a fewer number of dimensions to be used for creating the index than the total number of dimensions used for field encoding. i.e., dimensions 0 to N may be used to determine how to split the inner nodes, and dimensions N+1 to D are ignored and stored as data dimensions at the leaves.	2018-10-08 18:51:03 -05:00
David Smiley	fe844c739b	LUCENE-8513: Remove MultiFields.getFields SlowCompositeReaderWrapper now works with MultiTerms directly	2018-10-01 10:39:12 -04:00
Alan Woodward	c696cafc0d	LUCENE-8352: Make TokenStreamComponents final	2018-09-19 10:02:56 +01:00
Adrien Grand	a9acdfdb54	LUCENE-8340: Recency boosting.	2018-09-04 14:03:24 +02:00
Alan Woodward	910a0231f6	LUCENE-6228: Add Scorable class and make LeafCollector.setScorer() take Scorable	2018-09-04 11:01:44 +01:00
Dawid Weiss	54f2565038	LUCENE-8469: Inline calls to the deprecated StringHelper.compare, removed StringHelper.compare from master.	2018-08-30 09:59:51 +02:00
Dawid Weiss	ce504f4f81	LUCENE-8468: add ByteBuffersDirectory to randomized Directory implementations in LuceneTestCase (master branch only).	2018-08-29 10:43:00 +02:00
Adrien Grand	025350ea12	LUCENE-8461: Add Lucene80Codec.	2018-08-23 10:51:45 +02:00
Jim Ferenczi	49e3cca77f	LUCENE-8204: Boolean queries with a mix of required and optional clauses are now faster if the total hit count is not required	2018-08-08 15:49:58 +02:00
Adrien Grand	e56c8722ce	Revert "Make the nightly test smaller so that it does not fail with GC overhead exceeded (OOM). Clean up random number fetching to make it shorter." This reverts commit `3203e99d8f`.	2018-08-01 15:44:57 +02:00
Adrien Grand	86a39fa29f	Revert "Fix AAIOOBE in GeoTestUtil." This reverts commit `c3e813188e`.	2018-08-01 15:44:47 +02:00
Adrien Grand	c3e813188e	Fix AAIOOBE in GeoTestUtil.	2018-08-01 15:17:53 +02:00
Dawid Weiss	3203e99d8f	Make the nightly test smaller so that it does not fail with GC overhead exceeded (OOM). Clean up random number fetching to make it shorter.	2018-08-01 14:05:02 +02:00
Adrien Grand	99dbe93681	LUCENE-8060: IndexSearcher's search and searchAfter methods now only compute total hit counts accurately up to 1,000.	2018-08-01 09:01:21 +02:00
Steve Rowe	a08eadb480	Fix InfixSuggestersTest.testShutdownDuringBuild() failures	2018-07-30 22:49:49 -04:00
Adrien Grand	61e89e3ca0	LUCENE-8431: Top-docs collectors now collect lower bounds of the hit count.	2018-07-30 16:38:05 +02:00
Adrien Grand	9ca053712a	LUCENE-8430: TopDocs.totalHits may now be a lower bound of the hit count.	2018-07-30 16:38:05 +02:00
Dawid Weiss	d25f62634b	LUCENE-8415: test quirk follow up. MockDirectoryWriter uses AccessDeniedException (a subclass of IOException) to signal files still open for writing when read access is requested.	2018-07-25 11:34:31 +02:00
Dawid Weiss	8892c0d9af	LUCENE-8415: Clean up Directory contracts (write-once, no reads-before-write-completed). Minor test improvements and cleanups.	2018-07-24 08:47:50 +02:00
Jason Gerlowski	6ed9607f74	SOLR-12555: Add add'l expectThrows() test helper	2018-07-23 20:37:04 -04:00
Alan Woodward	028c86b1fa	LUCENE-8306: Allow iteration over submatches Also includes LUCENE-8404, adding match iteration to SpanQuery	2018-07-23 10:02:01 +01:00
Alan Woodward	6e3f61f6f9	Revert "LUCENE-8306: Allow iteration over submatches" Incorrect patch committed in error This reverts commit `a8839b7eab`.	2018-07-22 22:36:46 +01:00
Alan Woodward	a8839b7eab	LUCENE-8306: Allow iteration over submatches	2018-07-22 21:42:46 +01:00
Adrien Grand	331ccf3910	LUCENE-8405: Remove TopDocs.maxScore.	2018-07-18 08:38:57 +02:00
Adrien Grand	8093c450c1	LUCENE-8263: Replace TieredMergePolicy's reclaimDeletesWeight with deletesPctAllowed.	2018-07-17 18:31:06 +02:00
Adrien Grand	d730c8b214	LUCENE-8060: Remove usage of TopDocs#totalHits that should really be IndexSearcher#count. Many tests were written before we introduced IndexSearcher#count and used `searcher.search(query, 1).totalHits` to get the number of matches of a query rather than `searcher.count(query)`.	2018-07-17 14:32:02 +02:00
Michael Braun	f0e1864ceb	Merge remote-tracking branch 'source/master' into remove-constructor-wrapper-classes	2018-07-14 13:39:37 -04:00
Nicholas Knize	b5ef13330f	LUCENE-8396: Add Points Based Shape Indexing and Search that decomposes shapes into a triangular mesh and indexes individual triangles as a 6 dimension point	2018-07-14 11:28:37 -05:00
Adrien Grand	b1bb11b79d	LUCENE-8391: More tests for merge policies.	2018-07-10 09:17:34 +02:00
Adrien Grand	41ddac5b44	LUCENE-8385: Fix computation of the allowed segment count in TieredMergePolicy.	2018-07-09 15:21:10 +02:00
Erick Erickson	c303c5f126	LUCENE-8370: Reproducing TestLucene{54,70}DocValuesFormat.testSortedSetVariableLengthBigVsStoredFields() failures	2018-06-28 18:28:37 -07:00
Alan Woodward	ab2fec1642	LUCENE-8237: Correct handling of position increments in sub-tokenstreams	2018-06-18 09:57:38 +01:00
Nhat Nguyen	8a6f1bf5ad	LUCENE-8165: Ban copyOf and copyOfRange. These methods are lenient with out-of-bounds indices. Signed-off-by: Adrien Grand <jpountz@gmail.com>	2018-06-07 10:08:21 +02:00
Michael Braun	78079fc552	Merge remote-tracking branch 'source/master' into remove-constructor-wrapper-classes	2018-06-05 18:48:55 -04:00
Simon Willnauer	59087d148a	[TEST] Ensure MDW.assertNoUnreferencedFilesOnClose is threadsafe	2018-06-04 17:33:18 +02:00
Simon Willnauer	fe83838ec3	LUCENE-8341: Record soft deletes in SegmentCommitInfo This change add the number of documents that are soft deletes but not hard deleted to the segment commit info. This is the last step towards making soft deletes as powerful as hard deltes since now the number of document can be read from commit points without opening a full blown reader. This also allows merge posliies to make decisions without requiring an NRT reader to get the relevant statistics. This change doesn't enforce any field to be used as soft deletes and the statistic is maintained per segment.	2018-06-04 15:05:12 +02:00
Simon Willnauer	e7a0a12926	LUCENE-8335: Enforce soft-deletes field up-front Soft deletes field must be marked as such once it's introduced and can't be changed after the fact. Co-authored-by: Nhat Nguyen <nhat.nguyen@elastic.co>	2018-06-04 08:28:38 +02:00
Michael Braun	fb6574100e	LUCENE-8345 - add wrapper class constructors to forbiddenapis	2018-06-03 15:40:50 -04:00
Simon Willnauer	3dc4fa199c	Revert "LUCENE-8335: Enforce soft-deletes field up-front." This reverts commit `a2d9276674`.	2018-06-02 13:47:24 +02:00
Simon Willnauer	a2d9276674	LUCENE-8335: Enforce soft-deletes field up-front. Soft deletes field must be marked as such once it's introduced and can't be changed after the fact.	2018-06-02 13:14:53 +02:00
Simon Willnauer	34741a863a	LUCENE-8330: Exclude MockRandomMP from basic tests	2018-05-29 16:58:03 +02:00
Simon Willnauer	c93f628317	LUCENE-8330: Detach IndexWriter from MergePolicy This change introduces a new MergePolicy.MergeContext interface that is easy to mock and cuts over all instances of IW to MergeContext. Since IW now implements MergeContext the cut over is straight forward. This reduces the exposed API available in MP dramatically and allows efficient testing without relying on IW to improve the coverage and testability of our MP implementations.	2018-05-25 07:37:09 +02:00
Simon Willnauer	70cfe46689	LUCENE-8320: Fix NPE in WindowsFS if target file exists but isn't open	2018-05-18 19:38:11 +02:00
Alan Woodward	b1ee23c525	LUCENE-8273: Fix end() and posInc handling	2018-05-18 13:11:39 +01:00
Simon Willnauer	42a79970d5	LUCENE-8320: Fix WindowsFS#rename with hardlinks	2018-05-18 09:33:50 +02:00
Simon Willnauer	3fe612bed2	LUCENE-8318: Ensure pending delete is not brought back on a try delete attempt When renaming a file, `FSDirectory#rename` tries to delete the dest file if it's in the pending deletes list. If that delete fails, it adds the dest to the pending deletes list again. This causes the dest file to be deleted later by `deletePendingFiles`.	2018-05-17 11:02:35 +02:00
Adrien Grand	6d69824a6b	LUCENE-8314: More checks on AssertingScorer.	2018-05-16 17:54:19 +02:00
Adrien Grand	9b9776a714	LUCENE-8313: Simplify SimScorer.	2018-05-16 17:53:56 +02:00
Simon Willnauer	585952797c	LUCENE-8310: Ensure IndexFileDeleter accounts for pending deletes Today we fail creating the IndexWriter when the directory has a pending delete. Yet, this is mainly done to prevent writing still existing files more than once. IndexFileDeleter already accounts for that for existing files which we can now use to also take pending deletes into account which ensures that all file generations per segment always go forward.	2018-05-16 11:17:43 +02:00
Adrien Grand	d764156f91	LUCENE-8303: Make the overflow test a Monster rather than Nightly.	2018-05-11 14:36:42 +02:00
Simon Willnauer	a3c86373e4	LUCENE-8298: Allow DocValues updates to reset a value Today once a document has a value in a certain DV field this values can only be changed but not removed. While resetting / removing a value from a field is certainly a corner case it can be used to undelete a soft-deleted document unless it's merged away. This allows to rollback changes without rolling back to another commitpoint or trashing all uncommitted changes. In certain cenarios it can be used to "repair" history of documents in distributed systems.	2018-05-09 18:57:57 +02:00
Adrien Grand	8dc69428e3	LUCENE-8303: Make LiveDocsFormat only responsible for serialization/deserialization of live docs.	2018-05-09 15:40:14 +02:00
Dawid Weiss	85c00e77ef	LUCENE-8267: removed references to memory codecs.	2018-05-08 10:32:11 +02:00
Adrien Grand	67c13bbe2e	LUCENE-8142: Fix QueryUtils to only call getMaxScore when it is legal to do so.	2018-05-02 17:42:18 +02:00
Adrien Grand	46ecb73976	LUCENE-8142: Fix AssertingImpactsEnum and add missing javadoc.	2018-05-02 17:20:42 +02:00
Adrien Grand	af680af77f	LUCENE-8142: Make postings APIs expose raw impacts rather than scores.	2018-05-02 14:49:32 +02:00
Simon Willnauer	933d8a6995	LUCENE-8275: Fix BaseLockFactoryTestCase to step out on Windowns if pending files are found The particular test here is #testStressLocks that has several protectesion against WindowsFS and special logic in the catch clause that steps out on fatal exceptions with pending deletes. Since we now check this consistently in the IW ctor we need to also skip this entire test if we are on windows and have pending deletes.	2018-04-26 12:10:10 +02:00
Alan Woodward	e167e91247	LUCENE-8270: Remove MatchesIterator.term()	2018-04-23 16:51:17 +01:00
Simon Willnauer	6f0a884582	LUCENE-8269: Detach downstream classes from IndexWriter IndexWriter today is shared with many classes like BufferedUpdateStream, DocumentsWriter and DocumentsWriterPerThread. Some of them even acquire locks on the writer instance or assert that the current thread doesn't hold a lock. This makes it very difficult to have a manageable threading model. This change separates out the IndexWriter from those classes and makes them all independent of IW. IW now implements a new interface for DocumentsWriter to communicate on failed or successful flushes and tragic events. This allows IW to make it's critical methods private and execute all lock critical actions on it's private queue that ensures that the IW lock is not held. Follow-up changes will try to detach more code like publishing flushed segments to ensure we never call back into IW in an uncontrolled way.	2018-04-23 17:17:40 +02:00
Simon Willnauer	c70cceaee5	LUCENE-8253: Account for soft-deletes before they are flushed to disk Inside the IndexWriter buffers are only written to disk if it's needed or "worth it" which doesn't guarantee soft deletes to be accounted in time. This is not necessarily a problem since they are eventually collected and segments that have soft-deletes will me merged eventually but for tests and on par behavior compared to hard deletes this behavior is tricky. This change cuts over to accounting in-place just like hard-deletes. This results in accurate delete numbers for soft deletes at any give point in time once the reader is loaded or a pending soft delete occurs. This change also fixes an issue where all updates to a DV field are allowed event if the field is unknown. Now this only works if the field is equal to the soft deletes field. This behavior was never released.	2018-04-16 16:17:06 +02:00
Mike McCandless	7c0387ad3f	LUCENE-8248: MergePolicyWrapper is renamed to FilterMergePolicy and now also overrides getMaxCFSSegmentSizeMB	2018-04-13 15:45:19 -04:00
Alan Woodward	040a9601b1	LUCENE-8229: Add Weight.matches() to iterate over match positions	2018-04-11 09:43:27 +01:00
Alan Woodward	798d351034	LUCENE-8242: Deprecate createNormalizedWeight	2018-04-09 15:07:04 +01:00
Simon Willnauer	ed62b990d8	LUCENE-8237: Add a SoftDeletesDirectoryReaderWrapper This adds support for soft deletes if the reader is opened form a directory. Today we only support soft deletes for NRT readers, this change allows to wrap existing DirectoryReader with a SoftDeletesDirectoryReaderWrapper to also filter out soft deletes in the case of a non-NRT reader.	2018-04-09 11:50:38 +02:00
Simon Willnauer	ecc17f9023	LUCENE-8233: Add support for soft deletes to IndexWriter This change adds support for soft deletes as a fully supported feature by the index writer. Soft deletes are accounted for inside the index writer and therefor also by merge policies. This change also adds a SoftDeletesRetentionMergePolicy that allows users to selectively carry over soft_deleted document across merges for renention policies. The merge policy selects documents that should be kept around in the merged segment based on a user provided query.	2018-04-04 13:45:14 +02:00
Robert Muir	e595541ef3	LUCENE-8192: always enforce index-time offsets are correct with BaseTokenStreamTestCase	2018-03-26 22:02:34 -04:00
Alan Woodward	fac84c01c8	LUCENE-8202: Add FixedShingleFilter	2018-03-21 13:45:03 +00:00
Simon Willnauer	2e35ef2b3d	LUCENE-8215: Fix several fragile exception handling places in o.a.l.index Several places in the index package don't handle exceptions well or ignores them. This change adds some utility methods and cuts over to make use of try/with blocks to simplify exception handling.	2018-03-20 10:50:12 +01:00
Adrien Grand	3048e5da22	LUCENE-8008: Remove unintended changes.	2018-03-20 09:52:24 +01:00
Robert Muir	97299ed006	LUCENE-8191: if a tokenstream has broken offsets, its broken. IndexWriter always checks, so a separate whitelist can't work	2018-03-04 11:23:45 -05:00
Erick	ad7e94afb2	SOLR-12028: BadApple and AwaitsFix annotations usage	2018-03-03 21:42:14 -08:00
Uwe Schindler	7dba350c7a	SOLR-12028: Make initialization of constants dynamic (by reading the annotation), also add missing reproduce info	2018-02-28 00:47:00 +01:00
Erick Erickson	1fe45606b9	SOLR-12028: BadApple and AwaitsFix annotations usage	2018-02-26 20:35:12 -08:00
Adrien Grand	317a2e0c3d	LUCENE-8153: Make impacts checks lighter by default. The new `-slow` switch makes checks more complete but also more heavy. This option also cross-checks term vectors.	2018-02-20 17:14:11 +01:00
Adrien Grand	4fb7e3d02c	LUCENE-8135: Implement block-max WAND.	2018-02-15 15:13:58 +01:00
Alan Woodward	342e38217a	LUCENE-8163: BaseDirectoryTestCase produces random filenames that fail on Windows	2018-02-09 09:14:02 +00:00
Adrien Grand	f410df8113	LUCENE-4198: Give codecs the opportunity to index impacts.	2018-01-31 14:54:52 +01:00
Adrien Grand	75d50b4492	LUCENE-8116: Remove unnecessary IOException.	2018-01-11 11:49:36 +01:00
Adrien Grand	838c604b76	LUCENE-8119: Remove SimScorer.maxScore(float maxFreq).	2018-01-09 14:42:16 +01:00
Alan Woodward	d250a1463d	LUCENE-8133: Rename TermContext to TermStates, and load TermState lazily if term stats are not required	2018-01-05 14:17:15 +00:00
Adrien Grand	8fd7ead940	LUCENE-8116: SimScorer now only takes a frequency and a norm as per-document scoring factors.	2018-01-04 15:13:36 +01:00
Alan Woodward	c1030eeb74	LUCENE-8012: Explanation takes Number rather than float	2018-01-02 11:06:59 +00:00
Adrien Grand	b2f248164c	LUCENE-8010: Fix similarities so that they pass tests.	2017-12-29 10:06:00 +01:00
Steve Rowe	3e2f9e62d7	LUCENE-2899: Add OpenNLP Analysis capabilities as a module	2017-12-15 11:24:18 -05:00
Adrien Grand	d5c72eb588	LUCENE-8081: Remove unused import.	2017-12-08 08:45:18 +01:00
Simon Willnauer	ede46fe6e9	LUCENE-8081: Allow IndexWriter to opt out of flushing on indexing threads Index/Update Threads try to help out flushing pending document buffers to disk. This change adds an expert setting to opt ouf of this behavior unless flusing is falling behind.	2017-12-07 16:22:52 +01:00
Adrien Grand	4fc5a872de	LUCENE-4100: Faster disjunctions when the hit count is not needed.	2017-12-07 10:49:39 +01:00
Adrien Grand	63b63c5734	LUCENE-8015: Fixed DFR similarities' scores to not decrease when tfn increases.	2017-12-06 18:19:57 +01:00
Adrien Grand	a8a63464e7	LUCENE-7996: Queries are now required to produce positive scores.	2017-12-06 14:06:03 +01:00
Simon Willnauer	01d12777c4	LUCENE-8068: Allow IndexWriter to write a single DWPT to disk Adds a `flushNextBuffer` method to IndexWriter that allows the caller to synchronously move the next pending or the biggest non-pending index buffer to disk. This enables flushing selected buffer to disk without highjacking an indexing thread. This is for instance useful if more than one IW (shards) must be maintained in a single JVM / system.	2017-11-30 18:57:27 +01:00
Adrien Grand	d27ddcb409	LUCENE-8008: Reduce leniency in CheckHits.	2017-11-29 18:09:38 +01:00
David Smiley	64d95e6a6d	LUCENE-8049: IndexWriter.getMergingSegments() signature changed to return Set instead of Collection	2017-11-26 23:25:06 -05:00
Alan Woodward	183571c085	LUCENE-6278: Remove Scorer.freq()	2017-11-15 11:14:16 +00:00
Alan Woodward	276e317e94	LUCENE-8042: Add SegmentCachable interface	2017-11-10 12:17:50 +00:00
Alan Woodward	1aa049bb27	LUCENE-8014: Remove deprecated SimScorer methods	2017-11-10 09:43:18 +00:00
Alan Woodward	764abcb31a	Revert "LUCENE-8014: Remove deprecated SimScorer methods" Reverting to fix test failures This reverts commit `946ec9d5b9`.	2017-11-10 09:02:03 +00:00
Alan Woodward	946ec9d5b9	LUCENE-8014: Remove deprecated SimScorer methods	2017-11-09 14:05:34 +00:00
Alan Woodward	a886a001a4	LUCENE-8017: Add Weight.getCacheHelper()	2017-11-03 10:40:14 +00:00
Robert Muir	ca5f9b3457	LUCENE-8007: Make scoring statistics mandatory	2017-11-02 23:02:21 -04:00
Robert Muir	875d45ff14	LUCENE-8030: fix buggy assert	2017-10-31 22:30:33 -04:00
Robert Muir	e0bde57981	LUCENE-8020: don't force sim to score bogus terms (e.g. docfreq=0)	2017-10-30 20:32:12 -04:00
Robert Muir	489ca238c4	LUCENE-8021: Add AssertingSimilarity	2017-10-30 18:38:26 -04:00
Robert Muir	42717d5f4b	LUCENE-7997: More sanity testing of similarities	2017-10-24 22:48:04 -04:00
Mike McCandless	ea36f5040c	LUCENE-7999: upgrade int to long for tracking the counter for the next segment name to prevent overflow	2017-10-24 13:13:41 -04:00
Dawid Weiss	46cd679e91	LUCENE-7983: IndexWriter.IndexReaderWarmer is now a functional interface instead of an abstract class with a single method.	2017-10-04 10:59:16 +02:00
Nicholas Knize	bf71650ad7	LUCENE-7392: Add point based LatLonBoundingBox as new RangeField Type.	2017-09-19 14:45:04 -05:00
yonik	a4374e840d	SOLR-11173: implement Points support in TermsComponent via PointMerger	2017-08-19 18:02:11 -04:00
Adrien Grand	9c83d025e4	LUCENE-7897: IndexOrDocValuesQuery now requires the range cost to be more than 8x greater than the cost of the lead iterator in order to use doc values.	2017-08-10 12:10:44 +02:00

1 2 3 4 5 ...

1617 Commits