lucene

mirror of https://github.com/apache/lucene.git synced 2025-03-03 14:59:16 +00:00

Author	SHA1	Message	Date
Simon Willnauer	bc4da80776	Fix visibility on member variables in IndexWriter and friends (#1460 ) Today it looks like wild wild west inside IndexWriter and some of it's associated classes. This change makes sure all non-final members have private visibility, methods that are not used outside of IW today are made private unless they have been public. This change also removes some unused or unnecessary members where possible and deleted some dead code from previous refactoring.	2020-04-27 17:49:20 +02:00
Uwe Schindler	64eed9a1a6	LUCENE-9347: Add support for forbiddenapis 3.0 (#1459 ) LUCENE-9347: Add support for forbiddenapis 3.0	2020-04-27 11:54:59 +02:00
Alan Woodward	5d5b7e14d4	LUCENE-9314: Use SingletonDocumentBatch in monitor when we only have a single document	2020-04-27 10:41:49 +01:00
Tomoko Uchida	13bbe60333	LUCENE-9344: update file names (MIGRATE.txt, BUILD.txt => MIGRATE.md, BUILD.md)	2020-04-27 10:23:52 +09:00
Tomoko Uchida	f03e6aac59	SOLR-14429: Convert .txt files to properly formatted .md files (#1450 )	2020-04-27 08:43:04 +09:00
Simon Willnauer	8059eea160	Consolidate all IW locking inside IndexWriter (#1454 ) Today we still have one class that runs some tricky logic that should be in the IndexWriter in the first place since it requires locking on the IndexWriter itself. This change inverts the API and now FrozendBufferedUpdates does not get the IndexWriter passed in, instead the IndexWriter owns most of the logic and executes on a FrozenBufferedUpdates object. This prevent locking on IndexWriter out side of the writer itself and paves the way to simplify some concurrency down the road	2020-04-24 19:07:21 +02:00
Pierre-Luc Perron	013e98347a	LUCENE-9267 Replace getQueryBuildTime time unit from ms to ns	2020-04-24 10:36:30 -05:00
Simon Willnauer	d7e0b906ab	LUCENE-9345: Separate MergeSchedulder from IndexWriter (#1451 ) This change extracts the methods that are used by MergeScheduler into a MergeSource interface. This allows IndexWriter to better ensure locking, hide internal methods and removes the tight coupling between the two complex classes. This will also improve future testing.	2020-04-24 15:02:55 +02:00
Alan Woodward	5eb117f561	LUCENE-9340: Remove deprecated SimpleBindings#add(SortField) method	2020-04-24 12:22:21 +01:00
Alan Woodward	f6462ee350	LUCENE-9340: Deprecate SimpleBindings#add(SortField) (#1447 ) This method is trappy; it doesn't work for all SortField types, but doesn't tell you that until runtime. This commit deprecates it, and removes all other callsites in the codebase.	2020-04-24 12:08:16 +01:00
Alan Woodward	ed3caab2d8	LUCENE-9338: Clean up type safety in SimpleBindings (#1444 ) Replaces SimpleBindings' Map<String, Object> with a map of Function<Bindings, DoubleValuesSource> to improve type safety, and reworks cycle detection and validation to avoid catching StackOverflowException	2020-04-24 10:23:50 +01:00
Simon Willnauer	83018deef7	Ensure we use a sane IWC for tests adding many documents. This test produced tons of files on nighly builds causing TooManyOpenFilesExceptions likely due to not using CFS on flush and/or very small maxMergeSize values.	2020-04-24 08:36:06 +02:00
Tomoko Uchida	75b648ce82	LUCENE-9344: Use https url for lucene.apache.org	2020-04-24 14:45:34 +09:00
Tomoko Uchida	c7697b088c	LUCENE-9344: Convert .txt files to properly formatted .md files (#1449 )	2020-04-24 14:28:12 +09:00
Tomas Fernandez Lobbe	a11b78e06a	LUCENE-9342: Collector's totalHitsThreshold should not be lower than numHits (#1448 ) Use the maximum of the two, this is so that relation is EQUAL_TO in the case of the number of hits in a query is less than the collector's numHits	2020-04-23 12:04:02 -07:00
Simon Willnauer	4a98918bfa	LUCENE-9339: Only call MergeScheduler when we actually found new merges (#1445 ) IW#maybeMerge calls the MergeScheduler even if it didn't find any merges we should instead only do this if there is in-fact anything there to merge and safe the call into a sync'd method.	2020-04-22 21:26:45 +02:00
Simon Willnauer	2b6ae53cd9	LUCENE-9337: Ensure CMS updates it's thread accounting datastructures consistently (#1443 ) CMS today releases it's lock after finishing a merge before it re-acquires it to update the thread accounting datastructures. This causes threading issues where concurrently finishing threads fail to pick up pending merges causing potential thread starvation on forceMerge calls.	2020-04-22 14:30:14 +02:00
Mike McCandless	e0c06ee6a6	LUCENE-9191: make LineFileDocs random seeking more efficient by recording safe skip points in the concatenated gzip'd chunks	2020-04-21 12:09:17 -04:00
Simon Willnauer	56c61e698c	Remove dead code	2020-04-21 13:38:19 +02:00
Ignacio Vera	f914e08b36	LUCENE-9273: Speed up geometry queries by specialising Component2D spatial operations (#1341 ) Speed up geometry queries by specialising Component2D spatial operations. Instead of using a generic relate method for all relations, we use specialise methods for each one. In addition, the type of triangle is computed at deserialisation time, therefore we can be more selective when decoding points of a triangle	2020-04-20 19:24:49 +02:00
Simon Willnauer	9881dc031c	Fix compiler warnings in tests	2020-04-18 14:45:03 +02:00
Simon Willnauer	113043b1ed	LUCENE-9324: Add an ID to SegmentCommitInfo (#1434 ) We already have IDs in SegmentInfo, as well as on SegmentInfos which are useful to uniquely identify segments and entire commits. Having IDs on SegmentCommitInfo is be useful too in order to compare commits for equality and make snapshots incremental on generational files. This change adds a unique ID to SegmentCommitInfo starting from Lucene 8.6. Older segments won't have an ID until the segment receives an update or a delete even if they have been opened and / or committed by Lucene 8.6 or above.	2020-04-18 14:24:57 +02:00
Erick Erickson	3af165b32a	LUCENE-7788: fail precommit on unparameterised log messages and examine for wasted work/objects	2020-04-17 20:40:32 -04:00
Tommaso Teofili	243cf2c99d	LUCENE-9327 - drop useless casts in BaseXYShapeTestCase	2020-04-17 14:57:18 +02:00
iverase	9340e56551	Add back-compat indices for 8.5.1	2020-04-16 09:52:08 +02:00
iverase	b7b85f3e75	Move bugfix entries to version 8.5.1	2020-04-16 09:36:55 +02:00
iverase	8a88ab0e7c	Add bugfix version 8.5.1	2020-04-16 09:27:06 +02:00
Adrien Grand	0aa4ba7ccb	LUCENE-9260: Verify checksums of CFS files. (#1311 )	2020-04-15 15:10:59 +02:00
Adrien Grand	aa605b3c70	LUCENE-9307: Remove the ability to set the buffer size dynamically on BufferedIndexInput (#1415 )	2020-04-15 15:10:11 +02:00
Simon Willnauer	47bc18478a	Move DWPT private deletes out of FrozenBufferedUpdates (#1431 ) This change moves the deletes tracked by FrozenBufferedUpdates that are private to the DWPT and never used in a global context out of FrozenBufferedUpdates.	2020-04-14 21:37:19 +02:00
Simon Willnauer	18af6325ed	LUCENE-9304: Fix IW#getMaxCompletedSequenceNumber() (#1427 ) After recent refactoring on LUCENE-9304 `IW#getMaxCompletedSequenceNumber()` might return values that belong to non-completed operations if a full flush is running, a new delete queue is already in place but not all DWPTs that participate in the full flush have finished it's in flight operation. This caused rare failures in `TestControlledRealTimeReopenThread#testControlledRealTimeReopenThread` where documents are not actually visible given the max completed seqNo. This change streamlines the delete queue advance, adds a dedicated testcase and ensures that a delete queues sequence Id space is never exhausted.	2020-04-14 19:39:23 +02:00
Julie Tibshirani	3236d38c8b	Avoid using a raw Arc type. (#1429 ) This fixes some compiler warnings that popped up recently.	2020-04-14 09:23:12 +02:00
Simon Willnauer	f5457b82a1	Suppress Direct postings for TestIndexWriterThreadsToSegments to prevent OOM on Nightly	2020-04-13 13:44:15 +02:00
Dawid Weiss	616ec987a9	Do a bit count on 8 bytes from a long directly instead of reading 8 bytes from the reader. Byte order doesn't matter here. (#1426 )	2020-04-13 13:37:25 +02:00
Shalin Shekhar Mangar	13f19f6555	SOLR-9906: SolrjNamedThreadFactory is deprecated in favor of SolrNamedThreadFactory. DefaultSolrThreadFactory is removed from solr-core in favor of SolrNamedThreadFactory in solrj package and all solr-core classes now use SolrNamedThreadFactory	2020-04-13 08:16:35 +05:30
Simon Willnauer	8c1f9815db	LUCENE-9309: ensure stopMerges is set under IW lock	2020-04-11 19:53:21 +02:00
Simon Willnauer	2602269f3e	LUCENE-9304: Refactor DWPTPool to pool DWPT directly (#1397 ) This change removes the ThreadState indirection from DWPTPool and pools DWPT directly. The tracking information and locking semantics are mostly moved to DWPT directly and the pool semantics have changed slightly such that DWPT need to be checked-out in the pool once they need to be flushed or aborted. This automatically grows and shrinks the number of DWPT in the system when number of threads grow or shrink. Access of pooled DWPTs is more straight forward and doesn't require ordinal. Instead consumers can just iterate over the elements in the pool. This allowed for removal of indirections in DWPTFlushControl like BlockedFlush, the removal of DWPTPool setter and getter in IndexWriterConfig and the addition of stronger assertions in DWPT and DW.	2020-04-11 12:23:46 +02:00
Nhat Nguyen	527e651660	LUCENE-9298: Fix TestBufferedUpdates This test failed on Elastic CI because we did not add any term in the loop. This commit ensures that we always add at least one docId, term and query in the test.	2020-04-10 15:28:10 -04:00
Simon Willnauer	e376582e25	LUCENE-9309: Wait for #addIndexes merges when aborting merges (#1418 ) The SegmentMerger usage in IW#addIndexes(CodecReader...) might make changes to the Directory while the IW tries to clean-up files on rollback. This causes issues like FileNotFoundExceptions when IDF tries to remove temp files. This changes adds a waiting mechanism to the abortMerges method that, in addition to the running merges, also waits for merges in addIndices(CodecReader...)	2020-04-10 12:55:02 +02:00
YuBinglei	2935186c5b	LUCENE-9298: Improve RAM accounting in BufferedUpdates when deleted doc IDs and terms are cleared (#1389 )	2020-04-10 12:30:47 +02:00
Bruno Roustant	6bba35a709	LUCENE-9286: FST.Arc.BitTable reads directly FST bytes. Arc is lightweight again and FSTEnum traversal faster.	2020-04-09 10:36:37 +02:00
Juan Camilo Rodriguez Duran	de6233976a	LUCENE-8050: PerFieldDocValuesFormat should not get the DocValuesFormat on a field that has no doc values. Closes #1408	2020-04-07 16:12:05 -04:00
Adrien Grand	529042e786	LUCENE-9271: Complete fix for setBufferSize.	2020-04-07 17:24:41 +02:00
Adrien Grand	3363e1aa48	LUCENE-9271: Fix bad assertion.	2020-04-07 16:21:33 +02:00
Adrien Grand	82692e76e0	LUCENE-9271: Move BufferedIndexInput to the ByteBuffer API. Closes #1338	2020-04-07 13:30:09 +02:00
Ignacio Vera	f018c4c813	LUCENE-9244: In 2D, a point can be shared by four leaves (#1279 ) Adjust TestLucene60PointsFormat#testEstimatePointCount2Dims so it does not fail when a point is shared by multiple leaves	2020-04-07 10:41:15 +02:00
Erick Erickson	e1e2085e94	SOLR-14386: Update Jetty to 9.4.27 and dropwizard-metrics version to 4.1.5	2020-04-04 16:14:57 -04:00
Jim Ferenczi	b5c5ebe37c	LUCENE-9300: Fix field infos update on doc values update (#1394 ) Today a doc values update creates a new field infos file that contains the original field infos updated for the new generation as well as the new fields created by the doc values update. However existing fields are cloned through the global fields (shared in the index writer) instead of the local ones (present in the segment). In practice this is not an issue since field numbers are shared between segments created by the same index writer. But this assumption doesn't hold for segments created by different writers and added through IndexWriter#addIndexes(Directory). In this case, the field number of the same field can differ between segments so any doc values update can corrupt the index by assigning the wrong field number to an existing field in the next generation. When this happens, queries and merges can access wrong fields without throwing any error, leading to a silent corruption in the index. This change ensures that we preserve local field numbers when creating a new field infos generation.	2020-04-03 13:58:05 +02:00
Atri Sharma	d6cef4f39c	Update CHANGES.txt	2020-04-01 20:56:19 +05:30
Atri Sharma	9ed71a6efe	LUCENE-9074: Slice Allocation Control Plane For Concurrent Searches (#1294 ) This commit introduces a mechanism to control allocation of threads to slices planned for a query. The default implementation uses the size of backlog queue of the executor to determine if a slice should be allocated a new thread	2020-04-01 20:42:26 +05:30

1 2 3 4 5 ...

11853 Commits