Commit Graph

35119 Commits

Author SHA1 Message Date
Gautam Worah efeea0b8ee
LUCENE-9902 Minor fixes to the faceting API (#62) 2021-04-06 14:50:23 -04:00
Robert Muir be94a667f2
LUCENE-9827: avoid wasteful recompression for small segments (#28)
Require that the segment has enough dirty documents to create a clean
chunk before recompressing during merge, there must be at least maxChunkSize.

This prevents wasteful recompression with small flushes (e.g. every
document): we ensure recompression achieves some "permanent" progress.

Expose maxDocsPerChunk as a parameter for Term vectors too, matching the
stored fields format. This allows for easy testing.

Increment numDirtyDocs for partially optimized merges:
If segment N needs recompression, we have to flush any buffered docs
before bulk-copying segment N+1. Don't just increment numDirtyChunks,
also make sure numDirtyDocs is incremented, too.
This doesn't have a performance impact, and is unrelated to tooDirty()
improvements, but it is easier to reason about things with correct
statistics in the index.

Further tuning of how dirtiness is measured: for simplification just use percentage
of dirty chunks.

Co-authored-by: Adrien Grand <jpountz@gmail.com>
2021-04-06 14:18:48 -04:00
Adrien Grand d991fefb49
Add an example to the CacheHelper docs. (#50) 2021-04-06 16:25:15 +02:00
Dawid Weiss 2662a74cab Correct some of the jdk17-offending javadocs. 2021-04-05 20:34:52 +02:00
Dawid Weiss 2773172455 Correct some of the jdk17-offending javadocs. 2021-04-05 20:21:52 +02:00
Dawid Weiss baceb16904 Correct some of the jdk17-offending javadocs. 2021-04-05 20:19:56 +02:00
Dawid Weiss fbf9191abf
LUCENE-9901: UnicodeData.java has no regeneration task (#63) 2021-04-05 20:12:56 +02:00
Ignacio Vera 67a0bd4b6d
LUCENE-9705: Final clean-up and entry in CHANGES.txt (#59) 2021-04-04 11:30:47 +02:00
Dawid Weiss 010e3a1ba9
LUCENE-9900: Regenerate/ run ICU only if inputs changed (#61) 2021-04-02 11:46:43 +02:00
Dawid Weiss e3ae57a3c1
LUCENE-9872: Make the most painful tasks in regenerate fully incremental (#60) 2021-04-02 09:56:47 +02:00
Tomoko Uchida 670bbf8b99
Ignore sdkmanrc file on Git (#58) 2021-04-02 01:04:14 +09:00
Ignacio Vera 8c9b9546cc
LUCENE-9705: Create Lucene90PointsFormat (#52) 2021-04-01 07:04:04 +02:00
Pieter van Boxtel 1d579b9448
LUCENE-9898 Remove no longer used scorePayload method from BM25Similarity (#57) 2021-04-01 09:06:03 +09:00
zacharymorn 79fcd99f4c
LUCENE-9883: Turn on ecj missingEnumCaseDespiteDefault setting (#56) 2021-03-31 15:50:52 +09:00
Dawid Weiss 32e891c60f LUCENE-9871: move dummy outputs aspect into a separate file. 2021-03-30 20:15:55 +02:00
Adrien Grand 10520185a9 LUCENE-9877: Move CHANGES entry under 8.9. 2021-03-30 15:13:00 +02:00
Greg Miller fd79f9737a
LUCENE-9877: Allow up to 7 exceptions in PForUtil (instead of 3) (#48)
Co-authored-by: Greg Miller <gmiller@amazon.com>
2021-03-30 15:11:33 +02:00
Dawid Weiss 39b8e97613 LUCENE-9896: Add 'quiet exec' utility suppressing exec output unless a failure occurs 2021-03-30 14:38:13 +02:00
Dawid Weiss c7455ff561 LUCENE-9871: cleaning up the build system. Upgrade palantir. Remove all ant-related hacks. 2021-03-30 12:41:06 +02:00
Dawid Weiss fd685682be This removes the last of ant-compatibility hacks - cross-project dependency on test classes. Replaced with gradle's test fixture artifact sharing. Cleaned up spatial3d classes a bit too. 2021-03-30 12:35:33 +02:00
Dawid Weiss f83c9462bb Remove legacy ant hacks - add conf to test sourceSet. Correct jvm options hack (don't apply to benchmarks run). 2021-03-30 11:33:27 +02:00
Dawid Weiss 89024a466b Remove exceptional test exclusions for forked non-tests and inner classes. 2021-03-30 11:13:41 +02:00
Dawid Weiss 78bfbe0bad We don't need to exclude inner classes explicitly. 2021-03-30 10:57:15 +02:00
Dawid Weiss 3115797463 LUCENE-9871: clean up some old cruft and shuffle files around. Correct inputs/outputs on check broken links so that it's incremental. 2021-03-30 10:55:19 +02:00
Dawid Weiss 974e4bc5e8 LUCENE-9880: correct task ordering for clean. 2021-03-30 10:08:44 +02:00
Ignacio Vera 00e57f8c8a
LUCENE-9705: Create Lucene90SegmentInfoFormat (#30)
The existing Lucene86SegmentInfoFormat is moved to backwards-codecs.
2021-03-30 10:04:17 +02:00
iverase c11a01ab61 Move LUCENE-9870 under Lucene 8.8.2 2021-03-30 10:00:39 +02:00
Michael McCandless 4d16ff21b2
LUCENE-9888: re-enable CheckIndex verification that indexSort is the same across all segments (#49) 2021-03-29 12:29:40 -04:00
liupanfeng cce982146a LUCENE-9887: fix error param use in RadixSelector 2021-03-29 12:16:06 +02:00
Jørgen Nystad 06114459ee
LUCENE-9870: Fix Circle2D intersectsLine t-value (distance) range clamp (#41)
Fixes missing matches when line magnitudeAB < 1
2021-03-29 10:41:54 +02:00
Mike McCandless d5d6dc0793 LUCENE-9385: add CHANGES.txt entry 2021-03-27 12:40:06 -04:00
zacharymorn 3648a1020a
LUCENE-9385: Add FacetsConfig option to control which drill-down terms are indexed for a FacetLabel (#25) 2021-03-27 12:38:00 -04:00
Robert Muir 3596e05e5c
LUCENE-9878: enable redundantNullCheck in ecjLint (#44)
Detects common cases of unreachable/dead code.

For generated javacc code, the check is disabled via
SuppressWarnings("unused") because javacc generates strange/bad code such as:

  if ("" == null)

For TestStressNRTReplication's startNode() method, the check is also
disabled because analysis folds the "test evilness controls" which are
static final constants. This itself is a WTF, shouldn't we instead
randomize these evil things in our tests rather than hardcoding them to
specific values?
2021-03-27 11:43:47 -04:00
Uwe Schindler 3538709269 Improvement for LUCENE-9881 (#46): Completely disable Eclipse plugins's eclipseJdt task and replace by owur own just copying the filtered config files. This now works correctly with inputs/outputs. 2021-03-27 12:08:12 +01:00
Robert Muir 690e256ec9
LUCENE-9881: synchronize ECJ linter with Eclipse IDE (#46)
Co-authored-by: Uwe Schindler <uschindler@apache.org>
2021-03-27 00:42:29 +01:00
Dawid Weiss f02799c511
Skip errorprone on non-nightlies. (#45) 2021-03-26 21:42:15 +01:00
Mayya Sharipova 48715fe898
LUCENE-9507 Custom order for leaves in IndexReader and IndexWriter (#32)
1. Add an option to supply a custom leaf sorter for IndexWriter.
A DirectoryReader opened from this IndexWriter will have its leaf
readers sorted with the provided leaf sorter. This is useful for
indices on which it is expected to run many queries with particular
sort criteria (e.g. for time-based indices this is usually a
descending sort on timestamp). Providing leafSorter allows
to speed up early termination for this particular type of
sort queries.

2. Add an option to supply a custom sub-readers sorter for
BaseCompositeReader. In this case sub-readers will be sorted 
according to the the provided leafSorter.

3. Add an option to supply a custom leaf sorter for
StandardDirectoryReader. The leaf readers of this
StandardDirectoryReader will be sorted according to
the the provided leaf sorter.
2021-03-26 09:56:02 -04:00
Tomoko Uchida b174ef45c4
Add CHANGES entry for gradle build. (#43) 2021-03-26 09:50:38 +09:00
Tomoko Uchida 8c61c6b561
Point jdk.java.net instead of OracleJDK page. (#42) 2021-03-26 08:37:52 +09:00
Tomoko Uchida ea74ffb984
LUCENE-9853: Use CJKWidthCharFilter as the default character width normalizer in JapaneseAnalyzer (#26) 2021-03-26 08:32:42 +09:00
zacharymorn 3ed87c867a
LUCENE-9864: Enforce @Override annotation everywhere (#40)
Requiring the annotation is helpful because if an abstract method is removed, the concrete methods will then show up as compile errors: preventing dead code from being accidentally left behind.

Co-authored-by: Robert Muir <rmuir@apache.org>
2021-03-25 17:50:38 -04:00
Dawid Weiss a38713907d LUCENE-9866: regenerate kuromoji dict in regenerate 2021-03-25 11:43:37 +01:00
Uwe Schindler 3214e365e3
LUCENE-9856: Static analysis take 3: Remove redundant interfaces (#38)
Co-authored-by: Robert Muir <rmuir@apache.org>
2021-03-24 18:26:12 +01:00
Dawid Weiss c23ea2f537
LUCENE-9865: Reduce unnecessary bla-bla-bla in top-level readme file (#39) 2021-03-24 17:17:53 +01:00
Dawid Weiss 285ca64ae3 LUCENE-9862: cleanup of all regenerate tasks. Leaving interim commits for reference. 2021-03-24 16:21:43 +01:00
Dawid Weiss 108cd85375 Avoid creating a circular dependency between shared subtasks. 2021-03-24 16:01:36 +01:00
Dawid Weiss 4c2de7ef43 Correct soft task ordering between tidy and any other dependency of regenerate. 2021-03-24 15:39:45 +01:00
Dawid Weiss bb5db1e16d Correct snowball download/unzip sequence to be always consistent. 2021-03-24 15:39:45 +01:00
Dawid Weiss 34f589b0aa Correct run order between tidy and regenerate's deps. Make snowball not fail on Windows (just emit an error). 2021-03-24 15:39:45 +01:00
Dawid Weiss 27510d5f2f LUCENE-9862: cleanup of all regenerate tasks; moved common code into shared bit. Added failOnError for ant.patch. Included jflexStandardTokenizerImpl. 2021-03-24 15:39:45 +01:00