Commit Graph

35229 Commits

Author SHA1 Message Date
Adrien Grand e510ef11c2 LUCENE-9827: Propagate `numChunks` through bulk merges. 2021-04-08 16:45:52 +02:00
Uwe Schindler 779e00542c Make the character printout code uniform (always print at least 4 hex chars) 2021-04-08 16:38:31 +02:00
Dawid Weiss 4c2384a1f3 LUCENE-9872: load input/output checksums prior to executing the target task, even if regenerate is not called. 2021-04-08 15:00:20 +02:00
Robert Muir 2971f311a2
LUCENE-9911: enable ecjLint unusedExceptionParameter (#70)
Fails the linter if an exception is swallowed (e.g. variable completely
unused).

If this is intentional for some reason, the exception can simply by
annotated with @SuppressWarnings("unused").
2021-04-08 08:19:01 -04:00
Peter Gromov 7f147fece0
LUCENE-9894: Hunspell: add user-friendly diagnostics for morph data API misuse (#51) 2021-04-07 14:52:36 +02:00
Peter Gromov 8eb582e671
LUCENE-9895: Hunspell: make suggest-with-timeout API public (#54) 2021-04-07 14:52:29 +02:00
Robert Muir df25653cbd
LUCENE-9882: better synchronize eclipse formatter with spotless. (#47)
Import the spotless formatting settings to our eclipse IDE setting, so
that it is a closer match.
2021-04-07 06:20:42 -04:00
Robert Muir 4026753744
LUCENE-9910: maximize javac lint (#68)
This enables quite a few javac warnings from java11+ that weren't
enabled for some reason. None of them fail, so lock them in.

Additionally some newer checks are only recognized for newer JDK
versions, so they are only enabled based on the javac version used. They
will cause no annoyance because they relate to newer language features.
2021-04-07 06:10:29 -04:00
Ignacio Vera 430b3baa80
LUCENE-9907: Remove packedInts dependency on StoredFieldsFormat (#64) 2021-04-07 11:33:49 +02:00
Dawid Weiss 39071dbc54
LUCENE-9904: Port GenerateJflexTLDMacros.java regeneration to gradle and regenerate UAX tokenizer with up-to-date TLDs 2021-04-07 10:56:21 +02:00
Gautam Worah efeea0b8ee
LUCENE-9902 Minor fixes to the faceting API (#62) 2021-04-06 14:50:23 -04:00
Robert Muir be94a667f2
LUCENE-9827: avoid wasteful recompression for small segments (#28)
Require that the segment has enough dirty documents to create a clean
chunk before recompressing during merge, there must be at least maxChunkSize.

This prevents wasteful recompression with small flushes (e.g. every
document): we ensure recompression achieves some "permanent" progress.

Expose maxDocsPerChunk as a parameter for Term vectors too, matching the
stored fields format. This allows for easy testing.

Increment numDirtyDocs for partially optimized merges:
If segment N needs recompression, we have to flush any buffered docs
before bulk-copying segment N+1. Don't just increment numDirtyChunks,
also make sure numDirtyDocs is incremented, too.
This doesn't have a performance impact, and is unrelated to tooDirty()
improvements, but it is easier to reason about things with correct
statistics in the index.

Further tuning of how dirtiness is measured: for simplification just use percentage
of dirty chunks.

Co-authored-by: Adrien Grand <jpountz@gmail.com>
2021-04-06 14:18:48 -04:00
Adrien Grand d991fefb49
Add an example to the CacheHelper docs. (#50) 2021-04-06 16:25:15 +02:00
Dawid Weiss 2662a74cab Correct some of the jdk17-offending javadocs. 2021-04-05 20:34:52 +02:00
Dawid Weiss 2773172455 Correct some of the jdk17-offending javadocs. 2021-04-05 20:21:52 +02:00
Dawid Weiss baceb16904 Correct some of the jdk17-offending javadocs. 2021-04-05 20:19:56 +02:00
Dawid Weiss fbf9191abf
LUCENE-9901: UnicodeData.java has no regeneration task (#63) 2021-04-05 20:12:56 +02:00
Ignacio Vera 67a0bd4b6d
LUCENE-9705: Final clean-up and entry in CHANGES.txt (#59) 2021-04-04 11:30:47 +02:00
Dawid Weiss 010e3a1ba9
LUCENE-9900: Regenerate/ run ICU only if inputs changed (#61) 2021-04-02 11:46:43 +02:00
Dawid Weiss e3ae57a3c1
LUCENE-9872: Make the most painful tasks in regenerate fully incremental (#60) 2021-04-02 09:56:47 +02:00
Tomoko Uchida 670bbf8b99
Ignore sdkmanrc file on Git (#58) 2021-04-02 01:04:14 +09:00
Ignacio Vera 8c9b9546cc
LUCENE-9705: Create Lucene90PointsFormat (#52) 2021-04-01 07:04:04 +02:00
Pieter van Boxtel 1d579b9448
LUCENE-9898 Remove no longer used scorePayload method from BM25Similarity (#57) 2021-04-01 09:06:03 +09:00
zacharymorn 79fcd99f4c
LUCENE-9883: Turn on ecj missingEnumCaseDespiteDefault setting (#56) 2021-03-31 15:50:52 +09:00
Dawid Weiss 32e891c60f LUCENE-9871: move dummy outputs aspect into a separate file. 2021-03-30 20:15:55 +02:00
Adrien Grand 10520185a9 LUCENE-9877: Move CHANGES entry under 8.9. 2021-03-30 15:13:00 +02:00
Greg Miller fd79f9737a
LUCENE-9877: Allow up to 7 exceptions in PForUtil (instead of 3) (#48)
Co-authored-by: Greg Miller <gmiller@amazon.com>
2021-03-30 15:11:33 +02:00
Dawid Weiss 39b8e97613 LUCENE-9896: Add 'quiet exec' utility suppressing exec output unless a failure occurs 2021-03-30 14:38:13 +02:00
Dawid Weiss c7455ff561 LUCENE-9871: cleaning up the build system. Upgrade palantir. Remove all ant-related hacks. 2021-03-30 12:41:06 +02:00
Dawid Weiss fd685682be This removes the last of ant-compatibility hacks - cross-project dependency on test classes. Replaced with gradle's test fixture artifact sharing. Cleaned up spatial3d classes a bit too. 2021-03-30 12:35:33 +02:00
Dawid Weiss f83c9462bb Remove legacy ant hacks - add conf to test sourceSet. Correct jvm options hack (don't apply to benchmarks run). 2021-03-30 11:33:27 +02:00
Dawid Weiss 89024a466b Remove exceptional test exclusions for forked non-tests and inner classes. 2021-03-30 11:13:41 +02:00
Dawid Weiss 78bfbe0bad We don't need to exclude inner classes explicitly. 2021-03-30 10:57:15 +02:00
Dawid Weiss 3115797463 LUCENE-9871: clean up some old cruft and shuffle files around. Correct inputs/outputs on check broken links so that it's incremental. 2021-03-30 10:55:19 +02:00
Dawid Weiss 974e4bc5e8 LUCENE-9880: correct task ordering for clean. 2021-03-30 10:08:44 +02:00
Ignacio Vera 00e57f8c8a
LUCENE-9705: Create Lucene90SegmentInfoFormat (#30)
The existing Lucene86SegmentInfoFormat is moved to backwards-codecs.
2021-03-30 10:04:17 +02:00
iverase c11a01ab61 Move LUCENE-9870 under Lucene 8.8.2 2021-03-30 10:00:39 +02:00
Michael McCandless 4d16ff21b2
LUCENE-9888: re-enable CheckIndex verification that indexSort is the same across all segments (#49) 2021-03-29 12:29:40 -04:00
liupanfeng cce982146a LUCENE-9887: fix error param use in RadixSelector 2021-03-29 12:16:06 +02:00
Jørgen Nystad 06114459ee
LUCENE-9870: Fix Circle2D intersectsLine t-value (distance) range clamp (#41)
Fixes missing matches when line magnitudeAB < 1
2021-03-29 10:41:54 +02:00
Mike McCandless d5d6dc0793 LUCENE-9385: add CHANGES.txt entry 2021-03-27 12:40:06 -04:00
zacharymorn 3648a1020a
LUCENE-9385: Add FacetsConfig option to control which drill-down terms are indexed for a FacetLabel (#25) 2021-03-27 12:38:00 -04:00
Robert Muir 3596e05e5c
LUCENE-9878: enable redundantNullCheck in ecjLint (#44)
Detects common cases of unreachable/dead code.

For generated javacc code, the check is disabled via
SuppressWarnings("unused") because javacc generates strange/bad code such as:

  if ("" == null)

For TestStressNRTReplication's startNode() method, the check is also
disabled because analysis folds the "test evilness controls" which are
static final constants. This itself is a WTF, shouldn't we instead
randomize these evil things in our tests rather than hardcoding them to
specific values?
2021-03-27 11:43:47 -04:00
Uwe Schindler 3538709269 Improvement for LUCENE-9881 (#46): Completely disable Eclipse plugins's eclipseJdt task and replace by owur own just copying the filtered config files. This now works correctly with inputs/outputs. 2021-03-27 12:08:12 +01:00
Robert Muir 690e256ec9
LUCENE-9881: synchronize ECJ linter with Eclipse IDE (#46)
Co-authored-by: Uwe Schindler <uschindler@apache.org>
2021-03-27 00:42:29 +01:00
Dawid Weiss f02799c511
Skip errorprone on non-nightlies. (#45) 2021-03-26 21:42:15 +01:00
Mayya Sharipova 48715fe898
LUCENE-9507 Custom order for leaves in IndexReader and IndexWriter (#32)
1. Add an option to supply a custom leaf sorter for IndexWriter.
A DirectoryReader opened from this IndexWriter will have its leaf
readers sorted with the provided leaf sorter. This is useful for
indices on which it is expected to run many queries with particular
sort criteria (e.g. for time-based indices this is usually a
descending sort on timestamp). Providing leafSorter allows
to speed up early termination for this particular type of
sort queries.

2. Add an option to supply a custom sub-readers sorter for
BaseCompositeReader. In this case sub-readers will be sorted 
according to the the provided leafSorter.

3. Add an option to supply a custom leaf sorter for
StandardDirectoryReader. The leaf readers of this
StandardDirectoryReader will be sorted according to
the the provided leaf sorter.
2021-03-26 09:56:02 -04:00
Tomoko Uchida b174ef45c4
Add CHANGES entry for gradle build. (#43) 2021-03-26 09:50:38 +09:00
Tomoko Uchida 8c61c6b561
Point jdk.java.net instead of OracleJDK page. (#42) 2021-03-26 08:37:52 +09:00
Tomoko Uchida ea74ffb984
LUCENE-9853: Use CJKWidthCharFilter as the default character width normalizer in JapaneseAnalyzer (#26) 2021-03-26 08:32:42 +09:00