lucene

Commit Graph

Author	SHA1	Message	Date
Mayya Sharipova	48715fe898	LUCENE-9507 Custom order for leaves in IndexReader and IndexWriter (#32 ) 1. Add an option to supply a custom leaf sorter for IndexWriter. A DirectoryReader opened from this IndexWriter will have its leaf readers sorted with the provided leaf sorter. This is useful for indices on which it is expected to run many queries with particular sort criteria (e.g. for time-based indices this is usually a descending sort on timestamp). Providing leafSorter allows to speed up early termination for this particular type of sort queries. 2. Add an option to supply a custom sub-readers sorter for BaseCompositeReader. In this case sub-readers will be sorted according to the the provided leafSorter. 3. Add an option to supply a custom leaf sorter for StandardDirectoryReader. The leaf readers of this StandardDirectoryReader will be sorted according to the the provided leaf sorter.	2021-03-26 09:56:02 -04:00
Tomoko Uchida	b174ef45c4	Add CHANGES entry for gradle build. (#43 )	2021-03-26 09:50:38 +09:00
Tomoko Uchida	8c61c6b561	Point jdk.java.net instead of OracleJDK page. (#42 )	2021-03-26 08:37:52 +09:00
Tomoko Uchida	ea74ffb984	LUCENE-9853: Use CJKWidthCharFilter as the default character width normalizer in JapaneseAnalyzer (#26 )	2021-03-26 08:32:42 +09:00
zacharymorn	3ed87c867a	LUCENE-9864: Enforce @Override annotation everywhere (#40 ) Requiring the annotation is helpful because if an abstract method is removed, the concrete methods will then show up as compile errors: preventing dead code from being accidentally left behind. Co-authored-by: Robert Muir <rmuir@apache.org>	2021-03-25 17:50:38 -04:00
Dawid Weiss	a38713907d	LUCENE-9866: regenerate kuromoji dict in regenerate	2021-03-25 11:43:37 +01:00
Uwe Schindler	3214e365e3	LUCENE-9856: Static analysis take 3: Remove redundant interfaces (#38 ) Co-authored-by: Robert Muir <rmuir@apache.org>	2021-03-24 18:26:12 +01:00
Dawid Weiss	c23ea2f537	LUCENE-9865: Reduce unnecessary bla-bla-bla in top-level readme file (#39 )	2021-03-24 17:17:53 +01:00
Dawid Weiss	285ca64ae3	LUCENE-9862: cleanup of all regenerate tasks. Leaving interim commits for reference.	2021-03-24 16:21:43 +01:00
Dawid Weiss	108cd85375	Avoid creating a circular dependency between shared subtasks.	2021-03-24 16:01:36 +01:00
Dawid Weiss	4c2de7ef43	Correct soft task ordering between tidy and any other dependency of regenerate.	2021-03-24 15:39:45 +01:00
Dawid Weiss	bb5db1e16d	Correct snowball download/unzip sequence to be always consistent.	2021-03-24 15:39:45 +01:00
Dawid Weiss	34f589b0aa	Correct run order between tidy and regenerate's deps. Make snowball not fail on Windows (just emit an error).	2021-03-24 15:39:45 +01:00
Dawid Weiss	27510d5f2f	LUCENE-9862: cleanup of all regenerate tasks; moved common code into shared bit. Added failOnError for ant.patch. Included jflexStandardTokenizerImpl.	2021-03-24 15:39:45 +01:00
Robert Muir	945b1cb872	LUCENE-9856: fail precommit on unused local variables, take two (#37 ) Enable ecj unused local variable, private instance and method detection. Allow SuppressWarnings("unused") to disable unused checks (e.g. for generated code or very special tests). Fix gradlew regenerate for python 3.9 SuppressWarnings("unused") for generated javacc and jflex code. Enable a few other easy ecj checks such as Deprecated annotation, hashcode/equals, equals across different types. Co-authored-by: Mike McCandless <mikemccand@apache.org>	2021-03-23 13:59:00 -04:00
Michael McCandless	53fd63dbb2	replace 'static enum' with 'enum' (#36 )	2021-03-23 13:23:39 -04:00
Robert Muir	e6c4956cf6	Revert "LUCENE-9856: fail precommit on unused local variables (#34 )" This reverts commit `20dba278bb`.	2021-03-23 12:46:36 -04:00
Robert Muir	20dba278bb	LUCENE-9856: fail precommit on unused local variables (#34 ) Enable ecj unused local variable, private instance and method detection. Allow SuppressWarnings("unused") to disable unused checks (e.g. for generated code or very special tests). Fix gradlew regenerate for python 3.9 SuppressWarnings("unused") for generated javacc and jflex code. Enable a few other easy ecj checks such as Deprecated annotation, hashcode/equals, equals across different types. Co-authored-by: Mike McCandless <mikemccand@apache.org>	2021-03-23 11:09:24 -04:00
Dawid Weiss	078d0079d1	LUCENE-9861: pull tuned vm options into a separate aspect. (#33 )	2021-03-23 10:39:09 +01:00
András Salamon	2678d68be8	SOLR-14024 Invalid html generated by changes2html.pl (#31 )	2021-03-22 17:35:32 -04:00
Dawid Weiss	246c4beb22	LUCENE-9854: Clean up utilities to download and extract test/ benchmark data sets. (#27 )	2021-03-22 12:22:39 +01:00
Dawid Weiss	a5996dbecd	Follow-up to help/validateLogCalls.txt removal.	2021-03-19 15:14:42 +01:00
Dawid Weiss	c0852d1e9c	Follow-up to help/ant.txt removal.	2021-03-19 15:13:55 +01:00
Dawid Weiss	1679076bde	Nuke more unused/ obsolete refs.	2021-03-19 13:11:37 +01:00
Dawid Weiss	f1299bca9f	Nuke the obsolete ant.txt help.	2021-03-19 13:09:42 +01:00
Dawid Weiss	ee59e4e1ac	Add a link for Eclipse's users.	2021-03-19 13:08:15 +01:00
Dawid Weiss	bf807c2a32	Correct header structure for jdk13+	2021-03-19 08:30:15 +01:00
Christoph Büscher	7ed72972b8	LUCENE-9007: MockSynonymFilter should add TypeAttribute (#23 ) The MockSynonymFilter should add the type TypeAttribute to the synonyms it generates in order to make it a better stand-in for the real filter in tests.	2021-03-18 22:00:09 -04:00
Peter Gromov	28edbf8fc6	LUCENE-9852: Make Hunspell thread-safe (#24 )	2021-03-18 21:57:03 -04:00
Michael Sokolov	5b36af3cd7	LUCENE-9844: document disk layout of Lucene90VectorFormat	2021-03-18 09:39:23 -04:00
Dawid Weiss	53bea54669	LUCENE-9375: cleaning up post-split conditional build logic and solr refs. (#22 )	2021-03-18 11:04:45 +01:00
Dawid Weiss	ca3de30aff	Don't cross-link between modules for interim snapshot builds. (#21 )	2021-03-18 10:18:07 +01:00
András Salamon	0e245171f1	SOLR-15002 Upgrade httpclient to 4.5.13 and httpcore to 4.4.13 (#14 )	2021-03-17 22:25:42 -04:00
Bruno Roustant	d6a554138d	LUCENE-9663: Move to 8.9.0 section in CHANGES.txt.	2021-03-17 15:39:24 +01:00
Uwe Schindler	8e1d5e3dbf	LUCENE-9836: cherry pick changes entry from 8.x	2021-03-16 17:12:34 +01:00
Mike McCandless	8b68bc744f	remove accidental extra K	2021-03-15 17:34:25 -04:00
Dawid Weiss	f8040c0ecf	LUCENE-9650: errorprone plugin doesn't work on jdk16. A different workaround that keeps the dependency.	2021-03-15 10:19:27 +01:00
Peter Gromov	cdff0accaa	Hunspell suggestions: speed up for some non-Latin scripts (#19 )	2021-03-15 05:02:45 -04:00
Peter Gromov	8913a98379	LUCENE-9830: Hunspell: store word length for faster dictionary lookup/enumeration (#3 )	2021-03-15 00:35:25 -04:00
Peter Gromov	42c6f780bf	LUCENE-9831: Hunspell GeneratingSuggester: faster flag & case checks, less allocations (#4 )	2021-03-15 00:32:08 -04:00
Robert Muir	d48193e8cf	LUCENE-9837: try to improve performance of VectorUtil.dotProduct (#17 ) More loop unrolling for VectorUtil.dotProduct to eek out a bit more short-term performance.	2021-03-14 23:16:08 -04:00
Robert Muir	f3a284ad83	LUCENE-9796: Fix SortedDocValues to no longer extend BinaryDocValues SortedDocValues do not have a per-document binary value, they have a per-document numeric `ordValue()`. The ordinal can then be dereferenced to its binary form with `lookupOrd()`, but it was a performance trap to implement a `binaryValue()` on the SortedDocValues api that does this behind-the-scenes on every document. You can replace calls of `binaryValue()` with `lookupOrd(ordValue())` as a "quick fix", but it is better to use the ordinal alone (integer-based datastructures) for per-document access, and only call lookupOrd() a few times at the end (e.g. for the hits you want to display). Otherwise, if you really don't want per-document ordinals, but instead a per-document `byte[]`, use a BinaryDocValues field. This change only addresses the API (slow `binaryValue()` trap), but doesn't yet fix any slow algorithms that were discovered in the process, so it doesn't yield any performance improvements.	2021-03-14 23:07:48 -04:00
Tomoko Uchida	471f38c031	LUCENE-9834: goodbye old friend - the classic luke logo	2021-03-12 23:17:32 +09:00
Dawid Weiss	4f5389bfa8	Flush output on javadoc emitting a failure.	2021-03-12 11:39:40 +01:00
Tomoko Uchida	7478b3fc17	LUCENE-9834: Adjast logo/colors in the Luke About dialog	2021-03-12 11:00:10 +09:00
Dawid Weiss	8bbcc39583	Always include errorprone dependency, even if we're not checking. This ensures consistent use patterns across JVMs.	2021-03-11 22:27:25 +01:00
Peter Gromov	e784721e69	LUCENE-9833: Hunspell: AssertionError in WordStorage.lookupWord (#13 )	2021-03-11 10:10:57 -05:00
Peter Gromov	efa88a1790	LUCENE-9832: Hunspell: SIOOBE in GeneratingSuggester.expandRoot (#12 )	2021-03-11 10:09:11 -05:00
Robert Muir	1b36406ec4	LUCENE-9827: Speed up merging of stored fields and term vectors for small segments Stored Fields and Term Vectors are block-compressed. Decompressing and recompressing all the documents on every merge is too slow, so we try to avoid doing it unless it will actually improve the compression ratio. If we can get away with it, we just bulk-copy existing compressed blocks to the new segment. Previously, small segments would always be considered dirty and recompressed... the special optimized bulk merge wouldn't kick in until segments were relatively large. But as block size and ratio (shared dictionaries etc) have increased, "relatively large" has become a much bigger number. So try to avoid doing wasted work: if there's only 1 dirty chunk (incompletely filled compression block), then don't recompress: it will likely only give us 1 dirty chunk as a result, at the expense of cpu. Require at least 2 dirty chunks to recompress: this way the recompression actually buys us something (reduces 2 to 1). The change also means that bulk merge will now happen often in the unit test suite, increasing coverage.	2021-03-11 09:30:40 -05:00
Mike McCandless	12999d30f2	LUCENE-9791: add CHANGES.txt entry	2021-03-11 08:09:23 -05:00

1 2 3 4 5 ...

35033 Commits All Branches Search

35033 Commits

All Branches