Commit Graph

209 Commits

Author SHA1 Message Date
Bruno Roustant f394c9418e
Remove the HPPC dependency from all modules and move the HPPC fork to internal. (#13422)
* Remove hppc dependency
* Change fork version to 0.10.0
* Add @lucene.internal
* Move hppc classes to oal.internal.hppc but export it.
* Delete hppc license since it's no longer a dependency.

---------

Co-authored-by: Dawid Weiss <dawid.weiss@carrotsearch.com>
2024-05-27 12:09:25 +02:00
Bruno Roustant 7db9c8c9bd
Replace Map<Integer, Object> by primitive IntObjectHashMap. (#13368) 2024-05-18 14:40:40 +02:00
Dmitry Cherniachenko cbec94c153
Use String.isEmpty() instead of equals("") (#13050) 2024-01-30 00:14:10 +01:00
Benjamin Trent fb5037f841
Do not use mock merge policy for TestGrouping (#13034) 2024-01-26 07:16:27 -05:00
sabi0 91272f45da
Replace println(String.format(...)) with printf(...) (#12976) 2023-12-28 19:32:06 +01:00
Lu Xugang a71d64a598
Skip docs with Docvalues in NumericLeafComparator (#12405)
* Skip document by docValues

*When the queue is full with only one Comparator, we could better tune the maxValueAsBytes/minValueAsBytes. For instance, if the sort is ascending and bottom value is 5, we will use a range on [MIN_VALUE, 4].
---------

Co-authored-by: Adrien Grand <jpountz@gmail.com>
2023-11-09 13:05:28 +08:00
Kevin Risden de3b294be4
GITHUB#12655: gradle tidy after google java format update for jdk 21 and regen
* tidy whitespace changes from googleJavaFormat upgrade
* generateForUtil fixed and regened https://bugs.python.org/issue39350
* generateAntlr
* generateClassicTokenizer
* generateWikipediaTokenizer
2023-10-11 16:12:09 -04:00
Adrien Grand f527eb3b12
Remove Scorable#docID. (#12407)
`Scorable#docID()` exposes the document that is being collected, which makes it
impossible to bulk-collect multiple documents at once.

Relates #12358
2023-07-05 10:40:06 +02:00
Adrien Grand 8811f31b9c
Add a post-collection hook to LeafCollector. (#12380)
This adds `LeafCollector#finish` as a per-segment post-collection hook. While
it was already possible to do this sort of things on top of the collector API
before, a downside is that the last leaf would need to be post-collected in the
current thread instead of using the executor, which is a missed opportunity for
making queries concurrent.
2023-06-30 15:19:35 +02:00
Robert Muir 47f8c1baa2
Migrate away from per-segment-per-threadlocals on SegmentReader (#11998)
Add new stored fields and termvectors interfaces: IndexReader.storedFields()
and IndexReader.termVectors(). Deprecate IndexReader.document() and IndexReader.getTermVector().
The new APIs do not rely upon ThreadLocal storage for each index segment, which can greatly
reduce RAM requirements when there are many threads and/or segments.

Co-authored-by: Adrien Grand <jpountz@gmail.com>
2022-12-13 09:10:21 -05:00
Simon Cooper 135f3fab41
Ensure collections are properly sized on creation (#11942)
A few other optimisations along the way
2022-11-24 11:20:04 +01:00
Dawid Weiss 54fba99cb1
Upgrade google java format and apply tidy (#11811) 2022-09-24 15:40:27 +02:00
Greg Miller 5f2a4998a0
LUCENE-10603: Migrate remaining SSDV iteration to use docValueCount in production code (#995) 2022-06-30 14:01:14 -07:00
zacharymorn 94fe7e314f
LUCENE-10436: Remove deprecated DocValuesFieldExistsQuery, NormsFieldExistsQuery and KnnVectorFieldExistsQuery (#790) 2022-04-07 00:53:29 -07:00
Alan Woodward 2e2c4818d1
LUCENE-10377: Replace 'sortPos' with 'enableSkipping' in SortField.getComparator() (#603)
The sort position parameter in SortField.getComparator() is only ever used
to determine whether or not skipping should be enabled on a given comparator,
so the parameter name should reflect that.  This commit also explicitly disables
skipping in a number of cases where it is never used, in particular CheckIndex
and the grouping collectors.
2022-01-17 10:44:57 +00:00
Dawid Weiss ff547e7bbd
LUCENE-10328: Module path for compiling and running tests is wrong (#571) 2022-01-05 20:42:02 +01:00
Dawid Weiss a94fbb79ac LUCENE-10301: make the test-framework a proper module by moving all test
classes to org.apache.lucene.tests.*. Also changes distribution layout
(all modules are now under modules/).
2021-12-21 20:30:45 +01:00
Dawid Weiss d2c98912eb This reverts commit a7b50f723d. 2021-12-19 08:51:13 +01:00
Dawid Weiss a7b50f723d Reverting back to b48cac02. 2021-12-18 23:36:30 +01:00
Dawid Weiss 5b3b75efd8 LUCENE-10308: Make ecj and javadoc run with modular paths 2021-12-16 17:51:01 +01:00
Dawid Weiss 6d83c2e08e LUCENE-10255: add gradle compilation and module descriptor support for the java module system. Adds module descriptors to all Lucene subprojects. 2021-12-10 17:16:19 +01:00
Mayya Sharipova d03662c48b
LUCENE-9334 Consistency of field data structures
Require consistency between data-structures on a per-field basis

A field must be indexed with the same index options and data-structures across
all documents. Thus, for example, it is not allowed to have one document
where a certain field is indexed with doc values and points, and another document 
where the same field is indexed only with points. 
But it is allowed for a document not to have a certain field at all.

As a consequence of this, doc values updates are
only applicable for fields that are indexed with doc values only.
2021-04-14 15:00:41 -04:00
zacharymorn 79fcd99f4c
LUCENE-9883: Turn on ecj missingEnumCaseDespiteDefault setting (#56) 2021-03-31 15:50:52 +09:00
Robert Muir 3596e05e5c
LUCENE-9878: enable redundantNullCheck in ecjLint (#44)
Detects common cases of unreachable/dead code.

For generated javacc code, the check is disabled via
SuppressWarnings("unused") because javacc generates strange/bad code such as:

  if ("" == null)

For TestStressNRTReplication's startNode() method, the check is also
disabled because analysis folds the "test evilness controls" which are
static final constants. This itself is a WTF, shouldn't we instead
randomize these evil things in our tests rather than hardcoding them to
specific values?
2021-03-27 11:43:47 -04:00
Robert Muir 945b1cb872
LUCENE-9856: fail precommit on unused local variables, take two (#37)
Enable ecj unused local variable, private instance and method detection. Allow SuppressWarnings("unused") to disable unused checks (e.g. for generated code or very special tests). Fix gradlew regenerate for python 3.9 SuppressWarnings("unused") for generated javacc and jflex code. Enable a few other easy ecj checks such as Deprecated annotation, hashcode/equals, equals across different types.

Co-authored-by: Mike McCandless <mikemccand@apache.org>
2021-03-23 13:59:00 -04:00
Robert Muir e6c4956cf6
Revert "LUCENE-9856: fail precommit on unused local variables (#34)"
This reverts commit 20dba278bb.
2021-03-23 12:46:36 -04:00
Robert Muir 20dba278bb
LUCENE-9856: fail precommit on unused local variables (#34)
Enable ecj unused local variable, private instance and method detection. Allow SuppressWarnings("unused") to disable unused checks (e.g. for generated code or very special tests). Fix gradlew regenerate for python 3.9 SuppressWarnings("unused") for generated javacc and jflex code. Enable a few other easy ecj checks such as Deprecated annotation, hashcode/equals, equals across different types.

Co-authored-by: Mike McCandless <mikemccand@apache.org>
2021-03-23 11:09:24 -04:00
Robert Muir f3a284ad83
LUCENE-9796: Fix SortedDocValues to no longer extend BinaryDocValues
SortedDocValues do not have a per-document binary value, they have a
per-document numeric `ordValue()`. The ordinal can then be dereferenced
to its binary form with `lookupOrd()`, but it was a performance trap to
implement a `binaryValue()` on the SortedDocValues api that does this
behind-the-scenes on every document.

You can replace calls of `binaryValue()` with `lookupOrd(ordValue())`
as a "quick fix", but it is better to use the ordinal alone
(integer-based datastructures) for per-document access, and only call
lookupOrd() a few times at the end (e.g. for the hits you want to display).
Otherwise, if you really don't want per-document ordinals, but instead a
per-document `byte[]`, use a BinaryDocValues field.

This change only addresses the API (slow `binaryValue()` trap), but
doesn't yet fix any slow algorithms that were discovered in the process,
so it doesn't yield any performance improvements.
2021-03-14 23:07:48 -04:00
Dawid Weiss 2cbf261032 LUCENE-9570: code reformatting [final]. 2021-01-05 13:44:05 +01:00
Marcus b9a93cf695
LUCENE-8626: Standardize Lucene test file naming Part 2 (#2053) 2020-11-17 08:13:13 -05:00
Robert Muir 784ede4eda
LUCENE-9215: replace checkJavaDocs.py with doclet (#1802)
This has the same logic as the previous python, but no longer relies
upon parsing HTML output, instead using java's doclet processor.

The errors are reported like "normal" javadoc errors with source file
name and line number and happen when running "gradlew javadoc"

Although the "rules" are the same as the previous python, the python had
some bugs where the checker didn't quite do exactly what we wanted, so
some fixes were applied throughout.

Co-authored-by: Dawid Weiss <dawid.weiss@carrotsearch.com>
Co-authored-by: Uwe Schindler <uschindler@apache.org>
2020-09-02 08:29:17 -04:00
Erick Erickson 69fa5a00fb LUCENE-9433: Remove Ant support from trunk 2020-08-28 09:31:16 -04:00
Erick Erickson c9c75810c2 Revert "LUCENE-9433: Remove Ant support from trunk"
This reverts commit 37cd17dc
2020-08-21 16:57:58 -04:00
Erick Erickson 37cd17dcf5 LUCENE-9433: Remove Ant support from trunk 2020-08-21 15:19:52 -04:00
Michael Sokolov 26075fc1dc
LUCENE-9394: fix and suppress warnings (#1563)
* LUCENE-9394: fix and suppress warnings in lucene/*
* Change type of ValuesSource context from raw Map to Map<Object, Object>
2020-06-12 07:25:31 -04:00
Uwe Schindler 06df50e759
LUCENE-9321: Port markdown task to Gradle (#1477) 2020-05-17 14:46:26 +02:00
Alan Woodward 7c350d22c7
LUCENE-7889: Allow grouping on Double/LongValuesSource (#1484)
The grouping module currently allows grouping on a SortedDocValues field, or on
a ValueSource. The latter groups only on exact values, and so will not perform well
on numeric-valued fields. This commit adds the ability to group by defined ranges
from a Long or DoubleValuesSource.
2020-05-11 17:34:01 +01:00
Alan Woodward 0c58687a97
LUCENE-9348: Add a base grouping test for use with different GroupSelector implementations (#1461)
The grouping module tests currently all try and test both grouping by term and
grouping by ValueSource. They are quite difficult to follow, however, and it is not
at all easy to add tests for a new grouping type. This commit adds a new
BaseGroupSelectorTestCase class which can be extended to test particular
GroupSelector implementations, and adds tests for TermGroupSelector and
ValueSourceGroupSelector.  It also adds a separate test for Block grouping,
so that the distinct grouping types are tested separately.
2020-05-04 12:55:00 +01:00
Christine Poerschke 0c1b19a321 LUCENE-8530: fix some 'rawtypes' javac warnings 2020-01-31 16:40:55 +00:00
Robert Muir 975df9ddd3
LUCENE-9182: add apache license headers to all .gradle files and enforce in rat task 2020-01-27 12:05:34 -05:00
Robert Muir c53cc3edaf
LUCENE-9167: test speedup for slowest/pathological tests (round 3) 2020-01-24 08:58:59 -05:00
Robert Muir c754a764d4
LUCENE-9157: test speedup for slowest tests 2020-01-21 19:27:19 -05:00
Dawid Weiss f853d994ec Merge remote-tracking branch 'origin/master' into gradle-master 2019-12-09 16:48:21 +01:00
Christine Poerschke 49631ace9f LUCENE-8996: maxScore was sometimes missing from distributed grouped responses.
(Julien Massenet, Diego Ceccarelli, Munendra S N, Christine Poerschke)
2019-12-09 13:09:44 +00:00
Dawid Weiss d4a9842375 Initial gradle build layer. 2019-12-02 15:34:57 +01:00
Dawid Weiss 063c82ebd6 SOLR-13952: reverting Erick's commit (with permission). 2019-11-25 17:56:20 +01:00
Erick Erickson 4b34d726ab SOLR-13952: Separate out Gradle-specific code from other (mostly test) changes and commit separately 2019-11-24 13:24:40 -05:00
Christine Poerschke 3a3df47840 Revert "LUCENE-8996: maxScore was sometimes missing from distributed grouped responses."
This reverts commit 5289fce4bf.
2019-10-23 18:13:18 +01:00
Christine Poerschke 5289fce4bf LUCENE-8996: maxScore was sometimes missing from distributed grouped responses.
(Julien Massenet, Diego Ceccarelli, Christine Poerschke)
2019-10-23 12:08:42 +01:00
Christine Poerschke f8292f5372 LUCENE-9010: extend TopGroups.merge test coverage 2019-10-22 15:50:48 +01:00