Commit Graph

35959 Commits

Author SHA1 Message Date
Dawid Weiss a7b50f723d Reverting back to b48cac02. 2021-12-18 23:36:30 +01:00
Dawid Weiss 2a44ff532e LUCENE-10308: sort input files for ecj so that module-info.java comes first. 2021-12-18 21:17:56 +01:00
Dawid Weiss d42db56bab LUCENE-10255: initial support for Java Modules. 2021-12-18 20:45:51 +01:00
Dawid Weiss b48cac0206
LUCENE-10285: try to force ordering of internal tasks, in spite of making top-level wrapper dependencies. (#549) 2021-12-17 19:12:09 +01:00
Dawid Weiss 1a429c621e Render javadocs for all projects, even if they're not part of site. 2021-12-17 17:57:38 +01:00
Uwe Schindler f7fd21a0c6 Merge branch 'main' of https://gitbox.apache.org/repos/asf/lucene into jms2 2021-12-17 16:05:17 +01:00
Uwe Schindler 8610176a42 Remove obsolete options 2021-12-17 15:22:22 +01:00
Greg Miller 1e8b94a1bb
LUCENE-10321: Tweak MultiRangeQuery interval tree creation logic (#547) 2021-12-17 05:43:38 -08:00
Uwe Schindler db9dff225c after reading code, correct the argument file to comply with ECJ's parser 2021-12-17 12:18:28 +01:00
Uwe Schindler 6941701c6d Escape the options in ECJ's options file 2021-12-17 11:37:27 +01:00
Dawid Weiss ae92e96481 Address Uwe's remarks. 2021-12-16 20:05:05 +01:00
Dawid Weiss c64e5fe84c LUCENE-10313: add missing javadoc. 2021-12-16 18:36:19 +01:00
Dawid Weiss e0745c7b24 LUCENE-10255: re-add utilities for debugging packages and services. These are not included by default to avoid unnecessary compilation overhead. 2021-12-16 17:59:54 +01:00
Dawid Weiss 5b3b75efd8 LUCENE-10308: Make ecj and javadoc run with modular paths 2021-12-16 17:51:01 +01:00
Dawid Weiss 9224bde83c Add assertj. 2021-12-16 16:49:20 +01:00
Dawid Weiss 9917092bf8 LUCENE-10313: merge log4j-less Luke. 2021-12-16 15:47:52 +01:00
Tomoko Uchida e7b4700c5a
LUCENE-10313: minor clean-ups and follow-ups (#546) 2021-12-16 15:38:32 +01:00
Jan Høydahl 05cb0fd0c1 Add back-compat indices for 8.11.1 2021-12-16 15:09:28 +01:00
Jan Høydahl 8c48475c4d Add java version mapping for lucene 10 2021-12-16 13:07:03 +01:00
Jan Høydahl 2b07bcef2e Add bugfix version 8.11.1 2021-12-16 12:11:26 +01:00
Tomoko Uchida 8e8a94a2b7 LUCENE-10303: remove unnecessary changes entry 2021-12-16 19:39:20 +09:00
Dawid Weiss 36638dcb1e
LUCENE-10313: drop log4j from luke (#544) 2021-12-16 11:18:34 +01:00
Jan Høydahl 3687c71f28
Add 8.11.1 2021-12-16 10:43:44 +01:00
Quentin Pradet 9974f6ac34
LUCENE-10085: Fix flaky testQueryMatchesCount (#538)
Five times every 10 000 tests, we did not index any documents with i
between 0 and 10 (inclusive), which caused the deleted tests to fail.

With this commit, we make sure that we always index at least one
document between 0 and 10.
2021-12-14 10:49:58 +01:00
Ignacio Vera 5207aae527
LUCENE-10310: Fix test error in TestXYDocValuesQueries#testRandomDistanceHuge (#537)
We create random circles using ShapeTestUtils which is safe.
2021-12-13 12:01:20 +01:00
Tomoko Uchida 2f634b0d95
LUCENE-10309: Minimum KnnVector codec support in Luke (#535) 2021-12-12 15:31:18 +09:00
Tomoko Uchida e111182e12 LUCENE-10303: Upgrade log4j to 2.15.0 2021-12-11 10:43:03 +09:00
Tomoko Uchida cb788d8e9e LUCENE-10305: Ensure line endings of versions.props is LF 2021-12-11 10:10:44 +09:00
Dawid Weiss 1bcdc600b3 LUCENE-10304: exclude module-info.java from all sourcesets for Eclipse, otherwise things break (predictably). 2021-12-10 19:56:55 +01:00
Dawid Weiss 003fa44357 LUCENE-10307: correct module descriptor so that exported packages test passes. 2021-12-10 19:15:54 +01:00
Dawid Weiss 51d93635aa LUCENE-10307: add exported packages consistency check. 2021-12-10 17:53:16 +01:00
Dawid Weiss 8511def95b LUCENE-10307: add distribution sanity tests. 2021-12-10 17:25:47 +01:00
Dawid Weiss 458c0486c0 LUCENE-10304: a workaround for intellij's problem with runtime scopes on dependencies. 2021-12-10 17:16:19 +01:00
Dawid Weiss aee191d878 LUCENE-10300: rewrite how resources are read in ukrainian morfologik analyzer (module vs. classpath lookup). 2021-12-10 17:16:19 +01:00
Dawid Weiss 768adb99d6 LUCENE-10300: add morfologik.tests and check if the ukrainian analyzer loads properly. 2021-12-10 17:16:19 +01:00
Dawid Weiss 600d8345f8 LUCENE-10306: set up module configurations to consume full JARs for test projects. 2021-12-10 17:16:19 +01:00
Dawid Weiss 328b3cc55f LUCENE-10255: add support for .tests subprojects which contain module tests. 2021-12-10 17:16:19 +01:00
Dawid Weiss 6d83c2e08e LUCENE-10255: add gradle compilation and module descriptor support for the java module system. Adds module descriptors to all Lucene subprojects. 2021-12-10 17:16:19 +01:00
Dawid Weiss b9c22fdb49 LUCENE-9871: minor cleanups of extra semicolons and solr build remnants. 2021-12-10 10:29:35 +01:00
Dawid Weiss b2b52ca92a LUCENE-10229: change the wording a bit. 2021-12-09 17:34:54 +01:00
Patrick Zhai 53099e01de
LUCENE-10229: Unify behaviour of match offsets for interval queries (#521) 2021-12-09 17:19:18 +01:00
Ignacio Vera 40c213d873
Revert "LUCENE-10289: Change DocIdSetBuilder#grow() from taking an int to a long (#520)" (#532)
This reverts commit af1e68b891.
2021-12-09 13:54:40 +01:00
Dawid Weiss 8367f700c7 LUCENE-10294: Avoid compiling javadocs twice in 'gradlew check'. 2021-12-09 09:56:11 +01:00
Robert Muir 7a872c7a5c
LUCENE-10296: Stop minimizing regepx (#528)
In current trunk, we let caller (e.g. RegExpQuery) try to "reduce" the expression. The parser nor the low-level executors don't implicitly call exponential-time algorithms anymore.

But now that we have cleaned this up, we can see it is even worse than just calling determinize(). We still call minimize() which is much crazier and much more.

We stopped doing this for all other AutomatonQuery subclasses a long time ago, as we determined that it didn't help performance. Additionally, minimization vs. determinization is even less important than early days where we found trouble: the representation got a lot better. Today when you finishState we do a lot of practical sorting/coalescing on-the-fly. Also we added this fancy UTF32-to-UTF8 automata convertor, that makes the worst-case-space-per-state significantly lower than it was before? So why minimize() ?

Let's just replace minimize() calls with determinize() calls? I've already swapped them out for all of src/test, to get jenkins looking for issues ahead of time.

This change moves hopcroft minimization (MinimizeOperations) to src/test for now. I'd like to explore nuking it from there as a next step, any tests that truly need minimization should be fine with brzozowski's
algorithm.
2021-12-08 21:44:26 -05:00
Julie Tibshirani 5d39bca87a
LUCENE-10040: Add test for vector search with skewed deletions (#527)
This exercises a challenging case where the documents to skip all happen to
be closest to the query vector. In many cases, HNSW appears to be robust to this
case and maintains good recall.
2021-12-08 11:24:12 -08:00
Adrien Grand b9287c8ce0 Fix precommit. 2021-12-08 18:51:35 +01:00
Adrien Grand f190cc3509 Re-enable tests. 2021-12-08 17:52:17 +01:00
Adrien Grand ecc38495ab Add back-compat indices for 9.0.0. 2021-12-08 17:43:06 +01:00
Robert Muir 84e4b85b09
LUCENE-10010: don't determinize/minimize in RegExp (#513)
Previously, RegExp called minimize() at every parsing step. There is little point to making an NFA execution when it is doing this: minimize() implies exponential determinize().
 
Moreover, some minimize() calls are missing, and in fact in rare cases RegExp can already return an NFA today (for certain syntax)

Instead, RegExp parsing should do none of this, instead it may return a DFA or NFA. NOTE: many simple regexps happen to be still returned as DFA, just because of the algorithms in use.

Callers can decide whether to determinize or minimize. RegExp parsing should not run in exponential time.

All src/java callsites were modified to call minimize(), to prevent any performance problems. minimize() seems unnecessary, but let's approach removing minimization as a separate PR. src/test was fixed to just use determinize() in preparation for this.

Add new unit test for RegExp parsing

New test tries to test each symbol/node independently, to make it easier to maintain this code.
The new test case now exceeds 90% coverage of the regexp parser.
2021-12-07 21:39:13 -05:00
Robert Muir 5a1fdd8865
remove unnecessary "dependencies" in versions.props (#526)
Looks like stray cats from back when it was shared with solr
2021-12-07 21:22:54 -05:00