Commit Graph

34995 Commits

Author SHA1 Message Date
Peter Gromov c61b458719
LUCENE-9804: Hunspell: fix most similar dictionary entry search by reversing the comparator (#2419) 2021-02-23 06:58:22 -05:00
Peter Gromov 342ea856d3
LUCENE-9803: Hunspell: don't check second stage suffixes if the first stage flag only occurs in prefixes (#2418) 2021-02-23 06:55:45 -05:00
Robert Muir 7d3f3d61ce
Fix tests.profile output to not run many many times (#2417)
The profiler should only be invoked once at the end of the build. During
refactoring the buildFinished() hook became nested underneath stuff such
as allProjects which causes it to run too many times.
2021-02-23 06:54:39 -05:00
Peter Gromov 34993c22dd
LUCENE-9801: Hunspell suggestions: speed up expandWord by enumerating only applicable affixes (#2416) 2021-02-22 23:25:21 -05:00
Robert Muir af49df4851
Fix compilation failure on linux due to wrong case of package name
Correct package name in backwards-codecs from Lucene87 -> lucene87

It may cause no issues for case-insensitive filesystems such as on Mac
OS X or Windows, but it breaks on linux.
2021-02-22 22:39:14 -05:00
Noble Paul d1a5b9df02 refactor /cluster/aliases V2 API to use annotations 2021-02-23 13:03:38 +11:00
Julie Tibshirani 4d7b2aebfe
LUCENE-9705: Create Lucene90DocValuesFormat and Lucene90NormsFormat (#2392)
For now these are just copies of Lucene80DocValuesFormat and
Lucene80NormsFormat. The existing formats were moved to backwards-codecs.
2021-02-22 11:49:02 -08:00
Peter Gromov 42da2b45e6
LUCENE-9800: Hunspell: put a time limit on suggestion calculation (#2414)
* LUCENE-9800: Hunspell: put a time limit on suggestion calculation

* fix review remarks
2021-02-22 14:06:24 -05:00
Julie Tibshirani bfce5f36da
LUCENE-9616: Add developer docs on how to update a format. (#2395)
This commit adds simple guidelines on how to make a change to a file format:
* Document how the 'copy-on-write' approach works with backwards-codecs
* Clarify that we prefer to copy the format instead of using internal versions
2021-02-22 11:02:37 -08:00
Julie Tibshirani f43fe7642e
LUCENE-9705: Create Lucene90PostingsFormat (#2310)
For now this is just a copy of Lucene90PostingsFormat. The existing
Lucene84PostingsFormat was moved to backwards-codecs, along with its utility
classes.
2021-02-22 10:45:13 -08:00
Peter Gromov f783848e71
LUCENE-9799: Hunspell: don't check second-level affixes when the first level isn't a continuation (#2413)
* LUCENE-9799: Hunspell: don't check second-level affixes when the first level isn't a continuation

* check more words in TestPerformance
2021-02-22 05:35:36 -05:00
Gus Heck e420e6c8f6
SOLR-15160 update cloud.sh (#2393) 2021-02-21 14:36:19 -05:00
Ilan Ginzburg c472be5b86
SOLR-15157: fix wrong assumptions on stats returned by Overseer when cluster state updates are distributed (#2410) 2021-02-21 19:04:53 +01:00
Gus Heck 88ff3cd58d SOLR-14787 CHANGES.txt entry. 2021-02-21 12:05:53 -05:00
Gus Heck 7619165470 Documenting CloneFieldUpdateProcessorFactory once is enough :). 2021-02-21 11:57:24 -05:00
Kevin Watters b298d7fb16
SOLR-14787 - Adding support to use inequalities to the payload check query parser. (#1954) 2021-02-21 11:49:36 -05:00
Robert Muir 107926e486
LUCENE-9795: fix CheckIndex not to validate SortedDocValues as if they were BinaryDocValues
CheckIndex already validates SortedDocValues properly: reads every
document's ordinal and validates derefing all the ordinals back to bytes
from the terms dictionary.

It should not do an additional (very slow) pass where it treats the
field as if it were binary (doc -> ord -> byte[]), this is slow and
doesn't validate any additional index data.

Now that the term dictionary of SortedDocValues may be compressed, it is
especially slow to misuse the docvalues field in this way.
2021-02-21 11:19:41 -05:00
Dawid Weiss d2fb89c22f LUCENE-9793: Add task time aggregation utility (enabled with -Ptask.times=true). 2021-02-20 20:18:16 +01:00
Dawid Weiss 224843a2ba Clean up stale comments a bit. 2021-02-20 20:18:02 +01:00
Robert Muir c51fee9c1a
LUCENE-9480: Make DataInput.skipBytes(long) abstract
skipBytes() is a "relative" version of seek(), but DataInput previously
implemented it via read() calls, because DataInput's API does not
include absolute positioning methods (seek, getFilePointer).

This resulted in inefficiencies: calls to skipBytes() would cause
buffers to be allocated, bytes copied, etc.

Instead, make the subclass implement skipBytes() explicitly. The old
DataInput implementation is marked deprecated and renamed to skipBytesSlowly().

Some subclasses still implement skipBytes() via skipBytesSlowly(), to be
fixed in future improvements.
2021-02-20 12:11:32 -05:00
Eric Pugh 2f0d191452
SOLR-15162: Add some parameters to make MODIFYCOLLECTION v1 and v2 more similar. (#2402)
* expose readOnly parameter to v2 of modifycollection.


Co-authored-by: epugh@opensourceconnections.com <>
2021-02-20 10:49:09 -05:00
Jason Gerlowski 582a9f2e14 SOLR-15087: CHANGES.txt entry 2021-02-19 15:54:26 -05:00
Dawid Weiss 515a41dee9
LUCENE-9792: add testRegressions task that downloads and runs hunspell regression tests. (#2407) 2021-02-19 21:13:40 +01:00
Peter Gromov 31a64927a4
LUCENE-9785: Hunspell: don't check case in compound middle and end (#2398) 2021-02-19 20:16:39 +01:00
Peter Gromov 5325d2e6f4
LUCENE-9786: Hunspell suggestions: try moving the last character into the middle (#2399) 2021-02-19 20:15:57 +01:00
Peter Gromov 3ddc3c04a5
LUCENE-9787: Hunspell: speed up suggesting a bit by not creating a huge TreeSet (#2400) 2021-02-19 20:13:19 +01:00
Peter Gromov 58e3b7a854
LUCENE-9790: Hunspell: avoid slow dictionary lookup if the word's hash isn't there (#2405) 2021-02-19 20:10:06 +01:00
Peter Gromov 4b3fb1e065
LUCENE-9776: Hunspell: allow to inflect the last part of COMPOUNDRULE compound (#2397) 2021-02-19 20:03:34 +01:00
Ilan Ginzburg e7c80f6445
SOLR-15157: refactor Collection API to separate from Overseer and message handling abstractions (#2390)
No functional changes. In preparation of distributing the Collection API command execution.
2021-02-19 14:40:23 +01:00
Robert Muir 6deee14382
LUCENE-9774: Fix TestDirectIODirectory to probe for supported filesystem (#2396)
TestDirectIODirectory will currently fail if run on an unsupported
filesystem (e.g. tmpfs). Add an "assume" that probes if the filesystem
supports Direct I/O.

Also tweak javadocs to indicate correct @throws clauses for the
IndexInput and IndexOutput. You'll get an IOException (translated from
EINVAL) if the filesystem doesn't support it, not a UOE.
2021-02-18 20:36:18 -05:00
epugh@opensourceconnections.com f920b9b14e I do not want to backport build tool changes from gradle to ant, so will leave this feature for Solr 9 2021-02-18 17:26:01 -05:00
Eric Pugh f70a518f1b
SOLR-8138: Simple UI for issuing SQL queries (#2381)
* Updated SOLR-8138 files for Solr 9.

This code was mostly written by Michael Suzuki,  i just tweaked it to load, and updated the version of ui-grid to the 4.10 version.

* unused file, we use the .min version.

* add an entry for the ui-grid project to license file.

Co-authored-by: epugh@opensourceconnections.com <>
2021-02-18 17:21:21 -05:00
Peter Gromov 5e834b39eb
LUCENE-9769: Hunspell: KEEPCASE should take precedence over affixed forms (#2374)
and disregard KEEPCASE in Stemmer to make it more consistent with "hunspell -s"
2021-02-18 09:30:09 +01:00
Peter Gromov 589eefc32b
LUCENE-9782: Hunspell suggestions: split by space (but not dash) also before last char (#2387) 2021-02-18 09:28:29 +01:00
Peter Gromov f879c6ad84
LUCENE-9783: Hunspell: don't suggest more than 4 ngram corrections by default (#2388) 2021-02-18 09:27:06 +01:00
Peter Gromov f83c9862e8
LUCENE-9784: Hunspell suggestions: use US keyboard in absence of KEY option (#2389) 2021-02-18 09:26:22 +01:00
Houston Putman 4bd4f7063b
LUCENE-9780: Only validate JARs for tasks that are enabled (#2382) 2021-02-17 18:12:27 -05:00
Jason Gerlowski c3f6e12876 Resolve AbstractCloudBackupRestoreTestCase flakiness
The 'testBackupAndRestore' method in this class was asserting that the
collection created by restore had the expected number of cores-per-node,
but the logic to compute that expected cores-per-node value failed to
account for a rarely-triggered branch that adds a 'createNodeSet' param
to the restore.

This commit updates the test logic to compute the expected
cores-per-node value when createNodeSet is passed.
2021-02-17 16:02:50 -05:00
Gus Heck 1484c74ba7 LUCENE-9659 fix unit test. 2021-02-17 15:19:33 -05:00
Kevin Watters 890f570bf5
LUCENE-9659 inequality support in payload check query (#2185)
Changes from SOLR-14787 supporting inequalities in SpanPayloadCheckQuery
2021-02-17 09:48:50 -05:00
noblepaul 3b6ba9e3e8 Add back-compat indices for 8.8.0 2021-02-17 22:46:58 +11:00
Peter Gromov effca165df
LUCENE-9781: Speed up BytesStore reader setPosition (#2386) 2021-02-17 11:28:44 +01:00
Tobias Kaessmann f142bf9c54
SOLR-15038: Add elevateOnlyDocsMatchingQuery and collectElevatedDocsWhenCollapsing parameters to query elevation.
Closes #2134
2021-02-17 10:54:17 +01:00
Peter Gromov 2ae45cc985
LUCENE-9778: Hunspell: speed up input conversion (#2376) 2021-02-17 09:10:40 +01:00
Peter Gromov 2d53c6073b
LUCENE-9779: Hunspell: add an API to interrupt long computations (#2378) 2021-02-17 09:09:44 +01:00
Ignacio Vera cfd0ccefe1
LUCENE-9777: Fix out of date versions on releases 8.7.0 and 8.8.0 (#2377) 2021-02-17 08:29:05 +01:00
Peter Gromov 902cb93db2
LUCENE-9775: Hunspell: make FORCEUCASE work when the first compound word is inherently title-case (#2375) 2021-02-17 07:54:12 +01:00
David Smiley 2555418048
LUCENE-9762: DoubleValuesSource.fromQuery bug (#2365)
Also used by FunctionScoreQuery.boostByQuery. 
Could throw an exception when the query implements TwoPhaseIterator 
and when the score is requested repeatedly.

Co-authored-by: Chris Hostetter <hossman@apache.org>
2021-02-16 22:51:17 -05:00
David Smiley 253b20c3c6
SOLR-15156: [child childFilter='...:...'] no longer escapes (#2367)
The query escaping it did was inconsistent with all other places in Solr where a Lucene query may be provided.
2021-02-16 22:37:34 -05:00
Jason Gerlowski 15bd858d34
SOLR-15087: Allow restoration to existing collections (#2380)
The recent addition of support for a "readonly" mode for collections
opens the door to restoring to already-existing collections.

This commit adds a codepath to allow this.  Any compatible existing
collection may be used for restoration, including the collection that
was the original source of the backup.
2021-02-16 21:59:24 -05:00