Commit Graph

35347 Commits

Author SHA1 Message Date
Peter Gromov 4842e0c9ca LUCENE-9825: Hunspell: reverse the "words" trie for faster word lookup/suggestions 2021-03-05 16:06:48 +01:00
Peter Gromov 99a4bbf3a0
LUCENE-9824: Hunspell suggestions: speed up ngram score calculation for each dictionary entry (#2457) 2021-03-05 10:00:02 -05:00
Joel Bernstein 6e67b9f959 SOLR-15193: Fix wording 2021-03-05 09:17:44 -05:00
Joel Bernstein 36386f4832 SOLR-15193: Fix wording 2021-03-05 09:14:48 -05:00
Joel Bernstein eb0c04b752 SOLR-15193: Add maxDocFreq docs 2021-03-05 09:11:59 -05:00
David Smiley ddbd3b88ec
SOLR-15185: Optimize Hash QParser (#1524)
used in parallel() streaming expression.  Hash algorithm is different.
* Simpler
* Don't use Filter (to be removed)
* Do use TwoPhaseIterator, not PostFilter
* Don't pre-compute matching docs (wasteful)
* Support more fields, and more field types
* Faster hash on Strings (avoid Char conversion)
* Stronger hash when using multiple fields
2021-03-04 23:43:16 -05:00
Robert Muir 8e337ab63f
LUCENE-9822: Assert that ForUtil.BLOCK_SIZE can be PFOR-encoded in a single byte
For/PFor code has BLOCK_SIZE=128 as a static final constant, with a lot
of assumptions and optimizations for that case. For example it will
encode 3 exceptions at most and optimizes the exception encoding with a
single byte.

This would not work at all if you changed the constant in the code to
something like 512, but an assertion at an early stage helps make
experimentation less painful, and better "documents" the assumption of how
the exception encoding currently works.
2021-03-04 18:59:03 -05:00
Peter Gromov 231e3afe06
LUCENE-9687: Hunspell suggestions: reduce work in the findSimilarDictionaryEntries loop (#2451)
The loop is called a lot of times, and some allocations and method calls can be spared
2021-03-04 17:56:11 -05:00
Bruno Roustant 19e6560b7f
Restore read-only permission in solr-tests.policy 2021-03-04 16:20:17 +01:00
noblepaul 0e37835e0b DOAP changes for release 8.8.0 2021-03-04 22:55:30 +11:00
Thomas Wöckinger 8d62e2723a SOLR-15191: Fix EnumFieldTest
(9x/8x difference)
2021-03-03 21:41:22 -05:00
Eric Pugh c974b233e4
SOLR-15204: Document bin solr zk and zkcli (#2437)
* Update examples to use bin/solr zk
2021-03-03 14:43:01 -05:00
Christine Poerschke d822a38a48
SOLR-15206: improve CoreContainer constructor javadocs (#2443) 2021-03-02 13:44:24 +00:00
zacharymorn 6ba9fe5be3
LUCENE-9406: Add IndexWriterEventListener to track events in IndexWriter (#2342) 2021-03-02 09:54:08 +01:00
百岁 8b443420b8
SOLR-15100: make ConfigSetService configurable in solr.xml (#2343) 2021-03-01 22:27:37 -05:00
Michael Sokolov 9e8207a558 LUCENE-9819: fix random failures in TestKnnGraph due to insufficient graph connectivity 2021-03-01 17:43:35 -05:00
Joel Bernstein 17c6a7c37f SOLR-15193: Improve formatting 2021-03-01 14:19:17 -05:00
Joel Bernstein 53deb6f735 SOLR-15132: Update CHANGES.txt 2021-03-01 10:41:18 -05:00
Dawid Weiss 7dc43f46fd
LUCENE-9818: print slowest suites, add an option to enable/ disable the function from options. (#2439) 2021-03-01 16:02:18 +01:00
Andrzej Bialecki 2c5b86b673 SOLR-15130: Support for per-collection replica placement node sets, a.k.a "node type"
placements.
2021-03-01 15:21:54 +01:00
Dawid Weiss ef3a23b3d0 LUCENE-9793: add task length reporting for github PRs. 2021-03-01 11:19:47 +01:00
Robert Muir dade99cb4d
LUCENE-9816: lazy-init LZ4-HC hashtable in BlockTreeTermsWriter
LZ4-HC hashtable is heavy (128kb int[] + 128kb short[]) and must be
filled with special values on initialization. This is a lot of overhead
for fields that might not use the compression at all.

Don't initialize this for a field until we see hints that the data might
be compressible and need to use the table in order to test it out.
2021-02-28 17:54:30 -05:00
Robert Muir 96eb043131
fix TestKnnGraph test failure if it gets SimpleText
This test reaches into lucene90 internals, fails with classcastexception
if it happens to get simpletext.
2021-02-28 14:43:36 -05:00
Ilan Ginzburg 1fff174690
SOLR-14928: add exponential backoff wait time when Compare And Swap fails in distributed cluster state update due to concurrent update (#2438) 2021-02-28 00:53:42 +01:00
Thomas Wöckinger 988a16fe95 SOLR-15191: Fix JSON faceting on EnumFieldType (#2426)
* Fix JSON Faceting on EnumFieldType if allBuckets, numBuckets or missing is set.
* Enhance hash method of JSON faceting to support EnumFieldType and perhaps some other/custom field types

Co-authored-by: Thomas Wöckinger <two@silbergrau.com>
Co-authored-by: David Smiley <dsmiley@apache.org>
2021-02-27 14:35:29 -05:00
Eric Pugh d4fb023756
SOLR-15194: relax requirements and allow http urls. (#2430)
Relax the need for https urls for JWT IDP's if you pass in solr.auth.jwt.allowOutboundHttp=true system property.
2021-02-27 09:13:51 -05:00
Robert Muir 6348c284fd
Merge branch 'master' of https://gitbox.apache.org/repos/asf/lucene-solr 2021-02-26 20:27:23 -05:00
Robert Muir 373e1d6c83
LUCENE-9814: fix extremely slow 7.0 backwards tests in master
The 7.0 backwards tests added to master must have come from an older
branch before they were fixed: they've added minutes to my test times.

These tests have already been fixed in master, so that the crazy
corner-case stress tests are only running slowly in jekins and we dont
have 15-30s long tests locally.

Re-applying same fixes to 7.0 tests removes minutes from my test times.
2021-02-26 20:24:01 -05:00
Christine Poerschke e88b3e9c20 Fix 'invoke' typo in UUIDUpdateProcessorFactory javadocs. 2021-02-26 17:38:06 +00:00
Peter Gromov 4f6469b173
LUCENE-9812: Hunspell: honor empty stripping affixes when generating suggestions (#2432) 2021-02-26 00:24:57 -05:00
zacharymorn 5bca3d1960
LUCENE-9639: Implements SimpleTextVectorReader#ramBytesUsed (#2433)
* Use single class imports
2021-02-25 21:32:34 -05:00
David Smiley 62971c4f99
SOLR-13034: RTG sometimes didn't materialize LazyField (#2408)
Partial (AKA Atomic) updates could encounter "LazyField" instances in the document
cache and not know hot to deal with them when writing the updated doc to the update log.
2021-02-25 16:29:30 -05:00
Joel Bernstein 220db76311 SOLR-15193: Fix wording 2021-02-25 08:39:10 -05:00
Noble Paul 119aec804e removed empty file 2021-02-25 22:47:04 +11:00
Ilan Ginzburg 04c95c71af
SOLR-15146: remove unreachable code (#2431) 2021-02-25 00:08:16 +01:00
Joel Bernstein 1f8b708a54 SOLR-15193: Fix typos 2021-02-24 17:39:02 -05:00
Joel Bernstein a3691bc81d SOLR-15193: Fix typos 2021-02-24 17:33:42 -05:00
zacharymorn 56cb9a304c
LUCENE-9639: Add unit tests for SimpleTextVector format (#2404)
... and fix the implementation so it passes!
2021-02-24 15:37:34 -05:00
Joel Bernstein 9a30406871 SOLR-15193: Fix typos 2021-02-24 15:19:35 -05:00
Joel Bernstein 4fb530c52e SOLR-15193: Add Graph to the Visual Guide to Streaming Expressions and Math Expressions 2021-02-24 15:02:57 -05:00
Peter Gromov a5d8463119
LUCENE-9810: Hunspell: when generating suggestions, skip too deep word FST subtrees (#2427)
* LUCENE-9810: Hunspell: when generating suggestions, skip too deep word FST subtrees

we skip roots longer than misspelled+4 anyway, so there's no need to read their arcs

* check more in TestPerformance.de_suggest
2021-02-24 11:58:32 -05:00
Peter Gromov 3a99e2aa82
LUCENE-9806: Hunspell: speed up affix condition checking (#2423)
* LUCENE-9806: Hunspell: speed up affix condition checking

check only stem beginning/end without strip/condition, not the whole candidate
avoid regexp if possible

* hunspell: simplify AffixCondition, add more tests

* add a license to the test
2021-02-24 11:45:35 -05:00
Peter Gromov e1ff4c1354
LUCENE-9808: Hunspell suggestions: consider space/dash-separated words for each case variation (#2425) 2021-02-24 11:43:37 -05:00
Peter Gromov 9d6fd98810
LUCENE-9811: Hunspell suggestions: speed up ngram calculation by not searching for substrings in impossible places (#2428) 2021-02-24 11:41:50 -05:00
Ignacio Vera f8be421ae1
LUCENE-9705: Create Lucene90TermVectorsFormat (#2334) 2021-02-24 11:15:11 +01:00
Robert Muir 84a35dfaea
LUCENE-9794: Optimize skipBytes implementation in remaining DataInput subclasses
Fix various DataInputs to no longer use skipBytesSlowly, add new tests.
2021-02-24 02:46:24 -05:00
Timothy Potter eba0e25535
SOLR-15181: update schema to not specify the docValuesFormat (#2424) 2021-02-23 17:34:07 -07:00
Timothy Potter cfb9764250 Add backcompat indices for 8.8.1 2021-02-23 12:32:50 -07:00
Timothy Potter 10ece1bf2b Fix :lucene:core:spotlessApply failure due to 8.8.1 version update 2021-02-23 12:21:49 -07:00
Timothy Potter bb607bf3fd Add bugfix version 8.8.1 2021-02-23 09:02:52 -07:00