Commit Graph

34911 Commits

Author SHA1 Message Date
jimczi 5af12b5f14 LUCENE-9675: Binary doc values fields now expose their configured compression mode in the attributes of the field info. 2021-01-19 10:03:13 +01:00
Patrick Marty 227256d951
LUCENE-9646: Set BM25Similarity discountOverlaps via the constructor 2021-01-19 09:49:57 +01:00
Peter Gromov 9f5bdf43b7
LUCENE-9678: Hunspell: fix off-by-one error to support prefixes of word.length - 1 (#2219) 2021-01-19 09:34:27 +01:00
Peter Gromov 422c89baef
LUCENE-9676: Hunspell: improve stemming of all-caps words (#2217)
Hunspell: improve stemming of all-caps words

Repeat Hunspell's logic:
* when encountering a mixed- or (inflectable) all-case dictionary entry, add its title-case analog as a hidden entry
* use that hidden entry for stemming case variants for title- and uppercase words, but don't consider it a valid word itself
* ...unless there's another explicit dictionary entry of that title case
2021-01-19 09:32:23 +01:00
Simon Willnauer c1ae6dc07c
LUCENE-9669: Add an expert API to allow opening indices created < N-1 (#2212)
Today we force indices that were created with N-2 and older versions of Lucene
to fail on open. This check doesn't even check if the codecs are available. In order
to allow users to open older indices and for us to support N-2 versions this change
adds an API on DirectoryReader to specify a minimum index version on a per reader basis.
This doesn't apply for the IndexWriter which will fail on opening older indices.
2021-01-19 09:23:49 +01:00
Peter Gromov 426c902bc9
LUCENE-9677: simplify Dictionary.affixData storage (#2218)
Use char[] instead of byte[], get rid of unnecessary byte array readers/writers.
2021-01-19 09:22:33 +01:00
Peter Gromov ab08fdc6f0
LUCENE-9671: Hunspell: shorten Stemmer.applyAffix (#2209)
Call stem() recursively just once with different arguments depending on various conditions. 

NOTE: committing in directly as this is a refactoring, not a functional change (no CHANGES.txt entry).
2021-01-18 22:54:22 +01:00
Noble Paul 8505d4d416
SOLR-15052: Per-replica states for reducing overseer bottlenecks (trunk) (#2177) 2021-01-19 02:59:41 +11:00
Uwe Schindler 4b508aef24 LUCENE-8982: Add a note to MIGRATE.md 2021-01-18 00:50:02 +01:00
zacharymorn a7747b63b4
LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory (#2052)
LUCENE-8982: Make NativeUnixDirectory pure java with FileChannel direct IO flag, and rename to DirectIODirectory
2021-01-17 23:57:56 +01:00
Namgyu Kim eb24e95731
LUCENE-9661: Fix deadlock in TermsEnum.EMPTY 2021-01-16 06:49:23 +09:00
Cassandra Targett 30aa0f5ba4 Ref Guide: copy edits for 8.8 release 2021-01-15 14:54:41 -06:00
Cassandra Targett 90aabbdde8 Ref guide: add license to cluster-plugins.adoc; fix section title case throughout 2021-01-15 14:54:41 -06:00
Cassandra Targett cb465044d7 SOLR-14560: ref guide: remove references to XML output when examples are all JSON 2021-01-15 14:54:41 -06:00
Mike McCandless cc1d902ade fix typo in Adrien's name! 2021-01-15 09:31:13 -05:00
Peter Gromov 82f6f161ae
LUCENE-9664: Hunspell support: fix most IntelliJ warnings, cleanup (#2202) 2021-01-15 13:52:34 +01:00
Peter Gromov 90131a605a
LUCENE-9665: Hunspell: support default encoding (#2203, Peter Gromov via Dawid Weiss) 2021-01-15 09:35:25 +01:00
Florin Babes f285f02c89
SOLR-15071: add TestEdisMaxSolrFeature.testEdisMaxSolrFeatureCustomMM() test case (#2201)
* add test case for SOLR-15071

* add temporary @Ignore to be removed when the fix is committed

Co-authored-by: Florin Babes <florin.babes@emag.ro>
Co-authored-by: Christine Poerschke <cpoerschke@apache.org>
2021-01-14 10:44:26 +00:00
Noble Paul 9466af576a
SOLR-14155: Load all other SolrCore plugins from packages (#1666) 2021-01-13 22:28:01 +11:00
Cassandra Targett 7a301c736c Ref Guide: clarify backup location requirements for SolrCloud backups 2021-01-11 14:36:14 -06:00
Eric Pugh 3e2fb59272
SOLR-15010 Try to use jattach for threaddump if jstack is missing (#2192)
* introduce jattach check if jstack is missing.  jattach ships in the Solr docker image instead of jstack.
* get the full path to the jattach command

Co-authored-by: Christine Poerschke <cpoerschke@apache.org>
2021-01-11 14:58:11 -05:00
Mike Drob a429b969d8
SOLR-14413 fix unit test to use delayed handler (#2189) 2021-01-11 12:15:30 -06:00
Timothy Potter 6711eb7571
SOLR-15036: auto- select / rollup / sort / plist over facet expression when using a collection alias with multiple collections (#2132) 2021-01-11 10:34:28 -07:00
Adrien Grand f0d6fd84bb LUCENE-9346: Add CHANGES entry. 2021-01-11 15:06:08 +01:00
zacharymorn c2493283a5
LUCENE-9346: Support minimumNumberShouldMatch in WANDScorer (#2141)
Co-authored-by: Adrien Grand <jpountz@gmail.com>
2021-01-11 15:03:29 +01:00
Jason Gerlowski 98c51ca34b
SOLR-15070: Remove HashMap usage in SuggestComponent rsp (#2183)
Prior to this commit, SuggestComponent used a HashMap as part of the
response it built on the server side.  This class is serialized/
deserialized differently depending on the SolrJ ResponseParser used:
a LinkedHashMap when javabin was used, and a SimpleOrderedMap when XML
was used.  This discrepancy led to ClassCastException's in downstream
SolrJ code.

This commit fixes the issue by changing SuggestComponent to avoid these
types that are serialized differently.  "suggest" response sections now
deserialize as a NamedList in SolrJ, and the SuggesterResponse POJO has
been updated accordingly.
2021-01-11 07:31:26 -05:00
Houston Putman 7e94a56e81 SOLR-14999: Fixing SolrXmlConfig tests for hostPort. 2021-01-09 10:19:47 -05:00
Munendra S N 2c1ec75eaa SOLR-12559: fix error when multi-val fields are derefernced in JSON aggs
This ensures all derefernced fields are not parsed into actual valuesource
but parsed into a placeholder value. This works for 1-level of dereferencing
2021-01-09 19:30:43 +05:30
Tomas Fernandez Lobbe 4789112f91
Remove unused test file (#2174) 2021-01-08 16:40:21 -08:00
Houston Putman 4be49cbdf5
SOLR-14999: Option to set the advertised port for Solr. (#2089) 2021-01-08 18:21:41 -05:00
Houston Putman 86934787fe
Adding local gradle settings for github actions. (#2191) 2021-01-08 18:12:19 -05:00
Dawid Weiss 5b734fb94a
Make :localSettings always available, even if it's a noop on subsequent runs. (#2190) 2021-01-08 20:26:35 +01:00
iverase a7391fb73e LUCENE-9641: Fix LatLonShape#testPointIndexAndQuery test bug. 2021-01-08 13:43:29 +01:00
Ignacio Vera 14009f4424
LUCENE-9641: Support for spatial relationships in LatLonPoint (#2155)
Equivalent to LatLonShape, LatLonPoint can be queried now using spatial relationships.
2021-01-08 08:16:58 +01:00
David Smiley 4cb3ad4a1c
* SOLR-14923: Nested docs indexing perf & robustness (#2159)
* When the schema defines _root_, and you want to do atomic/partial updates...
** _root_ needn't be stored or have docValues any more
** _nest_path_ field isn't needed for this any more
** Simplified internal logic
* Allow (and recommend, eventually insist) that the _root_ field be passed for atomic/partial updates to child docs.
** In the absence of _root_, assume the _route_ param is equivalent to ameliorate back-compat scope.  This is a temporary hack; remove in SOLR-15064.
** One of the two is required; you'll get an exception if the assumption is false.  THIS IS A BACK-COMPAT CHANGE
* Ensure that the update log contains the _root_ field if it's defined in the schema; in some cases it wasn't.  It's important for robustness of atomic/partial updates to child docs.  Caveat: the buffer replay scenario is not tested with child docs.
* Limited the cases when a realtime searcher is re-opened.  It was being applied to any update that included child docs but now only some narrow subset: only for atomic/partial updates, and when the update log contains an in-place update for the same nest because it's complicated to resolve those log entries.
* Internal improvements to RealTimeGetComponent to aid clarity & robustness & probably performance...
** Use SolrDocumentFetcher.solrDoc(docID, ReturnFields) instead of more manual loading.  Will do more with this in another PR.
** Clarify when only root doc IDs are expected.
** Use Resolution enum more, add PARTIAL, remove DOC_WITH_CHILDREN; enhance docs.
** When have ReturnFields, a Set of "onlyTheseFields" becomes redundant.  Add a child doc resolution via a transformer when needed.
** Clarified where copy-field targets are removed
* NestPathField should default to single valued, instead of inheriting the schema default, which for ancient schemas was multi-valued.
* AddUpdateCommand.getLuceneDocument(s) methods are very internal; made package visible and refactored a bit for clarity
* DocumentBuilder: when in-place update, skip id and _root_ here, thus also simplifying further logic
* NestedShardedAtomicUpdateTest no longer extends AbstractFullDistribZkTestBase because it wasn't really leveraging the "control client" checking, and it added too much complexity to debug failures.
2021-01-07 23:23:20 -05:00
Michael Sokolov 8b4b1910c9 LUCENE-9658: add spotless formatting check to github precommit action 2021-01-07 14:42:36 -05:00
Christine Poerschke 60f2417aca
SOLR-15057: avoid unnecessary object retention in FacetRangeProcessor (#2160) 2021-01-07 18:45:46 +00:00
Munendra S N 6ff4a9b395 SOLR-14514: add extra checks for picking 'stream' method in JSON facet
missing, allBuckets, and numBuckets is not supported with stream method.
So, avoiding picking stream method when any one of them is enabled even if
facet sort is 'index asc'
2021-01-07 22:01:27 +05:30
Munendra S N d7fd3d8c20 SOLR-12539: handle extra spaces in JSON facet shorthand syntax 2021-01-07 22:01:27 +05:30
Munendra S N 0846da5c22 SOLR-14950: fix regenerating of copyfield with explicit src/dest matching dyn rule
CopyFields are regenerated in case of replace-field or replace-field-type.
While regenerating, source and destionation are checked against fields but source/dest
could match dynamic rule too.
For example,
<copyField source="something_s" dest="spellcheck"/>
<dynamicField name="*_s" type="string"/>
here, something_s is not present in schema but matches the dynamic rule.

To handle the above case, need to check dynamicFieldCache too while regenerating the
copyFields
2021-01-07 22:01:27 +05:30
Joel Bernstein 4ab5d31832 SOLR-15040: Update CHANGES.txt 2021-01-07 11:30:18 -05:00
Timothy Potter 8b55fb868d
SOLR-15059: Improve query performance monitoring (#2165) 2021-01-07 09:17:38 -07:00
S N Munendra d4fa1aae21
SOLR-10860: Return proper error code for bad input incase of inplace updates (#2121)
Return proper error code on invalid value with in-place update.
Handle invalid value for inc op with the in-place update, uses toNativeType to convert increment value instead of direct parsing. Also, return an error when inc operation is specified for the non-numeric field
2021-01-07 20:44:48 +05:30
Dawid Weiss 96aaf543af LUCENE-9652: follow-up code reformatting (tidy). 2021-01-07 10:59:21 +01:00
Dawid Weiss 0ab9cb8079 LUCENE-9658: temporarily hook up spotlessCheck to precommit. 2021-01-07 10:57:57 +01:00
David Smiley 3147625890
SOLR-15069: [child parentFilter=...] is now optional (#2181) 2021-01-06 17:43:15 -05:00
Michael Sokolov 7b9f875145
LUCENE-9652: DataInput.readLEFloats for use by Lucene90VectorReader (#2175) 2021-01-06 16:16:56 -05:00
Chris Hostetter 07071ca8e1 SOLR-15047: Fix collapse parser behavior when collapsing on numeric fields to differentiate '0' group from null group 2021-01-06 10:07:32 -07:00
Timothy Potter 2fcaba1ce2
SOLR-15058: Enforce node_name contains colon and port and find first underscore after colon to parse context (#2178) 2021-01-05 12:00:14 -07:00
Chris Hostetter a48e937f59 SOLR-15048: Fixed collapse parser behavior when dealing with docs boosted by QueryElevationComponent that are in the null group to treat them consistently regardless of collapse field type or group head selector 2021-01-05 10:00:56 -07:00