Previous changes to this issue 'fixed' the way the test was creating mock Replica instances,
to ensure all properties were specified -- but these changes tickled a bug in the existing test
scaffolding that caused it's "expecations" to be based on a regex check against only the base "url"
even though the test logic itself looked at the entire "core url"
The result is that there were reproducible failures if/when the randomly generated regex matched
".*1.*" because the existing test logic did not expect that to match the url or a Replica with
a core name of "core1" because it only considered the base url
(cherry picked from commit 49e20dbee4)
SOLR-13996: Refactor HttpShardHandler.prepDistributed method into smaller pieces
This commit introduces an interface named ReplicaSource which is marked as experimental. It has two sub-classes named CloudReplicaSource (for solr cloud) and LegacyReplicaSource for non-cloud clusters. The prepDistributed method now calls out to these sub-classes depending on whether the cluster is running on cloud mode or not.
(cherry picked from commit c65b97665c)
* No Introduction (to Solr) header. Point at solr-upgrade-notes.adoc instead
* No Getting Started header
* No Versions of Major Components header
* No "Upgrade Notes" for subsequent releases. See solr-upgrade-notes.adoc
Closes#1202
(cherry picked from commit 46c0945614)
If you have repeating intervals in an ordered or unordered interval source, you currently
get somewhat confusing behaviour:
* `ORDERED(a, a, b)` will return an extra interval over just a b if it first matches a a b, meaning
that you can get incorrect results if used in a `CONTAINING` filter -
`CONTAINING(ORDERED(x, y), ORDERED(a, a, b))` will match on the document `a x a b y`
* `UNORDERED(a, a)` will match on documents that just containg a single a.
This commit adds a RepeatingIntervalsSource that correctly handles repeats within
ordered and unordered sources. It also changes the way that gaps are calculated within
ordered and unordered sources, by using a new width() method on IntervalIterator. The
default implementation just returns end() - start() + 1, but RepeatingIntervalsSource
instead returns the sum of the widths of its child iterators. This preserves maxgaps filtering
on ordered and unordered sources that contain repeats.
In order to correctly handle matches in this scenario, IntervalsSource#matches now always
returns an explicit IntervalsMatchesIterator rather than a plain MatchesIterator, which adds
gaps() and width() methods so that submatches can be combined in the same way that
subiterators are. Extra checks have been added to checkIntervals() to ensure that the same
intervals are returned by both iterator and matches, and a fix to
DisjunctionIntervalIterator#matches() is also included - DisjunctionIntervalIterator minimizes
its intervals, while MatchesUtils.disjunction does not, so there was a discrepancy between
the two methods.
This replaces the index of stored fields and term vectors with two
`DirectMonotonic` arrays. `DirectMonotonicWriter` requires to know the number
of values to write up-front, so incoming doc IDs and file pointers are buffered
on disk using temporary files that never get fsynced, but have index headers
and footers to make sure any corruption in these files wouldn't propagate to the
index.
`DirectMonotonicReader` gets a specialized `binarySearch` implementation that
leverages the metadata in order to avoid going to the IndexInput as often as
possible. Actually in the common case, it would only go to a single
sub `DirectReader` which, combined with the size of blocks of 1k values, helps
bound the number of page faults to 2.
SOLR-14095 Introduced an issue for rolling restarts (Incompatible Java serialization). This change fixes the compatibility issue while keeping the functionality in SOLR-14095
The entire precommit task will still fail with unsupported java version
(subsequent checks do not support the newer javadocs format).
But this allows the ECJ linter to run, which checks for things such as
unused imports.
This triggers various places in the Streaming Expressions code that use background threads
to confirm that the expected credentails (or lack of) are propogarded along.
Test currently has comments + workarounds for 2 known client issues:
- SOLR-14226: SolrStream reports AuthN/AuthZ failures (401|403) as IOException w/o details
- SOLR-14222: CloudSolrClient converts (update) 403 error to 500 error
(cherry picked from commit 517438e356)
Fuzzy queries with an edit distance of 1 or 2 must visit all blocks whose prefix
length is 1 or 2. By not compressing those, we can trade very little space (a
couple MBs in the case of the wikibigall index) for better query efficiency.
Adds some build parameters to tune how tests run. There is an example
shown by "gradle helpLocalSettings"
Default C2 off in tests as it is wasteful locally and causes slowdown of
tests runs. You can override this by setting tests.jvmargs for gradle,
or args for ant.
Some crazy lucene stress tests may need to be toned down after the
change, as they may have been doing too many iterations by default...
but this is not a new problem.
The issue is that MockDirectoryWrapper's disk full check is horribly
inefficient. On every writeByte/etc, it totally recomputes disk space
across all files. This means it calls listAll() on the underlying
Directory (which sorts all the underlying files), then sums up fileLength()
for each of those files.
This leads to many pathological cases in the disk full tests... but the
number of tests impacted by this is minimal, and the logic is scary.