SortingCodecReader keeps all docvalues in memory that are loaded from this reader.
Yet, this reader should only be used for merging which happens sequentially. This makes
caching docvalues unnecessary.
Co-authored-by: Jim Ferenczi <jim.ferenczi@elastic.co>
Regression from 8.6
Multipart POST would fail due to a NoClassDefFoundError of Jetty MultiPart. Solr cannot access many Jetty classes, which is not noticeable in our tests.
...StringIndexOutOfBoundsException on bad syntax
* failOnMissingParams: should have been returning null (failing) on bad syntax cases
Co-authored-by: Christine Poerschke <cpoerschke@apache.org>
When set to true, Solr will overwrite an existing configset in ZooKeeper in an UPLOAD.
A new cleanup parameter can also be passed to let Solr know what to do with the files that existed in the old configset, but no longer exist in the new configset (remove or keep)
* Replace Auth plugin with mocks
* Remove unused password param
* Start cluster only once
* Use SolrCloudTestCase
* Use MiniSolrCloudCluster's methods to remove collections and configsets
AbstractSpatialPrefixTreeFieldType (and its children) create index
fields based on a prototype with options frozen in PrefixTreeStrategy,
regardless of options specified in the schema. This works fine most of
the time, but causes problems when QParsers or other query optimization
logic makes decisions based on these options (which are potentially out
of sync with the underlying index data). Most commonly this causes
issues with "exists" (e.g. [* TO *]) queries.
This commit enforces fieldType defaults that line up with the 'hardcoded'
FieldType used by PrefixTreeStrategy. Options on either the fieldType
or the field itself which contradict these defaults will result in
exceptions at schema load/modification time.
* Updated implicit definition with terms=true, distrib=false
* Commented out terms handler with notice, as this is the config used in tests
* Remove spurious mentions cluttering other test configs
* Remove implicit terms=true param
* Remove definitions from shipped configsets
* Improve documentation
* Add CHANGES record
some things being cleaned up here are simplifications of changes that were 1-to-1 ported from ant, others are left over from when we had a 'released' PDF and are no longer needed
The problem of tracking dirtiness via numbers of chunks is that larger
chunks make stored fields readers more likely to be considered dirty, so
I'm trying to work around it by tracking numbers of docs instead.
The increase of the maximum number of chunks per doc done in previous
issues was mostly random. I'd like to provide users with a similar
trade-off with what the old versions of BEST_SPEED and BEST_COMPRESSION
used to do. So since BEST_SPEED used to compress at most 128 docs at
once, I think we should roughly make it 128*10 now since there are 10
sub blocks. I made it 1024 to account for the fact that there is a preset
dict as well that need decompressing. And similarly BEST_COMPRESSION used
to allow 4x more docs than BEST_SPEED, so I made it 4096.
With such larger numbers of docs per chunk, the decoding of metadata
became a bottleneck for stored field access so I made it a bit faster by
doing bulk decoding of the packed longs.
Allow using placement plugins to compute replica placement on the cluster for Collection API calls.
This is the first code drop for the replacement of the Autoscaling feature.
Javadoc of sample plugin org.apache.solr.cluster.placement.plugins.SamplePluginAffinityReplicaPlacement details how to enable this replica placement strategy.
PR's #1684 then #1845
Instead of configuring a dictionary size and a block size, the format
now tries to have 10 sub blocks per bigger block, and adapts the size of
the dictionary and of the sub blocks to this overall block size.
When index sorting is enabled, stored fields and term vectors can't be
written on the fly like in the normal case, so they are written into
temporary files that then get resorted. For these temporary files,
disabling compression speeds up indexing significantly.
On a synthetic test that indexes stored fields and a doc value field
populated with random values that is used for index sorting, this
resulted in a 3x indexing speedup.