The current "checkindex" on startup is very very expensive. This is
like running one of the old school hard drive diagnostic checkers and
usually not a good idea.
But we can do a CRC32 verification of files. We don't even need to
open an indexreader to do this, its much more lightweight.
This option (as well as the existing true/false) are randomized in
tests to find problems.
Also fix bug where use of the current option would always leak
an indexwriter lock.
Closes#9183
Upgrades lucene to latest, and supports the BEST_COMPRESSION parameter
now supported (with backwards compatibility, etc) in Lucene.
This option uses deflate, tuned for highly compressible data.
index.codec::
The default value compresses stored data with LZ4 compression, but
this can be set to best_compression for a higher compression ratio,
at the expense of slower stored fields performance.
IMO its safest to implement as a named codec here, because ES already
has logic to handle this correctly, and because its unrealistic to have
a plethora of options to Lucene's default codec... we are practically
limited in Lucene to what we can support with back compat, so I don't
think we should overengineer this and add additional unnecessary plumbing.
See also:
https://issues.apache.org/jira/browse/LUCENE-5914https://issues.apache.org/jira/browse/LUCENE-6089https://issues.apache.org/jira/browse/LUCENE-6090https://issues.apache.org/jira/browse/LUCENE-6100Closes#8863
This documentation was dangerous because it felt like it was possible to gain
substantial performance by just switching the codec of the index.
However, non-default codecs are dangerous to use since they are not supported
in terms of backward compatibility, and most improvements that they bring have
been folded into the default codec anyway (for example, the default codec
"pulses" postings lists that contain a single document).