Since Lucene version 4.8 each file has a checksum written as it's
footer. We used to calculate the checksums for all files transparently
on the filesystem layer (Directory / Store) which is now not necessary
anymore. This commit makes use of the new checksums in a backwards
compatible way such that files written with the old checksum mechanism
are still compared against the corresponding Alder32 checksum while
newer files are compared against the Lucene build in CRC32 checksum.
Since now every written file is checksummed by default this commit
also verifies the checksum for files during recovery and restore if
applicable.
Closes#5924
This commit also has a fix for #6808 since the added tests in
`CorruptedFileTest.java` exposed the issue.
Closes#6808
Today, the tribe node needs the local node so it adds it when it starts, but other APIs would benefit from adding the local node, also, adding the local node should be done in a cleaner manner, where it belongs, which is right after the discovery service starts in the cluster service
closes#6811
This commit ensures that all primaries are allocated before we
restart the node. If one primary is in post recovery when we
restart it will not be allocated otherwise.
If the node initialisation fails, make sure the
node environment is closed correctly and thus
all locks (on data directories) being properly released.
Closes#6715
`mmapfs` is really good for random access but can have sideeffects if
memory maps are large depending on the operating system etc. A hybrid
solution where only selected files are actually memory mapped but others
mostly consumed sequentially brings the best of both worlds and
minimizes the memory map impact.
This commit mmaps only the `dvd` and `tim` file for fast random access
on docvalues and term dictionaries.
Closes#6636
especially the tests that check for update of mapping, we need to make sure that the cluster is green so mappings won't get override, also, put mapping during index creation when possible
- adds support for multiple logging configurations under the config dir (will pick up any logging.xxx in the config folder tree)
- plugins can now define a top level config directory that will be copied under es config dir and will be renamed after the plugin name (same as the support we have the plugin "bin" dirs)
Closes#6802
There is a special type of request that tries to not allocate another buffer when sending bytes request (used by the public cluster state action). With the new pages bytes reference support, the content can already be a composite channel buffer, take that into account when building the actual composite buffer that will be sent over the network
closes#6756
When launching multiple nodes in MS Windows env, batch command window does not come by default with any title.
This patch add `Elasticsearch VERSION` as a title, where VERSION is the current elasticsearch version.
Same for `plugin.bat` and `service.bat`.
Closes#6336Closes#6752
Today, if we miss on a get on setting, we try its camel case version. The assumption is that in our code, we never use camel case to lookup a setting, but we want to support camel case if the user provided one.
This can be expensive (#toCamelCase) when the get on the setting is done in a tight call, which is evident when running the allocation deciders as part of the reroute logic.
Instead of doing the camel case check on get, prepare an additional map that includes all the settings that are provided as came case, and try and lookup from it if needed.
closes#6765
At the moment one can iterate the MapperService to go through all document mappers. This includes the document mapper of DEFAULT_MAPPING, which may be surprising and lead to unintended results. This commit removes the Iterable implementation and add a docMappers method that asks the caller to make an explicit choice
Closes#6793
We check if the version map needs to be refreshed after we released
the readlock which can cause the the engine being closed before we
read the value from the volatile `indexWriter` field which can cause an
NPE on the indexing thread. This commit also fixes a potential uncaught
exception if the refresh failed due to the engine being already closed.
Relates to #6443Closes#6786
remove internal callas on FieldMappers#Names, we properly reuse FieldMapper, so there is no need to try and call intern in order to reuse the names. This can be heavy with many fields and continuous mapping parsing.
closes#6747
We use awaitBusy in our tests, the problem is that we have to check if it awaited or not, and then try and keep around somehow more info around why the predicate failed and a timeout happened.
The idea of assertBusy is to allow to simply write "regular" test code, and if the test code trips, it will busy wait till a timeout. This allows us to keep around the assertion information and properly throw it for information that is inherently kept in the failure itself.
If you are indexing tiny documents then the previous default (5000)
was too low, causing excessive fsyncs with high indexing rates. With
this change we now only flush by byte size (200 MB) by default for
better indexing performance for tiny documents.
Closes#6783
When refresh_interval is long or disabled, and indexing rate is high,
it's possible for live version map to use non-trivial amounts of RAM.
With this change we now trigger a refresh in such cases to clear the
version map so we don't use unbounded RAM.
Closes#6443
Aggregation name are now able to use any character except '[', ']', and '>". Dot syntax may still be used to reference values (e.g. in the order field) but may only defence the value directly beneath the last aggregation in the list. more complex structures will need to be accessed using the aggname[value] syntax
Closes#6702