We do wait for shards to be closed in IndicesService for 30 second.
Yet, if somebody holds on to a store reference ie. an open scroll request
the 30 seconds time-out and node shutdown takes very long. We should
release all other resources first before we shutdown IndicesService.
Closes#8940
The setting `mapping.date.round_ceil` (and the undocumented setting
`index.mapping.date.parse_upper_inclusive`) affect how date ranges using
`lte` are parsed. In #8556 the semantics of date rounding were
solidified, eliminating the need to have different parsing functions
whether the date is inclusive or exclusive.
This change removes these legacy settings and improves the tests
for the date math parser (now at 100% coverage!). It also removes the
unnecessary function `DateMathParser.parseTimeZone` for which
the existing `DateTimeZone.forID` handles all use cases.
Any user previously using these settings can refer to the changed
semantics and change their query accordingly. This is a breaking change
because even dates without datemath previously used the different
parsing functions depending on context.
closes#8598closes#8889
In cases of heavy contention, it's possible for more than 2 threads
to race to a circuit breaking exception.
Essentially this means that if we have 3 threads all trying to add 3 and
simultaneously cause a circuit breaking exception (due to retry), when
adjusting after circuit breaking we can "rewind" past what this test
expects the child breaker to be at.
This adds leeway into the check, where it's okay to be within
NUM_THREADS from the parentLimit, because each thread should only add 1
to the breaker at a time.
We only have a single gatweway since es 1.3. There is no need to keep all
these abstractsion and nested packages. We can fold most of it into simpler
structures.
IndexEngine was an abstraction where we had index-level engines (instead
of shard-level) that could store meta information about the index. It
was never actually used by Elasticsearch, and only there for plugins.
This removes it, because it is a confusing abstraction and not needed,
no plugins should be implementing their own IndexEngines.
When a node fails (or closes), the master processes the network disconnect event and removes the node from the cluster state. If multiple nodes fail (or shut down) in rapid succession, we process the events and remove the nodes one by one. During this process, the intermediate cluster states may cause the node fault detection to signal the failure of nodes that are not yet removed from the cluster state. While this is fine, it currently causes unneeded reroutes and cluster state publishing, which can be cumbersome in big clusters.
Closes#8804Closes#8933
When we close a node all pending / active search requests need to be
cleared otherwise a node will wait up to 30 sec for shutdown sicne there
could be open scroll requests. This behavior was introduces in 1.5 such that
versions <= 1.4.x are not affected.
Closes#8940
Occasionally a the join thread successfully connected to a just closed node and which causes the subsequent join request to time out. It's default timeout 60s throws the test off when it waits for a cluster to form.
Calling cache.cleanUp() is kind of like calling System.gc(), meaning
that we should never have (non-test) things that rely on this
functionality.
For the field data and filter cache, we already have a periodic process
that runs this .cleanUp(), so there is no need to block index
closing/clearing on it. Instead, we can clean the field data cache in
InternalTestCluster before we check the circuit breaker.
This can help tests that time out because cleaning the cache is taking
too long
Today, shard stats are blocked while phase 3 of recovery (replay xlog)
is running; this change removes the engine readLock from shard stats
so it's not blocked.
Closes#8910
We have lots of boilerplate code that is unnecessarily abstracting
services ie InternalIndexShard and IndexShard or InternalIndexService and
IndexService. It's enough to have concrete classes for these core classes.
Closes#8904
If you use the java-package tool to create java packages, those
paths also should be added to the debian init script.
Also updated the docs, that it is ok to install java8.
Closes#7383
This change adds a 'http.publish_port' setting to the HTTP module to configure
the port which HTTP clients should use when communicating with the node. This
is useful when running on a bridged network interface or when running behind
a proxy or firewall.
Closes#8807Closes#8137
we currently have operationThreaded set to false when indexing a script. This setting
means that if we are executing the operation locally that we don't spawn a new thread for
it althought incoming thread in this case is the network thread. Now sicne we are indexing here
the engine is currently recovering which sometimes locks the engine for finalization blocks on
a network call waiting for the recovery target to come back the internal lock in engine will never be
released since we are waiting with our network thread for it to be released.
Upgrades lucene to latest, and supports the BEST_COMPRESSION parameter
now supported (with backwards compatibility, etc) in Lucene.
This option uses deflate, tuned for highly compressible data.
index.codec::
The default value compresses stored data with LZ4 compression, but
this can be set to best_compression for a higher compression ratio,
at the expense of slower stored fields performance.
IMO its safest to implement as a named codec here, because ES already
has logic to handle this correctly, and because its unrealistic to have
a plethora of options to Lucene's default codec... we are practically
limited in Lucene to what we can support with back compat, so I don't
think we should overengineer this and add additional unnecessary plumbing.
See also:
https://issues.apache.org/jira/browse/LUCENE-5914https://issues.apache.org/jira/browse/LUCENE-6089https://issues.apache.org/jira/browse/LUCENE-6090https://issues.apache.org/jira/browse/LUCENE-6100Closes#8863
This fix adds better error handling for parsing multipoint, linestring, and polygon GeoJSONs. Current logic throws a NPE when parsing a multipoint, linestring, or polygon that does not comply with the GeoJSON specification. That is, if a user provides a single coordinate instead of an array of coordinates, or array of linestrings, the ShapeParser throws a NPE wrapped in a SearchParseException instead of a more useful error message.
Closes#8432
After the refactoring in #8784 some settings didn't get passed to the
actual engine and there exists a race if the settings are updated while
the engine is started such that the actual starting engine doesn't see
the latest settings. This commit fixes the concurrency issue as well as
adds tests to ensure the settings are reflected.
After upgrading shard might start relocating again. If there are no
replicas the cluster state of a node might not be up to data for
a few miliseconds and direct a search request to a node that does not
have the shard anymore. This result in the following test failures:
1> java.lang.AssertionError: Count is 99 but 101 was expected. Total shards: 13 Successful shards: 12 & 0 shard failures:
1> __randomizedtesting.SeedInfo.seed([1932F73B458703CA:6F4FAD3DAC55591C]:0)
1> [...org.junit.*]
1> org.elasticsearch.test.hamcrest.ElasticsearchAssertions.assertHitCount(ElasticsearchAssertions.java:184)
1> org.elasticsearch.bwcompat.BasicBackwardsCompatibilityTest.testIndexRollingUpgrade(BasicBackwardsCompatibilityTest.java:358)
Waiting for relocation finished should fix this.
Today we have several upgrade methods that can read state written
pre 0.90 or even pre 0.19. Version 2.0 should not support these state
formats. Users of these version should upgrade to a 1.x or 0.90.x version
first.
Closes#8850
This commit removes the deprecated constant for the main
version and uses the real lucene version we are running instead.
Behind the scenes the same value was used and is now obsolet.
Modifications to LoggingListener pushed with #8820 caused the original logger levels not to be reset after modifications, as the new state was saved for restore instead of the previous one.
Added unit tests for LoggingListener as well.
Closes#8845
Restrict use of java.io.File to 5 methods (excluded), but otherwise ban.
This is a prerequisite to do any mocking here.
I don't try to do any heavy cleanup on these tests, I am not familiar with them.
So this is mostly a rote straightforward conversion.
Closes#8836
This is a start to exposing memory stats improvements from Lucene 5.0.
This adds the following categories of Lucene index pieces to index stats:
* Terms
* Stored fields
* Term Vectors
* Norms
* Doc values
This commit add the engines reference to the store out of the actual
implementation into the hodler since the holder manages the actual lifcycle.
Engine internal references like per searcher or per recovery are kept inside
the actual implemenation since the have a different lifecycle.
Once the current engine is started you can only close it once. Once closed the engine cannot be started again. This commit adds a stop method which signals the engine to free it's resources but in a way that allows restarting.
This is done by introducing InternalEngineHolder which is a wrapper around InternalEngine. This allows to add the stop() method without adding complexity the engine implementation. InternalEngineHolder also serves an entry point for listeners (incoming and outgoing) to other ES components, which removes the needs add/remove them if the engine is stopped.
Closes#8784
Today we try to fetch a shard Id for a given IndexReader / LeafReader
by walking it's tree until the lucene internal SegmentReader and then
casting the directory into a StoreDirecotory. This class is fully internal
to Elasticsearch and should not be exposed outside of the Store.
This commit makes StoreDirectory a private inner class and adds dedicated
ElasticsearchDirectoryReader / ElasticserachLeafReader exposing a ShardId
getter to obtain information about the shard the index / segment belogs to.
These classes can be used to expose other segment specific information in
the future more easily.
When a scoring script returns not a number, the current message is confusing (IllegalArgumentException[docID must be >= 0 and < maxDoc=3 (got docID=2147483647)]). This commit adds the error message ScriptException[script score function returns a wrong score: NaN].
Closes#2426
The packaged init scripts could return an error, if the file
/proc/sys/vm/max_map_count was not existing and we still called
sysctl.
This is primarly to prevent confusing error messages when elasticsearch
is started under virtualized environments without a proc file system.
Closes#4978
Instead of iterating all shards of all indices to get all relocating
shards for a given node we can just use the RoutingNode#shardsWithState
method and fetch all INITIALIZING / RELOCATING shards and check if they
are relocating. This operation is much faster and uses pre-build
data-structures.
Relates to #6372
This commit changes internal callback to be clear
about when they are called and also provide the
exception that was potentially thrown as a callback argument.
Closes#5945
Previously it was possible for the field data clearing in this test to
take too long, causing the test to time out.
This also switches to using `scaledRandomIntBetween` for the number of
fields.
This change will chown /usr/share/elasticsearch/plugins to the elasticsearch
user (the directory was formerly owned by root). This enables the ES user to
manage plugins.
Also, /usr/share/elasticsearch/plugins is now removed when the elasticsearch
package is un-installed. Previously it was left lying there.
Closes#8732
Signed-off-by: Thilo Fromm <github@thilo-fromm.de>
This change adds a systemd service configuration file, and adds systemd logic
to installation and de-installation scripts. The upcoming Debian 8 "Jessie"
release will use systemd.
fixes#8943
Signed-off-by: Thilo Fromm <github@thilo-fromm.de>
the recovery diff can return file in the `different` category
since it's conservative if it can't tell if the files are the same.
Yet, the cleanup code only needs to ensure both ends of the recovery
are consistent. If we have a very old segments_N file no checksum is present
but for the delete files they might be such that the segments file passes
the consistency check but the .del file doesn't sicne it's in-fact the same
but this check was missing in the last commit.
Original indices are optional in ShardDeleteByQueryRequest only for backwards compatibility, see #7406. We can remove this in master since 2.0 will require a full cluster restart.
Closes#8777
This commit factors out the PID file creation from bootstrap and adds
tests for error conditions etc. We also can't rely on DELETE_ON_CLOSE
since it might not even write the file depending on the OS and JVM implementation.
This impl uses a shutdown hook to best-effort remove the pid file if it was written.
Closes#8771
This commit adds a very lightweight action to the transport
serivce that allows to fetch clustername and the discovery node
from a node. This is used by transport clients to test liveness of
a node without using the nodesinfo API which can be blocking if management
threadpools are busy.
Closes#8763
The bwc layer added with #7105 is not needed in master as a full cluster restart will be required, thus from 2.0 on the only supported action names are compliant to the defined conventions and don't need to be converted to the old format
Closes#8758
Related to #8667:
Some QueryBuilders have been deprecated in 1.x branches. We removed them in 2.0.
Removed
-------
* `textPhrase(...)`
* `textPhrasePrefix(...)`
* `textPhrasePrefixQuery(...)`
* `filtered(...)`
* `inQuery(...)`
* `commonTerms(...)`
* `queryString(...)`
* `simpleQueryString(...)`
Closes#8721.
the extract location might be on a different filesystem where
atomic move won't work. Yet this operation is not critical in terms
of visibility so there is no need to do this.
REST tests are being shuffled before their execution. To guarantee their repeatability given the seed, their order needs to be always the same before the shuffling happens.
Closes#8745
The conversion to the Path API doesn't work if the path points
to a file inside a JAR like a config. These path must be read
while the ZIP filesystem is opened which can't be guaranteed across
the board. This commit reverts back the relevant changes to java.net.URL
and adds a util method to read UTF-8 Encoded files from URLs correctly.
This commit cuts over all of core (not quite all tests) to java.nio.Path
It also adds the file class to the core forbidden APIs to prevent its usage.
This commit also resolves#8254 since we now consistently useing the NIO Path
API. The Changes in this commit allow for more information if IO operations fail
since the NIO API throws exceptions instead of boolean return values. The build-in
methods used in this commit are also more resillient to encodeing errors like
unmappable characters and throw exceptions if those chars are present in a file.
Closes#8254Closes#8666
Today we don't check if the recovery target has all the
files that we expect there after the recovery. This commit
adds aditional safety to ensure all files are present with the
correct checksums on recovery finalization.
Closes#8723