Merging the accumulated work from the feautre/improve_zen branch. Here are the highlights of the changes:
__Testing infra__
- Networking:
- all symmetric partitioning
- dropping packets
- hard disconnects
- Jepsen Tests
- Single node service disruptions:
- Long GC / Halt
- Slow cluster state updates
- Discovery settings
- Easy to setup unicast with partial host list
__Zen Discovery__
- Pinging after master loss (no local elects)
- Fixes the split brain issue: #2488
- Batching join requests
- More resilient joining process (wait on a publish from master)
Closes#7493
Per default the heap dump is written to target/JX/pidXYZ.hprof
In order to keep them when a new test is is started, they
should be written to log folder which is not cleared in a new
test run.
Heap dump location can be set with -Dtests.heapdump.path=/path/to/heapdump
closes#7452
internal cluster communication.
See CorruptedCompressorTests for details on how this bug can be hit.
This change also removes the ability to use the unsafe variant of
ChunkedEncoder, removing support for the compress.lzf.decoder setting.
CliTool is a base class for command-line interface tools (such as the plugin manager and potentially others). It supports the following:
- single or multi command tool
- help printing infrastructure (based on help files)
- consistent mechanism of parsing arguments (based on commons-cli lib)
- separation of argument parsing and command execution (for easier unit testing)
- terminal abstraction (will use System.console() when available)
This commit adds a profile that skips all validation ie.
- nocommit / tabs checking
- forbidden API checks
- license headers
It's not active by default but can easily be activated with
`mvn -Pdev` or in the `~/.m2/settings.xml`
for reference see:
http://maven.apache.org/guides/introduction/introduction-to-profiles.html
We parse the version that is shipped with the Lucene segments in order
to find the version of lucene that wrote a particular segment. Yet, some lucene
version ie:
* 4.3.1 (Elasticsearch 0.90.2)
* 4.5.1 (Elasticsearch 0.90.7)
* 3.6.1 (pre Elasticsearch 0.90.0)
wrote illegal strings containing the minor version which causes IAE exceptions
being thrown from lucenes parsing method.
Closes#7055
These are javascript expressions, which can only access numeric
fielddata, parameters, and _score. They can only be used for searches (not document updates).
closes#6818
In order to have access to all codecs and handlers by netty, they
need to be exposed during shading, otherwise only the classes, which
are used by the built are exposed.
`-Dtests.filter` allows to pass filter expressions to the elasticsearch
tests. This allows to filter test annotaged with TestGroup annotations
like @Slow, @Nightly, @Backwards, @Integration with a boolean expresssion like:
* to run only backwards tests run:
`mvn -Dtests.bwc.version=X.Y.Z -Dtests.filter="@backwards"`
* to run all integration tests but skip slow tests run:
`mvn -Dtests.filter="@integration and not @slow"
* to take defaults into account ie run all test as well as backwards:
`mvn -Dtests.filter="default and @backwards"
This feature is a more powerful alternative to flags like
`-Dtests.nighly=true|false` etc.
Closes#6703
Some IO api can return after writing & reading only a part of the requested data. On these rare occasions, we should call the methods again to read/write the rest of the data. This has cause rare translog corruption while writing huge documents on Windows.
Noteful parts of the commit:
- A new Channels class with utility methods for reading and writing to channels
- Writing or reading to channels is added to the forbidden API list
- Added locking to SimpleFsTranslogFile
- Removed FileChannelInputStream which was not used
Closes#6441 , #6576
Sandboxes the groovy scripting language with multiple configurable
whitelists:
`script.groovy.sandbox.receiver_whitelist`: comma-separated list of string
classes for objects that may have methods invoked.
`script.groovy.sandbox.package_whitelist`: comma-separated list of
packages under which new objects may be constructed.
`script.groovy.sandbox.class_whitelist` comma-separated list of classes
that are allowed to be constructed.
As well as a method blacklist:
`script.groovy.sandbox.method_blacklist`: comma-separated list of
methods that are never allowed to be invoked, regardless of target
object.
The sandbox can be entirely disabled by setting:
`script.groovy.sandbox.enabled: false`
This commit add a basic infrastructure as well as primitive tests
to ensure version backwards compatibility between the current
development trunk and an arbitrary previous version. The compatibility
tests are simple unit tests derived from a base class that starts
and manages nodes from a provided elasticsearch release package.
Use the following commandline executes all backwards compatiblity tests
in isolation:
```
mvn test -Dtests.bwc=true -Dtests.bwc.version=1.2.1 -Dtests.class=org.elasticsearch.bwcompat.*
```
These tests run basic checks like rolling upgrades and
routing/searching/get etc. against the specified version. The version
must be present in the `./backwards` folder as
`./backwards/elasticsearch-x.y.z`
This commit adds checks for nocommit and tabs in the source code.
The task is executed during the validate phase and can be disabled via
`-Dvalidate.skip`
E.g. we use Unsafe in quite a few places and this generates lots of
warnings, which we now suppress using the undocumented
-XDignore.symbol.file command-line option to javac.
Closes#6423
Routing has been inadvertly changed in #5562 resulting in documents going to
different shards in 1.2. This is a terrible bug because an indexing request
would not necessarily go to the same shard anymore, potentially leading to
duplicates.
Close#6391
This commit upgrades to the latest Lucene 4.8.1 release including the
following bugfixes:
* An IndexThrottle now kicks in when merges start falling behind
limiting index threads to 1 until merges caught up. Closes#6066
* RateLimiter now kicks in at the configured rate where previously
the limiter was limiting at ~8MB/sec almost all the time. Closes#6018
* If plugin does not provide `lucene` property, we consider that the plugin is compatible.
* If plugin provides `lucene` property, we try to load related Enum org.apache.lucene.util.Version. If this fails, it means that the node is too "old" comparing to the Lucene version the plugin was built for.
* We compare then two first digits of current node lucene version against two first digits of plugin Lucene version. If not equal, it means that the plugin is too "old" for the current node.
Plugin developers who wants to launch plugin check only have to add a `lucene` property in `es-plugin.properties` file. If you are using maven to build your plugin, you can do it like this:
In `pom.xml`:
```xml
<properties>
<lucene.version>4.6.0</lucene.version>
</properties>
<build>
<resources>
<resource>
<directory>src/main/resources</directory>
<filtering>true</filtering>
</resource>
</resources>
</build>
```
In `es-plugin.properties`, add:
```properties
lucene=${lucene.version}
```
BTW, if you don't already have it, you can add the plugin version as well:
```properties
version=${project.version}
```
You can disable that check using `plugins.check_lucene: false`.
We configure the threadpools according to the number of processors which is
different on every machine. Yet, we had some test failures related to this
and #6174 that only happened reproducibly on a node with 1 available processor.
This commit does:
* sometimes randomize the number of available processors
* if we don't randomize we should set the actual number of available processors
in the settings on the test node
* always print out the num of processors when a test fails to make sure we can
reproduce the thread pool settings with the reproduce info line
Closes#6176
Our improvements to t-digest have been pushed upstream and t-digest also got
some additional nice improvements around memory usage and speedups of quantile
estimation. So it makes sense to use it as a dependency now.
This also allows to remove the test dependency on Apache Mahout.
Close#6142
Updating to this version allows to configure a special JNA directory,
in case the /tmp directory is mounted with the noexec option, as JNA
extracts some data and tries to execute parts of it.
Also updated documentation to clarify mlockall and memory settings as well
as pointing to the new jna.tmpdir system property.
Closes#5493
Needed for REST backwards compatibility tests, since we need to run older tests with the latest runner, which randomizes shards and replicas, but the tests rely on defaults (5,1).
Done in a generic way based on compatibility versions e.g. `-Dtests.compatibility=1.0.0` allows to run tests in a special manner that is compatibile with 1.0.0 version.
Also moved back randomIndexTemplate to ElasticsearchIntegrationTest (from ImmutableCluster) where all the randomized aspects should be.
Closes#5897
The blacklist can be provided through -Dtests.rest.blacklist and supports a comma separated list of globs
e.g. -Dtests.rest.blacklist=get/10_basic/*,index/*/*
Also added some missing docs and made it clearer that the suite/test descriptions effectively contains their (relative) path (api/yaml_file/test section)
Closes#5881
This prevents executing bulks internal autocreate indices logic
and ensures that this internal request never creates an index
automaticall.
This fixes a bug where the TTL purger thread ran after the actual
index it was purging was already closed / deleted and that re-created
that index.
Closes#5766
ElasticsearchRestTests extends now ElasticsearchIntegrationTest and makes use of our ordinary test infrastructure, in particular all randomized aspects now come for free instead of having to maintain a separate (custom) tests runner
We previously parsed only the tests that needed to be run given the version of the cluster the tests are running against. This doesn't happen anymore as it didn't buy much and it would be harder to support as the tests get now parsed before the test cluster gets started. Thus all the tests are now parsed regardless of their skip sections, afterwards the ones that don't need to be run will be skipped through assume directives.
Fixed REST tests that rely on a specific number of shards as this change introduces also random number of shards and replicas (through randomIndexTemplate)
Closes#5654
* Removed XTermsFilter fixed in LUCENE-5502
* Switched back to automaton queries that caused failures due to LUCENE-5532
* Fixed Highlight test that has different results due to LUCENE-5538
All the ordinary test operations happen based on the ImmutableTestCluster base class and are executed via transport client. Will be used especially for the REST tests once migrated to the standard randomized runner.
Added new httpAddresses method to ImmutableTestCluster to be able to retrieve the http addresses to connect to for the REST tests. Both versions will look inside the cluster to figure out which nodes are available for http calls and their addresses.
The external cluster is used as global cluster if the tests.cluster system property is available. The property needs to contain a comma separated list of available elasticsearch nodes that will be used to connect to the cluster (e.g. localhost:9300,localhost:9301).
Only a subset of the integration tests can currently be run successfully against the external cluster, for more precision the ones that don't modify the cluster layout (don't require cluster() functionalities but rely only on immutableCluster()). Also at least two data nodes are required otherwise the ensureGreen calls cannot succeed.
Closes#5630
We have had a couple of bugs because of the use of these methods without paying
attention that it might return a negative value when provided with MIN_VALUE.
There is one common and legitimate usage of this method in order to perform
a modulo operation which would always return a positive number. This use-case
has been extracted to MathUtils.mod.
Close#5562
As documented in systemd's manual pages tmpfiles.d(5) and systemd.unit(5),
a package should install its default configuration in /usr/lib, which can
be overriden by system administrators in /etc.
New locations in the rpm:
/usr/lib/systemd/system/elasticsearch.service
/usr/lib/tmpfiles.d/elasticsearch.conf
When building elasticsearch, we now require to use java 1.7.
Maven will check that before compiling any class. If Java version is incorrect, you will get the following message:
```
[WARNING] Rule 0: org.apache.maven.plugins.enforcer.RequireJavaVersion failed with message:
Detected JDK Version: 1.6.0-65 is not in the allowed range [1.7,).
```
Closes#5428.
We have the default QueryWrapperFilter as well as our custom one while
our wrapper is explicitly marked as no_cache such that it will never
be included in a cache. This was not consistenly used and caused several
problems during tests where p/c related queries were used as filters
and ended up in the cache. This commit adds the QueryWrapperFilter
ctor to the forbidden APIs to enforce the query instance checks.
Note that the standard `atLeast` implementation has now Integer.MAX_VALUE as upper bound, thus it behaves differently from what we expect in our tests, as we never expect the upper bound to be that high.
Added our own `atLeast` to `AbstractRandomizedTest` so that it has the expected behaviour with a reasonable upper bound.
See https://github.com/carrotsearch/randomizedtesting/issues/131
Lucene's RamUsageEstimator.sizeOf(Object) is to expensive.
Query size estimation will be enabled when a cheaper way of query size estimation can be found.
Closes#5372
Relates to #5339
Today, even though our merge policy doesn't return new merge specs on SEGMENT_FLUSH, merge on the scheduler is still called on flush time, and can cause merges to stall indexing during merges. Both for the concurrent merge scheduler (the default) and the serial merge scheduler. This behavior become worse when throttling kicks in (today at 20mb per sec).
In order to solve it (outside of Lucene for now), we wrap the merge scheduler with an EnableMergeScheduler, where, on the thread level, using a thread local, the call to merge can be enabled/disabled.
A Merges helper class is added where all explicit merges operations should go through. If the scheduler is the enabled one, it will enable merges before calling the relevant explicit method call. In order to make sure Merges is the only class that calls the explicit merge calls, the IW variant of them is added to the forbidden APIs list.
closes#5319
Adds support for storing mustache based query templates that can later be filled
with query parameter values at execution time. Templates may be both quoted,
non-quoted and referencing templates stored in config/scripts/*.mustache by file
name.
See docs/reference/query-dsl/queries/template-query.asciidoc for templating
examples.
Implementation detail: mustache itself is being shaded as it depends directly on
guava - so having it marked optional but included in the final distribution
raises chances of version conflicts downstream.
Fixes#4879
This upgrade includes a fix for RAM estimation on IndexReader
that allows to expose the amount of used bytes per segment now
as a setting in Elasticsearch. (LUCENE-5373)
Additionally this bugfix release contained a small fix for highlighting
that was already ported to Elasticsearch when reported (LUCENE-5361)
Closes#4897