Commit Graph

23688 Commits

Author SHA1 Message Date
Daniel Mitterdorfer 4598c36027 Fix various concurrency issues in transport (#19675)
Due to various issues (most notably a missing happens-before edge
between socket accept and channel close in MockTcpTransport),
MockTcpTransportTests sometimes did not terminate.

With this commit we fix various concurrency issues that led to
this hanging test.

Failing example build: https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-os-compatibility/os=oraclelinux/835/console
2016-08-04 21:00:59 +02:00
Boaz Leskes 7010082112 Add checksumming and versions to the Translog's Checkpoint files (#19797)
This prepares the infrastructure to be able to extend the checkpoint file to store more information.
2016-08-04 20:42:12 +02:00
Luca Cavanna bca9ad86c6 Merge pull request #19808 from javanna/test/parse_alternate_query_strict
[TEST] parse query alternate versions in strict mode
2016-08-04 20:00:22 +02:00
javanna cd9388ce66 [TEST] parse query alternate versions in strict mode
AbstractQueryTestCase parses the main version of the query in strict mode, meaning that it will fail if any deprecated syntax is used. It should do the same for alternate versions (e.g. short versions). This is the way it is because the two alternate versions for ids query are both deprecated. Moved testing for those to a specific test method that isolates the deprecations and actually tests that the two are deprecated.
2016-08-04 19:49:43 +02:00
David Pilato 6b9a084086 Merge branch 'pr/19557-extract-aws-key' 2016-08-04 17:48:44 +02:00
Clinton Gormley 2cceb0a5f4 Updated v5.0.0-alpha5 release notes 2016-08-04 17:17:53 +02:00
Ali Beyad 34bb150863 [TEST] Fixes primary term in TransportReplicationActionTests#testReplicaProxy 2016-08-04 10:18:48 -04:00
Ryan Biesemeyer 9f1525255a Update link to mapper-murmur3 plugin in card docs (#19788) 2016-08-04 15:56:59 +02:00
Ali Beyad 8bbc312fdd Fixes issue with dangling index being deleted instead of re-imported (#19666)
Fixes an issue where a node that receives a cluster state
update with a brand new cluster UUID but without an
initial persistence block could cause indices to be wiped out,
preventing them from being reimported as dangling indices.
This commit only removes the in-memory data structures and
thus, are subsequently reimported as dangling indices.
2016-08-04 08:47:46 -04:00
Yannick Welsch ede78ad231 Use primary terms as authority to fail shards (#19715)
A primary shard currently instructs the master to fail a replica shard that it fails to replicate writes to before acknowledging the writes to the client. To ensure that the primary instructing the master to fail the replica is still the current primary in the cluster state on the master, it submits not only the identity of the replica shard to fail to the master but also its own shard identity. This can be problematic however when the primary is relocating. After primary relocation handoff but before the primary relocation target is activated, the primary relocation target is replicating writes through the authority of the primary relocation source. This means that the primary relocation target should probably send the identity of the primary relocation source as authority. However, this is not good enough either, as primary shard activation and shard failure instructions can arrive out-of-order. This means that the relocation target would have to send both relocation source and target identity as authority. Fortunately, there is another concept in the cluster state that represents this joint authority, namely primary terms. The primary term is only increased on initial assignment or when a replica is promoted. It stays the same however when a primary relocates.

This commit changes ShardStateAction to rely on primary terms for shard authority. It also changes the wire format to only transmit ShardId and allocation id of the shard to fail (instead of the full ShardRouting), so that the same action can be used in a subsequent PR to remove allocation ids from the active allocation set for which there exist no ShardRouting in the cluster anymore. Last but not least, this commit also makes AllocationService less lenient, requiring ShardRouting instances that are passed to its applyStartedShards and applyFailedShards methods to exist in the routing table. ShardStateAction, which is calling these methods, now has the responsibility to resolve the ShardRouting objects that are to be started / failed, and remove duplicates.
2016-08-04 12:00:37 +02:00
Boaz Leskes d327dd46b1 Recovery: don't log an error when listing an empty folder 2016-08-04 10:23:36 +02:00
javanna 146f02183d [TEST] remove unused methods and fix some warnings in AbstractQueryTestCase
Also fix line length issues
2016-08-04 10:06:25 +02:00
Jason Tedor 533412e36f Improve cat thread pool API
Today, when listing thread pools via the cat thread pool API, thread
pools are listed in a column-delimited format. This is unfriendly to
command-line tools, and inconsistent with other cat APIs. Instead,
thread pools should be listed in a row-delimited format.

Additionally, the cat thread pool API is limited to a fixed list of
thread pools that excludes certain built-in thread pools as well as all
custom thread pools. These thread pools should be available via the cat
thread pool API.

This commit improves the cat thread pool API by listing all thread pools
(built-in or custom), and by listing them in a row-delimited
format. Finally, for each node, the output thread pools are sorted by
thread pool name.

Relates #19721
2016-08-03 23:02:13 -04:00
Jason Tedor 7d750d2811 Increase Netty 3 REST test suite timeout
This commit increases the Netty 3 REST test suite timeout to thirty
minutes. This is to address these tests running slowly after increasing
the number of nodes in the tests to two. This has surfaced that the
tests are heavily impacted by excessive fsyncs from most tests using the
default number of shards of five.
2016-08-03 21:31:48 -04:00
debadair bcc5c7c07a Docs: Fixed callout error that broke the build. 2016-08-03 17:20:00 -07:00
Nik Everett 3be1e7ec35 CONSOLify the completion suggester docs (#19758)
* CONSOLEify search/suggesters/completion
* CONSOLEify context suggester docs
2016-08-03 18:40:17 -04:00
Ali Beyad be87d50f32 Fixes CreateIndexIT test that assumes an index create propogated
before calling delete.
2016-08-03 16:24:24 -04:00
Nik Everett e249ad8dfe Fix loggerUsageCheck after clean
The `loggerUsageCheck` can only run on directories that exist. It was
checking whether or not the directories exists before they were built
built and then deciding to do no work. But only if you are building in
a cleaned environment which CI does, but people rarely do locally.
2016-08-03 15:36:11 -04:00
Jason Tedor eb6da69e9f Explicitly tell Netty to not use unsafe
With the security permissions that we grant to Netty, Netty can not
access unsafe (because it relies on having the runtime permission
accessDeclaredMembers and the reflect permission
suppressAccessChecks). Instead, we should just explicitly tell Netty to
not use unsafe. This commit adds a flag to the default jvm.options to
tell Netty to not look for unsafe.

Relates #19786
2016-08-03 15:23:34 -04:00
Ryan Ernst c3a5e4fa48 Merge pull request #19765 from rjernst/metadata_mapper_dup
Mappings: Fix detection of metadata fields in documents
2016-08-03 11:58:24 -07:00
Ryan Ernst ef425f4b7c Merge pull request #19770 from rjernst/script_service_component
Add ScriptService to dependencies available for plugin components
2016-08-03 11:57:58 -07:00
Jason Tedor b12608a6db Merge pull request #19767 from jaymode/netty4
Enable Netty 4 extensions
2016-08-03 14:10:43 -04:00
Luca Cavanna c5a9427293 Merge pull request #19750 from javanna/fix/npe_parse_field_array
Throw ParsingException if a query is wrapped in an array
2016-08-03 18:21:39 +02:00
Mary fa3420c2a5 Update term-level-queries.asciidoc
Typo fix
2016-08-03 10:18:13 -06:00
javanna 4805250ecf Throw ParsingException if a query is wrapped in an array
Our parsing code accepted up until now queries in the following form (note that the query starts with `[`:

```
{
    "bool" : [
        {
          "must" : []
        }
    ]
}
```

This would lead to a null pointer exception as most parsers assume that the field name ("must" in this example) is the first thing that can be found in a query if its json is valid, hence always non null while parsing. Truth is that the additional array layer doesn't make the json invalid, hence the following code fragment would cause NPE within ParseField, because null gets passed to `parseContext.isDeprecatedSetting`:

```
if (token == XContentParser.Token.FIELD_NAME) {
    currentFieldName = parser.currentName();
} else if (parseContext.isDeprecatedSetting(currentFieldName)) {
    // skip
} else if (token == XContentParser.Token.START_OBJECT) {
```

We could add null checks in each of our parsers in lots of places, but we rely on `currentFieldName` being non null in all of our parsers, and we should consider it a bug when these unexpected situations are not caught explicitly. It would be best to find a way to prevent such queries altogether without changing all of our parsers.

The reason why such a query goes through is that we've been allowing a query to start with either `[` or `{`. The only reason I found is that we accept `match_all : []`. This seems like an undocumented corner case that we could drop support for. Then we can be stricter and accept only `{` as start token of a query. That way the only next token that the parser can encounter if the json is valid (otherwise the json parser would barf earlier) is actually a field_name, hence the assumption that all our parser makes hold.

The downside of this is simply dropping support for `match_all : []`

Relates to #12887
2016-08-03 17:05:14 +02:00
Nik Everett ca8f666c66 Add line number to yaml test failures
Old:
```
   > Throwable #1: java.lang.AssertionError: expected [2xx] status code but api [reindex] returned [400 Bad Request] [{"error":{"root_cause":[{"type":"parsing_exception","reason":"[reindex] failed to parse field [dest]","line":1,"col":25}],"type":"parsing_exception","reason":"[reindex] failed to parse field [dest]","line":1,"col":25,"caused_by":{"type":"illegal_argument_exception","reason":"[dest] unknown field [asdfadf], parser not found"}},"status":400}]
   >    at __randomizedtesting.SeedInfo.seed([9325F8C5C6F227DD:1B71C71F680E4A25]:0)
   >    at org.elasticsearch.test.rest.yaml.section.DoSection.execute(DoSection.java:119)
   >    at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:309)
   >    at java.lang.Thread.run(Thread.java:745)
```

New:
```
   > Throwable #1: java.lang.AssertionError: Failure at [reindex/10_basic:12]: expected [2xx] status code but api [reindex] returned [400 Bad Request] [{"error":{"root_cause":[{"type":"parsing_exception","reason":"[reindex] failed to parse field [dest]","line":1,"col":25}],"type":"parsing_exception","reason":"[reindex] failed to parse field [dest]","line":1,"col":25,"caused_by":{"type":"illegal_argument_exception","reason":"[dest] unknown field [asdfadf], parser not found"}},"status":400}]
   >    at __randomizedtesting.SeedInfo.seed([444DEEAF47322306:CC19D175E9CE4EFE]:0)
   >    at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.executeSection(ESClientYamlSuiteTestCase.java:329)
   >    at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:309)
   >    at java.lang.Thread.run(Thread.java:745)
   > Caused by: java.lang.AssertionError: expected [2xx] status code but api [reindex] returned [400 Bad Request] [{"error":{"root_cause":[{"type":"parsing_exception","reason":"[reindex] failed to parse field [dest]","line":1,"col":25}],"type":"parsing_exception","reason":"[reindex] failed to parse field [dest]","line":1,"col":25,"caused_by":{"type":"illegal_argument_exception","reason":"[dest] unknown field [asdfadf], parser not found"}},"status":400}]
   >    at org.elasticsearch.test.rest.yaml.section.DoSection.execute(DoSection.java:119)
   >    at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.executeSection(ESClientYamlSuiteTestCase.java:325)
   >    ... 37 more
```

Sorry for the longer stack trace, but I wanted to be sure I didn't throw
anything away by accident.
2016-08-03 10:59:57 -04:00
javanna 51bbe2c5c4 [TEST] fix log statement in ESIndexLevelReplicationTestCase 2016-08-03 16:56:19 +02:00
Nik Everett f97b1a94b8 Add 2.3.5 to packaging tests list
This should make the packaging tests happy again.
2016-08-03 10:49:49 -04:00
Ali Beyad 4f70ee521f Migration Guide changes for BlobContainer (#19731)
Adds a notice in the migration guide for removing
two deleteBlobs and one writeBlob method from the
BlobContainer interface.
2016-08-03 10:41:25 -04:00
Clinton Gormley 39081af9d6 Added version 2.3.5 with bwc indices 2016-08-03 15:50:47 +02:00
Britta Weber 5b38282fcb fix bwc index tool for versions before 5.0 (#19626)
* fix bwc index tool for versions before 5.0
2016-08-03 15:37:38 +02:00
Jason Tedor e74d02138f Merge branch 'master' into netty4
* master:
  Fix REST test documentation
  [Test] move methods from bwc test to test package for use in plugins (#19738)
  package-info.java should be in src/main only.
  Split regular histograms from date histograms. #19551
  Tighten up concurrent store metadata listing and engine writes (#19684)
  Plugins: Make NamedWriteableRegistry immutable and add extenion point for named writeables
  Add documentation for the 'elasticsearch-translog' tool
  [TEST] Increase time waiting for all shards to move off/on to a node
  Fixes the active shard count check in the case of (#19760)
  Fixes cat tasks operation in detailed mode
  ignore some docker craziness in scccomp environment checks
2016-08-03 09:16:18 -04:00
David Pilato 358ee7c272 Fix REST test documentation
To run the tests we have to use now `*Yaml*IT` instead of `RestIT`:

```sh
gradle :distribution:integ-test-zip:integTest  -Dtests.class=org.elasticsearch.test.rest.*Yaml*IT
```
2016-08-03 13:50:45 +02:00
Robert Muir ef5debc6ce Merge pull request #19754 from rmuir/docker_seccomp
ignore some docker craziness in seccomp environment checks
2016-08-03 05:50:25 -04:00
Britta Weber abcb4c8a97 [Test] move methods from bwc test to test package for use in plugins (#19738)
* [Test] move methods from bwc test to test package for use in other plugins
2016-08-03 11:41:46 +02:00
Adrien Grand 0e64117512 package-info.java should be in src/main only. 2016-08-03 11:11:25 +02:00
Ryan Ernst 18f242b069 Merge pull request #19764 from rjernst/writeable_registry
Make NamedWriteableRegistry immutable and add extension point for named writeables
2016-08-03 01:36:38 -07:00
Ryan Ernst fe823c857b Plugins: Add ScriptService to dependencies available for plugin components 2016-08-03 00:43:04 -07:00
Adrien Grand a0818d3b87 Split regular histograms from date histograms. #19551
Currently both aggregations really share the same implementation. This commit
splits the implementations so that regular histograms can support decimal
intervals/offsets and compute correct buckets for negative decimal values.

However the response API is still the same. So for intance both regular
histograms and date histograms will produce an
`org.elasticsearch.search.aggregations.bucket.histogram.Histogram`
aggregation.

The optimization to compute an identifier of the rounded value and the
rounded value itself has been removed since it was only used by regular
histograms, which now do the rounding themselves instead of relying on the
Rounding abstraction.

Closes #8082
Closes #4847
2016-08-03 08:39:48 +02:00
Boaz Leskes f6aeb35ce8 Tighten up concurrent store metadata listing and engine writes (#19684)
In several places in our code we need to get a consistent list of files + metadata of the current index. We currently have a couple of ways to do in the `Store` class, which also does the right things and tries to verify the integrity of the smaller files. Sadly, those methods can run into trouble if anyone writes into the folder while they are busy. Most notably, the index shard's engine decides to commit half way and remove a `segment_N` file before the store got to checksum (but did already list it). This race condition typically doesn't happen as almost all of the places where we list files also happen to be places where the relevant shard doesn't yet have an engine. There  is however an exception (of course :)) which is the API to list shard stores, used by the master when it is looking for shard copies to assign to.

I already took one shot at fixing this in #19416 , but it turns out not to be enough - see for example https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+multijob-os-compatibility/os=sles/822.

The first inclination to fix this was to add more locking to the different Store methods and acquire the `IndexWriter` lock, thus preventing any engine for accessing if if the a shard is offline and use the current index commit snapshotting logic already existing in `IndexShard` for when the engine is started. That turned out to be a bad idea as we create more subtleties where, for example, a store listing can prevent a shard from starting up (the writer lock doesn't wait if it can't get access, but fails immediately, which is good). Another example is running on a shared directory where some other engine may actually hold the lock.

Instead I decided to take another approach:
1) Remove all the various methods on store and keep one, which accepts an index commit (which can be null) and also clearly communicates that the *caller* is responsible for concurrent access. This also tightens up the API which is a plus.
2) Add a `snapshotStore` method to IndexShard that takes care of all the concurrency aspects with the engine, which is now possible because it's all in the same place. It's still a bit ugly but at least it's all in one place and we can evaluate how to improve on this later on. I also renamed the  `snapshotIndex` method to `acquireIndexCommit` to avoid confusion and I think it communicates better what it does.
2016-08-03 08:34:09 +02:00
Ryan Ernst 7bfe1bd628 Check inner field with metadata field name is ok 2016-08-02 17:03:21 -07:00
Ryan Ernst 4e48154130 Mappings: Fix detection of metadata fields in documents
In 2.0, the ability to specify metadata fields like _routing and _ttl
inside a document was removed. However, the ability to break through
this restriction has lingered, and the check that enforced it is
completely broken.

This change fixes the check, and adds a parsing test.
2016-08-02 16:54:44 -07:00
Ryan Ernst df8dc64e9b Plugins: Make NamedWriteableRegistry immutable and add extenion point for named writeables
Currently any code that wants to added NamedWriteables to the
NamedWriteableRegistry can do so via guice injection of the registry,
and registering at construction time. However, this makes the registry
complex: it has both get and register methods synchronized, and there is
likely contention on the read side from multiple threads.  The
registration has mostly already been contained to guice modules at node
construction time.

This change makes the registry immutable, taking all of the
NamedWriteable readers at construction time. It also allows plugins to
added arbitrary named writables that it may use in its own transport
actions.
2016-08-02 15:56:25 -07:00
Lee Hinman 046abdc281 Merge remote-tracking branch 'dakrone/document-translog-tool' 2016-08-02 16:27:02 -06:00
Lee Hinman 0ade5a207d Add documentation for the 'elasticsearch-translog' tool
This adds documentation to the translog page for the CLI truncation
tool.
2016-08-02 16:26:28 -06:00
Lee Hinman a9b2e172fa [TEST] Increase time waiting for all shards to move off/on to a node 2016-08-02 16:18:39 -06:00
Ali Beyad c28eee77df Fixes the active shard count check in the case of (#19760)
ActiveShardCount.ALL by checking for active shards,
not just started shards, as a shard could be active
but in the relocating state (i.e. not in the started
state).
2016-08-02 18:00:39 -04:00
Jason Tedor 0461e12663 Simplify Netty 4 transport implementations
The Netty 4 transport implementations have an unnecessary dependency on
SocketChannels, and can instead just use plain Channels.
2016-08-02 17:07:30 -04:00
Igor Motov 22e63b4783 Fixes cat tasks operation in detailed mode
Currently the cat tasks operation fails in the detailed mode.

Closes #19755
2016-08-02 15:21:31 -04:00
jaymode 6def10c5d9 make netty4 http request public 2016-08-02 14:46:44 -04:00