Commit Graph

508 Commits

Author SHA1 Message Date
Nhat Nguyen 038fe1151b TEST: Add debug log to FlushIT
We still don't have a strong reason for the failures of
testDoNotRenewSyncedFlushWhenAllSealed and
testSyncedFlushSkipOutOfSyncReplicas.

This commit adds debug logging for these two tests.
2018-05-01 10:15:03 -04:00
Diwas Joshi dd5fcb211d index name added to snapshot restore exception (#29604)
This PR adds index name to snapshot restore exception if index is renamed during restoring.
closes [#27601](https://github.com/elastic/elasticsearch/issues/27601)
2018-05-01 15:16:38 +02:00
Jason Tedor 5de6f4ff7b Adjust copy settings on resize BWC version
This commit adjusts the BWC version for copy settings on resize
operations after the behavior was backported to 6.x.
2018-05-01 08:49:16 -04:00
Jason Tedor 50535423ff
Allow copying source settings on resize operation (#30255)
Today when an index is created from shrinking or splitting an existing
index, the target index inherits almost none of the source index
settings. This is surprising and a hassle for operators managing such
indices. Given this is the default behavior, we can not simply change
it. Instead, we start by introducing the ability to copy settings. This
flag can be set on the REST API or on the transport layer and it has the
behavior that it copies all settings from the source except non-copyable
settings (a property of a setting introduced in this
change). Additionally, settings on the request will always override.

This change is the first step in our adventure:
 - this flag is added here in 7.0.0 and immediately deprecated
 - this flag will be backported to 6.4.0 and remain deprecated
 - then, we will remove the ability to set this flag to false in 7.0.0
 - finally, in 8.0.0 we will remove this flag and the only behavior will
   be for settings to be copied
2018-05-01 08:48:19 -04:00
Nik Everett 99b98fab18
Core: Pick inner most parse exception as root cause (#30270)
Just like `ElasticsearchException`, the inner most
`XContentParseException` tends to contain the root cause of the
exception and show be show to the user in the `root_cause` field.

The effectively undoes most of the changes that #29373 made to the
`root_cause` for parsing exceptions. The `type` field still changes from
`parse_exception` to `x_content_parse_exception`, but this seems like a
fairly safe change.

`ElasticsearchWrapperException` *looks* tempting to implement this but
the behavior isn't quite right. `ElasticsearchWrapperExceptions` are
entirely unwrapped until the cause no longer
`implements ElasticsearchWrapperException` but `XContentParseException`
should be unwrapped until its cause is no longer an
`XContentParseException` but no further. In other words,
`ElasticsearchWrapperException` are unwrapped one step too far.

Closes #30261
2018-05-01 07:44:58 -04:00
Luca Cavanna acdf330a0e
Minor DocWriteResponse changes (#29675)
Remove double if depending on the Result value. It makes little sense to
pass in a boolean flag based on a Result value that we already have,
if that internally is represented again as a `Result` value.

Also changed the `Result` `lowercase` instance member to be computed
based on `name()` instead of `toString()` which is safer and to use
`Locale.ROOT` instead of `Locale.ENGLISH`
2018-05-01 09:35:09 +02:00
Boaz Leskes 4a537ef03c
Bulk operation fail to replicate operations when a mapping update times out (#30244)
Starting with the refactoring in https://github.com/elastic/elasticsearch/pull/22778 (released in 5.3) we may fail to properly replicate operation when a mapping update on master fails. If a bulk
operations needs a mapping update half way, it will send a request to the master before continuing 
to index the operations. If that request times out or isn't acked (i.e., even one node in the cluster 
didn't process it within 30s), we end up throwing the exception and aborting the entire bulk. This is 
a problem because all operations that were processed so far are not replicated any more to the 
replicas.  Although these operations were never "acked" to the user (we threw an error) it cause the 
local checkpoint on the replicas to lag (on 6.x) and the primary and replica to diverge. 

This PR does a couple of things:
1) Most importantly, treat *any* mapping update failure as a document level failure, meaning only 
    the relevant indexing operation will fail.
2) Removes the mapping update callbacks from `IndexShard.applyIndexOperationOnPrimary` and 
    similar methods for simpler execution. We don't use exceptions any more when a mapping 
    update was successful.

I think we need to do more work here (the fact that a single slow node can prevent those mappings 
updates from being acked and thus fail operations is bad), but I want to keep this as small as I can 
(it is already too big).
2018-05-01 08:15:02 +02:00
Chris Earle 725a5af2c6
_cluster/state should always return cluster_uuid (#30143)
Currently, the only way to get the REST response for the `/_cluster/state`
call to return the `cluster_uuid` is to request the `metadata` metrics,
which is one of the most expensive response structures. However, external
monitoring agents will likely want the `cluster_uuid` to correlate the
response with other API responses whether or not they want cluster
metadata.
2018-04-30 10:16:11 -04:00
Jason Tedor 811f5b4efc
Do not ignore request analysis/similarity on resize (#30216)
Today when a resize operation is performed, we copy the analysis,
similarity, and sort settings from the source index. It is possible for
the resize request to include additional index settings including
analysis, similarity, and sort settings. We reject sort settings when
validating the request. However, we silently ignore analysis and
similarity settings on the request that are already set on the source
index. Since it is possible to change the analysis and similarity
settings on an existing index, this should be considered a bug and the
sort of leniency that we abhor. This commit addresses this bug by
allowing the request analysis/similarity settings to override the
existing analysis/similarity settings on the target.
2018-04-30 07:31:36 -04:00
Tanguy Leroux a6624bb742
[Test] Update test in SharedClusterSnapshotRestoreIT (#30200)
The `testDeleteSnapshotWithMissingIndexAndShardMetadata` test uses an
obsolete repository directory structure based on index names instead of
UUIDs. Because it swallows exceptions when deleting test files the test
never failed when the directory structure changed.

This commit fixes the test to use the right directory structure and file
 names and to not swallow exceptions anymore.
2018-04-30 09:48:03 +02:00
Jason Tedor 0a6312a5e6
Collapse REST resize handlers (#30229)
The REST resize handlers for shrink/split operations are effectively the
same code with a minor difference. This commit collapse these handlers
into a single base class.
2018-04-29 08:58:11 -04:00
Jason Tedor bdde2b9824
Rename request variables in shrink/split handlers (#30207)
This is a code-tidying PR, a little side adventure while working on
another change. Previously only shrink request existed but when the
ability to split indices was added, shrink and split were done together
under a single request object: the resize request object. However, the
code inherited the legacy name in the naming of some variables. This
commit cleans this up.
2018-04-28 01:09:44 -04:00
Julie Tibshirani f5978d6d33
In the field capabilities API, remove support for providing fields in the request body. (#30185) 2018-04-27 16:14:11 -07:00
Nhat Nguyen 9c586a2f07
Do not log warn shard not-available exception in replication (#30205)
Since #28049, only fully initialized shards are received write requests.
This enhancement allows us to handle all exceptions. In #28571, we
started strictly handling shard-not-available exceptions and tried to
keep the way we report replication errors to users by only reporting if
the error is not shard-not-available exceptions. However, since then we
unintentionally always log warn for all exception. This change restores
to the previous behavior which logs warn only if an exception is not a
shard-not-available exception.

Relates #28049
Relates #28571
2018-04-27 16:45:42 -04:00
Nik Everett f4ed902698
CCS: Drop http address from remote cluster info (#29568)
They are expensive to fetch and no longer needed by Kibana so they
*shouldn't* be needed by anyone else either.

Closes #29207
2018-04-27 14:19:00 -04:00
Julie Tibshirani d633130e1b
Convert FieldCapabilitiesResponse to a ToXContentObject. (#30182) 2018-04-27 09:47:11 -07:00
Tanguy Leroux 63148dd9ba
Fail snapshot operations early on repository corruption (#30140)
A NullPointerException is thrown when trying to create or delete
a snapshot in a repository that has been written to by an older 
Elasticsearch after writing to it with a newer Elasticsearch version.

This is because the way snapshots are formatted in the repository 
snapshots index file changed in #24477.

This commit changes the parsing of the repository index file so that 
it now detects a corrupted index file and fails early the snapshot 
operation.

closes #29052
2018-04-27 16:29:59 +02:00
Jim Ferenczi c08daf2589
Build global ordinals terms bucket from matching ordinals (#30166)
The global ordinals terms aggregator has an option to remap global ordinals to
dense ordinal that match the request. This mode is automatically picked when the terms
aggregator is a child of another bucket aggregator or when it needs to defer buckets to an
aggregation that is used in the ordering of the terms.
Though when building the final buckets, this aggregator loops over all possible global ordinals
rather than using the hash map that was built to remap the ordinals.
For fields with high cardinality this is highly inefficient and can lead to slow responses even
when the number of terms that match the query is low.
This change fixes this performance issue by using the hash table of matching ordinals to perform
the pruning of the final buckets for the terms and significant_terms aggregation.
I ran a simple benchmark with 1M documents containing 0 to 10 keywords randomly selected among 1M unique terms.
This field is used to perform a multi-level terms aggregation using rally to collect the response times.
The aggregation below is an example of a two-level terms aggregation that was used to perform the benchmark:

```
"aggregations":{
   "1":{
      "terms":{
         "field":"keyword"
      },
      "aggregations":{
         "2":{
            "terms":{
               "field":"keyword"
            }
         }
      }
   }
}
```

| Levels of aggregation | 50th percentile ms (master) | 50th percentile ms (patch) |
| --- | --- | --- |
| 2 | 640.41ms | 577.499ms |
| 3 | 2239.66ms | 600.154ms |
| 4 | 14141.2ms | 703.512ms |

Closes #30117
2018-04-27 15:26:46 +02:00
Alexander Reelsen e1a16a6018
REST: Remove GET support for clear cache indices (#29525)
Clearing the cache indices can be done via GET and POST. As GET should
only support read only operations, this removes the support for using
GET for clearing the indices caches.
2018-04-27 08:41:36 +02:00
Julie Tibshirani 0d8aed8c2b
Fix a bug in FieldCapabilitiesRequest#equals and hashCode. (#30181)
Also update its unit test to AbstractStreamableTestCase for better coverage.
2018-04-26 16:09:27 -07:00
Jim Ferenczi 80e0e64bfe Fix SliceBuilderTests#testRandom failures
Add missing shard context creation in a random test.
2018-04-26 22:18:39 +02:00
Julie Tibshirani d40116d260
Add support for field capabilities to the high-level REST client. (#29664) 2018-04-26 09:50:37 -07:00
Tanguy Leroux e864b93abf
Fix TermsSetQueryBuilder.doEquals() method (#29629)
Closes #29620
2018-04-26 17:47:16 +02:00
Nhat Nguyen 16490d7dfa
TEST: Update settings should go through cluster state (#29682)
Today we update index settings directly via IndexService instead of the
cluster state in IndexServiceTests. However, those changes will be lost
if there is a cluster state update. In general, we should update index
settings via client and limit the direct usage in only special tests.

This commit replaces direct usages by the updateSettings api of client.

Closes #24491
2018-04-26 09:28:14 -04:00
Jim Ferenczi 752ba2fb45 Adjust serialization versions after backport
Relates #29533
2018-04-26 14:06:56 +02:00
Jim Ferenczi 8b8c0c0b4d
Add additional shards routing info in ShardSearchRequest (#29533)
This commit propagates the preference and routing of the original SearchRequest in the ShardSearchRequest.
This information is then use to fix a bug in sliced scrolls when executed with a preference (or a routing).
Instead of computing the slice query from the total number of shards in the index, this commit computes this number from the number of shards per index that participates in the request.

Fixes #27550
2018-04-26 09:58:17 +02:00
Julie Tibshirani 32dfb65144 In the field capabilities API, deprecate support for providing fields in the request body. (#30157)
(cherry picked from commit d8d884b29d4aa7d01070484fee5de8d3db60cb25)
2018-04-25 23:01:53 -07:00
Nhat Nguyen 52c50e353b
Do not add noop from local translog to translog again (#29637)
Today we always add no-ops to translog regardless of its origin, thus a
noop may appear in the translog multiple times. This is not a big deal
as noops are small and rare to appear.

This commit ensures to add a noop to translog only if its origin is not
from local translog. This restriction has been applied for index and
delete.
2018-04-25 21:02:12 -04:00
Jason Tedor 2c3e71f116
Remove the suggest metric from stats APIs (#29635)
This metric previously existed for backwards compatibility reasons
although the suggest stats were folded into search stats. This metric
was deprecated in 6.3.0 and this commit removes them for 7.0.0.
2018-04-24 19:03:48 -04:00
Jason Tedor 25e45a765c
Fix byte size value equals/hash code test (#29643)
This commit fixes two issues with the byte size value equals/hash code
test.

The first problem is due to a test failure when the original instance is
zero bytes and we pick the mutation branch where we preserve the size
but change the unit. The mutation should result in a different byte size
value but changing the unit on zero bytes still leaves us with zero
bytes.

During the course of fixing this test I discovered another problem. When
we need to randomize size, we could randomly select a size that would
lead to an overflow of Long.MAX_VALUE.

This commit fixes both of these issues.
2018-04-24 19:01:27 -04:00
Jason Tedor bdf241347d
Add 6.4.0 version to master (#29684)
This commit adds the 6.4.0 version constant to the master branch.
2018-04-24 18:21:37 -04:00
Jason Tedor 3cadd5c40c Only enable modules to have native controllers
This commit removes the ability for a plugin to have a native controller
as leaves it as only modules can have a native controller.
2018-04-20 15:34:02 -07:00
Jason Tedor d99d0fa669 Add distribution type to startup scripts
This commit adds the distribution type to the startup scripts so that we
can discern from log output and the main response the type of the
distribution (deb/rpm/tar/zip).
2018-04-20 15:34:01 -07:00
Jason Tedor e64e6d8996 Add distribution flavor to startup scripts
This commit adds the distribution flavor (default versus oss) to the
build process which is passed through the startup scripts to
Elasticsearch. This change will be used to customize the message on
attempting to install/remove x-pack based on the distribution flavor.
2018-04-20 15:33:58 -07:00
Ryan Ernst fab5e21e7d Build: Split distributions into oss and default
This commit makes x-pack a module and adds it to the default
distrubtion. It also creates distributions for zip, tar, deb and rpm
which contain only oss code.
2018-04-20 15:33:57 -07:00
Yannick Welsch 6a4c5f3e93
Abort early on finding duplicate snapshot name in internal structures (#29634)
Adds a check in BlobstoreRepository.snapshot(...) that prevents duplicate snapshot names and fails 
the snapshot before writing out the new index file. This ensures that you cannot end up in this
situation where the index file has duplicate names and cannot be read anymore .

Relates to #28906
2018-04-20 17:32:34 +02:00
Jason Tedor 0045111ce2 Deprecate the suggest metrics (#29627)
The suggest stats were folded into the search stats as part of the
indices stats API in 5.0.0. However, the suggest metric remained as a
synonym for the search metric for BWC reasons. This commit deprecates
usage of the suggest metric on the indices stats API.

Similarly, due to the changes to fold the suggest stats into the search
stats, requesting the suggest index metric on the indices metric on the
nodes stats API has produced an empty object as the response since
5.0.0. This commit deprecates this index metric on the indices metric on
the nodes stats API.
2018-04-20 09:47:38 -04:00
Jay Modi dfc7ca7214
Implement Iterator#remove for Cache values iter (#29633)
This commit implements the ability to remove values from a Cache using
the values iterator. This brings the values iterator in line with the
keys iterator and adds support for removing items in the cache that are
not easily found by the key used for the cache.
2018-04-20 07:21:08 -06:00
Nhat Nguyen 42d81a2945 TEST: Unmute testPrimaryRelocationWhileIndexing
Previously we did not put an indexing to a version map if that map does
not require safe access but removed the existing delete tombstone only
if assertion enabled. In #29585, we removed the side-effect caused by
assertion then this test started failing. This failure can be explained
as follows:

- Step 1: Index a doc then delete that doc

- Step 2: The version map can switch to unsafe mode because of
  concurrent refreshes (implicitly called by flushes)

- Step 3: Index a document - the version map won't add this version
  value and won't prune the tombstone (previously it did)

- Step 4: Delete a document - this will return NOT_FOUND instead of
  DELETED because of the stale delete tombstone

This failure is actually fixed by #29619 in which we never leave stale
delete tombstones

Closes #29626
2018-04-19 21:35:21 -04:00
Ryan Ernst 7975280383
Remove remaining tribe node references (#29574)
While tribe node was removed in
https://github.com/elastic/elasticsearch/pull/28443, there remained a
couple lingering references to it in docs and code. This commit removes
those remaining references.
2018-04-19 18:02:01 -07:00
Nhat Nguyen 9cf8b01fc4
Never leave stale delete tombstones in version map (#29619)
Today the VersionMap does not clean up a stale delete tombstone if it
does not require safe access. However, in a very rare situation due to
concurrent refreshes, the safe-access flag may be flipped over then an
engine accidentally consult that stale delete tombstone.

This commit ensures to never leave stale delete tombstones in a version
map by always pruning delete tombstones when putting a new index entry
regardless of the value of the safe-access flag.
2018-04-19 20:49:56 -04:00
Jason Tedor d1670a18e4
Do not serialize common stats flags using ordinal (#29600)
This commit remove serializing of common stats flags via its enum
ordinal and uses an explicit index defined on the enum. This is to
enable us to remove an unused flag (Suggest) without ruining the
ordering and thus breaking serialization.
2018-04-19 20:12:24 -04:00
Jason Tedor a829d920ee
Remove stale comment from JVM stats (#29625)
We removed catched throwable from the code base and left behind was a
comment about catching InternalError in MemoryManagementMXBean. We are
not going to catch InternalError here as we expect that to be
fatal. This commit removes that stale comment.
2018-04-19 19:56:03 -04:00
Nhat Nguyen 293f85cd52 TEST: Mute testPrimaryRelocationWhileIndexing
AwaitsFix #29626
2018-04-19 19:15:30 -04:00
Jason Tedor 5d767e449a
Remove bulk fallback for write thread pool (#29609)
The name of the bulk thread pool was renamed to "write" with "bulk" as a
fallback name. This change was made in 6.x for BWC reasons yet in 7.0.0
we are removing this fallback. This commit removes this fallback for the
write thread pool.
2018-04-19 16:59:58 -04:00
Julie Tibshirani 113d1d3eab Fix an incorrect reference to 'zero_terms_docs' in match_phrase queries. 2018-04-19 13:24:14 -07:00
Julie Tibshirani 48461ac143 Update the version compatibility for zero_terms_query in match_phrase.
The change was just backported to 6.x.
2018-04-19 13:20:44 -07:00
Nhat Nguyen 955709b3f3 Account translog location to ram usage in version map
This commit accounts a translog location's ram usage in version map.
2018-04-19 16:05:33 -04:00
Julie Tibshirani b9e1a00213
Add support to match_phrase query for zero_terms_query. (#29598) 2018-04-19 11:25:27 -07:00
Julie Tibshirani 00d88a5d3e
Fix incorrect references to 'zero_terms_docs' in query parsing error messages. (#29599) 2018-04-19 11:02:49 -07:00