Today we check if the DocIdSet we filter by is `fast` but the check fails
if the DocIdSet if wrapped in an `ApplyAcceptedDocsFilter` which is always
the case if the index has deleted documents. This commit unwraps
the original DocIdSet in the case of deleted documents.
Closes#6247
AllTokenStream, used to index the _all field, adds some overhead, but
it's not necessary when no fields were boosted or when positions are
not indexed the _all field.
Closes#6187Closes#6219
This adds support for sending a list of benchmark names and/or wildcard
patterns when submitting an abort request. Standard wildcards such as:
"*", "*xxx", and "xxx*" are supported. Multiple benchmark names and
wildcard patterns can be submitted together as a comma-separated list
from the REST API or as a string array from the Java API.
Closes#6185
This change adds a new cluster state that waits for the replication of a shard to finish before starting snapshotting process. Because this change adds a new snapshot state, an pre-1.2.0 nodes will not be able to join the 1.2.0 cluster that is currently running snapshot/restore operation.
Closes#5531
This is yet another safety guard to make sure we don't delete data if the local copy is the only one (even if it's not part of the cluster state any more)
Closes#6191
In some places we want to delay the start of a shard recovery because the source node is not ready to receive. At the moment the retry logic ignores the time delay parameter (`retryAfter`) causing a busy waiting like scenario. This is fixed in this commit.
Closes#6226
This commit upgrades to the latest Lucene 4.8.1 release including the
following bugfixes:
* An IndexThrottle now kicks in when merges start falling behind
limiting index threads to 1 until merges caught up. Closes#6066
* RateLimiter now kicks in at the configured rate where previously
the limiter was limiting at ~8MB/sec almost all the time. Closes#6018
Just follow "lowercase" token filter example and register "uppercase" token filter as an exposed token filter. This will not, by itself, test whether ES is correctly handling "uppercase" TF; this is more of a "code as documentation" fix.
This commit add `dummy docs` to `ElasticsearchIntegrationTest#indexRandom`.
It indexes document with an empty body into the indices specified by the docs
and deletes them after all docs have been indexed. This produces gaps in
the segments and enforces usage of accept docs on lower levels to ensure
the features work with delete documents as well.
We currently ask `MatchDocIdSet#matchDoc(int)` before consulting
the accept docs. This can also have a negative performance impact
since `matchDoc(int)` calls might be way more expensive than
acceptDocs calls.
Closes#6234
FilterableTermsEnum allows to filter stats by supplying per segment
bits. Today if all docs are filtered out the term is still reported as
live but shouldn't.
Relates to #6211
because the shard the get request for 'test' index, 'type1' type and id 1 is getting executed on may not be in a started state
and also added more logging.
This is an update for the _cat/recovery API documentation. The examples
have been updated. Removed the bottom paragraph explaining why there
could be values > 100%. This can no longer happen so that had to be
removed.
Closes#6159
The `testCompareMoreLikeThisDSLWithAPI` test compares results from query
and API which might query different shards. Those shares might use
different doc IDs internally to disambiguate. This commit resorts the
results and compares them after stable disambiguation.
These calls were introduced in pr #6149 as a backward compatibility layer for the previous value of `Versions.MATCH_ANY`. This is not needed as the translog never contains these values. On top of that, the calls are not effective as the stream the translog used is effectively not versioned (versioining is done on an item by item basis)
The syntax to specify one or more items is the same as for the Multi GET API.
If only one document is specified, the results returned are the same as when
using the More Like This API.
Relates #4075Closes#5857
The retry mechanism in the transport layer might cause the delete snapshot request to be executed twice if the cluster master is closed while the request is executed. First time delete snapshot request is getting successfully executed on the old master and then it is retried on the newly elected master. When the new master tries to delete the snapshot - the snapshot no longer exists (since it was successfully deleted by the old master) and SnapshotMissingException is returned.
Currently even if some shards of the snapshot are not snapshotted successfully, the snapshot is still marked as "SUCCESS". Users may miss the fact the there are shard failures present in the snapshot and think that snapshot was completed. This change adds a new snapshot state "PARTIAL" that provides a quick indication that the snapshot was only partially successful.
Closes#5792
Until now all version types have officially required the version to be a positive long number. Despite of this has being documented, ES versions <=1.0 did not enforce it when using the `external` version type. As a result people have succesfully indexed documents with 0 as a version. In 1.1. we introduced validation checks on incoming version values and causing indexing request to fail if the version was set to 0. While this is strictly speaking OK, we effectively have a situation where data already indexed does not match the version invariant.
To be lenient and adhere to spirit of our data backward compatibility policy, we have decided to allow 0 as a valid external version type. This is somewhat complicated as 0 is also the internal value of `MATCH_ANY`, which indicates requests should succeed regardles off the current doc version. To keep things simple, this commit changes the internal value of `MATCH_ANY` to `-3` for all version types.
Since we're doing this in a minor release (and because versions are stored in the transaction log), the default `internal` version type still accepts 0 as a `MATCH_ANY` value. This is not a problem for other version types as `MATCH_ANY` doesn't make sense in that context.
Closes#5662
We try to reuse character arrays and UTF8 writers with softreferences.
SoftReferences have negative impact on GC and should be avoided in
general. Yet in this case it can simply replaced with a per-stream
Bytes/CharsRef that is thread local and has the same lifetime as the
stream.