If we close the shard before the engine is started we see a NPE in
the logs which is not problematic since the relevant parts are in a
finally block. Yet, the NPE is unnecessary and can be confusing.
Today, only the NettyTransportChannel implements the getProfileName method
and the other channel implementations do not. The profile name is useful for some
plugins to perform custom actions based on the name. Rather than checking the
type of the channel, it makes sense to always expose the profile name.
For DirectResponseChannels we use a name that cannot be used in the settings
to define another profile with that name. For LocalTransportChannel we use the
same name as the default profile.
Closes#10483
When mapping updates happen concurrently with document parsing, bad things can
happen. For instance, when applying a mapping update we first update the Mapping
object which is used for parsing and then FieldNameAnalyzer which is used by
IndexWriter for analysis. So if you are unlucky, it could happen that a document
was parsed successfully without introducing dynamic updates yet IndexWriter does
not see its analyzer yet.
In order to fix this issue, mapping updates are now protected by a write lock
and document parsing is protected by the read lock associated with this write
lock. This ensures that no documents will be parsed while a mapping update is
being applied, so document parsing will either see none of the update or all of
it.
We recently introduced support for reading and writing list of strings as part of #11056, but that was an oversight, we should be using arrays instead.
Closes#11276
The downside of having createTypeUids accept a list only is that if you do provide a collection nothing breaks at compile time, but you end up calling the same method that accepts an object as second argument. Renamed both methods to avoid clashes to `createUidsForTypesAndId` and `createUidsForTypesAndIds`. The latter accepts now a Collection of Objects rather than just a List.
Closes#11263
We fetch the state version to find the right shard to be started as
the primary. This can return a valid shard state even if the shard is
corrupted and can't even be opened. This commit adds best effort detection
for this scenario and returns an invalid version for the shard if it's corrupted
Closes#11226
Similar to the batching of "shards-started" actions, this commit implements batching of snapshot status updates. This is useful when backing up many indices as the cluster state does not need to be republished as many times.
Closes#10295
This clarifies some of the uses of names, so that the ambiguous
"name" is mostly no longer used (does this include path or not?).
sourcePath is also removed as it was not used. Not all the
uses of .name() have been cleaned up because Mapper still has
this, and ObjectMapper depends on it returning the short name,
but I would like to leave finishing that cleanup for a future issue.
Today, when a primary shard is not allocated we go to all the nodes to find where it is allocated (listing its started state). When we allocate a replica shard, we head to all the nodes and list its store to allocate the replica on a node that holds the closest matching index files to the primary.
Those two operations today execute synchronously within the GatewayAllocator, which means they execute in a sync manner within the cluster update thread. For large clusters, or environments with very slow disk, those operations will stall the cluster update thread, making it seem like its stuck.
Worse, if the FS is really slow, we timeout after 30s the operation (to not stall the cluster update thread completely). This means that we will have another run for the primary shard if we didn't find one, or we won't find the best node to place a shard since it might have timed out (listing stores need to list all files and read the checksum at the end of each file).
On top of that, this sync operation happen one shard at a time, so its effectively compounding the problem in a serial manner the more shards we have and the slower FS is...
This change moves to perform both listing the shard started states and the shard stores to an async manner. During the allocation by the GatewayAllocator, if data needs to be fetched from a node, it is done in an async fashion, with the response triggering a reroute to make sure the results will be taken into account. Also, if there are on going operations happening, the relevant shard data will not be taken into account until all the ongoing listing operations are done executing.
The execution of listing shard states and stores has been moved to their own respective thread pools (scaling, so will go down to 0 when not needed anymore, unbounded queue, since we don't want to timeout, just let it execute based on how fast the local FS is). This is needed sine we are going to blast nodes with a lot of requests and we need to make sure there is no thread explosion.
This change also improves the handling of shard failures coming from a specific node. Today, those nodes were ignored from allocation only for the single reroute round. Now, since fetching is async, we need to keep those failures around at least until a single successful fetch without the node is done, to make sure not to repeat allocating to the failed node all the time.
Note, if before the indication of slow allocation was high pending tasks since the allocator was waiting for responses, not the pending tasks will be much smaller. In order to still indicate that the cluster is in the middle of fetching shard data, 2 attributes were added to the cluster health API, indicating the number of ongoing fetches of both started shards and shard store.
closes#9502closes#11101
There is a feature available in S3 that clients can use to ensure data integrity on upload. Whenever an object is PUT to an S3 bucket, the client is able to get back the `MD5` base64 encoded and check that it's the same `MD5` as the local one.
For reference, please see the [S3 PutObject API](http://docs.aws.amazon.com/AmazonS3/latest/API/RESTObjectPUT.html).
Closes#186.
The default for reuse address was null on Windows and casting to a boolean would
result in a NPE. This makes the default on Windows false and changes the return
type to a boolean and removes the need to check for nulls.
This also fixes a potential race condition when the number of docs
is compared across shards with the same seal ID since the assertion
was taking the number of docs form the live index reader which might
not be equivalent to the committed num docs.