Commit Graph

1352 Commits

Author SHA1 Message Date
Adrien Grand 1b660821a2
Allow `_doc` as a type. (#27816)
Allowing `_doc` as a type will enable users to make the transition to 7.0
smoother since the index APIs will be `PUT index/_doc/id` and `POST index/_doc`.
This also moves most of the documentation to `_doc` as a type name.

Closes #27750
Closes #27751
2017-12-14 17:47:53 +01:00
Daniel Mitterdorfer d26b33dea2 Mute VersionUtilsTest#testGradleVersionsMatchVersionUtils
Relates #27815
2017-12-14 12:33:41 +01:00
Nhat Nguyen 57fc705d5e
Keep commits and translog up to the global checkpoint (#27606)
We need to keep index commits and translog operations up to the current 
global checkpoint to allow us to throw away unsafe operations and
increase the operation-based recovery chance. This is achieved by a new
index deletion policy.

Relates #10708
2017-12-12 19:20:08 -05:00
Tim Brooks d1acb7697b
Remove internal channel tracking in transports (#27711)
This commit attempts to continue unifying the logic between different
transport implementations. As transports call a `TcpTransport` callback
when a new channel is accepted, there is no need to internally track
channels accepted. Instead there is a set of accepted channels in
`TcpTransport`. This set is used for metrics and shutting down channels.
2017-12-08 16:56:53 -07:00
Tim Brooks d82c40d35c
Implement byte array reusage in `NioTransport` (#27696)
This is related to #27563. This commit modifies the
InboundChannelBuffer to support releasable byte pages. These byte
pages are provided by the PageCacheRecycler. The PageCacheRecycler
must be passed to the Transport with this change.
2017-12-08 10:39:30 -07:00
Tim Brooks da5f52a2fc
Add test for writer operation buffer accounting (#27707)
This is a follow up to #27695. This commit adds a test checking that
across multiple writes using multiple buffers, a write operation
properly keeps track of which buffers still need to be written.
2017-12-07 12:48:49 -07:00
Christoph Büscher b83e14858a Correcting some minor typos in comments 2017-12-07 16:39:23 +01:00
Tim Brooks 5b3230cbae
Fix issue where the incorrect buffers are written (#27695)
This is a followup to #27551. That commit introduced a bug where the
incorrect byte buffers would be returned when we attempted a write. This
commit fixes the logic.
2017-12-06 20:57:46 -07:00
Tim Brooks 2aa62daed4
Introduce resizable inbound byte buffer (#27551)
This is related to #27563. In order to interface with java nio, we must
have buffers that are compatible with ByteBuffer. This commit introduces
a basic ByteBufferReference to easily allow transferring bytes off the
wire to usage in the application.

Additionally it introduces an InboundChannelBuffer. This is a buffer
that can internally expand as more space is needed. It is designed to
be integrated with a page recycler so that it can internally reuse pages.
The final piece is moving all of the index work for writing bytes to a
channel into the WriteOperation.
2017-12-06 11:02:25 -07:00
Jim Ferenczi caea6b70fa
Add a new cluster setting to limit the total number of buckets returned by a request (#27581)
This commit adds a new dynamic cluster setting named `search.max_buckets` that can be used to limit the number of buckets created per shard or by the reduce phase. Each multi bucket aggregator can consume buckets during the final build of the aggregation at the shard level or during the reduce phase (final or not) in the coordinating node. When an aggregator consumes a bucket, a global count for the request is incremented and if this number is greater than the limit an exception is thrown (TooManyBuckets exception).
This change adds the ability for multi bucket aggregator to "consume" buckets in the global limit, the default is 10,000. It's an opt-in consumer so each multi-bucket aggregator must explicitly call the consumer when a bucket is added in the response.

Closes #27452 #26012
2017-12-06 09:15:28 +01:00
Luca Cavanna f4fb4d3bf5
Add support for filtering mappings fields (#27603)
Add support for filtering fields returned as part of mappings in get index, get mappings, get field mappings and field capabilities API.

Plugins can plug in their own function, which receives the index as argument, and return a predicate which controls whether each field is included or not in the returned output.
2017-12-05 20:31:29 +01:00
Jason Tedor 42a4ad35da
Add node name to thread pool executor name
This commit adds the node name to the names of thread pool executors so
that the node name is visible in rejected execution exception messages.

Relates #27663
2017-12-05 07:45:40 -05:00
Lee Hinman 1ff5ef9055 [TEST] Check accounting breaker is equal to segment stats rather than 0
If there are existing indices, it may not be 0
2017-12-04 14:15:23 -07:00
Simon Willnauer 84ec472428
Include internal refreshes in refresh stats (#27615)
Today we exclude internal refreshes in the refresh stats. Yet, it's very much
confusing to not take these into account. This change includes internal refreshes
into the stats until we have a dedicated stats for this.
2017-12-04 16:33:47 +01:00
Boaz Leskes f58a3d0b96 testRelocationWithConcurrentIndexing: wait for green (on relevan index) and shard initialization to settle down before starting relocation 2017-12-04 13:18:42 +01:00
Boaz Leskes 1a976ea7a4 Cherry pick tests and seqNo recovery hardning from #27580 2017-12-04 13:15:40 +01:00
James Baiera e16f1271b6
Fix SecurityException when HDFS Repository used against HA Namenodes (#27196)
* Sense HA HDFS settings and remove permission restrictions during regular execution.

This PR adds integration tests for HA-Enabled HDFS deployments, both regular and secured. 
The Mini HDFS fixture has been updated to optionally run in HA-Mode. A new test suite has 
been added for reproducing the effects of a Namenode failing over during regular repository 
usage. Going forward, the HDFS Repository will still be subject to its self imposed permission 
restrictions during normal use, but will no longer restrict them when running against an HA 
enabled HDFS cluster. Instead, the plugin will rely on the provided security policy and not 
further restrict the permissions so that the transparent operation to failover to a different 
Namenode in the client does not raise security exceptions. Additionally, we are now testing the 
secure mode with SASL based wire encryption of data between Elasticsearch and HDFS. This 
includes a missing library (commons codec) in order to support this change.
2017-12-01 14:26:05 -05:00
Lee Hinman 623d3700f0
Add accounting circuit breaker and track segment memory usage (#27116)
* Add accounting circuit breaker and track segment memory usage

This commit adds a new circuit breaker "accounting" that is used for tracking
the memory usage of non-request-tied memory users. It also adds tracking for the
amount of Lucene segment memory used by a shard as a user of the new circuit
breaker.

The Lucene segment memory is updated when the shard refreshes, and removed when
the shard relocates away from a node or is deleted. It should also be noted that
all tracking for segment memory uses `addWithoutBreaking` so as not to fail the
shard if a limit is reached.

The `accounting` breaker has a default limit of 100% and will contribute to the
parent breaker limit.

Resolves #27044
2017-12-01 07:59:45 -07:00
Luca Cavanna 3e8ca38fca
Deprecate the transport client in favour of the high-level REST client (#27085) 2017-12-01 12:24:16 +01:00
Tim Brooks b8557651aa
Add exception handling for write listeners (#27590)
This potential issue was exposed when I saw this PR #27542. Essentially
we currently execute the write listeners all over the place without
consistently catching and handling exceptions. Some of these exceptions
will be logged in different ways (including as low as `debug`).

This commit adds a single location where these listeners are executed.
If the listener throws an execption, the exception is caught and logged
at the `warn` level.
2017-11-29 15:47:12 -07:00
Jason Tedor 11d46ad2aa Merge branch 'master' into ccr
* master:
  Skip shard refreshes if shard is `search idle` (#27500)
  Remove workaround in translog rest test (#27530)
  inner_hits: Return an empty _source for nested inner hit when filtering on a field that doesn't exist.
  percolator: Avoid TooManyClauses exception if number of terms / ranges is exactly equal to 1024
  Dedup translog operations by reading in reverse (#27268)
  Ensure logging is configured for CLI commands
  Ensure `doc_stats` are changing even if refresh is disabled (#27505)
  Fix classes that can exit
  Revert "Adjust CombinedDeletionPolicy for multiple commits (#27456)"
  Transpose expected and actual, and remove duplicate info from message. (#27515)
  [DOCS] Fixed broken link in breaking changes
2017-11-27 12:48:39 -05:00
David Turner 00867e618d
Transpose expected and actual, and remove duplicate info from message. (#27515)
Previously:
```
   > Throwable #1: java.lang.AssertionError: Expected all shards successful but got successful [8] total [9]
   > Expected: <8>
   >      but: was <9>
```

Now:
```
   > Throwable #1: java.lang.AssertionError: Expected all shards successful
   > Expected: <9>
   >      but: was <8>
```
2017-11-24 17:45:34 +00:00
Martijn van Groningen 69bc97d73f
Merge remote-tracking branch 'es/master' into ccr
* es/master: (38 commits)
  Backport wait_for_initialiazing_shards to cluster health API
  Carry over version map size to prevent excessive resizing (#27516)
  Fix scroll query with a sort that is a prefix of the index sort (#27498)
  Delete shard store files before restoring a snapshot (#27476)
  Replace `delimited_payload_filter` by `delimited_payload` (#26625)
  CURRENT should not be a -SNAPSHOT version if build.snapshot is false (#27512)
  Fix merging of _meta field (#27352)
  Remove unused method (#27508)
  unmuted test, this has been fixed by #27397
  Consolidate version numbering semantics (#27397)
  Add wait_for_no_initializing_shards to cluster health API (#27489)
  [TEST] use routing partition size based on the max routing shards of the second split
  Adjust CombinedDeletionPolicy for multiple commits (#27456)
  Update composite-aggregation.asciidoc
  Deprecate `levenstein` in favor of `levenshtein` (#27409)
  Automatically prepare indices for splitting (#27451)
  Validate `op_type` for `_create` (#27483)
  Minor ShapeBuilder cleanup
  muted test
  Decouple nio constructs from the tcp transport (#27484)
  ...
2017-11-24 15:58:52 +01:00
Tanguy Leroux 5dc5580eac
Delete shard store files before restoring a snapshot (#27476)
Pull request #20220 added a change where the store files
that have the same name but are different from the ones in the
snapshot are deleted first before the snapshot is restored.
This logic was based on the `Store.RecoveryDiff.different`
set of files which works by computing a diff between an
existing store and a snapshot.

This works well when the files on the filesystem form valid
shard store, ie there's a `segments` file and store files
are not corrupted. Otherwise, the existing store's snapshot
metadata cannot be read (using Store#snapshotStoreMetadata())
and an exception is thrown
(CorruptIndexException, IndexFormatTooOldException etc) which
is later caught as the begining of the restore process
(see RestoreContext#restore()) and is translated into
an empty store metadata (Store.MetadataSnapshot.EMPTY).

This will make the deletion of different files introduced
in #20220 useless as the set of files will always be empty
even when store files exist on the filesystem. And if some
files are present within the store directory, then restoring
a snapshot with files with same names will fail with a
FileAlreadyExistException.

This is part of the #26865 issue.

There are various cases were some files could exist in the
 store directory before a snapshot is restored. One that
Igor identified is a restore attempt that failed on a node
and only first files were restored, then the shard is allocated
again to the same node and the restore starts again (but fails
 because of existing files). Another one is when some files
of a closed index are corrupted / deleted and the index is
restored.

This commit adds a test that uses the infrastructure provided
by IndexShardTestCase in order to test that restoring a shard
succeed even when files with same names exist on filesystem.

Related to #26865
2017-11-24 13:15:34 +01:00
Martijn van Groningen f1ebf366bf
unmuted test, this has been fixed by #27397
Closes #27497
2017-11-24 08:53:00 +01:00
David Turner 89ba8996c6 Consolidate version numbering semantics (#27397)
Fixes to the build system, particularly around BWC testing, and to make future
version bumps less painful.
2017-11-23 20:21:53 +00:00
Martijn van Groningen ca9c476d88
muted test 2017-11-22 19:18:35 +01:00
Tim Brooks ef34555b29
Decouple nio constructs from the tcp transport (#27484)
This is related to #27260. Currently, basic nio constructs (nio
channels, the channel factories, selector event handlers, etc) implement
logic that is specific to the tcp transport. For example, NioChannel
implements the TcpChannel interface. These nio constructs at some point
will also need to support other protocols (ex: http).

This commit separates the TcpTransport logic from the nio building
blocks.
2017-11-22 11:39:31 -06:00
Jim Ferenczi 6319424e4a
Move composite aggregation to core (#27474)
This change removes the module named aggs-composite and adds the `composite` aggs
as a core aggregation. This allows other plugins to use this new aggregation
and simplifies the integration in the HL rest client.
2017-11-21 13:31:01 +01:00
Jason Tedor f2eaacb9b6 Merge branch 'master' into ccr
* master:
  Fix resync request serialization
  Fix issue where pages aren't released (#27459)
  Add YAML REST tests for filters bucket agg (#27128)
  Remove tcp profile from low level nio channel (#27441)
2017-11-20 20:58:23 -05:00
Tim Brooks f37eb1b403
Remove tcp profile from low level nio channel (#27441)
This is related to #27260. Currently every nio channel has a profile
field. Profile is a concept that only relates to the tcp transport. Http
channels will not have profiles. This commit moves the profile from the
nio channel to the read context. The context is the level that protocol
specific features and logic should live.
2017-11-20 12:20:42 -07:00
Jason Tedor 58591f2d16 Merge branch 'master' into ccr
* master: (31 commits)
  [TEST] Fix `GeoShapeQueryTests#testPointsOnly` failure
  Transition transport apis to use void listeners (#27440)
  AwaitsFix GeoShapeQueryTests#testPointsOnly #27454
  Bump test version after backport
  Ensure nested documents have consistent version and seq_ids (#27455)
  Tests: Add Fedora-27 to packaging tests
  Delete some seemingly unused exceptions (#27439)
  #26800: Fix docs rendering
  Remove config prompting for secrets and text (#27216)
  Move the CLI into its own subproject (#27114)
  Correct usage of "an" to "a" in getting started docs
  Avoid NPE when getting build information
  Removes BWC snapshot status handler used in 6.x (#27443)
  Remove manual tracking of registered channels (#27445)
  Remove parameters on HandshakeResponseHandler (#27444)
  [GEO] fix pointsOnly bug for MULTIPOINT
  Standardize underscore requirements in parameters (#27414)
  peanut butter hamburgers
  Log primary-replica resync failures
  Uses TransportMasterNodeAction to update shard snapshot status (#27165)
  ...
2017-11-20 13:17:20 -05:00
Tim Brooks 0a8f48d592
Transition transport apis to use void listeners (#27440)
Currently we use ActionListener<TcpChannel> for connect, close, and send
message listeners in TcpTransport. However, all of the listeners have to
capture a reference to a channel in the case of the exception api being
called. This commit changes these listeners to be type <Void> as passing
the channel to onResponse is not necessary. Additionally, this change
makes it easier to integrate with low level transports (which use
different implementations of TcpChannel).
2017-11-20 10:47:47 -07:00
Michael Basnight 2949c53174
Remove config prompting for secrets and text (#27216)
This commit removes the ability to use ${prompt.secret} and
${prompt.text} as valid config settings. Secure settings has obsoleted
the need for this, and it cleans up some of the code in Bootstrap.
2017-11-19 22:33:17 -06:00
Michael Basnight cb3e8f4763
Move the CLI into its own subproject (#27114)
Projects the depend on the CLI currently depend on core. This should not
always be the case. The EnvironmentAwareCommand will remain in :core,
but the rest of the CLI components have been moved into their own
subproject of :core, :core:cli.
2017-11-18 21:42:57 -06:00
Tim Brooks ce45e29be7
Remove manual tracking of registered channels (#27445)
This is related to #27260. Currently, every ESSelector keeps track of
all channels that are registered with it. ESSelector is just an
abstraction over a raw java nio selector. The java nio selector already
tracks its own selection keys. This commit removes our tracking and
relies on the java nio selector tracking.
2017-11-17 16:20:09 -07:00
David Turner 08a257327f
Remove newline from log message (#27425)
It leads to harder-to-parse logs that look like this:

```
  1> [2017-11-16T20:46:21,804][INFO ][o.e.t.r.y.ClientYamlTestClient] Adding header Content-Type
  1>  with value application/json
  1> [2017-11-16T20:46:21,812][INFO ][o.e.t.r.y.ClientYamlTestClient] Adding header Content-Type
  1>  with value application/json
  1> [2017-11-16T20:46:21,820][INFO ][o.e.t.r.y.ClientYamlTestClient] Adding header Content-Type
  1>  with value application/json
  1> [2017-11-16T20:46:21,966][INFO ][o.e.t.r.y.ClientYamlTestClient] Adding header Content-Type
  1>  with value application/json
```
2017-11-17 14:12:06 +00:00
Tim Brooks f761a0e0e4
Remove unneeded Throwable handling in nio (#27412)
This is related to #27260. In the nio transport work we do not catch or
handle `Throwable`. There are a few places where we have exception
handlers that accept `Throwable`. This commit removes those cases.
2017-11-16 18:24:06 -07:00
David Turner 9766b858d0
Prepare for bump to 6.0.1 on the master branch (#27391)
An assortment of fixes, particularly to version number calculations, in preparation for the bump to 6.0.1.
2017-11-16 18:38:54 +00:00
Tim Brooks 80ef9bbdb1
Remove parameterization from TcpTransport (#27407)
This commit is a follow up to the work completed in #27132. Essentially
it transitions two more methods (sendMessage and getLocalAddress) from
Transport to TcpChannel. With this change, there is no longer a need for
TcpTransport to be aware of the specific type of channel a transport
returns. So that class is no longer parameterized by channel type.
2017-11-16 11:19:36 -07:00
Tim Brooks 35a5922927
Delete unneeded nio client (#27408)
This is a follow up to #27132. As that PR greatly simplified the
connection logic inside a low level transport implementation, much of
the functionality provided by the NioClient class is no longer
necessary. This commit removes that class.
2017-11-16 09:22:40 -07:00
Jason Tedor c1e22572b3 Merge branch 'master' into ccr
* master:
  Stop skipping REST test after backport of #27056
  Fix default value of ignore_unavailable for snapshot REST API (#27056)
  Add composite aggregator (#26800)
  Fix `ShardSplittingQuery` to respect nested documents. (#27398)
  [Docs] Restore section about multi-level parent/child relation in parent-join (#27392)
  Add TcpChannel to unify Transport implementations (#27132)
  Add note on plugin distributions in plugins folder
  Remove implementations of `TransportChannel` (#27388)
  Update Google SDK to version 1.23 (#27381)
  Fix Gradle 4.3.1 compatibility for logging (#27382)
  [Test] Change Elasticsearch startup timeout to 120s in packaging tests
  Docs/windows installer (#27369)
2017-11-16 11:06:09 -05:00
Jim Ferenczi 623367d793
Add composite aggregator (#26800)
* This change adds a module called `aggs-composite` that defines a new aggregation named `composite`.
The `composite` aggregation is a multi-buckets aggregation that creates composite buckets made of multiple sources.
The sources for each bucket can be defined as:
  * A `terms` source, values are extracted from a field or a script.
  * A `date_histogram` source, values are extracted from a date field and rounded to the provided interval.
This aggregation can be used to retrieve all buckets of a deeply nested aggregation by flattening the nested aggregation in composite buckets.
A composite buckets is composed of one value per source and is built for each document as the combinations of values in the provided sources.
For instance the following aggregation:

````
"test_agg": {
  "terms": {
    "field": "field1"
  },
  "aggs": {
    "nested_test_agg":
      "terms": {
        "field": "field2"
      }
  }
}
````
... which retrieves the top N terms for `field1` and for each top term in `field1` the top N terms for `field2`, can be replaced by a `composite` aggregation in order to retrieve **all** the combinations of `field1`, `field2` in the matching documents:

````
"composite_agg": {
  "composite": {
    "sources": [
      {
	"field1": {
          "terms": {
              "field": "field1"
            }
        }
      },
      {
	"field2": {
          "terms": {
            "field": "field2"
          }
        }
      },
    }
  }
````

The response of the aggregation looks like this:

````
"aggregations": {
  "composite_agg": {
    "buckets": [
      {
        "key": {
          "field1": "alabama",
          "field2": "almanach"
        },
        "doc_count": 100
      },
      {
        "key": {
          "field1": "alabama",
          "field2": "calendar"
        },
        "doc_count": 1
      },
      {
        "key": {
          "field1": "arizona",
          "field2": "calendar"
        },
        "doc_count": 1
      }
    ]
  }
}
````

By default this aggregation returns 10 buckets sorted in ascending order of the composite key.
Pagination can be achieved by providing `after` values, the values of the composite key to aggregate after.
For instance the following aggregation will aggregate all composite keys that sorts after `arizona, calendar`:

````
"composite_agg": {
  "composite": {
    "after": {"field1": "alabama", "field2": "calendar"},
    "size": 100,
    "sources": [
      {
	"field1": {
          "terms": {
            "field": "field1"
          }
        }
      },
      {
	"field2": {
          "terms": {
            "field": "field2"
          }
	}
      }
    }
  }
````

This aggregation is optimized for indices that set an index sorting that match the composite source definition.
For instance the aggregation above could run faster on indices that defines an index sorting like this:

````
"settings": {
  "index.sort.field": ["field1", "field2"]
}
````

In this case the `composite` aggregation can early terminate on each segment.
This aggregation also accepts multi-valued field but disables early termination for these fields even if index sorting matches the sources definition.
This is mandatory because index sorting picks only one value per document to perform the sort.
2017-11-16 15:13:36 +01:00
Tim Brooks ca11085bb6
Add TcpChannel to unify Transport implementations (#27132)
Right now our different transport implementations must duplicate
functionality in order to stay compliant with the requirements of
TcpTransport. They must all implement common logic to open channels,
close channels, keep track of channels for eventual shutdown, etc.

Additionally, there is a weird and complicated relationship between
Transport and TransportService. We eventually want to start merging
some of the functionality between these classes.

This commit starts moving towards a world where TransportService retains
all the application logic and channel state. Transport implementations
in this world will only be tasked with returning a channel when one is
requested, calling transport service when a channel is accepted from
a server, and starting / stopping itself.

Specifically this commit changes how channels are opened and closed. All
Transport implementations now return a channel type that must comply with
the new TcpChannel interface. This interface has the methods necessary
for TcpTransport to completely manage the lifecycle of a channel. This
includes setting the channel up, waiting for connection, adding close
listeners, and eventually closing.
2017-11-15 12:38:39 -07:00
Martijn van Groningen 252371668e
Merge remote-tracking branch 'es/master' into ccr
* es/master:
  Logging: Unify log rotation for index/search slow log (#27298)
  wildcard query on _index (#27334)
  REST spec: Validate that api name matches file name that contains it (#27366)
  Revert "Reduce synchronization on field data cache"
  Create new handlers for every new request in GoogleCloudStorageService (#27339)
  Rest test fixes (#27354)
2017-11-15 11:16:44 +01:00
Luca Cavanna 382da0f227
REST spec: Validate that api name matches file name that contains it (#27366)
This commit validates that each spec json file contains an API that has the same name as the file
2017-11-14 14:53:00 +01:00
Martijn van Groningen ba0b7079f9
Merge remote-tracking branch 'es/master' into ccr
* es/master: (24 commits)
  Reduce synchronization on field data cache
  add json-processor support for non-map json types (#27335)
  Properly format IndexGraveyard deletion date as date (#27362)
  Upgrade AWS SDK Jackson Databind to 2.6.7.1
  Stop responding to ping requests before master abdication (#27329)
  [Test] Fix POI version in packaging tests
  Allow affix settings to specify dependencies (#27161)
  Tests: Improve size regex in documentation test (#26879)
  reword comment
  Remove unnecessary logger creation for doc values field data
  [Geo] Decouple geojson parse logic from ShapeBuilders
  [DOCS] Fixed link to docker content
  Plugins: Add versionless alias to all security policy codebase properties (#26756)
  [Test] #27342 Fix SearchRequests#testValidate
  [DOCS] Move X-Pack-specific Docker content (#27333)
  Fail queries with scroll that explicitely set request_cache (#27342)
  [Test] Fix S3BlobStoreContainerTests.testNumberOfMultiparts()
  Set minimum_master_nodes to all nodes for REST tests (#27344)
  [Tests] Relax allowed delta in extended_stats aggregation (#27171)
  Remove S3 output stream (#27280)
  ...
2017-11-14 09:03:52 +01:00
Simon Willnauer 2299c70371
Allow affix settings to specify dependencies (#27161)
We use affix settings to group settings / values under a certain namespace.
In some cases like login information for instance a setting is only valid if
one or more other settings are present. For instance `x.test.user` is only valid
if there is an `x.test.passwd` present and vice versa. This change allows to specify
such a dependency to prevent settings updates that leave settings in an inconsistent
state.
2017-11-13 12:06:36 +01:00
Jason Tedor 2032881269 Merge branch 'master' into ccr
* master: (22 commits)
  Update Tika version to 1.15
  Aggregations: bucket_sort pipeline aggregation (#27152)
  Introduce templating support to timezone/locale in DateProcessor (#27089)
  Increase logging on qa:mixed-cluster tests
  Update to AWS SDK 1.11.223 (#27278)
  Improve error message for parse failures of completion fields (#27297)
  Ensure external refreshes will also refresh internal searcher to minimize segment creation (#27253)
  Remove optimisations to reuse objects when applying a new `ClusterState` (#27317)
  Decouple `ChannelFactory` from Tcp classes (#27286)
  Fix find remote when building BWC
  Remove colons from task and configuration names
  Add unreleased 5.6.5 version number
  testCreateSplitIndexToN: do not set `routing_partition_size` to >= `number_of_routing_shards`
  Snapshot/Restore: better handle incorrect chunk_size settings in FS repo (#26844)
  Add limits for ngram and shingle settings (#27211) (#27318)
  Correct comment in index shard test
  Roll translog generation on primary promotion
  ObjectParser: Replace IllegalStateException with ParsingException (#27302)
  scripted_metric _agg parameter disappears if params are provided (#27159)
  Update discovery-ec2.asciidoc
  ...
2017-11-09 15:21:44 -05:00
Jason Tedor ddd3ed4f22
Additional engine refactoring
A previous refactoring exposed some protected methods. However, there is
not currently a need to be exposing these methods so publicly so we pull
one of them back to being package-private and we remove the other one.

Relates #27324
2017-11-09 15:17:11 -05:00
Simon Willnauer a34c2f0b8d
Ensure external refreshes will also refresh internal searcher to minimize segment creation (#27253)
We cut over to internal and external IndexReader/IndexSearcher in #26972 which uses
two independent searcher managers. This has the downside that refreshes of the external
reader will never clear the internal version map which in-turn will trigger additional
and potentially unnecessary segment flushes since memory must be freed. Under heavy
indexing load with low refresh intervals this can cause excessive segment creation which
causes high GC activity and significantly increases the required segment merges.

This change adds a dedicated external reference manager that delegates refreshes to the
internal reference manager that then `steals` the refreshed reader from the internal
reference manager for external usage. This ensures that external and internal readers
are consistent on an external refresh. As a sideeffect this also releases old segments
referenced by the internal reference manager which can potentially hold on to already merged
away segments until it is refreshed due to a flush or indexing activity.
2017-11-09 08:40:22 +00:00
Tim Brooks dc86b4c2ed
Decouple `ChannelFactory` from Tcp classes (#27286)
* Decouple `ChannelFactory` from Tcp classes

This is related to #27260. Currently `ChannelFactory` is tightly coupled
to classes related to the elasticsearch Tcp binary protocol. This commit
modifies the factory to be able to construct http or other protocol
channels.
2017-11-08 14:30:00 -07:00
Jason Tedor 3b6d0ce11f Merge branch 'master' into ccr
* master: (25 commits)
  Disable bwc tests in preparation of backporting #26931
  TemplateUpgradeService should only run on the master (#27294)
  Die with dignity while merging
  Fix profiling naming issues (#27133)
  Correctly encode warning headers
  Fixed references to Multi Index Syntax (#27283)
  Add an active Elasticsearch WordPress plugin link (#27279)
  Setting url parts as required to reflect the code base (#27263)
  keys in aggs percentiles need to be in quotes. (#26905)
  Align routing param type with search.json (#26958)
  Update to support bulk updates by query (#27172)
  Remove duplicated SnapshotStatus (#27276)
  add split index reference in indices.asciidoc
  Add ability to split shards (#26931)
  [Docs] Fix minor paragraph indentation error for multiple Indices params (#25535)
  Upgrade to Jackson 2.8.10 (#27230)
  Fix inconsistencies in the rest api specs for `tasks` (#27163)
  Adjust RestHighLevelClient method modifiers (#27238)
  Remove unused parameters in AnalysisRegistry (#27232)
  Add more information on `_failed_to_convert_` exception (#27034)
  ...
2017-11-07 07:16:04 -05:00
Jason Tedor d5451b2037
Die with dignity while merging
If an out of memory error is thrown while merging, today we quietly
rewrap it into a merge exception and the out of memory error is
lost. Instead, we need to rethrow out of memory errors, and in fact any
fatal error here, and let those go uncaught so that the node is torn
down. This commit causes this to be the case.

Relates #27265
2017-11-06 17:55:11 -05:00
Jason Tedor 766d29e7cf
Correctly encode warning headers
The warnings headers have a fairly limited set of valid characters
(cf. quoted-text in RFC 7230). While we have assertions that we adhere
to this set of valid characters ensuring that our warning messages do
not violate the specificaion, we were neglecting the possibility that
arbitrary user input would trickle into these warning headers. Thus,
missing here was tests for these situations and encoding of characters
that appear outside the set of valid characters. This commit addresses
this by encoding any characters in a deprecation message that are not
from the set of valid characters.

Relates #27269
2017-11-06 13:20:30 -05:00
Simon Willnauer bd7efa908a Add ability to split shards (#26931)
This change adds a new `_split` API that allows to split indices into a new
index with a power of two more shards that the source index.  This API works
alongside the `_shrink` API but doesn't require any shard relocation before
indices can be split.

The split operation is conceptually an inverse `_shrink` operation since we
initialize the index with a _syntetic_ number of routing shards that are used
for the consistent hashing at index time. Compared to indices created with
earlier versions this might produce slightly different shard distributions but
has no impact on the per-index backwards compatibility.  For now, the user is
required to prepare an index to be splittable by setting the
`index.number_of_routing_shards` at index creation time.  The setting allows the
user to prepare the index to be splittable in factors of
`index.number_of_routing_shards` ie. if the index is created with
`index.number_of_routing_shards: 16` and `index.number_of_shards: 2` it can be
split into `4, 8, 16` shards. This is an intermediate step until we can make
this the default. This also allows us to safely backport this change to 6.x.

The `_split` operation is implemented internally as a DeleteByQuery on the
lucene level that is executed while the primary shards execute their initial
recovery. Subsequent merges that are triggered due to this operation will not be
executed immediately. All merges will be deferred unti the shards are started
and will then be throttled accordingly.

This change is intended for the 6.1 feature release but will not support pre-6.1
indices to be split unless these indices have been shrunk before. In that case
these indices can be split backwards into their original number of shards.
2017-11-06 11:37:55 +01:00
Tanguy Leroux 43e7a4a349
Upgrade to Jackson 2.8.10 (#27230)
While it's not possible to upgrade the Jackson dependencies 
to their latest versions yet (see #27032 (comment) for more) 
it's still possible to upgrade to the latest 2.8.x version.
2017-11-06 10:20:05 +01:00
Jim Ferenczi 429275a773
Remove ElasticsearchQueryCachingPolicy (#27190)
We have an hidden setting called `index.queries.cache.term_queries` that disables caching of term queries in the query cache.
Though term queries are not cached in the Lucene UsageTrackingQueryCachingPolicy since version 6.5.
This makes the es policy useless but also makes it impossible to re-enable caching for term queries.
This change appeared in Lucene 6.5 so this setting is no-op since version 5.4 of Elasticsearch
The change in this PR removes the setting and the custom policy.
2017-11-06 08:26:24 +01:00
David Roberts 749c3ec716
Remove the single argument Environment constructor (#27235)
Only tests should use the single argument Environment constructor.  To
enforce this the single arg Environment constructor has been replaced with
a test framework factory method.

Production code (beyond initial Bootstrap) should always use the same
Environment object that Node.getEnvironment() returns.  This Environment
is also available via dependency injection.
2017-11-04 13:25:09 +00:00
Martijn van Groningen e61e4b8571
Merge remote-tracking branch 'es/master' into ccr
* es/master:
  Fix snapshot getting stuck in INIT state (#27214)
  Add an example of dynamic field names (#27255)
  #26260 Allow ip_range to accept CIDR notation (#27192)
  #27189 Fixed rounding of bounds in scaled float comparison (#27207)
  Add support for Gradle 4.3 (#27249)
  Fixes QueryStringQueryBuilderTests
  build: Fix setting the incorrect bwc version in mixed cluster qa module
  [Test] Fix QueryStringQueryBuilderTests.testExistsFieldQuery BWC
  Adjust assertions for sequence numbers BWC tests
  Do not create directories if repository is readonly (#26909)
  [Test] Fix InternalStatsTests
  [Test] Fix QueryStringQueryBuilderTests.testExistsFieldQuery
  Uses norms for exists query if enabled (#27237)
  Reinstate recommendation for ≥ 3 master-eligible nodes. (#27204)
2017-11-04 09:29:48 +01:00
kel 0f21262b36 Do not create directories if repository is readonly (#26909)
For FsBlobStore and HdfsBlobStore, if the repository is read only, the blob store should be aware of the readonly setting and do not create directories if they don't exist.

Closes #21495
2017-11-03 13:10:50 +01:00
Jason Tedor 57b24a938a Merge branch 'master' into ccr
* master:
  Lazy initialize checkpoint tracker bit sets
  Remove checkpoint tracker bit sets setting
  Fix stable BWC branch detection logic
  Fix logic detecting unreleased versions
  Enhances exists queries to reduce need for `_field_names` (#26930)
  Added new terms_set query
  Set request body to required to reflect the code base (#27188)
  Update Docker docs for 6.0.0-rc2 (#27166)
  Add version 6.0.0
  Docs: restore now fails if it encounters incompatible settings (#26933)
  Convert index blocks to cluster block exceptions (#27050)
  [DOCS] Link remote info API in Cross Cluster Search docs page
  Fix Laplace scorer to multiply by alpha (and not add) (#27125)
  [DOCS] Clarify migrate guide and search request validation
  Raise IllegalArgumentException if query validation failed (#26811)
  prevent duplicate fields when mixing parent and root nested includes (#27072)
  TopHitsAggregator must propagate calls to `setScorer`. (#27138)
2017-11-01 21:34:55 -04:00
Jason Tedor d6d830ff0b
Fix logic detecting unreleased versions
When partitioning version constants into released and unreleased
versions, today we have a bug in finding the last unreleased
version. Namely, consider the following version constants on the 6.x
branch: ..., 5.6.3, 5.6.4, 6.0.0-alpha1, ..., 6.0.0-rc1, 6.0.0-rc2,
6.0.0, 6.1.0. In this case, our convention dictates that: 5.6.4, 6.0.0,
and 6.1.0 are unreleased. Today we correctly detect that 6.0.0 and 6.1.0
are unreleased, and then we say the previous patch version is unreleased
too. The problem is the logic to remove that previous patch version is
broken, it does not skip alphas/betas/RCs which have been released. This
commit fixes this by skipping backwards over pre-release versions when
finding the previous patch version to remove.

Relates #27206
2017-11-01 13:01:45 -04:00
Colin Goodheart-Smithe 99aca9cdfc
Enhances exists queries to reduce need for `_field_names` (#26930)
* Enhances exists queries to reduce need for `_field_names`

Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance.

This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`.

Closes #26770

* Addresses review comments

* Addresses more review comments

* implements existsQuery explicitly on every mapper

* Reinstates ability to perform term query on `_field_names`

* Added bwc depending on index created version

* Review Comments

* Skips tests that are not supported in 6.1.0

These values will need to be changed after backporting this PR to 6.x
2017-11-01 10:46:59 +00:00
kel c3e2bdf20c Raise IllegalArgumentException if query validation failed (#26811)
Closes #26799
2017-10-31 12:17:27 +01:00
Adrien Grand 3812d3cb43
TopHitsAggregator must propagate calls to `setScorer`. (#27138)
It is required in order to work correctly with bulk scorer implementations
that change the scorer during the collection process. Otherwise sub collectors
might call `Scorer.score()` on the wrong scorer.

Closes #27131
2017-10-31 09:59:06 +01:00
Jason Tedor 3cc1a88f7f Merge branch 'master' into ccr
* master:
  Refactor internal engine
  [Docs] #26541: add warning regarding the limit on the number of fields that can be queried at once in the multi_match query.
  [Docs] Fix note in bucket_selector
  [Docs] Fix indentation of examples (#27168)
  [Docs] Clarify `span_not` query behavior for non-overlapping matches (#27150)
  [Docs] Remove first person "I" from getting started (#27155)
2017-10-30 13:16:50 -04:00
Jason Tedor a566942219
Refactor internal engine
This commit is a minor refactoring of internal engine to move hooks for
generating sequence numbers into the engine itself. As such, we refactor
tests that relied on this hook to use the new hook, and remove the hook
from the sequence number service itself.

Relates #27082
2017-10-30 13:10:20 -04:00
Martijn van Groningen 2e41f0dd48
Merge remote-tracking branch 'es/master' into ccr 2017-10-30 10:29:58 +01:00
Ryan Ernst 2a8452b513 Reindex: Fix headers in reindex action (#26937)
The headers passed to reindex were skipped except for the last one. This
commit fixes the copying of the headers, as well as adds a base test
case for rest client builders to access the headers within the built
rest client.

relates #22976
2017-10-25 16:37:01 -07:00
olcbean 981b7f4d39 Make yaml test runner stricter by enforcing `required` for paths and parameters (#27035)
Till now the yaml test runner was verifying that the provided path parts and parameters are supported.
With this PR, yaml test runner also checks that all required path parts and parameters are provided.
2017-10-25 19:36:42 +00:00
Luca Cavanna 8caf7d4ff8 Decouple BulkProcessor from ThreadPool (#26727)
Introduce minimal thread scheduler as a base class for `ThreadPool`. Such a class can be used from the `BulkProcessor` to schedule retries and the flush task. This allows to remove the `ThreadPool` dependency from `BulkProcessor`, which requires to provide settings that contain `node.name` and also needed log4j for logging. Instead, it needs now a `Scheduler` that is much lighter and gets automatically created and shut down on close.

Closes #26028
2017-10-25 10:30:23 +02:00
Lee Hinman fcfbdf1f37 Expose adaptive replica selection stats in /_nodes/stats API
This exposes the collected metrics we store for ARS in the nodes stats, as well
as the computed rank of nodes. Each node exposes its perspective about the
cluster.

Here's an example output (with `?human`):

```json
...
"adaptive_selection" : {
  "_k6v1-wERxyUd5ke6s-D0g" : {
    "outgoing_searches" : 0,
    "avg_queue_size" : 0,
    "avg_service_time" : "7.8ms",
    "avg_service_time_ns" : 7896963,
    "avg_response_time" : "9ms",
    "avg_response_time_ns" : 9095598,
    "rank" : "9.1"
  },
  "VJiCUFoiTpySGmO00eWmtQ" : {
    "outgoing_searches" : 0,
    "avg_queue_size" : 0,
    "avg_service_time" : "1.3ms",
    "avg_service_time_ns" : 1330240,
    "avg_response_time" : "4.5ms",
    "avg_response_time_ns" : 4524154,
    "rank" : "4.5"
  },
  "DHNGTdzyT9iiaCpEUsIAKA" : {
    "outgoing_searches" : 0,
    "avg_queue_size" : 0,
    "avg_service_time" : "2.1ms",
    "avg_service_time_ns" : 2113164,
    "avg_response_time" : "6.3ms",
    "avg_response_time_ns" : 6375810,
    "rank" : "6.4"
  }
}
...
```
2017-10-24 08:58:42 -06:00
Jason Tedor 8a3540c366 Merge branch 'master' into ccr
* master:
  Remove unnecessary exception for engine constructor
  Update docs about `script` parameter (#27010)
  Don't refresh on `_flush` `_force_merge` and `_upgrade` (#27000)
  Do not set SO_LINGER on server channels (#26997)
  Fix inconsistencies in the rest api specs for *_script (#26971)
  fix inconsistencies in the rest api specs for cat.snapshots (#26996)
  Add docs on full_id parameter in cat nodes API
  [TEST] Add test that replicates versioned updates with random flushes
  Use internal searcher for all indexing related operations in the engine
  Reformat paragraph in template docs to 80 columns
  Clarify settings and template on create index
  Fix reference to TcpTransport in documentation
  Allow Uid#decodeId to decode from a byte array slice (#26987)
  Fix a typo in the similarity docs (#26970)
  Use separate searchers for "search visibility" vs "move indexing buffer to disk (#26972)
2017-10-16 16:23:13 +02:00
Tim Brooks 277637f42f Do not set SO_LINGER on server channels (#26997)
Right now we are attempting to set SO_LINGER to 0 on server channels
when we are stopping the tcp transport. This is not a supported socket
option and throws an exception. This also prevents the channels from
being closed.

This commit 1. doesn't set SO_LINGER for server channges, 2. checks
that it is a supported option in nio, and 3. changes the log message
to warn for server channel close exceptions.
2017-10-13 13:06:38 -06:00
Jason Tedor 7c1bcfc4d5 Merge branch 'master' into ccr
* master: (35 commits)
  Create weights lazily in filter and filters aggregation (#26983)
  Use a dedicated ThreadGroup in rest sniffer (#26897)
  Fire global checkpoint sync under system context
  Update by Query is modified to accept short `script` parameter. (#26841)
  Cat shards bytes (#26952)
  Add support for parsing inline script (#23824) (#26846)
  Change default value to true for transpositions parameter of fuzzy query (#26901)
  Adding unreleased 5.6.4 version number to Version.java
  Rename TCPTransportTests to TcpTransportTests (#26954)
  Fix NPE for /_cat/indices when no primary shard (#26953)
  [DOCS] Fixed indentation of the definition list.
  Fix formatting in channel close test
  Check for closed connection while opening
  Clarify systemd overrides
  [DOCS] Plugin Installation for Windows (#21671)
  Painless: add tests for cached boxing (#24163)
  Don't detect source's XContentType in DocumentParser.parseDocument() (#26880)
  Fix handling of paths containing parentheses
  Allow only a fixed-size receive predictor (#26165)
  Add Homebrew instructions to getting started
  ...
2017-10-12 11:02:31 -04:00
Jason Tedor 6e403b0581 Enable engine factory to be pluggable
This commit enables the engine factory to be pluggable based on index
settings used when creating the index service for an index.

Relates #26827
2017-10-12 10:48:41 -04:00
Jason Tedor 393e73612e Fix formatting in channel close test
This commit fixes the indentation in the transport test case for a
channel closing while connecting.
2017-10-10 13:39:45 -04:00
Jason Tedor 4c06b8f1d2 Check for closed connection while opening
While opening a connection to a node, a channel can subsequently
close. If this happens, a future callback whose purpose is to close all
other channels and disconnect from the node will fire. However, this
future will not be ready to close all the channels because the
connection will not be exposed to the future callback yet. Since this
callback is run once, we will never try to disconnect from this node
again and we will be left with a closed channel. This commit adds a
check that all channels are open before exposing the channel and throws
a general connection exception. In this case, the usual connection retry
logic will take over.

Relates #26932
2017-10-10 13:34:51 -04:00
Simon Willnauer cdd7c1e6c2 Return List instead of an array from settings (#26903)
Today we return a `String[]` that requires copying values for every
access. Yet, we already store the setting as a list so we can also directly
return the unmodifiable list directly. This makes list / array access in settings
a much cheaper operation especially if lists are large.
2017-10-09 09:52:08 +02:00
Nhat bf4c3642b2 remove _primary and _replica shard preferences (#26791)
The shard preference _primary, _replica and its variants were useful
for the asynchronous replication. However, with the current impl, they
are no longer useful and should be removed.

Closes #26335
2017-10-08 11:03:06 -04:00
Jason Tedor 470e5e7cfc Add additional low-level logging handler ()
* Add additional low-level logging handler

We have the trace handler which is useful for recording sent messages
but there are times where it would be useful to have more low-level
logging about the events occurring on a channel. This commit adds a
logging handler that can be enabled by setting a certain log level
(org.elasticsearch.transport.netty4.ESLoggingHandler) to trace that
provides trace logging on low-level channel events and includes some
information about the request/response read/write events on the channel
as well.

* Remove imports

* License header

* Remove redundant

* Add test

* More assertions
2017-10-05 12:10:58 -04:00
Martijn van Groningen b27e408ed2
Removed void token filter entries and added two tests 2017-10-05 13:25:05 +02:00
Md. Abdulla-Al-Sun a40c474e10
Added Bengali Analyzer to Elasticsearch with respect to the lucene update(PR#238) 2017-10-05 13:25:05 +02:00
Boaz Leskes 2a04118e88 Promote common rest test utility methods to ESRestTestCase
We have duplicates in some classes and I was about to create one more.
2017-10-05 10:08:10 +02:00
Simon Willnauer 00dfdf50cf Represent lists as actual lists inside Settings (#26878)
Today we represent each value of a list setting with it's own dedicated key
that ends with the index of the value in the list. Aside of the obvious
weirdness this has several issues especially if lists are massive since it
causes massive runtime penalties when validating settings. Like a list of 100k
words will literally cause a create index call to timeout and in-turn massive
slowdown on all subsequent validations runs.

With this change we use a simple string list to represent the list. This change
also forbids to add a settings that ends with a .0 which was internally used to
detect a list setting.  Once this has been rolled out for an entire major
version all the internal .0 handling can be removed since all settings will be
converted.

Relates to #26723
2017-10-05 09:27:08 +02:00
Martijn van Groningen dca787ed8a
upgrade to Lucene 7.1.0 snapshot version 2017-10-05 09:06:56 +02:00
Simon Willnauer d1533e2397 Remove Settings#getAsMap() (#26845)
Since `#getAsMap` exposes internal representation we are trying to remove it
step by step. This commit is cleaning up some xcontent writing as well as
usage in tests
2017-10-04 01:21:38 -06:00
Boaz Leskes a18bd9caa2 Increase ESRestTestCase.waitForClusterStateUpdatesToFinish time out to 30s
It is set to 10 sec but sometimes it takes the cluster longer to settle.
2017-10-03 12:24:36 +02:00
Tim Brooks d80ad7f097 Check channel i open before setting SO_LINGER (#26857)
This commit fixes a #26855. Right now we set SO_LINGER to 0 if we are
stopping the transport. This can throw a ChannelClosedException if the
raw channel is already closed. We have a number of scenarios where it is
possible this could be called with a channel that is already closed.
This commit fixes the issue be checking that the channel is not closed
before attempting to set the socket option.
2017-10-02 15:09:52 -06:00
Tim Brooks 9ae7a80ba5 Move raw selector usage into ESSelector (#26825)
Currently we only log generic messages about errors in logs from the
nio event handler. This means that we do not know which channel had
issues connection, reading, writing, etc.

This commit changes the logs to include the local and remote addresses
and profile for a channel.
2017-10-01 17:59:57 -06:00
Simon Willnauer 7b8d036ab5 Replace group map settings with affix setting (#26819)
We use group settings historically instead of using a prefix setting which is more restrictive and type safe. The majority of the usecases needs to access a key, value map based on the _leave node_ of the setting ie. the setting `index.tag.*` might be used to tag an index with `index.tag.test=42` and `index.tag.staging=12` which then would be turned into a `{"test": 42, "staging": 12}` map. The group settings would always use `Settings#getAsMap` which is loosing type information and uses internal representation of the settings. Using prefix settings allows now to access such a method type-safe and natively.
2017-09-30 14:27:21 +02:00
Tim Brooks bf403ae028 Add information about nio channels in logs (#26806)
Currently we only log generic messages about errors in logs from the
nio event handler. This means that we do not know which channel had
issues connection, reading, writing, etc.

This commit changes the logs to include the local and remote addresses
and profile for a channel.
2017-09-28 17:11:26 -06:00
Simon Willnauer 25d6778d31 Add comment to TCP transport impls why we set SO_LINGER on close 2017-09-28 13:07:01 +02:00
Armin Braun af06231d4c #26701 Close TcpTransport on RST in some Spots to Prevent Leaking TIME_WAIT Sockets (#26764)
#26701 Added option to RST instead of FIN to TcpTransport#closeChannels
2017-09-26 19:58:11 +00:00
Simon Willnauer a506ba8602 Remove `Settings,put(Map<String,String>)` (#26785)
`Map<String,String>` is basically erasing the type while other methods on
the `Settings.Builder` are type safe and have corresponding `get` methods.
2017-09-26 12:15:20 +02:00
Simon Willnauer aab4655e63 Unify Settings xcontent reading and writing (#26739)
This change adds a fromXContent method to Settings that allows to read
the xcontent that is produced by toXContent. It also replaces the entire settings
loader infrastructure and removes the structured map representation. Future PRs will
also tackle the `getAsMap` that exposes the internal represenation of settings for
better encapsulation.
2017-09-25 13:23:01 +02:00
Jason Tedor f35d1de502 Introduce global checkpoint background sync
It is the exciting return of the global checkpoint background
sync. Long, long ago, in snapshot version far, far away we had and only
had a global checkpoint background sync. This sync would fire
periodically and send the global checkpoint from the primary shard to
the replicas so that they could update their local knowledge of the
global checkpoint. Later in time, as we sped ahead towards finalizing
the initial version of sequence IDs, we realized that we need the global
checkpoint updates to be inline. This means that on a replication
operation, the primary shard would piggy back the global checkpoint with
the replication operation to the replicas. The replicas would update
their local knowledge of the global checkpoint and reply with their
local checkpoint. However, this could allow the global checkpoint on the
primary to advance again and the replicas would fall behind in their
local knowledge of the global checkpoint. If another replication
operation never fired, then the replicas would be permanently behind. To
account for this, we added one more sync that would fire when the
primary shard fell idle. However, this has problems:
 - the shard idle timer defaults to five minutes, a long time to wait
   for the replicas to learn of the new global checkpoint
 - if a replica missed the sync, there was no follow-up sync to catch
   them up
 - there is an inherent race condition where the primary shard could
   fall idle mid-operation (after having sent the replication request to
   the replicas); in this case, there would never be a background sync
   after the operation completes
 - tying the global checkpoint sync to the idle timer was never natural

To fix this, we add two additional changes for the global checkpoint to
be synced to the replicas. The first is that we add a post-operation
sync that only fires if there are no operations in flight and there is a
lagging replica. This gives us a chance to sync the global checkpoint to
the replicas immediately after an operation so that they are always kept
up to date. The second is that we add back a global checkpoint
background sync that fires on a timer. This timer fires every thirty
seconds, and is not configurable (for simplicity). This background sync
is smarter than what we had previously in the sense that it only sends a
sync if the global checkpoint on at least one replica is lagging that of
the primary. When the timer fires, we can compare the global checkpoint
on the primary to its knowledge of the global checkpoint on the replicas
and only send a sync if there is a shard behind.

Relates #26591
2017-09-21 15:34:13 -04:00
James Baiera c760eec054 Add permission checks before reading from HDFS stream (#26716)
Add checks for special permissions before reading hdfs stream data. Also adds test from 
readonly repository fix. MiniHDFS will now start with an existing repository with a single snapshot 
contained within. Readonly Repository is created in tests and attempts to list the snapshots 
within this repo.
2017-09-21 11:55:07 -04:00
Michael Basnight f385e0cf26 Add bad_request to the rest-api-spec catch params (#26539)
This adds another request to the catch params. It also makes sure that
the generic request param does not allow 400 either.
2017-09-14 14:24:03 -05:00