Commit Graph

295 Commits

Author SHA1 Message Date
Martijn van Groningen 44c7c4b166
[CCR] Add auto follow stats api (#33801)
GET /_ccr/auto_follow/stats

Returns:

```
{
   "number_of_successful_follow_indices": ...
   "number_of_failed_follow_indices": ...
   "number_of_failed_remote_cluster_state_requests": ...
   "recent_auto_follow_errors": [
      ...
   ]
}
```

Relates to #33007
2018-09-20 07:16:20 +02:00
Benjamin Trent 4767a016a5
Adding node_count to ML Usage (#33850) (#33863) 2018-09-19 13:35:09 -07:00
Benjamin Trent 4190a9f1e9
Delete custom index if the only contained job is deleted (#33788)
* Delete custom index if the only contained job is deleted
2018-09-19 07:42:26 -07:00
markharwood c118581617 Test fix - Graph connections could appear in different orders
Graph connections could appear in different orders based on insertion sequence

Closes #33686
2018-09-19 15:16:14 +01:00
Martijn van Groningen d9947c631a
[CCR] Rename idle_shard_retry_delay to poll_timout in auto follow patterns (#33821) 2018-09-19 13:13:20 +02:00
Simon Willnauer 251489d59a
Cut over to unwrap segment reader (#33843)
The fix in #33757 introduces some workaround since FilterCodecReader didn't
support unwrapping. This cuts over to a more elegant fix to access the readers
segment infos.
2018-09-19 10:18:03 +02:00
Martijn van Groningen 013b64a07c
[CCR] Change FollowIndexAction.Request class to be more user friendly (#33810)
Instead of having one constructor that accepts all arguments, all parameters
should be provided via setters. Only leader and follower index are required
arguments. This makes using this class in tests and transport client easier.
2018-09-19 07:18:24 +02:00
Martijn van Groningen 47b86d6e6a
[CCR] Changed AutoFollowCoordinator to keep track of certain statistics (#33684)
The following stats are being kept track of:
1) The total number of times that auto following a leader index succeed.
2) The total number of times that auto following a leader index failed.
3) The total number of times that fetching a remote cluster state failed.
4) The most recent 256 auto follow failures per auto leader index
   (e.g. create_and_follow api call fails) or cluster alias
   (e.g. fetching remote cluster state fails).

Each auto follow run now produces a result that is being used to update
the stats being kept track of in AutoFollowCoordinator.

Relates to #33007
2018-09-18 09:43:50 +02:00
Martijn van Groningen 15f30d689b
[CCR] Do not unnecessarily wrap fetch exception in a ElasticSearch exception and (#33777)
* [CCR] Do not unnecessarily wrap fetch exception in a ElasticSearch exception and
properly map fetch_exception.exception field as object.

The extra caused by level is not necessary here:

```
"fetch_exceptions": [
              {
                "from_seq_no": 1,
                "retries": 106,
                "exception": {
                  "type": "exception",
                  "reason": "[index1] IndexNotFoundException[no such index]",
                  "caused_by": {
                    "type": "index_not_found_exception",
                    "reason": "no such index",
                    "index_uuid": "_na_",
                    "index": "index1"
                  }
                }
              }
            ],
```
2018-09-17 22:33:37 +02:00
Simon Willnauer 48a5b45d28
Ensure fully deleted segments are accounted for correctly (#33757)
We can't rely on the leaf reader ordinal in a wrapped reader since
it might not correspond to the ordinal in the SegmentInfos for it's
SegmentCommitInfo.

Relates to #32844
Closes #33689
Closes #33755
2018-09-17 18:18:58 +02:00
Martijn van Groningen 481f8a9a07
[CCR] Make auto follow patterns work with security (#33501)
Relates to #33007
2018-09-17 07:29:00 +02:00
Jason Tedor 770ad53978
Introduce long polling for changes (#33683)
Rather than scheduling pings to the leader index when we are caught up
to the leader, this commit introduces long polling for changes. We will
fire off a request to the leader which if we are already caught up will
enter a poll on the leader side to listen for global checkpoint
changes. These polls will timeout after a default of one minute, but can
also be specified when creating the following task. We use these time
outs as a way to keep statistics up to date, to not exaggerate time
since last fetches, and to avoid pipes being broken.
2018-09-16 10:35:23 -04:00
Jason Tedor 069605bd91
Do not count shard changes tasks against REST tests (#33738)
When executing CCR REST tests it is going to be expected after global
checkpoint polling goes in that shard changes tasks can still be pending
at the end of the test. One way to deal with this is to set a low
timeout on these polls, but then that means we are not executing our
REST tests with our default production settings and instead would be
using an unrealistic low timeout. Alternatively, since we expect these
tasks to be there, we can not count them against the test. That is what
this commit does.
2018-09-16 07:32:12 -04:00
Martijn van Groningen 82a6ae1dae
[CCR] Move ccr tests in core module back to ccr module (#33711)
When developing ccr it is not ideal if tests are in multiple modules.
Even the classes these tests test are in the core module, it is easier
if these tests are in ccr module in order to avoid running the test task
in core module. This results in running many non ccr tests.

This way when developing ccr we can run locally:
./gradlew x-pack:plugin:core:precommit x-pack:plugin:ccr:check

before pushing to PR branches and be confident that the PR build passes,
without running x-pack:plugin:core:check task.
2018-09-14 17:18:00 +02:00
Ioannis Kakavas d9f5e4fd2e Pin TLS1.2 in SSLConfigurationReloaderTests
Ensure that the SSLConfigurationReloaderTests can run with JDK 11
by pinning the HttpClient to TLS version to TLS1.2. This is necessary
becase even if the MockWebServer is set to user TLS1.2, we don't
set its enabled protocols, so if it receives a TLS1.3 request (which
is the default behavior for HttpClient in JDK11), it will use TLS1.3
and the original issue will manifest again.

Relates  #33127
Resolves #32124
2018-09-14 16:39:20 +03:00
Jason Tedor 2282150f34
Expose retries for CCR fetch failures (#33694)
This commit exposes the number of times that a fetch has been tried to
the CCR stats endpoint, and to CCR monitoring.
2018-09-14 08:52:46 -04:00
Albert Zaharovits c86e2d5211
Structured audit logging (#31931)
Changes the format of log events in the audit logfile.
It also changes the filename suffix from `_access` to `_audit`.
The new entry format is consistent with Elastic Common Schema.
Entries are formatted as JSON with no nested objects and field
names have a dotted syntax. Moreover, log entries themselves
are not spaced by commas and there is exactly one entry per line.
In addition, entry fields are ordered, unlike a typical JSON doc,
such that a human would not strain his eyes over jumbled 
fields from one line to the other; the order is defined in the log4j2
properties file.
The implementation utilizes the log4j2's `StringMapMessage`.
This means that the application builds the log event as a map
and the log4j logic (the appender's layout) handle the format
internally. The layout, such as the set of printed fields and their
order, can be changed at runtime without restarting the node.
2018-09-14 15:25:53 +03:00
David Roberts 568ac10ca6
[ML] Allow overrides for some file structure detection decisions (#33630)
This change modifies the file structure detection functionality
such that some of the decisions can be overridden with user
supplied values.

The fields that can be overridden are:

- charset
- format
- has_header_row
- column_names
- delimiter
- quote
- should_trim_fields
- grok_pattern
- timestamp_field
- timestamp_format

If an override makes finding the file structure impossible then
the endpoint will return an exception.
2018-09-14 09:29:11 +01:00
Ioannis Kakavas 8ae1eeb303
[TESTS] Disable specific locales for RestrictedTrustManagerTest (#33299)
Disable specific Thai and Japanese locales as Certificate expiration
validation fails due to the date parsing of BouncyCastle (that manifests
in a FIPS 140 JVM as this is the only place we use BouncyCastle).
Added the locale switching logic here instead of subclassing
ESTestCase as these are the only tests that fail for these locales and
JVM combination.

Resolves #33081
2018-09-14 09:42:03 +03:00
Nhat Nguyen 189aaceecf AwaitsFix testRestoreMinmal
Tracked at #33689
2018-09-13 22:15:21 -04:00
Jay Modi 3914a980f7
Security: remove wrapping in put user response (#33512)
This change removes the wrapping of the created field in the put user
response. The created field was added as a top level field in #32332,
while also still being wrapped within the `user` object of the
response. Since the value is available in both formats in 6.x, we can
remove the wrapped version for 7.0.
2018-09-13 14:40:36 -06:00
Martijn van Groningen 53ba253aa4
[CCR] Add validation for max_retry_delay (#33648) 2018-09-13 20:52:00 +02:00
Jason Tedor eb715d5290
Add follower index to CCR monitoring and status (#33645)
This commit adds the follower index to CCR shard follow task status, and
to monitoring.
2018-09-12 17:35:06 -04:00
Martijn van Groningen b5d8495789
[CCR] Add auto follow pattern APIs to transport client. (#33629) 2018-09-12 21:50:22 +02:00
Jay Modi 20c6c9c542
Address license state update/read thread safety (#33396)
This change addresses some issues regarding thread safety around
updates and method calls on the XPackLicenseState object. There exists
a possibility that there could be a concurrent update to the
XPackLicenseState when there is a scheduled check to see if the license
is expired and a cluster state update. In order to address this, the
update method now has a synchronized block where member variables are
updated. Each method that reads these variables is now also
synchronized.

Along with the above change, there was a consistency issue around
security calls to the license state. The majority of security checks
make two calls to the license state, which could result in incorrect
behavior due to the checks being made against different license states.
The majority of this behavior was introduced for 6.3 with the inclusion
of x-pack in the default distribution. In order to resolve the majority
of these cases, the `isSecurityEnabled` method is no longer public and
the logic is also included in individual methods about security such as
`isAuthAllowed`. There were a few cases where this did not remove
multiple calls on the license state, so a new method has been added
which creates a copy of the current license state that will not change.
Callers can use this copy of the license state to make decisions based
on a consistent view of the license state.
2018-09-12 13:08:09 -06:00
Martijn van Groningen 901d8035d9
[CCR] Update es monitoring mapping and (#33635)
* [CCR] Update es monitoring mapping and
change qa tests to query based on leader index.


Co-authored-by: Jason Tedor <jason@tedor.me>
2018-09-12 19:36:17 +02:00
Simon Willnauer c783488e97
Add `_source`-only snapshot repository (#32844)
This change adds a `_source` only snapshot repository that allows to wrap
any existing repository as a _backend_ to snapshot only the `_source` part
including live docs markers. Snapshots taken with the `source` repository
won't include any indices,  doc-values or points. The snapshot will be reduced in size and
functionality such that it requires full re-indexing after it's successfully restored.

The restore process will copy the `_source` data locally starts a special shard and engine
to allow `match_all` scrolls and searches. Any other query, or get call will fail with and unsupported operation exception.  The restored index is also marked as read-only.

This feature aims mainly for disaster recovery use-cases where snapshot size is
a concern or where time to restore is less of an issue.

**NOTE**: The snapshot produced by this repository is still a valid lucene index. This change doesn't allow for any longer retention policies which is out of scope for this change.
2018-09-12 17:47:10 +02:00
Jason Tedor 23f12e42c1
Expose CCR stats to monitoring (#33617)
This commit exposes the CCR stats endpoint to monitoring collection.

Co-authored-by: Martijn van Groningen <martijn.v.groningen@gmail.com>
2018-09-12 09:13:07 -04:00
Martijn van Groningen 96c49e5ed0
[CCR] Improve shard follow task's retryable error handling (#33371)
Improve failure handling of retryable errors by retrying remote calls in
a exponential backoff like manner. The delay between a retry would not be
longer than the configured max retry delay. Also retryable errors will be
retried indefinitely.

Relates to #30086
2018-09-12 12:49:51 +02:00
Jason Tedor eca37e6e0a
Expose CCR to the transport client (#33608)
This commit exposes CCR to the transport client.
2018-09-11 16:37:52 -04:00
David Roberts 8e05ce567f
[ML] Rename input_fields to column_names in file structure (#33568)
This change tightens up the meaning of the "input_fields" field
in the file structure finder output.  Previously it was permitted
but not calculated for JSON and XML files.  Following this change
the field is called "column_names" and is only permitted for
delimited files.

Additionally the way the column names are set for headerless
delimited files is refactored to encapsulate the way they're
named to one line of the code rather than having the same
logic in two places.
2018-09-11 08:46:26 +01:00
Chris Roberson 369db8a9d6
Update beats template to include apm-server metrics (#33286) 2018-09-10 08:50:07 -05:00
Martijn van Groningen c4adcee3ea
[CCR] Add create_follow_index privilege (#33559)
This is a new index privilege that the user needs to have in the follow cluster.
This privilege is required in addition to the `manage_ccr` cluster privilege in
order to execute the create and follow api.

Closes #33555
2018-09-10 13:08:20 +02:00
Ioannis Kakavas 77aeeda275
Correctly handle PKCS#11 tokens for system keystore (#33460)
* Correctly handle NONE keyword for system keystore

As defined in the PKCS#11 reference guide
https://docs.oracle.com/javase/8/docs/technotes/guides/security/p11guide.html
PKCS#11 tokens can be used as the JSSE keystore and truststore and
the way to indicate this is to set `javax.net.ssl.keyStore` and
`javax.net.ssl.trustStore` to `NONE` (case sensitive).

This commits ensures that we honor this convention and do not
attempt to load the keystore or truststore if the system property is
set to NONE.

* Handle password protected system truststore

When a PKCS#11 token is used as the system truststore, we need to
pass a password when loading it, even if only for reading
certificate entries. This commit ensures that if
`javax.net.ssl.trustStoreType` is set to `PKCS#11` (as it would
when a PKCS#11 token is in use) the password specified in
`javax.net.ssl.trustStorePassword` is passed when attempting to
load the truststore.

Relates #33459
2018-09-10 11:18:44 +03:00
Jason Tedor 6bb817004b
Add infrastructure to upgrade settings (#33536)
In some cases we want to deprecate a setting, and then automatically
upgrade uses of that setting to a replacement setting. This commit adds
infrastructure for this so that we can upgrade settings when recovering
the cluster state, as well as when such settings are dynamically applied
on cluster update settings requests. This commit only focuses on cluster
settings, index settings can build on this infrastructure in a
follow-up.
2018-09-09 20:49:19 -04:00
Dimitris Athanasiou fcb15b0ce3
[ML] Get job stats request should filter non-ML job tasks (#33516)
When requesting job stats for `_all`, all ES tasks are accepted
resulting to loads of cluster traffic and a memory overhead.
This commit correctly filters out non ML job tasks.

Closes #33515
2018-09-09 22:53:03 +01:00
David Roberts e42cc5cd8c
[ML] Add a file structure determination endpoint (#33471)
This endpoint accepts an arbitrary file in the request body and
attempts to determine the structure.  If successful it also
proposes mappings that could be used when indexing the file's
contents, and calculates simple statistics for each of the fields
that are useful in the data preparation step prior to configuring
machine learning jobs.
2018-09-07 17:41:57 +01:00
Jim Ferenczi 79cd6385fe
Collapse package structure for metrics aggs (#33463)
This change collapses all metrics aggregations classes into a single package `org.elasticsearch.aggregations.metrics`.
It also restricts the visibility of some classes (aggregators and factories) that should not be used outside of the package.

Relates #22868
2018-09-07 10:58:06 +02:00
Yogesh Gaikwad ee73bc2f3f
[SECURITY] Set Auth-scheme preference (#33156)
Some browsers (eg. Firefox) behave differently when presented with
multiple auth schemes in 'WWW-Authenticate' header. The expected
behavior is that browser select the most secure auth-scheme before
trying others, but Firefox selects the first presented auth scheme and
tries the next ones sequentially. As the browser interpretation is
something that we do not control, we can at least present the auth
schemes in most to least secure order as the server's preference.

This commit modifies the code to collect and sort the auth schemes
presented by most to least secure. The priority of the auth schemes is
fixed, the lower number denoting more secure auth-scheme.
The current order of schemes based on the ES supported auth-scheme is
[Negotiate, Bearer,Basic] and when we add future support for
other schemes we will need to update the code. If need be we will make
this configuration customizable in future.

Unit test to verify the WWW-Authenticate header values are sorted by
server preference as more secure to least secure auth schemes.
Tested with Firefox, Chrome, Internet Explorer 11.

Closes#32699
2018-09-07 08:46:49 +10:00
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
David Roberts 0849b98f60
[ML] Rename log structure to file structure (#33421)
Many files supplied to the upcoming ML data preparation
functionality will not be "log" files.  For example,
CSV files are generally not "log" files.  Therefore it
makes sense to rename library that determines the
structure of these files.

Although "file structure" could be considered too broad,
as the library currently only works with a few text
formats, in the future it may be extended to work with
more formats.
2018-09-06 09:13:08 +01:00
Alan Woodward e134f9b5f3
Fix generics in ScriptPlugin#getContexts() (#33426)
Changes the return value from List<ScriptContext> to List<ScriptContext<?>> to remove raw-types warnings.
2018-09-06 09:04:22 +01:00
Martijn van Groningen a721d09c81
[CCR] Added auto follow patterns feature (#33118)
Auto Following Patterns is a cross cluster replication feature that
keeps track whether in the leader cluster indices are being created with
names that match with a specific pattern and if so automatically let
the follower cluster follow these newly created indices.

This change adds an `AutoFollowCoordinator` component that is only active
on the elected master node. Periodically this component checks the
 the cluster state of remote clusters if there new leader indices that
match with configured auto follow patterns that have been defined in
`AutoFollowMetadata` custom metadata.

This change also adds two new APIs to manage auto follow patterns. A put
auto follow pattern api:

```
PUT /_ccr/_autofollow/{{remote_cluster}}
{
   "leader_index_pattern": ["logs-*", ...],
   "follow_index_pattern": "{{leader_index}}-copy",
   "max_concurrent_read_batches": 2
   ... // other optional parameters
}
```

and delete auto follow pattern api:

```
DELETE /_ccr/_autofollow/{{remote_cluster_alias}}
```

The auto follow patterns are directly tied to the remote cluster aliases
configured in the follow cluster.

Relates to #33007


Co-authored-by: Jason Tedor jason@tedor.me
2018-09-06 08:01:58 +02:00
Tim Brooks 88c178dca6
Add sni name to SSLEngine in netty transport (#33144)
This commit is related to #32517. It allows an "server_name"
attribute on a DiscoveryNode to be propagated to the server using
the TLS SNI extentsion. This functionality is only implemented for
the netty security transport.
2018-09-05 16:12:10 -06:00
Jay Modi ea52277a1e
HLRest: add put user API (#32332)
This commit adds a security client to the high level rest client, which
includes an implementation for the put user api. As part of these
changes, a new request and response class have been added that are
specific to the high level rest client. One change here is that the response
was previously wrapped inside a user object. The plan is to remove this
wrapping and this PR adds an unwrapped response outside of the user
object so we can remove the user object later on.

See #29827
2018-09-05 10:56:30 -06:00
Sohaib Iftikhar 761e8c461f HLRC: Add delete by query API (#32782)
Adds the delete-by-query API to the High Level REST Client.
2018-09-04 08:56:26 -04:00
Dimitris Athanasiou 1457b07a06
[ML] The sort field on get records should default to the record_score (#33358)
This is not changing the behaviour as when the sort field was set
to `influencer_score` the secondary sort would be used and that
was using the `record_score` at the highest priority.
2018-09-04 11:38:24 +01:00
Benjamin Trent 767d8e0801
[ML] Delete forecast API (#31134) (#33218)
* Delete forecast API (#31134)
2018-09-03 19:06:18 -05:00
David Kyle ccb2ad25cc
Prevent NPE parsing the stop datafeed request. (#33347)
The issue depends on the request parameters being passed in the request
body rather than as query parameters.
2018-09-03 13:35:04 +01:00
Zachary Tong 90ce3a6224 [Rollup] Fix Caps Comparator to handle calendar/fixed time (#33336)
The comparator used TimeValue parsing, which meant it couldn't handle
calendar time.  This fixes the comparator to handle either (and potentially
mixed).  The mixing shouldn't be an issue since the validation code
upstream will prevent it, but was simplest to allow the comparator
to handle both.
2018-09-03 10:49:19 +02:00