Commit Graph

222 Commits

Author SHA1 Message Date
Tanguy Leroux c43e932a0c Fix CharArraysTests.testConstantTimeEquals() (#47346)
The change #47238 fixed a first issue (#47076) but introduced 
another one that can be reproduced using:

org.elasticsearch.common.CharArraysTests > testConstantTimeEquals FAILED

java.lang.StringIndexOutOfBoundsException: String index out of range: 1
at __randomizedtesting.SeedInfo.seed([DFCA64FE2C786BE3:ED987E883715C63B]:0)
at java.lang.String.substring(String.java:1963)
at org.elasticsearch.common.CharArraysTests.testConstantTimeEquals(CharArraysTests.java:74)

REPRODUCE WITH: ./gradlew ':libs:elasticsearch-core:test' --tests 
"org.elasticsearch.common.CharArraysTests.testConstantTimeEquals" 
-Dtests.seed=DFCA64FE2C786BE3 -Dtests.security.manager=true -Dtests.locale=fr-CA 
-Dtests.timezone=Pacific/Johnston -Dcompiler.java=12 -Druntime.java=8

that happens when the first randomized string has a length of 0.
2019-10-01 12:49:15 +02:00
Ryan Ernst 67f0ffd134 Ensure char array test uses different values (#47238)
The test of constantTimeEquals could get unlucky and randomly produce
the same two strings. This commit tweaks the test to ensure the two
string are unique, and the loop inside constantTimeEquals is actually
executed (which requires the strings be of the same length).

fixes #47076
2019-09-30 14:46:53 -07:00
Tim Brooks f02582de4b
Reduce a bind failure to trace logging (#46891)
Due to recent changes in the nio transport, a failure to bind the server
channel has started to be logged at an error level. This exception leads
to an automatic retry on a different port, so it should only be logged
at a trace level.
2019-09-24 10:32:18 -06:00
Lee Hinman cdc3a260af
Add retention to Snapshot Lifecycle Management (backport of #4… (#46506)
* Add retention to Snapshot Lifecycle Management (#46407)

This commit adds retention to the existing Snapshot Lifecycle Management feature (#38461) as described in #43663. This allows a user to configure SLM to automatically delete older snapshots based on a number of criteria.

An example policy would look like:

```
PUT /_slm/policy/snapshot-every-day
{
  "schedule": "0 30 2 * * ?",
  "name": "<production-snap-{now/d}>",
  "repository": "my-s3-repository",
  "config": {
    "indices": ["foo-*", "important"]
  },
  // Newly configured retention options
  "retention": {
    // Snapshots should be deleted after 14 days
    "expire_after": "14d",
    // Keep a maximum of thirty snapshots
    "max_count": 30,
    // Keep a minimum of the four most recent snapshots
    "min_count": 4
  }
}
```

SLM Retention is run on a scheduled configurable with the `slm.retention_schedule` setting, which supports cron expressions. Deletions are run for a configurable time bounded by the `slm.retention_duration` setting, which defaults to 1 hour.

Included in this work is a new SLM stats API endpoint available through

``` json
GET /_slm/stats
```

That returns statistics about snapshot taken and deleted, as well as successful retention runs, failures, and the time spent deleting snapshots. #45362 has more information as well as an example of the output. These stats are also included when retrieving SLM policies via the API.

* Add base framework for snapshot retention (#43605)

* Add base framework for snapshot retention

This adds a basic `SnapshotRetentionService` and `SnapshotRetentionTask`
to start as the basis for SLM's retention implementation.

Relates to #38461

* Remove extraneous 'public'

* Use a local var instead of reading class var repeatedly

* Add SnapshotRetentionConfiguration for retention configuration (#43777)

* Add SnapshotRetentionConfiguration for retention configuration

This commit adds the `SnapshotRetentionConfiguration` class and its HLRC
counterpart to encapsulate the configuration for SLM retention.
Currently only a single parameter is supported as an example (we still
need to discuss the different options we want to support and their
names) to keep the size of the PR down. It also does not yet include version serialization checks
since the original SLM branch has not yet been merged.

Relates to #43663

* Fix REST tests

* Fix more documentation

* Use Objects.equals to avoid NPE

* Put `randomSnapshotLifecyclePolicy` in only one place

* Occasionally return retention with no configuration

* Implement SnapshotRetentionTask's snapshot filtering and delet… (#44764)

* Implement SnapshotRetentionTask's snapshot filtering and deletion

This commit implements the snapshot filtering and deletion for
`SnapshotRetentionTask`. Currently only the expire-after age is used for
determining whether a snapshot is eligible for deletion.

Relates to #43663

* Fix deletes running on the wrong thread

* Handle missing or null policy in snap metadata differently

* Convert Tuple<String, List<SnapshotInfo>> to Map<String, List<SnapshotInfo>>

* Use the `OriginSettingClient` to work with security, enhance logging

* Prevent NPE in test by mocking Client

* Allow empty/missing SLM retention configuration (#45018)

Semi-related to #44465, this allows the `"retention"` configuration map
to be missing.

Relates to #43663

* Add min_count and max_count as SLM retention predicates (#44926)

This adds the configuration options for `min_count` and `max_count` as
well as the logic for determining whether a snapshot meets this criteria
to SLM's retention feature.

These options are optional and one, two, or all three can be specified
in an SLM policy.

Relates to #43663

* Time-bound deletion of snapshots in retention delete function (#45065)

* Time-bound deletion of snapshots in retention delete function

With a cluster that has a large number of snapshots, it's possible that
snapshot deletion can take a very long time (especially since deletes
currently have to happen in a serial fashion). To prevent snapshot
deletion from taking forever in a cluster and blocking other operations,
this commit adds a setting to allow configuring a maximum time to spend
deletion snapshots during retention. This dynamic setting defaults to 1
hour and is best-effort, meaning that it doesn't hard stop a deletion
at an hour mark, but ensures that once the time has passed, all
subsequent deletions are deferred until the next retention cycle.

Relates to #43663

* Wow snapshots suuuure can take a long time.

* Use a LongSupplier instead of actually sleeping

* Remove TestLogging annotation

* Remove rate limiting

* Add SLM metrics gathering and endpoint (#45362)

* Add SLM metrics gathering and endpoint

This commit adds the infrastructure to gather metrics about the different SLM actions that a cluster
takes. These actions are stored in `SnapshotLifecycleStats` and perpetuated in cluster state. The
stats stored include the number of snapshots taken, failed, deleted, the number of retention runs,
as well as per-policy counts for snapshots taken, failed, and deleted. It also includes the amount
of time spent deleting snapshots from SLM retention.

This commit also adds an endpoint for retrieving all stats (further commits will expose this in the
SLM get-policy API) that looks like:

```
GET /_slm/stats
{
  "retention_runs" : 13,
  "retention_failed" : 0,
  "retention_timed_out" : 0,
  "retention_deletion_time" : "1.4s",
  "retention_deletion_time_millis" : 1404,
  "policy_metrics" : {
    "daily-snapshots2" : {
      "snapshots_taken" : 7,
      "snapshots_failed" : 0,
      "snapshots_deleted" : 6,
      "snapshot_deletion_failures" : 0
    },
    "daily-snapshots" : {
      "snapshots_taken" : 12,
      "snapshots_failed" : 0,
      "snapshots_deleted" : 12,
      "snapshot_deletion_failures" : 6
    }
  },
  "total_snapshots_taken" : 19,
  "total_snapshots_failed" : 0,
  "total_snapshots_deleted" : 18,
  "total_snapshot_deletion_failures" : 6
}
```

This does not yet include HLRC for this, as this commit is quite large on its own. That will be
added in a subsequent commit.

Relates to #43663

* Version qualify serialization

* Initialize counters outside constructor

* Use computeIfAbsent instead of being too verbose

* Move part of XContent generation into subclass

* Fix REST action for master merge

* Unused import

*  Record history of SLM retention actions (#45513)

This commit records the deletion of snapshots by the retention component
of SLM into the SLM history index for the purposes of reviewing operations
taken by SLM and alerting.

* Retry SLM retention after currently running snapshot completes (#45802)

* Retry SLM retention after currently running snapshot completes

This commit adds a ClusterStateObserver to wait until the currently
running snapshot is complete before proceeding with snapshot deletion.
SLM retention waits for the maximum allowed deletion time for the
snapshot to complete, however, the waiting time is not factored into
the limit on actual deletions.

Relates to #43663

* Increase timeout waiting for snapshot completion

* Apply patch

From 2374316f0d.patch

* Rename test variables

* [TEST] Be less strict for stats checking

* Skip SLM retention if ILM is STOPPING or STOPPED (#45869)

This adds a check to ensure we take no action during SLM retention if
ILM is currently stopped or in the process of stopping.

Relates to #43663

* Check all actions preventing snapshot delete during retention (#45992)

* Check all actions preventing snapshot delete during retention run

Previously we only checked to see if a snapshot was currently running,
but it turns out that more things can block snapshot deletion. This
changes the check to be a check for:

- a snapshot currently running
- a deletion already in progress
- a repo cleanup in progress
- a restore currently running

This was found by CI where a third party delete in a test caused SLM
retention deletion to throw an exception.

Relates to #43663

* Add unit test for okayToDeleteSnapshots

* Fix bug where SLM retention task would be scheduled on every node

* Enhance test logging

* Ignore if snapshot is already deleted

* Missing import

* Fix SnapshotRetentionServiceTests

* Expose SLM policy stats in get SLM policy API (#45989)

This also adds support for the SLM stats endpoint to the high level rest client.

Retrieving a policy now looks like:

```json
{
  "daily-snapshots" : {
    "version": 1,
    "modified_date": "2019-04-23T01:30:00.000Z",
    "modified_date_millis": 1556048137314,
    "policy" : {
      "schedule": "0 30 1 * * ?",
      "name": "<daily-snap-{now/d}>",
      "repository": "my_repository",
      "config": {
        "indices": ["data-*", "important"],
        "ignore_unavailable": false,
        "include_global_state": false
      },
      "retention": {}
    },
    "stats": {
      "snapshots_taken": 0,
      "snapshots_failed": 0,
      "snapshots_deleted": 0,
      "snapshot_deletion_failures": 0
    },
    "next_execution": "2019-04-24T01:30:00.000Z",
    "next_execution_millis": 1556048160000
  }
}
```

Relates to #43663

* Rewrite SnapshotLifecycleIT as as ESIntegTestCase (#46356)

* Rewrite SnapshotLifecycleIT as as ESIntegTestCase

This commit splits `SnapshotLifecycleIT` into two different tests.
`SnapshotLifecycleRestIT` which includes the tests that do not require
slow repositories, and `SLMSnapshotBlockingIntegTests` which is now an
integration test using `MockRepository` to simulate a snapshot being in
progress.

Relates to #43663
Resolves #46205

* Add error logging when exceptions are thrown

* Update serialization versions

* Fix type inference

* Use non-Cancellable HLRC return value

* Fix Client mocking in test

* Fix SLMSnapshotBlockingIntegTests for 7.x branch

* Update SnapshotRetentionTask for non-multi-repo snapshot retrieval

* Add serialization guards for SnapshotLifecyclePolicy
2019-09-10 09:08:09 -06:00
William Brafford 2b549e7342
CLI tools: write errors to stderr instead of stdout (#45586)
Most of our CLI tools use the Terminal class, which previously did not provide methods for writing to standard output. When all output goes to standard out, there are two basic problems. First, errors and warnings are "swallowed" in pipelines, making it hard for a user to know when something's gone wrong. Second, errors and warnings are intermingled with legitimate output, making it difficult to pass the results of interactive scripts to other tools.

This commit adds a second set of print commands to Terminal for printing to standard error, with errorPrint corresponding to print and errorPrintln corresponding to println. This leaves it to developers to decide which output should go where. It also adjusts existing commands to send errors and warnings to stderr.

Usage is printed to standard output when it's correctly requested (e.g., bin/elasticsearch-keystore --help) but goes to standard error when a command is invoked incorrectly (e.g. bin/elasticsearch-keystore list-with-a-typo | sort).
2019-08-21 14:46:07 -04:00
Igor Motov 98c850c08b
Geo: Change order of parameter in Geometries to lon, lat 7.x (#45618)
Changes the order of parameters in Geometries from lat, lon to lon, lat
and moves all Geometry classes are moved to the
org.elasticsearch.geomtery package.

Backport of #45332

Closes #45048
2019-08-16 14:42:02 -04:00
Armin Braun 1cd464d675
Isolate Request in Call-Chain for REST Request Handling (#45130) (#45417)
* Follow up to #44949
* Stop using a special code path for multi-line JSON and instead handle its detection like that of other XContent types when creating the request
* Only leave a single path that holds a reference to the full REST request
   * In the next step we can move the copying of request content to happen before the actual request handling and make it conditional on the handler in question to stop copying bulk requests as suggested in #44564
2019-08-10 10:21:01 +02:00
Yannick Welsch 17846212bd Fix tests after backport of #44055 2019-08-06 14:19:20 +02:00
Yannick Welsch a453cd489e Run testExtendedSocketOptions only on JDK11+ (#44055)
This functionality only works on JDK 11 or higher
2019-08-06 13:15:17 +02:00
Yannick Welsch 7aeb2fe73c Add per-socket keepalive options (#44055)
Uses JDK 11's per-socket configuration of TCP keepalive (supported on Linux and Mac), see
https://bugs.openjdk.java.net/browse/JDK-8194298, and exposes these as transport settings.
By default, these options are disabled for now (i.e. fall-back to OS behavior), but we would like
to explore whether we can enable them by default, in particular to force keepalive configurations
that are better tuned for running ES.
2019-08-06 10:45:44 +02:00
Tim Brooks 984ba82251
Move nio channel initialization to event loop (#45155)
Currently in the transport-nio work we connect and bind channels on the
a thread before the channel is registered with a selector. Additionally,
it is at this point that we set all the socket options. This commit
moves these operations onto the event-loop after the channel has been
registered with a selector. It attempts to set the socket options for a
non-server channel at registration time. If that fails, it will attempt
to set the options after the channel is connected. This should fix
#41071.
2019-08-02 17:31:31 -04:00
Tim Brooks fdc6c9853f
Do not write if connect incomplete (#44466)
Currently, we do not handle READ or WRITE events until the channel
connection process is complete. However, the external write queue path
allows a write to be attempted when the conneciton is not complete. This
commit closes the loophole and only queues write operations when the
connection process is not complete.
2019-07-31 14:30:14 -06:00
Christoph Büscher f6922bca2d
Unmute test that seems to be fixed (#44432)
Since #42509 is closed and the fix seems to have been backported to 7.x (#43539)
the test can be enabled again.
2019-07-31 16:33:21 +02:00
Mark Vieira a89860160b
Expose Elasticsearch API nullability information to Kotlin compiler. (#43912) (#44518)
This change allows the Kotlin compiler to type check methods annotated with the
org.elasticsearch.common.Nullable annotation in Elasticsearch Java
APIs as described in: https://kotlinlang.org/docs/reference/java-interop.html#jsr-305-support.

(cherry picked from commit 0d0485ad9cf10e16b75b862b023b42827c375599)
2019-07-25 12:16:38 -07:00
Tanguy Leroux a8905ef142
[7.x] Add CloseIndexResponse to HLRC (#44349) (#44788)
The CloseIndexResponse was improved in #39687; this commit
exposes it in the HLRC.

Backport of #44349 to 7.x.
2019-07-24 15:51:01 +02:00
Igor Motov 9338fc8536 GEO: Switch to using GeoTestUtil to generate random geo shapes (#44635)
Switches to more robust way of generating random test geometries by
reusing lucene's GeoTestUtil. Removes duplicate random geometry
generators by moving them to the test framework.

Closes #37278
2019-07-23 14:30:41 -04:00
Ioannis Kakavas 3714cb63da Allow parsing the value of java.version sysprop (#44017)
We often start testing with early access versions of new Java
versions and this have caused minor issues in our tests
(i.e. #43141) because the version string that the JVM reports
cannot be parsed as it ends with the string -ea.

This commit changes how we parse and compare Java versions to
allow correct parsing and comparison of the output of java.version
system property that might include an additional alphanumeric
part after the version numbers
 (see [JEP 223[(https://openjdk.java.net/jeps/223)). In short it 
handles a version number part, like before, but additionally a 
PRE part that matches ([a-zA-Z0-9]+).

It also changes a number of tests that would attempt to parse
java.specification.version in order to get the full version
of Java. java.specification.version only contains the major
version and is thus inappropriate when trying to compare against
a version that might contain a minor, patch or an early access
part. We know parse java.version that can be consistently
parsed.

Resolves #43141
2019-07-22 20:14:56 +03:00
Yannick Welsch c8b66c549d Ignore failures to set socket options on Mac (#44355)
Brings some temporary relief for test failures until #41071 is addressed.
2019-07-17 18:51:25 +02:00
Tim Brooks 0a352486e8
Isolate nio channel registered from channel active (#44388)
Registering a channel with a selector is a required operation for the
channel to be handled properly. Currently, we mix the registeration with
other setup operations (ip filtering, SSL initiation, etc). However, a
fail to register is fatal. This PR modifies how registeration occurs to
immediately close the channel if it fails.

There are still two clear loopholes for how a user can interact with a
channel even if registration fails. 1. through the exception handler.
2. through the channel accepted callback. These can perhaps be improved
in the future. For now, this PR prevents writes from proceeding if the
channel is not registered.
2019-07-16 17:18:57 -06:00
Armin Braun 5c8275cd2c
Fix Exceptions in EventHandler#postHandling Breaking Select Loop (#44347) (#44396)
* Fix Exceptions in EventHandler#postHandling Breaking Select Loop

* We can run into the `write` path for SSL channels when they are not fully registered (if registration fails and a close message is attempted to be written) and thus into NPEs from missing selection keys
  * This is a quick fix to quiet down tests, a cleaner solution will be incoming for #44343
* Relates #44343
2019-07-16 07:06:26 +02:00
Armin Braun d2407d0ffc
Remove Redundant Setting of OP_WRITE Interest (#43653) (#44255)
* Remove Redundant Setting of OP_WRITE Interest

* We shouldn't have to set OP_WRITE interest before running into a partial write. Since setting OP_WRITE is handled by the `eventHandler.postHandling` logic, I think we can simply remove this operation and simplify/remove tests that were testing the setting of the write interest
2019-07-12 09:08:17 +02:00
Igor Motov 66a9b721f5 Add Map to XContentParser Wrapper (#44036)
In some cases we need to parse some XContent that is already parsed into
a map. This is currently happening in handling source in SQL and ingest
processors as well as parsing null_value values in geo mappings. To avoid
re-serializing and parsing the value again or writing another map-based
parser this commit adds an iterator that iterates over a map as if it was
XContent. This makes reusing existing XContent parser on maps possible.

Relates to #43554
2019-07-11 09:38:31 -04:00
Igor Motov df2e1fb43e Geo: add validator that only checks altitude (#43893)
By default, we don't check ranges while indexing geo_shapes. As a
result, it is possible to index geoshapes that contain contain
coordinates outside of -90 +90 and -180 +180 ranges. Such geoshapes
will currently break SQL and ML retrieval mechanism. This commit removes
these restriction from the validator is used in SQL and ML retrieval.
2019-07-10 16:55:03 -04:00
Igor Motov 3607876a71 Geo: Makes coordinate validator in libs/geo plugable (#43657)
Moves coordinate validation from Geometry constructors into
parser.

Relates #43644
2019-06-27 19:53:41 -04:00
Armin Braun c00e305d79
Optimize Selector Wakeups (#43515) (#43650)
* Use atomic boolean to guard wakeups
* Don't trigger wakeups from the select loops thread itself for registering and closing channels
* Don't needlessly queue writes

Co-authored-by:  Tim Brooks <tim@uncontended.net>
2019-06-26 20:00:42 +02:00
Alexander Reelsen 9493c145d7 Upgrade jcodings dependency to 1.0.44 (#43334) 2019-06-26 10:03:40 +02:00
Yogesh Gaikwad ca43cdf755
Fix for PemTrustConfigTests.testTrustConfigReloadsFileContents failure (#43539) (#43613)
The test `PemTrustConfigTests.testTrustConfigReloadsFileContents` failed
intermittently with `ArrayIndexOutOfBoundsException` while parsing
the randomly generated bytes array representing DER encoded stream.
This seems to be a bug in JDK (once confirmed we can raise the bug
in JDK bugs system).

The problem arises when the `X509Factory#parseX509orPKCS7()` tries to
[create `PKCS7` block](19fb8f93c5/src/java.base/share/classes/sun/security/provider/X509Factory.java (L460)) from der encoded stream. While constructing PKCS7
block it tries to create `ContentInfo` type but fails to do so for the
stream where the length after the DER SEQUENCE is 0.
`DerInputStream#getSequence` [may return empty array of `DerValue`](19fb8f93c5/src/java.base/share/classes/sun/security/util/DerInputStream.java (L409..L412)) but
[the code in `ContentInfo`](19fb8f93c5/src/java.base/share/classes/sun/security/pkcs/ContentInfo.java (L135)) does not check for the empty thereby throwing
`ArrayIndexOutOfBoundsException`.

Closes #42509
2019-06-26 13:32:01 +10:00
Lee Hinman 8927081981
[7.x] Add TimeValue.toHumanReadableString() to allow specifyin… (#43545)
* Enhance TimeValue.toString() to allow specifying fractional values.

This enhances the `TimeValue` class to allow specifying the number of
truncated fractional decimals when calling `toString()`. The default
remains 1, however, more (or less, such as 0) can be specified to change
the output.

This commit also re-organizes some things in `TimeValue` such as putting
all the class variables near the top of the class, and moving the
constructors to the first methods in the class, in order to follow the
structure of our other code.

* Rename `toString(...)` to `toHumanReadableString(...)`
2019-06-25 14:06:23 -06:00
Przemysław Witek 76a750a0a0
Remove unused mapStringsOrdered method (#42513) (#43585) 2019-06-25 20:43:38 +02:00
Przemysław Witek c702cd7415
[7.x] Implement XContentParser.genericMap and XContentParser.genericMapOrdered methods (#42059) (#43575) 2019-06-25 16:04:54 +02:00
Tim Brooks 38516a4dd5
Move nio ip filter rule to be a channel handler (#43507)
Currently nio implements ip filtering at the channel context level. This
is kind of a hack as the application logic should be implemented at the
handler level. This commit moves the ip filtering into a channel
handler. This requires adding an indicator to the channel handler to
show when a channel should be closed.
2019-06-24 10:03:24 -06:00
Armin Braun 5f87caa54c
Assert ServerSocketChannel is not Blocking (#43479) (#43488)
* Assert ServerSocketChannel is not Blocking

* Relates #43387 which appears to run into blocking accept calls
2019-06-21 21:28:58 +02:00
sandmannn cf610b5e81 Added parsing of erroneous field value (#42321) 2019-06-20 15:24:04 -04:00
Igor Motov 9f7d1ff2de Geo: Add coerce support to libs/geo WKT parser (#43273)
Adds support for coercing not closed polygons and ignoring Z value
to libs/geo WKT parser.

Closes #43173
2019-06-18 14:41:01 -04:00
Julie Tibshirani 4b1d8e4433 Allow big integers and decimals to be mapped dynamically. (#42827)
This PR proposes to model big integers as longs (and big decimals as doubles)
in the context of dynamic mappings.

Previously, the dynamic mapping logic did not recognize big integers or
decimals, and would an error of the form "No matching token for number_type
[BIG_INTEGER]" when a dynamic big integer was encountered. It now accepts these
numeric types and interprets them as 'long' and 'double' respectively. This
allows `dynamic_templates` to accept and and remap them as another type such as
`keyword` or `scaled_float`.

Addresses #37846.
2019-06-14 10:05:11 -07:00
Jason Tedor 5f7d5920d7
Fix IOUtils#fsync on Windows fsyncing directories (#43008)
Fsyncing directories on Windows is not possible. We always suppressed
this by allowing that an AccessDeniedException is thrown when attemping
to open the directory for reading. Yet, this suppression also allowed
other IOExceptions to be suppressed, and that was a bug (e.g., the
directory not existing, or a filesystem error and reasons that we might
get an access denied there, like genuine permissions issues). This
leniency was previously removed yet it exposed that we were suppressing
this case on Windows. Rather than relying on exceptions for flow control
and continuing to suppress there, we simply return early if attempting
to fsync a directory on Windows (we will not put this burden on the
caller).
2019-06-07 23:00:26 -04:00
Jason Tedor 45bbd7f7f1
Only ignore IOException when fsyncing on dirs (#42972)
Today in the method IOUtils#fsync we ignore IOExceptions when fsyncing a
directory. However, the catch block here is too broad, for example it
would be ignoring IOExceptions when we try to open a non-existant
file. This commit addresses that by scoping the ignored exceptions only
to the invocation of FileChannel#force.
2019-06-07 08:35:09 -04:00
Tim Brooks 667c613d9e
Remove `nonApplicationWrite` from `SSLDriver` (#42954)
Currently, when the SSLEngine needs to produce handshake or close data,
we must manually call the nonApplicationWrite method. However, this data
is only required when something triggers the need (starting handshake,
reading from the wire, initiating close, etc). As we have a dedicated
outbound buffer, this data can be produced automatically. Additionally,
with this refactoring, we combine handshake and application mode into a
single mode. This is necessary as there are non-application messages that
are sent post handshake in TLS 1.3. Finally, this commit modifies the
SSLDriver tests to test against TLS 1.3.
2019-06-06 17:44:40 -04:00
Albert Zaharovits 72eb9c2d44
Eclipse libs projects setup fix (#42852)
Fallout from #42773 for eclipse users.

(cherry picked from commit 998419c49fe51eb8343664a80f07d8d8d39abc6a)
2019-06-04 13:52:41 -07:00
Mark Vieira e44b8b1e2e
[Backport] Remove dependency substitutions 7.x (#42866)
* Remove unnecessary usage of Gradle dependency substitution rules (#42773)

(cherry picked from commit 12d583dbf6f7d44f00aa365e34fc7e937c3c61f7)
2019-06-04 13:50:23 -07:00
Alan Woodward ab6b86bac9 Add option to ObjectParser to consume unknown fields (#42491)
ObjectParser has two ways of dealing with unknown fields: ignore them entirely,
or throw an error. Sometimes it can be useful instead to gather up these unknown
fields and record them separately, for example as arbitrary entries in a map.

This commit adds the ability to specify an unknown field consumer on an ObjectParser,
called with the field name and parsed value of each unknown field encountered during
parsing. The public API of ObjectParser is largely unchanged, with a single new
constructor method and interface definition.
2019-05-31 11:33:47 +01:00
Mark Vieira c1816354ed
[Backport] Improve build configuration time (#42674) 2019-05-30 10:29:42 -07:00
Igor Motov d2f9ccbe18 Geo: Refactor libs/geo parsers (#42549)
Refactors the WKT and GeoJSON parsers from an utility class into an
instantiatable objects. This is a preliminary step in
preparation for moving out coordinate validators from Geometry
constructors. This should allow us to make validators plugable.
2019-05-29 20:07:27 -04:00
Marios Trivyzas 56677f69cf Mute testTrustConfigReloadsFileContents
Tracked by #42509
2019-05-24 14:03:46 +02:00
David Roberts 14f29de2a8 Avoid HashMap construction on Grok non-match (#42444)
This change moves the construction of the result
HashMap in Grok.captures() into the branch that
actually needs it.

This probably will not make a measurable difference
for ingest pipelines, but it is beneficial to the
ML find_file_structure endpoint, as it tries out
many Grok patterns that will fail to match.
2019-05-23 21:09:33 +01:00
Jay Modi dbbdcea128
Update ciphers for TLSv1.3 and JDK11 if available (#42082)
This commit updates the default ciphers and TLS protocols that are used
when the runtime JDK supports them. New cipher support has been
introduced in JDK 11 and 12 along with performance fixes for AES GCM.
The ciphers are ordered with PFS ciphers being most preferred, then
AEAD ciphers, and finally those with mainstream hardware support. When
available stronger encryption is preferred for a given cipher.

This is a backport of #41385 and #41808. There are known JDK bugs with
TLSv1.3 that have been fixed in various versions. These are:

1. The JDK's bundled HttpsServer will endless loop under JDK11 and JDK
12.0 (Fixed in 12.0.1) based on the way the Apache HttpClient performs
a close (half close).
2. In all versions of JDK 11 and 12, the HttpsServer will endless loop
when certificates are not trusted or another handshake error occurs. An
email has been sent to the openjdk security-dev list and #38646 is open
to track this.
3. In JDK 11.0.2 and prior there is a race condition with session
resumption that leads to handshake errors when multiple concurrent
handshakes are going on between the same client and server. This bug
does not appear when client authentication is in use. This is
JDK-8213202, which was fixed in 11.0.3 and 12.0.
4. In JDK 11.0.2 and prior there is a bug where resumed TLS sessions do
not retain peer certificate information. This is JDK-8212885.

The way these issues are addressed is that the current java version is
checked and used to determine the supported protocols for tests that
provoke these issues.
2019-05-20 09:45:36 -04:00
Tim Brooks 927013426a
Read multiple TLS packets in one read call (#41820)
This is related to #27260. Currently we have a single read buffer that
is no larger than a single TLS packet. This prevents us from reading
multiple TLS packets in a single socket read call. This commit modifies
our TLS work to support reading similar to the plaintext case. The data
will be copied to a (potentially) recycled TLS packet-sized buffer for
interaction with the SSLEngine.
2019-05-06 09:51:32 -06:00
Tim Brooks b4bcbf9f64
Support http read timeouts for transport-nio (#41466)
This is related to #27260. Currently there is a setting
http.read_timeout that allows users to define a read timeout for the
http transport. This commit implements support for this functionality
with the transport-nio plugin. The behavior here is that a repeating
task will be scheduled for the interval defined. If there have been
no requests received since the last run and there are no inflight
requests, the channel will be closed.
2019-05-02 09:48:52 -06:00
Tim Brooks df3ef66294
Remove dedicated SSL network write buffer (#41654)
This is related to #27260. Currently for the SSLDriver we allocate a
dedicated network write buffer and encrypt the data into that buffer one
buffer at a time. This requires constantly switching between encrypting
and flushing. This commit adds a dedicated outbound buffer for SSL
operations that will internally allocate new packet sized buffers as
they are need (for writing encrypted data). This allows us to totally
encrypt an operation before writing it to the network. Eventually it can
be hooked up to buffer recycling.

This commit also backports the following commit:

Handle WRAP ops during SSL read

It is possible that a WRAP operation can occur while decrypting
handshake data in TLS 1.3. The SSLDriver does not currently handle this
well as it does not have access to the outbound buffer during read call.
This commit moves the buffer into the Driver to fix this issue. Data
wrapped during a read call will be queued for writing after the read
call is complete.
2019-04-29 17:59:13 -06:00
Igor Motov 10ab838106
Geo: Add GeoJson parser to libs/geo classes (#41575) (#41657)
Adds GeoJson parser for Geometry classes defined in libs/geo.

Relates #40908 and #29872
2019-04-29 19:43:31 -04:00
Nick Knize 113b24be4b Refactor GeoHashUtils (#40869)
This commit refactors GeoHashUtils class into a new Geohash utility class located in the ES geo library. The intent is to not only better control what geo methods are whitelisted for painless scripting but to clean up the geo utility API in general.
2019-04-26 10:06:36 -05:00
Tim Vernum 13fa72cae3
Fix broken test on FIPS for specific seed (#41230)
Under random seed 4304ED44CB755610 the generated byte pattern causes
BC-FIPS to throw

    java.io.IOException: DER length more than 4 bytes: 101

Rather than simply returning an empty list (as it does for most random
values).

Backport of: #40939
2019-04-26 15:43:48 +10:00
Tim Brooks 1f8ff052a1
Revert "Remove dedicated SSL network write buffer (#41283)"
This reverts commit f65a86c258.
2019-04-25 18:39:25 -06:00
Tim Brooks f65a86c258
Remove dedicated SSL network write buffer (#41283)
This is related to #27260. Currently for the SSLDriver we allocate a
dedicated network write buffer and encrypt the data into that buffer one
buffer at a time. This requires constantly switching between encrypting
and flushing. This commit adds a dedicated outbound buffer for SSL
operations that will internally allocate new packet sized buffers as
they are need (for writing encrypted data). This allows us to totally
encrypt an operation before writing it to the network. Eventually it can
be hooked up to buffer recycling.
2019-04-25 14:30:54 -06:00
Christoph Büscher 52495843cc [Docs] Fix common word repetitions (#39703) 2019-04-25 20:47:47 +02:00
Ryan Ernst 7e3875d781 Upgrade hamcrest to 2.1 (#41464)
hamcrest has some improvements in newer versions, like FileMatchers
that make assertions regarding file exists cleaner. This commit upgrades
to the latest version of hamcrest so we can start using new and improved
matchers.
2019-04-24 23:40:03 -07:00
Mark Vieira 1287c7d91f
[Backport] Replace usages RandomizedTestingTask with built-in Gradle Test (#40978) (#40993)
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)

This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.

(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)

* Fix forking JVM runner

* Don't bump shadow plugin version
2019-04-09 11:52:50 -07:00
Henning Andersen 14ee3d3f95 Unmute and fix testSubParserArray (#40626)
testSubParserArray failed, fixed and improved to not always have an
object as outer-level inside array.

Closes #40617
2019-03-29 17:39:12 +01:00
Henning Andersen 92d07e9377 Geo Point parse error fix (#40447)
When geo point parsing threw a parse exception, it did not consume
remaining tokens from the parser. This in turn meant that
indexing documents with malformed geo points into mappings with
ignore_malformed=true would fail in some cases, since DocumentParser
expects geo_point parsing to end on the END_OBJECT token.

Related to #17617
2019-03-29 17:39:12 +01:00
David Turner 1a3916a8de Optimise rejection of out-of-range `long` values (#40325)
Today if you try and insert a very large number like `1e9999999` into a long
field we first construct this number as a `BigDecimal`, convert this to a
`BigInteger` and then reject it because it is out of range. Unfortunately
making such a large `BigInteger` is rather expensive.

We can avoid this expense by performing a (weaker) range check on the
`BigDecimal` representation of incoming `long`s too.

Relates #26137
Closes #40323
2019-03-28 12:27:34 +00:00
Mayya Sharipova 49a7c6e0e8
Expose proximity boosting (#39385) (#40251)
Expose DistanceFeatureQuery for geo, date and date_nanos types

Closes #33382
2019-03-20 09:24:41 -04:00
Igor Motov 4a42e408c5 GEO: Add support for z values to libs/geo classes (#38921)
Adds support for z-values to all Geometry objects in the
libs/geo library.
2019-03-13 15:36:03 -04:00
Tim Brooks 5612ed97ca
Add log warnings for long running event handling (#39729)
Recently we have had a number of test issues related to blocking
activity occuring on the io thread. This commit adds a log warning for
when handling event takes a >150 milliseconds. This is implemented
for the MockNioTransport which is the transport used in
ESIntegTestCase.
2019-03-08 13:07:24 -07:00
Alpar Torok 813351fe26 Un-mute and fix BuildExamplePluginsIT (#38899)
* Un-mute and fix BuildExamplePluginsIT

There doesn't seem to be anything wrong with the test iteself.
I think the failure were CI performance related, but while it was muted,
some failures managed to sneak in.

Closes #38784

* PR review
2019-03-04 08:50:55 +02:00
Armin Braun da9190be0a
Add Checks for Closed Channel in Selector Loop (#39096) (#39439)
* A few warnings could be observed in test logs about `NoSuchElementException` being thrown in `InboundChannelBuffer#sliceBuffersTo`.
These were the result of calls to this method after the relevant channel and hence the buffer was closed already as a result of a failed IO operation.
  * Fixed by adding the necessary guard statements to break out in these cases. I don't think there is a need here to do any additional error handling since `eventHandler.postHandling(channelContext);` at the end of the `processKey`
call in the main selection loop handles closing channels and invoking callbacks for writes that failed to go through already.
2019-02-27 11:28:30 +01:00
Albert Zaharovits ca630bbe6f Fix DissectParserTests expecting unique keys (#39262)
Fixes a bug in DissectParserTests where the tests expected dissect
keys to be unique but were not.

Closes #39244
2019-02-22 17:16:24 +02:00
Albert Zaharovits 08ad740d48 Mute test (#39248)
Mute test DissectParserTests.testBasicMatchUnicode
2019-02-21 17:36:01 +02:00
Albert Zaharovits 5c30446bd0 Fix libs:ssl-config project setup (#39074)
The build script file for the `:libs:elasticsearch-ssl-config` and
`:libs:ssl-config-tests` projects was incorrectly named `eclipse.build.gradle`
 while the expected name was `eclipse-build.gradle`.
In addition, this also adds a missing snippet in the `build.gradle` conf file,
that fixes the project setup for Eclipse users.
2019-02-19 02:23:11 +02:00
Tim Vernum 8895befe51
Generate mvn pom for ssl-config library (#39026)
This is used by the reindex-client library which is published to maven

Relates: #37287, #37527
Backport of: #39019
2019-02-18 20:07:22 +11:00
austintp 8ebff0512b Updates the grok patterns to be consistent with logstash (#27181) 2019-02-05 12:37:02 -06:00
Jay Modi 2ca22209cd
Enable TLSv1.3 by default for JDKs with support (#38103)
This commit enables the use of TLSv1.3 with security by enabling us to
properly map `TLSv1.3` in the supported protocols setting to the
algorithm for a SSLContext. Additionally, we also enable TLSv1.3 by
default on JDKs that support it.

An issue was uncovered with the MockWebServer when TLSv1.3 is used that
ultimately winds up in an endless loop when the client does not trust
the server's certificate. Due to this, SSLConfigurationReloaderTests
has been pinned to TLSv1.2.

Closes #32276
2019-02-01 08:34:11 -07:00
Alpar Torok d417997aca
Fix eclipse config for ssl-config (#38096) 2019-02-01 10:47:54 +02:00
Henning Andersen 68ed72b923
Handle scheduler exceptions (#38014)
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.

This is a continuation of #28667, #36137 and also fixes #37708.
2019-01-31 17:51:45 +01:00
Igor Motov 23805fa41a
Geo: Fix Empty Geometry Collection Handling (#37978)
Fixes handling empty geometry collection and re-enables
testParseGeometryCollection test.

Fixes #37894
2019-01-30 09:20:30 -05:00
markharwood 1579ac032b
Added missing eclipse-build.gradle files (#37980)
Eclipse build files were missing so .eclipse project files were not being generated.

Closes #37973
2019-01-29 16:43:24 +00:00
Igor Motov 68149b6058
Geo: replace intermediate geo objects with libs/geo (#37721)
Replaces intermediate geo objects built by ShapeBuilders with
objects from the libs/geo hierarchy. This should allow us to build
all geo functionality around a single hierarchy.

Follow up for #35320
2019-01-25 11:37:27 -05:00
Christoph Büscher b4b4cd6ebd
Clean codebase from empty statements (#37822)
* Remove empty statements

There are a couple of instances of undocumented empty statements all across the
code base. While they are mostly harmless, they make the code hard to read and
are potentially error-prone. Removing most of these instances and marking blocks
that look empty by intention as such.

* Change test, slightly more verbose but less confusing
2019-01-25 14:23:02 +01:00
Tim Vernum 03690d12b2
Remove TLS 1.0 as a default SSL protocol (#37512)
The default value for ssl.supported_protocols no longer includes TLSv1
as this is an old protocol with known security issues.
Administrators can enable TLSv1.0 support by configuring the
appropriate `ssl.supported_protocols` setting, for example:

xpack.security.http.ssl.supported_protocols: ["TLSv1.2","TLSv1.1","TLSv1"]

Relates: #36021
2019-01-25 15:46:39 +11:00
Alpar Torok 37768b7eac
Testing conventions now checks for tests in main (#37321)
* Testing conventions now checks for tests in main

This is the last outstanding feature of the old NamingConventionsTask,
so time to remove it.

* PR review
2019-01-24 17:30:50 +02:00
Tim Brooks 21838d73b5
Extract message serialization from `TcpTransport` (#37034)
This commit introduces a NetworkMessage class. This class has two
subclasses - InboundMessage and OutboundMessage. These messages can
be serialized and deserialized independent of the transport. This allows
more granular testing. Additionally, the serialization mechanism is now
a simple Supplier. This builds the framework to eventually move the
serialization of transport messages to the network thread. This is the
one serialization component that is not currently performed on the
network thread (transport deserialization and http serialization and
deserialization are all on the network thread).
2019-01-21 14:14:18 -07:00
Tim Brooks f516d68fb2
Share `NioGroup` between http and transport impls (#37396)
Currently we create dedicated network threads for both the http and
transport implementations. Since these these threads should never
perform blocking operations, these threads could be shared. This commit
modifies the nio-transport to have 0 http workers be default. If the
default configs are used, this will cause the http transport to be run
on the transport worker threads. The http worker setting will still exist
in case the user would like to configure dedicated workers. Additionally,
this commmit deletes dedicated acceptor threads. We have never had these
for the netty transport and they can be added back if a need is
determined in the future.
2019-01-21 13:50:56 -07:00
Tim Vernum 6d99e790b3
Add SSL Configuration Library (#37287)
This introduces a new ssl-config library that can parse
and validate SSL/TLS settings and files.

It supports the standard configuration settings as used in the
Elastic Stack such as "ssl.verification_mode" and
"ssl.certificate_authorities" as well as all file formats used
in other parts of Elasticsearch security (such as PEM, JKS,
PKCS#12, PKCS#8, et al).
2019-01-16 21:52:17 +11:00
Igor Motov 6f91f06d86
Geo: Adds a set of no dependency geo classes for JDBC driver (#36477)
Adds a set of geo classes to represent geo data in the JDBC driver and 
to be used as an intermediate format to pass geo shapes for indexing 
and query generation in #35320.

Relates to #35767 and #35320
2019-01-15 10:52:46 -05:00
Tim Brooks 9de62f1262
Increase IO direct byte buffers to 256KB (#37283)
Currently we read and write 64KB at a time in the nio libraries. As a
single byte buffer per event loop thread does not consume much memory,
there is little reason to not increase it further. This commit increases
the buffer to 256KB but still limits a single write to 64KB. The write
limit could be increased, but too high of a write limit will lead to
copying more data (if all the data is not flushed and needs to be copied
on the next call). This is something to explore in the future.
2019-01-10 09:17:20 -07:00
Tim Brooks cfa58a51af
Add TLS/SSL channel close timeouts (#37246)
Closing a channel using TLS/SSL requires reading and writing a
CLOSE_NOTIFY message (for pre-1.3 TLS versions). Many implementations do
not actually send the CLOSE_NOTIFY message, which means we are depending
on the TCP close from the other side to ensure channels are closed. In
case there is an issue with this, we need a timeout. This commit adds a
timeout to the channel close process for TLS secured channels.

As part of this change, we need a timer service. We could use the
generic Elasticsearch timeout threadpool. However, it would be nice to
have a local to the nio event loop timer service dedicated to network needs. In
the future this service could support read timeouts, connect timeouts,
request timeouts, etc. This commit adds a basic priority queue backed
service. Since our timeout volume (channel closes) is very low, this
should be fine. However, this can be updated to something more efficient
in the future if needed (timer wheel). Everything being local to the event loop
thread makes the logic simple as no locking or synchronization is necessary.
2019-01-09 11:46:24 -07:00
Alpar Torok 6344e9a3ce
Testing conventions: add support for checking base classes (#36650) 2019-01-08 13:39:03 +02:00
Alpar Torok a7c3d5842a
Split third party audit exclusions by type (#36763) 2019-01-07 17:24:19 +02:00
Alpar Torok e9ef5bdce8
Converting randomized testing to create a separate unitTest task instead of replacing the builtin test task (#36311)
- Create a separate unitTest task instead of Gradle's built in 
- convert all configuration to use the new task 
- the  built in task is now disabled
2018-12-19 08:25:20 +02:00
Tim Brooks e63d52af63
Move page size constants to PageCacheRecycler (#36524)
`PageCacheRecycler` is the class that creates and holds pages of arrays
for various uses. `BigArrays` is just one user of these pages. This
commit moves the constants that define the page sizes for the recycler
to be on the recycler class.
2018-12-12 07:00:50 -07:00
Tim Brooks 373c67dd7a
Add DirectByteBuffer strategy for transport-nio (#36289)
This is related to #27260. In Elasticsearch all of the messages that we
serialize to write to the network are composed of heap bytes. When you
read or write to a nio socket in java, the heap memory you passed down
must be copied to/from direct memory. The JVM internally does some
buffering of the direct memory, however it is essentially unbounded.

This commit introduces a simple mechanism of buffering and copying the
memory in transport-nio. Each network event loop is given a 64kb
DirectByteBuffer. When we go to read we use this buffer and copy the
data after the read. Additionally, when we go to write, we copy the data
to the direct memory before calling write. 64KB is chosen as this is the
default receive buffer size we use for transport-netty4
(NETTY_RECEIVE_PREDICTOR_SIZE).

Since we only have one buffer per thread, we could afford larger.
However, if we the buffer is large and not all of the data is flushed in
a write call, we will do excess copies. This is something we can
explore in the future.
2018-12-06 18:09:07 -07:00
Jim Ferenczi 18866c4c0b
Make hits.total an object in the search response (#35849)
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).

Relates #33028
2018-12-05 19:49:06 +01:00
Tim Brooks b6ed6ef189
Add sni name to SSLEngine in nio transport (#35920)
This commit is related to #32517. It allows an "sni_server_name"
attribute on a DiscoveryNode to be propagated to the server using
the TLS SNI extentsion. Prior to this commit, this functionality
was only support for the netty transport. This commit adds this
functionality to the security nio transport.
2018-11-27 09:06:52 -07:00
John 0baffda390 ingest: grok remove duplicated patterns (#35886)
This commit removes the redundant (and incorrect) JAVACLASS
and JAVAFILE grok patterns. This helps to keep parity with 
Logstash's patterns. 

See also: https://github.com/logstash-plugins/logstash-patterns-core/pull/237
 
closes #35699
2018-11-26 11:13:46 -06:00
Igor Motov 39789d0a10
GEO: More robust handling of ignore_malformed in geoshape parsing (#35603)
Adds an XContent sub parser class that can to wrap another
XContent parser at the beginning of an object and allow skiping
all children in case of the parsing failure. It also uses this
subparser to ignore the rest of the GeoJson shape if the 
parsing fails and we need to ignore the geoshape due to the 
ignore_malformed flag.

Supersedes #34498

Closes #34047
2018-11-21 11:04:01 -10:00
Simon Willnauer 0cc0fd2d15
Add a frozen engine implementation (#34357)
This change adds a `frozen` engine that allows lazily open a directory reader
on a read-only shard. The engine wraps general purpose searchers in a LazyDirectoryReader
that also allows to release and reset the underlying index readers after any and before
secondary search phases.

Relates to #34352
2018-11-07 20:23:35 +01:00
Alan Woodward e2af849f70
Move ObjectPath and XContentUtils to libs/x-content (#34803)
These are generally useful utility classes that do not need to live in the Watcher code
2018-11-02 15:12:09 +00:00
Nik Everett 3cde1356c1
XContent: Check for bad parsers (#34561)
Adds checks for misbehaving parsers. The checks aren't perfect at all but
they are simple and fast enough that we can do them all the time so
they'll catch most badly behaving parsers.

Closes #34351
2018-10-25 17:03:42 -04:00
Jay Modi d824cbe992
Test: ensure char[] doesn't being with prefix (#34816)
The testCharsBeginsWith test has a check that a random prefix of length
2 is not the prefix of a char[]. However, there is no check that the
char[] is not randomly generated with the same two characters as the
prefix. This change ensures that the char[] does not begin with the
prefix.

Closes #34765
2018-10-25 08:58:21 -06:00
Julie Tibshirani 5a4866f67d Mute CharArraysTests#testCharsBeginsWith while we await a fix. 2018-10-23 11:37:54 -07:00
Alpar Torok 0536635c44
Upgrade forbiddenapis to 2.6 (#33809)
* Upgrade forbiddenapis to 2.6

Closes #33759

* Switch forbiddenApis back to official plugin

* Remove CLI based task

* Fix forbiddenApisJava9
2018-10-23 12:06:46 +03:00