Commit Graph

831 Commits

Author SHA1 Message Date
Nik Everett 7fd84a03a0
Drop references to deprecated logger (#50474) (#50681)
This drops all remaining references to `BaseRestHandler.logger` which
has been deprecated for something like a year now. I replaced all of the
references with locally declared loggers which is so much less spooky
action at a distance to me.
2020-01-06 16:34:07 -05:00
Albert Zaharovits 9ae3cd2a78
Add 'monitor_snapshot' cluster privilege (#50489) (#50647)
This adds a new cluster privilege `monitor_snapshot` which is a restricted
version of `create_snapshot`, granting the same privileges to view
snapshot and repository info and status but not granting the actual
privilege to create a snapshot.

Co-authored-by: j-bean <anton.shuvaev91@gmail.com>
2020-01-06 13:15:55 +02:00
Tim Vernum cad0f6bf28
Do not load SSLService in plugin contructor (#50519)
XPackPlugin created an SSLService within the plugin contructor.
This has 2 negative consequences:

1. The service may be constructed based on a partial view of settings.
   Other plugins are free to add setting values via the
   additionalSettings() method, but this (necessarily) happens after
   plugins have been constructed.

2. Any exceptions thrown during the plugin construction are handled
   differently than exceptions thrown during "createComponents".
   Since SSL configurations exceptions are relatively common, it is
   far preferable for them to be thrown and handled as part of the
   createComponents flow.

This commit moves the creation of the SSLService to
XPackPlugin.createComponents, and alters the sequence of some other
steps to accommodate this change.

Backport of: #49667
2019-12-30 14:42:32 +11:00
Jason Tedor 7c5a3bcf6d
Always consume the body in has privileges (#50298)
Our REST infrastructure will reject requests that have a body where the
body of the request is never consumed. This ensures that we reject
requests on endpoints that do not support having a body. This requires
cooperation from the REST handlers though, to actually consume the body,
otherwise the REST infrastructure will proceed with rejecting the
request. This commit addresses an issue in the has privileges API where
we would prematurely try to reject a request for not having a username,
before consuming the body. Since the body was not consumed, the REST
infrastructure would instead reject the request as a bad request.
2019-12-18 08:30:53 -05:00
Armin Braun 761d6e8e4b
Remove BlobContainer Tests against Mocks (#50194) (#50220)
* Remove BlobContainer Tests against Mocks

Removing all these weird mocks as asked for by #30424.
All these tests are now part of real repository ITs and otherwise left unchanged if they had
independent tests that didn't call the `createBlobStore` method previously.
The HDFS tests also get added coverage as a side-effect because they did not have an implementation
of the abstract repository ITs.

Closes #30424
2019-12-16 11:37:09 +01:00
Ioannis Kakavas 46376100b1
Fix testMalformedToken (#50164) (#50170)
This test was fixed as part of #49736 so that it used a
TokenService mock instance that was enabled, so that token
verification fails because the token is invalid and not because
the token service is not enabled.
When the randomly generated token we send, decodes to being of
version > 7.2 , we need to have mocked a GetResponse for the call
that TokenService#getUserTokenFromId will make, otherwise this
hangs and times out.
2019-12-13 13:46:44 +02:00
Ioannis Kakavas 3b613c36f4
Always return 401 for not valid tokens (#49736) (#50042)
Return a 401 in all cases when a request is submitted with an
access token that we can't consume. Before this change, we would
throw a 500 when a request came in with an access token that we
had generated but was then invalidated/expired and deleted from
the tokens index.

Resolves: #38866
Backport of #49736
2019-12-11 09:14:50 +02:00
Armin Braun 996cddd98b
Stop Copying Every Http Request in Message Handler (#44564) (#49809)
* Copying the request is not necessary here. We can simply release it once the response has been generated and a lot of `Unpooled` allocations that way
* Relates #32228
   * I think the issue that preventet that PR  that PR from being merged was solved by #39634 that moved the bulk index marker search to ByteBuf bulk access so the composite buffer shouldn't require many additional bounds checks  (I'd argue the bounds checks we add, we save when copying the composite buffer)
* I couldn't neccessarily reproduce much of a speedup from this change, but I could reproduce a very measureable reduction in GC time with e.g. Rally's PMC (4g heap node and bulk requests of size 5k saw a reduction in young GC time by ~10% for me)
2019-12-04 08:41:42 +01:00
Tim Vernum e6f530c167
Improved diagnostics for TLS trust failures (#49669)
- Improves HTTP client hostname verification failure messages
- Adds "DiagnosticTrustManager" which logs certificate information
  when trust cannot be established (hostname failure, CA path failure,
  etc)

These diagnostic messages are designed so that many common TLS
problems can be diagnosed based solely (or primarily) on the
elasticsearch logs.

These diagnostics can be disabled by setting

     xpack.security.ssl.diagnose.trust: false

Backport of: #48911
2019-11-29 15:01:20 +11:00
Ioannis Kakavas ba0c848027
[7.x] Update opensaml dependency (#44972) (#49512)
Add a mirror of the maven repository of the shibboleth project
and upgrade opensaml and related dependencies to the latest
version available version

Resolves: #44947
2019-11-29 00:17:16 +02:00
Jim Ferenczi d6445fae4b Add a cluster setting to disallow loading fielddata on _id field (#49166)
This change adds a dynamic cluster setting named `indices.id_field_data.enabled`.
When set to `false` any attempt to load the fielddata for the `_id` field will fail
with an exception. The default value in this change is set to `false` in order to prevent
fielddata usage on this field for future versions but it will be set to `true` when backporting
to 7x. When the setting is set to true (manually or by default in 7x) the loading will also issue
a deprecation warning since we want to disallow fielddata entirely when https://github.com/elastic/elasticsearch/issues/26472
is implemented.

Closes #43599
2019-11-28 09:35:28 +01:00
Tim Vernum 901c64ebbf
Add Debug/Trace logging for authentication (#49619)
Authentication has grown more complex with the addition of new realm
types and authentication methods. When user authentication does not
behave as expected it can be difficult to determine where and why it
failed.

This commit adds DEBUG and TRACE logging at key points in the
authentication flow so that it is possible to gain addition insight
into the operation of the system.

Backport of: #49575
2019-11-27 16:39:07 +11:00
Tim Vernum e9ad1a7fcd
Fix iterate-from-1 bug in smart realm order (#49614)
The AuthenticationService has a feature to "smart order" the realm
chain so that whicherver realm was the last one to successfully
authenticate a given user will be tried first when that user tries to
authenticate again.

There was a bug where the building of this realm order would
incorrectly drop the first realm from the default chain unless that
realm was the "last successful" realm.

In most cases this didn't cause problems because the first realm is
the reserved realm and so it is unusual for a user that authenticated
against a different realm to later need to authenticate against the
resevered realm.

This commit fixes that bug and adds relevant asserts and tests.

Backport of: #49473
2019-11-27 13:46:52 +11:00
Armin Braun 3862400270
Remove Redundant EsBlobStoreTestCase (#49603) (#49605)
All the implementations of `EsBlobStoreTestCase` use the exact same
bootstrap code that is also used by their implementation of
`EsBlobStoreContainerTestCase`.
This means all tests might as well live under `EsBlobStoreContainerTestCase`
saving a lot of code duplication. Also, there was no HDFS implementation for
`EsBlobStoreTestCase` which is now automatically resolved by moving the tests over
since there is a HDFS implementation for the container tests.
2019-11-26 20:57:19 +01:00
Tim Vernum 2e5f2dd1e1
Deprecate misconfigured SSL server config (#49280)
This commit adds a deprecation warning when starting
a node where either of the server contexts
(xpack.security.transport.ssl and xpack.security.http.ssl)
meet either of these conditions:

1. The server lacks a certificate/key pair (i.e. neither
   ssl.keystore.path not ssl.certificate are configured)
2. The server has some ssl configuration, but ssl.enabled is not
   specified. This new validation does not care whether ssl.enabled is
   true or false (though other validation might), it simply makes it
   an error to configure server SSL without being explicit about
   whether to enable that configuration.

Backport of: #45892
2019-11-22 12:14:55 +11:00
Jay Modi eed4cd25eb
ThreadPool and ThreadContext are not closeable (#43249) (#49273)
This commit changes the ThreadContext to just use a regular ThreadLocal
over the lucene CloseableThreadLocal. The CloseableThreadLocal solves
issues with ThreadLocals that are no longer needed during runtime but
in the case of the ThreadContext, we need it for the runtime of the
node and it is typically not closed until the node closes, so we miss
out on the benefits that this class provides.

Additionally by removing the close logic, we simplify code in other
places that deal with exceptions and tracking to see if it happens when
the node is closing.

Closes #42577
2019-11-19 13:15:16 -07:00
Albert Zaharovits 89b3c32b40
Audit log filter and marker (#49145)
This adds a log marker and a marker filter for the audit log.

Closes #47251
2019-11-15 08:44:09 -05:00
Ioannis Kakavas f5f0e1366a
Handle unexpected/unchecked exceptions correctly (#49080) (#49137)
Ensures that methods that are called from different threads ( i.e.
from the callbacks of org.apache.http.concurrent.FutureCallback )
catch `Exception` instead of only the expected checked exceptions.

This resolves a bug where OpenIdConnectAuthenticator#mergeObjects
would throw an IllegalStateException that was never caught causing
the thread to hang and the listener to never be called. This would
in turn cause Kibana requests to authenticate with OpenID Connect
to timeout and fail without even logging anything relevant.

This also guards against unexpected Exceptions that might be thrown
by invoked library methods while performing the necessary operations
in these callbacks.
2019-11-15 11:54:08 +02:00
Rory Hunter c46a0e8708
Apply 2-space indent to all gradle scripts (#49071)
Backport of #48849. Update `.editorconfig` to make the Java settings the
default for all files, and then apply a 2-space indent to all `*.gradle`
files. Then reformat all the files.
2019-11-14 11:01:23 +00:00
Ioannis Kakavas 4405042900
Remove unnecessary details logged for OIDC (#48746) (#49031)
This commit removes unnecessary details logged for
OIDC.

Co-Authored-By: Ioannis Kakavas <ikakavas@protonmail.com>
2019-11-13 13:43:56 +02:00
Mark Vieira 6ab4645f4e
[7.x] Introduce type-safe and consistent pattern for handling build globals (#48818)
This commit introduces a consistent, and type-safe manner for handling
global build parameters through out our build logic. Primarily this
replaces the existing usages of extra properties with static accessors.
It also introduces and explicit API for initialization and mutation of
any such parameters, as well as better error handling for uninitialized
or eager access of parameter values.

Closes #42042
2019-11-01 11:33:11 -07:00
Ioannis Kakavas 99aedc844d
Copy http headers to ThreadContext strictly (#45945) (#48675)
Previous behavior while copying HTTP headers to the ThreadContext,
would allow multiple HTTP headers with the same name, handling only
the first occurrence and disregarding the rest of the values. This
can be confusing when dealing with multiple Headers as it is not
obvious which value is read and which ones are silently dropped.

According to RFC-7230, a client must not send multiple header fields
with the same field name in a HTTP message, unless the entire field
value for this header is defined as a comma separated list or this
specific header is a well-known exception.

This commits changes the behavior in order to be more compliant to
the aforementioned RFC by requiring the classes that implement
ActionPlugin to declare if a header can be multi-valued or not when
registering this header to be copied over to the ThreadContext in
ActionPlugin#getRestHeaders.
If the header is allowed to be multivalued, then all such headers
are read from the HTTP request and their values get concatenated in
a comma-separated string.
If the header is not allowed to be multivalued, and the HTTP
request contains multiple such Headers with different values, the
request is rejected with a 400 status.
2019-10-31 23:05:12 +02:00
Yogesh Gaikwad c7342dde29
Fix to release system resource after reading JKWSet file (#48666) (#48677)
When we load a JSON Web Key (JWKSet) from the specified
file using JWKSet.load it internally uses IOUtils.readFileToString
but the opened FileInputStream is never closed after usage.
https://bitbucket.org/connect2id/nimbus-jose-jwt/issues/342

This commit reads the file and parses the JWKSet from the string.

This also fixes an issue wherein if the underlying file changed,
for every change event it would add another file watcher. The
change is to only add the file watcher at the start.

Closes #44942
2019-10-31 10:16:33 +11:00
Yogesh Gaikwad 9ed7352a12
Add Sysprop to Adjust IO Buffer Size (#48267) (#48667)
The 1MB IO-buffer size per transport thread is causing trouble in
some tests, albeit at a low rate. Reducing the number of transport
threads was not enough to fully fix this situation.
Allowing to configure the size of the buffer and reducing it by
more than an order of magnitude should fix these tests.

Closes #46803
2019-10-30 14:19:54 +11:00
Ioannis Kakavas a0362153e2
Update oauth2-oidc-sdk and nimbus-jose-jwt (#48537) (#48628)
Update two dependencies for our OpenID Connect realm implementation
to their latest versions
2019-10-29 14:18:59 +02:00
Rory Hunter 30389c6660
Improve SAML tests resiliency to auto-formatting (#48517)
Backport of #48452.

The SAML tests have large XML documents within which various parameters
are replaced. At present, if these test are auto-formatted, the XML
documents get strung out over many, many lines, and are basically
illegible.

Fix this by using named placeholders for variables, and indent the
multiline XML documents.

The tests in `SamlSpMetadataBuilderTests` deserve a special mention,
because they include a number of certificates in Base64. I extracted
these into variables, for additional legibility.
2019-10-27 16:06:23 +00:00
Tim Brooks f5f1072824
Multiple remote connection strategy support (#48496)
* Extract remote "sniffing" to connection strategy (#47253)

Currently the connection strategy used by the remote cluster service is
implemented as a multi-step sniffing process in the
RemoteClusterConnection. We intend to introduce a new connection strategy
that will operate in a different manner. This commit extracts the
sniffing logic to a dedicated strategy class. Additionally, it implements
dedicated tests for this class.

Additionally, in previous commits we moved away from a world where the
remote cluster connection was mutable. Instead, when setting updates are
made, the connection is torn down and rebuilt. We still had methods and
tests hanging around for the mutable behavior. This commit removes those.

* Introduce simple remote connection strategy (#47480)

This commit introduces a simple remote connection strategy which will
open remote connections to a configurable list of user supplied
addresses. These addresses can be remote Elasticsearch nodes or
intermediate proxies. We will perform normal clustername and version
validation, but otherwise rely on the remote cluster to route requests
to the appropriate remote node.

* Make remote setting updates support diff strategies (#47891)

Currently the entire remote cluster settings infrastructure is designed
around the sniff strategy. As we introduce an additional conneciton
strategy this infrastructure needs to be modified to support it. This
commit modifies the code so that the strategy implementations will tell
the service if the connection needs to be torn down and rebuilt.

As part of this commit, we will wait 10 seconds for new clusters to
connect when they are added through the "update" settings
infrastructure.

* Make remote setting updates support diff strategies (#47891)

Currently the entire remote cluster settings infrastructure is designed
around the sniff strategy. As we introduce an additional conneciton
strategy this infrastructure needs to be modified to support it. This
commit modifies the code so that the strategy implementations will tell
the service if the connection needs to be torn down and rebuilt.

As part of this commit, we will wait 10 seconds for new clusters to
connect when they are added through the "update" settings
infrastructure.
2019-10-25 09:29:41 -06:00
Ioannis Kakavas c6b733f1b4
Add populate_user_metadata in OIDC realm (#48357) (#48438)
Make populate_user_metadata configuration parameter
available in the OpenID Connect authentication realm

Resolves: #48217
2019-10-24 09:51:08 +03:00
Ioannis Kakavas 834f2b4546
Add brackets where necessary in error messages (#48140) (#48386)
This commit attempts to help error readability by adding brackets
where applicable/missing in saml errors.
2019-10-23 17:23:50 +03:00
Ioannis Kakavas 24e43dfa34
[7.x] Refactor FIPS BootstrapChecks to simple checks (#47499) (#48333)
FIPS 140 bootstrap checks should not be bootstrap checks as they
are always enforced. This commit moves the validation logic within
the security plugin.
The FIPS140SecureSettingsBootstrapCheck was not applicable as the
keystore was being loaded on init, before the Bootstrap checks
were checked, so an elasticsearch keystore of version < 3 would
cause the node to fail in a FIPS 140 JVM before the bootstrap check
kicked in, and as such hasn't been migrated.

Resolves: #34772
2019-10-22 12:49:01 +03:00
James Baiera 0d12ef8958
Add Enrich Origin (#48098) (#48312)
This PR adds an origin for the Enrich feature, and modifies the background 
maintenance task to use the origin when executing client operations. 
Without this fix, the maintenance task fails to execute when security is 
enabled.
2019-10-21 16:40:49 -04:00
Albert Zaharovits 69fc715bc3
Fix security origin for TokenService#findActiveTokensFor... (#47418) (#48280)
All internal searches (triggered by APIs) across the .security index
must be performed while "under the security origin". Otherwise,
the search is performed in the context of the caller which most
likely does not have privileges to search .security (hopefully).
This commit fixes this in the case of two methods in the
TokenService and corrects an overly done such context switch
in the ApiKeyService.

In addition, this makes all tests from the client/rest-high-level
module execute as an all mighty administrator,
but not a literal superuser.

Closes #47151
2019-10-21 13:15:05 +03:00
Ioannis Kakavas ce3a06292b Mute flaky testCreateApiKey test (#47973)
see #47958
2019-10-18 09:52:07 +01:00
Ioannis Kakavas 9ee7b3743e
Add FIPS 140 mode to XPack Usage API (#47278) (#47976)
This change adds support for the FIPS 140 mode feature to be
retrieved via the XPack Usage API.
2019-10-14 10:40:24 +03:00
Yogesh Gaikwad ac209c142c
Remove uniqueness constraint for API key name and make it optional (#47549) (#47959)
Since we cannot guarantee the uniqueness of the API key `name` this commit removes the constraint and makes this field optional.

Closes #46646
2019-10-12 22:22:16 +11:00
Armin Braun 302e09decf
Simplify some Common ActionRunnable Uses (#47799) (#47828)
Especially in the snapshot code there's a lot
of logic chaining `ActionRunnables` in tricky
ways now and the code is getting hard to follow.
This change introduces two convinience methods that
make it clear that a wrapped listener is invoked with
certainty in some trickier spots and shortens the code a bit.
2019-10-09 23:29:50 +02:00
Yogesh Gaikwad b6d1d2e6ec
Add 'create_doc' index privilege (#45806) (#47645)
Use case:
User with `create_doc` index privilege will be allowed to only index new documents
either via Index API or Bulk API.

There are two cases that we need to think:
- **User indexing a new document without specifying an Id.**
   For this ES auto generates an Id and now ES version 7.5.0 onwards defaults to `op_type` `create` we just need to authorize on the `op_type`.
- **User indexing a new document with an Id.**
   This is problematic as we do not know whether a document with Id exists or not.
   If the `op_type` is `create` then we can assume the user is trying to add a document, if it exists it is going to throw an error from the index engine.

Given these both cases, we can safely authorize based on the `op_type` value. If the value is `create` then the user with `create_doc` privilege is authorized to index new documents.

In the `AuthorizationService` when authorizing a bulk request, we check the implied action.
This code changes that to append the `:op_type/index` or `:op_type/create`
to indicate the implied index action.
2019-10-07 23:58:44 +11:00
Yogesh Gaikwad 7c862fe71f
Add support to retrieve all API keys if user has privilege (#47274) (#47641)
This commit adds support to retrieve all API keys if the authenticated
user is authorized to do so.
This removes the restriction of specifying one of the
parameters (like id, name, username and/or realm name)
when the `owner` is set to `false`.

Closes #46887
2019-10-07 23:58:21 +11:00
Yogesh Gaikwad d371f9d44d
Fix for ApiKeyIntegTests related to Expired API keys remover (#43477) (#47546)
When API key is invalidated we do two things first it tries to trigger `ExpiredApiKeysRemover` task
and second, we do index the invalidation for the API key. The index invalidation may happen
before the `ExpiredApiKeysRemover` task is run and in that case, the API key
invalidated will also get deleted. If the `ExpiredApiKeysRemover` runs before the
API key invalidation is indexed then the API key is not deleted and will be
deleted in the future run.
This behavior was not captured in the tests related to `ExpiredApiKeysRemover`
causing intermittent failures.
This commit fixes those tests by checking if the API key invalidated is reported
back when we get API keys after invalidation and perform the checks based on that.

Closes #41747
2019-10-04 13:17:52 +10:00
Ioannis Kakavas fd6a585009
Fix ADRealmTests in FIPS 140 JVMs (#47437) (#47506)
The changes introduced in #47179 made it so that we could try to
build an SSLContext with verification mode set to None, which is
not allowed in FIPS 140 JVMs. This commit address that
2019-10-03 17:14:26 +03:00
Ioannis Kakavas 4f722f0f53
Fix Active Directory tests (#47358) (#47440)
Fixes multiple Active Directory related tests that run against the
samba fixture. Some were failing since we changed the realm settings
format in 7.0 and a few were slightly broken in other ways.
We can move to cleanup the tests in a follow up but this work fits
better to be done with or after we move the tests from a Samba
based fixture to a real(-ish) Microsoft Active Directory based
fixture.

Resolves: #33425, #35738
2019-10-02 17:18:12 +03:00
Albert Zaharovits 78558a7b2f
Fix AD realm additional metadata (#47179)
Due to a regression bug the metadata Active Directory realm
setting is ignored (it works correctly for the LDAP realm type).
This commit redresses it.

Closes #45848
2019-10-01 17:05:25 +03:00
Ioannis Kakavas 3b06916fcd Revert "Fix Active Directory tests (#47266)"
This reverts commit 7d9c064218.
2019-10-01 13:32:31 +03:00
Ioannis Kakavas 7d9c064218 Fix Active Directory tests (#47266)
Fixes multiple Active Directory related tests that run against the
samba fixture. Some were failing since we changed the realm settings
format in 7.0 and a few were slightly broken in other ways.
We can move to cleanup the tests in a follow up but this work fits
better to be done with or after we move the tests from a Samba
based fixture to a real(-ish) Microsoft Active Directory based
fixture.

Resolves: #33425, #35738
2019-10-01 10:52:07 +03:00
Ioannis Kakavas 33c5e5b09d Fix SSLErrorMessageTests in Windows (#47315)
- Build paths with PathUtils#get instead of hard-coding a string with
forward slashes.
- Do not try to match the whole message that includes paths. The
file separator is `\\` in windows but when we throw an Elasticsearch
Exception, the message is formatted with LoggerMessageFormat#format
which replaces `\\` with `\` in Path names. That means that in Windows
the Exception message will contain paths with single backslashes while
the expected string that comes from Path#toString on filename and
env.configFile will contain double backslashes. There is no point in
attempting to match the whole message string for the purpose of this test.

Resolves: #45598
2019-10-01 09:14:36 +03:00
Yogesh Gaikwad 2be351c5d0
Use 'should' clause instead of 'filter' when querying native privileges (#47019) (#47271)
When we added support for wildcard application names, we started to build
the prefix query along with the term query but we used 'filter' clause
instead of 'should', so this would not fetch the correct application
privilege descriptor thereby failing the _has_privilege checks.
This commit changes the clause to use should and with minimum_should_match
as 1.
2019-09-30 14:14:52 +10:00
Rory Hunter 53a4d2176f
Convert most awaitBusy calls to assertBusy (#45794) (#47112)
Backport of #45794 to 7.x. Convert most `awaitBusy` calls to
`assertBusy`, and use asserts where possible. Follows on from #28548 by
@liketic.

There were a small number of places where it didn't make sense to me to
call `assertBusy`, so I kept the existing calls but renamed the method to
`waitUntil`. This was partly to better reflect its usage, and partly so
that anyone trying to add a new call to awaitBusy wouldn't be able to find
it.

I also didn't change the usage in `TransportStopRollupAction` as the
comments state that the local awaitBusy method is a temporary
copy-and-paste.

Other changes:

  * Rework `waitForDocs` to scale its timeout. Instead of calling
    `assertBusy` in a loop, work out a reasonable overall timeout and await
    just once.
  * Some tests failed after switching to `assertBusy` and had to be fixed.
  * Correct the expect templates in AbstractUpgradeTestCase.  The ES
    Security team confirmed that they don't use templates any more, so
    remove this from the expected templates. Also rewrite how the setup
    code checks for templates, in order to give more information.
  * Remove an expected ML template from XPackRestTestConstants The ML team
    advised that the ML tests shouldn't be waiting for any
    `.ml-notifications*` templates, since such checks should happen in the
    production code instead.
  * Also rework the template checking code in `XPackRestTestHelper` to give
    more helpful failure messages.
  * Fix issue in `DataFrameSurvivesUpgradeIT` when upgrading from < 7.4
2019-09-29 12:21:46 +01:00
Tanguy Leroux 95e2ca741e
Remove unused private methods and fields (#47154)
This commit removes a bunch of unused private fields and unused
private methods from the code base.

Backport of (#47115)
2019-09-26 12:49:21 +02:00
Yogesh Gaikwad 9a64b7a888
[Backport] Validate `query` field when creating roles (#46275) (#47094)
In the current implementation, the validation of the role query
occurs at runtime when the query is being executed.

This commit adds validation for the role query when creating a role
but not for the template query as we do not have the runtime
information required for evaluating the template query (eg. authenticated user's
information). This is similar to the scripts that we
store but do not evaluate or parse if they are valid queries or not.

For validation, the query is evaluated (if not a template), parsed to build the
QueryBuilder and verify if the query type is allowed.

Closes #34252
2019-09-26 17:57:36 +10:00
Jim Ferenczi 04972baffa
Merge ShardSearchTransportRequest and ShardSearchLocalRequest (#46996) (#47081)
This change merges the `ShardSearchTransportRequest` and `ShardSearchLocalRequest`
into a single `ShardSearchRequest` that can be used to create a SearchContext.

Relates #46523
2019-09-26 09:20:53 +02:00
Ioannis Kakavas 23bceaadf8
Handle RelayState in preparing a SAMLAuthN Request (#46534) (#47092)
This change allows for the caller of the `saml/prepare` API to pass
a `relay_state` parameter that will then be part of the redirect
URL in the response as the `RelayState` query parameter.

The SAML IdP is required to reflect back the value of that relay
state when sending a SAML Response. The caller of the APIs can
then, when receiving the SAML Response, read and consume the value
as it see fit.
2019-09-25 13:23:46 +03:00
Albert Zaharovits 3a82e0f7f4
Do not rewrite aliases on remove-index from aliases requests (#46989) (#47018)
When we rewrite alias requests, after filtering down to only those that
the user is authorized to see, it can be that there are no aliases
remaining in the request. However, core Elasticsearch interprets this as
_all so the user would see more than they are authorized for. To address
this, we previously rewrote all such requests to have aliases `"*"`,
`"-*"`, which would be interpreted when aliases are resolved as
nome. Yet, this is only needed for get aliases requests and we were
applying it to all alias requests, including remove index requests. If
such a request was sent to a coordinating node that is not the master
node, the request would be rewritten to include `"*"` and `"-*"`, and
then the master would authorize the user for these. If the user had
limited permissions, the request would fail, even if they were
authorized on the index that the remove index action was over. This
commit addresses this by rewriting for get aliases and remove
aliases request types but not for the remove index.

Co-authored-by: Albert Zaharovits <albert.zaharovits@elastic.co>
Co-authored-by: Tim Vernum <tim@adjective.org>
2019-09-24 19:07:55 +03:00
Ioannis Kakavas 98e6bb4d01
Workaround JDK-8213202 in SSLClientAuthTests (#46995)
This change works around JDK-8213202, which is a bug related to TLSv1.3
session resumption before JDK 11.0.3 that occurs when there are
multiple concurrent sessions being established. Nodes connecting to
each other will trigger this bug when client authentication is
disabled, which is the case for SSLClientAuthTests.

Backport of #46680
2019-09-24 12:47:56 +03:00
Hendrik Muhs abe889af75
[7.5][Transform] rename classes in transform plugin (#46867)
rename classes and settings in transform plugin, provide BWC for old settings
2019-09-20 10:43:00 +02:00
Yannick Welsch 9638ca20b0 Allow dropping documents with auto-generated ID (#46773)
When using auto-generated IDs + the ingest drop processor (which looks to be used by filebeat
as well) + coordinating nodes that do not have the ingest processor functionality, this can lead
to a NullPointerException.

The issue is that markCurrentItemAsDropped() is creating an UpdateResponse with no id when
the request contains auto-generated IDs. The response serialization is lenient for our
REST/XContent format (i.e. we will send "id" : null) but the internal transport format (used for
communication between nodes) assumes for this field to be non-null, which means that it can't
be serialized between nodes. Bulk requests with ingest functionality are processed on the
coordinating node if the node has the ingest capability, and only otherwise sent to a different
node. This means that, in order to reproduce this, one needs two nodes, with the coordinating
node not having the ingest functionality.

Closes #46678
2019-09-19 16:46:33 +02:00
Armin Braun 6b09c2cdbb
Limit Netty Workers in NativeRealmIntegTestCase (#46816) (#46850)
The fact that this test randomly uses a relatively large number
of nodes and hence Netty worker threads created a problem with
running out of direct memory on CI.
Tests run with 512M heap (and hence 512M direct memory) by default.
On a CI worker with 16 cores, this means Netty will by default set
up 32 transport workers. If we get unlucky and a lot of them
actually do work (and thus instantiate a `CopyBytesSocketChannel`
which costs 1M per thread for the thread-local IO buffer) we
would run out of memory.

This specific failure was only seen with `NativeRealmIntegTests` so I
only added the constraint on the Netty worker count here.
We can add it to other tests (or `SecurityIntegTestCase`) if need be
but for now it doesn't seem necessary so I opted for least impact.

Closes #46803
2019-09-19 13:07:42 +02:00
Luca Cavanna e57756492a Update http-core and http-client dependencies (#46549)
Relates to #45808
Closes #45577
2019-09-12 09:45:29 +02:00
James Rodewig f9bf10f2b6
[DOCS] Change "a SSL" to "an SSL" in the Java docs (#46524) (#46618) 2019-09-11 15:55:57 -04:00
Ioannis Kakavas 35810bd2ae
Enforce realm name uniqueness (#46580)
We depend on file realms being unique in a number of places. Pre
7.0 this was enforced by the fact that the multiple realm types
with different name would mean identical configuration keys and
cause configuration parsing errors. Since we intoduced affix
settings for realms this is not the case any more as the realm type
is part of the configuration key.
This change adds a check when building realms which will explicitly
fail if multiple realms are defined with the same name.

Backport of #46253
2019-09-11 13:13:59 +03:00
Tim Vernum 80064652f8
Fallback to realm authc if ApiKey fails (#46552)
This changes API-Key authentication to always fallback to the realm
chain if the API key is not valid. The previous behaviour was
inconsistent and would terminate on some failures, but continue to the
realm chain for others.

Backport of: #46538
2019-09-11 14:33:17 +10:00
Alpar Torok b40ac6dee7 mute on 7.x fo windows
Tracking #44942
2019-09-10 12:34:16 +03:00
Yogesh Gaikwad d5acb15a71
[Backport] Initialize document subset bit set cache used for DLS (#46211) (#46359)
This commit initializes DocumentSubsetBitsetCache even if DLS
is disabled. Previously it would throw null pointer when querying
usage stats if we explicitly disabled DLS as there would be no instance of DocumentSubsetBitsetCache to query. It is okay to initialize
DocumentSubsetBitsetCache which will be empty as the license enforcement
would prevent usage of DLS feature and it will not fail when accessing usage stats.

Closes #45147
2019-09-05 14:34:19 +10:00
Andrey Ershov ece9eb4acd Remove stack trace logging in Security(Transport|Http)ExceptionHandler (#45966)
As per #45852 comment we no longer need to log stack-traces in
SecurityTransportExceptionHandler and SecurityHttpExceptionHandler even
if trace logging is enabled.

(cherry picked from commit c99224a32d26db985053b7b36e2049036e438f97)
2019-09-04 11:50:35 +03:00
Yogesh Gaikwad 7b6246ec67
Add `manage_own_api_key` cluster privilege (#45897) (#46023)
The existing privilege model for API keys with privileges like
`manage_api_key`, `manage_security` etc. are too permissive and
we would want finer-grained control over the cluster privileges
for API keys. Previously APIs created would also need these
privileges to get its own information.

This commit adds support for `manage_own_api_key` cluster privilege
which only allows api key cluster actions on API keys owned by the
currently authenticated user. Also adds support for retrieval of
the API key self-information when authenticating via API key
without the need for the additional API key privileges.
To support this privilege, we are introducing additional
authentication context along with the request context such that
it can be used to authorize cluster actions based on the current
user authentication.

The API key get and invalidate APIs introduce an `owner` flag
that can be set to true if the API key request (Get or Invalidate)
is for the API keys owned by the currently authenticated user only.
In that case, `realm` and `username` cannot be set as they are
assumed to be the currently authenticated ones.

The changes cover HLRC changes, documentation for the API changes.

Closes #40031
2019-08-28 00:44:23 +10:00
Albert Zaharovits 1ebee5bf9b
PKI realm authentication delegation (#45906)
This commit introduces PKI realm delegation. This feature
supports the PKI authentication feature in Kibana.

In essence, this creates a new API endpoint which Kibana must
call to authenticate clients that use certificates in their TLS
connection to Kibana. The API call passes to Elasticsearch the client's
certificate chain. The response contains an access token to be further
used to authenticate as the client. The client's certificates are validated
by the PKI realms that have been explicitly configured to permit
certificates from the proxy (Kibana). The user calling the delegation
API must have the delegate_pki privilege.

Closes #34396
2019-08-27 14:42:46 +03:00
Ioannis Kakavas 2bee27dd54
Allow Transport Actions to indicate authN realm (#45946)
This commit allows the Transport Actions for the SSO realms to
indicate the realm that should be used to authenticate the
constructed AuthenticationToken. This is useful in the case that
many authentication realms of the same type have been configured
and where the caller of the API(Kibana or a custom web app) already
know which realm should be used so there is no need to iterate all
the realms of the same type.
The realm parameter is added in the relevant REST APIs as optional
so as not to introduce any breaking change.
2019-08-25 19:36:41 +03:00
William Brafford 2b549e7342
CLI tools: write errors to stderr instead of stdout (#45586)
Most of our CLI tools use the Terminal class, which previously did not provide methods for writing to standard output. When all output goes to standard out, there are two basic problems. First, errors and warnings are "swallowed" in pipelines, making it hard for a user to know when something's gone wrong. Second, errors and warnings are intermingled with legitimate output, making it difficult to pass the results of interactive scripts to other tools.

This commit adds a second set of print commands to Terminal for printing to standard error, with errorPrint corresponding to print and errorPrintln corresponding to println. This leaves it to developers to decide which output should go where. It also adjusts existing commands to send errors and warnings to stderr.

Usage is printed to standard output when it's correctly requested (e.g., bin/elasticsearch-keystore --help) but goes to standard error when a command is invoked incorrectly (e.g. bin/elasticsearch-keystore list-with-a-typo | sort).
2019-08-21 14:46:07 -04:00
David Roberts d40f3718f2 [ML] Muting 5 SSLErrorMessageTests tests on Windows (#45602)
Due to https://github.com/elastic/elasticsearch/issues/45598
2019-08-15 11:05:00 +01:00
Yogesh Gaikwad 471d940c44
Refactor cluster privileges and cluster permission (#45265) (#45442)
The current implementations make it difficult for
adding new privileges (example: a cluster privilege which is
more than cluster action-based and not exposed to the security
administrator). On the high level, we would like our cluster privilege
either:
- a named cluster privilege
  This corresponds to `cluster` field from the role descriptor
- or a configurable cluster privilege
  This corresponds to the `global` field from the role-descriptor and
allows a security administrator to configure them.

Some of the responsibilities like the merging of action based cluster privileges
are now pushed at cluster permission level. How to implement the predicate
(using Automaton) is being now enforced by cluster permission.

`ClusterPermission` helps in enforcing the cluster level access either by
performing checks against cluster action and optionally against a request.
It is a collection of one or more permission checks where if any of the checks
allow access then the permission allows access to a cluster action.

Implementations of cluster privilege must be able to provide information
regarding the predicates to the cluster permission so that can be enforced.
This is enforced by making implementations of cluster privilege aware of
cluster permission builder and provide a way to specify how the permission is
to be built for a given privilege.

This commit renames `ConditionalClusterPrivilege` to `ConfigurableClusterPrivilege`.
`ConfigurableClusterPrivilege` is a renderable cluster privilege exposed
as a `global` field in role descriptor.

Other than this there is a requirement where we would want to know if a cluster
permission is implied by another cluster-permission (`has-privileges`).
This is helpful in addressing queries related to privileges for a user.
This is not just simply checking of cluster permissions since we do not
have access to runtime information (like request object).
This refactoring does not try to address those scenarios.

Relates #44048
2019-08-13 09:06:18 +10:00
Armin Braun a9e1402189
Remove Settings from BaseRestRequest Constructor (#45418) (#45429)
* Resolving the todo, cleaning up the unused `settings` parameter
* Cleaning up some other minor dead code in affected classes
2019-08-12 05:14:45 +02:00
Alpar Torok 634a070430 Restrict which tasks can use testclusters (#45198)
* Restrict which tasks can use testclusters

This PR fixes a problem between the interaction of test-clusters and
build cache.
Before this any task could have used a cluster without tracking it as
input.
With this change a new interface is introduced to track the tasks that
can use clusters and we do consider the cluster as input for all of
them.
2019-08-09 13:38:01 +03:00
Ioannis Kakavas 99ddb8b3d8 Allow empty token endpoint for implicit flow (#45038)
When using the implicit flow in OpenID Connect, the
op.token_endpoint_url should not be mandatory as there is no need
to contact the token endpoint of the OP.
2019-08-08 12:50:53 +03:00
Yannick Welsch 7aeb2fe73c Add per-socket keepalive options (#44055)
Uses JDK 11's per-socket configuration of TCP keepalive (supported on Linux and Mac), see
https://bugs.openjdk.java.net/browse/JDK-8194298, and exposes these as transport settings.
By default, these options are disabled for now (i.e. fall-back to OS behavior), but we would like
to explore whether we can enable them by default, in particular to force keepalive configurations
that are better tuned for running ES.
2019-08-06 10:45:44 +02:00
Tim Brooks 984ba82251
Move nio channel initialization to event loop (#45155)
Currently in the transport-nio work we connect and bind channels on the
a thread before the channel is registered with a selector. Additionally,
it is at this point that we set all the socket options. This commit
moves these operations onto the event-loop after the channel has been
registered with a selector. It attempts to set the socket options for a
non-server channel at registration time. If that fails, it will attempt
to set the options after the channel is connected. This should fix
#41071.
2019-08-02 17:31:31 -04:00
Tim Vernum e21d58541a
Improve errors when TLS files cannot be read (#45122)
This change improves the exception messages that are thrown when the
system cannot read TLS resources such as keystores, truststores,
certificates, keys or certificate-chains (CAs).

This change specifically handles:

- Files that do not exist
- Files that cannot be read due to file-system permissions
- Files that cannot be read due to the ES security-manager

Backport of: #44787
2019-08-02 12:29:43 +10:00
Tim Vernum 590777150f
Explicitly fail if a realm only exists in keystore (#45091)
There are no realms that can be configured exclusively with secure
settings. Every realm that supports secure settings also requires one
or more non-secure settings.
However, sometimes a node will be configured with entries in the
keystore for which there is nothing in elasticsearch.yml - this may be
because the realm we removed from the yml, but not deleted from the
keystore, or it could be because there was a typo in the realm name
which has accidentially orphaned the keystore entry.

In these cases the realm building would fail, but the error would not
always be clear or point to the root cause (orphaned keystore
entries). RealmSettings would act as though the realm existed, but
then fail because an incorrect combination of settings was provided.

This change causes realm building to fail early, with an explicit
message about incorrect keystore entries.

Backport of: #44471
2019-08-02 12:28:59 +10:00
Yogesh Gaikwad ae5c01e2d2 Do not use scroll when finding duplicate API key (#45026)
When we create API key we check if the API key with the name
already exists. It searches with scroll enabled and this causes
the request to fail when creating large number of API keys in
parallel as it hits the number of open scroll limit (default 500).
We do not need the search context to be created so this commit
removes the scroll parameter from the search request for duplicate
API key.
2019-08-02 10:16:48 +10:00
Mark Vieira c13285a382
Remove unnecessary plugin application and project configuration (#45100) 2019-08-01 14:18:24 -07:00
Armin Braun c7d7230524
Stop Recreating Wrapped Handlers in RestController (#44964) (#45040)
* We shouldn't be recreating wrapped REST handlers over and over for every request. We only use this hook in x-pack and the wrapper there does not have any per request state.
  This is inefficient and could lead to some very unexpected memory behavior
   => I made the logic create the wrapper on handler registration and adjusted the x-pack wrapper implementation to correctly forward the circuit breaker and content stream flags
2019-07-31 17:11:34 +02:00
Tim Vernum 3c17d4379d
Expand logging when SAML Audience condition fails (#45027)
A mismatched configuration between the IdP and SP will often result in
SAML authentication attempts failing because the audience condition is
not met (because the IdP and SP disagree about the correct form of the
SP's Entity ID).

Previously the error message in this case did not provide sufficient
information to resolve the issue because the IdP's expected audience
would be truncated if it exceeeded 32 characters. Since the error did
not provide both IDs in full, it was not possible to determine the
correct fix (in detail) based on the error alone.

This change expands the message that is included in the thrown
exception, and also adds additional logging of every failed audience
condition, with diagnostics of the match failure.

Backport of: #44334
2019-07-31 19:40:17 +10:00
Tim Vernum f575370e2f
Fix broken short-circuit in getUnlicensedRealms (#44937)
The existing equals check was broken, and would always be false.

The correct behaviour is to return "Collections.emptyList()" whenever
the the active(licensed)-realms equals the configured-realms.

Backport of: #44399
2019-07-30 16:32:04 +10:00
Albert Zaharovits af937b14ae
SecurityIndexManager handle RuntimeEx while reading mapping (#44409)
Fixes exception handling while reading and parsing `.security-*`
mappings and templates.
2019-07-25 16:52:21 +03:00
Ioannis Kakavas 3714cb63da Allow parsing the value of java.version sysprop (#44017)
We often start testing with early access versions of new Java
versions and this have caused minor issues in our tests
(i.e. #43141) because the version string that the JVM reports
cannot be parsed as it ends with the string -ea.

This commit changes how we parse and compare Java versions to
allow correct parsing and comparison of the output of java.version
system property that might include an additional alphanumeric
part after the version numbers
 (see [JEP 223[(https://openjdk.java.net/jeps/223)). In short it 
handles a version number part, like before, but additionally a 
PRE part that matches ([a-zA-Z0-9]+).

It also changes a number of tests that would attempt to parse
java.specification.version in order to get the full version
of Java. java.specification.version only contains the major
version and is thus inappropriate when trying to compare against
a version that might contain a minor, patch or an early access
part. We know parse java.version that can be consistently
parsed.

Resolves #43141
2019-07-22 20:14:56 +03:00
Ryan Ernst edd26339c5
Convert remaining request classes in xpack core to writeable.reader (#44524) (#44534)
This commit converts all remaining classes extending ActionRequest
in xpack core to have a StreamInput constructor.

relates #34389
2019-07-18 01:11:45 -07:00
Tim Brooks 0a352486e8
Isolate nio channel registered from channel active (#44388)
Registering a channel with a selector is a required operation for the
channel to be handled properly. Currently, we mix the registeration with
other setup operations (ip filtering, SSL initiation, etc). However, a
fail to register is fatal. This PR modifies how registeration occurs to
immediately close the channel if it fails.

There are still two clear loopholes for how a user can interact with a
channel even if registration fails. 1. through the exception handler.
2. through the channel accepted callback. These can perhaps be improved
in the future. For now, this PR prevents writes from proceeding if the
channel is not registered.
2019-07-16 17:18:57 -06:00
Ryan Ernst c4cf98c538
Convert core security actions to use writeable ActionType (#44359) (#44390)
This commit converts all the StreamableResponseActionType security
classes in xpack core to ActionType, implementing Writeable for their
response classes.

relates #34389
2019-07-16 01:11:13 -07:00
Jason Tedor be98a12cd0
Do not swallow I/O exception getting authentication (#44398)
When getting authentication info from the thread context, it might be
that we encounter an I/O exception. Today we swallow this exception and
return a null authentication info to the caller. Yet, this could be
hiding bugs or errors. This commits adjusts this behavior so that we no
longer swallow the exception.
2019-07-16 16:14:15 +09:00
Ryan Ernst e0b82e92f3
Convert BaseNode(s) Request/Response classes to Writeable (#44301) (#44358)
This commit converts all BaseNodeResponse and BaseNodesResponse
subclasses to implement Writeable.Reader instead of Streamable.

relates #34389
2019-07-15 18:07:52 -07:00
Ryan Ernst 7e06888bae
Convert testclusters to use distro download plugin (#44253) (#44362)
Test clusters currently has its own set of logic for dealing with
finding different versions of Elasticsearch, downloading them, and
extracting them. This commit converts testclusters to use the
DistributionDownloadPlugin.
2019-07-15 17:53:05 -07:00
Ryan Ernst 1dcf53465c Reorder HandledTransportAction ctor args (#44291)
This commit moves the Supplier variant of HandledTransportAction to have
a different ordering than the Writeable.Reader variant. The Supplier
version is used for the legacy Streamable, and currently having the
location of the Writeable.Reader vs Supplier in the same place forces
using casts of Writeable.Reader to select the correct super constructor.
This change in ordering allows easier migration to Writeable.Reader.

relates #34389
2019-07-12 13:45:09 -07:00
Albert Zaharovits e490ecb7d3
Fix X509AuthenticationToken principal (#43932)
Fixes a bug in the PKI authentication. This manifests when there
are multiple PKI realms configured in the chain, with different
principal parse patterns. There are a few configuration scenarios
where one PKI realm might parse the principal from the Subject
DN (according to the `username_pattern` realm setting) but
another one might do the truststore validation (according to
the truststore.* realm settings).

This is caused by the two passes through the realm chain, first to
build the authentication token and secondly to authenticate it, and
that the X509AuthenticationToken sets the principal during
construction.
2019-07-12 11:04:50 +03:00
Yannick Welsch 2ee07f1ff4 Simplify port usage in transport tests (#44157)
Simplifies AbstractSimpleTransportTestCase to use JVM-local ports  and also adds an assertion so
that cases like #44134 can be more easily debugged. The likely reason for that one is that a test,
which was repeated again and again while always spawning a fresh Gradle worker (due to Gradle
daemon) kept increasing Gradle worker IDs, causing an overflow at some point.
2019-07-11 13:35:37 +02:00
Ryan Ernst fb77d8f461 Removed writeTo from TransportResponse and ActionResponse (#44092)
The base classes for transport requests and responses currently
implement Streamable and Writeable. The writeTo method on these base
classes is implemented with an empty implementation. Not only does this
complicate subclasses to think they need to call super.writeTo, but it
also can lead to not implementing writeTo when it should have been
implemented, or extendiong one of these classes when not necessary,
since there is nothing to actually implement.

This commit removes the empty writeTo from these base classes, and fixes
subclasses to not call super and in some cases implement an empty
writeTo themselves.

relates #34389
2019-07-10 12:42:04 -07:00
Ioannis Kakavas 9beb51fc44 Revert "Mute testEnableDisableBehaviour (#42929)"
This reverts commit 6ee578c6eb.
2019-07-08 08:52:21 +03:00
Yannick Welsch 504a43d43a Move ConnectionManager to async APIs (#42636)
This commit converts the ConnectionManager's openConnection and connectToNode methods to
async-style. This will allow us to not block threads anymore when opening connections. This PR also
adapts the cluster coordination subsystem to make use of the new async APIs, allowing to remove
some hacks in the test infrastructure that had to account for the previous synchronous nature of the
connection APIs.
2019-07-05 20:40:22 +02:00
Jay Modi 1e0f67fb38 Deprecate transport profile security type setting (#43237)
This commit deprecates the `transport.profiles.*.xpack.security.type`
setting. This setting is used to configure a profile that would only
allow client actions. With the upcoming removal of the transport client
the setting should also be deprecated so that it may be removed in
a future version.
2019-07-03 19:31:55 +10:00
Tim Vernum 2a8f30eb9a
Support builtin privileges in get privileges API (#43901)
Adds a new "/_security/privilege/_builtin" endpoint so that builtin
index and cluster privileges can be retrieved via the Rest API

Backport of: #42134
2019-07-03 19:08:28 +10:00
Tim Vernum deacc2038e
Always attach system user to internal actions (#43902)
All valid licenses permit security, and the only license state where
we don't support security is when there is a missing license.
However, for safety we should attach the system (or xpack/security)
user to internally originated actions even if the license is missing
(or, more strictly, doesn't support security).

This allows all nodes to communicate and send internal actions (shard
state, handshake/pings, etc) even if a license is transitioning
between a broken state and a valid state.

Relates: #42215
Backport of: #43468
2019-07-03 19:07:16 +10:00
Tim Vernum 31b19bd022
Use separate BitSet cache in Doc Level Security (#43899)
Document level security was depending on the shared
"BitsetFilterCache" which (by design) never expires its entries.

However, when using DLS queries - particularly templated ones - the
number (and memory usage) of generated bitsets can be significant.

This change introduces a new cache specifically for BitSets used in
DLS queries, that has memory usage constraints and access time expiry.

The whole cache is automatically cleared if the role cache is cleared.
Individual bitsets are cleared when the corresponding lucene index
reader is closed.

The cache defaults to 50MB, and entries expire if unused for 7 days.

Backport of: #43669
2019-07-03 18:04:06 +10:00
Tim Vernum 461aa39daf
Switch WriteActionsTests.testBulk to use hamcrest (#43897)
If an item in the bulk request fails, that could be for a variety of
reasons - it may be that the underlying behaviour of security has
changed, or it may just be a transient failure during testing.

Simply asserting a `true`/`false` value produces failure messages that
are difficult to diagnose and debug. Using hamcert (`assertThat`) will
make it easier to understand the causes of failures in this test.

Backport of: #43725
2019-07-03 16:29:28 +10:00
Tim Vernum 8d099dad38
Add "manage_api_key" cluster privilege (#43865)
This adds a new cluster privilege for manage_api_key. Users with this
privilege are able to create new API keys (as a child of their own
user identity) and may also get and invalidate any/all API keys
(including those owned by other users).

Backport of: #43728
2019-07-02 21:57:42 +10:00
Ioannis Kakavas c8ed271937 Use URLEncoder#encode(String, String)
as URLEncoder#encode(String, Charset) is only available since Java
10
2019-07-02 14:20:29 +03:00
Ioannis Kakavas 4ea17b76dc Fix credentials encoding for OIDC token request (#43808)
As defined in https://tools.ietf.org/html/rfc6749#section-2.3.1
both client id and client secret need to be encoded with the
application/x-www-form-urlencoded encoding algorithm when used as
credentials for HTTP Basic Authentication in requests to the OP.

Resolves #43709
2019-07-02 13:36:00 +03:00
Albert Zaharovits 4eb89a6912
UserRoleMapper non-null groups and metadata (#43836)
This is an odd backport of #41774

UserRoleMapper.UserData is constructed by each realm and it is used to
"match" role mapping expressions that eventually supply the role names
of the principal.

This PR filters out `null` collection values (lists and maps), for the groups
and metadata, which get to take part in the role mapping, in preparation
for using Java 9 collection APIs. It filters them as soon as possible, during
the construction.
2019-07-02 00:10:15 +03:00
Christoph Büscher fe3f9f0c6b Yet another `the the` cleanup (#43815) 2019-07-01 20:22:19 +02:00
Ryan Ernst 3a2c698ce0
Rename Action to ActionType (#43778)
Action is a class that encapsulates meta information about an action
that allows it to be called remotely, specifically the action name and
response type. With recent refactoring, the action class can now be
constructed as a static constant, instead of needing to create a
subclass. This makes the old pattern of creating a singleton INSTANCE
both misnamed and lacking a common placement.

This commit renames Action to ActionType, thus allowing the old INSTANCE
naming pattern to be TYPE on the transport action itself. ActionType
also conveys that this class is also not the action itself, although
this change does not rename any concrete classes as those will be
removed organically as they are converted to TYPE constants.

relates #34389
2019-06-30 22:00:17 -07:00
Jim Ferenczi 7ca69db83f Refactor IndexSearcherWrapper to disallow the wrapping of IndexSearcher (#43645)
This change removes the ability to wrap an IndexSearcher in plugins. The IndexSearcherWrapper is replaced by an IndexReaderWrapper and allows to wrap the DirectoryReader only. This simplifies the creation of the context IndexSearcher that is used on a per request basis. This change also moves the optimization that was implemented in the security index searcher wrapper to the ContextIndexSearcher that now checks the live docs to determine how the search should be executed. If the underlying live docs is a sparse bit set the searcher will compute the intersection
betweeen the query and the live docs instead of checking the live docs on every document that match the query.
2019-06-28 16:28:02 +02:00
Ryan Ernst 5b4089e57e
Remove nodeId from BaseNodeRequest (#43658)
TransportNodesAction provides a mechanism to easily broadcast a request
to many nodes, and collect the respones into a high level response. Each
node has its own request type, with a base class of BaseNodeRequest.
This base request requires passing the nodeId to which the request will
be sent. However, that nodeId is not used anywhere. It is private to the
base class, yet serialized to each node, where the node could just as
easily find the nodeId of the node it is on locally.

This commit removes passing the nodeId through to the node request
creation, and guards its serialization so that we can remove the base
request class altogether in the future.
2019-06-27 18:45:14 -07:00
Armin Braun c00e305d79
Optimize Selector Wakeups (#43515) (#43650)
* Use atomic boolean to guard wakeups
* Don't trigger wakeups from the select loops thread itself for registering and closing channels
* Don't needlessly queue writes

Co-authored-by:  Tim Brooks <tim@uncontended.net>
2019-06-26 20:00:42 +02:00
Tim Brooks 38516a4dd5
Move nio ip filter rule to be a channel handler (#43507)
Currently nio implements ip filtering at the channel context level. This
is kind of a hack as the application logic should be implemented at the
handler level. This commit moves the ip filtering into a channel
handler. This requires adding an indicator to the channel handler to
show when a channel should be closed.
2019-06-24 10:03:24 -06:00
Tim Vernum 059eb55108
Use SecureString for password length validation (#43465)
This replaces the use of char[] in the password length validation
code, with the use of SecureString

Although the use of char[] is not in itself problematic, using a
SecureString encourages callers to think about the lifetime of the
password object and to clear it after use.

Backport of: #42884
2019-06-21 17:11:07 +10:00
Armin Braun 21515b9ff1
Fix IpFilteringIntegrationTests (#43019) (#43434)
* Increase timeout to 5s since we saw 500ms+ GC pauses on CI
* closes #40689
2019-06-20 22:31:59 +02:00
Jason Tedor 1f1a035def
Remove stale test logging annotations (#43403)
This commit removes some very old test logging annotations that appeared
to be added to investigate test failures that are long since closed. If
these are needed, they can be added back on a case-by-case basis with a
comment associating them to a test failure.
2019-06-19 22:58:22 -04:00
Yogesh Gaikwad 2f173402ec
Add kerberos grant_type to get token in exchange for Kerberos ticket (#42847) (#43355)
Kibana wants to create access_token/refresh_token pair using Token
management APIs in exchange for kerberos tickets. `client_credentials`
grant_type requires every user to have `cluster:admin/xpack/security/token/create`
cluster privilege.

This commit introduces `_kerberos` grant_type for generating `access_token`
and `refresh_token` in exchange for a valid base64 encoded kerberos ticket.
In addition, `kibana_user` role now has cluster privilege to create tokens.
This allows Kibana to create access_token/refresh_token pair in exchange for
kerberos tickets.

Note:
The lifetime from the kerberos ticket is not used in ES and so even after it expires
the access_token/refresh_token pair will be valid. Care must be taken to invalidate
such tokens using token management APIs if required.

Closes #41943
2019-06-19 18:26:52 +10:00
Alpar Torok 5a9c48369b TestClusters: Convert the security plugin (#43242)
* TestClusters: Convert the security plugin

This PR moves security tests to use TestClusters.
The TLS test required support in testclusters itself, so the correct
wait condition is configgured based on the cluster settings.

* PR review
2019-06-18 11:55:44 +03:00
Jason Tedor 5bc3b7f741
Enable node roles to be pluggable (#43175)
This commit introduces the possibility for a plugin to introduce
additional node roles.
2019-06-13 15:15:48 -04:00
Ryan Ernst 172cd4dbfa Remove description from xpack feature sets (#43065)
The description field of xpack featuresets is optionally part of the
xpack info api, when using the verbose flag. However, this information
is unnecessary, as it is better left for documentation (and the existing
descriptions describe anything meaningful). This commit removes the
description field from feature sets.
2019-06-11 09:22:58 -07:00
Ioannis Kakavas 1776d6e055 Refresh remote JWKs on all errors (#42850)
It turns out that key rotation on the OP, can manifest as both
a BadJWSException and a BadJOSEException in nimbus-jose-jwt. As
such we cannot depend on matching only BadJWSExceptions to
determine if we should poll the remote JWKs for an update.

This has the side-effect that a remote JWKs source will be polled
exactly one additional time too for errors that have to do with
configuration, or for errors that might be caused by not synched
clocks, forged JWTs, etc. ( These will throw a BadJWTException
which extends BadJOSEException also )
2019-06-11 11:01:54 +03:00
Nhat Nguyen f2e66e22eb Increase waiting time when check retention locks (#42994)
WriteActionsTests#testBulk and WriteActionsTests#testIndex sometimes
fail with a pending retention lock. We might leak retention locks when
switching to async recovery. However, it's more likely that ongoing
recoveries prevent the retention lock from releasing.

This change increases the waiting time when we check for no pending
retention lock and also ensures no ongoing recovery in
WriteActionsTests.

Closes #41054
2019-06-10 17:58:37 -04:00
Tim Vernum 090d42d3e6
Permit API Keys on Basic License (#42973)
Kibana alerting is going to be built using API Keys, and should be
permitted on a basic license.

This commit moves API Keys (but not Tokens) to the Basic license

Relates: elastic/kibana#36836
Backport of: #42787
2019-06-07 14:18:05 +10:00
Tim Brooks 667c613d9e
Remove `nonApplicationWrite` from `SSLDriver` (#42954)
Currently, when the SSLEngine needs to produce handshake or close data,
we must manually call the nonApplicationWrite method. However, this data
is only required when something triggers the need (starting handshake,
reading from the wire, initiating close, etc). As we have a dedicated
outbound buffer, this data can be produced automatically. Additionally,
with this refactoring, we combine handshake and application mode into a
single mode. This is necessary as there are non-application messages that
are sent post handshake in TLS 1.3. Finally, this commit modifies the
SSLDriver tests to test against TLS 1.3.
2019-06-06 17:44:40 -04:00
Ioannis Kakavas 6ee578c6eb Mute testEnableDisableBehaviour (#42929) 2019-06-06 14:01:07 +03:00
Przemyslaw Gomulka cfdb1b771e
Enable console audit logs for docker backport#42671 #42887
Enable audit logs in docker by creating console appenders for audit loggers.
also rename field @timestamp to timestamp and add field type with value audit

The docker build contains now two log4j configuration for oss or default versions. The build now allows override the default configuration.

Also changed the format of a timestamp from ISO8601 to include time zone as per this discussion #36833 (comment)

closes #42666
backport#42671
2019-06-05 17:15:37 +02:00
Jason Tedor aad1b3a2a0
Fix version parsing in various tests (#42871)
This commit fixes the version parsing in various tests. The issue here is that
the parsing was relying on java.version. However, java.version can contain
additional characters such as -ea for early access builds. See JEP 233:

Name                            Syntax
------------------------------  --------------
java.version                    $VNUM(\-$PRE)?
java.runtime.version            $VSTR
java.vm.version                 $VSTR
java.specification.version      $VNUM
java.vm.specification.version   $VNUM

Instead, we want java.specification.version.
2019-06-04 18:22:20 -04:00
Mark Vieira e44b8b1e2e
[Backport] Remove dependency substitutions 7.x (#42866)
* Remove unnecessary usage of Gradle dependency substitution rules (#42773)

(cherry picked from commit 12d583dbf6f7d44f00aa365e34fc7e937c3c61f7)
2019-06-04 13:50:23 -07:00
Tim Vernum 928f49992f
Don't require TLS for single node clusters (#42830)
This commit removes the TLS cluster join validator.

This validator existed to prevent v6.x nodes (which mandated
TLS) from joining an existing cluster of v5.x nodes (which did
not mandate TLS) unless the 6.x node (and by implication the
5.x nodes) was configured to use TLS.

Since 7.x nodes cannot talk to 5.x nodes, this validator is no longer
needed.

Removing the validator solves a problem where single node clusters
that were bound to local interfaces were incorrectly requiring TLS
when they recovered cluster state and joined their own cluster.

Backport of: #42826
2019-06-04 19:48:37 +10:00
Tim Vernum 8de3a88205
Log the status of security on license change (#42741)
Whether security is enabled/disabled is dependent on the combination
of the node settings and the cluster license.

This commit adds a license state listener that logs when the license
change causes security to switch state (or to be initialised).

This is primarily useful for diagnosing cluster formation issues.

Backport of: #42488
2019-06-04 14:25:43 +10:00
Tim Vernum 9035e61825
Detect when security index is closed (#42740)
If the security index is closed, it should be treated as unavailable
for security purposes.

Prior to 8.0 (or in a mixed cluster) a closed security index has
no routing data, which would cause a NPE in the cluster change
handler, and the index state would not be updated correctly.
This commit fixes that problem

Backport of: #42191
2019-06-04 14:25:20 +10:00
Alan Woodward 2129d06643 Create client-only AnalyzeRequest/AnalyzeResponse classes (#42197)
This commit clones the existing AnalyzeRequest/AnalyzeResponse classes
to the high-level rest client, and adjusts request converters to use these new
classes.

This is a prerequisite to removing the Streamable interface from the internal
server version of these classes.
2019-06-03 09:46:36 +01:00
Mark Vieira c1816354ed
[Backport] Improve build configuration time (#42674) 2019-05-30 10:29:42 -07:00
Jay Modi 711de2f59a
Make hashed token ids url safe (#42651)
This commit changes the way token ids are hashed so that the output is
url safe without requiring encoding. This follows the pattern that we
use for document ids that are autogenerated, see UUIDs and the
associated classes for additional details.
2019-05-30 10:44:41 -06:00
Ioannis Kakavas 7cabe8acc9 Fix refresh remote JWKS logic (#42662)
This change ensures that:

- We only attempt to refresh the remote JWKS when there is a
signature related error only ( BadJWSException instead of the
geric BadJOSEException )
- We do call OpenIDConnectAuthenticator#getUserClaims upon
successful refresh.
- We test this in OpenIdConnectAuthenticatorTests.

Without this fix, when using the OpenID Connect realm with a remote
JWKSet configured in `op.jwks_path`, the refresh would be triggered
for most configuration errors ( i.e. wrong value for `op.issuer` )
and the kibana wouldn't get a response and timeout since
`getUserClaims` wouldn't be called because
`ReloadableJWKSource#reloadAsync` wouldn't call `onResponse` on the
future.
2019-05-30 18:08:30 +03:00
Ioannis Kakavas 24a794fd6b Fix testTokenExpiry flaky test (#42585)
Test was using ClockMock#rewind passing the amount of nanoseconds
in order to "strip" nanos from the time value. This was intentional
as the expiration time of the UserToken doesn't have nanosecond
precision.
However, ClockMock#rewind doesn't support nanos either, so when it's
called with a TimeValue, it rewinds the clock by the TimeValue's
millis instead. This was causing the clock to go enough millis
before token expiration time and the test was passing. Once every
few hundred times though, the TimeValue by which we attempted to
rewind the clock only had nanos and no millis, so rewind moved the
clock back just a few millis, but still after expiration time.

This change moves the clock explicitly to the same instant as expiration,
using clock.setTime and disregarding nanos.
2019-05-30 07:53:56 +03:00
Armin Braun a96606d962
Safer Wait for Snapshot Success in ClusterPrivilegeTests (#40943) (#42575)
* Safer Wait for Snapshot Success in ClusterPrivilegeTests

* The snapshot state returned by the API might become SUCCESS before it's fully removed from the cluster state.
  * We should fix this race in the transport API but it's not trivial and will be part of the incoming big round of refactoring the repository interaction, this added check fixes the test for now
* closes #38030
2019-05-27 12:08:20 +02:00
Ryan Ernst a49bafc194
Split document and metadata fields in GetResult (#38373) (#42456)
This commit makes creators of GetField split the fields into document fields and metadata fields. It is part of larger refactoring that aims to remove the calls to static methods of MapperService related to metadata fields, as discussed in #24422.
2019-05-23 14:01:07 -07:00
Ioannis Kakavas aab97f1311 Fail early when rp.client_secret is missing in OIDC realm (#42256)
rp.client_secret is a required secure setting. Make sure we fail with
a SettingsException and a clear, actionable message when building
the realm, if the setting is missing.
2019-05-22 13:20:41 +03:00
Ioannis Kakavas ccdc0e6b3e Merge claims from userinfo and ID Token correctly (#42277)
Enhance the handling of merging the claims sets of the
ID Token and the UserInfo response. JsonObject#merge would throw a
runtime exception when attempting to merge two objects with the
same key and different values. This could happen for an OP that
returns different vales for the same claim in the ID Token and the
UserInfo response ( Google does that for profile claim ).
If a claim is contained in both sets, we attempt to merge the
values if they are objects or arrays, otherwise the ID Token claim
value takes presedence and overwrites the userinfo response.
2019-05-22 13:15:41 +03:00
Ioannis Kakavas 7af30345b4 Revert "mute failing filerealm hash caching tests (#42304)"
This reverts commit 39fbed1577.
2019-05-22 13:15:00 +03:00
Ioannis Kakavas 34dda75cdf Ensure SHA256 is not used in tests (#42289)
SHA256 was recently added to the Hasher class in order to be used
in the TokenService. A few tests were still using values() to get
the available algorithms from the Enum and it could happen that
SHA256 would be picked up by these.
This change adds an extra convenience method
(Hasher#getAvailableAlgoCacheHash) and enures that only this and
Hasher#getAvailableAlgoStoredHash are used for getting the list of
available password hashing algorithms in our tests.
2019-05-22 09:54:24 +03:00
Tim Vernum c5f191f6af
Add cluster restart for security on basic (#42217)
This performs a simple restart test to move a basic licensed
cluster from no security (the default) to security & transport TLS
enabled.

Backport of: #41933
2019-05-22 14:27:45 +10:00
Tal Levy 39fbed1577 mute failing filerealm hash caching tests (#42304)
some tests are failing after the introduction of #41792.

relates #42267 and #42289.
2019-05-21 10:40:14 -07:00
Tim Vernum 7b3a9c7033
Do not refresh realm cache unless required (#42212)
If there are no realms that depend on the native role mapping store,
then changes should it should not perform any cache refresh.
A refresh with an empty realm array will refresh all realms.

This also fixes a spurious log warning that could occur if the
role mapping store was notified that the security index was recovered
before any realm were attached.

Backport of: #42169
2019-05-21 18:14:22 +10:00
Ioannis Kakavas b4a413c4d0
Hash token values for storage (#41792) (#42220)
This commit changes how access tokens and refresh tokens are stored
in the tokens index.

Access token values are now hashed before being stored in the id
field of the `user_token` and before becoming part of the token
document id. Refresh token values are hashed before being stored
in the token field of the `refresh_token`. The tokens are hashed
without a salt value since these are v4 UUID values that have
enough entropy themselves. Both rainbow table attacks and offline
brute force attacks are impractical.

As a side effect of this change and in order to support multiple
concurrent refreshes as introduced in #39631, upon refreshing an
<access token, refresh token> pair, the superseding access token
and refresh tokens values are stored in the superseded token doc,
encrypted with a key that is derived from the superseded refresh
token. As such, subsequent requests to refresh the same token in
the predefined time window will return the same superseding access
token and refresh token values, without hitting the tokens index
(as this only stores hashes of the token values). AES in GCM
mode is used for encrypting the token values and the key
derivation from the superseded refresh token uses a small number
of iterations as it needs to be quick.

For backwards compatibility reasons, the new behavior is only
enabled when all nodes in a cluster are in the required version
so that old nodes can cope with the token values in a mixed
cluster during a rolling upgrade.
2019-05-20 17:55:29 +03:00
Jay Modi dbbdcea128
Update ciphers for TLSv1.3 and JDK11 if available (#42082)
This commit updates the default ciphers and TLS protocols that are used
when the runtime JDK supports them. New cipher support has been
introduced in JDK 11 and 12 along with performance fixes for AES GCM.
The ciphers are ordered with PFS ciphers being most preferred, then
AEAD ciphers, and finally those with mainstream hardware support. When
available stronger encryption is preferred for a given cipher.

This is a backport of #41385 and #41808. There are known JDK bugs with
TLSv1.3 that have been fixed in various versions. These are:

1. The JDK's bundled HttpsServer will endless loop under JDK11 and JDK
12.0 (Fixed in 12.0.1) based on the way the Apache HttpClient performs
a close (half close).
2. In all versions of JDK 11 and 12, the HttpsServer will endless loop
when certificates are not trusted or another handshake error occurs. An
email has been sent to the openjdk security-dev list and #38646 is open
to track this.
3. In JDK 11.0.2 and prior there is a race condition with session
resumption that leads to handshake errors when multiple concurrent
handshakes are going on between the same client and server. This bug
does not appear when client authentication is in use. This is
JDK-8213202, which was fixed in 11.0.3 and 12.0.
4. In JDK 11.0.2 and prior there is a bug where resumed TLS sessions do
not retain peer certificate information. This is JDK-8212885.

The way these issues are addressed is that the current java version is
checked and used to determine the supported protocols for tests that
provoke these issues.
2019-05-20 09:45:36 -04:00
Ryan Ernst fa1d1d1f57 Deprecate the native realm migration tool (#42142)
The migrate tool was added when the native realm was created, to aid
users in converting from file realms that were per node, into the
cluster managed native realm. While this tool was useful at the time,
users should now be using the native realm directly. This commit
deprecates the tool, to be removed in a followup for 8.0.
2019-05-16 09:52:31 -04:00
Tim Vernum 9191b02213
Enforce transport TLS on Basic with Security (#42150)
If a basic license enables security, then we should also enforce TLS
on the transport interface.

This was already the case for Standard/Gold/Platinum licenses.

For Basic, security defaults to disabled, so some of the process
around checking whether security is actuallY enabled is more complex
now that we need to account for basic licenses.
2019-05-15 13:59:27 -04:00
David Kyle c0d67919c8 Mute ApiKeyIntegTests
See https://github.com/elastic/elasticsearch/issues/41747
2019-05-09 13:24:52 +01:00
Ioannis Kakavas 58041f3fdb Remove op.name configuration setting (#41445)
This setting was not eventually used in the realm and thus can be
removed
2019-05-07 19:01:55 +03:00
Tim Vernum 3508b6c641
Log warning when unlicensed realms are skipped (#41828)
Because realms are configured at node startup, but license levels can
change dynamically, it is possible to have a running node that has a
particular realm type configured, but that realm is not permitted under
the current license.
In this case the realm is silently ignored during authentication.

This commit adds a warning in the elasticsearch logs if authentication
fails, and there are realms that have been skipped due to licensing.
This message is not intended to imply that the realms could (or would)
have successfully authenticated the user, but they may help reduce
confusion about why authentication failed if the caller was expecting
the authentication to be handled by a particular realm that is in fact
unlicensed.

Backport of: #41778
2019-05-07 09:55:48 +10:00
Ryan Ernst 6fd8924c5a Switch run task to use real distro (#41590)
The run task is supposed to run elasticsearch with the given plugin or
module. However, for modules, this is most realistic if using the full
distribution. This commit changes the run setup to use the default or
oss as appropriate.
2019-05-06 12:34:07 -07:00
Tim Brooks 927013426a
Read multiple TLS packets in one read call (#41820)
This is related to #27260. Currently we have a single read buffer that
is no larger than a single TLS packet. This prevents us from reading
multiple TLS packets in a single socket read call. This commit modifies
our TLS work to support reading similar to the plaintext case. The data
will be copied to a (potentially) recycled TLS packet-sized buffer for
interaction with the SSLEngine.
2019-05-06 09:51:32 -06:00
Jason Tedor d0f071236a
Simplify filtering addresses on interfaces (#41758)
This commit is a refactoring of how we filter addresses on
interfaces. In particular, we refactor all of these methods into a
common private method. We also change the order of logic to first check
if an address matches our filter and then check if the interface is
up. This is to possibly avoid problems we are seeing where devices are
flapping up and down while we are checking for loopback addresses. We do
not expect the loopback device to flap up and down so by reversing the
logic here we avoid that problem on CI machines. Finally, we expand the
error message when this does occur so that we know which device is
flapping.
2019-05-02 16:36:27 -04:00
Tim Brooks b4bcbf9f64
Support http read timeouts for transport-nio (#41466)
This is related to #27260. Currently there is a setting
http.read_timeout that allows users to define a read timeout for the
http transport. This commit implements support for this functionality
with the transport-nio plugin. The behavior here is that a repeating
task will be scheduled for the interval defined. If there have been
no requests received since the last run and there are no inflight
requests, the channel will be closed.
2019-05-02 09:48:52 -06:00
Jason Tedor 0870523489
Fix compilation in SecurityMocks
This commit fixes compilation in SecurityMocks from what appears to be
some merge conflicts that were not resolved adequately.
2019-05-01 14:29:33 -04:00
Jason Tedor f500d727cf
Resolve conflicts in AuthenticationServiceTests
This commit resolves some merge conflicts that arose in
AuthenticationServiceTests after a rebase.
2019-05-01 14:20:58 -04:00
Jason Tedor 942a1445f3
Fix reference to 7.1 in security token tests
This version should be referencing 7.2 rather than 7.1, due to some
changes in timing of the token service changes.
2019-05-01 14:00:35 -04:00
Ioannis Kakavas 8426130553
Add negative tests for security features in basic
Assert that API Keys, Tokens, DLS/FLS do not work in basic
2019-05-01 14:00:32 -04:00
Tim Vernum 3589ca8493
Add test for security on basic license.
This is modelled on the qa test for TLS on basic.

It starts a cluster on basic with security & performs a number of
security related checks.
It also performs those same checks on a trial license.
2019-05-01 14:00:29 -04:00
Tim Vernum 0ee16d0115
Security on Basic License
This adds support for using security on a basic license.
It includes:

- AllowedRealmType.NATIVE realms (reserved, native, file)
- Roles / RBAC
- TLS (already supported)

It does not support:

- Audit
- IP filters
- Token Service & API Keys
- Advanced realms (AD, LDAP, SAML, etc)
- Advanced roles (DLS, FLS)
- Pluggable security

As with trial licences, security is disabled by default.

This commit does not include any new automated tests, but existing tests have been updated.
2019-05-01 14:00:25 -04:00
Jason Tedor 7f3ab4524f
Bump 7.x branch to version 7.2.0
This commit adds the 7.2.0 version constant to the 7.x branch, and bumps
BWC logic accordingly.
2019-05-01 13:38:57 -04:00
Albert Zaharovits 990be1f806
Security Tokens moved to a new separate index (#40742)
This commit introduces the `.security-tokens` and `.security-tokens-7`
alias-index pair. Because index snapshotting is at the index level granularity
(ie you cannot snapshot a subset of an index) snapshoting .`security` had
the undesirable effect of storing ephemeral security tokens. The changes
herein address this issue by moving tokens "seamlessly" (without user
intervention) to another index, so that a "Security Backup" (ie snapshot of
`.security`) would not be bloated by ephemeral data.
2019-05-01 14:53:56 +03:00
Jason Tedor 0b46a62f6b
Drop distinction in entries for keystore (#41701)
Today we allow adding entries from a file or from a string, yet we
internally maintain this distinction such that if you try to add a value
from a file for a setting that expects a string or add a value from a
string for a setting that expects a file, you will have a bad time. This
causes a pain for operators such that for each setting they need to know
this difference. Yet, we do not need to maintain this distinction
internally as they are bytes after all. This commit removes that
distinction and includes logic to upgrade legacy keystores.
2019-05-01 07:02:04 -04:00
Tim Brooks df3ef66294
Remove dedicated SSL network write buffer (#41654)
This is related to #27260. Currently for the SSLDriver we allocate a
dedicated network write buffer and encrypt the data into that buffer one
buffer at a time. This requires constantly switching between encrypting
and flushing. This commit adds a dedicated outbound buffer for SSL
operations that will internally allocate new packet sized buffers as
they are need (for writing encrypted data). This allows us to totally
encrypt an operation before writing it to the network. Eventually it can
be hooked up to buffer recycling.

This commit also backports the following commit:

Handle WRAP ops during SSL read

It is possible that a WRAP operation can occur while decrypting
handshake data in TLS 1.3. The SSLDriver does not currently handle this
well as it does not have access to the outbound buffer during read call.
This commit moves the buffer into the Driver to fix this issue. Data
wrapped during a read call will be queued for writing after the read
call is complete.
2019-04-29 17:59:13 -06:00
David Kyle 1a6ffb2644 Mute ClusterPrivilegeTests.testThatSnapshotAndRestore
Tracked in #38030
2019-04-29 16:45:01 +10:00
Yogesh Gaikwad c0d40ae4ca
Remove deprecated stashWithOrigin calls and use the alternative (#40847) (#41562)
This commit removes the deprecated `stashWithOrigin` and
modifies its usage to use the alternative.
2019-04-28 21:25:42 +10:00
Tim Brooks 1f8ff052a1
Revert "Remove dedicated SSL network write buffer (#41283)"
This reverts commit f65a86c258.
2019-04-25 18:39:25 -06:00
Tim Brooks f65a86c258
Remove dedicated SSL network write buffer (#41283)
This is related to #27260. Currently for the SSLDriver we allocate a
dedicated network write buffer and encrypt the data into that buffer one
buffer at a time. This requires constantly switching between encrypting
and flushing. This commit adds a dedicated outbound buffer for SSL
operations that will internally allocate new packet sized buffers as
they are need (for writing encrypted data). This allows us to totally
encrypt an operation before writing it to the network. Eventually it can
be hooked up to buffer recycling.
2019-04-25 14:30:54 -06:00
Christoph Büscher 52495843cc [Docs] Fix common word repetitions (#39703) 2019-04-25 20:47:47 +02:00
Tim Brooks 6d7110edf5
SSLDriver can transition to CLOSED in handshake (#41458)
TLS 1.3 changes to the SSLEngine introduced a scenario where a UNWRAP
call during a handshake can consume a close notify alerty without
throwing an exception. This means that we continue down a codepath where
we assert that we are still in handshaking mode. Transitioning to closed
from handshaking is a valid scenario. This commit removes this
assertion.
2019-04-25 12:02:17 -06:00
Jim Ferenczi 6184efaff6
Handle unmapped fields in _field_caps API (#34071) (#41426)
Today the `_field_caps` API returns the list of indices where a field
is present only if this field has different types within the requested indices.
However if the request is an index pattern (or an alias, or both...) there
is no way to infer the indices if the response contains only fields that have
the same type in all indices. This commit changes the response to always return
the list of indices in the response. It also adds a way to retrieve unmapped field
in a specific section per field called `unmapped`. This section is created for each field
that is present in some indices but not all if the parameter `include_unmapped` is set to
true in the request (defaults to false).
2019-04-25 18:13:48 +02:00
Albert Zaharovits fe5789ada1 Fix Has Privilege API check on restricted indices (#41226)
The Has Privileges API allows to tap into the authorization process, to validate
privileges without actually running the operations to be authorized. This commit
fixes a bug, in which the Has Privilege API returned spurious results when checking
for index privileges over restricted indices (currently .security, .security-6,
.security-7). The actual authorization process is not affected by the bug.
2019-04-25 12:03:27 +03:00
Ryan Ernst 7e3875d781 Upgrade hamcrest to 2.1 (#41464)
hamcrest has some improvements in newer versions, like FileMatchers
that make assertions regarding file exists cleaner. This commit upgrades
to the latest version of hamcrest so we can start using new and improved
matchers.
2019-04-24 23:40:03 -07:00
Albert Zaharovits c3e0ae24d3
Fix role mapping DN field wildcards for users with NULL DNs (#41343)
The `DistinguishedNamePredicate`, used for matching users to role mapping
expressions, should handle users with null DNs. But it fails to do so (and this is
a NPE bug), if the role mapping expression contains a lucene regexp or a wildcard.

The fix simplifies `DistinguishedNamePredicate` to not handle null DNs at all, and
instead use the `ExpressionModel#NULL_PREDICATE` for the DN field, just like
any other missing user field.
2019-04-22 10:25:24 +03:00
Yogesh Gaikwad 0d1178fca6
put mapping authorization for alias with write-index and multiple read indices (#40834) (#41287)
When the same alias points to multiple indices we can write to only one index
with `is_write_index` value `true`. The special handling in case of the put
mapping request(to resolve authorized indices) has a check on indices size
for a concrete index. If multiple indices existed then it marked the request
as unauthorized.

The check has been modified to consider write index flag and only when the
requested index matches with the one with write index alias, the alias is considered
for authorization.

Closes #40831
2019-04-17 14:25:33 +10:00
Ioannis Kakavas fe9442b05b
Add an OpenID Connect authentication realm (#40674) (#41178)
This commit adds an OpenID Connect authentication realm to
elasticsearch. Elasticsearch (with the assistance of kibana or
another web component) acts as an OpenID Connect Relying
Party and supports the Authorization Code Grant and Implicit
flows as described in http://ela.st/oidc-spec. It adds support
for consuming and verifying signed ID Tokens, both RP
initiated and 3rd party initiated Single Sign on and RP
initiated signle logout.
It also adds an OpenID Connect Provider in the idp-fixture to
be used for the associated integration tests.

This is a backport of #40674
2019-04-15 12:41:16 +03:00
Yogesh Gaikwad 47ba45732d
Find and use non local IPv4 address while testing IP filtering (#40234) (#41141)
For pattern "n:localhost" PatternRule#isLocalhost() matches
any local address, loopback address.
[Note: I think for "localhost" this should not consider IP address
as a match when they are bound to network interfaces. It should just
be loopback address check unless the intent is to match all local addresses.
This class is adopted from Netty3 and I am not sure if this is intended
behavior or maybe I am missing something]

For now I have fixed this assuming the PatternRule#isLocalhost check is
correct by avoiding use of local address to check address denied.

Closes #40194
2019-04-13 04:37:25 +10:00
Martijn van Groningen 1eff8976a8
Deprecate AbstractHlrc* and AbstractHlrcStreamable* base test classes (#41014)
* moved hlrc parsing tests from xpack to hlrc module and removed dependency on hlrc from xpack core

* deprecated old base test class

* added deprecated jdoc tag

* split test between xpack-core part and hlrc part

* added lang-mustache test dependency, this previously came in via
hlrc dependency.

* added hlrc dependency on a qa module

* duplicated ClusterPrivilegeName class in xpack-core, since x-pack
core no longer has a dependency on hlrc.

* replace ClusterPrivilegeName usages with string literals

* moved tests to dedicated to hlrc packages in order to remove Hlrc part from the name and make sure to use imports instead of full qualified class where possible

* remove ESTestCase. from method invocation and use method directly,
because these tests indirectly extend from ESTestCase
2019-04-10 16:29:17 +02:00
Albert Zaharovits adf3393a4e
Deprecate permission over aliases (#38059) (#41060)
This PR generates deprecation log entries for each Role Descriptor,
used for building a Role, when the Role Descriptor grants more privileges
for an alias compared to an index that the alias points to. This is done in
preparation for the removal of the ability to define privileges over aliases.
There is one log entry for each "role descriptor name"-"alias name" pair.
On such a notice, the administrator is expected to modify the Role Descriptor
definition so that the name pattern for index names does not cover aliases.

Caveats:
* Role Descriptors that are not used in any authorization process,
either because they are not mapped to any user or the user they are mapped to
is not used by clients, are not be checked.
* Role Descriptors are merged when building the effective Role that is used in
the authorization process. Therefore some Role Descriptors can overlap others,
so even if one matches aliases in a deprecated way, and it is reported as such,
it is not at risk from the breaking behavior in the current role mapping configuration
and index-alias configuration. It is still reported because it is a best practice to
change its definition, or remove offending aliases.
2019-04-10 15:02:33 +03:00
Mark Vieira 1287c7d91f
[Backport] Replace usages RandomizedTestingTask with built-in Gradle Test (#40978) (#40993)
* Replace usages RandomizedTestingTask with built-in Gradle Test (#40978)

This commit replaces the existing RandomizedTestingTask and supporting code with Gradle's built-in JUnit support via the Test task type. Additionally, the previous workaround to disable all tasks named "test" and create new unit testing tasks named "unitTest" has been removed such that the "test" task now runs unit tests as per the normal Gradle Java plugin conventions.

(cherry picked from commit 323f312bbc829a63056a79ebe45adced5099f6e6)

* Fix forking JVM runner

* Don't bump shadow plugin version
2019-04-09 11:52:50 -07:00
Jason Tedor 26d8ecfe07
Fix unsafe publication in opt-out query cache (#40957)
This opt-out query cache has an unsafe publication issue, where the
cache is exposed to another thread (namely the cluster state update
thread) before the constructor has finished execution. This exposes the
opt-out query cache to concurrency bugs. This commit addresses this by
ensuring that the opt-out query cache is not registered as a listener
for license state changes until after the constructor has returned.
2019-04-08 16:11:20 -04:00
Mark Vieira 2569fb60de Avoid sharing source directories as it breaks intellij (#40877)
* Avoid sharing source directories as it breaks intellij
* Subprojects share main project output classes directory
* Fix jar hell
* Fix sql security with ssl integ tests
* Relax dependency ordering rule so we don't explode on cycles
2019-04-08 17:26:46 +03:00
Tim Vernum 26c63e0115
Add test for HTTP and Transport TLS on basic license (#40932)
This adds a new security/qa test for TLS on a basic license.

It starts a 2 node cluster with a basic license, and TLS enabled
on both HTTP and Transport, and verifies the license type, x-pack
SSL usage and SSL certificates API.

It also upgrades the cluster to a trial license and performs that
same set of checks (to ensure that clusters with basic license
and TLS enabled can be upgraded to a higher feature license)

Backport of: #40714
2019-04-08 13:23:12 +10:00
Jay Modi f34663282c
Update apache httpclient to version 4.5.8 (#40875)
This change updates our version of httpclient to version 4.5.8, which
contains the fix for HTTPCLIENT-1968, which is a bug where the client
started re-writing paths that contained encoded reserved characters
with their unreserved form.
2019-04-05 13:48:10 -06:00
Martijn van Groningen 809a5f13a4
Make -try xlint warning disabled by default. (#40833)
Many gradle projects specifically use the -try exclude flag, because
there are many cases where auto-closeable resource ignore is never
referenced in body of corresponding try statement. Suppressing this
warning specifically in each case that it happens using
`@SuppressWarnings("try")` would be very verbose.

This change removes `-try` from any gradle project and adds it to the
build plugin. Also this change removes exclude flags from gradle projects
that is already specified in build plugin (for example -deprecation).

Relates to #40366
2019-04-05 08:02:26 +02:00
Tim Vernum 1a30ab22fb
Show SSL usage when security is not disabled (#40761)
It is possible to have SSL enabled but security disabled if security
was dynamically disabled by the license type (e.g. trial license).

e.g. In the following configuration:

    xpack.license.self_generated.type: trial
    # xpack.security not set, default to disabled on trial
    xpack.security.transport.ssl.enabled: true

The security feature will be reported as

    available: true
    enabled: false

And in this case, SSL will be active even though security is not
enabled.

This commit causes the X-Pack feature usage to report the state of the
"ssl" features unless security was explicitly disabled in the
settings.

Backport of: #40672
2019-04-04 14:40:15 +11:00
Tim Vernum 2c770ba3cb
Support mustache templates in role mappings (#40571)
This adds a new `role_templates` field to role mappings that is an
alternative to the existing roles field.

These templates are evaluated at runtime to determine which roles should be
granted to a user.
For example, it is possible to specify:

    "role_templates": [
      { "template":{ "source": "_user_{{username}}" } }
    ]

which would mean that every user is assigned to their own role based on
their username.

You may not specify both roles and role_templates in the same role
mapping.

This commit adds support for templates to the role mapping API, the role
mapping engine, the Java high level rest client, and Elasticsearch
documentation.

Due to the lack of caching in our role mapping store, it is currently
inefficient to use a large number of templated role mappings. This will be
addressed in a future change.

Backport of: #39984, #40504
2019-04-02 20:55:10 +11:00
Tim Vernum 7bdd41399d
Support roles with application privileges against wildcard applications (#40675)
This commit introduces 2 changes to application privileges:

- The validation rules now accept a wildcard in the "suffix" of an application name.
  Wildcards were always accepted in the application name, but the "valid filename" check
  for the suffix incorrectly prevented the use of wildcards there.

- A role may now be defined against a wildcard application (e.g. kibana-*) and this will
  be correctly treated as granting the named privileges against all named applications.
  This does not allow wildcard application names in the body of a "has-privileges" check, but the
  "has-privileges" check can test concrete application names against roles with wildcards.

Backport of: #40398
2019-04-02 14:48:39 +11:00
Yannick Welsch 64b31f44af No mapper service and index caches for replicated closed indices (#40423)
Replicated closed indices can't be indexed into or searched, and therefore don't need a shard with
full indexing and search capabilities allocated. We can save on a lot of heap memory for those
indices by not allocating a mapper service and caching infrastructure (which preallocates a constant
amount per instance). Before this change, a 1GB ES instance could host 250 replicated closed
metricbeat indices (each index with one shard). After this change, the same instance can host 7300
replicated closed metricbeat instances (not that this would be a recommended configuration). Most
of the remaining memory is in the cluster state and the IndexSettings object.
2019-03-27 19:04:24 +01:00
Albert Zaharovits 2f80b7304f
Refactor Token Service (#39808)
This refactoring is in the context of the work related to moving security
tokens to a new index. In that regard, the Token Service has to work with
token documents stored in any of the two indices, albeit only as a transient
situation. I reckoned the added complexity as unmanageable,
hence this refactoring.

This is incomplete, as it fails to address the goal of minimizing .security accesses,
but I have stopped because otherwise it would've become a full blown rewrite
(if not already). I will follow-up with more targeted PRs.

In addition to being a true refactoring, some 400 errors moved to 500. Furthermore,
more stringed validation of various return result, has been implemented, notably the
one of the token document creation.
2019-03-21 15:55:56 +02:00
Yogesh Gaikwad 5d30df5a60
Fix so non super users can also create API keys (#40028) (#40286)
When creating API keys we check for if API key with
the same key name already exists and fail the request if it does.
The check should have been performed with XPackSecurityUser
instead of the authenticated user. This caused the request to fail
in case of the non-super user trying to create an API key.
This commit fixes by executing search action with SECURITY_ORIGIN
so it can be executed with XPackSecurityUser.
Also fixed the Rest test to avoid using a user with `super_user` role.

Closes #40029
2019-03-21 15:53:25 +11:00
Yannick Welsch 1d8b5fc658 Fail command-line client's auto-URL detection with helpful message (#40151)
The setup-passwords tool gives cryptic messages in case where custom discovery providers are
used (see #33580). As the URL auto-detection logic should be seen as best effort, this commit
improves the exception message to make it clearer what needs to be done to fix the issue.

Relates #33580
2019-03-19 09:04:14 +01:00
Albert Zaharovits 124de8d938 Un-hardcode SecurityIndexManager to handle generic indices (#40064)
`SecurityIndexManager` is hardcoded to handle only the `.security`-`.security-7` alias-index pair.
This commit removes the hardcoded bits, so that the `SecurityIndexManager` can be reused
for other indices, such as the planned security tokens index (`.security-tokens-7`).
2019-03-17 14:46:16 +02:00
Albert Zaharovits 1b75ee0bd7 AuditTrail correctly handle ReplicatedWriteRequest (#39925)
This fix deduplicates index names in `BulkShardRequests` and only audits
the specific resolved index for every comprising `BulkItemRequest`.
2019-03-17 13:05:26 +02:00
Jason Tedor d02bca1314
Upgrade the bouncycastle dependency to 1.61 (#40017)
This commit upgrades the bouncycastle dependency from 1.59 to 1.61.
2019-03-14 08:54:47 -04:00
Michael Basnight 8c78fc096d More lenient socket binding in LDAP tests (#39864)
The LDAP tests attempt to bind all interfaces,
but if for some reason an interface can't be bound
the tests will stall until the suite times out.

This modifies the tests to be a bit more lenient and allow
some binding to fail so long as at least one succeeds.
This allows the test to continue even in more antagonistic
environments.
2019-03-12 12:00:49 -04:00
Albert Zaharovits 3c7fafd0cc Fix token invalidation when retries exhausted (#39799)
Fixes an error about missing to call the index invalidation listener
when retry count is exhausted but there are still tokens to be retried.
2019-03-08 20:18:59 +02:00
Tim Brooks 8043fefcf6
Log close_notify during handshake at debug level (#39715)
A TLS handshake requires exchanging multiple messages to initiate a
session. If one side decides to close during the handshake, it is
supposed to send a close_notify alert (similar to closing during
application data exchange). The java SSLEngine engine throws an
exception when this happens. We currently log this at the warn level if
trace logging is not enabled. This level is too high for a valid
scenario. Additionally it happens all the time in tests (quickly closing
and opened transports). This commit changes this to be logged at the
debug level if trace is not enabled. Additionally, it extracts the
transport security exception handling to a common class.
2019-03-07 09:52:18 -07:00
Ioannis Kakavas 6c19d872a0 Fix testRefreshingMultipleTimesWithinWindowSucceeds (#39701)
Previously all the threads were writing the received tokens to a
HashSet. In cases with many threads, sometimes (1 every ~25 tests)
calling size() on the HashSet returned 2 even though it seemed to
contain only one String and there was no evidence from logging that
threadSecurityClient.refreshToken() ever returned a different
access or refresh token.

This commit changes the test to use a ConcurrentHashMap instead,
checking that we only received one pair of access token/refresh token
eventually. It also adds a check so that we won't take into consideration
tokens that are returned after 30s, hence not in the concurrent refresh
time window.
2019-03-07 13:13:50 +02:00
Albert Zaharovits fb1005fffc
Fix Token Service retry mechanism (#39639)
Fixes several errors of the token retry logic:

* not checking for backoff.hasNext() before calling backoff.next()
* checking for backoff.hasNext() without calling backoff.next()
* not preserving the context on the retry
* calling scheduleWithFixedDelay instead of schedule
2019-03-06 15:32:23 +02:00
David Turner 77dd711847 Tidy up GroupedActionListener (#39633)
Today the `GroupedActionListener` accepts a `defaults` parameter but all
callers pass an empty list. Also it is permitted to pass an empty group but
this is trappy because the delegated listener is never be called in that case.
This commit removes the `defaults` parameter and forbids an empty group.
2019-03-06 09:25:10 +00:00
Yogesh Gaikwad c91dcbd5ee
Types removal security index template (#39705) (#39728)
As we are moving to single type indices,
we need to address this change in security-related indexes.
To address this, we are
- updating index templates to use preferred type name `_doc`
- updating the API calls to use preferred type name `_doc`

Upgrade impact:-
In case of an upgrade from 6.x, the security index has type
`doc` and this will keep working as there is a single type and `_doc`
works as an alias to an existing type. The change is handled in the
`SecurityIndexManager` when we load mappings and settings from
the template. Previously, we used to do a `PutIndexTemplateRequest`
with the mapping source JSON with the type name. This has been
modified to remove the type name from the source.
So in the case of an upgrade, the `doc` type is updated
whereas for fresh installs `_doc` is updated. This happens as
backend handles `_doc` as an alias to the existing type name.

An optional step is to `reindex` security index and update the
type to `_doc`.

Since we do not support the security audit log index,
that template has been deleted.

Relates: #38637
2019-03-06 18:53:59 +11:00
Ioannis Kakavas 7ed9d52824
Support concurrent refresh of refresh tokens (#39647)
This is a backport of #39631

Co-authored-by: Jay Modi jaymode@users.noreply.github.com

This change adds support for the concurrent refresh of access
tokens as described in #36872
In short it allows subsequent client requests to refresh the same token that
come within a predefined window of 60 seconds to be handled as duplicates
of the original one and thus receive the same response with the same newly
issued access token and refresh token.
In order to support that, two new fields are added in the token document. One
contains the instant (in epoqueMillis) when a given refresh token is refreshed
and one that contains a pointer to the token document that stores the new
refresh token and access token that was created by the original refresh.
A side effect of this change, that was however also a intended enhancement
for the token service, is that we needed to stop encrypting the string
representation of the UserToken while serializing. ( It was necessary as we
correctly used a new IV for every time we encrypted a token in serialization, so
subsequent serializations of the same exact UserToken would produce
different access token strings)

This change also handles the serialization/deserialization BWC logic:

    In mixed clusters we keep creating tokens in the old format and
    consume only old format tokens
    In upgraded clusters, we start creating tokens in the new format but
    still remain able to consume old format tokens (that could have been
    created during the rolling upgrade and are still valid)
    When reading/writing TokensInvalidationResult objects, we take into
    consideration that pre 7.1.0 these contained an integer field that carried
    the attempt count

Resolves #36872
2019-03-05 14:55:59 +02:00
Albert Zaharovits e7dbfda5d3 Fix security index auto-create and state recovery race (#39582)
Previously, the security index could be wrongfully recreated. This might
happen if the index was interpreted as missing, as in the case of a fresh
install, but the index existed and the state did not yet recover.

This fix will return HTTP SERVICE_UNAVAILABLE (503) for requests that
try to write to the security index before the state has not been recovered yet.
2019-03-05 12:47:59 +02:00
Tanguy Leroux 0c6b7cfb77 Revert "Support concurrent refresh of refresh tokens (#39559)"
This reverts commit e2599214e0.
2019-03-01 17:59:45 +01:00
Ioannis Kakavas e2599214e0
Support concurrent refresh of refresh tokens (#39559)
This is a backport of #38382

This change adds supports for the concurrent refresh of access
tokens as described in #36872
In short it allows subsequent client requests to refresh the same token that
come within a predefined window of 60 seconds to be handled as duplicates
of the original one and thus receive the same response with the same newly
issued access token and refresh token.
In order to support that, two new fields are added in the token document. One
contains the instant (in epoqueMillis) when a given refresh token is refreshed
and one that contains a pointer to the token document that stores the new
refresh token and access token that was created by the original refresh.
A side effect of this change, that was however also a intended enhancement
for the token service, is that we needed to stop encrypting the string
representation of the UserToken while serializing. ( It was necessary as we
correctly used a new IV for every time we encrypted a token in serialization, so
subsequent serializations of the same exact UserToken would produce
different access token strings)

This change also handles the serialization/deserialization BWC logic:

- In mixed clusters we keep creating tokens in the old format and
consume only old format tokens
- In upgraded clusters, we start creating tokens in the new format but
still remain able to consume old format tokens (that could have been
created during the rolling upgrade and are still valid)

Resolves #36872

Co-authored-by: Jay Modi jaymode@users.noreply.github.com
2019-03-01 16:00:07 +02:00
Albert Zaharovits 8a19d981db Integ test snapshot and restore for native realm (#39123)
This commit adds a simple integ test that exercises the flow:
* snapshot .security
* delete .security
* restore .security

, checking that the Native Realm works as expected.

Relates #34454
2019-02-28 14:41:47 +02:00
Tim Brooks f24dae302d
Make security tests transport agnostic (#39411)
Currently there are two security tests that specifically target the
netty security transport. This PR moves the client authentication tests
into `AbstractSimpleSecurityTransportTestCase` so that the nio transport
will also be tested.

Additionally the work to build transport configurations is moved out of
the netty transport and tested independently.
2019-02-26 18:55:19 -07:00
Tim Vernum 30687cbe7f
Switch internal security index to ".security-7" (#39422)
This changes the name of the internal security index to ".security-7",
but supports indices that were upgraded from earlier versions and use
the ".security-6" name.

In all cases, both ".security-6" and ".security-7" are considered to
be restricted index names regardless of which name is actually in use
on the cluster.

Backport of: #39337
2019-02-27 12:49:44 +11:00
Ioannis Kakavas 7f999c43b3
[BACKPORT-7.x] Fix TokenBackwardsCompatibility tests (#39294)
This change is a backport of  #39252

- Fixes TokenBackwardsCompatibilityIT: Existing tests seemed to made
  the assumption that in the oneThirdUpgraded stage the master node
  will be on the old version and in the twoThirdsUpgraded stage, the
  master node will be one of the upgraded ones. However, there is no
  guarantee that the master node in any of the states will or will
  not be one of the upgraded ones.
  This class now tests:
  - That we can generate and consume tokens before we start the
  rolling upgrade.
  - That we can consume tokens generated in the old cluster during
  all the stages of the rolling upgrade.
  - That while on a mixed cluster, when/if the master node is
  upgraded, we can generate, consume and refresh a token
  - That after the rolling upgrade, we can consume a token
  generated in an old cluster and can invalidate it so that it
  can't be used any more.
- Ensures that during the rolling upgrade, the upgraded nodes have
the same configuration as the old nodes. Specifically that the
file realm we use is explicitly named `file1`. This is needed
because while attempting to refresh a token in a mixed cluster
we might create a token hitting an old node and attempt to refresh
it hitting a new node. If the file realm name is not the same, the
refresh will be seen as being made by a "different" client, and
will, thus, fail.
- Renames the Authentication variable we check while refreshing a
token to be clientAuth in order to make the code more readable.

Some of the above were possibly causing the flakiness of #37379
2019-02-26 10:42:36 +02:00
Tim Brooks 44df76251f
Rebuild remote connections on profile changes (#39146)
Currently remote compression and ping schedule settings are dynamic.
However, we do not listen for changes. This commit adds listeners for
changes to those two settings. Additionally, when those settings change
we now close existing connections and open new ones with the settings
applied.

Fixes #37201.
2019-02-21 14:00:39 -07:00
Jay Modi af451459a5
Fix failures in SessionFactoryLoadBalancingTests (#39154)
This change aims to fix failures in the session factory load balancing
tests that mock failure scenarios. For these tests, we randomly shut
down ldap servers and bind a client socket to the port they were
listening on. Unfortunately, we would occasionally encounter failures
in these tests where a socket was already in use and/or the port
we expected to connect to was wrong and in fact was to one of the ldap
instances that should have been shut down.

The failures are caused by the behavior of certain operating systems
when it comes to binding ports and wildcard addresses. It is possible
for a separate application to be bound to a wildcard address and still
allow our code to bind to that port on a specific address. So when we
close the server socket and open the client socket, we are still able
to establish a connection since the other application is already
listening on that port on a wildcard address. Another variant is that
the os will allow a wildcard bind of a server socket when there is
already an application listening on that port for a specific address.

In order to do our best to prevent failures in these scenarios, this
change does the following:

1. Binds a client socket to all addresses in an awaitBusy
2. Adds assumption that we could bind all valid addresses
3. In the case that we still establish a connection to an address that
   we should not be able to, try to bind and expect a failure of not
   being connected

Closes #32190
2019-02-20 11:38:26 -07:00
Albert Zaharovits af8ef1bb98 Do not create the missing index when invoking getRole (#39039)
In most of the places we avoid creating the `.security` index (or updating the mapping)
for read/search operations. This is more of a nit for the case of the getRole call,
that fixes a possible mapping update during a get role, and removes a dead if branch
about creating the `.security` index.
2019-02-20 17:33:10 +02:00
Jason Tedor 09ea3ccd16
Remove retention leases when unfollowing (#39088)
This commit attempts to remove the retention leases on the leader shards
when unfollowing an index. This is best effort, since the leader might
not be available.
2019-02-20 07:06:49 -05:00
Ioannis Kakavas 210f34f8e9 Remove BCryptTests (#39098)
This test was added to verify that we fixed a specific behavior in
Bcrypt and hasn't been running for almost 4 years now.
2019-02-19 18:12:18 +02:00
Ioannis Kakavas 59e9a0f4f4 Disable specific locales for tests in fips mode (#38938)
* Disable specific locales for tests in fips mode

The Bouncy Castle FIPS provider that we use for running our tests
in fips mode has an issue with locale sensitive handling of Dates as
described in https://github.com/bcgit/bc-java/issues/405

This causes certificate validation to fail if any given test that
includes some form of certificate validation happens to run in one
of the locales. This manifested earlier in #33081 which was
handled insufficiently in #33299

This change ensures that the problematic 3 locales

* th-TH
* ja-JP-u-ca-japanese-x-lvariant-JP
* th-TH-u-nu-thai-x-lvariant-TH

will not be used when running our tests in a FIPS 140 JVM. It also
reverts #33299
2019-02-19 08:46:08 +02:00
Hendrik Muhs 4f662bd289
Add data frame feature (#38934) (#39029)
The data frame plugin allows users to create feature indexes by pivoting a source index. In a
nutshell this can be understood as reindex supporting aggregations or similar to the so called entity
centric indexing.

Full history is provided in: feature/data-frame-transforms
2019-02-18 11:07:29 +01:00
Jason Tedor a5ce1e0bec
Integrate retention leases to recovery from remote (#38829)
This commit is the first step in integrating shard history retention
leases with CCR. In this commit we integrate shard history retention
leases with recovery from remote. Before we start transferring files, we
take out a retention lease on the primary. Then during the file copy
phase, we repeatedly renew the retention lease. Finally, when recovery
from remote is complete, we disable the background renewing of the
retention lease.
2019-02-16 15:37:52 -05:00
Yogesh Gaikwad 36c274867e
Fix intermittent failure in ApiKeyIntegTests (#38627) (#38935)
Few tests failed intermittently and most of the
times due to invalidated or expired keys that were
deleted were still reported in search results.
This commit removes the test and adds enhancements
to other tests testing different scenario's.

When ExpiredApiKeysRemover is triggered, the tests
did not await its termination thereby sometimes
the results would be wrong for a search operation.

DELETE_INTERVAL setting has been further reduced to
100ms so we can trigger ExpiredApiKeysRemover faster.

Closes #38408
2019-02-15 23:01:35 +11:00
Jay Modi 5d06226507
Fix writing of SecurityFeatureSetUsage to pre-7.1 (#38922)
This change makes the writing of new usage data conditional based on
the version that is being written to. A test has also been added to
ensure serialization works as expected to an older version.

Relates #38687, #38917
2019-02-14 16:28:52 -07:00
Jay Modi e59b7b696a
Use consistent view of realms for authentication (#38815)
This change updates the authentication service to use a consistent view
of the realms based on the license state at the start of
authentication. Without this, the license can change during
authentication of a request and it will result in a failure if the
realm that extracted the token is no longer in the realm list. This
manifests in some tests as an authentication failure that should never
really happen; one example would be the test framework's transport
client user should always have a succesful authentication but in the
LicensingTests this can fail and will show up as a
NoNodeAvailableException.

Additionally, the licensing tests have been updated to ensure that
there is consistency when changing the license. The license is changed
by modifying the internal xpack license state on each node, which has
no protection against be changed by some pending cluster action. The
methods to disable and enable now ensure we have a green cluster and
that the cluster is consistent before returning.

Closes #30301
2019-02-14 07:49:14 -07:00
Yogesh Gaikwad 335cf91bb9
Add enabled status for token and api key service (#38687) (#38882)
Right now there is no way to determine whether the
token service or API key service is enabled or not.
This commit adds support for the enabled status of
token and API key service to the security feature set
usage API `/_xpack/usage`.

Closes #38535
2019-02-14 23:08:52 +11:00
Ioannis Kakavas 8c624e5a20 Enhance parsing of StatusCode in SAML Responses (#38628)
* Enhance parsing of StatusCode in SAML Responses

<Status> elements in a failed response might contain two nested
<StatusCode> elements. We currently only parse the first one in
order to create a message that we attach to the Exception we return
and log. However this is generic and only gives out informarion
about whether the SAML IDP believes it's an error with the
request or if it couldn't handle the request for other reasons. The
encapsulated StatusCode has a more interesting error message that
potentially gives out the actual error as in Invalid nameid policy,
authentication failure etc.

This change ensures that we print that information also, and removes
Message and Details fields from the message when these are not
part of the Status element (which quite often is the case)
2019-02-11 11:55:26 +02:00
Tim Vernum 273edea712
Mute testExpiredApiKeysDeletedAfter1Week (#38683)
Tracked: #38408
2019-02-11 16:50:10 +11:00
Christoph Büscher 5180b36547 Mute failing ApiKeyIntegTests (#38614) 2019-02-08 13:04:17 +01:00
David Turner 5a3c452480
Align docs etc with new discovery setting names (#38492)
In #38333 and #38350 we moved away from the `discovery.zen` settings namespace
since these settings have an effect even though Zen Discovery itself is being
phased out. This change aligns the documentation and the names of related
classes and methods with the newly-introduced naming conventions.
2019-02-06 11:34:38 +00:00
Yogesh Gaikwad 6ff4a8cfd5
Add API key settings documentation (#38490)
This commit adds missing
API key service settings documentation.
2019-02-06 20:58:22 +11:00
Yogesh Gaikwad 5261673349
Change the min supported version to 6.7.0 for API keys (#38481)
This commit changes the minimum supported version to 6.7.0
for API keys, the change for the API keys has been backported
to 6.7.0 version #38399
2019-02-06 16:03:49 +11:00
Jay Modi e73c9c90ee
Add an authentication cache for API keys (#38469)
This commit adds an authentication cache for API keys that caches the
hash of an API key with a faster hash. This will enable better
performance when API keys are used for bulk or heavy searching.
2019-02-05 18:16:26 -07:00
Yogesh Gaikwad 57600c5acb
Enable logs for intermittent test failure (#38426)
I have not been able to reproduce the failing
test scenario locally for #38408 and there are other similar
tests which are running fine in the same test class.
I am re-enabling the test with additional logs so
that we can debug further on what's happening.
I will keep the issue open for now and look out for the builds
to see if there are any related failures.
2019-02-06 11:21:54 +11:00
Przemyslaw Gomulka afcdbd2bc0
XPack: core/ccr/Security-cli migration to java-time (#38415)
part of the migrating joda time work.
refactoring x-pack plugins usages of joda to java-time
refers #27330
2019-02-05 22:09:32 +01:00
Jay Modi 7ca5495d86
Allow custom authorization with an authorization engine (#38358)
For some users, the built in authorization mechanism does not fit their
needs and no feature that we offer would allow them to control the
authorization process to meet their needs. In order to support this,
a concept of an AuthorizationEngine is being introduced, which can be
provided using the security extension mechanism.

An AuthorizationEngine is responsible for making the authorization
decisions about a request. The engine is responsible for knowing how to
authorize and can be backed by whatever mechanism a user wants. The
default mechanism is one backed by roles to provide the authorization
decisions. The AuthorizationEngine will be called by the
AuthorizationService, which handles more of the internal workings that
apply in general to authorization within Elasticsearch.

In order to support external authorization services that would back an
authorization engine, the entire authorization process has become
asynchronous, which also includes all calls to the AuthorizationEngine.

The use of roles also leaked out of the AuthorizationService in our
existing code that is not specifically related to roles so this also
needed to be addressed. RequestInterceptor instances sometimes used a
role to ensure a user was not attempting to escalate their privileges.
Addressing this leakage of roles meant that the RequestInterceptor
execution needed to move within the AuthorizationService and that
AuthorizationEngines needed to support detection of whether a user has
more privileges on a name than another. The second area where roles
leaked to the user is in the handling of a few privilege APIs that
could be used to retrieve the user's privileges or ask if a user has
privileges to perform an action. To remove the leakage of roles from
these actions, the AuthorizationService and AuthorizationEngine gained
methods that enabled an AuthorizationEngine to return the response for
these APIs.

Ultimately this feature is the work included in:
#37785
#37495
#37328
#36245
#38137
#38219

Closes #32435
2019-02-05 13:39:29 -07:00
Boaz Leskes 033ba725af
Remove support for internal versioning for concurrency control (#38254)
Elasticsearch has long [supported](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-index_.html#index-versioning) compare and set (a.k.a optimistic concurrency control) operations using internal document versioning. Sadly that approach is flawed and can sometime do the wrong thing. Here's the relevant excerpt from the resiliency status page:

> When a primary has been partitioned away from the cluster there is a short period of time until it detects this. During that time it will continue indexing writes locally, thereby updating document versions. When it tries to replicate the operation, however, it will discover that it is partitioned away. It won’t acknowledge the write and will wait until the partition is resolved to negotiate with the master on how to proceed. The master will decide to either fail any replicas which failed to index the operations on the primary or tell the primary that it has to step down because a new primary has been chosen in the meantime. Since the old primary has already written documents, clients may already have read from the old primary before it shuts itself down. The version numbers of these reads may not be unique if the new primary has already accepted writes for the same document 

We recently [introduced](https://www.elastic.co/guide/en/elasticsearch/reference/6.x/optimistic-concurrency-control.html) a new sequence number based approach that doesn't suffer from this dirty reads problem. 

This commit removes support for internal versioning as a concurrency control mechanism in favor of the sequence number approach.

Relates to #1078
2019-02-05 20:53:35 +01:00
David Turner f2dd5dd6eb
Remove DiscoveryPlugin#getDiscoveryTypes (#38414)
With this change we no longer support pluggable discovery implementations. No
known implementations of `DiscoveryPlugin` actually override this method, so in
practice this should have no effect on the wider world. However, we were using
this rather extensively in tests to provide the `test-zen` discovery type. We
no longer need a separate discovery type for tests as we no longer need to
customise its behaviour.

Relates #38410
2019-02-05 17:42:24 +00:00
Jason Tedor 638ba4a59a
Mute failing API key integration test (#38409)
This commit mutes the test
testGetAndInvalidateApiKeysWithExpiredAndInvalidatedApiKey as it failed
during a PR build.
2019-02-05 06:08:03 -05:00
Albert Zaharovits 8e2eb39cef
SecuritySettingsSource license.self_generated: trial (#38233)
Authn is enabled only if `license_type` is non `basic`, but `basic` is
what the `LicenseService` generates implicitly. This commit explicitly sets
license type to `trial`, which allows for authn, in the `SecuritySettingsSource`
which is the settings configuration parameter for `InternalTestCluster`s.

The real problem, that had created tests failures like #31028 and #32685, is
that the check `licenseState.isAuthAllowed()` can change sporadically. If it were
to return `true` or `false` during the whole test there would be no problem.
The problem manifests when it turns from `true` to `false` right before `Realms.asList()`.
There are other license checks before this one (request filter, token service, etc)
that would not cause a problem if they would suddenly see the check as `false`.
But switching to `false` before `Realms.asList()` makes it appear that no installed
realms could have handled the authn token which is an authentication error, as can
be seen in the failing tests.

Closes #31028 #32685
2019-02-05 10:49:08 +02:00
David Turner 2d114a02ff
Rename static Zen1 settings (#38333)
Renames the following settings to remove the mention of `zen` in their names:

- `discovery.zen.hosts_provider` -> `discovery.seed_providers`
- `discovery.zen.ping.unicast.concurrent_connects` -> `discovery.seed_resolver.max_concurrent_resolvers`
- `discovery.zen.ping.unicast.hosts.resolve_timeout` -> `discovery.seed_resolver.timeout`
- `discovery.zen.ping.unicast.hosts` -> `discovery.seed_addresses`
2019-02-05 08:46:52 +00:00
Yogesh Gaikwad fe36861ada
Add support for API keys to access Elasticsearch (#38291)
X-Pack security supports built-in authentication service
`token-service` that allows access tokens to be used to 
access Elasticsearch without using Basic authentication.
The tokens are generated by `token-service` based on
OAuth2 spec. The access token is a short-lived token
(defaults to 20m) and refresh token with a lifetime of 24 hours,
making them unsuitable for long-lived or recurring tasks where
the system might go offline thereby failing refresh of tokens.

This commit introduces a built-in authentication service
`api-key-service` that adds support for long-lived tokens aka API
keys to access Elasticsearch. The `api-key-service` is consulted
after `token-service` in the authentication chain. By default,
if TLS is enabled then `api-key-service` is also enabled.
The service can be disabled using the configuration setting.

The API keys:-
- by default do not have an expiration but expiration can be
  configured where the API keys need to be expired after a
  certain amount of time.
- when generated will keep authentication information of the user that
   generated them.
- can be defined with a role describing the privileges for accessing
   Elasticsearch and will be limited by the role of the user that
   generated them
- can be invalidated via invalidation API
- information can be retrieved via a get API
- that have been expired or invalidated will be retained for 1 week
  before being deleted. The expired API keys remover task handles this.

Following are the API key management APIs:-
1. Create API Key - `PUT/POST /_security/api_key`
2. Get API key(s) - `GET /_security/api_key`
3. Invalidate API Key(s) `DELETE /_security/api_key`

The API keys can be used to access Elasticsearch using `Authorization`
header, where the auth scheme is `ApiKey` and the credentials, is the 
base64 encoding of API key Id and API key separated by a colon.
Example:-
```
curl -H "Authorization: ApiKey YXBpLWtleS1pZDphcGkta2V5" http://localhost:9200/_cluster/health
```

Closes #34383
2019-02-05 14:21:57 +11:00
Yogesh Gaikwad 9d3f057894
Limit token expiry to 1 hour maximum (#38244)
We mention in our documentation for the token
expiration configuration maximum value is 1 hour
but do not enforce it. This commit adds max limit
to the TOKEN_EXPIRATION setting.
2019-02-05 12:02:36 +11:00
Jason Tedor 625d37a26a
Introduce retention lease background sync (#38262)
This commit introduces a background sync for retention leases. The idea
here is that we do a heavyweight sync when adding a new retention lease,
and then periodically we want to background sync any retention lease
renewals to the replicas. As long as the background sync interval is
significantly lower than the extended lifetime of a retention lease, it
is okay if from time to time a replica misses a sync (it will still have
an older version of the lease that is retaining more data as we assume
that renewals do not decrease the retaining sequence number). There are
two follow-ups that will come after this commit. The first is to address
the fact that we have not adapted the should periodically flush logic to
possibly flush the retention leases. We want to do something like flush
if we have not flushed in the last five minutes and there are renewed
retention leases since the last time that we flushed. An additional
follow-up will remove the syncing of retention leases when a retention
lease expires. Today this sync could be invoked in the background by a
merge operation. Rather, we will move the syncing of retention lease
expiration to be done under the background sync. The background sync
will use the heavyweight sync (write action) if a lease has expired, and
will use the lightweight background sync (replication action) otherwise.
2019-02-04 10:35:29 -05:00
Boaz Leskes e49b593c81
Move TokenService to seqno powered cas (#38311)
Relates #37872 
Relates #10708
2019-02-04 15:25:41 +01:00
Tim Vernum 0164acb0a7
Cleanup construction of interceptors (#38294)
It would be beneficial to apply some of the request interceptors even
when features are disabled. This change reworks the way we build that
list so that the interceptors we always want to use are constructed
outside of the settings check.
2019-02-04 17:27:41 +11:00
Albert Zaharovits 3c1544d259
Fix NPE in Logfile Audit Filter (#38120)
The culprit in #38097 is an `IndicesRequest` that has no indices,
but instead of `request.indices()` returning `null` or `String[0]`
it returned `String[] {null}` . This tripped the audit filter.

I have addressed this in two ways:
1. `request.indices()` returning `String[] {null}` is treated as `null`
    or `String[0]`, i.e. no indices
2. `null` values among the roles and indices lists, which are
    unexpected, will never again stumble the audit filter; `null` values
    are treated as special values that will not match any policy,
    i.e. their events will always be printed.

Closes #38097
2019-02-03 10:34:17 +02:00
Henning Andersen 68ed72b923
Handle scheduler exceptions (#38014)
Scheduler.schedule(...) would previously assume that caller handles
exception by calling get() on the returned ScheduledFuture.
schedule() now returns a ScheduledCancellable that no longer gives
access to the exception. Instead, any exception thrown out of a
scheduled Runnable is logged as a warning.

This is a continuation of #28667, #36137 and also fixes #37708.
2019-01-31 17:51:45 +01:00
Jay Modi 54dbf9469c
Update httpclient for JDK 11 TLS engine (#37994)
The apache commons http client implementations recently released
versions that solve TLS compatibility issues with the new TLS engine
that supports TLSv1.3 with JDK 11. This change updates our code to
use these versions since JDK 11 is a supported JDK and we should
allow the use of TLSv1.3.
2019-01-30 14:24:29 -07:00
David Turner 81c443c9de
Deprecate minimum_master_nodes (#37868)
Today we pass `discovery.zen.minimum_master_nodes` to nodes started up in
tests, but for 7.x nodes this setting is not required as it has no effect.
This commit removes this setting so that nodes are started with more realistic
configurations, and deprecates it.
2019-01-30 20:09:15 +00:00
Albert Zaharovits 53e80e9814 Fix failure in test code ClusterPrivilegeTests
Closes #38030
2019-01-30 16:11:44 +02:00
Tim Vernum 99129d7786
Fix exit code for Security CLI tools (#37956)
The certgen, certutil and saml-metadata tools did not correctly return
their exit code to the calling shell.

These commands now explicitly exit with the code that was returned
from the main(args, terminal) method.
2019-01-30 17:51:11 +11:00
Albert Zaharovits 697b2fbe52
Remove implicit index monitor privilege (#37774)
Restricted indices (currently only .security-6 and .security) are special
internal indices that require setting the `allow_restricted_indices` flag
on every index permission that covers them. If this flag is `false`
(default) the permission will not cover these and actions against them
will not be authorized.
However, the monitoring APIs were the only exception to this rule.

This exception is herein forfeited and index monitoring privileges have to be
granted explicitly, using the `allow_restricted_indices` flag on the permission,
as is the case for any other index privilege.
2019-01-29 21:10:03 +02:00
Albert Zaharovits 66ddd8d2f7
Create snapshot role (#35820)
This commit introduces the `create_snapshot` cluster privilege and
the `snapshot_user` role.
This role is to be used by "cronable" tools that call the snapshot API
periodically without recurring to the `manage` cluster privilege. The
`create_snapshot` cluster privilege is much more limited compared to
the `manage` privilege.

The `snapshot_user` role grants the privileges to view the metadata of
all indices (including restricted ones, i.e. .security). It obviously grants the
create snapshot privilege but the repository has to be created using another
role. In addition, it grants the privileges to (only) GET repositories and
snapshots, but not create and delete them.

The role does not allow to create repositories. This distinction is important
because snapshotting equates to the `read` index privilege if the user has
control of the snapshot destination, but this is not the case in this instance,
because the role does not grant control over repository configuration.
2019-01-27 23:07:32 +02:00
Jason Tedor 5fddb631a2
Introduce retention lease syncing (#37398)
This commit introduces retention lease syncing from the primary to its
replicas when a new retention lease is added. A follow-up commit will
add a background sync of the retention leases as well so that renewed
retention leases are synced to replicas.
2019-01-27 07:49:56 -05:00