This is related to #22116. Certain plugins (discovery-azure-classic,
discovery-ec2, discovery-gce, repository-azure, repository-gcs, and
repository-s3) open socket connections. As SocketPermissions are
transitioned out of core, these plugins will require connect
permission. This pull request wraps operations that require these
permissions in doPrivileged blocks.
This integrates the mocksocket jar with elasticsearch tests. Mocksocket wraps actions requiring SocketPermissions in doPrivilege blocks. This will eventually allow SocketPermissions to be assigned to the mocksocket jar opposed to the entire elasticsearch codebase.
* Remove a checked exception, replacing it with `ParsingException`.
* Remove all Parser classes for the yaml sections, replacing them with static methods.
* Remove `ClientYamlTestFragmentParser`. Isn't used any more.
* Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods.
I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.
Since the removal of local discovery of #https://github.com/elastic/elasticsearch/pull/20960 we rely on minimum master nodes to be set in our test cluster. The settings is automatically managed by the cluster (by default) but current management doesn't work with concurrent single node async starting. On the other hand, with `MockZenPing` and the `discovery.initial_state_timeout` set to `0s` node starting and joining is very fast making async starting an unneeded complexity. Test that still need async starting could, in theory, still do so themselves via background threads.
Note that this change also removes the usage of `INITIAL_STATE_TIMEOUT_SETTINGS` as the starting of nodes is done concurrently (but building them is sequential)
Today we eagerly resolve unicast hosts. This means that if DNS changes,
we will never find the host at the new address. Moreover, a single host
failng to resolve causes startup to abort. This commit introduces lazy
resolution of unicast hosts. If a DNS entry changes, there is an
opportunity for the host to be discovered. Note that under the Java
security manager, there is a default positive cache of infinity for
resolved hosts; this means that if a user does want to operate in an
environment where DNS can change, they must adjust
networkaddress.cache.ttl in their security policy. And if a host fails
to resolve, we warn log the hostname but continue pinging other
configured hosts.
When doing DNS resolutions for unicast hostnames, we wait until the DNS
lookups timeout. This appears to be forty-five seconds on modern JVMs,
and it is not configurable. If we do these serially, the cluster can be
blocked during ping for a lengthy period of time. This commit introduces
doing the DNS lookups in parallel, and adds a user-configurable timeout
for these lookups.
Relates #21630
#20960 removed `LocalDiscovery` and we now use `ZenDiscovery` in all our tests. To keep cluster forming fast, we are using a `MockZenPing` implementation which uses static maps to return instant results making master election fast. Currently, we don't set `minimum_master_nodes` causing the occasional split brain when starting multiple nodes concurrently and their pinging is so fast that it misses the fact that one of the node has elected it self master. To solve this, `InternalTestCluster` is modified to behave like a true cluster and manage and set `minimum_master_nodes` correctly with every change to the number of nodes.
Tests that want to manage the settings themselves can opt out using a new `autoMinMasterNodes` parameter to the `ClusterScope` annotation.
Having `min_master_nodes` set means the started node may need to wait for other nodes to be started as well. To combat this, we set `discovery.initial_state_timeout` to `0` and wait for the cluster to form once all node have been started. Also, because a node may wait and ping while other nodes are started, `MockZenPing` is adapted to wait rather than busy-ping.
This changes adds a test discovery (which internally uses the existing
mock zenping by default). Having the mock the test framework selects be a discovery
greatly simplifies discovery setup (no more weird callback to a Node
method).
* Plugins: Convert custom discovery to pull based plugin
This change primarily moves registering custom Discovery implementations
to the pull based DiscoveryPlugin interface. It also keeps the cloud
based discovery plugins re-registering ZenDiscovery under their own name
in order to maintain backwards compatibility. However,
discovery.zen.hosts_provider is changed here to no longer fallback to
discovery.type. Instead, each plugin which previously relied on the
value of discovery.type now sets the hosts_provider to itself if
discovery.type is set to itself, along with a deprecation warning.
At one point in the past when moving out the rest tests from core to
their own subproject, we had multiple test classes which evenly split up
the tests to run. However, we simplified this and went back to a single
test runner to have better reproduceability in tests. This change
removes the remnants of that multiplexing support.
Follow up for #21039.
We can revert the previous change and do that a bit smarter than it was.
Patch tested successfully manually on ec2 with 2 nodes with a configuration like:
```yml
discovery.type: ec2
network.host: ["_local_", "_site_", "_ec2_"]
cloud.aws.region: us-west-2
```
(cherry picked from commit fbbeded)
Backport of #21048 in master branch
This change moves providing UnicastHostsProvider for zen discovery to be
pull based, adding a getter in DiscoveryPlugin. A new setting is added,
discovery.zen.hosts_provider, to separate the discovery type from the
hosts provider for zen when it is selected. Unfortunately existing
plugins added ZenDiscovery with their own name in order to just provide
a hosts provider, so there are already many users setting the hosts
provider through discovery.type. This change also includes backcompat,
falling back to discovery.type when discovery.zen.hosts_provider is not
set.
Here is what is happening without this fix when you try to connect to ec2 APIs:
```
[2016-10-20T12:41:49,925][DEBUG][c.a.a.AWSCredentialsProviderChain] Unable to load credentials from EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
[2016-10-20T12:41:49,926][DEBUG][c.a.a.AWSCredentialsProviderChain] Unable to load credentials from SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey)
[2016-10-20T12:41:49,926][DEBUG][c.a.a.AWSCredentialsProviderChain] Unable to load credentials from com.amazonaws.auth.profile.ProfileCredentialsProvider@1ad14091: access denied ("java.io.FilePermission" "/home/ubuntu/.aws/credentials" "read")
[2016-10-20T12:41:49,927][DEBUG][c.a.i.EC2MetadataClient ] Connecting to EC2 instance metadata service at URL: http://169.254.169.254/latest/meta-data/iam/security-credentials/
[2016-10-20T12:41:49,951][DEBUG][c.a.i.EC2MetadataClient ] Connecting to EC2 instance metadata service at URL: http://169.254.169.254/latest/meta-data/iam/security-credentials/discovery-tests
[2016-10-20T12:41:49,965][DEBUG][c.a.a.AWSCredentialsProviderChain] Unable to load credentials from InstanceProfileCredentialsProvider: Unable to parse Json String.
[2016-10-20T12:41:49,966][INFO ][o.e.d.e.AwsEc2UnicastHostsProvider] [dJfktmE] Exception while retrieving instance list from AWS API: Unable to load AWS credentials from any provider in the chain
[2016-10-20T12:41:49,967][DEBUG][o.e.d.e.AwsEc2UnicastHostsProvider] [dJfktmE] Full exception:
com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain
at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:131) ~[aws-java-sdk-core-1.10.69.jar:?]
at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:11117) ~[aws-java-sdk-ec2-1.10.69.jar:?]
at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5403) ~[aws-java-sdk-ec2-1.10.69.jar:?]
at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.fetchDynamicNodes(AwsEc2UnicastHostsProvider.java:116) [discovery-ec2-5.0.0.jar:5.0.0]
at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:234) [discovery-ec2-5.0.0.jar:5.0.0]
at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:219) [discovery-ec2-5.0.0.jar:5.0.0]
at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:54) [elasticsearch-5.0.0.jar:5.0.0]
at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:102) [discovery-ec2-5.0.0.jar:5.0.0]
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:358) [elasticsearch-5.0.0.jar:5.0.0]
at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$1.doRun(UnicastZenPing.java:272) [elasticsearch-5.0.0.jar:5.0.0]
at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:504) [elasticsearch-5.0.0.jar:5.0.0]
at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.0.0.jar:5.0.0]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
```
For whatever reason, it can not parse what is coming back from http://169.254.169.254/latest/meta-data/iam/security-credentials/discovery-tests.
But, if you wrap the code within an `AccessController.doPrivileged()` call, then it works perfectly.
Closes#21039.
(cherry picked from commit abfdc70)
* Move all zen discovery classes into o.e.discovery.zen
This collapses sub packages of zen into zen. These all had just a couple
classes each, and there is really no reason to have the subpackages.
* fix checkstyle
UpdateHelper, MetaDataIndexUpgradeService, and some recovery
stuff.
Move ClusterSettings to nullable ctor parameter of TransportService
so it isn't forgotten.
This change proposes the removal of all non-tcp transport implementations. The
mock transport can be used by default to run tests instead of local transport that has
roughly the same performance compared to TCP or at least not noticeably slower.
This is a master only change, deprecation notice in 5.x will be committed as a
separate change.
Because of security permissions that we do not grant to the AWS SDK (for
use in discovery-ec2 and repository-s3 plugins), certain calls in the
AWS SDK will lead to security exceptions that are logged at the warning
level. These warnings are noise and we should suppress them. This commit
adds plugin log configurations for discovery-ec2 and repository-s3 to
ship with default Log4j 2 configurations that suppress these log
warnings.
Relates #20313
This commit modifies the call sites that allocate a parameterized
message to use a supplier so that allocations are avoided unless the log
level is fine enough to emit the corresponding log message.
This makes it obvious that these tests are for running the client yaml
suites. Now that there are other ways of running tests using the REST
client against a running cluster we can't go on calling the shared
client yaml tests "REST tests". They are rest tests, but they aren't
**the** rest tests.
This adds a header that looks like `Location: /test/test/1` to the
response for the index/create/update API. The requirement for the header
comes from https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.htmlhttps://tools.ietf.org/html/rfc7231#section-7.1.2 claims that relative
URIs are OK. So we use an absolute path which should resolve to the
appropriate location.
Closes#19079
This makes large changes to our rest test infrastructure, allowing us
to write junit tests that test a running cluster via the rest client.
It does this by splitting ESRestTestCase into two classes:
* ESRestTestCase is the superclass of all tests that use the rest client
to interact with a running cluster.
* ESClientYamlSuiteTestCase is the superclass of all tests that use the
rest client to run the yaml tests. These tests are shared across all
official clients, thus the `ClientYamlSuite` part of the name.
Follow up for #18662
We add some tests to check that settings are correctly applied.
Tests revealed that some checks were missing.
But we ignore `testAWSCredentialsWithSystemProviders` test for now.
Some tests still start http implicitly or miss configuring the transport clients correctly.
This commit fixes all remaining tests and adds a depdenceny to `transport-netty` from
`qa/smoke-test-http` and `modules/reindex` since they need an http server running on the nodes.
This also moves all required permissions for netty into it's module and out of core.
This change adds a createComponents() method to Plugin implementations
which they can use to return already constructed componenents/services.
Eventually this should be just services ("components" don't really do
anything), but for now it allows any object so that preconstructed
instances by plugins can still be bound to guice. Over time we should
add basic services as arguments to this method, but for now I have left
it empty so as to not presume what is a necessary service.
The DiscoveryNodeService exists to register CustomNodeAttributes which
plugins can add. This is not necessary, since plugins can already add
additional attributes, and use the node attributes prefix.
This change removes the DiscoveryNodeService, and converts the only
consumer, the ec2 discovery plugin, to add the ec2 availability zone
in additionalSettings().
The only reason for LifecycleComponent taking a generic type was so that
it could return that type on its start and stop methods. However, this
chaining has no practical necessity. Instead, start and stop can be
void, and a whole bunch of confusing generics disappear.
We pretended to be able to ackt like a different version node for so long it's
time to be honest and remove this ability. It's just confusing and where needed
and tested we should build dedicated extension points.
This change removes some unnecessary dependencies from ClusterService
and cleans up ClusterName creation. ClusterService is now not created
by guice anymore.