Commit Graph

1741 Commits

Author SHA1 Message Date
Nik Everett bb06d8ec4f Allow plugins to build pre-configured token filters (#24223)
This changes the way we register pre-configured token filters so that
plugins can declare them and starts to move all of the pre-configured
token filters out of core. It doesn't finish the job because doing
so would make the change unreviewably large. So this PR includes
a shim that keeps the "old" way of registering pre-configured token
filters around.

The Lowercase token filter is special because there is a "special"
interaction between it and the lowercase tokenizer. I'm not sure
exactly what to do about it so for now I'm leaving it alone with
the intent of figuring out what to do with it in a followup.

This also renames these pre-configured token filters from
"pre-built" to "pre-configured" because that seemed like a more
descriptive name.

This is a part of #23658
2017-05-09 14:50:49 -04:00
Yannick Welsch c8712e9531 Limit AllocationService dependency injection hack (#24479)
Changes the scope of the AllocationService dependency injection hack so that it is at least contained to the AllocationService and does not leak into the Discovery world.
2017-05-05 08:39:18 +02:00
James Baiera f5edd5049a Fixing permission errors for `KERBEROS` security mode for HDFS Repository (#23439)
Added missing permissions required for authenticating with Kerberos to HDFS. Also implemented 
code to support authentication in the form of using a Kerberos keytab file. In order to support 
HDFS authentication, users must install a Kerberos keytab file on each node and transfer it to the 
configuration directory. When a user specifies a Kerberos principal in the repository settings the 
plugin automatically enables security for Hadoop and begins the login process. There will be a 
separate PR and commit for the testing infrastructure to support these changes.
2017-05-04 10:51:31 -04:00
James Baiera d928ae210d Add Vagrant based testing fixture (#24249) 2017-05-04 10:17:55 -04:00
Koen De Groote 0fef5acd01 Cleanup collections construction
This commit cleans up some cases where a list or map was being
constructed, and then an existing collection was copied into the new
collection. The clean is to instead use an appropriate constructor to
directly copy the existing collection in during collection
construction. The advantage of this is that the new collection is sized
appropriately.

Relates #24409
2017-04-30 21:26:51 -04:00
Yannick Welsch 35f78d098a Separate publishing from applying cluster states (#24236)
Separates cluster state publishing from applying cluster states:

- ClusterService is split into two classes MasterService and ClusterApplierService. MasterService has the responsibility to calculate cluster state updates for actions that want to change the cluster state (create index, update shard routing table, etc.). ClusterApplierService has the responsibility to apply cluster states that have been successfully published and invokes the cluster state appliers and listeners.
- ClusterApplierService keeps track of the last applied state, but MasterService is stateless and uses the last cluster state that is provided by the discovery module to calculate the next prospective state. The ClusterService class is still kept around, which now just delegates actions to ClusterApplierService and MasterService.
- The discovery implementation is now responsible for managing the last cluster state that is used by the consensus layer and the master service. It also exposes the initial cluster state which is used by the ClusterApplierService. The discovery implementation is also responsible for adding the right cluster-level blocks to the initial state.
- NoneDiscovery has been renamed to TribeDiscovery as it is exclusively used by TribeService. It adds the tribe blocks to the initial state.
- ZenDiscovery is synchronized on state changes to the last cluster state that is used by the consensus layer and the master service, and does not submit cluster state update tasks anymore to make changes to the disco state (except when becoming master).

Control flow for cluster state updates is now as follows:

- State updates are sent to MasterService
- MasterService gets the latest committed cluster state from the discovery implementation and calculates the next cluster state to publish
- MasterService submits the new prospective cluster state to the discovery implementation for publishing
- Discovery implementation publishes cluster states to all nodes and, once the state is committed, asks the ClusterApplierService to apply the newly committed state.
- ClusterApplierService applies state to local node.
2017-04-28 09:34:31 +02:00
Ryan Ernst 4a5c3c5a4a Test: Write node ports file before starting tribe service (#24351)
The tribe service can take a while to initialize, depending on how many cluster it needs to connect to. This change moves writing the ports file used by tests to before the tribe service is started.
2017-04-27 09:59:54 +02:00
Ryan Ernst 51b33f1fd5 S3 Repository: Deprecate remaining `repositories.s3.*` settings (#24144)
Most of these settings should always be pulled from the repository
settings. A couple were leftover that should be moved to client
settings. The path style access setting should be removed altogether.
This commit adds deprecations for all of these existing settings, as
well as adding new client specific settings for max retries and
throttling.

relates #24143
2017-04-25 23:43:20 -07:00
Nik Everett caf376c8af Start building analysis-common module (#23614)
Start moving built in analysis components into the new analysis-common
module. The goal of this project is:
1. Remove core's dependency on lucene-analyzers-common.jar which should
shrink the dependencies for transport client and high level rest client.
2. Prove that analysis plugins can do all the "built in" things by moving all
"built in" behavior to a plugin.
3. Force tests not to depend on any oddball analyzer behavior. If tests
need anything more than the standard analyzer they can use the mock
analyzer provided by Lucene's test infrastructure.
2017-04-19 18:51:34 -04:00
Ryan Ernst 151a65ed17 Ec2 Discovery: Cleanup deprecated settings (#24150)
This commit removes the deprecated cloud.aws.* settings. It also removes
backcompat for specifying `discovery.type: ec2`, and unused aws signer
code which was removed in a previous PR.
2017-04-19 12:06:10 -07:00
Ryan Ernst 212f24aa27 Tests: Clean up rest test file handling (#21392)
This change simplifies how the rest test runner finds test files and
removes all leniency.  Previously multiple prefixes and suffixes would
be tried, and tests could exist inside or outside of the classpath,
although outside of the classpath never quite worked. Now only classpath
tests are supported, and only one resource prefix is supported,
`/rest-api-spec/tests`.

closes #20240
2017-04-18 15:07:08 -07:00
Adrien Grand 4632661bc7 Upgrade to a Lucene 7 snapshot (#24089)
We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index.

Some notes about the change:
 - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors
 - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed
 - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge
2017-04-18 15:17:21 +02:00
Ryan Ernst a8083f3d76 S3 Repository: Remove unused files (#24145)
These were leftover from the removal of the signer type setting in
2017-04-18 01:19:25 -07:00
Ryan Ernst a8017ff020 Tests: Move cluster dependencies from runner to cluster (#24142)
After splitting integ tests into cluster configuration and the test
runner task, we still have dependencies of the test runner added as deps
of the cluster. This commit adds dependencies directly to the cluster,
so that the runner can have other dependencies independent of what is
needed for the cluster.
2017-04-17 16:02:46 -07:00
Ryan Ernst 1629c9fd5c S3 Repository: Cleanup deprecated settings (#24097)
This commit removes all deprecated settings which start with
`cloud.aws`, `repositories.s3` and repository level client settings.
2017-04-17 14:18:49 -07:00
Ryan Ernst 1207103b6d S3 Repository: Eagerly load static settings (#23910)
The S3 repostiory has many levels of settings it looks at to create a
repository, and these settings were read at repository creation time.
This meant secure settings like access and secret keys had to be
available after node construction. This change makes setting loading for
every except repository level settings eager, so that secure settings
can be stashed, and the keystore can once again be closed after
bootstrapping the node is complete.
2017-04-11 15:42:56 -07:00
Colin Goodheart-Smithe 0114f0061c Removes version 2.x constants from Version (#24011)
* Removes version 2.x constants from Version

Closes #21887

* Addresses review comments
2017-04-11 08:31:22 +01:00
Ryan Ernst dd3c1137a4 Repository S3: Simplify client method (#24034)
This commit removes passing the repository metadata object through to
s3 client creation. It is not needed, and in fact in tests was confusing
because you could create the metadata but have it contain different
settings than were passed in as repository settings.
2017-04-10 14:43:34 -07:00
Ryan Ernst 83ba677e7f Discovery EC2: Remove region setting (#23991)
We have both endpoint and region settings. Region was removed from s3 to
simplify configuration. This is the ec2 equivalent.

closes #22758
2017-04-07 22:06:40 -07:00
Ryan Ernst 05e2ea1aef AWS Plugins: Remove signer type setting (#23984)
This commit removes support for s3 signer type in 6.0, and adds a note
to the migration guide.

closes #22599
2017-04-07 16:46:17 -07:00
Ryan Ernst 73b8aad9a3 Settings: Disallow secure setting to exist in normal settings (#23976)
This commit removes the "legacy" feature of secure settings, which setup
a parallel setting that was a fallback in the insecure
elasticsearch.yml. This was previously used to allow the new secure
setting name to be that of the old setting name, but is now not in use
due to other refactorings. It is much cleaner to just have all secure
settings use new setting names. If in the future we want to reuse the
previous setting name, once support for the insecure settings have been
removed, we can then rename the secure setting.  This also adds a test
for the behavior.
2017-04-07 14:18:06 -07:00
Ryan Ernst 6e0b445abb Add registration of new discovery settings
This was forgotten as part of #23961
2017-04-07 14:07:59 -07:00
Ryan Ernst d4c0ef0028 Settings: Migrate ec2 discovery sensitive settings to elasticsearch keystore (#23961)
This change adds secure settings for access/secret keys and proxy
username/password to ec2 discovery.  It adds the new settings with the
prefix `discovery.ec2`, copies other relevant ec2 client settings to the
same prefix, and deprecates all other settings (`cloud.aws.*` and
`cloud.aws.ec2.*`).  Note that this is simpler than the client configs
in repository-s3 because discovery is only initialized once for the
entire node, so there is no reason to complicate the configuration with
the ability to have multiple sets of client settings.

relates #22475
2017-04-07 13:28:15 -07:00
Ryan Ernst 776006bac5 Collapse repository gcs classes into a single java package (#23975)
This is a single reorge of the classes to simplify making them mostly
package protected.
2017-04-07 11:27:26 -07:00
Ali Beyad ac87d40bd5 Removes unused S3BlobStore#shouldRetry() method 2017-04-06 20:58:12 -04:00
Ali Beyad 4f121744bd Removes the retry mechanism from the S3 blob store (#23952)
Currently, both the Amazon S3 client provides a retry mechanism, and the
S3 blob store also attempts retries for failed read/write requests.
Both retry mechanisms are controlled by the
`repositories.s3.max_retries` setting.  However, the S3 blob store retry
mechanism is unnecessary because the Amazon S3 client provided by the
Amazon SDK already handles retries (with exponential backoff) based on
the provided max retry configuration setting (defaults to 3) as long as
the request is retryable.  Hence, this commit removes the unneeded retry
logic in the S3 blob store and the S3OutputStream.

Closes #22845
2017-04-06 19:58:53 -04:00
Ryan Ernst 203f8433c2 Collapse packages in ec2 discovery plugin (#23909)
This commit collapses all the classes inside ec2 discovery to a single
package name.
2017-04-05 23:51:49 -07:00
Ryan Ernst d31d2caf09 Collapse packages in repository-s3 (#23907)
This commit puts all the classes in the repository-s3 plugin into a
single package.  In addition to simplifying the plugin, it will make it
easier to test as things that should be package private will not be
difficult to use inside tests alone.
2017-04-04 15:15:25 -07:00
Jason Tedor 3136ed1490 Rename random ASCII helper methods
This commit renames the random ASCII helper methods in ESTestCase. This
is because this method ultimately uses the random ASCII methods from
randomized runner, but these methods actually only produce random
strings generated from [a-zA-Z].

Relates #23886
2017-04-04 11:04:18 -04:00
Boaz Leskes ad6eea92d6 GceDiscoverTests - remove intitial_state_timeout 2017-04-03 16:50:40 +02:00
David Pilato 17be03e85e Add Backoff policy to azure repository
With this commit, Azure repositories are now using an Exponential Backoff policy before failing the backup.
It uses Azure SDK default values for this policy:

* `30s` delta backoff base with
   * `3s` min
   * `90s` max
* `3` retries max

Users can define the number of retries they wish by setting `cloud.azure.storage.xxx.max_retries` where `xxx` is the azure named account.

Closes #22728.
2017-04-03 10:52:44 +02:00
David Pilato f5d41dfc9d Merge branch 'pr/remove-repositories-azure-settings' 2017-03-31 12:33:12 +02:00
David Pilato e634d89825 Merge branch 'pr/23448-update-azure-storage' 2017-03-30 18:40:16 +02:00
Jim Ferenczi 0e95c90e9f Upgrade to Lucene 6.5.0 (#23750) 2017-03-27 15:57:54 +02:00
AdityaJNair 63757efe9c Remove DocumentMapper#parse(String index, String type, String id, BytesReference source) (#23706)
Removed `parse(String index, String type, String id, BytesReference source)` in DocumentMapper.java and replaced all of its use in Test files with `parse(SourceToParse source)`.

`parse(String index, String type, String id, BytesReference source)` was only used in test files and never in the main code so it was removed. All of the test files that used it was then modified to use `parse(SourceToParse source)` method that existing in DocumentMapper.java
2017-03-23 11:01:09 -04:00
Jason Tedor 2517cb3062 Fix line-length violations in gce/util/Access
This commit addresses all 100-column line-length violations in
gce/util/Access.java and removes this file from the suppressions list.
2017-03-22 21:34:15 -04:00
Ryan Ernst f8453aca57 Packaging: Remove classpath ordering hack (#23596)
After the removal of the joda time hack we used to have, we can cleanup
the codebase handling in security, jarhell and plugins to be more picky
about uniqueness. This was originally in #18959 which was never merged.

closes #18959
2017-03-21 12:12:16 -07:00
Boaz Leskes c0cafa786b UnicastZenPing shouldn't ping the address of the local node (#23567)
Pinging the local node address doesn't really add to discovering other nodes. It just pollutes the logs with unneeded information.
2017-03-14 07:02:42 -07:00
David Pilato 9bd3d7cca8 Update to Azure Storage 5.0.0
Closes #23448.
2017-03-08 21:56:19 -08:00
Ali Beyad 3dff0d0de2 Azure blob store's readBlob() method first checks if the blob exists (#23483)
Previously, the Azure blob store would depend on a 404 StorageException
coming back from Azure if trying to open an input stream to a
non-existent blob. This works for Azure repositories which access a
primary location path. For those configured to access a secondary
location path, the Azure SDK keeps trying for a long while before
returning a 404 StorageException, causing potential delays in the
snapshot APIs. This commit makes an initial check if the blob exists in
Azure and returns immediately with a NoSuchFileException, instead of
trying to open the input stream to the blob.

Closes #23480
2017-03-03 17:01:51 -05:00
Luca Cavanna cc65a94fd4 [TEST] improve yaml test sections parsing (#23407)
Throw error when skip or do sections are malformed, such as they don't start with the proper token (START_OBJECT). That signals bad indentation, which would be ignored otherwise. Thanks (or due to) our pull parsing code, we were still able to properly parse the sections, yet other runners weren't able to.

Closes #21980

* [TEST] fix indentation in matrix_stats yaml tests

* [TEST] fix indentation in painless yaml test

* [TEST] fix indentation in analysis yaml tests

* [TEST] fix indentation in generated docs yaml tests

* [TEST] fix indentation in multi_cluster_search yaml tests
2017-03-02 12:43:20 +01:00
Jason Tedor b9622251fe Correct version on repository-hdfs Guava dependency
This commit sets the version on the repository-hdfs Guava dependency to
version 11.0.2. This change is made to align the version here with the
version that is defined in the POM for Hadoop 2.7.1, the version of
Hadoop that the repository-hdfs plugin is based on. See HADOOP-10101 and
HADOOP-11319 for the ridiculous history of trying to upgrade Guava past
this version in the Hadoop project.

Relates #23420
2017-03-01 16:29:06 -05:00
Jason Tedor ee2f6ccf32 Add convenience method for asserting deprecations
This commit adds a convenience method for simultaneously asserting
settings deprecations and other warnings and fixes some tests where
setting deprecations and general warnings were present.
2017-02-28 18:24:39 -05:00
Jim Ferenczi 5c84640126 Upgrade to lucene-6.5.0-snapshot-d00c5ca (#23385)
Lucene upgrade
2017-02-27 18:39:04 +01:00
Jason Tedor 577e6a5e14 Correct warning header to be compliant
The warning header used by Elasticsearch for delivering deprecation
warnings has a specific format (RFC 7234, section 5.5). The format
specifies that the warning header should be of the form

    warn-code warn-agent warn-text [warn-date]

Here, the warn-code is a three-digit code which communicates various
meanings. The warn-agent is a string used to identify the source of the
warning (either a host:port combination, or some other identifier). The
warn-text is quoted string which conveys the semantic meaning of the
warning. The warn-date is an optional quoted date that can be in a few
different formats.

This commit corrects the warning header within Elasticsearch to follow
this specification. We use the warn-code 299 which means a
"miscellaneous persistent warning." For the warn-agent, we use the
version of Elasticsearch that produced the warning. The warn-text is
unchanged from what we deliver today, but is wrapped in quotes as
specified (this is important as a problem that exists today is that
multiple warnings can not be split by comma to obtain the individual
warnings as the warnings might themselves contain commas). For the
warn-date, we use the RFC 1123 format.

Relates #23275
2017-02-27 12:14:21 -05:00
javanna 2f6a6090b8 [TEST] don't check exact size in mapper-size yaml test
Rather test that the size is present and greather than zero. The actual size depends on the content-type, which is randomized.
2017-02-27 12:27:03 +01:00
Martijn van Groningen 211d50f7b8 [INGEST] Lazy load the geoip databases.
Load the geoip database the first time a pipeline gets created that has a geoip processor.
This saves memory (measured ~150MB for the city db) in cases when the plugin is installed, but not used.
2017-02-24 08:52:27 +01:00
Tim Brooks 0e802961f1 Test that buildCredentials returns correct clazz (#23334)
This is fallout from #23297. That commit wrapped
`InstanceProfileCredentialsProvider` to ensure that the `getCredentials`
and `refresh` methods had privileged access. However, it looks like
there was a test ensuring that `buildCredentials` returned the correct
clazz type. This commit adjusts that test to check that the correct
wrapper is returned.
2017-02-23 17:33:15 -06:00
Ryan Ernst 0b4834f7da Test: Fix hdfs test fixture setup on windows
The test setup for hdfs is a little complicated for windows, needing to
check if the hdfs fixture can be run at all. This was unfortunately not
updated when the integ tests were reorganized into separate runner and
cluster setups.
2017-02-23 11:20:41 -08:00
Christoph Büscher 12b143e871 Tests: fix AwsS3ServiceImplTests 2017-02-23 19:06:35 +01:00