Commit Graph

1639 Commits

Author SHA1 Message Date
Adrien Grand 7ec51d628d Make the default S3 buffer size depend on the available memory. (#21299)
Currently the default S3 buffer size is 100MB, which can be a lot for small
heaps. This pull request updates the default to be 100MB for heaps that are
greater than 2GB and 5% of the heap size otherwise.
2016-11-03 16:07:52 +01:00
Adrien Grand aa6cd93e0f Require arguments for QueryShardContext creation. (#21196)
The `IndexService#newQueryShardContext()` method creates a QueryShardContext on
shard `0`, with a `null` reader and that uses `System.currentTimeMillis()` to
resolve `now`. This may hide bugs, since the shard id is sometimes used for
query parsing (it is used to salt random score generation in `function_score`),
passing a `null` reader disables query rewriting and for some use-cases, it is
simply not ok to rely on the current timestamp (eg. percolation). So this pull
request removes this method and instead requires that all call sites provide
these parameters explicitly.
2016-11-02 09:48:49 +01:00
Christoph Büscher 1f5adaa824 Docs: Adding Ukrainian analyzer 2016-10-31 18:20:39 +01:00
Christoph Büscher a9b0b97703 Expose Lucenes Ukrainian analyzer
Since Lucene 6.2. the UkrainianMorfologikAnalyzer is available through the
lucene-analyzers-morfologik jar. This change exposes it to be used as an
elasticsearch plugin.
2016-10-31 18:20:39 +01:00
Yannick Welsch a23ded6a94 [TEST] Fix NullPointerException in AzureStorageServiceMock
Makes the code safe against concurrent modifications of the underlying hashmap.
2016-10-31 16:21:07 +01:00
Adrien Grand b3cc54cf0d Upgrade to lucene-6.3.0-snapshot-ed102d6 (#21150)
Lucene 6.3 is expected to be released in the next weeks so it'd be good to give
it some integration testing. I had to upgrade randomized-testing too so that
both Lucene and Elasticsearch are on the same version.
2016-10-28 14:47:15 +02:00
Jun Ohtani a66c76eb44 Merge pull request #20704 from johtani/remove_request_params_in_analyze_api
Removing request parameters in _analyze API
2016-10-27 17:43:18 +09:00
Jack Conradson 512a77a633 Refactor ScriptType to be a top-level class. 2016-10-26 10:21:22 -07:00
David Pilato 50bc31a918 Fix s3 repository when used with IAM profiles
Applying same patch we did in #21048 but for `repository-s3` plugin.

Backport of #21058 in master branch
2016-10-21 16:45:11 +02:00
David Pilato e5d9f393f1 Fix ec2 discovery when used with IAM profiles.
Follow up for #21039.

We can revert the previous change and do that a bit smarter than it was.

Patch tested successfully manually on ec2 with 2 nodes with a configuration like:

```yml
discovery.type: ec2
network.host: ["_local_", "_site_", "_ec2_"]
cloud.aws.region: us-west-2
```

(cherry picked from commit fbbeded)

Backport of #21048 in master branch
2016-10-20 20:19:47 +02:00
Ryan Ernst 60353a245a Plugins: Make UnicastHostsProvider extension pull based (#21036)
This change moves providing UnicastHostsProvider for zen discovery to be
pull based, adding a getter in DiscoveryPlugin. A new setting is added,
discovery.zen.hosts_provider, to separate the discovery type from the
hosts provider for zen when it is selected. Unfortunately existing
plugins added ZenDiscovery with their own name in order to just provide
a hosts provider, so there are already many users setting the hosts
provider through discovery.type. This change also includes backcompat,
falling back to discovery.type when discovery.zen.hosts_provider is not
set.
2016-10-20 09:13:59 -07:00
David Pilato efffb946e2 Fix ec2 discovery when used with IAM profiles.
Here is what is happening without this fix when you try to connect to ec2 APIs:

```
[2016-10-20T12:41:49,925][DEBUG][c.a.a.AWSCredentialsProviderChain] Unable to load credentials from EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
[2016-10-20T12:41:49,926][DEBUG][c.a.a.AWSCredentialsProviderChain] Unable to load credentials from SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey)
[2016-10-20T12:41:49,926][DEBUG][c.a.a.AWSCredentialsProviderChain] Unable to load credentials from com.amazonaws.auth.profile.ProfileCredentialsProvider@1ad14091: access denied ("java.io.FilePermission" "/home/ubuntu/.aws/credentials" "read")
[2016-10-20T12:41:49,927][DEBUG][c.a.i.EC2MetadataClient  ] Connecting to EC2 instance metadata service at URL: http://169.254.169.254/latest/meta-data/iam/security-credentials/
[2016-10-20T12:41:49,951][DEBUG][c.a.i.EC2MetadataClient  ] Connecting to EC2 instance metadata service at URL: http://169.254.169.254/latest/meta-data/iam/security-credentials/discovery-tests
[2016-10-20T12:41:49,965][DEBUG][c.a.a.AWSCredentialsProviderChain] Unable to load credentials from InstanceProfileCredentialsProvider: Unable to parse Json String.
[2016-10-20T12:41:49,966][INFO ][o.e.d.e.AwsEc2UnicastHostsProvider] [dJfktmE] Exception while retrieving instance list from AWS API: Unable to load AWS credentials from any provider in the chain
[2016-10-20T12:41:49,967][DEBUG][o.e.d.e.AwsEc2UnicastHostsProvider] [dJfktmE] Full exception:
com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain
	at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:131) ~[aws-java-sdk-core-1.10.69.jar:?]
	at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:11117) ~[aws-java-sdk-ec2-1.10.69.jar:?]
	at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5403) ~[aws-java-sdk-ec2-1.10.69.jar:?]
	at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.fetchDynamicNodes(AwsEc2UnicastHostsProvider.java:116) [discovery-ec2-5.0.0.jar:5.0.0]
	at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:234) [discovery-ec2-5.0.0.jar:5.0.0]
	at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:219) [discovery-ec2-5.0.0.jar:5.0.0]
	at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:54) [elasticsearch-5.0.0.jar:5.0.0]
	at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:102) [discovery-ec2-5.0.0.jar:5.0.0]
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:358) [elasticsearch-5.0.0.jar:5.0.0]
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$1.doRun(UnicastZenPing.java:272) [elasticsearch-5.0.0.jar:5.0.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:504) [elasticsearch-5.0.0.jar:5.0.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.0.0.jar:5.0.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
```

For whatever reason, it can not parse what is coming back from http://169.254.169.254/latest/meta-data/iam/security-credentials/discovery-tests.

But, if you wrap the code within an `AccessController.doPrivileged()` call, then it works perfectly.

Closes #21039.

(cherry picked from commit abfdc70)
2016-10-20 17:19:22 +02:00
Ryan Ernst 53cff0f00f Move all zen discovery classes into o.e.discovery.zen (#21032)
* Move all zen discovery classes into o.e.discovery.zen

This collapses sub packages of zen into zen. These all had just a couple
classes each, and there is really no reason to have the subpackages.

* fix checkstyle
2016-10-20 00:44:48 -07:00
Boaz Leskes c3987156ab Remove local discovery in favor of a simpler `MockZenPings` (#20960)
`LocalDiscovery` is a discovery implementation that uses static in memory maps to keep track of current live nodes. This is used extensively in our tests in order to speed up cluster formation (i.e., shortcut the 3 second ping period used by `ZenDiscovery` by default). This is sad as that mean that most of the test run using a different discovery semantics than what is used in production. Instead of replacing the entire discovery logic, we can use a similar approach to only shortcut the pinging components.
2016-10-18 21:12:15 +02:00
Jason Tedor f23ae90d92 Fix logging configuration for AwsSdkMetrics logger
This commit fixes an issue with the configuration for the AwsSdkMetrics
logger; the issue is that the logging configuration had used underscores
instead of periods for the settings key (the perils of lenient settings
parsing).

Relates #20313
2016-10-14 23:44:39 -04:00
Tanguy Leroux 44ac5d057a Remove empty javadoc (#20871)
This commit removes as many as empty javadocs comments my regexp has found
2016-10-12 10:27:09 +02:00
Ali Beyad bbf6e6d0bd Fixes leading forward slash in S3 repository base_path (#20861)
In 2.x, the S3 repository accepted a `/` (forward slash) to start
the repositories.s3.base_path, and it used a different string splitting
method that removed the forward slash from the base path, so there
were no issues.

In 5.x, we removed this custom string splitting method in favor of
the JDK's string splitting method, which preserved the leading `/`.
The AWS SDK does not like the leading `/` in the key path after the
bucket name, and so it could not find any objects in the S3 repository.

This commit fixes the issue by removing the leading `/` if it exists
and adding a deprecation notice that leading `/` will not be supported
in the future in S3 repository's base_path.
2016-10-11 11:18:52 -04:00
Alexander Reelsen 3c2e51d831 Deps: Update ingest-attachment to latest libraries (#20710)
Also added a test to check for a with a regular PDF,
instead of only an encrypted one with expected exception.
2016-10-10 12:55:05 +02:00
Nik Everett cf4038b668 DeGuice some of IndicesModule
UpdateHelper, MetaDataIndexUpgradeService, and some recovery
stuff.

Move ClusterSettings to nullable ctor parameter of TransportService
so it isn't forgotten.
2016-10-07 11:14:38 -04:00
Simon Willnauer 7452028e50 Simplify TransportAddress (#20798)
since TransportAddress is now final we can simplify it's interface a bit
and remove methods that are only used in tests or are plain delegates.
2016-10-07 15:56:54 +02:00
Simon Willnauer 194a6b1df0 Remove LocalTransport in favor of MockTcpTransport (#20695)
This change proposes the removal of all non-tcp transport implementations. The
mock transport can be used by default to run tests instead of local transport that has
roughly the same performance compared to TCP or at least not noticeably slower.

This is a master only change, deprecation notice in 5.x will be committed as a
separate change.
2016-10-07 11:27:47 +02:00
Jun Ohtani 370f0b885e Removing request parameters in _analyze API
Remove request params in _analyze API without index param
Change rest-api-test using JSON
Change docs using JSON

Closes #20246
2016-10-07 16:23:24 +09:00
David Pilato 591a8d4ec6 Merge branch 'fix/20669-master-azure-log' 2016-10-06 16:00:43 +02:00
Martijn van Groningen 6a5630f901 ingest: Upgrade geoip2 dependency
Closes #20563
2016-10-05 09:31:55 +02:00
Jason Tedor 51d53791fe Remove lenient URL parameter parsing
Today when parsing a request, Elasticsearch silently ignores incorrect
(including parameters with typos) or unused parameters. This is bad as
it leads to requests having unintended behavior (e.g., if a user hits
the _analyze API and misspell the "tokenizer" then Elasticsearch will
just use the standard analyzer, completely against intentions).

This commit removes lenient URL parameter parsing. The strategy is
simple: when a request is handled and a parameter is touched, we mark it
as such. Before the request is actually executed, we check to ensure
that all parameters have been consumed. If there are remaining
parameters yet to be consumed, we fail the request with a list of the
unconsumed parameters. An exception has to be made for parameters that
format the response (as opposed to controlling the request); for this
case, handlers are able to provide a list of parameters that should be
excluded from tripping the unconsumed parameters check because those
parameters will be used in formatting the response.

Additionally, some inconsistencies between the parameters in the code
and in the docs are corrected.

Relates #20722
2016-10-04 12:45:29 -04:00
Tanguy Leroux 857e861d32 [Docs] Log snapshot shard failures in AzureSnapshotRestoreServiceIntegTests
This commit adds logs when a snapshot has failures for some snapshoted shards.
2016-10-03 15:04:37 +02:00
Jason Tedor 25fd9e26c4 Merge branch 'master' into feature/seq_no
* master: (1199 commits)
  [DOCS] Remove non-valid link to mapping migration document
  Revert "Default `include_in_all` for numeric-like types to false"
  test: add a test with ipv6 address
  docs: clearify that both ip4 and ip6 addresses are supported
  Include complex settings in settings requests
  Add production warning for pre-release builds
  Clean up confusing error message on unhandled endpoint
  [TEST] Increase logging level in testDelayShards()
  change health from string to enum (#20661)
  Provide error message when plugin id is missing
  Document that sliced scroll works for reindex
  Make reindex-from-remote ignore unknown fields
  Remove NoopGatewayAllocator in favor of a more realistic mock (#20637)
  Remove Marvel character reference from guide
  Fix documentation for setting Java I/O temp dir
  Update client benchmarks to log4j2
  Changes the API of GatewayAllocator#applyStartedShards and (#20642)
  Removes FailedRerouteAllocation and StartedRerouteAllocation
  IndexRoutingTable.initializeEmpty shouldn't override supplied primary RecoverySource (#20638)
  Smoke tester: Adjust to latest changes (#20611)
  ...
2016-09-29 00:22:31 +02:00
Martijn van Groningen c99890eda5 test: add a test with ipv6 address 2016-09-28 10:04:20 +02:00
David Pilato 14af343d8d Fix logger when you can not create an azure storage client
We were swallowing the original exception when creating a client with bad credentials.
So even in `TRACE` log level, nothing useful were coming out of it.
With this commit, it now prints:

```
[2016-09-27 15:54:13,118][ERROR][cloud.azure.storage      ] [node_s0] can not create azure storage client: Storage Key is not a valid base64 encoded string.
```

Closes #20633.

Backport of #20669 for master branch (6.0)
2016-09-27 16:28:38 +02:00
Simon Willnauer fe1803c957 Remove AnalysisService and reduce it to a simple name to analyzer mapping (#20627)
Today we hold on to all possible tokenizers, tokenfilters etc. when we create
an index service on a node. This was mainly done to allow the `_analyze` API to
directly access all these primitive. We fixed this in #19827 and can now get rid of
the AnalysisService entirely and replace it with a simple map like class. This
ensures we don't create a gazillion long living objects that are entirely useless since
they are never used in most of the indices. Also those objects might consume a considerable
amount of memory since they might load stopwords or synonyms etc.

Closes #19828
2016-09-23 08:53:50 +02:00
Ali Beyad 5031824291 File-based discovery plugin integration tests (#20492)
Adds an integration test for the file-based discovery plugin
to test the plugin operates correctly and uses the hosts
configured in `unicast_hosts.txt` with a real cluster

Closes #20459
2016-09-21 15:48:18 -04:00
Tanguy Leroux 7645abaad9 Remove duplicate methods in ByteSizeValue (#20560)
This commit removes `ByteSizeValue`'s methods that are duplicated (ex: `mbFrac()` and `getMbFrac()`) in order to only keep the `getN` form.
    
It also renames `mb()` -> `getMb()`, `kb()` -> `getKB()` in order to be more coherent with the `ByteSizeUnit` method names.
2016-09-20 14:07:23 +02:00
Ryan Ernst 85b8f29415 Build: Remove old maven deploy support (#20403)
* Build: Remove old maven deploy support

This change removes the old maven deploy that we have in parallel to
maven-publish, and makes maven-publish fully work with publishing to
maven local. Using `gradle publishToMavenLocal` should be used to
publish to .m2.

Note that there is an unfortunate hack that means for
zip artifacts we must first create/publish a dummy pom file, and then
follow that with the real pom file. It would be nice to have the pom
file contains packaging=zip, but maven central then requires sources and
javadocs. But our zips are really just attached artifacts, so we already
set the packaging type to pom for our zip files. This change just works
around a limitation of the underlying maven publishing library which
silently skips attached artifacts when the packaging type is set to pom.

relates #20164
closes #20375

* Remove unnecessary extra spacing
2016-09-19 15:10:41 -07:00
David Pilato dfd1eebdd0 Remove mapper attachments plugin
We now have in 5.0.0 `ingest-attachment` plugin.
We can remove `mapper-attachments` plugin for 6.0.

Closes #18837.
2016-09-19 09:01:16 +02:00
Simon Willnauer f5daa165f1 Remove ability to plug-in TransportService (#20505)
TransportService is such a central part of the core server, replacing
it's implementation is risky and can cause serious issues. This change removes the ability to
plug in TransportService but allows registering a TransportInterceptor that enables
plugins to intercept requests on both the sender and the receiver ends. This is a commonly used
and overwritten functionality but encapsulates the custom code in a contained manner.
2016-09-16 09:47:53 +02:00
Tal Levy 4704efaef4 [ingest-geoip] do not insert null-valued fields in geoip response
update geoip to not include null-valued results from database

Originally, the plugin would still insert all the requested fields, but
assign null to each one. This fixes that by not writing the fields at
all. Makes for a better experience when the null fields conflict with
the typical geo_point field mapping.
2016-09-13 18:12:02 -07:00
Ali Beyad 4431720c3d File-based discovery plugin (#20394)
This commit introduces a new plugin for file-based unicast hosts
discovery. This allows specifying the unicast hosts participating
in discovery through a `unicast_hosts.txt` file located in the
`config/discovery-file` directory. The plugin will use the hosts 
specified in this file as the set of hosts to ping during discovery.

The format of the `unicast_hosts.txt` file is to have one host/port
entry per line. The hosts file is read and parsed every time
discovery makes ping requests, thus a new version of the file that
is published to the config directory will automatically be picked
up.

Closes #20323
2016-09-13 20:52:39 -04:00
Jim Ferenczi 1764ec56b3 Fixed naming inconsistency for fields/stored_fields in the APIs (#20166)
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain

The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering):
* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.

Some APIs still have the `fields` parameter for various reasons:
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155
2016-09-13 20:54:41 +02:00
Jason Tedor 981e4f5bc5 Configure AWS SDK logging configuration
Because of security permissions that we do not grant to the AWS SDK (for
use in discovery-ec2 and repository-s3 plugins), certain calls in the
AWS SDK will lead to security exceptions that are logged at the warning
level. These warnings are noise and we should suppress them. This commit
adds plugin log configurations for discovery-ec2 and repository-s3 to
ship with default Log4j 2 configurations that suppress these log
warnings.

Relates #20313
2016-09-03 06:41:07 -04:00
Jack Conradson 222a4fa765 Reduce the number of threads and scripts being used in multi-threaded
tests to prevent OOM from deprecation logging.
2016-09-02 11:56:44 -07:00
Jack Conradson 71d8ee5eac Merge branch 'master' into deprecate 2016-09-01 08:51:29 -07:00
Jack Conradson 3b3baa6e6c Made deprecation of Groovy, Javascript, and Python more explicit. 2016-08-31 15:56:31 -07:00
Jason Tedor 0853fc806f Add missing cast to logging message supplier
This commit adds a missing cast to logging message supplier on a single
invocation receiving a parameterized message parameter.
2016-08-30 18:26:45 -04:00
Jason Tedor abf8a1a3f0 Avoid allocating log parameterized messages
This commit modifies the call sites that allocate a parameterized
message to use a supplier so that allocations are avoided unless the log
level is fine enough to emit the corresponding log message.
2016-08-30 18:17:09 -04:00
Jason Tedor 7da0cdec42 Introduce Log4j 2
This commit introduces Log4j 2 to the stack.
2016-08-30 13:31:24 -04:00
Jack Conradson 7930233527 Deprecate Groovy, Python, and Javascript scripts. 2016-08-30 09:06:18 -07:00
Jun Ohtani 450f47d5b5 Validate blank field name
add validation and validate only 5.0+
Add tests before 5.0

Closes #19251
2016-08-26 20:10:33 +09:00
Adrien Grand 3ed0da5a58 GET operations should not extract fields from `_source`. #20158
This makes GET operations more consistent with `_search` operations which expect
`(stored_)fields` to work on stored fields and source filtering to work on the
`_source` field. This is now possible thanks to the fact that GET operations
do not read from the translog anymore (#20102) and also allows to get rid of
`FieldMapper#isGenerated`.

The `_termvectors` API (and thus more_like_this too) was relying on the fact
that GET operations would extract fields from either stored fields or the source
so the logic to do this that used to exist in `ShardGetService` has been moved
to `TermVectorsService`. It would be nice that term vectors do not rely on this,
but this does not seem to be a low hanging fruit.
2016-08-26 10:35:23 +02:00
Sarwar Bhuiyan b0ceecc3eb Refactored to use Settings object 2016-08-25 17:27:22 -04:00
Chris Earle 1cf694b63e Use StringBuilder in favor of StringBuffer
This removes all instances of StringBuffer that are removeable.

Uncontended synchronization in Java is pretty cheap, but it's unnecessary.
2016-08-25 16:20:03 -04:00
Mike McCandless 0ccfe69789 Upgrade to Lucene 6.2.0 2016-08-24 17:26:28 -04:00
Nik Everett 1452ab4b9f Squash the rest of o.e.rest.action
Squashes all the subpackages of `org.elasticsearch.rest.action` down to
the following:
* `o.e.rest.action.admin` - Administrative actions
* `o.e.rest.action.cat` - Actions that make tables for `grep`ing
* `o.e.rest.action.document` - Actions that act on documents
* `o.e.rest.action.ingest` - Actions that act on ingest pipelines
* `o.e.rest.action.search` - Actions that search

I'm tempted to merge `search` into `document` but the `document`
package feels fairly complete as is and `Suggest` isn't actually always
about documents either....

I'm also tempted to merge `ingest` into `admin.cluster` because the
latter contains the actions for dealing with stored scripts.

I've moved the `o.e.rest.action.support` into `o.e.rest.action`.

I've also added `package-info.java`s to all packges in `o.e.rest`. I
figure if the package is too small to deserve a `package-info.java` file
then it is too small to deserve to be a package....

Also fixes checkstyle in all moved classes.
2016-08-15 21:06:32 -04:00
Nik Everett 9f8f2ea54b Remove ESIntegTestCase#pluginList
It was a useful method in 1.7 when javac's type inference wasn't as
good, but now we can just replace it with `Arrays.asList`.
2016-08-11 15:44:02 -04:00
Nik Everett e07e5d66fa Make reindex and lang-javascript compatible
Fixes two issues:
1. lang-javascript doesn't support `executable` with a `null` `vars`
parameters. The parameter is quite nullable.
2. reindex didn't support script engines who's `unwrap` method wasn't
a noop. This didn't come up for lang-groovy or lang-painless because
both of those `unwrap`s were noops. lang-javascript copys all maps that
it `unwrap`s.

This adds fairly low level unit tests for these fixes but dosen't add
an integration test that makes sure that reindex and lang-javascript
play well together. That'd make backporting this difficult and would
add a fairly significant amount of time to the build for a fairly rare
interaction. Hopefully the unit tests will be enough.
2016-08-11 09:54:03 -04:00
David Pilato 42f851cf49 Merge branch 'master' into fix/19924-attachment 2016-08-10 19:05:22 +02:00
David Pilato 905684fe73 Adds content-length as number
If you run Elasticsearch with the ingest-attachment plugin:

```sh
gradle plugins:ingest-attachment:run
```

And then you use it on a document:

```js
 PUT _ingest/pipeline/attachment
 {
   "description" : "Extract attachment information",
   "processors" : [
     {
       "attachment" : {
         "field" : "data"
       }
     }
   ]
 }
 PUT my_index/my_type/my_id?pipeline=attachment
 {
   "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0="
 }
 GET my_index/my_type/my_id
```

 You were getting this back:

```js
 # PUT _ingest/pipeline/attachment
 {
   "acknowledged": true
 }

 # PUT my_index/my_type/my_id?pipeline=attachment
 {
   "_index": "my_index",
   "_type": "my_type",
   "_id": "my_id",
   "_version": 2,
   "result": "updated",
   "_shards": {
     "total": 2,
     "successful": 1,
     "failed": 0
   },
   "created": false
 }

 # GET my_index/my_type/my_id
 {
   "_index": "my_index",
   "_type": "my_type",
   "_id": "my_id",
   "_version": 2,
   "found": true,
   "_source": {
     "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=",
     "attachment": {
       "content_type": "application/rtf",
       "language": "ro",
       "content": "Lorem ipsum dolor sit amet",
       "content_length": "28"
     }
   }
 }
```

With this commit you are now getting:

```
 # GET my_index/my_type/my_id
 {
   "_index": "my_index",
   "_type": "my_type",
   "_id": "my_id",
   "_version": 2,
   "found": true,
   "_source": {
     "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=",
     "attachment": {
       "content_type": "application/rtf",
       "language": "ro",
       "content": "Lorem ipsum dolor sit amet",
       "content_length": 28
     }
   }
 }
```

Closes #19924
2016-08-10 18:31:16 +02:00
Adrien Grand 0d6ac57acf Collapse o.e.index.mapper packages. #19921
I also reduced the visibility of a couple classes and renamed/consolidated some
test classes for consistency, eg. removing the `Simple` prefix or using the
`<Type>FieldMapperTests` convention for testing field mappers.
2016-08-10 17:51:11 +02:00
Lee Hinman 5849c488b5 Merge remote-tracking branch 'dakrone/compliation-breaker' 2016-08-09 11:57:26 -06:00
Lee Hinman 2be52eff09 Circuit break the number of inline scripts compiled per minute
When compiling many dynamically changing scripts, parameterized
scripts (<https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-using.html#prefer-params>)
should be preferred. This enforces a limit to the number of scripts that
can be compiled within a minute. A new dynamic setting is added -
`script.max_compilations_per_minute`, which defaults to 15.

If more dynamic scripts are sent, a user will get the following
exception:

```json
{
  "error" : {
    "root_cause" : [
      {
        "type" : "circuit_breaking_exception",
        "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
        "bytes_wanted" : 0,
        "bytes_limit" : 0
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "i",
        "node" : "a5V1eXcZRYiIk8lecjZ4Jw",
        "reason" : {
          "type" : "general_script_exception",
          "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]",
          "caused_by" : {
            "type" : "circuit_breaking_exception",
            "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
            "bytes_wanted" : 0,
            "bytes_limit" : 0
          }
        }
      }
    ],
    "caused_by" : {
      "type" : "general_script_exception",
      "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]",
      "caused_by" : {
        "type" : "circuit_breaking_exception",
        "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
        "bytes_wanted" : 0,
        "bytes_limit" : 0
      }
    }
  },
  "status" : 500
}
```

This also fixes a bug in `ScriptService` where requests being executed
concurrently on a single node could cause a script to be compiled
multiple times (many in the case of a powerful node with many shards)
due to no synchronization between checking the cache and compiling the
script. There is now synchronization so that a script being compiled
will only be compiled once regardless of the number of concurrent
searches on a node.

Relates to #19396
2016-08-09 10:26:27 -06:00
Ali Beyad f59ca9083b Snapshot repository cleans up empty index folders (#19751)
This commit cleans up indices in a snapshot repository when all
snapshots containing the index are all deleted. Previously, empty
indices folders would lay around after all snapshots containing
them were deleted.
2016-08-05 09:39:02 -04:00
David Pilato 6b9a084086 Merge branch 'pr/19557-extract-aws-key' 2016-08-04 17:48:44 +02:00
Ali Beyad c4ae23f5d8 Enables implementations of the BlobContainer interface to (#19749)
conform with the requirements of the writeBlob method by
throwing a FileAlreadyExistsException if attempting to write
to a blob that already exists. This change means implementations
of BlobContainer should never overwrite blobs - to overwrite a
blob, it must first be deleted and then can be written again.

Closes #15579
2016-08-02 09:48:21 -04:00
Ali Beyad 456ea56527 Cleans up the BlobContainer interface by removing the (#19727)
writeBlob method takes a BytesReference in favor of just
the writeBlob method that takes an InputStream.

Closes #18528
2016-08-02 09:21:43 -04:00
Ali Beyad 9f88a8194a Merge pull request #19706 from elastic/enhancement/snapshot-blob-handling
More resilient blob handling in snapshot repositories
2016-08-01 12:03:53 -04:00
Ali Beyad 401edeb0d8 AzureBlobContainer's deleteBlob method now throws a NoSuchFileException
instead of a vanilla IOException when the blob doesn't exist, in order
to conform to the BlobContainer's interface contract.
2016-08-01 10:50:02 -04:00
Nik Everett 303c9faca5 Squash o.e.rest.action.admin.cluster
In an effort to reduce the number of tiny packages we have in the
code base this moves all the files that were in subdirectories of
`org.elasticsearch.rest.action.admin.cluster` into
`org.elasticsearch.rest.action.admin.cluster`.

Also fixes line length in these packages.
2016-07-29 20:31:24 -04:00
David Pilato 6b68d1e09b Fix typo in comment 2016-07-29 13:49:11 +02:00
David Pilato f8e0557be5 Extract AWS Key from KeyChain instead of using potential null value
While I was working on #18703, I discovered a bad behavior when people don't provide AWS key/secret as part as their `elasticsearch.yml` but rely on SysProps or env. variables...

In [`InternalAwsS3Service#getClient(...)`](d4366f8493/plugins/repository-s3/src/main/java/org/elasticsearch/cloud/aws/InternalAwsS3Service.java (L76-L141)), we have:

```java
        Tuple<String, String> clientDescriptor = new Tuple<>(endpoint, account);
        AmazonS3Client client = clients.get(clientDescriptor);
```

But if people don't provide credentials, `account` is `null`.

Even if it actually could work, I think that we should use the `AWSCredentialsProvider` we create later on and extract from it the `account` (AWS KEY actually) and then use it as the second value of the tuple.

Closes #19557.
2016-07-28 18:05:51 +02:00
David Pilato 3adccd4560 Merge branch 'pr/19556-use-DefaultAWSCredentialsProviderChain' 2016-07-28 17:38:52 +02:00
David Pilato fb9bad23de Rename GceMetadataServiceImpl to GceMetadataService
See https://github.com/elastic/elasticsearch/pull/15765/files#r65527203
2016-07-27 13:28:53 +02:00
David Pilato e9339a1960 Merge branch 'master' into pr/15724-gce-network-host-master 2016-07-27 11:24:53 +02:00
Nik Everett 9270e8b22b Rename client yaml test infrastructure
This makes it obvious that these tests are for running the client yaml
suites. Now that there are other ways of running tests using the REST
client against a running cluster we can't go on calling the shared
client yaml tests "REST tests". They are rest tests, but they aren't
**the** rest tests.
2016-07-26 13:53:44 -04:00
David Pilato 0d3edee928 Merge branch 'master' into pr/15724-gce-network-host-master 2016-07-26 18:51:01 +02:00
David Pilato fde15ae470 Move custom name resolvers to NetworkService CTOR
Instead of using NetworkModule we can directly inject them in NetworkService CTOR.

See https://github.com/elastic/elasticsearch/pull/15765#issuecomment-235307974
2016-07-26 18:26:30 +02:00
Nik Everett a95d4f4ee7 Add Location header and improve REST testing
This adds a header that looks like `Location: /test/test/1` to the
response for the index/create/update API. The requirement for the header
comes from https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

https://tools.ietf.org/html/rfc7231#section-7.1.2 claims that relative
URIs are OK. So we use an absolute path which should resolve to the
appropriate location.

Closes #19079

This makes large changes to our rest test infrastructure, allowing us
to write junit tests that test a running cluster via the rest client.
It does this by splitting ESRestTestCase into two classes:
* ESRestTestCase is the superclass of all tests that use the rest client
to interact with a running cluster.
* ESClientYamlSuiteTestCase is the superclass of all tests that use the
rest client to run the yaml tests. These tests are shared across all
official clients, thus the `ClientYamlSuite` part of the name.
2016-07-25 17:02:40 -04:00
David Pilato b62ec1d300 Remove TODO about Timeout in Azure
In #15950 #15080 #16084 we added the support of TimeOut for Requests with a default client`setTimeoutIntervalInMs`.
So we can remove this useless todo which was added for only one method.

Closes #18617.
2016-07-25 16:19:15 +02:00
Ali Beyad 299b8a7a52 Removes unnecessary blobExists() check before reading a blob in the
Azure and Google cloud blob containers, as the APIs for both return
a 404 in the case of a missing object, which we already handle through
a NoSuchFileFoundException.
2016-07-23 23:24:56 -04:00
David Pilato 43c15f2b23 Merge branch 'test/check-s3-settings' 2016-07-23 00:38:55 +02:00
David Pilato 7aa4568a9c Fix s3 settings
Follow up for #18662 and #18690.

* For consistency, we rename method parameters and use `key` and `secret` instead of `account` and `key`.
* We add some tests to check that settings are correctly applied.
* Tests revealed that some checks are bad like for #18662.

Add test and fix issue for getting the right S3 endpoint
Test when Repository, Repositories or global settings are defined
But ignore testAWSCredentialsWithSystemProviders test
Add tests for AWS Client Configuration
Fix NPE when no region is set

We used to transform region="" to region=null but it's not needed anymore and would actually cause NPE from now.
2016-07-23 00:37:29 +02:00
David Pilato 0578925423 Fix ec2 settings
Follow up for #18662

We add some tests to check that settings are correctly applied.
Tests revealed that some checks were missing.

But we ignore `testAWSCredentialsWithSystemProviders` test for now.
2016-07-23 00:09:18 +02:00
Alexander Kazakov 0216cef7bd Fix EC2 discovery setting
Closes #18652
2016-07-23 00:09:18 +02:00
Ali Beyad d9ec959dfc Index folder names now use a UUID (not the index UUID but one specific
to snapshot/restore) and the index to UUID mapping is stored in the
repository index file.
2016-07-22 13:59:13 -04:00
Ali Beyad 630218a16f Change the BlobContainer interface to throw a NoSuchFileFoundException
for reads and deletes if the blob does not exist.
2016-07-22 13:49:25 -04:00
gfyoung dfcdadb59f Added HdfsBlobStoreContainer tests
Added BlobContainer tests for HDFS storage
and caught a bug at the same time in which
deleteBlob was not raising an IOException
when the blobName did not exist.
2016-07-22 13:48:45 -04:00
gfyoung 5eb4797955 Added AzureBlobStoreContainer tests
Added BlobContainer tests for Azure storage
and caught a bug at the same time in which
deleteBlob was not raising an IOException
when the blobName did not exist.
2016-07-22 13:48:45 -04:00
gfyoung c2c40d51db Added S3BlobStoreContainer tests 2016-07-22 13:48:45 -04:00
gfyoung d98fd36dad Added deleteBlob IOException test 2016-07-22 13:48:45 -04:00
gfyoung b02a6da8fd Properly raise IOException for Azure, Fs, Hdfs, and S3 2016-07-22 13:48:45 -04:00
gfyoung 0620a3d6c2 Raised IOException on deleteBlob
Closes gh-18530.
2016-07-22 13:48:45 -04:00
javanna db8beeba3b Merge branch 'master' into feature/async_rest_client 2016-07-22 15:51:03 +02:00
David Pilato 5e57febe53 Add DiscoveryPlugin interface
So we have a Pull interface easier to use which reduce the need of Guice.

See 2a9d7f68a1 (commitcomment-18335161)
2016-07-21 11:35:29 +02:00
David Pilato 2a9d7f68a1 Move custom name resolver registration to the NetworkModule
As explained in https://github.com/elastic/elasticsearch/pull/15765#discussion_r65804713
2016-07-21 10:27:38 +02:00
David Pilato 11ec3a4af6 Fix path_style_access after merge with master 2016-07-21 09:45:12 +02:00
David Pilato 98fd5833cc Merge branch 'master' into pr/15724-gce-network-host-master 2016-07-21 09:29:59 +02:00
David Pilato 55887457fa Don't register repository settings in S3 plugin
Follow up for https://github.com/elastic/elasticsearch/pull/17784#discussion_r64575845

Today we are registering repository settings when `S3RepositoryPlugin` starts:

```java
        settingsModule.registerSetting(S3Repository.Repository.KEY_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.SECRET_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.BUCKET_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.ENDPOINT_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.PROTOCOL_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.REGION_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.SERVER_SIDE_ENCRYPTION_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.BUFFER_SIZE_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.MAX_RETRIES_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.CHUNK_SIZE_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.COMPRESS_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.STORAGE_CLASS_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.CANNED_ACL_SETTING);
        settingsModule.registerSetting(S3Repository.Repository.BASE_PATH_SETTING);
```

We don't need to register those settings as they are repository level settings and not node level settings.

Closes #18945.
2016-07-20 11:26:03 +02:00
javanna 118a14fbe3 Build: upgrade httpcore version to 4.4.5
Closes #19127
2016-07-19 15:11:40 +02:00
David Pilato c6c5a1b7c8 Merge branch 'master' into pr/s3-path-style-access 2016-07-19 12:55:25 +02:00
David Pilato cdf2324d20 Register the new settings 2016-07-19 12:53:03 +02:00
Ali Beyad 19d0dbcd17 Removes waiting for yellow cluster health upon index (#19460)
creation in the REST tests, as we no longer need it due
to index creation now waiting for active shard copies
before returning (by default, it waits for the primary of
each shard, which is the same as ensuring yellow health).

Relates #19450
2016-07-15 17:18:34 -04:00
Martijn van Groningen d0069f0fbb Provide access to ThreadContext in ingest plugins
Also introduced a `Processor.Parameters` class that is holder for several services processors rely on,
the  IngestPlugin#getProcessors(...) method has been changed to accept `Processor.Parameters` instead
of each service seperately.
2016-07-15 08:16:15 +02:00