Commit Graph

1505 Commits

Author SHA1 Message Date
Ryan Ernst f1376262fe Merge branch 'master' into ingest_plugin_api 2016-06-29 14:16:16 -07:00
Ryan Ernst 6590e77c1a Plugins: Make plugins closeable
This change allows Plugin implementions to implement Closeable when they
have resources that should be released. As a first example of how this
can be used, I switched over ingest plugins, which just had the geoip
processor. The ingest framework had chains of closeable to support this,
which is now removed.
2016-06-28 16:16:26 -07:00
Ryan Ernst 258c3e86ab Added IngestPlugin api, cutover common and geoip, changed ingest factory
api to take ProcessorsRegistry
2016-06-28 10:52:07 -07:00
Tanguy Leroux c557663b90 Make discovery-azure work again
The discovery-plugin has been broken since 2.x because the code was not compliant with the security manager and because plugins have been refactored.

closes #18637, #15630
2016-06-28 10:05:44 +02:00
David Pilato af989c0780 Support new Asia Pacific (Mumbai) ap-south-1 AWS region
AWS [announced](http://www.allthingsdistributed.com/2016/06/introducing-aws-asia-pacific-mumbai-region.html) a new region: Asia Pacific (Mumbai)	`ap-south-1`.

We need to support it for:

* repository-s3: s3.ap-south-1.amazonaws.com or s3-ap-south-1.amazonaws.com
* discovery-ec2: ec2.ap-south-1.amazonaws.com

For reference: http://docs.aws.amazon.com/general/latest/gr/rande.html

Closes #19110.
2016-06-28 08:50:50 +02:00
Ryan Ernst 33ccc5aead Merge branch 'master' into mapper_plugin_api 2016-06-27 11:19:59 -07:00
Jim Ferenczi eb1e231a63 Revert "Rename `fields` to `stored_fields` and add `docvalue_fields`"
This reverts commit 2f46f53dc8.
2016-06-27 17:20:32 +02:00
Tanguy Leroux 59762fe487 Register group setting for repository-azure accounts
Since the Settings infrastructure has been improved, a group setting must be registered by the repository-azure plugin to allow settings like "cloud.azure.storage.my_account.account" to be coherent with Azure plugin documentation.
2016-06-27 12:04:59 +02:00
Nik Everett 71b95fb63c Switch analysis from push to pull
Instead of plugins calling `registerTokenizer` to extend the analyzer
they now instead have to implement `AnalysisPlugin` and override
`getTokenizer`. This lines up extending plugins in with extending
scripts. This allows `AnalysisModule` to construct the `AnalysisRegistry`
immediately as part of its constructor which makes testing anslysis
much simpler.

This also moves the default analysis configuration into `AnalysisModule`
which is how search is setup.

Like `ScriptModule`, `AnalysisModule` no longer extends `AbstractModule`.
Instead it is only responsible for building `AnslysisRegistry`. We still
bind `AnalysisRegistry` but we only do so in `Node`. This is means it
is available at module construction time so we slowly remove the need to
bind it in guice.
2016-06-26 07:15:42 -04:00
Ryan Ernst 6995bde710 Merge branch 'master' into mapper_plugin_api 2016-06-24 11:15:06 -07:00
David Pilato d73194b9b7 Remove settings filtering for service_account in GCS repository
Related to #18945 and to this 35d3bdab84 (commitcomment-17914150)

In GCS Repository plugin we defined a `service_account` setting which is defined as `Property.Filtered`.
It's not needed as it's only a path to a file.

Closes #18946
2016-06-24 10:58:29 +02:00
Adrien Grand 7ba5bceebe Add a MultiTermAwareComponent marker interface to analysis factories. #19028
This is the same as what Lucene does for its analysis factories, and we hawe
tests that make sure that the elasticsearch factories are in sync with
Lucene's. This is a first step to move forward on #9978 and #18064.
2016-06-23 10:19:24 +02:00
Jim Ferenczi 2f46f53dc8 Rename `fields` to `stored_fields` and add `docvalue_fields`
`stored_fields` parameter will no longer try to retrieve fields from the _source but will only return stored fields.
`fields` will throw an exception if the user uses it.
Add `docvalue_fields` as an adjunct to `fielddata_fields` which is deprecated. `docvalue_fields` will try to load the value from the docvalue and fallback to fielddata cache if docvalues are not enabled on that field.

Closes #18943
2016-06-22 17:38:30 +02:00
Ryan Ernst e817b5daa3 Plugins: Remove guice from Mapper plugins
This changes adds a MapperPlugin interface which allows pull style
retrieval of mappers and metadata mappers added by plugins. For now, I
have kept the MapperRegistry, but this should be removed in the future
as it is just a silly container for 2 maps which could themselves be
passed around.
2016-06-21 22:50:39 -07:00
Martijn van Groningen 82f7bfad98 ingest: merged o.e.ingest.core with o.e.ingest and in ingest-common module added o.e.ingest.common package
and moved all code to that package.
2016-06-21 09:24:00 +02:00
Simon Willnauer 7fea5bd8e7 Remove obsolete Modules that can simply be inlined in node creation 2016-06-20 11:28:14 +02:00
Simon Willnauer 260f38fd76 Remove VersionModule and use Version#current consistently.
We pretended to be able to ackt like a different version node for so long it's
time to be honest and remove this ability. It's just confusing and where needed
and tested we should build dedicated extension points.
2016-06-20 10:55:52 +02:00
Tanguy Leroux 98951b1203 Compile each Groovy script in its own classloader
closes #18572
2016-06-20 08:17:09 +02:00
Areek Zillur 9356a6090f Merge branch 'master' into enhancement/rollover_api 2016-06-17 11:35:57 -04:00
David Pilato bf7a6f5509 Merge branch 'pr/update-aws-sdk'
# Conflicts:
#	plugins/repository-s3/src/main/java/org/elasticsearch/plugin/repository/s3/S3RepositoryPlugin.java
2016-06-17 17:23:12 +02:00
Simon Willnauer bdb6dcea3a Cleanup ClusterService dependencies and detached from Guice (#18941)
This change removes some unnecessary dependencies from ClusterService
and cleans up ClusterName creation. ClusterService is now not created
by guice anymore.
2016-06-17 17:07:19 +02:00
David Pilato 82fee9f7a7 Revert change about registering Repository settings
Will create another issue to change that. Related to this discussion: f4cd3bd348 (r67291936)
2016-06-17 17:02:00 +02:00
Areek Zillur 545ffa7801 Merge branch 'master' into enhancement/rollover_api 2016-06-17 10:33:11 -04:00
Adrien Grand 600cbb6ab0 Upgrade to Lucene 6.1.0. #18926 2016-06-17 09:03:00 +02:00
Areek Zillur 6adffa6b7b Merge branch 'master' into enhancement/rollover_api 2016-06-16 17:27:32 -04:00
Ryan Ernst 8196cf01e3 Merge branch 'master' into plugin_name_api 2016-06-16 13:49:28 -07:00
Simon Willnauer b22c526b34 Cut over settings registration to a pull model (#18890)
Today we have a push model for registering basically anything. All our extension points
are defined on modules which we pass in to plugins. This is harder to maintain and adds
unnecessary dependencies on the modules itself. This change moves towards a pull model
where the plugin offers a getter kind of method to get the extensions. This will also
help in the future if we need to pass dependencies to the extension points which can
easily be defined on the method as arguments if a pull model is used.
2016-06-16 15:52:58 +02:00
Simon Willnauer 18ff051ad5 Simplify ScriptModule and script registration (#18903)
Registering a script engine or native scripts still uses Guice today
and is much more complicated than needed. This change moves to a pull
based model where script plugins have to implement a dedicated interface
`ScriptPlugin` and defines simple getter returning instances rather than
classes.
2016-06-16 09:35:13 +02:00
David Pilato b036d238f5 Fix typo 2016-06-16 05:51:39 +02:00
Ryan Ernst a4503c2aed Plugins: Remove name() and description() from api
In 2.0 we added plugin descriptors which require defining a name and
description for the plugin. However, we still have name() and
description() which must be overriden from the Plugin class. This still
exists for classpath plugins. But classpath plugins are mainly for
tests, and even then, referring to classpath plugins with their class is
a better idea. This change removes name() and description(), replacing
the name for classpath plugins with the full class name.
2016-06-15 17:12:22 -07:00
Ryan Ernst 7f6e0c6c02 fix compile for ingest plugin lambda 2016-06-15 16:57:01 -07:00
Tal Levy a26260fb72 new ScriptProcessor for Ingest (#18193)
add new ScriptProcessor for executing ES Scripts within pipelines
2016-06-15 14:57:18 -07:00
David Pilato 63223928dc Merge branch 'master' into pr/update-aws-sdk 2016-06-15 14:48:41 +02:00
Adrien Grand 44c653f5a8 Upgrade to lucene-6.1.0-snapshot-3a57bea. 2016-06-10 16:18:12 +02:00
Martijn van Groningen 3dd3ed4905 ingest: Upgrade geoip processor's dependencies and database files
The database files have been doubled in size compared to the previous files being used.
For this reason the database files are now gzip compressed, which required using
`GZIPInputStream` when loading database files.
2016-06-08 18:41:48 +02:00
Martijn van Groningen f611f1c99e ingest: Move processors from core to ingest-common module.
Folded grok processor into ingest-common module.

The rest tests have been moved to ingest-common module as well, because these tests don't run in the rest-api-spec module but in the distribution:integ-test-zip module
and adding a test plugin there felt just wrong to me. I think this is ok. I left a tiny ingest rest test behind in that tests with an empty pipeline.

Removed messy tests, these tests were already covered in the rest tests

Added ingest test plugin in test infra so that each module testing integration with ingest doesn't need write its own plugin

Moved reindex ingest tests to qa module

Closes #18490
2016-06-07 17:32:52 +02:00
Jason Tedor da74323141 Register thread pool settings
This commit refactors the handling of thread pool settings so that the
individual settings can be registered rather than registering the top
level group. With this refactoring, individual plugins must now register
their own settings for custom thread pools that they need, but a
dedicated API is provided for this in the thread pool module. This
commit also renames the prefix on the thread pool settings from
"threadpool" to "thread_pool". This enables a hard break on the settings
so that:
 - some of the settings can be given more sensible names (e.g., the max
   number of threads in a scaling thread pool is now named "max" instead
   of "size")
 - change the soft limit on the number of threads in the bulk and
   indexing thread pools to a hard limit
 - the settings names for custom plugins for thread pools can be
   prefixed (e.g., "xpack.watcher.thread_pool.size")
 - remove dynamic thread pool settings

Relates #18674
2016-06-06 22:09:12 -04:00
Areek Zillur d96fe20e3a add named writable registry glue 2016-06-06 16:11:46 -04:00
Jason Tedor 974c753bf6 Fix uncaught checked exception in AzureTestUtils
This commit fixes an uncaught checked IOException now thrown in
AzureTestUtils after 3adaf09675.
2016-06-03 14:17:25 -04:00
Jason Tedor bbd5f26d45 Merge branch 'master' into rjernst-placeholder
* master: (911 commits)
  [TEST] wait for yellow after setup doc tests (#18726)
  Fix recovery throttling to properly handle relocating non-primary shards (#18701)
  Fix merge stats rendering in RestIndicesAction (#18720)
  [TEST] mute RandomAllocationDeciderTests.testRandomDecisions
  Reworked docs for index-shrink API (#18705)
  Improve painless compile-time exceptions
  Adds UUIDs to snapshots
  Add test rethrottle test case for delete-by-query
  Do not start scheduled pings until transport start
  Adressing review comments
  Only filter intial recovery (post API) when shrinking an index (#18661)
  Add tests to check that toQuery() doesn't return null
  Removing handling of null lucene query where we catch this at parse time
  Handle empty query bodies at parse time and remove EmptyQueryBuilder
  Mute failing assertions in IndexWithShadowReplicasIT until fix
  Remove allow running as root
  Add upgrade-not-supported warning to alpha release notes
  remove unrecognized javadoc tag from matrix aggregation module
  set ValuesSourceConfig fields as private
  Adding MultiValuesSource support classes and documentation to matrix stats agg module
  ...
2016-06-03 13:32:03 -04:00
David Pilato ef6e43e18d We don't need many URLs here but just one 2016-06-03 18:27:27 +02:00
David Pilato a1496f8e21 Fix getResource to call it from the current class 2016-06-03 18:26:48 +02:00
David Pilato e7ab0a9233 Remove GceMetadataService interface as not needed 2016-06-03 18:14:00 +02:00
David Pilato b0bdd443bd Don't use String concatenation but actual URI 2016-06-03 18:04:55 +02:00
David Pilato 1eb022de11 Use a unique GCE_HOST setting 2016-06-03 17:33:09 +02:00
David Pilato 711515ac80 Rename GceComputeService to GceInstancesService 2016-06-03 17:26:53 +02:00
David Pilato b9989f88d0 Merge branch 'master' into pr/15724-gce-network-host-master 2016-06-03 17:22:24 +02:00
Ali Beyad b720216395 Adds UUIDs to snapshots
This commit adds a UUID for each snapshot, in addition to the already
existing repository and snapshot name. The addition of UUIDs will enable
more robust handling of the deletion of previous snapshots and lingering
files from partially failed delete operations, on top of being able to
uniquely track each snapshot.

Closes #18228
Relates #18156
2016-06-02 17:01:48 -04:00
Alexander Kazakov f32f35bec4 Register "cloud.node.auto_attributes" setting (#18678) 2016-06-01 13:27:11 +02:00
Adrien Grand d182e171a4 Upgrade to Lucene 6.0.1. 2016-06-01 10:31:10 +02:00
Ali Beyad 0efac76f01 Clarify the semantics of the BlobContainer interface
This commit clarifies the behavior that must be adhered to by any
implementors of the BlobContainer interface.  This is done through
expanded Javadocs.

Closes #18157
Closes #15580
2016-05-31 19:22:55 -04:00
Simon Willnauer 502a775a7c Add primitive to shrink an index into a single shard (#18270)
This adds a low level primitive operations to shrink an existing
index into a new index with a single shard. This primitive expects
all shards of the source index to allocated on a single node. Once the target index is initializing on the shrink node it takes a snapshot of the source index shards and copies all files into the target indices data folder. An [optimization](https://issues.apache.org/jira/browse/LUCENE-7300) coming in Lucene 6.1 will also allow for optional constant time copy if hard-links are supported by the filesystem. All mappings are merged into the new indexes metadata once the snapshots have been taken on the merge node.

To shrink an existing index all shards must be moved to a single node (one instance of each shard) and the index must be read-only:

```BASH
$ curl -XPUT 'http://localhost:9200/logs/_settings' -d '{
    "settings" : {
        "index.routing.allocation.require._name" : "shrink_node_name",
        "index.blocks.write" : true 
    }
}
```
once all shards are started on the shrink node. the new index can be created via:

```BASH
$ curl -XPUT 'http://localhost:9200/logs/_shrink/logs_single_shard' -d '{
    "settings" : {
        "index.codec" : "best_compression",
        "index.number_of_replicas" : 1
    }
}'
```

This API will perform all needed check before the new index is created and selects the shrink node based on the allocation of the source index. This call returns immediately, to monitor shrink progress the recovery API should be used since all copy operations are reflected in the recovery API with byte copy progress etc.

The shrink operation does not modify the source index, if a shrink operation should
be canceled or if the shrink failed, the target index can simply be deleted and
all resources are released.
2016-05-31 10:41:44 +02:00
David Pilato 63622aa6b6 Fix log use_throttle_retries 2016-05-27 12:48:05 +02:00
David Pilato 623a5b7a85 Merge branch 'master' into pr/update-aws-sdk 2016-05-27 10:13:35 +02:00
David Pilato a445654123 Fix after review
* changes `throttle_retries` to `use_throttle_retries`
* removes registering of all individual repository settings when the plugin starts. Not needed
* adds more comment about deprecated method in AWS SDK we need to implement though in a Delegate class within our tests
2016-05-27 10:13:16 +02:00
Boaz Leskes 318a4e3ef6 Introduce dedicated master nodes in testing infrastructure (#18514)
This PR changes the InternalTestCluster to support dedicated master nodes. The creation of dedicated master nodes can be controlled using a new `supportsMasterNodes` parameter to the ClusterScope annotation. If set to true (the default), dedicated master nodes will randomly be used. If set to false,  no master nodes will be created and data nodes will also be allowed to become masters. If active, test runs will either have 1 or 3 masternodes
2016-05-27 08:44:20 +02:00
Jason Tedor 9d39b05845 Remove deprecation suppression
Failing the build on deprecation warnings was removed in
19b3ec88af. This commit removes the
suppressed deprecation warnings so that their use is surfaced in the
build now.

Relates #18582
2016-05-25 17:15:36 -04:00
Tanguy Leroux bdee8c2632 Disable XContent auto closing of object and arrays 2016-05-25 16:46:09 +02:00
David Pilato c4d3bf472b Fix comment and rename blob_container to blobContainer 2016-05-25 11:02:11 +02:00
David Pilato fd602cc037 Merge branch 'master' into azure/fix-delete 2016-05-25 10:53:04 +02:00
Tanguy Leroux 1f011f9dea Remove Delete-By-Query plugin
closes #18469
2016-05-24 13:28:20 +02:00
Adrien Grand 459916f5dd Remove custom Base64 implementation. #18413
This replaces o.e.common.Base64 with java.util.Base64.
2016-05-23 11:32:42 +02:00
Tanguy Leroux e7eb664c78 Change BlobPath.buildAsString() method 2016-05-23 10:50:40 +02:00
Ryan Ernst 37d36f2f4c Merge branch 'master' into java9 2016-05-21 14:19:58 -07:00
Ryan Ernst 1d40c4bbc1 Make java9 work again
This change makes ES compile with java9 again, build 118.
* There are a handful of changes due to failure to determine types during compile.
* The attachment plugins which use tika needed to have tika upgraded in order to pickup fixes there for java 9.
* azure discovery and s3 repository indirectly depend on jaxb, which is no longer in the default modules. They now add a jaxb dependency externally, and make JarHell allow for this package.
2016-05-21 09:41:51 -07:00
David Pilato 1d75ee6fb9 Merge branch 'master' into azure/fix-delete 2016-05-20 16:12:30 +02:00
David Pilato 6772517f4d Cleanup the PR and apply advices from the review
* ESBlobStore tests must move to the test framework if we want to be able to reuse them in the context of plugins.
* To be able to identify more easily what are Integration Tests vs Unit Tests, this commit renames `*AzureTestCase` to `*AzureIntegTestCase`.
* Move some debug level logs to trace level
* Collapse when possible identical catch blocks
* `blobNameFromUri()` does not need anymore to get the container name. We just split the URI after 3 `/` and simply get the remaining part.
* Added a Unit test for that
* As we renamed some existing classes, checkstyle is now complaining about the lines width.
* While we are at it, let's replace all calls to `execute().actionGet()` with `get()`
* Move `readSettingsFromFile()` in a Util class. Note that this class might be useful for other plugins (S3/EC2/Azure-discovery for instance) so may be we should move it to the test framework?
* Replace some part of the code with lambdas
2016-05-20 16:04:39 +02:00
javanna 63c5b31449 update shas for httpclient and httpcore 2016-05-20 14:10:55 +02:00
Tanguy Leroux 8486488627 Disable DeleteByQueryRestIT delete_by_query/10_basic/Basic delete_by_query
Because of a REST test namespace conflict introduced by 18329. Issue tracked in 18469
2016-05-19 18:44:53 +02:00
David Pilato f4cd3bd348 Merge branch 'master' into pr/update-aws-sdk 2016-05-19 16:55:21 +02:00
David Pilato e289de6e96 Move `throttle_retries` under `repositories.s3.` prefix or per repository
I initially wrongly put this setting under `cloud.aws.s3.` prefix which does not make sense. It should be placed at the same place as `max_retries`.

Also applied @tlrx comments. We should set this even if max_retries is not set (when using default values).

Also added some documentation about this setting.
2016-05-19 16:50:37 +02:00
Tanguy Leroux 35d3bdab84 Add Google Cloud Storage repository plugin
Closes #12880
2016-05-19 13:26:23 +02:00
Jason Tedor 6e3b49c522 Fix inequality symbol in test assertion
This commit fixes the inequality symbol used in a test assertion in
RepositoryS3SettingsTests#testInvalidChunkBufferSizeRepositorySettings. The
inequality symbol was previously backwards but fixed in commit
cad0608cdb but fixing the inequality
symbol here was missed in that commit.

Closes #18449
2016-05-18 12:14:37 -04:00
David Pilato 9b247f9828 Fix remove of azure files
Probably when we updated Azure SDK, we introduced a regression.
Actually, we are not able to remove files anymore.

For example, if you register a new azure repository, the snapshot service tries to create a temp file and then remove it.
Removing does not work and you can see in logs:

```
[2016-05-18 11:03:24,914][WARN ][org.elasticsearch.cloud.azure.blobstore] [azure] can not remove [tests-ilmRPJ8URU-sh18yj38O6g/] in container {elasticsearch-snapshots}: The specified blob does not exist.
```

This fix deals with that. It now list all the files in a flatten mode, remove in the full URL the server and the container name.

As an example, when you are removing a blob which full name is `https://dpi24329.blob.core.windows.net/elasticsearch-snapshots/bar/test` you need to actually call Azure SDK with `bar/test` as the path, `elasticsearch-snapshots` is the container.

To run the test, you need to pass some parameters: `-Dtests.thirdparty=true -Dtests.config=/path/to/elasticsearch.yml`

Where `elasticsearch.yml` contains something like:

```
cloud.azure.storage.default.account: account
cloud.azure.storage.default.key: key
```

Related to #16472
Closes #18436.
2016-05-18 17:23:33 +02:00
Jason Tedor db4809d906 Remove last vestigates of /bin/sh shebangs
This commit removes the remaining /bin/sh shebangs in favor of
/bin/bash.

Relates #18448
2016-05-18 11:03:00 -04:00
David Pilato d85dac7a9a Add more logs 2016-05-18 16:43:56 +02:00
David Pilato cfedda5291 Default azure container should be `elasticsearch-snapshots`
This bug has been introduced in 5.0 when we refactored settings
2016-05-18 16:43:28 +02:00
polyfractal 72094feb12 [TEST] Add missing sort processor to tests, continued 2016-05-17 16:39:53 -04:00
Robert Muir 8d4c1befe5 Merge pull request #18364 from rmuir/nukeRunAsFloat
Remove LeafSearchScript.runAsFloat(): Nothing calls it.
2016-05-16 17:08:25 -04:00
Adrien Grand 864ed04059 Lessen leniency of the query dsl. #18276
This change does the following:
 - Queries that are currently unsupported such as prefix queries on numeric
   fields or term queries on geo fields now throw an error rather than returning
   a query that does not match anything.
 - Fuzzy queries on numeric, date and ip fields are now unsupported: they used
   to create range queries, we now expect users to use range queries directly.
   Fuzzy, regexp and prefix queries are now only supported on text/keyword
   fields (including `_all`).
 - The `_uid` and `_id` fields do not support prefix or range queries anymore as
   it would prevent us to store them more efficiently in the future, eg. by
   using a binary encoding.

Note that it is still possible to ignore these errors by using the `lenient`
option of the `match` or `query_string` queries.
2016-05-16 17:37:00 +02:00
Robert Muir 8edf213492 Remove LeafSearchScript.runAsFloat(): Nothing calls it. 2016-05-15 22:59:28 -04:00
Robert Muir 2028691e66 painless: improve exception stacktraces
closes #18319
2016-05-13 15:40:45 -04:00
Lee Hinman 9bcdafedda Allow only a single extension for a scripting engine
Previously multiple extensions could be provided, however, this can lead
to confusion with on-disk scripts (ie, "foo.js" and "foo.javascript")
having different content. Only a single extension is now supported.

The only language currently supporting multiple extensions was the
Javascript engine ("js" and "javascript"). It now only supports the
`.js` extension.

Relates to #10598
2016-05-13 09:54:31 -06:00
Lee Hinman efff3918d8 Remove support for mulitple languages per scripting engine 2016-05-13 09:24:31 -06:00
Lee Hinman a4060f7436 Remove vestiges of script engine sandboxing
This removes all the mentions of the sandbox from the script engine
services and permissions model. This means that the following settings
are no longer supported:

```yaml
script.inline: sandbox
script.stored: sandbox
```

Instead, only a `true` or `false` value can be specified.

Since this would otherwise break the default-allow parameter for
languages like expressions, painless, and mustache, all script engines
have been updated to have individual settings, for instance:

```yaml
script.engine.groovy.inline: true
```

Would enable all inline scripts for groovy. (they can still be
overridden on a per-operation basis).

Expressions, Painless, and Mustache all default to `true` for inline,
file, and stored scripts to preserve the old scripting behavior.

Resolves #17114
2016-05-13 09:24:31 -06:00
John Barker 531dcbf20a Add TAG_SETTING to list of allowed tags for the ec2 discovery plugin.
I am unable to set ec2 discovery tags because this setting was
accidentally omitted from the register settings list in
Ec2DiscoveryPlugin.java. I get this:

java.lang.IllegalArgumentException: unknown setting [discovery.ec2.tag.project]
2016-05-10 16:19:46 -04:00
David Pilato e8ddf5de2f Merge branch 'pr/hide-s3-repositories-credentials' 2016-05-10 20:22:39 +02:00
Adrien Grand f481492af3 Remove FieldMapper.Builder.indexName. #18219
The ability to configure index names that are different from the full name was
removed in 2.0.
2016-05-10 08:17:00 +02:00
Adrien Grand 5d8f684319 Mapping cleanups. #18180
This removes dead/duplicate code and makes the `_index` field not configurable.
(Configuration used to jus be ignored, now we would throw an exception if any
is provided.)
2016-05-10 08:14:18 +02:00
Chris Earle 5be79ed02c Add Failure Details to every NodesResponse
Most of the current implementations of BaseNodesResponse (plural Nodes) ignore FailedNodeExceptions.

- This adds a helper function to do the grouping to TransportNodesAction
- Requires a non-null array of FailedNodeExceptions within the BaseNodesResponse constructor
- Reads/writes the array to output
- Also adds StreamInput and StreamOutput methods for generically reading and writing arrays
2016-05-06 14:59:43 -04:00
Adrien Grand 7d8708716e QueryBuilder does not need generics. #18133
QueryBuilder has generics, but those are never used: all call sites use
`QueryBuilder<?>`. Only `AbstractQueryBuilder` needs generics so that the base
class can contain a default implementation for setters that returns `this`.
2016-05-06 08:38:20 +02:00
Jason Tedor 784c9e5fb9 Introduce node handshake
This commit introduces a handshake when initiating a light
connection. During this handshake, node information, cluster name, and
version are received from the target node of the connection. This
information can be used to immediately validate that the target node is
a member of the same cluster, and used to set the version on the
stream. This will allow us to extend APIs that are used during initial
cluster recovery without a major version change.

Relates #15971
2016-05-04 20:06:47 -04:00
Jason Tedor 2dea449949 Remove Strings#splitStringToArray
This commit removes the method Strings#splitStringToArray and replaces
the call sites with invocations to String#split. There are only two
explanations for the existence of this method. The first is that
String#split is slightly tricky in that it accepts a regular expression
rather than a character to split on. This means that if s is a string,
s.split(".")  does not split on the character '.', but rather splits on
the regular expression '.' which splits on every character (of course,
this is easily fixed by invoking s.split("\\.") instead). The second
possible explanation is that (again) String#split accepts a regular
expression. This means that there could be a performance concern
compared to just splitting on a single character. However, it turns out
that String#split has a fast path for the case of splitting on a single
character and microbenchmarks show that String#split has 1.5x--2x the
throughput of Strings#splitStringToArray. There is a slight behavior
difference between Strings#splitStringToArray and String#split: namely,
the former would return an empty array in cases when the input string
was null or empty but String#split will just NPE at the call site on
null and return a one-element array containing the empty string when the
input string is empty. There was only one place relying on this behavior
and the call site has been modified accordingly.
2016-05-04 08:12:41 -04:00
Martijn van Groningen 7aca1389e2 ingest: Add `date_index_name` processor.
Closes #17814
2016-04-29 17:20:48 +02:00
Tal Levy 07c2fbf83a Validate properties values according to database type (#17940)
Fixes #17683.
2016-04-29 07:58:27 -07:00
David Pilato c16d309c8c Allow `_gce_` network when not using discovery gce
For now we support `_gce_` only if discovery is set to `gce` and all information about GCE is provided (project_id and zone).
But in some cases, people would like to only bind to `_gce_` on a single node (without any elasticsearch cluster).

They could access the machine then from other machines running inside the same project.

This commit adds a new GceMetadataService which is started as soon as the plugin is started so GceNameResolver can use it to resolve `_gce`.

Closes #15724.
2016-04-29 16:56:24 +02:00
Yannick Welsch 37382ecfb2 Add Azure discovery tests mocking Azure management endpoint (#18004) 2016-04-29 15:54:15 +02:00
David Pilato 7cc8a1419b Update after rebase onto master 2016-04-29 15:39:51 +02:00
David Pilato d7eb375d24 Merge branch 'master' into pr/s3-path-style-access
# Conflicts:
#	plugins/repository-s3/src/main/java/org/elasticsearch/cloud/aws/AwsS3Service.java
#	plugins/repository-s3/src/main/java/org/elasticsearch/cloud/aws/InternalAwsS3Service.java
#	plugins/repository-s3/src/main/java/org/elasticsearch/repositories/s3/S3Repository.java
#	plugins/repository-s3/src/test/java/org/elasticsearch/cloud/aws/TestAwsS3Service.java
2016-04-29 15:21:16 +02:00
David Pilato 6c7a44ccd9 Fix test in mapper attachments plugin 2016-04-29 15:02:04 +02:00
David Pilato 2636703afa Merge branch 'master' into pr/attachments-add-test-forced-values 2016-04-29 14:55:42 +02:00
David Pilato faa3c6ef3c Add new UnsupportedException for EC Mock 2016-04-29 14:41:57 +02:00
David Pilato 6ef81c5dcd S3 repositories credentials should be filtered
When working on #18008 I found while reading the code that we don't filter anymore `repositories.s3.access_key` and `repositories.s3.secret_key`.

Also fixed a typo in REST test
2016-04-27 14:11:17 +02:00
Alexander Reelsen f71eb0b888 Version: Set version to 5.0.0-alpha2 2016-04-26 09:30:26 +02:00
Xu Zhang 3e4b470f83 Fix icu IndexScope setting 2016-04-22 15:03:02 -07:00
Ryan Ernst d12a4bb51d Merge pull request #17933 from rjernst/camelcase4
Remove camelCase support
2016-04-22 13:46:43 -07:00
xuzha cd527c5b92 Add support for customizing the rule file in ICU tokenizer
Lucene allows to create a ICUTokenizer with a special config argument
enabling the customization of the rule based iterator by providing
custom rules files.

This commit enable this feature. Users could provide a list of RBBI rule
files to ICU tokenizer.

closes #13146
2016-04-22 12:39:20 -07:00
Ryan Ernst 55388590c1 Remove camelCase support
Now that the current uses of magical camelCase support have been
deprecated, we can remove these in master (sans remaining issues like
BulkRequest). This change removes camel case support from ParseField,
query types, analysis, and settings lookup.

see #8988
2016-04-22 09:18:10 -07:00
Martijn van Groningen c5ad2e2865 Changed indexed scripts to be stored in the cluster state instead of the `.scripts` index.
Also added max script size soft limit for stored scripts.

Closes #16651
2016-04-22 13:42:55 +02:00
Martijn van Groningen dd2184ab25 ingest: Streamline option naming for several processors:
* `rename` processor, renamed `to` to `target_field`
* `date` processor, renamed `match_field` to `field` and renamed `match_formats` to `formats`
* `geoip` processor, renamed `source_field` to `field` and renamed `fields` to `properties`
* `attachment` processor, renamed `source_field` to `field` and renamed `fields` to `properties`

Closes #17835
2016-04-21 13:40:43 +02:00
Jun Ohtani 9eb242a5fe Analyze API : Rename filters/token_filters/char_filter to filter/token_filter/char_filter
Closes #15189
2016-04-21 18:05:11 +09:00
Ryan Ernst 523b071836 Internal: Remove XContentBuilderString
This was previously used by xcontentbuilder to support camelCase.
However, it is no longer used, and can be replaced with just String.
2016-04-18 14:32:18 -07:00
Nik Everett ff9b28d806 Deprecate remaining readXYZ|writeXYZ methods 2016-04-18 16:19:45 -04:00
David Pilato 44080a007f Add cloud.aws.s3.throttle_retries setting
Defaults to `true`.

If anyone is having trouble with this option, you could disable it with `cloud.aws.s3.throttle_retries: false` in `elasticsearch.yml` file.
2016-04-15 14:53:09 +02:00
David Pilato f2ee759ad5 Upgrade AWS SDK to 1.10.69
* Moving from JSON.org to Jackson for request marshallers.
* The Java SDK now supports retry throttling to limit the rate of retries during periods of reduced availability. This throttling behavior can be enabled via ClientConfiguration or via the system property "-Dcom.amazonaws.sdk.enableThrottledRetry".
* Fixed String case conversion issues when running with non English locales.
* AWS SDK for Java introduces a new dynamic endpoint system that can compute endpoints for services in new regions.
* Introducing a new AWS region, ap-northeast-2.
* Added a new metric, HttpSocketReadTime, that records socket read latency. You can enable this metric by adding enableHttpSocketReadMetric to the system property com.amazonaws.sdk.enableDefaultMetrics. For more information, see [Enabling Metrics with the AWS SDK for Java](https://java.awsblog.com/post/Tx3C0RV4NRRBKTG/Enabling-Metrics-with-the-AWS-SDK-for-Java).
* New Client Execution timeout feature to set a limit spent across retries, backoffs, ummarshalling, etc. This new timeout can be specified at the client level or per request.
  Also included in this release is the ability to specify the existing HTTP Request timeout per request rather than just per client.

* Added support for RequesterPays for all operations.
* Ignore the 'Connection' header when generating S3 responses.
* Allow users to generate an AmazonS3URI from a string without using URL encoding.
* Fixed issue that prevented creating buckets when using a client configured for the s3-external-1 endpoint.
* Amazon S3 bucket lifecycle configuration supports two new features: the removal of expired object delete markers and an action to abort incomplete multipart uploads.
* Allow TransferManagerConfiguration to accept integer values for multipart upload threshold.
* Copy the list of ETags before sorting https://github.com/aws/aws-sdk-java/pull/589.
* Option to disable chunked encoding https://github.com/aws/aws-sdk-java/pull/586.
* Adding retry on InternalErrors in CompleteMultipartUpload operation. https://github.com/aws/aws-sdk-java/issues/538
* Deprecated two APIs : AmazonS3#changeObjectStorageClass and AmazonS3#setObjectRedirectLocation.
* Added support for the aws-exec-read canned ACL. Owner gets FULL_CONTROL. Amazon EC2 gets READ access to GET an Amazon Machine Image (AMI) bundle from Amazon S3.

* Added support for referencing security groups in peered Virtual Private Clouds (VPCs). For more information see the service announcement at https://aws.amazon.com/about-aws/whats-new/2016/03/announcing-support-for-security-group-references-in-a-peered-vpc/ .
* Fixed a bug in AWS SDK for Java - Amazon EC2 module that returns NPE for dry run requests.
* Regenerated client with new implementation of code generator.
* This feature enables support for DNS resolution of public hostnames to private IP addresses when queried over ClassicLink. Additionally, you can now access private hosted zones associated with your VPC from a linked EC2-Classic instance. ClassicLink DNS support makes it easier for EC2-Classic instances to communicate with VPC resources using public DNS hostnames.
* You can now use Network Address Translation (NAT) Gateway, a highly available AWS managed service that makes it easy to connect to the Internet from instances within a private subnet in an AWS Virtual Private Cloud (VPC). Previously, you needed to launch a NAT instance to enable NAT for instances in a private subnet. Amazon VPC NAT Gateway is available in the US East (N. Virginia), US West (Oregon), US West (N. California), EU (Ireland), Asia Pacific (Tokyo), Asia Pacific (Singapore), and Asia Pacific (Sydney) regions. To learn more about Amazon VPC NAT, see [New - Managed NAT (Network Address Translation) Gateway for AWS](https://aws.amazon.com/blogs/aws/new-managed-nat-network-address-translation-gateway-for-aws/)
* A default read timeout is now applied when querying data from EC2 metadata service.
2016-04-15 14:52:48 +02:00
Adrien Grand d84c643f58 Use the new points API to index numeric fields. #17746
This makes all numeric fields including `date`, `ip` and `token_count` use
points instead of the inverted index as a lookup structure. This is expected
to perform worse for exact queries, but faster for range queries. It also
requires less storage.

Notes about how the change works:
 - Numeric mappers have been split into a legacy version that is essentially
   the current mapper, and a new version that uses points, eg.
   LegacyDateFieldMapper and DateFieldMapper.
 - Since new and old fields have the same names, the decision about which one
   to use is made based on the index creation version.
 - If you try to force using a legacy field on a new index or a field that uses
   points on an old index, you will get an exception.
 - IP addresses now support IPv6 via Lucene's InetAddressPoint and store them
   in SORTED_SET doc values using the same encoding (fixed length of 16 bytes
   and sortable).
 - The internal MappedFieldType that is stored by the new mappers does not have
   any of the points-related properties set. Instead, it keeps setting the index
   options when parsing the `index` property of mappings and does
   `if (fieldType.indexOptions() != IndexOptions.NONE) { // add point field }`
   when parsing documents.

Known issues that won't fix:
 - You can't use numeric fields in significant terms aggregations anymore since
   this requires document frequencies, which points do not record.
 - Term queries on numeric fields will now return constant scores instead of
   giving better scores to the rare values.

Known issues that we could work around (in follow-up PRs, this one is too large
already):
 - Range queries on `ip` addresses only work if both the lower and upper bounds
   are inclusive (exclusive bounds are not exposed in Lucene). We could either
   decide to implement it, or drop range support entirely and tell users to
   query subnets using the CIDR notation instead.
 - Since IP addresses now use a different representation for doc values,
   aggregations will fail when running a terms aggregation on an ip field on a
   list of indices that contains both pre-5.0 and 5.0 indices.
 - The ip range aggregation does not work on the new ip field. We need to either
   implement range aggs for SORTED_SET doc values or drop support for ip ranges
   and tell users to use filters instead. #17700

Closes #16751
Closes #17007
Closes #11513
2016-04-14 17:56:23 +02:00
Yannick Welsch 80cf9fc761 Add EC2 discovery tests to check permissions of AWS Java SDK (#17677) 2016-04-13 10:01:49 +02:00
Adrien Grand 3bf6f4076c Do not set analyzers on numeric fields.
When it comes to query parsing, either a field is tokenized and it would go
through analysis with its search_analyzer. Or it is not tokenized and the
raw string should be passed to termQuery(). Since numeric fields are not
tokenized and also declare a search analyzer, values would currently go through
analysis twice...
2016-04-12 17:47:29 +02:00
Adrien Grand 013acf9179 Remove MappedFieldType.value. #17557
This commit removes `MappedFieldType.value` and simplifies
`MappedFieldType.valueforSearch`. `valueforSearch` was used to post-process
values that come for stored fields (eg. to convert a long back to a string
representation of a date in the case of a date field) and also values that
are extracted from the source but only in the case of GET calls: it would
not be called when performing source filtering on search requests.

`valueforSearch` is now only called for stored fields, since values that are
extracted from the source should already be formatted as expected.
2016-04-12 09:12:56 +02:00
Adrien Grand 496c7fbd84 Upgrade Lucene 6 Release
* upgrades numerics to new Point format
* updates geo api changes
  * adds GeoPointDistanceRangeQuery as XGeoPointDistanceRangeQuery
  * cuts over to ES GeoHashUtils
2016-04-11 16:50:04 -05:00
Ryan Ernst 31ca8fa411 Merge branch 'master' into placeholder 2016-04-11 13:44:59 -07:00
Yannick Welsch b08d453a0a Fix EC2 Discovery settings (#17651)
Fixes two bugs introduced by the settings refactoring in #16602
2016-04-11 16:17:55 +02:00
Alexander Reelsen da19ddf3e6 Ingest Attachment: Allow to prevent base64 conversions by using raw bytes (#16601)
CBOR is natively supported in Elasticsearch and allows for byte arrays.
This means, that by using CBOR the user can prevent base64 conversions
for the data being sent back and forth.

This PR adds support to extract data from a byte array in addition to
a string. This also required to add a ByteArrayValueSource class.
2016-04-11 14:14:56 +02:00
Adrien Grand 42526ac28e Remove Settings.settingsBuilder.
We have both `Settings.settingsBuilder` and `Settings.builder` that do exactly
the same thing, so we should keep only one. I kept `Settings.builder` since it
has my preference but also it is the one that we use in examples of the Java API.
2016-04-08 18:10:02 +02:00
David Pilato c6b1beb083 Add a test for forced values in mapper-attachments plugin
This PR just adds a new test where we check that we forcing a value in the JSON document actually works as expected:

```json
{
     "file": {
        "_content": "BASE64"
        "_name": "12-240.pdf",
        "_language": "en",
        "_content_type": "pdf"
    }
}
```

Note that we don't support forcing all values. So sending:

```json
{
     "file": {
        "_content": "BASE64"
        "_name": "12-240.pdf",
        "_title": "12-240.pdf",
        "_keywords": "Div42 Src580 LGE Mechtech",
        "_language": "en",
        "_content_type": "pdf"
    }
}
```

Will have absolutely no effect on fields `title` and `keywords`.

Note that when `_language` is set, it only works if `index.mapping.attachment.detect_language` is set to `true`.

Related to https://discuss.elastic.co/t/mapper-attachments/46615/4
2016-04-08 10:07:21 +02:00
Chris Earle d97d5ebb8b Remove hostname from NetworkAddress.format
This removes the inconsistent output of IP addresses. The format was parsing-unfriendly and it makes it hard
to reason about API responses, such as to _nodes.

With this change in place, it will never print the hostname as part of the default format, which has the
added benefit that it can be used consistently for URIs, which was not the case when the hostname might
appear at the front with "hostname/ip:port".
2016-04-07 17:27:59 -04:00
javanna b9f9b2e3ee Merge branch 'master' into enhancement/discovery_node_one_getter 2016-03-30 17:22:40 +02:00
javanna f8b5d1f5b0 Remove DiscoveryNodes#masterNodeId in favour of existing DiscoveryNodes#getMasterNodeId 2016-03-30 15:28:06 +02:00
Adrien Grand 068c788ec8 Disable fielddata on text fields by defaults. #17386
`text` fields will have fielddata disabled by default. Fielddata can still be
enabled on an existing index by setting `fielddata=true` in the mappings.
2016-03-30 14:35:32 +02:00
javanna 8fc9dbbb99 Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-29 14:27:04 +02:00
Clinton Gormley 579d976e90 The source parameter should not be defined in the delete-by-query REST spec 2016-03-29 11:45:20 +02:00
javanna 93ce36a198 separated attributes from node roles in DiscoveryNode
Node roles are now serialized as well, they are not part of the node attributes anymore. DiscoveryNodeService takes care of dividing settings into attributes and roles. DiscoveryNode always requires to pass in attributes and roles separately.
2016-03-25 20:14:27 +01:00
Jason Tedor 7f0134e725 Revert "Merge pull request #16843 from xuzha/s3-encryption"
This reverts commit 37a183d9ed, reversing
changes made to 08903f1ed8.
2016-03-24 17:11:02 -04:00
Xu Zhang 38923b89c2 Update Format, add new settings into the setting test 2016-03-24 12:16:57 -07:00
Ryan Ernst 3adaf09675 Settings: Cleanup placeholder replacement
This change moves placeholder replacement to a pkg private class for
settings. It also adds a null check when calling replacement, as
settings objects can still contain null values, because we only prohibit
nulls on file loading. Finally, this cleans up file and stream loading a
bit to not have unnecessary exception wrapping.
2016-03-24 11:54:05 -07:00
Xu Zhang 7499e3aa4a Update and rebase the init implementation.
Also removes the MD5 checks from our side, AWS S3 SDK java is doing the
check.
2016-03-24 11:21:40 -07:00
Nicolas Trésegnie ea78fd6560 Add client-side encryption
The Java Cryptography Extension (JCE) has to be installed to use this feature.
2016-03-24 11:13:37 -07:00
David Pilato 4b1ae331f0 Update after review 2016-03-23 17:32:51 +01:00
David Pilato e907b7c11e Check that S3 setting `buffer_size` is always lower than `chunk_size`
We can be better at checking `buffer_size` and `chunk_size` for S3 repositories.
For example, we know that:

* `buffer_size` should be more than `5mb`
* `chunk_size` should be no more than `5tb`
* `buffer_size` should be lower than `chunk_size`

Otherwise, setting `buffer_size` is useless.

For the record:

`chunk_size` is a Snapshot setting whatever the implementation is.
`buffer_size` is an S3 implementation setting.

Let say that you are snapshotting a 500mb file. If you set `chunk_size` to `200mb`, then Snapshot service will call S3 repository to snapshot 3 files with the following sizes:

* `200mb`
* `200mb`
* `100mb`

If you set `buffer_size` to `100mb` (AWS maximum size recommendation), the first file of `200mb` will be uploaded on S3 using the multipart feature in 2 chunks and the workflow is basically the following:

* create the multipart request and get back an `id` from AWS S3 platform
* upload part1: `100mb`
* upload part2: `100mb`
* "commit" the full upload using the `id`.

Closes #17244.
2016-03-23 10:39:54 +01:00
Simon Willnauer 1988b8b387 [TEST] Reuse EsTestCase#createAnalysisService in KuromojiAnalysisTests 2016-03-22 13:45:20 +01:00
Jun Ohtani a9a0f262af Analysis Kuromoji: Add nbest option and NumberFilter
Add nbest_cost and nbest_examples parameter to KuromojiTokenizerFactory
Add KuromojiNumberFilterFactory
2016-03-22 20:09:56 +09:00
Ryan Ernst f71f0d6010 Revert "Build: Switch to maven-publish plugin"
This reverts commit a90a2b34fc.
2016-03-18 17:22:25 -07:00
Ryan Ernst 6af4c43c4f Merge pull request #17128 from rjernst/maven_publish
Build: Switch to maven-publish plugin
2016-03-17 11:53:50 -07:00
Simon Willnauer e91a141233 Prevent index level setting from being configured on a node level
Today we allow to set all kinds of index level settings on the node level which
is error prone and difficult to get right in a consistent manner.
For instance if some analyzers are setup in a yaml config file some nodes might
not have these analyzers and then index creation fails.

Nevertheless, this change allows some selected settings to be specified on a node level
for instance:
 * `index.codec` which is used in a hot/cold node architecture and it's value is really per node or per index
 * `index.store.fs.fs_lock` which is also dependent on the filesystem a node uses

All other index level setting must be specified on the index level. For existing clusters the index must be closed
and all settings must be updated via the API on each of the indices.

Closes #16799
2016-03-17 14:42:18 +01:00
Ryan Ernst a90a2b34fc Build: Switch to maven-publish plugin
The build currently uses the old maven support in gradle. This commit
switches to use the newer maven-publish plugin. This will allow future
changes, for example, easily publishing to artifactory.

An additional part of this change makes publishing of build-tools part
of the normal publishing, instead of requiring a separate upload step
from within buildSrc. That also sets us up for a follow up to enable
precomit checks on the buildSrc code itself.
2016-03-15 19:16:37 -07:00
Jason Tedor 618441aea3 Merge pull request #17088 from jasontedor/simplify-bootstrap-settings
Bootstrap does not set system properties
2016-03-15 19:25:16 -04:00
Jason Tedor 66ba044ec5 Use setting in integration test cluster config 2016-03-15 17:45:17 -04:00
Yannick Welsch f5e6db4090 Remove System.out.println and Throwable.printStackTrace from tests 2016-03-15 15:40:37 +01:00
Yannick Welsch d14ae5f8b6 Remove Python and Javascript Benchmark classes 2016-03-15 15:02:50 +01:00
David Pilato 84c862b825 Merge remote-tracking branch 'origin/master' 2016-03-15 09:25:26 +01:00