Commit Graph

2273 Commits

Author SHA1 Message Date
Nhat Nguyen a90b3eadf3
Upgrade to Lucene-7.4.0-snapshot-bcf9f5c36b (#29490)
This snapshot includes LUCENE-8246 which allows to customize the number 
of deletes a merge claims
2018-04-12 08:12:57 -04:00
Nhat Nguyen b24ca650ad Merge branch 'master' into ccr 2018-04-11 11:35:15 -04:00
Adrien Grand 4918924fae
Remove legacy mapping code. (#29224)
Some features have been deprecated since `6.0` like the `_parent` field or the
ability to have multiple types per index. This allows to remove quite some
code, which in-turn will hopefully make it easier to proceed with the removal
of types.
2018-04-11 09:41:37 +02:00
Nhat Nguyen 30c06a6f8c
Upgrade to Lucene-7.4.0-snapshot-2b27dd846a (#29398)
This snapshot version supports soft delete and the merge policy.
2018-04-09 12:25:49 -04:00
Lee Hinman a07ba9e400
Move Streams.copy into elasticsearch-core and make a multi-release jar (#29322)
* Move Streams.copy into elasticsearch-core and make a multi-release jar

This moves the method `Streams.copy(InputStream in, OutputStream out)` into the
`elasticsearch-core` project (inside the `o.e.core.internal.io` package). It
also makes this class into a multi-release class where the Java 9 equivalent
uses `InputStream#transferTo`.

This is a followup from
https://github.com/elastic/elasticsearch/pull/29300#discussion_r178147495
2018-04-06 11:07:20 -06:00
Tanguy Leroux 26fc8ad109
Use fixture to test repository-azure plugin (#29347)
This commit adds a new fixture that emulates an
Azure Storage service in order to improve the
existing integration tests. This is very similar
to what has been made for Google Cloud Storage
in #28788 and for Amazon S3 in #29296, and it
would have helped a lot to catch bugs like #22534.
2018-04-06 11:06:20 +02:00
Tanguy Leroux 7d29087442
[Tests] Use mock storage in repository-gcs unit tests (#29397)
The repository-gcs unit tests rely on the GoogleCloudStorageTestServer
but it would be better if they rely on a mocked Storage client instead.

That would also help to extract the GoogleCloudStorageFixture and the
GoogleCloudStorageTestServer classes in a QA third party project.

Closes #28960
2018-04-06 09:13:07 +02:00
Tanguy Leroux d813a05b9f
Use ESBlobStoreRepositoryIntegTestCase to test the repository-s3 plugin (#29315)
This commit adds the S3BlobStoreRepositoryTests class that extends the
base testing class for S3. It also removes some usage of socket servers 
that emulate socket connections in unit tests. It was added to trigger 
security exceptions, but this won't be needed anymore since #29296 
is merged.
2018-04-05 13:34:02 +02:00
Alan Woodward dccd43af47
Upgrade to lucene 7.3.0 (#29387) 2018-04-05 10:34:44 +01:00
Jason Tedor c95e7539e7
Enhance error for out of bounds byte size settings (#29338)
Today when you input a byte size setting that is out of bounds for the
setting, you get an error message that indicates the maximum value of
the setting. The problem is that because we use ByteSize#toString, we
end up with a representation of the value that does not really tell you
what the bound is. For example, if the bound is 2^31 - 1 bytes, the
output would be 1.9gb which does not really tell you want the limit as
there are many byte size values that we format to the same 1.9gb with
ByteSize#toString. We have a method ByteSize#getStringRep that uses the
input units to the value as the output units for the string
representation, so we end up with no loss if we use this to report the
bound. This commit does this.
2018-04-04 07:22:13 -04:00
Tanguy Leroux 989e465964
Use fixture to test repository-s3 plugin (#29296)
This commit adds a new fixture that emulates a S3 service in order to
improve the existing integration tests. This is very similar to what has
 been made for Google Cloud Storage in #28788, and such tests would 
have helped a lot to catch bugs like #22534.

The AmazonS3Fixture is brittle and only implements the very necessary
stuff for the S3 repository to work, but at least it works and can be
adapted for specific tests needs.
2018-04-03 11:30:43 +02:00
Adrien Grand 3bdfc8f3fb
Upgrade to lucene-7.3.0-snapshot-98a6b3d. (#29298)
Most notable changes include:
 - this release doesn't have the 7.2.1 version constant so I had to create one
 - spatial4j and jts were upgraded
2018-04-03 09:27:14 +02:00
Christoph Büscher 318b0af953 Remove execute mode bit from source files
Some source files seem to have the execute bit (a+x) set, which doesn't
really seem to hurt but is a bit odd. This change removes those, making
the permissions similar to other source files in the repository.
2018-03-26 13:37:55 +02:00
Lee Hinman 8e8fdc4f0e
Decouple XContentBuilder from BytesReference (#28972)
* Decouple XContentBuilder from BytesReference

This commit removes all mentions of `BytesReference` from `XContentBuilder`.
This is needed so that we can completely decouple the XContent code and move it
into its own dependency.

While this change appears large, it is due to two main changes, moving
`.bytes()` and `.string()` out of XContentBuilder itself into static methods
`BytesReference.bytes` and `Strings.toString` respectively. The rest of the
change is code reacting to these changes (the majority of it in tests).

Relates to #28504
2018-03-14 13:47:57 -06:00
David Pilato 87553bba16
Add ingest-attachment support for per document `indexed_chars` limit (#28977)
We today support a global `indexed_chars` processor parameter. But in some cases, users would like to set this limit depending on the document itself.
It used to be supported in mapper-attachments plugin by extracting the limit value from a meta field in the document sent to indexation process.

We add an option which reads this limit value from the document itself
by adding a setting named `indexed_chars_field`.

Which allows running:

```
PUT _ingest/pipeline/attachment
{
  "description" : "Extract attachment information. Used to parse pdf and office files",
  "processors" : [
    {
      "attachment" : {
        "field" : "data",
        "indexed_chars_field" : "size"
      }
    }
  ]
}
```

Then index either:

```
PUT index/doc/1?pipeline=attachment
{
  "data": "BASE64"
}
```

Which will use the default value (or the one defined by `indexed_chars`)

Or

```
PUT index/doc/2?pipeline=attachment
{
  "data": "BASE64",
  "size": 1000
}
```

Closes #28942
2018-03-14 19:07:20 +01:00
Jason Tedor 5904d936fa
Copy Lucene IOUtils (#29012)
As we have factored Elasticsearch into smaller libraries, we have ended
up in a situation that some of the dependencies of Elasticsearch are not
available to code that depends on these smaller libraries but not server
Elasticsearch. This is a good thing, this was one of the goals of
separating Elasticsearch into smaller libraries, to shed some of the
dependencies from other components of the system. However, this now
means that simple utility methods from Lucene that we rely on are no
longer available everywhere. This commit copies IOUtils (with some small
formatting changes for our codebase) into the fold so that other
components of the system can rely on these methods where they no longer
depend on Lucene.
2018-03-13 12:49:33 -04:00
Daniel Mitterdorfer b2557b9c11
Skip GeoIpProcessorFactoryTests on Windows (#29005)
With this commit we skip all GeoIpProcessorFactoryTests on Windows.
These tests use a MappedByteBuffer which will keep its file mappings
until it is garbage-collected. As a consequence, the corresponding
file appears to be still in use, Windows cannot delete it and the test
will fail in teardown.

Closes #29001
2018-03-13 09:10:40 +01:00
Tanguy Leroux 5a65db153e
[Test] GoogleCloudStorageFixture command line is too long on Windows (#28991)
Windows has some strong limitations on command line arguments,
specially when it's too long. In the googlecloudstoragefixture anttask
the classpath argument is very long and the command fails. This commit
removes the classpath as an argument and uses the CLASSPATH
environment variable instead.
2018-03-12 18:02:30 +01:00
Daniel Mitterdorfer 0d78a5890e
Reduce heap-memory usage of ingest-geoip plugin (#28963)
With this commit we reduce heap usage of the ingest-geoip plugin by
memory-mapping the database files. Previously, we have stored these
files gzip-compressed but this has resulted that data are loaded on the
heap.

Closes #28782
2018-03-12 08:07:33 +01:00
Tanguy Leroux d9cc6b9270 Remove temporary file 10_basic.yml~ 2018-03-09 17:44:10 +01:00
Tanguy Leroux 4756790d6e
Use fixture to test the repository-gcs plugin (#28788)
This commit adds a GoogleCloudStorageFixture that uses the
logic of a GoogleCloudStorageTestServer (added in #28576)
to emulate a remote Google Cloud Storage service.

By adding this fixture and a more complete integration test, we 
should be able to catch more bugs when upgrading the client library.

The fixture is started by the googleCloudStorageFixture task
and a custom Service Account file is created and added to the
Elasticsearch keystore for each test.
2018-03-09 13:57:27 +01:00
Tim Brooks 7d434c16f9
Remove NioNotEnabledBootstrapCheck bootstrap check (#28901)
This is related to #27260. This commit removes the bootstrap check that
prevents nio from being enabled.
2018-03-08 11:06:36 -07:00
Tim Brooks d8d1f0d4f0
Give transport-nio plugin socket permissions (#28900)
This is related to #27260. The transport-nio plugin needs socket
permissions to operate as a transport. This commit gives it these
permissions in the policy file.
2018-03-08 09:33:39 -07:00
Tim Brooks 5a8ec9b762
Selectors operate on channel contexts (#28468)
This commit is related to #27260. Currently there is a weird
relationship between channel contexts and nio channels. The selectors
use the context for read and writing. But the selector operates directly
on the nio channel for registering, closing, and connecting.

This commit works on improving this relationship. The selector operates
directly on the context which wraps the low level java.nio.channels. The
NioChannel class is simply an API that is used to interact with the
channel (sending messages from outside the selector event loop,
scheduling a close, adding listeners, etc). The context is only used
internally by the channel to implement these apis and by the selector to
perform these operations.
2018-02-22 09:44:52 -07:00
Tanguy Leroux a6a138905d
Use client settings in repository-gcs (#28575)
Similarly to what has been done for s3 and azure, this commit removes
the repository settings `application_name` and `connect/read_timeout`
in favor of client settings. It introduce a GoogleCloudStorageClientSettings
class (similar to S3ClientSettings) and a bunch of unit tests for that,
it aligns the documentation to be more coherent with the S3 one, it
documents the connect/read timeouts that were not documented at all and
also adds a new client setting that allows to define a custom endpoint.
2018-02-22 15:40:20 +01:00
Tim Brooks de2a0dfa6e
Ensure that azure stream has socket privileges (#28751)
This is related to #28662. It wraps the azure repository inputstream in
an inputstream that ensures `read` calls have socket permissions. This
is because the azure inputstream internally makes service calls.
2018-02-21 11:20:06 -07:00
Tanguy Leroux 9a95be35cf
[Tests] Extract the testing logic for Google Cloud Storage (#28576)
This pull request extracts in a dedicated class the request/response 
logic that "emulates" a Google Cloud Storage service in our 
repository-gcs tests.

The idea behind this is to make the logic more reusable. The class 
MockHttpTransport has been renamed to MockStorage which now 
only takes care of instantiating a Storage client and does the low-level 
request/response plumbing needed by this client.

The "Google Cloud Storage" logic has been extracted from 
MockHttpTransport and put in a new GoogleCloudStorageTestServer 
that is now independent from the google client testing framework.
2018-02-21 13:20:35 +01:00
Tanguy Leroux 9485b43167 [Tests] Fix RetryHttpInitializerWrapperTests.testIOExceptionRetry
This commit gives more time to the IO exception handler to retry the
request.
2018-02-20 14:54:53 +01:00
Tanguy Leroux 207ca1cc38
[Tests] Simplify GceDiscoverTests (#28726)
GceDiscoverTests can be simplified in a similar manner than #27945. It
now uses a mocked GceInstancesService that exposes internal test cluster
nodes as if they were real GCE nodes. It should also make the test more
robust by not using a HTTP server anymore.

closes #24313
2018-02-20 09:38:22 +01:00
Christoph Büscher 231fd3c9be [Docs] Remove misleading comment
The TikaImpl#parse method comment sounds like this method is only used
in the same package for testing, but AttachmentProcessor uses it outside
of testing, so we should remove this comment.
2018-02-09 15:47:38 +01:00
Jason Tedor 641a6c9e62
Guard accessDeclaredMembers for Tika on JDK 10
Tika parsers need accessDeclaredMembers because ZipFile needs
accessDeclaredMembers on JDK 10. This commit guards adding this
permission to parsers so that the permission is only granted on JDK
10. Additionally, we add an assertion that forces us to check if the
permission is still needed in JDK 11.

Relates #28603
2018-02-09 09:08:07 -05:00
Christoph Büscher cc9cb5356a
Add missing runtime permission to TikaImpl (#28602)
Tests on jdk10 were failing because of a change in its ZipFile implementation 
that now needs `accessDeclaredMembers` permissions. This change adds 
the missing permission to the plugins security policy and TikaImpl.

Closes #28568
2018-02-09 14:41:24 +01:00
Lee Hinman eebff4d2b3
Use non deprecated xcontenthelper (#28503)
* Move to non-deprecated XContentHelper.createParser(...)

This moves away from one of the now-deprecated XContentHelper.createParser
methods in favor of specifying the deprecation logger at parser creation time.

Relates to #28449

Note that this doesn't move all the `createParser` calls because some of them
use the already-deprecated method that doesn't specify the XContentType.

* Remove the deprecated (and now non-needed) createParser method
2018-02-05 16:18:18 -07:00
Tanguy Leroux be74f11517
Replace jvm-example by two plugin examples (#28339)
This pull request replaces the jvm-example plugin (from the jvm/site plugins era) by two new plugins: a custom-settings that shows how to register and use custom settings (including secured settings) in a plugin, and rest-handler plugin that shows how to register a rest handler.

The two plugins now reside in the plugins/examples project. They can serve as sample plugins for users, a special attention has been put on documentation. The packaging tests have been adapted to use the custom-settings plugin.
2018-01-26 17:34:24 +01:00
kel c675407a70 Remove redundant argument for buildConfiguration of s3 plugin (#28281) 2018-01-23 22:32:46 -08:00
Adrien Grand 700d9ecc95
Remove the `update_all_types` option. (#28288)
This option is not useful in 7.x since no indices may have more than one type
anymore.
2018-01-22 12:03:07 +01:00
Tim Brooks a6a57a71d3
Implement socket and server ChannelContexts (#28275)
This commit is related to #27260. Currently have a channel context that
implements reading and writing logic for socket channels. Additionally,
we have exception contexts to handle exceptions. And accepting contexts
to handle accepted channels. This PR introduces a ChannelContext that
handles close and exception handling for all channel types.
Additionally, it has implementers that provide specific functionality
for socket channels (read and writing). And specific functionality for
server channels (accepting).
2018-01-18 13:06:40 -07:00
Ryan Ernst cefea1a7c9
Build: Add gradle plugin for configuring meta plugin (#28276)
This commit adds a gradle plugin to ease development of meta plugins.
Applying the plugin will generated the meta plugin properties based on
the es_meta_plugin configuration object, which includes name and
description. The plugins to include within the meta plugin are
configured through the `plugins` list. An integ test task is also
automatically added.
2018-01-17 19:47:37 -08:00
Tim Brooks 4ea9ddb7d3
Unify nio read / write channel contexts (#28160)
This commit is related to #27260. Right now we have separate read and
write contexts for implementing specific protocol logic. However, some
protocols require a closer relationship between read and write
operations than is allowed by our current model. An example is HTTP
which might require a write if some problem with request parsing was
encountered.

Additionally, some protocols require close messages to be sent when a
channel is shutdown. This is also problematic in our current model,
where we assume that channels should simply be queued for close and
forgotten.

This commit transitions to a single ChannelContext which implements
all read, write, and close logic for protocols. It is the job of the
context to tell the selector when to close the channel. A channel can
still be manually queued for close with a selector. This is how server
channels are closed for now. And this route allows timeout mechanisms on
normal channel closes to be implemented.
2018-01-17 09:44:21 -07:00
Jason Tedor aded32f48f
Fix third-party audit tasks on JDK 8
This one is interesting. The third party audit task runs inside the
Gradle JVM. This means that if Gradle is started on JDK 8, the third
party audit tasks will fail as a result of the changes to support
building Elasticsearch with the JDK 9 compiler. This commit reverts the
third party audit changes to support running this task when Gradle is
started with JDK 8.

Relates #28256
2018-01-16 22:59:29 -05:00
Jason Tedor 0a79555a12
Require JDK 9 for compilation (#28071)
This commit modifies the build to require JDK 9 for
compilation. Henceforth, we will compile with a JDK 9 compiler targeting
JDK 8 as the class file format. Optionally, RUNTIME_JAVA_HOME can be set
as the runtime JDK used for running tests. To enable this change, we
separate the meaning of the compiler Java home versus the runtime Java
home. If the runtime Java home is not set (via RUNTIME_JAVA_HOME) then
we fallback to using JAVA_HOME as the runtime Java home. This enables:
 - developers only have to set one Java home (JAVA_HOME)
 - developers can set an optional Java home (RUNTIME_JAVA_HOME) to test
   on the minimum supported runtime
 - we can test compiling with JDK 9 running on JDK 8 and compiling with
   JDK 9 running on JDK 9 in CI
2018-01-16 13:45:13 -05:00
Ryan Ernst 18463e7e9f
Painless: Add whitelist extensions (#28161)
This commit adds a PainlessExtension which may be plugged in via SPI to
add additional classes, methods and members to the painless whitelist on
a per context basis. An example plugin adding and using a whitelist is
also added.
2018-01-15 11:28:31 -08:00
Jim Ferenczi b82017cbfe
Fix daitch_mokotoff phonetic filter to use the dedicated Lucene filter (#28225)
This commit changes the phonetic filter factory to use a DaitchMokotoffSoundexFilter
instead of a PhoneticFilter with a daitch_mokotoff encoder when daitch_mokotoff is selected.
The latter does not hanlde branching when computing the soundex and fails to encode multiple
variations when possible.

Closes #28211
2018-01-15 19:35:54 +01:00
Tim Brooks ee7eac8dc1
`MockTcpTransport` to connect asynchronously (#28203)
The method `initiateChannel` on `TcpTransport` is explicit in that
channels can be connect asynchronously. All production implementations
do connect asynchronously. Only the blocking `MockTcpTransport`
connects in a synchronous manner. This avoids testing some of the
blocking code in `TcpTransport` that waits on connections to complete.
Additionally, it requires a more extensive method signature than
required for other transports.

This commit modifies the `MockTcpTransport` to make these connections
asynchronously on a different thread. Additionally, it simplifies that
`initiateChannel` method signature.
2018-01-15 10:20:30 -07:00
Jim Ferenczi be012b1326
upgrade to lucene 7.2.1 (#28218) 2018-01-15 16:47:46 +01:00
Igor Motov c75ac319a6
Add ability to associate an ID with tasks (#27764)
Adds support for capturing the X-Opaque-Id header from a REST request and storing it's value in the tasks that this request started. It works for all user-initiated tasks (not only search).

Closes #23250

Usage:
```
$ curl -H "X-Opaque-Id: imotov" -H "foo:bar" "localhost:9200/_tasks?pretty&group_by=parents"
{
  "tasks" : {
    "7qrTVbiDQKiZfubUP7DPkg:6998" : {
      "node" : "7qrTVbiDQKiZfubUP7DPkg",
      "id" : 6998,
      "type" : "transport",
      "action" : "cluster:monitor/tasks/lists",
      "start_time_in_millis" : 1513029940042,
      "running_time_in_nanos" : 266794,
      "cancellable" : false,
      "headers" : {
        "X-Opaque-Id" : "imotov"
      },
      "children" : [
        {
          "node" : "V-PuCjPhRp2ryuEsNw6V1g",
          "id" : 6088,
          "type" : "netty",
          "action" : "cluster:monitor/tasks/lists[n]",
          "start_time_in_millis" : 1513029940043,
          "running_time_in_nanos" : 67785,
          "cancellable" : false,
          "parent_task_id" : "7qrTVbiDQKiZfubUP7DPkg:6998",
          "headers" : {
            "X-Opaque-Id" : "imotov"
          }
        },
        {
          "node" : "7qrTVbiDQKiZfubUP7DPkg",
          "id" : 6999,
          "type" : "direct",
          "action" : "cluster:monitor/tasks/lists[n]",
          "start_time_in_millis" : 1513029940043,
          "running_time_in_nanos" : 98754,
          "cancellable" : false,
          "parent_task_id" : "7qrTVbiDQKiZfubUP7DPkg:6998",
          "headers" : {
            "X-Opaque-Id" : "imotov"
          }
        }
      ]
    }
  }
}
```
2018-01-12 15:34:17 -05:00
Jim Ferenczi fcf4114adc
Make sure that we don't detect files as maven coordinate when installing a plugin (#28163)
* This change makes sure that we don't detect a file path containing a ':' as
a maven coordinate (e.g.: `file:C:\path\to\zip`)

* restore test muted on master
2018-01-10 14:59:37 +01:00
Jim Ferenczi ca6b15bf7c [Tests] temporary disable meta plugin rest tests #28163 2018-01-10 14:31:07 +01:00
Jim Ferenczi 36729d1c46
Add the ability to bundle multiple plugins into a meta plugin (#28022)
This commit adds the ability to package multiple plugins in a single zip.
The zip file for a meta plugin must contains the following structure:

|____elasticsearch/
| |____   <plugin1> <-- The plugin files for plugin1 (the content of the elastisearch directory)
| |____   <plugin2>  <-- The plugin files for plugin2
| |____   meta-plugin-descriptor.properties <-- example contents below
The meta plugin properties descriptor is mandatory and must contain the following properties:

description: simple summary of the meta plugin.
name: the meta plugin name
The installation process installs each plugin in a sub-folder inside the meta plugin directory.
The example above would create the following structure in the plugins directory:

|_____ plugins
| |____   <name_of_the_meta_plugin>
| | |____   meta-plugin-descriptor.properties
| | |____   <plugin1>
| | |____   <plugin2>
If the sub plugins contain a config or a bin directory, they are copied in a sub folder inside the meta plugin config/bin directory.

|_____ config
| |____   <name_of_the_meta_plugin>
| | |____   <plugin1>
| | |____   <plugin2>

|_____ bin
| |____   <name_of_the_meta_plugin>
| | |____   <plugin1>
| | |____   <plugin2>
The sub-plugins are loaded at startup like normal plugins with the same restrictions; they have a separate class loader and a sub-plugin
cannot have the same name than another plugin (or a sub-plugin inside another meta plugin).

It is also not possible to remove a sub-plugin inside a meta plugin, only full removal of the meta plugin is allowed.

Closes #27316
2018-01-09 18:28:43 +01:00
Tim Brooks ff3db0b50e
Cleanup TcpChannelFactory and remove classes (#28102)
This commit is related to #27260. It moves the TcpChannelFactory into
NioTransport so that consumers do not have to be passed around.
Additionally it deletes an unused read handler.
2018-01-08 10:18:19 -07:00
Martijn van Groningen b46bb2efae
test: do not use asn fields
Closes #28124
2018-01-07 23:21:01 +01:00
Tim Brooks 38701fb6ee
Create nio-transport plugin for NioTransport (#27949)
This is related to #27260. This commit moves the NioTransport from
:test:framework to a new nio-transport plugin. Additionally, supporting
tcp decoding classes are moved to this plugin. Generic byte reading and
writing contexts are moved to the nio library.

Additionally, this commit adds a basic MockNioTransport to
:test:framework that is a TcpTransport implementation for testing that
is driven by nio.
2018-01-05 09:41:29 -07:00
Martijn van Groningen fdb9b50747
test: replaced try-catch statements with expectThrows(...) 2018-01-05 14:29:53 +01:00
Sian Lerk Lau a4a7150b56
Added ASN support for Ingest GeoIP plugin.
Closes #27849
2018-01-05 14:07:04 +01:00
Ryan Ernst d36ec18029
Plugins: Add plugin extension capabilities (#27881)
This commit adds the infrastructure to plugin building and loading to
allow one plugin to extend another. That is, one plugin may extend
another by the "parent" plugin allowing itself to be extended through
java SPI. When all plugins extending a plugin are finished loading, the
"parent" plugin has a callback (through the ExtensiblePlugin interface)
allowing it to reload SPI.

This commit also adds an example plugin which uses as-yet implemented
extensibility (adding to the painless whitelist).
2018-01-03 11:12:43 -08:00
Tanguy Leroux 098f82f086
[Test] Do not rely on MockZenPing for Azure tests (#27945)
This commit changes some Azure tests so that they do not rely on
MockZenPing and TestZenDiscovery anymore, but instead use a mocked
AzureComputeService that exposes internal test cluster nodes as if
they were real Azure nodes.

Related to #27859

Closes #27917, #11533
2017-12-22 09:58:02 +01:00
Martijn van Groningen 90ee35930a
ingest: upgraded ingest geoip's geoip2's dependencies. 2017-12-21 08:43:02 +01:00
Nhat Nguyen 3c865d6d04 TEST: reduce blob size #testExecuteMultipartUpload
If a large blob size and small buffer size are picked, this test causes out of memory.

https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+intake/1061/
2017-12-20 12:43:05 -05:00
Adrien Grand 77711508b0
Upgrade to Lucene 7.2.0. (#27910) 2017-12-20 14:17:40 +01:00
Boaz Leskes 0ca141880f Disable TestZenDiscovery in cloud providers integrations test
TestZenDiscovery is used to allow discovery based on in memory structures. This isn't a relevant for the cloud providers tests (but isn't a problem at the moment either)
2017-12-20 14:02:55 +01:00
Martijn van Groningen 4585cc8312
ingest: Upgraded the geolite2 databases. 2017-12-20 10:42:46 +01:00
Tal Levy 43ff38c5da
update ingest-attachment to use Tika 1.17 and newer deps (#27824)
- this pr updates tika and its dependencies
- updates the SHAs
- updates the class excludes
2017-12-15 13:47:26 -08:00
Colin Goodheart-Smithe 579d1fea57
Fixes ByteSizeValue to serialise correctly (#27702)
* Fixes ByteSizeValue to serialise correctly

This fix makes a few fixes to ByteSizeValue to make it possible to perform round-trip serialisation:
* Changes wire serialisation to use Zlong methods instead of VLong methods. This is needed because the value `-1` is accepted but previously if `-1` is supplied it cannot be serialised using the wire protocol.
* Limits the supplied size to be no more than Long.MAX_VALUE when converted to bytes. Previously values greater than Long.MAX_VALUE bytes were accepted but would be silently interpreted as Long.MAX_VALUE bytes rather than erroring so the user had no idea the value was not being used the way they had intended. I consider this a bug and so fine to include this bug fix in a minor version but I am open to other points of view.
* Adds a `getStringRep()` method that can be used when serialising the value to JSON. This will print the bytes value if the size is positive, `”0”` if the size is `0` and `”-1”` if the size is `-1`.
* Adds logic to detect fractional values when parsing from a String and emits a deprecation warning in this case.
* Modifies hashCode and equals methods to work with long values rather than doubles so they don’t run into precision problems when dealing with large values. Previous to this change the equals method would not detect small differences in the values (e.g. 1-1000 bytes ranges) if the actual values where very large (e.g. PBs). This was due to the values being in the order of 10^18 but doubles only maintaining a precision of ~10^15.

Closes #27568

* Fix bytes settings default value to not use fractional values

* Fixes test

* Addresses review comments

* Modifies parsing to preserve unit

This should be bwc since in the case that the input is fractional it reverts back to the old method of parsing it to the bytes value.

* Addresses more review comments

* Fixes tests

* Temporarily changes version check to 7.0.0

This will be changed to 6.2 when the fix has been backported
2017-12-14 12:17:17 +00:00
Tanguy Leroux b69923f112
Remove some unused code (#27792)
This commit removes some unused code.
2017-12-13 16:45:55 +01:00
Tanguy Leroux f27cb96a64
Use AmazonS3.doesObjectExist() method in S3BlobContainer (#27723)
This pull request changes the S3BlobContainer.blobExists() method implementation 
to make it use the AmazonS3.doesObjectExist() method instead of 
AmazonS3.getObjectMetadata(). The AmazonS3 implementation takes care of 
catching any thrown AmazonS3Exception and compares its response code with 404, 
returning false (object does not exist) or lets the exception be propagated.
2017-12-12 09:30:36 +01:00
Luca Cavanna f4fb4d3bf5
Add support for filtering mappings fields (#27603)
Add support for filtering fields returned as part of mappings in get index, get mappings, get field mappings and field capabilities API.

Plugins can plug in their own function, which receives the index as argument, and return a predicate which controls whether each field is included or not in the returned output.
2017-12-05 20:31:29 +01:00
Jason Tedor 42a4ad35da
Add node name to thread pool executor name
This commit adds the node name to the names of thread pool executors so
that the node name is visible in rejected execution exception messages.

Relates #27663
2017-12-05 07:45:40 -05:00
Henrik Lindström 7a44596446 Catch InvalidPathException in IcuCollationTokenFilterFactory (#27202)
Using custom rules in the icu_collation filter can fail on Windows. If the rules are interpreted 
as a file location, this leads to an InvalidPathException when trying to read the rules from a file.
2017-12-04 10:29:08 +01:00
Adrien Grand 6323bb0d97
Upgrade to lucene-7.2.0-snapshot-8c94404. (#27619)
This new snapshot mostly brings a change to TopFieldCollector which can now
early terminate collection when trackTotalHits is `false`.

As a follow-up, we should replace our usage of
`EarlyTerminatingSortingCollector` with this new option.
2017-12-04 09:40:08 +01:00
James Baiera e16f1271b6
Fix SecurityException when HDFS Repository used against HA Namenodes (#27196)
* Sense HA HDFS settings and remove permission restrictions during regular execution.

This PR adds integration tests for HA-Enabled HDFS deployments, both regular and secured. 
The Mini HDFS fixture has been updated to optionally run in HA-Mode. A new test suite has 
been added for reproducing the effects of a Namenode failing over during regular repository 
usage. Going forward, the HDFS Repository will still be subject to its self imposed permission 
restrictions during normal use, but will no longer restrict them when running against an HA 
enabled HDFS cluster. Instead, the plugin will rely on the provided security policy and not 
further restrict the permissions so that the transparent operation to failover to a different 
Namenode in the client does not raise security exceptions. Additionally, we are now testing the 
secure mode with SASL based wire encryption of data between Elasticsearch and HDFS. This 
includes a missing library (commons codec) in order to support this change.
2017-12-01 14:26:05 -05:00
Jason Tedor d0cd18169e Remove stale awaits fix on azure master nodes test
This awaits fix has been there forever and no one seems to know what to
do with this test. I say let CI churn on it because it passed for me
three out of three times. If there is something wrong with it, we will
know quickly and can then address with the new information that we have.
2017-11-28 22:43:34 -05:00
Adrien Grand 996990ad1f
Upgrade to lucene-7.2.0-snapshot-8c94404. (#27496)
The main highlight of this new snapshot is that it introduces the opportunity
for queries to opt out of caching. In case a query opts out of caching, not only
will it never be cached, but also no compound query that wraps it will be
cached.
2017-11-28 14:52:42 +01:00
David Turner 7ac361d86e Update @AwaitsFix URL to point at an issue in the current repo 2017-11-28 09:35:46 +00:00
Tanguy Leroux 50a2459adf
Update Google SDK to version 1.23 (#27381)
This commit updates the google-api-client library to version 1.23.0.

Related to #26636
2017-11-15 15:30:27 +01:00
Tanguy Leroux dd51c231ac
Create new handlers for every new request in GoogleCloudStorageService (#27339)
This commit changes the DefaultHttpRequestInitializer in order to make
it create new HttpIOExceptionHandler and HttpUnsuccessfulResponseHandler
for every new HTTP request instead of reusing the same two handlers for
all requests.

Closes #27092
2017-11-14 11:43:32 +01:00
Jason Tedor d375cef73c
Upgrade AWS SDK Jackson Databind to 2.6.7.1
The AWS SDK has a transitive dependency on Jackson Databind. While the
AWS SDK was recently upgraded, the Jackson Databind dependency was not
pulled along with it to the version that the AWS SDK depends on. This
commit upgrades the dependencies for discovery-ec2 and repository-s3
plugins to match versions on the AWS SDK transitive dependencies.

Relates #27361
2017-11-13 12:05:14 -05:00
Simon Willnauer 2299c70371
Allow affix settings to specify dependencies (#27161)
We use affix settings to group settings / values under a certain namespace.
In some cases like login information for instance a setting is only valid if
one or more other settings are present. For instance `x.test.user` is only valid
if there is an `x.test.passwd` present and vice versa. This change allows to specify
such a dependency to prevent settings updates that leave settings in an inconsistent
state.
2017-11-13 12:06:36 +01:00
Tanguy Leroux f6c2ea0f7d [Test] Fix S3BlobStoreContainerTests.testNumberOfMultiparts() 2017-11-10 15:45:20 +01:00
Tanguy Leroux 9c4d6c629a
Remove S3 output stream (#27280)
Now the blob size information is available before writing anything, 
the repository implementation can know upfront what will be the 
more suitable API to upload the blob to S3.

This commit removes the DefaultS3OutputStream and S3OutputStream 
classes and moves the implementation of the upload logic directly in the 
S3BlobContainer.

related #26993
closes #26969
2017-11-10 12:22:33 +01:00
Guillaume Le Floch ac5fd6a7d9 Update Tika version to 1.15
This commit upgrades the Tika dependency to version 1.15.

Relates #25003
2017-11-09 13:16:44 -05:00
Tanguy Leroux 184dda9eb0
Update to AWS SDK 1.11.223 (#27278) 2017-11-09 13:25:51 +01:00
Jason Tedor 58a28dacbd
Remove colons from task and configuration names
Gradle 5.0 will remove support for colons in configuration and task
names. This commit fixes this for our build by removing all current uses
of colons in configuration and task names.

Relates #27305
2017-11-08 15:22:31 -05:00
David Roberts 749c3ec716
Remove the single argument Environment constructor (#27235)
Only tests should use the single argument Environment constructor.  To
enforce this the single arg Environment constructor has been replaced with
a test framework factory method.

Production code (beyond initial Bootstrap) should always use the same
Environment object that Node.getEnvironment() returns.  This Environment
is also available via dependency injection.
2017-11-04 13:25:09 +00:00
kel 0f21262b36 Do not create directories if repository is readonly (#26909)
For FsBlobStore and HdfsBlobStore, if the repository is read only, the blob store should be aware of the readonly setting and do not create directories if they don't exist.

Closes #21495
2017-11-03 13:10:50 +01:00
Colin Goodheart-Smithe c1b8140c83
Upgrade to Lucene 7.1 (#27225) 2017-11-02 13:25:33 +00:00
Colin Goodheart-Smithe 99aca9cdfc
Enhances exists queries to reduce need for `_field_names` (#26930)
* Enhances exists queries to reduce need for `_field_names`

Before this change we wrote the name all the fields in a document to a `_field_names` field and then implemented exists queries as a term query on this field. The problem with this approach is that it bloats the index and also affects indexing performance.

This change adds a new method `existsQuery()` to `MappedFieldType` which is implemented by each sub-class. For most field types if doc values are available a `DocValuesFieldExistsQuery` is used, falling back to using `_field_names` if doc values are disabled. Note that only fields where no doc values are available are written to `_field_names`.

Closes #26770

* Addresses review comments

* Addresses more review comments

* implements existsQuery explicitly on every mapper

* Reinstates ability to perform term query on `_field_names`

* Added bwc depending on index created version

* Review Comments

* Skips tests that are not supported in 6.1.0

These values will need to be changed after backporting this PR to 6.x
2017-11-01 10:46:59 +00:00
Christoph Büscher 9253ea8aec Fix beidermorse phonetic token filter for unspecified `languageset` (#27112)
Currently, when we create a BeiderMorseFilter with an unspecified `languageset`,
the filter will not guess the language, which should be the default behaviour.
This change fixes this and adds a simple test for the cases with and without
provided `languageset` settings.

Closes #26771
2017-10-27 10:07:36 +02:00
Tanguy Leroux 463e7e6fa3 Revert "Upgrade to Jackson 2.9.2 (#27032)"
This reverts commit 0b9acc5ace.
2017-10-20 08:25:41 +02:00
Tanguy Leroux f78b2e5bc9 Fix ingest-attachment yaml REST test 2017-10-19 22:20:50 +02:00
Simon Willnauer cdd7c1e6c2 Return List instead of an array from settings (#26903)
Today we return a `String[]` that requires copying values for every
access. Yet, we already store the setting as a list so we can also directly
return the unmodifiable list directly. This makes list / array access in settings
a much cheaper operation especially if lists are large.
2017-10-09 09:52:08 +02:00
Md. Abdulla-Al-Sun a40c474e10
Added Bengali Analyzer to Elasticsearch with respect to the lucene update(PR#238) 2017-10-05 13:25:05 +02:00
Simon Willnauer 00dfdf50cf Represent lists as actual lists inside Settings (#26878)
Today we represent each value of a list setting with it's own dedicated key
that ends with the index of the value in the list. Aside of the obvious
weirdness this has several issues especially if lists are massive since it
causes massive runtime penalties when validating settings. Like a list of 100k
words will literally cause a create index call to timeout and in-turn massive
slowdown on all subsequent validations runs.

With this change we use a simple string list to represent the list. This change
also forbids to add a settings that ends with a .0 which was internally used to
detect a list setting.  Once this has been rolled out for an entire major
version all the internal .0 handling can be removed since all settings will be
converted.

Relates to #26723
2017-10-05 09:27:08 +02:00
Martijn van Groningen dca787ed8a
upgrade to Lucene 7.1.0 snapshot version 2017-10-05 09:06:56 +02:00
David Pilato 84a3899550 Simplify Azure blobStore method signatures (#26752)
While working on #26751, I found that we are passing the container name on every single method although we don't need it as it is stored within the blobstore object already.

This commit simplifies a bit that part of the code.

It also removes `repositoryName` from AzureBlobStore which was not used anymore.
Also we move some properties in AzureBlobContainer to `private` members.
2017-10-04 20:17:50 +02:00
Simon Willnauer d1533e2397 Remove Settings#getAsMap() (#26845)
Since `#getAsMap` exposes internal representation we are trying to remove it
step by step. This commit is cleaning up some xcontent writing as well as
usage in tests
2017-10-04 01:21:38 -06:00
Simon Willnauer 7b8d036ab5 Replace group map settings with affix setting (#26819)
We use group settings historically instead of using a prefix setting which is more restrictive and type safe. The majority of the usecases needs to access a key, value map based on the _leave node_ of the setting ie. the setting `index.tag.*` might be used to tag an index with `index.tag.test=42` and `index.tag.staging=12` which then would be turned into a `{"test": 42, "staging": 12}` map. The group settings would always use `Settings#getAsMap` which is loosing type information and uses internal representation of the settings. Using prefix settings allows now to access such a method type-safe and natively.
2017-09-30 14:27:21 +02:00
David Pilato 9ba5e168e4 Don't use create static storage service
Even though you annotate the Test class with `@ThirdParty` the static
code is initialized.

In that case it fails with:

```
==> Test Info: seed=529C3C6977F695FC; jvms=3; suites=6
Suite: org.elasticsearch.repositories.azure.AzureSnapshotRestoreTests
ERROR   0.00s J2 | AzureSnapshotRestoreTests (suite) <<< FAILURES!
   > Throwable #1: java.lang.IllegalStateException: to run integration tests, you need to set -Dtests.thirdparty=true and -Dtests.azure.account=azure-account -Dtests.azure.key=azure-key
   >    at org.elasticsearch.cloud.azure.AzureTestUtils.generateMockSecureSettings(AzureTestUtils.java:37)
   >    at org.elasticsearch.repositories.azure.AzureSnapshotRestoreTests.generateMockSettings(AzureSnapshotRestoreTests.java:81)
   >    at org.elasticsearch.repositories.azure.AzureSnapshotRestoreTests.<clinit>(AzureSnapshotRestoreTests.java:84)
   >    at java.lang.Class.forName0(Native Method)
   >    at java.lang.Class.forName(Class.java:348)
Completed [1/6] on J2 in 2.21s, 0 tests, 1 error <<< FAILURES!
```

Closes #26812.

(cherry picked from commit eb6d714 for master branch)
2017-09-28 16:40:16 +01:00
Hendrik Muhs 0358bb5f34 fix of disabling #26812 2017-09-28 16:14:34 +02:00
Hendrik Muhs bf4d3123bd disable AzureSnapshotRestoreTests, see #26812 2017-09-28 15:42:53 +02:00
David Pilato 1ccb497c0d Use Azure upload method instead of our own implementation (#26751)
* Use Azure upload method instead of our own implementation

We are not following the Azure documentation about uploading blobs to Azure storage. https://docs.microsoft.com/en-us/azure/storage/blobs/storage-java-how-to-use-blob-storage#upload-a-blob-into-a-container

Instead we are using our own implementation which might cause some troubles and rarely some blobs can be not immediately commited just after we close the stream. Using the standard implementation provided by Azure team should allow us to benefit from all the magic Azure SDK team already wrote.

And well... Let's just read the doc!

* Adapt integration tests to secure settings

That was a missing part in #23405.

* Simplify all the integration tests and *extends ESBlobStoreRepositoryIntegTestCase tests

    * removes IT `testForbiddenContainerName()` as it is useless. The plugin does not create anymore the container but expects that the user has created it before registering the repository
   * merges 2 IT classes so all IT tests are ran from one single class
   * We don't remove/create anymore the container between each single test but only for the test suite
2017-09-28 13:15:37 +02:00
olcbean 6952f7b560 Validate top-level keys for create index request (#23755) (#23869)
This commit ensures create index requests do not ignore unknown keys passed to the request.

closes #23755
2017-09-26 09:49:20 -07:00
David Pilato 3f71772cd2 Azure snapshots can not be restored anymore (#26778)
While working on #26751 and doing some manual integration testing I found that this #22858 removed an important line of our code:

`AzureRepository` overrides default `initializeSnapshot` method which creates metadata files and do other stuff.

But with PR #22858, I wrote:

```java
    @Override
    public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) {
        if (blobStore.doesContainerExist(blobStore.container()) == false) {
            throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " +
                " creating an azure snapshot repository backed by it.");
        }
    }
```

instead of

```java
    @Override
    public void initializeSnapshot(SnapshotId snapshotId, List<IndexId> indices, MetaData clusterMetadata) {
        if (blobStore.doesContainerExist(blobStore.container()) == false) {
            throw new IllegalArgumentException("The bucket [" + blobStore.container() + "] does not exist. Please create it before " +
                " creating an azure snapshot repository backed by it.");
        }
        super.initializeSnapshot(snapshotId, indices, clusterMetadata);
    }
```

As we never call `super.initializeSnapshot(...)` files are not created and we can't restore what we saved.

Closes #26777.
2017-09-25 14:35:27 -04:00
Simon Willnauer aab4655e63 Unify Settings xcontent reading and writing (#26739)
This change adds a fromXContent method to Settings that allows to read
the xcontent that is produced by toXContent. It also replaces the entire settings
loader infrastructure and removes the structured map representation. Future PRs will
also tackle the `getAsMap` that exposes the internal represenation of settings for
better encapsulation.
2017-09-25 13:23:01 +02:00
Jason Tedor 2e63a13c0a Upgrade to Log4j 2.9.1
This commit upgrades the Log4j dependency, picking up a fix for an issue
with handling stack traces on JDK 9.

Relates #26750
2017-09-22 11:57:06 -04:00
Jason Tedor e0db89bc35 Upgrade to Lucene 7.0.0
This commit upgrades to the GA release of Luence 7!

Relates #26744
2017-09-21 19:19:33 -04:00
James Baiera c760eec054 Add permission checks before reading from HDFS stream (#26716)
Add checks for special permissions before reading hdfs stream data. Also adds test from 
readonly repository fix. MiniHDFS will now start with an existing repository with a single snapshot 
contained within. Readonly Repository is created in tests and attempts to list the snapshots 
within this repo.
2017-09-21 11:55:07 -04:00
kel 601be4f83e Add azure storage endpoint suffix #26432 (#26568)
Allow specifying azure storage endpoint suffix for an azure client.
2017-09-20 22:26:19 -07:00
Ryan Ernst bebff47b5b File Discovery: Remove fallback with zen discovery (#26667)
When adding file based discovery, we added a fallback when the discovery
type was set to zen (the default, so everyone got this warning). This
commit removes the fallback for 6.0. Setting file discovery should now
happen explicitly through the hosts_provider setting.

closes #26661
2017-09-19 16:32:34 -07:00
Jason Tedor bdd9953aa4 Fix discovery-file plugin to use custom config path
The discovery-file plugin was not config path aware, so it always picked
up the default config path (from Elasticsearch home) rather than a
custom config path. This commit fixes the discovery-file plugin to
respect a custom config path.

Relates #26662
2017-09-16 11:00:33 -04:00
Claudio Bley 7184cf8b5b Fix kuromoji default stoptags (#26600)
Initialize the default stop-tags in `KuromojiPartOfSpeechFilterFactory` if the
`stoptags` are not given in the config. Also adding a test which checks that 
part-of-speech tokens are removed when using the kuromoji_part_of_speech 
filter.
2017-09-15 12:25:09 +02:00
Christoph Büscher c7c6443b10 [Docs] "The the" is a great band, but ... (#26644)
Removing several occurrences of this typo in the docs and javadocs, seems to be
a common mistake. Corrections turn up once in a while in PRs, better to correct
some of this in one sweep.
2017-09-14 15:08:20 +02:00
David Pilato b6c6effa2a Move all repository-azure classes under one single package (#26624)
As we did for S3, we can collapse all packages within one single `org.elasticsearch.repositories.azure` package name.

Follow up for https://github.com/elastic/elasticsearch/pull/23518#issuecomment-328903585
2017-09-14 11:56:02 +02:00
David Pilato a34db4e09f Support for accessing Azure repositories through a proxy (#23518)
You can define a proxy using the following settings:

```yml
azure.client.default.proxy.host: proxy.host
azure.client.default.proxy.port: 8888
azure.client.default.proxy.type: http
```

Supported values for `proxy.type` are `direct`, `http` or `socks`. Defaults to `direct` (no proxy).

Closes #23506

BTW I changed a test `testGetSelectedClientBackoffPolicyNbRetries` as it was using an old setting name `cloud.azure.storage.azure.max_retries` instead of `azure.client.azure1.max_retries`.
2017-09-13 11:51:55 +02:00
David Pilato b01b1c2a58 Remove azure deprecated settings (#26099)
Follow up for #23405.

We remove azure deprecated settings in 7.0:

* The legacy azure settings which where starting with `cloud.azure.storage.` prefix have been removed.
This includes `account`, `key`, `default` and `timeout`.
You need to use settings which are starting with `azure.client.` prefix instead.

* Global timeout setting `cloud.azure.storage.timeout` has been removed.
You must set it per azure client instead. Like `azure.client.default.timeout: 10s` for example.
2017-09-12 16:51:44 +02:00
mohit 06150d40a2 update AWS SDK for ECS Task IAM support in discovery-ec2 (#26479)
This commit contains:

* update AWS SDK for ECS Task IAM support
* ignore dependencies not essential to `discovery-ec2`:
  * jmespath seems to be used for `waiters`
  * amazon ion is a protocol not used by EC2 or IAM
2017-09-12 10:34:12 +02:00
etiennecarriere 706067211a Azure repository: Accelerate the listing of files (used in delete snapshot) (#25710)
This commit reworks the azure listing of snapshot files to do a single listing, instead of once per blob.

closes #25424
2017-09-11 16:09:27 -07:00
Adrien Grand 1adee8b5a8 Fix the MapperFieldType.rangeQuery API. (#26552)
RangeQueryBuilder needs to perform too many `instanceof` checks in order to
check for `date` or `range` fields in order to know what it should do with the
shape relation, time zone and date format.

This commit adds those 3 parameters to the `rangeQuery` factory method so that
those instanceof checks are not necessary anymore.
2017-09-11 11:02:05 +02:00
Andy Bristol 33faf5ec70 forbid ICU Collator creation with default locale (#26476) 2017-09-07 14:47:52 -07:00
Jason Tedor f6a489f323 Add Log4j to SLF4J binding for repository-hdfs
This commit adds the Log4j to SLF4J binding JAR to the repository-hdfs
plugin so that SLF4J can detect Log4j at runtime and therefore use the
server Log4j implementation for logging (and the usual Elasticsearch
APIs can be used for setting logging levels).

Relates #26514
2017-09-05 19:38:17 -04:00
Adrien Grand 78681bc9e5 Upgrade to lucene-7.0.0-snapshot-d94a5f0. (#26441) 2017-08-31 09:06:40 +02:00
Andy Bristol e00366ba95 ICU plugin: use root locale by default for collators (#26413)
Calls to Collator.getInstance without arguments returns a
collator that uses the system's default locale, which we don't
want because it makes behavior harder to reproduce. Change it
to always use the root locale instead.

For #25587
2017-08-29 08:58:36 -07:00
Jim Ferenczi 86d97971a4 Remove the _all metadata field (#26356)
* Remove the _all metadata field

This change removes the `_all` metadata field. This field is deprecated in 6
and cannot be activated for indices created in 6 so it can be safely removed in
the next major version (e.g. 7).
2017-08-28 17:43:59 +02:00
Adrien Grand eb782492be Remove support for lenient booleans.
Closes #22298
2017-08-28 09:56:01 +02:00
Nik Everett b3edd11aa0 Allow plugins to plug rescore implementations (#26368)
This allows plugins to plug rescore implementations into
Elasticsearch. While this is a fairly expert thing to do I've
done my best to point folks to the QueryRescorer as one that at
least documents the tradeoffs that it makes. I've attempted to
limit the API surface area by removing `SearchContext` from the
exposed interface, instead exposing just the IndexSearcher and
`QueryShardContext`. I also tried to make some of the class names
more consistent and do some general cleanup while I was there.

I entertained the notion of moving the `QueryRescorer` to module.
After all, it'd be a wonderful test to prove that you can plug
rescore implementation into Elasticsearch if the only built in
rescore implementation is in the module. But I decided against it
because the new module would require a client jar and it'd require
moving some more things around. I think if we really want to do
it, we should do it as a followup.

I did, on the other hand, create an "example" rescore plugin which
should both be a nice example for anyone wanting to plug in their
own rescore implementation and servers as a good integration test
to make sure that you can indeed plug one in.

Closes #26208
2017-08-25 13:46:57 -04:00
Yannick Welsch 3d8feff66e Use Java 9 FilePermission model (#26302)
This commit makes the security code aware of the Java 9 FilePermission changes (see #21534) and allows us to remove the `jdk.io.permissionsUseCanonicalPath` system property.
2017-08-22 11:22:00 +09:30
Matt Weber e89d9400c9 ICUCollationKeywordFieldMapper use SortedSetDocValuesField (#26267)
Switch ICUCollationKeywordFieldMapper from using SortedDocValuesField to SortedSetDocValuesField
so we can support fields with multiple values.
2017-08-21 10:40:56 +02:00
desmorto 292dd8f992 (refactor) some opportunities to use diamond operator (#25585)
* (refactor) some opportunities to use diamond operator

* Update ExceptionRetryIT.java

update typo
2017-08-15 16:36:42 -06:00
David Pilato 80b142d218 Azure repository: Move to named configurations as we do for S3 repository
We should have the same behavior for Azure repositories as we have for S3 (see #22762).

Instead of:

```yml
cloud:
    azure:
        storage:
            my_account1:
                account: your_azure_storage_account1
                key: your_azure_storage_key1
                default: true
            my_account2:
                account: your_azure_storage_account2
                key: your_azure_storage_key2
```

Support something like:

```
azure.client:
            default:
                account: your_azure_storage_account1
                key: your_azure_storage_key1
            my_account2:
                account: your_azure_storage_account2
                key: your_azure_storage_key2
```

Then instead of:

```
PUT _snapshot/my_backup3
{
    "type": "azure",
    "settings": {
        "account": "my_account2"
    }
}
```

Use:

```
PUT _snapshot/my_backup3
{
    "type": "azure",
    "settings": {
        "config": "my_account2"
    }
}
```

If someone uses:

```
PUT _snapshot/my_backup3
{
    "type": "azure"
}
```

It will use the `default` azure repository settings.

And mark as deprecated old settings.

Closes #22763.
2017-08-08 15:14:47 +02:00
Adrien Grand f0c1e30544 Upgrade to lucene-7.0.0-snapshot-a128fcb. (#26090) 2017-08-08 13:03:19 +02:00
Simon Willnauer 82fa531ab4 Remove `_index` fielddata hack if cluster alias is present (#26082)
We introduced a hack in #25885 to respect the cluster alias if available on the `_index` field. This is important if aggregations or other field data related operations are executed. Yet, we added a small hack that duplicated an implementation detail from the `_index` field data builder to make this work. This change adds a necessary but simple API change that allows us to remove the hack and only have a single implementation.
2017-08-08 09:24:24 +02:00
Yannick Welsch 1a01514081 Move tribe to a module (#25778)
This commit moves tribe to a module, stripping core from the tribe functionality.
2017-07-28 11:23:50 +02:00
Tim Brooks 7d2d6bd752 Make calls to CloudBlobContainer#exists privileged (#25937)
This is related to #25931. In CloudBlobContainer#exists it is possible
that a socket connection will be opened. This commit ensures that those
calls have the proper socket privileges.
2017-07-27 22:29:24 -05:00
Tim Brooks 71f58e6f26 Ensure that gcs client creation is privileged (#25938)
This is related to #25932. Currently when we create the
`GoogleCloudStorageService` client we do not wrap that call in a
doPrivileged block. The call might open a connection. This commit
ensures that the creation is wrapped in a doPrivileged block.
2017-07-27 22:28:47 -05:00
Yannick Welsch efd79882a2 Allow build to directly run under JDK 9 (#25859)
With Gradle 4.1 and newer JDK versions, we can finally invoke Gradle directly using a JDK9 JAVA_HOME without requiring a JDK8 to "bootstrap" the build. As the thirdPartyAudit task runs within the JVM that Gradle runs in, it needs to be adapted now to be JDK9 aware.

This commit also changes the `JavaCompile` tasks to only fork if necessary (i.e. when Gradle's JVM and JAVA_HOME's JVM differ).
2017-07-27 16:14:04 +02:00
Simon Willnauer 634ce90dc0 Respect cluster alias in `_index` aggs and queries (#25885)
Today when we aggregate on the `_index` field the cross cluster search
alias is not taken into account. Neither is it respected when we search
on the field. This change adds support for cluster alias when the cluster
alias is present on the `_index` field.

Closes #25606
2017-07-26 09:16:52 +02:00
Adrien Grand 481d5d09b2 Upgrade to lucene-7.0.0-snapshot-00142c9. (#25641)
Lucene 7.0 is feature-frozen now, so there should not be many changes until GA.
2017-07-11 13:58:55 +02:00
Tal Levy 8cf0528001 update ingest-user-agent regexes.yml (#25608)
This new regexes are from:

3153c2f2ae/regexes.yaml
2017-07-10 08:43:11 -07:00
joachimdraeger 1ff2c13472 Avoid SecurityException in repository-S3 on DefaultS3OutputStream.flush() (#25254)
Moved SocketAccess.doPrivileged up the stack to DefaultS3OutputStream in repository-S3 plugin to avoid SecurityException by Streams.copy(). A plugin is only allowed to use its own jars when performing privileged operations. The S3 client might open a new Socket on close(). #25192
2017-07-07 09:26:50 -05:00
Jason Tedor 5f2a0118b8 Fix third party audit for repository-hdfs
This commit fixes the third party audit check for the repository-hdfs
plugin; a class was excluded on JDK 9 that does not need to be.
2017-07-02 16:14:05 -04:00
Simon Willnauer 5a7c8bb04e Cleanup network / transport related settings (#25489)
This commit makes the use of the global network settings explicit instead
of implicit within NetworkService. It cleans up several places where we fall
back to the global settings while we should have used tcp or http ones.

In addition this change also removes unnecessary settings classes
2017-07-02 10:16:50 +02:00
James Baiera 74f4a14d82 Upgrading HDFS Repository Plugin to use HDFS 2.8.1 Client (#25497)
Hadoop 2.7.x libraries fail when running on JDK9 due to the version string changing to a single 
character. On Hadoop 2.8, this is no longer a problem, and it is unclear on whether the fix will be 
backported to the 2.7 branch. This commit upgrades our dependency of Hadoop for the HDFS 
Repository to 2.8.1.
2017-06-30 17:57:56 -04:00
olcbean 3518e313b8 Unify the result interfaces from get and search in Java client (#25361)
As GetField and SearchHitField have the same members, they have been unified into
DocumentField.

Closes #16440
2017-06-29 11:35:28 +02:00
Jason Tedor 5a9fc8aa2a Remove path.conf setting
This commit removes path.conf as a valid setting and replaces it with a
command-line flag for specifying a non-default path for configuration.

Relates #25392
2017-06-26 15:18:29 -04:00
Adrien Grand 44e9c0b947 Upgrade to lucene-7.0.0-snapshot-ad2cb77. (#25349)
Most notable changes:
 - better update concurrency: LUCENE-7868
 - TopDocs.totalHits is now a long: LUCENE-7872
 - QueryBuilder does not remove the boolean query around multi-term synonyms:
   LUCENE-7878
 - removal of Fields: LUCENE-7500

For the `TopDocs.totalHits` change, this PR relies on the fact that the encoding
of vInts and vLongs are compatible: you can write and read with any of them as
long as the value can be represented by a positive int.
2017-06-22 12:35:33 +02:00
joachimdraeger 98b02676d8 Remove redundant and broken MD5 checksum from repository-s3 (#25270)
Remove redundant and not resettable (fails on retries) check-summing. Checksums are calculated and compared by the S3 client already. 

Closes #25269
2017-06-21 15:41:17 -04:00
Nik Everett 21b1db2965 Remove assemble from build task when assemble removed
Removes the `assemble` task from the `build` task when we have
removed `assemble` from the project. We removed `assemble` from
projects that aren't published so our releases will be faster. But
That broke CI because CI builds with `gradle precommit build` and,
it turns out, that `build` includes `check` and `assemble`. With
this change CI will only run `check` for projects without an
`assemble`.
2017-06-16 17:19:14 -04:00
Nik Everett 7b358190d6 Remove assemble task when not used for publishing (#25228)
Removes the `assemble` task from projects that are not published.
This should speed up `gradle assemble` by skipping projects that
don't need to be built. Which is useful because `gradle assemble`
is how we cut releases.
2017-06-16 11:46:34 -04:00
David Causse ff9edb627e [analysis-icu] Allow setting unicodeSetFilter (#20814)
UnicodeSetFilter was only allowed in the icu_folding token filter.
It seems useful to expose this setting in icu_normalizer token filter
and char filter.
2017-06-16 11:08:39 +02:00
Jim Ferenczi 0036f28a6a Upgrade icu4j for the ICU analysis plugin to 59.1 (#25243)
* Upgrade icu4j for the ICU analysis plugin to 59.1

Lucene upgraded to 59.1 so we should use the same.

Closes #21425

* Add breaking change for the icu upgrade
2017-06-15 13:26:48 +02:00
Adrien Grand 0c117145f6 Upgrade to lucene-7.0.0-snapshot-92b1783. (#25222)
This snapshot has faster range queries on range fields (LUCENE-7828), more
accurate norms (LUCENE-7730) and the ability to use fake term frequencies
(LUCENE-7854).
2017-06-15 09:52:07 +02:00
Ryan Ernst caf7792db1 Scripting: Rename SearchScript.needsScores to needs_score (#25235)
This commit renames the needsScores method so as to make it
automatically generatable, based on the name of the `_score` variable
which is available in search scripts. It also adds documentation to
ScriptContext to explain the naming and signature of such methods.
2017-06-14 22:01:19 -07:00
John Murphy c652b586c4 Remove `discovery.type` BWC layer from the EC2/Azure/GCE plugins #25080
Those plugins don't replace the discovery logic but rather only provide a custom unicast host provider for their respective platforms. in 5.1 we introduced the  `discovery.zen.hosts_provider` setting to better reflect it. This PR removes BWC code in those plugins as it is not needed anymore

Fixes #24543
2017-06-14 13:52:48 +02:00
Adis Nezirović 82897e2636 Port support for commercial GeoIP2 databases from Logstash. (#24889)
* Port support for commercial GeoIP2 databases from Logstash.

* Match GeoIP databases according to the database name suffix.

* Rename CITY/COUNTRY_DB_TYPE, since they are suffixes now.
2017-06-13 14:20:01 -07:00
Jason Tedor 8de6f4e608 Fix secure repository-hdfs tests on JDK 9
The secure repository-hdfs tests fail on JDK 9 because some Hadoop code
reaches into sun.security.krb5. This commit adds the necessary flags to
open the java.security.jgss module. Note that these flags are actually
needed at runtime as well when using secure repository-hdfs. For now we
will punt on how best to help users obtain this when running on JDK 9
with this plugin.

Relates #25205
2017-06-13 13:26:48 -04:00
James Baiera 2e29b69f6a Revert "Revert "Sense for VirtualBox and $HOME when deciding to turn on vagrant testing. (#24636)""
This reverts commit b9e2a1f989.
2017-06-12 09:41:35 -04:00
Ryan Ernst a03b6c2fa5 Scripting: Change keys for inline/stored scripts to source/id (#25127)
This commit adds back "id" as the key within a script to specify a
stored script (which with file scripts now gone is no longer ambiguous).
It also adds "source" as a replacement for "code". This is in an attempt
to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.
2017-06-09 08:29:25 -07:00
Jason Tedor 2f5f27fafa Remove unnecessary callback interface
We have a callback interface that is not needed because it is
effectively the same as java.util.function.Consumer. This commit removes
it.

Relates #25089
2017-06-06 20:50:03 -04:00
Lee Hinman b9e2a1f989 Revert "Sense for VirtualBox and $HOME when deciding to turn on vagrant testing. (#24636)"
This reverts commit 4ed0abe72d.
2017-06-02 14:42:52 -06:00
James Baiera 4ed0abe72d Sense for VirtualBox and $HOME when deciding to turn on vagrant testing. (#24636)
We're using Vagrant in more places now than before. This commit includes a plugin that verifies
the Vagrant and Virtualbox installations for projects that depend on them. This shared code
should fix up the errors we've seen from CI builds relating to the new Kerberos fixture.
2017-06-02 16:26:11 -04:00
Colin Goodheart-Smithe 779fb9a1c0 Adds nodes usage API to monitor usages of actions (#24169)
* Adds nodes usage API to monitor usages of actions

The nodes usage API has 2 main endpoints

/_nodes/usage and /_nodes/{nodeIds}/usage return the usage statistics
for all nodes and the specified node(s) respectively.

At the moment only one type of usage statistics is available, the REST
actions usage. This records the number of times each REST action class is
called and when the nodes usage api is called will return a map of rest
action class name to long representing the number of times each of the action
classes has been called.

Still to do:

* [x] Create usage service to store usage statistics
* [x] Record usage in REST layer
* [x] Add Transport Actions
* [x] Add REST Actions
* [x] Tests
* [x] Documentation

* Rafactors UsageService so counts are done by the handlers

* Fixing up docs tests

* Adds a name to all rest actions

* Addresses review comments
2017-06-02 08:46:38 +01:00
Ryan Ernst 7c1211d2ed Scripting: Add StatefulFactoryType as optional intermediate factory in script contexts (#24974)
ScriptContexts currently understand a FactoryType that can produce
instances of the script InstanceType. However, for search scripts, this
does not work as we have the concept of LeafSearchScript that is created
per lucene segment. This commit effectively renames the existing
SearchScript class into SearchScript.LeafFactory, which is a new,
optional, class that can be defined within a ScriptContext.
LeafSearchScript is effectively renamed back into SearchScript. This
change allows the model of stateless factory -> stateful factory ->
script instance to continue, but in a generic way that any script
context may take advantage of.

relates #20426
2017-05-30 16:32:14 -07:00
Ryan Ernst 74e031e842 Scripting: Rename CompiledType to FactoryType in ScriptContext (#24897)
This commit renames the concept of the "compiled type" to a "factory
type", along with all implementations of this class to be named Factory.
This brings it inline with the classes purpose.
2017-05-26 00:02:54 -07:00
Ryan Ernst 8eab1fefa1 Scripting: Make contexts available to ScriptEngine construction (#24896)
This commit adds collection of all contexts to the parameters of
getScriptEngine. This will allow script engines like painless to
precache extra information about the contexts.
2017-05-25 16:55:47 -07:00
Ryan Ernst 8aaea51a0a Scripting: Move context definitions to instance type classes (#24883)
This is a simple refactoring to move the context definitions into the
type that they use. While we have multiple context names for the same
class at the moment, this will eventually become one ScriptContext per
instance type, so the pattern of a static member on the interface called
CONTEXT can be used. This commit also moves the consolidated list of
contexts provided by core ES into ScriptModule.
2017-05-25 12:18:45 -07:00
Ryan Ernst 59c052e76f Build: Fix hadoop integ test error on windows (#24885)
This commit fixes the error message to escape the dollar sign for
referencing a literal `$HADOOP_HOME`, which caused an error while trying
to generate an error.

closes #24878
2017-05-25 12:11:33 -07:00
Ryan Ernst 7d03cff820 Scripting: Make ScriptEngine.compile generic on the script context (#24873)
This commit changes the compile method of ScriptEngine to be generic in
the same way it is on ScriptService. This moves the shim of handling the
two existing context classes into each script engine, so that each
engine can be worked on independently to convert to real handling of
contexts.
2017-05-24 20:06:32 -07:00
Ryan Ernst 1daacd97b0 Scripting: Add instance and compiled classes to script contexts (#24868)
This commit modifies the compile method of ScriptService to be context
aware. The ScriptContext is now a generic class which contains both the
instance type and compiled type for a script. Instance type may be
stateful (for example, pre loading field information for the index a
script will execute on, like in expressions), while the compiled type is
stateless and used to construct instance type instances. This change is
only a first step to cutover ScriptService to the new paradigm. It only
converts callers to the script service, and has a small shim to wrap
compilation from the script engines to support the current two fixed
instance types, SearchScript and ExecutableScript.
2017-05-24 14:29:02 -07:00
Ryan Ernst 0ddd219423 Scripting: Add default implementation of close() for ScriptEngine (#24851)
Since groovy was removed, we no longer have any ScriptEngines with
resources to release. We may want to keep the option open for a script
engine to close resources, but this would not be common. This commit
adds a default implementation to ScriptEngine for `close()` to reduce
the boiler plate that must be added for a ScriptEngine implementation.
2017-05-24 13:19:27 -07:00
Jim Ferenczi 4e70235d55 Upgrade icu4j to latest version (#24821) 2017-05-22 09:34:50 +02:00
Ryan Ernst 2de748859f Scripting: Remove "inline script enabled" on script engines (#24815)
ScriptEngine implementations have an overridable method to indicate they
are safe to use as inline scripts. Since groovy was removed fro 6.0,
there are no longer any implementations which used the default false
value. Furthermore, the value was not actually read anywhere. This
commit removes the method. The ScriptEngineRegistry was also no longer
necessary as it only was used to build a map from language to engine.
2017-05-20 12:01:25 -07:00
Nicholas Knize deb7caf4d3 Upgrade to lucene-7.0.0-snapshot-a0aef2f
This commit upgrades master to a current lucene snapshot with commit id a0aef2f.
2017-05-19 10:20:55 -05:00
Jack Conradson 1196dfb6bb Remove Deprecated Script Settings (#24756)
Removes all fine-grained script settings replaced by scripts.types_allowed and scripts.contexts_allowed.
2017-05-18 13:32:46 -07:00
Ryan Ernst b214b80e6c GCS Repository: Remove specifying credential file on disk (#24727)
This commit removes the ability to specify the google credential json
file on disk, which is deprecated in 5.5.0.
2017-05-18 10:22:29 -07:00
Ryan Ernst 2a65bed243 Tests: Change rest test extension from .yaml to .yml (#24659)
This commit renames all rest test files to use the .yml extension
instead of .yaml. This way the extension used within all of
elasticsearch for yaml is consistent.
2017-05-16 17:24:35 -07:00
Ryan Ernst d74760c306 GCS Repository: Add secure storage of credentials (#24697)
This commit adds gcs credential settings to the elasticsearch keystore.
The setting name follows the same pattern as the s3 client settings,
beginning with `gcs.client.`, followed by the client name, and then the
setting name, in this case, `credentials_file`. Using the legacy service
file setting is also deprecated.
2017-05-16 17:17:37 -07:00
Koen De Groote 878ae8eb3c Size lists in advance when known
When constructing an array list, if we know the size of the list in
advance (because we are adding objects to it derived from another list),
we should size the array list to the appropriate capacity in advance (to
avoid resizing allocations). This commit does this in various places.

Relates #24439
2017-05-12 10:36:13 -04:00
Dimitris Athanasiou b7976bd536 [TEST] Temporarily disable the secure fixture for hdfs tests (#24643)
This keeps failing the build so I am temporarily disabling it
until #24636 gets merged.
2017-05-12 12:58:30 +01:00
Ryan Ernst c1f1f66509 Scripting: Replace advanced and native scripts with ScriptEngine docs (#24603)
This commit documents how to write a `ScriptEngine` in order to use
expert internal apis, such as using Lucene directly to find index term
statistics. These documents prepare the way to remove both native
scripts and IndexLookup.

The example java code is actually compiled and tested under a new gradle
subproject for example plugins. This change does not yet breakup
jvm-example into the new examples dir, which should be done separately.

relates #19359
relates #19966
2017-05-11 12:15:16 -07:00
Ryan Ernst 17d01550c2 S3 Repository: Add back repository level credentials (#24609)
Specifying s3 access and secret keys inside repository settings are not
secure. However, until there is a way to dynamically update secure
settings, this is the only way to dynamically add repositories with
credentials that are not known at node startup time. This commit adds
back `access_key` and `secret_key` s3 repository settings, but protects
it with a required system property `allow_insecure_settings`.
2017-05-11 12:14:23 -07:00
Ryan Ernst 0789a74055 S3 Repository: Remove deprecated settings (#24445)
These settings are deprecated in 5.5. This change removes them for 6.0.
2017-05-10 20:12:17 -07:00
James Baiera 6a113ae499 Introduce Kerberos Test Fixture for Repository HDFS Security Tests (#24493)
This PR introduces a subproject in test/fixtures that contains a Vagrantfile used for standing up a 
KRB5 KDC (Kerberos). The PR also includes helper scripts for provisioning principals, a few 
changes to the HDFS Fixture to allow it to interface with the KDC, as well as a new suite of 
integration tests for the HDFS Repository plugin.

The HDFS Repository plugin senses if the local environment can support the HDFS Fixture 
(Windows is generally a restricted environment). If it can use the regular fixture, it then tests if 
Vagrant is installed with a compatible version to determine if the secure test fixtures should be 
enabled. If the secure tests are enabled, then we create a Kerberos KDC fixture, tasks for adding 
the required principals, and an HDFS fixture configured for security. A new integration test task is 
also configured to use the KDC and secure HDFS fixture and to run a testing suite that uses 
authentication. At the end of the secure integration test the fixtures are torn down.
2017-05-10 17:42:20 -04:00
Matt Weber b24326271e Add ICUCollationFieldMapper (#24126)
Adds a new "icu_collation" field type that exposes lucene's
ICUCollationDocValuesField.  ICUCollationDocValuesField is the replacement
for ICUCollationKeyFilter which has been deprecated since Lucene 5.
2017-05-10 10:35:11 +02:00
Nik Everett bb06d8ec4f Allow plugins to build pre-configured token filters (#24223)
This changes the way we register pre-configured token filters so that
plugins can declare them and starts to move all of the pre-configured
token filters out of core. It doesn't finish the job because doing
so would make the change unreviewably large. So this PR includes
a shim that keeps the "old" way of registering pre-configured token
filters around.

The Lowercase token filter is special because there is a "special"
interaction between it and the lowercase tokenizer. I'm not sure
exactly what to do about it so for now I'm leaving it alone with
the intent of figuring out what to do with it in a followup.

This also renames these pre-configured token filters from
"pre-built" to "pre-configured" because that seemed like a more
descriptive name.

This is a part of #23658
2017-05-09 14:50:49 -04:00
Yannick Welsch c8712e9531 Limit AllocationService dependency injection hack (#24479)
Changes the scope of the AllocationService dependency injection hack so that it is at least contained to the AllocationService and does not leak into the Discovery world.
2017-05-05 08:39:18 +02:00
James Baiera f5edd5049a Fixing permission errors for `KERBEROS` security mode for HDFS Repository (#23439)
Added missing permissions required for authenticating with Kerberos to HDFS. Also implemented 
code to support authentication in the form of using a Kerberos keytab file. In order to support 
HDFS authentication, users must install a Kerberos keytab file on each node and transfer it to the 
configuration directory. When a user specifies a Kerberos principal in the repository settings the 
plugin automatically enables security for Hadoop and begins the login process. There will be a 
separate PR and commit for the testing infrastructure to support these changes.
2017-05-04 10:51:31 -04:00
James Baiera d928ae210d Add Vagrant based testing fixture (#24249) 2017-05-04 10:17:55 -04:00
Koen De Groote 0fef5acd01 Cleanup collections construction
This commit cleans up some cases where a list or map was being
constructed, and then an existing collection was copied into the new
collection. The clean is to instead use an appropriate constructor to
directly copy the existing collection in during collection
construction. The advantage of this is that the new collection is sized
appropriately.

Relates #24409
2017-04-30 21:26:51 -04:00
Yannick Welsch 35f78d098a Separate publishing from applying cluster states (#24236)
Separates cluster state publishing from applying cluster states:

- ClusterService is split into two classes MasterService and ClusterApplierService. MasterService has the responsibility to calculate cluster state updates for actions that want to change the cluster state (create index, update shard routing table, etc.). ClusterApplierService has the responsibility to apply cluster states that have been successfully published and invokes the cluster state appliers and listeners.
- ClusterApplierService keeps track of the last applied state, but MasterService is stateless and uses the last cluster state that is provided by the discovery module to calculate the next prospective state. The ClusterService class is still kept around, which now just delegates actions to ClusterApplierService and MasterService.
- The discovery implementation is now responsible for managing the last cluster state that is used by the consensus layer and the master service. It also exposes the initial cluster state which is used by the ClusterApplierService. The discovery implementation is also responsible for adding the right cluster-level blocks to the initial state.
- NoneDiscovery has been renamed to TribeDiscovery as it is exclusively used by TribeService. It adds the tribe blocks to the initial state.
- ZenDiscovery is synchronized on state changes to the last cluster state that is used by the consensus layer and the master service, and does not submit cluster state update tasks anymore to make changes to the disco state (except when becoming master).

Control flow for cluster state updates is now as follows:

- State updates are sent to MasterService
- MasterService gets the latest committed cluster state from the discovery implementation and calculates the next cluster state to publish
- MasterService submits the new prospective cluster state to the discovery implementation for publishing
- Discovery implementation publishes cluster states to all nodes and, once the state is committed, asks the ClusterApplierService to apply the newly committed state.
- ClusterApplierService applies state to local node.
2017-04-28 09:34:31 +02:00
Ryan Ernst 4a5c3c5a4a Test: Write node ports file before starting tribe service (#24351)
The tribe service can take a while to initialize, depending on how many cluster it needs to connect to. This change moves writing the ports file used by tests to before the tribe service is started.
2017-04-27 09:59:54 +02:00
Ryan Ernst 51b33f1fd5 S3 Repository: Deprecate remaining `repositories.s3.*` settings (#24144)
Most of these settings should always be pulled from the repository
settings. A couple were leftover that should be moved to client
settings. The path style access setting should be removed altogether.
This commit adds deprecations for all of these existing settings, as
well as adding new client specific settings for max retries and
throttling.

relates #24143
2017-04-25 23:43:20 -07:00
Nik Everett caf376c8af Start building analysis-common module (#23614)
Start moving built in analysis components into the new analysis-common
module. The goal of this project is:
1. Remove core's dependency on lucene-analyzers-common.jar which should
shrink the dependencies for transport client and high level rest client.
2. Prove that analysis plugins can do all the "built in" things by moving all
"built in" behavior to a plugin.
3. Force tests not to depend on any oddball analyzer behavior. If tests
need anything more than the standard analyzer they can use the mock
analyzer provided by Lucene's test infrastructure.
2017-04-19 18:51:34 -04:00
Ryan Ernst 151a65ed17 Ec2 Discovery: Cleanup deprecated settings (#24150)
This commit removes the deprecated cloud.aws.* settings. It also removes
backcompat for specifying `discovery.type: ec2`, and unused aws signer
code which was removed in a previous PR.
2017-04-19 12:06:10 -07:00
Ryan Ernst 212f24aa27 Tests: Clean up rest test file handling (#21392)
This change simplifies how the rest test runner finds test files and
removes all leniency.  Previously multiple prefixes and suffixes would
be tried, and tests could exist inside or outside of the classpath,
although outside of the classpath never quite worked. Now only classpath
tests are supported, and only one resource prefix is supported,
`/rest-api-spec/tests`.

closes #20240
2017-04-18 15:07:08 -07:00
Adrien Grand 4632661bc7 Upgrade to a Lucene 7 snapshot (#24089)
We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index.

Some notes about the change:
 - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors
 - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed
 - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge
2017-04-18 15:17:21 +02:00
Ryan Ernst a8083f3d76 S3 Repository: Remove unused files (#24145)
These were leftover from the removal of the signer type setting in
2017-04-18 01:19:25 -07:00
Ryan Ernst a8017ff020 Tests: Move cluster dependencies from runner to cluster (#24142)
After splitting integ tests into cluster configuration and the test
runner task, we still have dependencies of the test runner added as deps
of the cluster. This commit adds dependencies directly to the cluster,
so that the runner can have other dependencies independent of what is
needed for the cluster.
2017-04-17 16:02:46 -07:00
Ryan Ernst 1629c9fd5c S3 Repository: Cleanup deprecated settings (#24097)
This commit removes all deprecated settings which start with
`cloud.aws`, `repositories.s3` and repository level client settings.
2017-04-17 14:18:49 -07:00
Ryan Ernst 1207103b6d S3 Repository: Eagerly load static settings (#23910)
The S3 repostiory has many levels of settings it looks at to create a
repository, and these settings were read at repository creation time.
This meant secure settings like access and secret keys had to be
available after node construction. This change makes setting loading for
every except repository level settings eager, so that secure settings
can be stashed, and the keystore can once again be closed after
bootstrapping the node is complete.
2017-04-11 15:42:56 -07:00
Colin Goodheart-Smithe 0114f0061c Removes version 2.x constants from Version (#24011)
* Removes version 2.x constants from Version

Closes #21887

* Addresses review comments
2017-04-11 08:31:22 +01:00
Ryan Ernst dd3c1137a4 Repository S3: Simplify client method (#24034)
This commit removes passing the repository metadata object through to
s3 client creation. It is not needed, and in fact in tests was confusing
because you could create the metadata but have it contain different
settings than were passed in as repository settings.
2017-04-10 14:43:34 -07:00
Ryan Ernst 83ba677e7f Discovery EC2: Remove region setting (#23991)
We have both endpoint and region settings. Region was removed from s3 to
simplify configuration. This is the ec2 equivalent.

closes #22758
2017-04-07 22:06:40 -07:00
Ryan Ernst 05e2ea1aef AWS Plugins: Remove signer type setting (#23984)
This commit removes support for s3 signer type in 6.0, and adds a note
to the migration guide.

closes #22599
2017-04-07 16:46:17 -07:00
Ryan Ernst 73b8aad9a3 Settings: Disallow secure setting to exist in normal settings (#23976)
This commit removes the "legacy" feature of secure settings, which setup
a parallel setting that was a fallback in the insecure
elasticsearch.yml. This was previously used to allow the new secure
setting name to be that of the old setting name, but is now not in use
due to other refactorings. It is much cleaner to just have all secure
settings use new setting names. If in the future we want to reuse the
previous setting name, once support for the insecure settings have been
removed, we can then rename the secure setting.  This also adds a test
for the behavior.
2017-04-07 14:18:06 -07:00
Ryan Ernst 6e0b445abb Add registration of new discovery settings
This was forgotten as part of #23961
2017-04-07 14:07:59 -07:00
Ryan Ernst d4c0ef0028 Settings: Migrate ec2 discovery sensitive settings to elasticsearch keystore (#23961)
This change adds secure settings for access/secret keys and proxy
username/password to ec2 discovery.  It adds the new settings with the
prefix `discovery.ec2`, copies other relevant ec2 client settings to the
same prefix, and deprecates all other settings (`cloud.aws.*` and
`cloud.aws.ec2.*`).  Note that this is simpler than the client configs
in repository-s3 because discovery is only initialized once for the
entire node, so there is no reason to complicate the configuration with
the ability to have multiple sets of client settings.

relates #22475
2017-04-07 13:28:15 -07:00
Ryan Ernst 776006bac5 Collapse repository gcs classes into a single java package (#23975)
This is a single reorge of the classes to simplify making them mostly
package protected.
2017-04-07 11:27:26 -07:00
Ali Beyad ac87d40bd5 Removes unused S3BlobStore#shouldRetry() method 2017-04-06 20:58:12 -04:00
Ali Beyad 4f121744bd Removes the retry mechanism from the S3 blob store (#23952)
Currently, both the Amazon S3 client provides a retry mechanism, and the
S3 blob store also attempts retries for failed read/write requests.
Both retry mechanisms are controlled by the
`repositories.s3.max_retries` setting.  However, the S3 blob store retry
mechanism is unnecessary because the Amazon S3 client provided by the
Amazon SDK already handles retries (with exponential backoff) based on
the provided max retry configuration setting (defaults to 3) as long as
the request is retryable.  Hence, this commit removes the unneeded retry
logic in the S3 blob store and the S3OutputStream.

Closes #22845
2017-04-06 19:58:53 -04:00
Ryan Ernst 203f8433c2 Collapse packages in ec2 discovery plugin (#23909)
This commit collapses all the classes inside ec2 discovery to a single
package name.
2017-04-05 23:51:49 -07:00
Ryan Ernst d31d2caf09 Collapse packages in repository-s3 (#23907)
This commit puts all the classes in the repository-s3 plugin into a
single package.  In addition to simplifying the plugin, it will make it
easier to test as things that should be package private will not be
difficult to use inside tests alone.
2017-04-04 15:15:25 -07:00
Jason Tedor 3136ed1490 Rename random ASCII helper methods
This commit renames the random ASCII helper methods in ESTestCase. This
is because this method ultimately uses the random ASCII methods from
randomized runner, but these methods actually only produce random
strings generated from [a-zA-Z].

Relates #23886
2017-04-04 11:04:18 -04:00
Boaz Leskes ad6eea92d6 GceDiscoverTests - remove intitial_state_timeout 2017-04-03 16:50:40 +02:00
David Pilato 17be03e85e Add Backoff policy to azure repository
With this commit, Azure repositories are now using an Exponential Backoff policy before failing the backup.
It uses Azure SDK default values for this policy:

* `30s` delta backoff base with
   * `3s` min
   * `90s` max
* `3` retries max

Users can define the number of retries they wish by setting `cloud.azure.storage.xxx.max_retries` where `xxx` is the azure named account.

Closes #22728.
2017-04-03 10:52:44 +02:00
David Pilato f5d41dfc9d Merge branch 'pr/remove-repositories-azure-settings' 2017-03-31 12:33:12 +02:00
David Pilato e634d89825 Merge branch 'pr/23448-update-azure-storage' 2017-03-30 18:40:16 +02:00
Jim Ferenczi 0e95c90e9f Upgrade to Lucene 6.5.0 (#23750) 2017-03-27 15:57:54 +02:00
AdityaJNair 63757efe9c Remove DocumentMapper#parse(String index, String type, String id, BytesReference source) (#23706)
Removed `parse(String index, String type, String id, BytesReference source)` in DocumentMapper.java and replaced all of its use in Test files with `parse(SourceToParse source)`.

`parse(String index, String type, String id, BytesReference source)` was only used in test files and never in the main code so it was removed. All of the test files that used it was then modified to use `parse(SourceToParse source)` method that existing in DocumentMapper.java
2017-03-23 11:01:09 -04:00
Jason Tedor 2517cb3062 Fix line-length violations in gce/util/Access
This commit addresses all 100-column line-length violations in
gce/util/Access.java and removes this file from the suppressions list.
2017-03-22 21:34:15 -04:00
Ryan Ernst f8453aca57 Packaging: Remove classpath ordering hack (#23596)
After the removal of the joda time hack we used to have, we can cleanup
the codebase handling in security, jarhell and plugins to be more picky
about uniqueness. This was originally in #18959 which was never merged.

closes #18959
2017-03-21 12:12:16 -07:00
Boaz Leskes c0cafa786b UnicastZenPing shouldn't ping the address of the local node (#23567)
Pinging the local node address doesn't really add to discovering other nodes. It just pollutes the logs with unneeded information.
2017-03-14 07:02:42 -07:00
David Pilato 9bd3d7cca8 Update to Azure Storage 5.0.0
Closes #23448.
2017-03-08 21:56:19 -08:00
Ali Beyad 3dff0d0de2 Azure blob store's readBlob() method first checks if the blob exists (#23483)
Previously, the Azure blob store would depend on a 404 StorageException
coming back from Azure if trying to open an input stream to a
non-existent blob. This works for Azure repositories which access a
primary location path. For those configured to access a secondary
location path, the Azure SDK keeps trying for a long while before
returning a 404 StorageException, causing potential delays in the
snapshot APIs. This commit makes an initial check if the blob exists in
Azure and returns immediately with a NoSuchFileException, instead of
trying to open the input stream to the blob.

Closes #23480
2017-03-03 17:01:51 -05:00
Luca Cavanna cc65a94fd4 [TEST] improve yaml test sections parsing (#23407)
Throw error when skip or do sections are malformed, such as they don't start with the proper token (START_OBJECT). That signals bad indentation, which would be ignored otherwise. Thanks (or due to) our pull parsing code, we were still able to properly parse the sections, yet other runners weren't able to.

Closes #21980

* [TEST] fix indentation in matrix_stats yaml tests

* [TEST] fix indentation in painless yaml test

* [TEST] fix indentation in analysis yaml tests

* [TEST] fix indentation in generated docs yaml tests

* [TEST] fix indentation in multi_cluster_search yaml tests
2017-03-02 12:43:20 +01:00
Jason Tedor b9622251fe Correct version on repository-hdfs Guava dependency
This commit sets the version on the repository-hdfs Guava dependency to
version 11.0.2. This change is made to align the version here with the
version that is defined in the POM for Hadoop 2.7.1, the version of
Hadoop that the repository-hdfs plugin is based on. See HADOOP-10101 and
HADOOP-11319 for the ridiculous history of trying to upgrade Guava past
this version in the Hadoop project.

Relates #23420
2017-03-01 16:29:06 -05:00
Jason Tedor ee2f6ccf32 Add convenience method for asserting deprecations
This commit adds a convenience method for simultaneously asserting
settings deprecations and other warnings and fixes some tests where
setting deprecations and general warnings were present.
2017-02-28 18:24:39 -05:00
Jim Ferenczi 5c84640126 Upgrade to lucene-6.5.0-snapshot-d00c5ca (#23385)
Lucene upgrade
2017-02-27 18:39:04 +01:00
Jason Tedor 577e6a5e14 Correct warning header to be compliant
The warning header used by Elasticsearch for delivering deprecation
warnings has a specific format (RFC 7234, section 5.5). The format
specifies that the warning header should be of the form

    warn-code warn-agent warn-text [warn-date]

Here, the warn-code is a three-digit code which communicates various
meanings. The warn-agent is a string used to identify the source of the
warning (either a host:port combination, or some other identifier). The
warn-text is quoted string which conveys the semantic meaning of the
warning. The warn-date is an optional quoted date that can be in a few
different formats.

This commit corrects the warning header within Elasticsearch to follow
this specification. We use the warn-code 299 which means a
"miscellaneous persistent warning." For the warn-agent, we use the
version of Elasticsearch that produced the warning. The warn-text is
unchanged from what we deliver today, but is wrapped in quotes as
specified (this is important as a problem that exists today is that
multiple warnings can not be split by comma to obtain the individual
warnings as the warnings might themselves contain commas). For the
warn-date, we use the RFC 1123 format.

Relates #23275
2017-02-27 12:14:21 -05:00
javanna 2f6a6090b8 [TEST] don't check exact size in mapper-size yaml test
Rather test that the size is present and greather than zero. The actual size depends on the content-type, which is randomized.
2017-02-27 12:27:03 +01:00
Martijn van Groningen 211d50f7b8 [INGEST] Lazy load the geoip databases.
Load the geoip database the first time a pipeline gets created that has a geoip processor.
This saves memory (measured ~150MB for the city db) in cases when the plugin is installed, but not used.
2017-02-24 08:52:27 +01:00
Tim Brooks 0e802961f1 Test that buildCredentials returns correct clazz (#23334)
This is fallout from #23297. That commit wrapped
`InstanceProfileCredentialsProvider` to ensure that the `getCredentials`
and `refresh` methods had privileged access. However, it looks like
there was a test ensuring that `buildCredentials` returned the correct
clazz type. This commit adjusts that test to check that the correct
wrapper is returned.
2017-02-23 17:33:15 -06:00
Ryan Ernst 0b4834f7da Test: Fix hdfs test fixture setup on windows
The test setup for hdfs is a little complicated for windows, needing to
check if the hdfs fixture can be run at all. This was unfortunately not
updated when the integ tests were reorganized into separate runner and
cluster setups.
2017-02-23 11:20:41 -08:00
Christoph Büscher 12b143e871 Tests: fix AwsS3ServiceImplTests 2017-02-23 19:06:35 +01:00
Tim Brooks a4afc22df6 Wrap getCredentials() in a doPrivileged() block (#23297)
This commit fixes an issue that was missed in #22534.
`AWSCredentialsProvider.getCredentials()` appears to potentially open a
socket connect. This operation needed to be wrapped in `doPrivileged()`.

This should fix issue #23271.
2017-02-23 08:59:42 -06:00
Ryan Ernst 175bda64a0 Build: Rework integ test setup and shutdown to ensure stop runs when desired (#23304)
Gradle's finalizedBy on tasks only ensures one task runs after another,
but not immediately after. This is problematic for our integration tests
since it allows multiple project's integ test clusters to be
simultaneously. While this has not been a problem thus far (gradle 2.13
happened to keep the finalizedBy tasks close enough that no clusters
were running in parallel), with gradle 3.3 the task graph generation has
changed, and numerous clusters may be running simultaneously, causing
memory pressure, and thus generally slower tests, or even failure if the
system has a limited amount of memory (eg in a vagrant host).

This commit reworks how integ tests are configured. It adds an
`integTestCluster` extension to gradle which is equivalent to the current
`integTest.cluster` and moves the rest test runner task to
`integTestRunner`.  The `integTest` task is then just a dummy task,
which depends on the cluster runner task, as well as the cluster stop
task. This means running `integTest` in one project will both run the
rest tests, and shut down the cluster, before running `integTest` in
another project.
2017-02-22 12:43:15 -08:00
David Pilato da907e7a7d Remove global `repositories.azure` settings
Today we have multiple ways to define settings when a user needs to create a repository:

* in `elasticsearch.yml` file using `repositories.azure` prefix
* when creating the repository itself with `PUT _snaphot/repo`

The plan is to:

* Deprecate `repositories.azure` settings in 5.x (done with #22856)
* Remove in 6.x (this PR)

Related to #22800
2017-02-20 12:22:54 +01:00
David Pilato 76675229c7 Merge branch 'fix/22077-ingest-attachment' 2017-02-16 15:49:04 +01:00
Ryan Ernst 6cdf4f3f72 Plugins: Include license and notice files in zip (#23191)
This commit adds the elasticsearch LICENSE.txt to all plugins that
released with elasticsearch, as well as a generated NOTICE.txt specific
to the dependencies of each plugin.
2017-02-15 11:23:12 -08:00
Yannick Welsch 1aefbf57e1 Fix tests that check for deprecation message 2017-02-15 09:35:02 +01:00
Adrien Grand 709cc9ba65 Upgrade to lucene-6.5.0-snapshot-f919485. (#23087) 2017-02-10 15:08:47 +01:00
Simon Willnauer ecb01c15b9 Fold InternalSearchHits and friends into their interfaces (#23042)
We have a bunch of interfaces that have only a single implementation
for 6 years now. These interfaces are pretty useless from a SW development
perspective and only add unnecessary abstractions. They also require
lots of casting in many places where we expect that there is only one
concrete implementation. This change removes the interfaces, makes
all of the classes final and removes the duplicate `foo` `getFoo` accessors
in favor of `getFoo` from these classes.
2017-02-08 14:40:08 +01:00
Tim Brooks fcc568fd8d Add methods requiring connect to forbidden apis (#22964)
This is related to #22116. This commit adds calls that require
SocketPermission connect to forbidden APIs.

The following calls are now forbidden:

- java.net.URL#openStream()
- java.net.URLConnection#connect()
- java.net.URLConnection#getInputStream()
- java.net.Socket#connect(java.net.SocketAddress)
- java.net.Socket#connect(java.net.SocketAddress, int)
- java.nio.channels.SocketChannel#open(java.net.SocketAddress)
- java.nio.channels.SocketChannel#connect(java.net.SocketAddress)
2017-02-07 14:41:50 -06:00
Ryan Ernst 470ad1ae4a Settings: Add secure settings validation on startup (#22894)
Secure settings from the elasticsearch keystore were not yet validated.
This changed improves support in Settings so that secure settings more
seamlessly blend in with normal settings, allowing the existing settings
validation to work. Note that the setting names are still not validated
(yet) when using the elasticsearc-keystore tool.
2017-02-07 09:34:41 -08:00
Tim Brooks 27b7d9bd8d Add FileSystemUtil method to read 'file:/' URLs (#23020)
As part of #22116 we are going to forbid usage of api
java.net.URL#openStream(). However in a number of places across the
we use this method to read files from the local filesystem. This commit
introduces a helper method openFileURLStream(URL url) to read files
from URLs. It does specific validation to only ensure that file:/
urls are read.

Additionlly, this commit removes unneeded method
FileSystemUtil.newBufferedReader(URL, Charset). This method used the
openStream () method which will soon be forbidden. Instead we use the
Files.newBufferedReader(Path, Charset).
2017-02-07 10:24:22 -06:00
Adrien Grand c8496fc4f4 Upgrade to Lucene 6.4.1. (#22978) 2017-02-06 09:28:43 +01:00
Tim Brooks f70188ac58 Remove connect SocketPermissions from core (#22797)
This is related to #22116. Core no longer needs `SocketPermission`
`connect`.

This permission is relegated to these modules/plugins:
- transport-netty4 module
- reindex module
- repository-url module
- discovery-azure-classic plugin
- discovery-ec2 plugin
- discovery-gce plugin
- repository-azure plugin
- repository-gcs plugin
- repository-hdfs plugin
- repository-s3 plugin

And for tests:
- mocksocket jar
- rest client
- httpcore-nio jar
- httpasyncclient jar
2017-02-03 09:39:56 -06:00
David Pilato 6b66e29435 Remove POTM file after merging with master branch 2017-02-03 16:20:15 +01:00
David Pilato 626faeafe7 Merge branch 'master' into fix/22077-ingest-attachment
# Conflicts:
#	plugins/ingest-attachment/src/test/resources/org/elasticsearch/ingest/attachment/test/tika-files.zip
2017-02-03 16:15:44 +01:00
David Pilato 4775f520f4 Use PathUtils instead of Paths 2017-02-03 16:08:51 +01:00
David Pilato 4c3466709a Merge branch 'fix/22958-tika-files-zip' 2017-02-03 16:02:30 +01:00
Jason Tedor 9a0b216c36 Upgrade checkstyle to version 7.5
This commit upgrades the checkstyle configuration from version 5.9 to
version 7.5, the latest version as of today. The main enhancement
obtained via this upgrade is better detection of redundant modifiers.

Relates #22960
2017-02-03 09:46:44 -05:00
David Pilato 7a8680c1a4 Replace tika-files.zip by a tika-files dir
Let's make our life easier when debugging/testing.
Also having a flat dir helps us to compare or "synchronize" more easily with Tika project files.

Closes #22958.
2017-02-03 15:19:00 +01:00
David Pilato 2b15d20f93 Remove support for Visio and POTM files
Actually we never supported Visio files but we are failing hard (kill a node) when that kind of file is provided.
See https://github.com/elastic/elasticsearch/pull/22079#issuecomment-277035357

This commits excludes Visio parsing from Tika so it does not fail anymore but returns empty content instead.

As a side effect, it also removes support for POTM files.

Closes #22077.
2017-02-03 13:03:52 +01:00
Jay Modi 7520a107be Optionally require a valid content type for all rest requests with content (#22691)
This change adds a strict mode for xcontent parsing on the rest layer. The strict mode will be off by default for 5.x and in a separate commit will be enabled by default for 6.0. The strict mode, which can be enabled by setting `http.content_type.required: true` in 5.x, will require that all incoming rest requests have a valid and supported content type header before the request is dispatched. In the non-strict mode, the Content-Type header will be inspected and if it is not present or not valid, we will continue with auto detection of content like we have done previously.

The content type header is parsed to the matching XContentType value with the only exception being for plain text requests. This value is then passed on with the content bytes so that we can reduce the number of places where we need to auto-detect the content type.

As part of this, many transport requests and builders were updated to provide methods that
accepted the XContentType along with the bytes and the methods that would rely on auto-detection have been deprecated.

In the non-strict mode, deprecation warnings are issued whenever a request with body doesn't provide the Content-Type header.

See #19388
2017-02-02 14:07:13 -05:00
David Pilato 858333246d Merge branch 'pr/remove-azure-container-auto-creation'
# Conflicts:
#	docs/reference/migration/migrate_6_0/plugins.asciidoc
2017-01-31 09:05:43 +01:00
Ryan Ernst cf7747c338 S3 Repository: Remove region setting (#22853)
This change removes the ability to set region for s3 repositories.
Endpoint should be used instead if a custom s3 location needs to be
used.

closes #22758
2017-01-30 14:34:59 -08:00
David Pilato 1898dc2554 Remove auto creation of container for azure repository
Follow up of #22857 where we deprecate automatic creation of azure containers.

BTW I found that the `AzureSnapshotRestoreServiceIntegTests` does not bring any value because it runs basically a Snapshot/Restore operation on local files which we already test in core.

So instead of trying to fix it to make it pass with this PR, I simply removed it.
2017-01-30 11:47:08 +01:00
Ryan Ernst fe4043c8ff S3 Repository: Remove bucket auto create (#22846)
closes #22761
2017-01-28 11:13:21 -08:00
Ryan Ernst c921bebc4a S3 Repository: Remove env var and sysprop credentials support (#22842)
These are deprecated in 5.x. This commit removes support for them in 6.0.
2017-01-27 13:43:16 -08:00
Tim Brooks eb4562d7a5 Add doPrivilege blocks for socket connect ops in repository-hdfs (#22793)
This is related to #22116. The repository-hdfs plugin opens socket
connections. As SocketPermission is transitioned out of core, hdfs
will require connect permission. This pull request wraps operations
that require this permission in doPrivileged blocks.
2017-01-27 15:01:44 -06:00
Ryan Ernst aad51d44ab S3 repository: Add named configurations (#22762)
* S3 repository: Add named configurations

This change implements named configurations for s3 repository as
proposed in #22520. The access/secret key secure settings which were
added in #22479 are reverted, and the only secure settings are those
with the new named configs. All other previously used settings for the
connection are deprecated.

closes #22520
2017-01-27 10:42:45 -08:00
David Pilato 2abe948cd7 Remove non needed import 2017-01-26 17:43:59 +01:00
David Pilato 6e7aee0c5a use expectThrows instead of manually testing exception 2017-01-26 17:33:26 +01:00
David Pilato d97750b52c Fix checkstyle and a test 2017-01-26 17:20:27 +01:00
David Pilato 17930930a7 Update after review 2017-01-26 17:10:37 +01:00
David Pilato 3804bfcc60 Read ec2 discovery address from aws instance tags
This PR adds a new option for `host_type`: `tag:TAGNAME` where `TAGNAME` is the tag field you defined for your ec2 instance.

For example if you defined a tag `my-elasticsearch-host` in ec2 and set it to `myhostname1.mydomain.com`, then
setting `host_type: tag:my-elasticsearch-host` will tell Discovery Ec2 plugin to read the host name from the
`my-elasticsearch-host` tag. In this case, it will be resolved to `myhostname1.mydomain.com`.

Closes #22566.
2017-01-26 17:10:37 +01:00
David Pilato 98f799f6d5 Merge branch 'pr/ingest-attachment-mime4j' 2017-01-25 16:52:38 +01:00
David Pilato ee3d73dc3d Add test-outlook.msg and test-outlook2003.msg files 2017-01-25 08:53:44 +01:00
Yannick Welsch 36198e0275 Make build Gradle 2.14 / 3.x compatible (#22669)
This changes build files so that building Elasticsearch works with both Gradle 2.13 as well as higher versions of Gradle (tested 2.14 and 3.3), enabling a smooth transition from Gradle 2.13 to 3.x.
2017-01-24 11:09:57 +01:00
David Pilato 8701f7a3ce Add missing mime4j library
In some cases (apparently with outlook files), mime4j library is needed.
We removed it in the past which can cause elasticsearch to crash when you are using ingest-attachment (and probably mapper-attachments as well in 2.x series) with a file which requires this library.

 Similar problem as the one reported at #22077.
2017-01-24 10:25:02 +01:00
Tim Brooks 7f20b93051 Use generic interfaces for checking socket access (#22753)
This commit replaces specialized functional interfaces in various
plugins with generic options. Instead of creating `StorageRunnable`
interfaces in every plugin we can just use `Runnable` or `CheckedRunnable`.
2017-01-23 16:34:24 -06:00
Tim Brooks a4ac29c005 Add single static instance of SpecialPermission (#22726)
This commit adds a SpecialPermission constant and uses that constant
opposed to introducing new instances everywhere.

Additionally, this commit introduces a single static method to check that
the current code has permission. This avoids all the duplicated access
blocks that exist currently.
2017-01-21 12:03:52 -06:00
Jim Ferenczi 8028578305 Upgrade to Lucene 6.4.0 (#22724)
* Upgrade to Lucene 6.4.0

`ValueSource`s are now converted to `DoubleValueSource`s using the Lucene adapter made for the migration to the new API in 6.4.0.
2017-01-21 04:48:01 +01:00
Jason Tedor 8f6c074691 Revert "Make build Gradle 2.14 / 3.x compatible (#22669)"
This reverts commit 652cb7dbf7.

Relates #22727
2017-01-20 18:16:45 -05:00
Nik Everett 6265ef1c1b Deguice rest handlers (#22575)
There are presently 7 ctor args used in any rest handlers:
* `Settings`: Every handler uses it to initialize a logger and
  some other strange things.
* `RestController`: Every handler registers itself with it.
* `ClusterSettings`: Used by `RestClusterGetSettingsAction` to
  render the default values for cluster settings.
* `IndexScopedSettings`: Used by `RestGetSettingsAction` to get
  the default values for index settings.
* `SettingsFilter`: Used by a few handlers to filter returned
  settings so we don't expose stuff like passwords.
* `IndexNameExpressionResolver`: Used by `_cat/indices` to
  filter the list of indices.
* `Supplier<DiscoveryNodes>`: Used to fill enrich the response
  by handlers that list tasks.

We probably want to reduce these arguments over time but
switching construction away from guice gives us tighter
control over the list of available arguments.

These parameters are passed to plugins using
`ActionPlugin#initRestHandlers` which is expected to build and
return that handlers immediately. This felt simpler than
returning an reference to the ctors given all the different
possible args.

Breaks java plugins by moving rest handlers off of guice.
2017-01-20 11:48:51 -05:00
Ryan Ernst c5b4bba30b S3 repository: Deprecate specifying credentials through env vars, sys props, and remove profile files (#22567)
* S3 repository: Deprecate specifying credentials through env vars and sys props

This is a follow up to #22479, where storing credentials secure way was
added.
2017-01-19 12:36:32 -08:00
Jason Tedor 9781b88a38 Fix deprecation logging for lenient booleans
This commit fixes an issue with deprecation logging for lenient
booleans. The underlying issue is that adding deprecation logging for
lenient booleans added a static deprecation logger to the Settings
class. However, the Settings class is initialized very early and in CLI
tools can be initialized before logging is initialized. This leads to
status logger error messages. Additionally, the deprecation logging for
a lot of the settings does not provide useful context (for example, in
the token filter factories, the deprecation logging only produces the
name of the setting, but gives no context which token filter factory it
comes from). This commit addresses both of these issues by changing the
call sites to push a deprecation logger through to the lenient boolean
parsing.

Relates #22696
2017-01-19 12:30:33 -05:00
Yannick Welsch 652cb7dbf7 Make build Gradle 2.14 / 3.x compatible (#22669)
This changes build files so that building Elasticsearch works with both Gradle 2.13 as well as higher versions of Gradle (tested 2.14 and 3.3), enabling a smooth transition from Gradle 2.13 to 3.x.
2017-01-19 09:56:54 +01:00
Daniel Mitterdorfer aece89d6a1 Make boolean conversion strict (#22200)
This PR removes all leniency in the conversion of Strings to booleans: "true"
is converted to the boolean value `true`, "false" is converted to the boolean
value `false`. Everything else raises an error.
2017-01-19 07:59:18 +01:00
Tim Brooks 2766b08ff4 Add doPrivilege blocks for socket connect operations in plugins (#22534)
This is related to #22116. Certain plugins (discovery-azure-classic, 
discovery-ec2, discovery-gce, repository-azure, repository-gcs, and 
repository-s3) open socket connections. As SocketPermissions are 
transitioned out of core, these plugins will require connect 
permission. This pull request wraps operations that require these 
permissions in doPrivileged blocks.
2017-01-18 10:12:18 -06:00
Michael McCandless eea4db5512 Fix thread safety of Stempel's token filter factory (#22610)
Closes #21911
2017-01-16 10:36:36 -05:00
Ali Beyad bdf836a286 Fixes default chunk size for Azure repositories (#22577)
Before, the default chunk size for Azure repositories was
-1 bytes, which meant that if the chunk_size was not set on
the Azure repository, nor as a node setting, then no data
files would get written as part of the snapshot (because
the BlobStoreRepository's PartSliceStream does not know
how to process negative chunk sizes).

This commit fixes the default chunk size for Azure repositories
to be the same as the maximum chunk size.  This commit also
adds tests for both the Azure and Google Cloud repositories to
ensure only valid chunk sizes can be set.

Closes #22513
2017-01-12 07:59:22 -06:00
Ryan Ernst 8015fbbf25 Make s3 repository sensitive settings use secure settings (#22479)
* Settings: Make s3 repository sensitive settings use secure settings

This change converts repository-s3 to use the new secure settings. In
order to support the multiple ways we allow aws creds to be configured,
it also moves the main methods for the keystore wrapper into a
SecureSettings interface, in order to allow settings prefixing to work.
2017-01-11 11:19:46 -08:00
Nik Everett abb7d7841f Remove SearchRequestParsers (#22538)
It is empty now that we've moved all the parsing into `namedObject`.
2017-01-11 10:28:14 -05:00
Simon Willnauer 081c1ad416 Allow affix settings to delegate to actual settings (#22523)
Affix settings are useful to namespace a certain setting. Yet, affix settings
must be specialized for their concrete type which causes lot of code duplication.
This commit allows to reuse an existing setting with and affix setting as soon as
a concrete key is available.
2017-01-10 15:14:55 +01:00
animageofmine e3546d59c4 Add support for ca-central-1 region to EC2 and S3 plugins
Closes #22458 #22454
2017-01-06 16:27:08 -06:00
Tim B be22a250b6 Replace Socket, ServerSocket, and HttpServer usages in tests with mocksocket versions (#22287)
This integrates the mocksocket jar with elasticsearch tests. Mocksocket wraps actions requiring SocketPermissions in doPrivilege blocks. This will eventually allow SocketPermissions to be assigned to the mocksocket jar opposed to the entire elasticsearch codebase.
2017-01-04 14:38:51 -06:00
Adrien Grand f8998fece5 Upgrade to lucene-6.4.0-snapshot-084f7a0. (#22413) 2017-01-04 19:03:52 +01:00
Daniel Mitterdorfer 1ed64f0551 Eliminate unneccessary declaration of IOException
With this commit we remove the declaration of IOException from
assertWarnings and modify all call sites.

Checked with @javanna
2017-01-03 12:36:28 +01:00
Igor Motov ca90d9ea82 Remove PROTO-based custom cluster state components
Switches custom cluster state components from PROTO-based de-serialization to named objects based de-serialization
2016-12-28 13:32:35 -05:00
Nik Everett f5f2149ff2 Remove much ceremony from parsing client yaml test suites (#22311)
* Remove a checked exception, replacing it with `ParsingException`.
* Remove all Parser classes for the yaml sections, replacing them with static methods.
* Remove `ClientYamlTestFragmentParser`. Isn't used any more.
* Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods.

I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.
2016-12-22 11:00:34 -05:00
Jason Tedor 7946396fe6 Introduce translog no-op
As the translog evolves towards a full operations log as part of the
sequence numbers push, there is a need for the translog to be able to
represent operations for which a sequence number was assigned, but the
operation did not mutate the index. Examples of how this can arise are
operations that fail after the sequence number is assigned, and gaps in
this history that arise when an operation is assigned a sequence number
but the operation never completed (e.g., a node crash). It is important
that these operations appear in the history so that they can be
replicated and replayed during recovery as otherwise the history will be
incomplete and local checkpoints will not be able to advance. This
commit introduces a no-op to the translog to set the stage for these
efforts.

Relates #22291
2016-12-21 23:08:16 -05:00
David Pilato 2adb310508 Merge pull request #22308 from nicpalmer/master
Support for eu-west-2 (London) cloud-aws plugin

See:

* http://docs.aws.amazon.com/general/latest/gr/rande.html#ec2_region
* http://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
2016-12-21 16:57:42 +01:00
Nic Palmer 3894ec9bae Fixed eu-west-2 entries for discovery-ec2 and repository-s3 also updated the asciidocs 2016-12-21 15:48:07 +00:00
Boaz Leskes 0e9186e137 Simplify Unicast Zen Ping (#22277)
The `UnicastZenPing` shows it's age and is the result of many small changes. The current state of affairs is confusing and is hard to reason about. This PR cleans it up (while following the same original intentions). Highlights of the changes are:

1) Clear 3 round flow - no interleaving of scheduling.
2) The previous implementation did a best effort attempt to wait for ongoing pings to be sent and completed. The pings were guaranteed to complete because each used the total ping duration as a timeout. This did make it hard to reason about the total ping duration and the flow of the code. All of this is removed now and ping should just complete within the given duration or not be counted (note that it was very handy for testing, but I move the needed sync logic to the test).
3) Because of (2) the pinging scheduling changed a bit, to give a chance for the last round to complete. We now ping at the beginning, 1/3 and 2/3 of the duration.
4) To offset for (3) a bit, incoming ping requests are now added to on going ping collections.
5) UnicastZenPing never establishes full blown connections (but does reuse them if there). Relates to #22120
6) Discovery host providers are only used once per pinging round. Closes #21739
7) Usage of the ability to open a connection without connecting to a node ( #22194 ) and shorter connection timeouts helps with connections piling up. Closes #19370
8) Beefed up testing and sped them up.
9) removed light profile from production code
2016-12-21 15:09:58 +01:00
Nic Palmer 8847c34093 Push for eu-west-2 issue 2016-12-21 13:10:33 +00:00
Tal Levy 5a90d9d7e6 add `ignore_missing` flag to ingest plugins (#22273)
added `ignore_missing` flag to:

- Attachment Processor
- GeoIP Processor
- User-Agent Processor
2016-12-20 10:53:28 -08:00
Nik Everett a04dcfb95b Introduce XContentParser#namedObject (#22003)
Introduces `XContentParser#namedObject which works a little like
`StreamInput#readNamedWriteable`: on startup components register
parsers under names and a superclass. At runtime we look up the
parser and call it to parse the object.

Right now the parsers take a context object they use to help with
the parsing but I hope to be able to eliminate the need for this
context as most what it is used for at this point is to move
around parser registries which should be replaced by this method
eventually. I make no effort to do so in this PR because it is
big enough already. This is meant to the a start down a road that
allows us to remove classes like `QueryParseContext`,
`AggregatorParsers`, `IndicesQueriesRegistry`, and
`ParseFieldRegistry`.

The goal here is to reduce the amount of plumbing required to
allow parsing pluggable things. With this you don't have to pass
registries all over the place. Instead you must pass a super
registry to fewer places and use it to wrap the reader. This is
the same tradeoff that we use for NamedWriteable and it allows
much, much simpler binary serialization. We think we want that
same thing for xcontent serialization.

The only parsing actually converted to this method is parsing
`ScoreFunctions` inside of `FunctionScoreQuery`. I chose this
because it is relatively self contained.
2016-12-20 11:05:24 -05:00
javanna 5dae10db11 [TEST] add warnings check to ESTestCase
We are currenlty checking that no deprecation warnings are emitted in our query tests. That can be moved to ESTestCase (disabled in ESIntegTestCase) as it allows us to easily catch where our tests use deprecated features and assert on the expected warnings.
2016-12-19 19:39:56 +01:00
Daniel Mitterdorfer 655a95a2bb Cache results of geoip lookups (#22231)
With this commit, we introduce a cache to the geoip ingest processor.
The cache is enabled by default and caches the 1000 most recent items.
The cache size is controlled by the setting `ingest.geoip.cache_size`.

Closes #22074
2016-12-19 10:06:12 +01:00
Adrien Grand 96f1739c0d The `_all` default mapper is not completely configured. (#22236)
In some cases, it might happen that the `_all` field gets a field type that is
not totally configured, and in particular lacks analyzers. This is due to the
fact that `AllFieldMapper.TypeParser.getDefault` uses `Defaults.FIELD_TYPE` as
a default field type, which does not have any analyzers configured since it
does not know about the default analyzers.
2016-12-19 09:54:27 +01:00
David Pilato 8b0df47381 readonly on azure repository must be taken into account
While I was fixing a documentation issue (#22007), I looked at the code and discovered that we actually never read what the user entered as a `readonly` parameter when he creates an azure repository.

So if someone sends:

```
PUT _snapshot/my_backup4
{
    "type": "azure",
    "settings": {
        "account": "my_account2",
        "location_mode": "primary_only",
        "readonly": true
    }
}
```

The repository is not actually defined as `readonly`.

It's caused by the fact we are always overwriting `readonly`setting based on `location_mode`.
If a user sets it to `primary_only`, `readonly` is forced to `false`.
If a user sets it to `primary_then_secondary`, `readonly` is forced to `false`.
If a user sets it to `secondary_only`, `readonly` is forced to `false`.

Note that with this change, a user can force a `secondary_only` repository to `readonly: false` which will lead him to an error later on when we check the repository as per definition in Azure, a secondary repository is not writable.
Another option could have been to detect this mismatch and throw an exception in that case. Note sure it is worth writing more code though.

Closes #22053.
2016-12-08 18:54:00 +01:00
David Pilato 8923b36780 Merge pull request #21956 from alexshadow007/aws_read_timeout
Add setting to set read timeout for EC2 discovery and S3 repository plugins
2016-12-07 16:00:48 +01:00
Alexander Kazakov 0a03a62ab6 Using ClientConfiguration.DEFAULT_SOCKET_TIMEOUT as default value for read timeout 2016-12-06 21:13:28 +03:00
Boaz Leskes a7050b2d56 Remove `InternalTestCluster.startNode(s)Async` (#21846)
Since the removal of local discovery of #https://github.com/elastic/elasticsearch/pull/20960 we rely on minimum master nodes to be set in our test cluster. The settings is automatically managed by the cluster (by default) but current management doesn't work with concurrent single node async starting. On the other hand, with `MockZenPing` and the `discovery.initial_state_timeout` set to `0s` node starting and joining is very fast making async starting an unneeded complexity. Test that still need async starting could, in theory, still do so themselves via background threads.

Note that this change also removes the usage of `INITIAL_STATE_TIMEOUT_SETTINGS` as the starting of nodes is done concurrently (but building them is sequential)
2016-12-06 12:06:15 +01:00
Alexander Kazakov 1491e2dec9 Remove default value for read_timeout setting
Fix tests and docs
2016-12-05 21:29:17 +03:00
Alexander Kazakov 23550f277b Add us-east-2 AWS region 2016-12-04 20:02:05 +03:00
Alexander Kazakov 5695eaf19e Add setting to set read timeout for EC2 discovery and S3 repository plugins 2016-12-04 01:58:53 +03:00
Boaz Leskes fe01c0f83b fix TemplateQueryBuilderTests & Murmur3FieldMapperTests 2016-12-01 14:21:57 +01:00
Adrien Grand 6231009a8f Remove 2.x backward compatibility of mappings. (#21670)
For the record, I also had to remove the geo-hash cell and geo-distance range
queries to make the code compile. These queries already throw an exception in
all cases with 5.x indices, so that does not hurt any more.

I also had to rename all 2.x bwc indices from `index-${version}` to
`unsupported-${version}` to make `OldIndexBackwardCompatibilityIT`
happy.
2016-11-30 13:34:46 +01:00
Jim Ferenczi d791ddf704 Upgrade to lucene-6.4.0-snapshot-ec38570 (#21853)
Set lucene version to 6.4.0-snapshot-ec38570 and update all the sha1s/license
Fix invalid combo after upgrade in query_string query. split_on_whitespace=false is disallowed if auto_generate_phrase_queries=true
Adapt the expectations of some tests to the new format of the Lucene explain output
2016-11-29 18:40:31 +01:00
Nicholas Knize af1ab68b64 Add RangeFieldMapper for numeric and date range types
Lucene 6.2 added index and query support for numeric ranges. This commit adds a new RangeFieldMapper for indexing numeric (int, long, float, double) and date ranges and creating appropriate range and term queries. The design is similar to NumericFieldMapper in that it uses a RangeType enumerator for implementing the logic specific to each type. The following range types are supported by this field mapper: int_range, float_range, long_range, double_range, date_range.

Lucene does not provide a DocValue field specific to RangeField types so the RangeFieldMapper implements a CustomRangeDocValuesField for handling doc value support.

When executing a Range query over a Range field, the RangeQueryBuilder has been enhanced to accept a new relation parameter for defining the type of query as one of: WITHIN, CONTAINS, INTERSECTS. This provides support for finding all ranges that are related to a specific range in a desired way. As with other spatial queries, DISJOINT can be achieved as a MUST_NOT of an INTERSECTS query.
2016-11-29 10:10:14 -06:00
Simon Willnauer 9809760eb0 Fix settings diff generation for affix, list and group settings (#21788)
Group, List and Affix settings generate a bogus diff that turns the actual
diff into a string containing a json structure for instance:

```
"action" : {
  "search" : {
    "remote" : {
      "" : "{\"my_remote_cluster\":\"[::1]:60378\"}"
    }
  }
}
```

which make reading the setting impossible. This happens for instance
if a group or affix setting is rendered via `_cluster/settings?include_defaults=true`
This change fixes the issue as well as several minor issues with affix settings that
where not accepted as valid setting today.
2016-11-24 21:53:04 +01:00
Jason Tedor 9dc65037bc Lazy resolve unicast hosts
Today we eagerly resolve unicast hosts. This means that if DNS changes,
we will never find the host at the new address. Moreover, a single host
failng to resolve causes startup to abort. This commit introduces lazy
resolution of unicast hosts. If a DNS entry changes, there is an
opportunity for the host to be discovered. Note that under the Java
security manager, there is a default positive cache of infinity for
resolved hosts; this means that if a user does want to operate in an
environment where DNS can change, they must adjust
networkaddress.cache.ttl in their security policy. And if a host fails
to resolve, we warn log the hostname but continue pinging other
configured hosts.

When doing DNS resolutions for unicast hostnames, we wait until the DNS
lookups timeout. This appears to be forty-five seconds on modern JVMs,
and it is not configurable. If we do these serially, the cluster can be
blocked during ping for a lengthy period of time. This commit introduces
doing the DNS lookups in parallel, and adds a user-configurable timeout
for these lookups.

Relates #21630
2016-11-22 14:17:04 -05:00
Nik Everett c79371fd5b Remove lang-python and lang-javascript (#20734)
They were deprecated in 5.0. We are concentrating on making
Painless awesome rather than supporting every language possible.

Closes #20698
2016-11-21 22:13:25 -05:00
David Pilato bccbc75efe Merge branch 'pr/update-tika-1.14' 2016-11-18 12:33:45 +01:00
Adrien Grand 6581b77198 Remove store throttling. (#21573)
Store throttling has been disabled by default since Lucene added automatic
throttling of merge operations based on the indexing rate.
2016-11-17 09:33:32 +01:00
Ryan Ernst 1732fd2ea6 Remove rogue file from the by-gone days of 2.x. 2016-11-16 16:22:01 -08:00
David Pilato 7517c50698 Update to Tika 1.14
Closes #20390.
2016-11-16 11:29:14 +01:00
Boaz Leskes 2c0338fa87 Merge remote-tracking branch 'upstream/master' into feature/seq_no 2016-11-15 17:09:08 +00:00
Boaz Leskes d6c2b4f7c5 Adapt InternalTestCluster to auto adjust `minimum_master_nodes` (#21458)
#20960 removed `LocalDiscovery` and we now use `ZenDiscovery` in all our tests. To keep cluster forming fast, we are using a `MockZenPing` implementation which uses static maps to return instant results making master election fast. Currently, we don't set `minimum_master_nodes` causing the occasional split brain when starting multiple nodes concurrently and their pinging is so fast that it misses the fact that one of the node has elected it self master. To solve this, `InternalTestCluster` is modified to behave like a true cluster and manage and set `minimum_master_nodes` correctly with every change to the number of nodes.

Tests that want to manage the settings themselves can opt out using a new `autoMinMasterNodes` parameter to the `ClusterScope` annotation. 

Having `min_master_nodes` set means the started node may need to wait for other nodes to be started as well. To combat this, we set `discovery.initial_state_timeout` to `0` and wait for the cluster to form once all node have been started. Also, because a node may wait and ping while other nodes are started, `MockZenPing` is adapted to wait rather than busy-ping.
2016-11-15 13:42:26 +00:00
Boaz Leskes c9f49039d3 Merge remote-tracking branch 'upstream/master' into feature/seq_no 2016-11-15 10:14:47 +00:00
Yannick Welsch 64a7a960d9 Use pre-JDK9 style FilePermissions on JDK9 (#21540)
JDK9 removed pathname canonicalization when constructing FilePermission objects, which breaks some of the FilePermissions added by Elasticsearch. This commit adds the system property jdk.io.permissionsUseCanonicalPath which makes JDK9 behave like JDK8 w.r.t. FilePermission objects (see #21534).
2016-11-15 09:31:32 +01:00
Ryan Ernst c7bd4f3454 Tests: Add TestZenDiscovery and replace uses of MockZenPing with it (#21488)
This changes adds a test discovery (which internally uses the existing
mock zenping by default). Having the mock the test framework selects be a discovery
greatly simplifies discovery setup (no more weird callback to a Node
method).
2016-11-14 21:46:10 -08:00
Yannick Welsch ea65a01789 Use pre-JDK9 style FilePermissions on JDK9
JDK9 removed pathname canonicalization when constructing FilePermission objects, which breaks some of the FilePermissions added by
Elasticsearch. This commit adds the system property jdk.io.permissionsUseCanonicalPath which makes JDK9 behave like JDK8 w.r.t. FilePermissions (see
https://github.com/elastic/elasticsearch/issues/21534).
2016-11-14 14:13:23 +01:00
Adrien Grand 1fd5c47e7f Upgrade to lucene-6.3.0. (#21464) 2016-11-14 09:36:45 +01:00
Jason Tedor 1e7c424479 Merge branch 'master' into feature/seq_no
* master:
  ShardActiveResponseHandler shouldn't hold to an entire cluster state
  Ensures cleanup of temporary index-* generational blobs during snapshotting (#21469)
  Remove (again) test uses of onModule (#21414)
  [TEST] Add assertBusy when checking for pending operation counter after tests
  Revert "Add trace logging when aquiring and releasing operation locks for replication requests"
  Allows multiple patterns to be specified for index templates (#21009)
  [TEST] fixes rebalance single shard check as it isn't guaranteed that a rebalance makes sense and the method only tests if rebalance is allowed
  Document _reindex with random_score
2016-11-11 11:25:27 -05:00
Jason Tedor d3417fb022 Merge branch 'master' into feature/seq_no
* master: (516 commits)
  Avoid angering Log4j in TransportNodesActionTests
  Add trace logging when aquiring and releasing operation locks for replication requests
  Fix handler name on message not fully read
  Remove accidental import.
  Improve log message in TransportNodesAction
  Clean up of Script.
  Update Joda Time to version 2.9.5 (#21468)
  Remove unused ClusterService dependency from SearchPhaseController (#21421)
  Remove max_local_storage_nodes from elasticsearch.yml (#21467)
  Wait for all reindex subtasks before rethrottling
  Correcting a typo-Maan to Man-in README.textile (#21466)
  Fix InternalSearchHit#hasSource to return the proper boolean value (#21441)
  Replace all index date-math examples with the URI encoded form
  Fix typos (#21456)
  Adapt ES_JVM_OPTIONS packaging test to ubuntu-1204
  Add null check in InternalSearchHit#sourceRef to prevent NPE (#21431)
  Add VirtualBox version check (#21370)
  Export ES_JVM_OPTIONS for SysV init
  Skip reindex rethrottle tests with workers
  Make forbidden APIs be quieter about classpath warnings (#21443)
  ...
2016-11-10 23:40:33 -05:00
Ryan Ernst 48bfb142b9 Remove (again) test uses of onModule (#21414)
This change was reverted after it caused random test failures. This was
due to a copy/paste error in the original PR which caused the mock
version of ClusterInfoService to be used whenever the mock *ZenPing* was
used, and the real ClusterInfoService to be used when MockZenPing was
not used.
2016-11-10 16:06:14 -08:00
Jack Conradson aeb97ff412 Clean up of Script.
Closes #21321
2016-11-10 09:59:13 -08:00
javanna 2f32c1173b Revert "Tests: Remove a couple test uses of onModule (#21414)"
This reverts commit b326f0bc51.
2016-11-09 11:32:16 +01:00
Ryan Ernst b326f0bc51 Tests: Remove a couple test uses of onModule (#21414)
There were still a couple test use cases and examples that were using
onModule. This change cleans those cases up.
2016-11-08 13:50:13 -08:00
Ryan Ernst 4f5a934d92 Plugins: Convert custom discovery to pull based plugin (#21398)
* Plugins: Convert custom discovery to pull based plugin

This change primarily moves registering custom Discovery implementations
to the pull based DiscoveryPlugin interface. It also keeps the cloud
based discovery plugins re-registering ZenDiscovery under their own name
in order to maintain backwards compatibility. However,
discovery.zen.hosts_provider is changed here to no longer fallback to
discovery.type. Instead, each plugin which previously relied on the
value of discovery.type now sets the hosts_provider to itself if
discovery.type is set to itself, along with a deprecation warning.
2016-11-08 12:52:10 -08:00
Ryan Ernst 7a2c984bcc Test: Remove multi process support from rest test runner (#21391)
At one point in the past when moving out the rest tests from core to
their own subproject, we had multiple test classes which evenly split up
the tests to run. However, we simplified this and went back to a single
test runner to have better reproduceability in tests. This change
removes the remnants of that multiplexing support.
2016-11-07 15:07:34 -08:00
Adrien Grand 2a70f6e7b1 Upgrade to lucene-6.3.0-snapshot-a66a445. (#21309)
This addresses a bug that was introduced with https://issues.apache.org/jira/browse/LUCENE-7501.
2016-11-04 10:34:04 +01:00
Adrien Grand 7ec51d628d Make the default S3 buffer size depend on the available memory. (#21299)
Currently the default S3 buffer size is 100MB, which can be a lot for small
heaps. This pull request updates the default to be 100MB for heaps that are
greater than 2GB and 5% of the heap size otherwise.
2016-11-03 16:07:52 +01:00
Adrien Grand aa6cd93e0f Require arguments for QueryShardContext creation. (#21196)
The `IndexService#newQueryShardContext()` method creates a QueryShardContext on
shard `0`, with a `null` reader and that uses `System.currentTimeMillis()` to
resolve `now`. This may hide bugs, since the shard id is sometimes used for
query parsing (it is used to salt random score generation in `function_score`),
passing a `null` reader disables query rewriting and for some use-cases, it is
simply not ok to rely on the current timestamp (eg. percolation). So this pull
request removes this method and instead requires that all call sites provide
these parameters explicitly.
2016-11-02 09:48:49 +01:00
Christoph Büscher 1f5adaa824 Docs: Adding Ukrainian analyzer 2016-10-31 18:20:39 +01:00
Christoph Büscher a9b0b97703 Expose Lucenes Ukrainian analyzer
Since Lucene 6.2. the UkrainianMorfologikAnalyzer is available through the
lucene-analyzers-morfologik jar. This change exposes it to be used as an
elasticsearch plugin.
2016-10-31 18:20:39 +01:00
Yannick Welsch a23ded6a94 [TEST] Fix NullPointerException in AzureStorageServiceMock
Makes the code safe against concurrent modifications of the underlying hashmap.
2016-10-31 16:21:07 +01:00
Adrien Grand b3cc54cf0d Upgrade to lucene-6.3.0-snapshot-ed102d6 (#21150)
Lucene 6.3 is expected to be released in the next weeks so it'd be good to give
it some integration testing. I had to upgrade randomized-testing too so that
both Lucene and Elasticsearch are on the same version.
2016-10-28 14:47:15 +02:00
Jun Ohtani a66c76eb44 Merge pull request #20704 from johtani/remove_request_params_in_analyze_api
Removing request parameters in _analyze API
2016-10-27 17:43:18 +09:00
Jack Conradson 512a77a633 Refactor ScriptType to be a top-level class. 2016-10-26 10:21:22 -07:00
David Pilato 50bc31a918 Fix s3 repository when used with IAM profiles
Applying same patch we did in #21048 but for `repository-s3` plugin.

Backport of #21058 in master branch
2016-10-21 16:45:11 +02:00
David Pilato e5d9f393f1 Fix ec2 discovery when used with IAM profiles.
Follow up for #21039.

We can revert the previous change and do that a bit smarter than it was.

Patch tested successfully manually on ec2 with 2 nodes with a configuration like:

```yml
discovery.type: ec2
network.host: ["_local_", "_site_", "_ec2_"]
cloud.aws.region: us-west-2
```

(cherry picked from commit fbbeded)

Backport of #21048 in master branch
2016-10-20 20:19:47 +02:00
Ryan Ernst 60353a245a Plugins: Make UnicastHostsProvider extension pull based (#21036)
This change moves providing UnicastHostsProvider for zen discovery to be
pull based, adding a getter in DiscoveryPlugin. A new setting is added,
discovery.zen.hosts_provider, to separate the discovery type from the
hosts provider for zen when it is selected. Unfortunately existing
plugins added ZenDiscovery with their own name in order to just provide
a hosts provider, so there are already many users setting the hosts
provider through discovery.type. This change also includes backcompat,
falling back to discovery.type when discovery.zen.hosts_provider is not
set.
2016-10-20 09:13:59 -07:00
David Pilato efffb946e2 Fix ec2 discovery when used with IAM profiles.
Here is what is happening without this fix when you try to connect to ec2 APIs:

```
[2016-10-20T12:41:49,925][DEBUG][c.a.a.AWSCredentialsProviderChain] Unable to load credentials from EnvironmentVariableCredentialsProvider: Unable to load AWS credentials from environment variables (AWS_ACCESS_KEY_ID (or AWS_ACCESS_KEY) and AWS_SECRET_KEY (or AWS_SECRET_ACCESS_KEY))
[2016-10-20T12:41:49,926][DEBUG][c.a.a.AWSCredentialsProviderChain] Unable to load credentials from SystemPropertiesCredentialsProvider: Unable to load AWS credentials from Java system properties (aws.accessKeyId and aws.secretKey)
[2016-10-20T12:41:49,926][DEBUG][c.a.a.AWSCredentialsProviderChain] Unable to load credentials from com.amazonaws.auth.profile.ProfileCredentialsProvider@1ad14091: access denied ("java.io.FilePermission" "/home/ubuntu/.aws/credentials" "read")
[2016-10-20T12:41:49,927][DEBUG][c.a.i.EC2MetadataClient  ] Connecting to EC2 instance metadata service at URL: http://169.254.169.254/latest/meta-data/iam/security-credentials/
[2016-10-20T12:41:49,951][DEBUG][c.a.i.EC2MetadataClient  ] Connecting to EC2 instance metadata service at URL: http://169.254.169.254/latest/meta-data/iam/security-credentials/discovery-tests
[2016-10-20T12:41:49,965][DEBUG][c.a.a.AWSCredentialsProviderChain] Unable to load credentials from InstanceProfileCredentialsProvider: Unable to parse Json String.
[2016-10-20T12:41:49,966][INFO ][o.e.d.e.AwsEc2UnicastHostsProvider] [dJfktmE] Exception while retrieving instance list from AWS API: Unable to load AWS credentials from any provider in the chain
[2016-10-20T12:41:49,967][DEBUG][o.e.d.e.AwsEc2UnicastHostsProvider] [dJfktmE] Full exception:
com.amazonaws.AmazonClientException: Unable to load AWS credentials from any provider in the chain
	at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:131) ~[aws-java-sdk-core-1.10.69.jar:?]
	at com.amazonaws.services.ec2.AmazonEC2Client.invoke(AmazonEC2Client.java:11117) ~[aws-java-sdk-ec2-1.10.69.jar:?]
	at com.amazonaws.services.ec2.AmazonEC2Client.describeInstances(AmazonEC2Client.java:5403) ~[aws-java-sdk-ec2-1.10.69.jar:?]
	at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.fetchDynamicNodes(AwsEc2UnicastHostsProvider.java:116) [discovery-ec2-5.0.0.jar:5.0.0]
	at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:234) [discovery-ec2-5.0.0.jar:5.0.0]
	at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider$DiscoNodesCache.refresh(AwsEc2UnicastHostsProvider.java:219) [discovery-ec2-5.0.0.jar:5.0.0]
	at org.elasticsearch.common.util.SingleObjectCache.getOrRefresh(SingleObjectCache.java:54) [elasticsearch-5.0.0.jar:5.0.0]
	at org.elasticsearch.discovery.ec2.AwsEc2UnicastHostsProvider.buildDynamicNodes(AwsEc2UnicastHostsProvider.java:102) [discovery-ec2-5.0.0.jar:5.0.0]
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing.sendPings(UnicastZenPing.java:358) [elasticsearch-5.0.0.jar:5.0.0]
	at org.elasticsearch.discovery.zen.ping.unicast.UnicastZenPing$1.doRun(UnicastZenPing.java:272) [elasticsearch-5.0.0.jar:5.0.0]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:504) [elasticsearch-5.0.0.jar:5.0.0]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-5.0.0.jar:5.0.0]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
	at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
```

For whatever reason, it can not parse what is coming back from http://169.254.169.254/latest/meta-data/iam/security-credentials/discovery-tests.

But, if you wrap the code within an `AccessController.doPrivileged()` call, then it works perfectly.

Closes #21039.

(cherry picked from commit abfdc70)
2016-10-20 17:19:22 +02:00
Ryan Ernst 53cff0f00f Move all zen discovery classes into o.e.discovery.zen (#21032)
* Move all zen discovery classes into o.e.discovery.zen

This collapses sub packages of zen into zen. These all had just a couple
classes each, and there is really no reason to have the subpackages.

* fix checkstyle
2016-10-20 00:44:48 -07:00
Boaz Leskes c3987156ab Remove local discovery in favor of a simpler `MockZenPings` (#20960)
`LocalDiscovery` is a discovery implementation that uses static in memory maps to keep track of current live nodes. This is used extensively in our tests in order to speed up cluster formation (i.e., shortcut the 3 second ping period used by `ZenDiscovery` by default). This is sad as that mean that most of the test run using a different discovery semantics than what is used in production. Instead of replacing the entire discovery logic, we can use a similar approach to only shortcut the pinging components.
2016-10-18 21:12:15 +02:00
Jason Tedor f23ae90d92 Fix logging configuration for AwsSdkMetrics logger
This commit fixes an issue with the configuration for the AwsSdkMetrics
logger; the issue is that the logging configuration had used underscores
instead of periods for the settings key (the perils of lenient settings
parsing).

Relates #20313
2016-10-14 23:44:39 -04:00
Tanguy Leroux 44ac5d057a Remove empty javadoc (#20871)
This commit removes as many as empty javadocs comments my regexp has found
2016-10-12 10:27:09 +02:00
Ali Beyad bbf6e6d0bd Fixes leading forward slash in S3 repository base_path (#20861)
In 2.x, the S3 repository accepted a `/` (forward slash) to start
the repositories.s3.base_path, and it used a different string splitting
method that removed the forward slash from the base path, so there
were no issues.

In 5.x, we removed this custom string splitting method in favor of
the JDK's string splitting method, which preserved the leading `/`.
The AWS SDK does not like the leading `/` in the key path after the
bucket name, and so it could not find any objects in the S3 repository.

This commit fixes the issue by removing the leading `/` if it exists
and adding a deprecation notice that leading `/` will not be supported
in the future in S3 repository's base_path.
2016-10-11 11:18:52 -04:00
Alexander Reelsen 3c2e51d831 Deps: Update ingest-attachment to latest libraries (#20710)
Also added a test to check for a with a regular PDF,
instead of only an encrypted one with expected exception.
2016-10-10 12:55:05 +02:00
Nik Everett cf4038b668 DeGuice some of IndicesModule
UpdateHelper, MetaDataIndexUpgradeService, and some recovery
stuff.

Move ClusterSettings to nullable ctor parameter of TransportService
so it isn't forgotten.
2016-10-07 11:14:38 -04:00
Simon Willnauer 7452028e50 Simplify TransportAddress (#20798)
since TransportAddress is now final we can simplify it's interface a bit
and remove methods that are only used in tests or are plain delegates.
2016-10-07 15:56:54 +02:00
Simon Willnauer 194a6b1df0 Remove LocalTransport in favor of MockTcpTransport (#20695)
This change proposes the removal of all non-tcp transport implementations. The
mock transport can be used by default to run tests instead of local transport that has
roughly the same performance compared to TCP or at least not noticeably slower.

This is a master only change, deprecation notice in 5.x will be committed as a
separate change.
2016-10-07 11:27:47 +02:00
Jun Ohtani 370f0b885e Removing request parameters in _analyze API
Remove request params in _analyze API without index param
Change rest-api-test using JSON
Change docs using JSON

Closes #20246
2016-10-07 16:23:24 +09:00
David Pilato 591a8d4ec6 Merge branch 'fix/20669-master-azure-log' 2016-10-06 16:00:43 +02:00
Martijn van Groningen 6a5630f901 ingest: Upgrade geoip2 dependency
Closes #20563
2016-10-05 09:31:55 +02:00
Jason Tedor 51d53791fe Remove lenient URL parameter parsing
Today when parsing a request, Elasticsearch silently ignores incorrect
(including parameters with typos) or unused parameters. This is bad as
it leads to requests having unintended behavior (e.g., if a user hits
the _analyze API and misspell the "tokenizer" then Elasticsearch will
just use the standard analyzer, completely against intentions).

This commit removes lenient URL parameter parsing. The strategy is
simple: when a request is handled and a parameter is touched, we mark it
as such. Before the request is actually executed, we check to ensure
that all parameters have been consumed. If there are remaining
parameters yet to be consumed, we fail the request with a list of the
unconsumed parameters. An exception has to be made for parameters that
format the response (as opposed to controlling the request); for this
case, handlers are able to provide a list of parameters that should be
excluded from tripping the unconsumed parameters check because those
parameters will be used in formatting the response.

Additionally, some inconsistencies between the parameters in the code
and in the docs are corrected.

Relates #20722
2016-10-04 12:45:29 -04:00
Tanguy Leroux 857e861d32 [Docs] Log snapshot shard failures in AzureSnapshotRestoreServiceIntegTests
This commit adds logs when a snapshot has failures for some snapshoted shards.
2016-10-03 15:04:37 +02:00
Jason Tedor 25fd9e26c4 Merge branch 'master' into feature/seq_no
* master: (1199 commits)
  [DOCS] Remove non-valid link to mapping migration document
  Revert "Default `include_in_all` for numeric-like types to false"
  test: add a test with ipv6 address
  docs: clearify that both ip4 and ip6 addresses are supported
  Include complex settings in settings requests
  Add production warning for pre-release builds
  Clean up confusing error message on unhandled endpoint
  [TEST] Increase logging level in testDelayShards()
  change health from string to enum (#20661)
  Provide error message when plugin id is missing
  Document that sliced scroll works for reindex
  Make reindex-from-remote ignore unknown fields
  Remove NoopGatewayAllocator in favor of a more realistic mock (#20637)
  Remove Marvel character reference from guide
  Fix documentation for setting Java I/O temp dir
  Update client benchmarks to log4j2
  Changes the API of GatewayAllocator#applyStartedShards and (#20642)
  Removes FailedRerouteAllocation and StartedRerouteAllocation
  IndexRoutingTable.initializeEmpty shouldn't override supplied primary RecoverySource (#20638)
  Smoke tester: Adjust to latest changes (#20611)
  ...
2016-09-29 00:22:31 +02:00
Martijn van Groningen c99890eda5 test: add a test with ipv6 address 2016-09-28 10:04:20 +02:00
David Pilato 14af343d8d Fix logger when you can not create an azure storage client
We were swallowing the original exception when creating a client with bad credentials.
So even in `TRACE` log level, nothing useful were coming out of it.
With this commit, it now prints:

```
[2016-09-27 15:54:13,118][ERROR][cloud.azure.storage      ] [node_s0] can not create azure storage client: Storage Key is not a valid base64 encoded string.
```

Closes #20633.

Backport of #20669 for master branch (6.0)
2016-09-27 16:28:38 +02:00
Simon Willnauer fe1803c957 Remove AnalysisService and reduce it to a simple name to analyzer mapping (#20627)
Today we hold on to all possible tokenizers, tokenfilters etc. when we create
an index service on a node. This was mainly done to allow the `_analyze` API to
directly access all these primitive. We fixed this in #19827 and can now get rid of
the AnalysisService entirely and replace it with a simple map like class. This
ensures we don't create a gazillion long living objects that are entirely useless since
they are never used in most of the indices. Also those objects might consume a considerable
amount of memory since they might load stopwords or synonyms etc.

Closes #19828
2016-09-23 08:53:50 +02:00
Ali Beyad 5031824291 File-based discovery plugin integration tests (#20492)
Adds an integration test for the file-based discovery plugin
to test the plugin operates correctly and uses the hosts
configured in `unicast_hosts.txt` with a real cluster

Closes #20459
2016-09-21 15:48:18 -04:00
Tanguy Leroux 7645abaad9 Remove duplicate methods in ByteSizeValue (#20560)
This commit removes `ByteSizeValue`'s methods that are duplicated (ex: `mbFrac()` and `getMbFrac()`) in order to only keep the `getN` form.
    
It also renames `mb()` -> `getMb()`, `kb()` -> `getKB()` in order to be more coherent with the `ByteSizeUnit` method names.
2016-09-20 14:07:23 +02:00
Ryan Ernst 85b8f29415 Build: Remove old maven deploy support (#20403)
* Build: Remove old maven deploy support

This change removes the old maven deploy that we have in parallel to
maven-publish, and makes maven-publish fully work with publishing to
maven local. Using `gradle publishToMavenLocal` should be used to
publish to .m2.

Note that there is an unfortunate hack that means for
zip artifacts we must first create/publish a dummy pom file, and then
follow that with the real pom file. It would be nice to have the pom
file contains packaging=zip, but maven central then requires sources and
javadocs. But our zips are really just attached artifacts, so we already
set the packaging type to pom for our zip files. This change just works
around a limitation of the underlying maven publishing library which
silently skips attached artifacts when the packaging type is set to pom.

relates #20164
closes #20375

* Remove unnecessary extra spacing
2016-09-19 15:10:41 -07:00
David Pilato dfd1eebdd0 Remove mapper attachments plugin
We now have in 5.0.0 `ingest-attachment` plugin.
We can remove `mapper-attachments` plugin for 6.0.

Closes #18837.
2016-09-19 09:01:16 +02:00
Simon Willnauer f5daa165f1 Remove ability to plug-in TransportService (#20505)
TransportService is such a central part of the core server, replacing
it's implementation is risky and can cause serious issues. This change removes the ability to
plug in TransportService but allows registering a TransportInterceptor that enables
plugins to intercept requests on both the sender and the receiver ends. This is a commonly used
and overwritten functionality but encapsulates the custom code in a contained manner.
2016-09-16 09:47:53 +02:00
Tal Levy 4704efaef4 [ingest-geoip] do not insert null-valued fields in geoip response
update geoip to not include null-valued results from database

Originally, the plugin would still insert all the requested fields, but
assign null to each one. This fixes that by not writing the fields at
all. Makes for a better experience when the null fields conflict with
the typical geo_point field mapping.
2016-09-13 18:12:02 -07:00
Ali Beyad 4431720c3d File-based discovery plugin (#20394)
This commit introduces a new plugin for file-based unicast hosts
discovery. This allows specifying the unicast hosts participating
in discovery through a `unicast_hosts.txt` file located in the
`config/discovery-file` directory. The plugin will use the hosts 
specified in this file as the set of hosts to ping during discovery.

The format of the `unicast_hosts.txt` file is to have one host/port
entry per line. The hosts file is read and parsed every time
discovery makes ping requests, thus a new version of the file that
is published to the config directory will automatically be picked
up.

Closes #20323
2016-09-13 20:52:39 -04:00
Jim Ferenczi 1764ec56b3 Fixed naming inconsistency for fields/stored_fields in the APIs (#20166)
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain

The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering):
* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.

Some APIs still have the `fields` parameter for various reasons:
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155
2016-09-13 20:54:41 +02:00
Jason Tedor 981e4f5bc5 Configure AWS SDK logging configuration
Because of security permissions that we do not grant to the AWS SDK (for
use in discovery-ec2 and repository-s3 plugins), certain calls in the
AWS SDK will lead to security exceptions that are logged at the warning
level. These warnings are noise and we should suppress them. This commit
adds plugin log configurations for discovery-ec2 and repository-s3 to
ship with default Log4j 2 configurations that suppress these log
warnings.

Relates #20313
2016-09-03 06:41:07 -04:00
Jack Conradson 222a4fa765 Reduce the number of threads and scripts being used in multi-threaded
tests to prevent OOM from deprecation logging.
2016-09-02 11:56:44 -07:00
Jack Conradson 71d8ee5eac Merge branch 'master' into deprecate 2016-09-01 08:51:29 -07:00
Jack Conradson 3b3baa6e6c Made deprecation of Groovy, Javascript, and Python more explicit. 2016-08-31 15:56:31 -07:00
Jason Tedor 0853fc806f Add missing cast to logging message supplier
This commit adds a missing cast to logging message supplier on a single
invocation receiving a parameterized message parameter.
2016-08-30 18:26:45 -04:00
Jason Tedor abf8a1a3f0 Avoid allocating log parameterized messages
This commit modifies the call sites that allocate a parameterized
message to use a supplier so that allocations are avoided unless the log
level is fine enough to emit the corresponding log message.
2016-08-30 18:17:09 -04:00
Jason Tedor 7da0cdec42 Introduce Log4j 2
This commit introduces Log4j 2 to the stack.
2016-08-30 13:31:24 -04:00
Jack Conradson 7930233527 Deprecate Groovy, Python, and Javascript scripts. 2016-08-30 09:06:18 -07:00
Jun Ohtani 450f47d5b5 Validate blank field name
add validation and validate only 5.0+
Add tests before 5.0

Closes #19251
2016-08-26 20:10:33 +09:00
Adrien Grand 3ed0da5a58 GET operations should not extract fields from `_source`. #20158
This makes GET operations more consistent with `_search` operations which expect
`(stored_)fields` to work on stored fields and source filtering to work on the
`_source` field. This is now possible thanks to the fact that GET operations
do not read from the translog anymore (#20102) and also allows to get rid of
`FieldMapper#isGenerated`.

The `_termvectors` API (and thus more_like_this too) was relying on the fact
that GET operations would extract fields from either stored fields or the source
so the logic to do this that used to exist in `ShardGetService` has been moved
to `TermVectorsService`. It would be nice that term vectors do not rely on this,
but this does not seem to be a low hanging fruit.
2016-08-26 10:35:23 +02:00
Sarwar Bhuiyan b0ceecc3eb Refactored to use Settings object 2016-08-25 17:27:22 -04:00
Chris Earle 1cf694b63e Use StringBuilder in favor of StringBuffer
This removes all instances of StringBuffer that are removeable.

Uncontended synchronization in Java is pretty cheap, but it's unnecessary.
2016-08-25 16:20:03 -04:00
Mike McCandless 0ccfe69789 Upgrade to Lucene 6.2.0 2016-08-24 17:26:28 -04:00
Nik Everett 1452ab4b9f Squash the rest of o.e.rest.action
Squashes all the subpackages of `org.elasticsearch.rest.action` down to
the following:
* `o.e.rest.action.admin` - Administrative actions
* `o.e.rest.action.cat` - Actions that make tables for `grep`ing
* `o.e.rest.action.document` - Actions that act on documents
* `o.e.rest.action.ingest` - Actions that act on ingest pipelines
* `o.e.rest.action.search` - Actions that search

I'm tempted to merge `search` into `document` but the `document`
package feels fairly complete as is and `Suggest` isn't actually always
about documents either....

I'm also tempted to merge `ingest` into `admin.cluster` because the
latter contains the actions for dealing with stored scripts.

I've moved the `o.e.rest.action.support` into `o.e.rest.action`.

I've also added `package-info.java`s to all packges in `o.e.rest`. I
figure if the package is too small to deserve a `package-info.java` file
then it is too small to deserve to be a package....

Also fixes checkstyle in all moved classes.
2016-08-15 21:06:32 -04:00
Nik Everett 9f8f2ea54b Remove ESIntegTestCase#pluginList
It was a useful method in 1.7 when javac's type inference wasn't as
good, but now we can just replace it with `Arrays.asList`.
2016-08-11 15:44:02 -04:00
Nik Everett e07e5d66fa Make reindex and lang-javascript compatible
Fixes two issues:
1. lang-javascript doesn't support `executable` with a `null` `vars`
parameters. The parameter is quite nullable.
2. reindex didn't support script engines who's `unwrap` method wasn't
a noop. This didn't come up for lang-groovy or lang-painless because
both of those `unwrap`s were noops. lang-javascript copys all maps that
it `unwrap`s.

This adds fairly low level unit tests for these fixes but dosen't add
an integration test that makes sure that reindex and lang-javascript
play well together. That'd make backporting this difficult and would
add a fairly significant amount of time to the build for a fairly rare
interaction. Hopefully the unit tests will be enough.
2016-08-11 09:54:03 -04:00
David Pilato 42f851cf49 Merge branch 'master' into fix/19924-attachment 2016-08-10 19:05:22 +02:00
David Pilato 905684fe73 Adds content-length as number
If you run Elasticsearch with the ingest-attachment plugin:

```sh
gradle plugins:ingest-attachment:run
```

And then you use it on a document:

```js
 PUT _ingest/pipeline/attachment
 {
   "description" : "Extract attachment information",
   "processors" : [
     {
       "attachment" : {
         "field" : "data"
       }
     }
   ]
 }
 PUT my_index/my_type/my_id?pipeline=attachment
 {
   "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0="
 }
 GET my_index/my_type/my_id
```

 You were getting this back:

```js
 # PUT _ingest/pipeline/attachment
 {
   "acknowledged": true
 }

 # PUT my_index/my_type/my_id?pipeline=attachment
 {
   "_index": "my_index",
   "_type": "my_type",
   "_id": "my_id",
   "_version": 2,
   "result": "updated",
   "_shards": {
     "total": 2,
     "successful": 1,
     "failed": 0
   },
   "created": false
 }

 # GET my_index/my_type/my_id
 {
   "_index": "my_index",
   "_type": "my_type",
   "_id": "my_id",
   "_version": 2,
   "found": true,
   "_source": {
     "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=",
     "attachment": {
       "content_type": "application/rtf",
       "language": "ro",
       "content": "Lorem ipsum dolor sit amet",
       "content_length": "28"
     }
   }
 }
```

With this commit you are now getting:

```
 # GET my_index/my_type/my_id
 {
   "_index": "my_index",
   "_type": "my_type",
   "_id": "my_id",
   "_version": 2,
   "found": true,
   "_source": {
     "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=",
     "attachment": {
       "content_type": "application/rtf",
       "language": "ro",
       "content": "Lorem ipsum dolor sit amet",
       "content_length": 28
     }
   }
 }
```

Closes #19924
2016-08-10 18:31:16 +02:00
Adrien Grand 0d6ac57acf Collapse o.e.index.mapper packages. #19921
I also reduced the visibility of a couple classes and renamed/consolidated some
test classes for consistency, eg. removing the `Simple` prefix or using the
`<Type>FieldMapperTests` convention for testing field mappers.
2016-08-10 17:51:11 +02:00
Lee Hinman 5849c488b5 Merge remote-tracking branch 'dakrone/compliation-breaker' 2016-08-09 11:57:26 -06:00
Lee Hinman 2be52eff09 Circuit break the number of inline scripts compiled per minute
When compiling many dynamically changing scripts, parameterized
scripts (<https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-using.html#prefer-params>)
should be preferred. This enforces a limit to the number of scripts that
can be compiled within a minute. A new dynamic setting is added -
`script.max_compilations_per_minute`, which defaults to 15.

If more dynamic scripts are sent, a user will get the following
exception:

```json
{
  "error" : {
    "root_cause" : [
      {
        "type" : "circuit_breaking_exception",
        "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
        "bytes_wanted" : 0,
        "bytes_limit" : 0
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "i",
        "node" : "a5V1eXcZRYiIk8lecjZ4Jw",
        "reason" : {
          "type" : "general_script_exception",
          "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]",
          "caused_by" : {
            "type" : "circuit_breaking_exception",
            "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
            "bytes_wanted" : 0,
            "bytes_limit" : 0
          }
        }
      }
    ],
    "caused_by" : {
      "type" : "general_script_exception",
      "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]",
      "caused_by" : {
        "type" : "circuit_breaking_exception",
        "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
        "bytes_wanted" : 0,
        "bytes_limit" : 0
      }
    }
  },
  "status" : 500
}
```

This also fixes a bug in `ScriptService` where requests being executed
concurrently on a single node could cause a script to be compiled
multiple times (many in the case of a powerful node with many shards)
due to no synchronization between checking the cache and compiling the
script. There is now synchronization so that a script being compiled
will only be compiled once regardless of the number of concurrent
searches on a node.

Relates to #19396
2016-08-09 10:26:27 -06:00
Ali Beyad f59ca9083b Snapshot repository cleans up empty index folders (#19751)
This commit cleans up indices in a snapshot repository when all
snapshots containing the index are all deleted. Previously, empty
indices folders would lay around after all snapshots containing
them were deleted.
2016-08-05 09:39:02 -04:00
David Pilato 6b9a084086 Merge branch 'pr/19557-extract-aws-key' 2016-08-04 17:48:44 +02:00
Ali Beyad c4ae23f5d8 Enables implementations of the BlobContainer interface to (#19749)
conform with the requirements of the writeBlob method by
throwing a FileAlreadyExistsException if attempting to write
to a blob that already exists. This change means implementations
of BlobContainer should never overwrite blobs - to overwrite a
blob, it must first be deleted and then can be written again.

Closes #15579
2016-08-02 09:48:21 -04:00
Ali Beyad 456ea56527 Cleans up the BlobContainer interface by removing the (#19727)
writeBlob method takes a BytesReference in favor of just
the writeBlob method that takes an InputStream.

Closes #18528
2016-08-02 09:21:43 -04:00
Ali Beyad 9f88a8194a Merge pull request #19706 from elastic/enhancement/snapshot-blob-handling
More resilient blob handling in snapshot repositories
2016-08-01 12:03:53 -04:00
Ali Beyad 401edeb0d8 AzureBlobContainer's deleteBlob method now throws a NoSuchFileException
instead of a vanilla IOException when the blob doesn't exist, in order
to conform to the BlobContainer's interface contract.
2016-08-01 10:50:02 -04:00
Nik Everett 303c9faca5 Squash o.e.rest.action.admin.cluster
In an effort to reduce the number of tiny packages we have in the
code base this moves all the files that were in subdirectories of
`org.elasticsearch.rest.action.admin.cluster` into
`org.elasticsearch.rest.action.admin.cluster`.

Also fixes line length in these packages.
2016-07-29 20:31:24 -04:00