Commit Graph

1542 Commits

Author SHA1 Message Date
Simon Willnauer 194a6b1df0 Remove LocalTransport in favor of MockTcpTransport (#20695)
This change proposes the removal of all non-tcp transport implementations. The
mock transport can be used by default to run tests instead of local transport that has
roughly the same performance compared to TCP or at least not noticeably slower.

This is a master only change, deprecation notice in 5.x will be committed as a
separate change.
2016-10-07 11:27:47 +02:00
David Pilato 591a8d4ec6 Merge branch 'fix/20669-master-azure-log' 2016-10-06 16:00:43 +02:00
Martijn van Groningen 6a5630f901 ingest: Upgrade geoip2 dependency
Closes #20563
2016-10-05 09:31:55 +02:00
Jason Tedor 51d53791fe Remove lenient URL parameter parsing
Today when parsing a request, Elasticsearch silently ignores incorrect
(including parameters with typos) or unused parameters. This is bad as
it leads to requests having unintended behavior (e.g., if a user hits
the _analyze API and misspell the "tokenizer" then Elasticsearch will
just use the standard analyzer, completely against intentions).

This commit removes lenient URL parameter parsing. The strategy is
simple: when a request is handled and a parameter is touched, we mark it
as such. Before the request is actually executed, we check to ensure
that all parameters have been consumed. If there are remaining
parameters yet to be consumed, we fail the request with a list of the
unconsumed parameters. An exception has to be made for parameters that
format the response (as opposed to controlling the request); for this
case, handlers are able to provide a list of parameters that should be
excluded from tripping the unconsumed parameters check because those
parameters will be used in formatting the response.

Additionally, some inconsistencies between the parameters in the code
and in the docs are corrected.

Relates #20722
2016-10-04 12:45:29 -04:00
Tanguy Leroux 857e861d32 [Docs] Log snapshot shard failures in AzureSnapshotRestoreServiceIntegTests
This commit adds logs when a snapshot has failures for some snapshoted shards.
2016-10-03 15:04:37 +02:00
Martijn van Groningen c99890eda5 test: add a test with ipv6 address 2016-09-28 10:04:20 +02:00
David Pilato 14af343d8d Fix logger when you can not create an azure storage client
We were swallowing the original exception when creating a client with bad credentials.
So even in `TRACE` log level, nothing useful were coming out of it.
With this commit, it now prints:

```
[2016-09-27 15:54:13,118][ERROR][cloud.azure.storage      ] [node_s0] can not create azure storage client: Storage Key is not a valid base64 encoded string.
```

Closes #20633.

Backport of #20669 for master branch (6.0)
2016-09-27 16:28:38 +02:00
Simon Willnauer fe1803c957 Remove AnalysisService and reduce it to a simple name to analyzer mapping (#20627)
Today we hold on to all possible tokenizers, tokenfilters etc. when we create
an index service on a node. This was mainly done to allow the `_analyze` API to
directly access all these primitive. We fixed this in #19827 and can now get rid of
the AnalysisService entirely and replace it with a simple map like class. This
ensures we don't create a gazillion long living objects that are entirely useless since
they are never used in most of the indices. Also those objects might consume a considerable
amount of memory since they might load stopwords or synonyms etc.

Closes #19828
2016-09-23 08:53:50 +02:00
Ali Beyad 5031824291 File-based discovery plugin integration tests (#20492)
Adds an integration test for the file-based discovery plugin
to test the plugin operates correctly and uses the hosts
configured in `unicast_hosts.txt` with a real cluster

Closes #20459
2016-09-21 15:48:18 -04:00
Tanguy Leroux 7645abaad9 Remove duplicate methods in ByteSizeValue (#20560)
This commit removes `ByteSizeValue`'s methods that are duplicated (ex: `mbFrac()` and `getMbFrac()`) in order to only keep the `getN` form.
    
It also renames `mb()` -> `getMb()`, `kb()` -> `getKB()` in order to be more coherent with the `ByteSizeUnit` method names.
2016-09-20 14:07:23 +02:00
Ryan Ernst 85b8f29415 Build: Remove old maven deploy support (#20403)
* Build: Remove old maven deploy support

This change removes the old maven deploy that we have in parallel to
maven-publish, and makes maven-publish fully work with publishing to
maven local. Using `gradle publishToMavenLocal` should be used to
publish to .m2.

Note that there is an unfortunate hack that means for
zip artifacts we must first create/publish a dummy pom file, and then
follow that with the real pom file. It would be nice to have the pom
file contains packaging=zip, but maven central then requires sources and
javadocs. But our zips are really just attached artifacts, so we already
set the packaging type to pom for our zip files. This change just works
around a limitation of the underlying maven publishing library which
silently skips attached artifacts when the packaging type is set to pom.

relates #20164
closes #20375

* Remove unnecessary extra spacing
2016-09-19 15:10:41 -07:00
David Pilato dfd1eebdd0 Remove mapper attachments plugin
We now have in 5.0.0 `ingest-attachment` plugin.
We can remove `mapper-attachments` plugin for 6.0.

Closes #18837.
2016-09-19 09:01:16 +02:00
Simon Willnauer f5daa165f1 Remove ability to plug-in TransportService (#20505)
TransportService is such a central part of the core server, replacing
it's implementation is risky and can cause serious issues. This change removes the ability to
plug in TransportService but allows registering a TransportInterceptor that enables
plugins to intercept requests on both the sender and the receiver ends. This is a commonly used
and overwritten functionality but encapsulates the custom code in a contained manner.
2016-09-16 09:47:53 +02:00
Tal Levy 4704efaef4 [ingest-geoip] do not insert null-valued fields in geoip response
update geoip to not include null-valued results from database

Originally, the plugin would still insert all the requested fields, but
assign null to each one. This fixes that by not writing the fields at
all. Makes for a better experience when the null fields conflict with
the typical geo_point field mapping.
2016-09-13 18:12:02 -07:00
Ali Beyad 4431720c3d File-based discovery plugin (#20394)
This commit introduces a new plugin for file-based unicast hosts
discovery. This allows specifying the unicast hosts participating
in discovery through a `unicast_hosts.txt` file located in the
`config/discovery-file` directory. The plugin will use the hosts 
specified in this file as the set of hosts to ping during discovery.

The format of the `unicast_hosts.txt` file is to have one host/port
entry per line. The hosts file is read and parsed every time
discovery makes ping requests, thus a new version of the file that
is published to the config directory will automatically be picked
up.

Closes #20323
2016-09-13 20:52:39 -04:00
Jim Ferenczi 1764ec56b3 Fixed naming inconsistency for fields/stored_fields in the APIs (#20166)
This change replaces the fields parameter with stored_fields when it makes sense.
This is dictated by the renaming we made in #18943 for the search API.

The following list of endpoint has been changed to use `stored_fields` instead of `fields`:
* get
* mget
* explain

The documentation and the rest API spec has been updated to cope with the changes for the following APIs:
* delete_by_query
* get
* mget
* explain

The `fields` parameter has been deprecated for the following APIs (it is replaced by _source filtering):
* update: the fields are extracted from the _source directly.
* bulk: the fields parameter is used but fields are extracted from the source directly so it is allowed to have non-stored fields.

Some APIs still have the `fields` parameter for various reasons:
* cat.fielddata: the fields paramaters relates to the fielddata fields that should be printed.
* indices.clear_cache: used to indicate which fielddata fields should be cleared.
* indices.get_field_mapping: used to filter fields in the mapping.
* indices.stats: get stats on fields (stored or not stored).
* termvectors: fields are retrieved from the stored fields if possible and extracted from the _source otherwise.
* mtermvectors:
* nodes.stats: the fields parameter is used to concatenate completion_fields and fielddata_fields so it's not related to stored_fields at all.

Fixes #20155
2016-09-13 20:54:41 +02:00
Jason Tedor 981e4f5bc5 Configure AWS SDK logging configuration
Because of security permissions that we do not grant to the AWS SDK (for
use in discovery-ec2 and repository-s3 plugins), certain calls in the
AWS SDK will lead to security exceptions that are logged at the warning
level. These warnings are noise and we should suppress them. This commit
adds plugin log configurations for discovery-ec2 and repository-s3 to
ship with default Log4j 2 configurations that suppress these log
warnings.

Relates #20313
2016-09-03 06:41:07 -04:00
Jack Conradson 222a4fa765 Reduce the number of threads and scripts being used in multi-threaded
tests to prevent OOM from deprecation logging.
2016-09-02 11:56:44 -07:00
Jack Conradson 71d8ee5eac Merge branch 'master' into deprecate 2016-09-01 08:51:29 -07:00
Jack Conradson 3b3baa6e6c Made deprecation of Groovy, Javascript, and Python more explicit. 2016-08-31 15:56:31 -07:00
Jason Tedor 0853fc806f Add missing cast to logging message supplier
This commit adds a missing cast to logging message supplier on a single
invocation receiving a parameterized message parameter.
2016-08-30 18:26:45 -04:00
Jason Tedor abf8a1a3f0 Avoid allocating log parameterized messages
This commit modifies the call sites that allocate a parameterized
message to use a supplier so that allocations are avoided unless the log
level is fine enough to emit the corresponding log message.
2016-08-30 18:17:09 -04:00
Jason Tedor 7da0cdec42 Introduce Log4j 2
This commit introduces Log4j 2 to the stack.
2016-08-30 13:31:24 -04:00
Jack Conradson 7930233527 Deprecate Groovy, Python, and Javascript scripts. 2016-08-30 09:06:18 -07:00
Jun Ohtani 450f47d5b5 Validate blank field name
add validation and validate only 5.0+
Add tests before 5.0

Closes #19251
2016-08-26 20:10:33 +09:00
Adrien Grand 3ed0da5a58 GET operations should not extract fields from `_source`. #20158
This makes GET operations more consistent with `_search` operations which expect
`(stored_)fields` to work on stored fields and source filtering to work on the
`_source` field. This is now possible thanks to the fact that GET operations
do not read from the translog anymore (#20102) and also allows to get rid of
`FieldMapper#isGenerated`.

The `_termvectors` API (and thus more_like_this too) was relying on the fact
that GET operations would extract fields from either stored fields or the source
so the logic to do this that used to exist in `ShardGetService` has been moved
to `TermVectorsService`. It would be nice that term vectors do not rely on this,
but this does not seem to be a low hanging fruit.
2016-08-26 10:35:23 +02:00
Sarwar Bhuiyan b0ceecc3eb Refactored to use Settings object 2016-08-25 17:27:22 -04:00
Chris Earle 1cf694b63e Use StringBuilder in favor of StringBuffer
This removes all instances of StringBuffer that are removeable.

Uncontended synchronization in Java is pretty cheap, but it's unnecessary.
2016-08-25 16:20:03 -04:00
Mike McCandless 0ccfe69789 Upgrade to Lucene 6.2.0 2016-08-24 17:26:28 -04:00
Nik Everett 1452ab4b9f Squash the rest of o.e.rest.action
Squashes all the subpackages of `org.elasticsearch.rest.action` down to
the following:
* `o.e.rest.action.admin` - Administrative actions
* `o.e.rest.action.cat` - Actions that make tables for `grep`ing
* `o.e.rest.action.document` - Actions that act on documents
* `o.e.rest.action.ingest` - Actions that act on ingest pipelines
* `o.e.rest.action.search` - Actions that search

I'm tempted to merge `search` into `document` but the `document`
package feels fairly complete as is and `Suggest` isn't actually always
about documents either....

I'm also tempted to merge `ingest` into `admin.cluster` because the
latter contains the actions for dealing with stored scripts.

I've moved the `o.e.rest.action.support` into `o.e.rest.action`.

I've also added `package-info.java`s to all packges in `o.e.rest`. I
figure if the package is too small to deserve a `package-info.java` file
then it is too small to deserve to be a package....

Also fixes checkstyle in all moved classes.
2016-08-15 21:06:32 -04:00
Nik Everett 9f8f2ea54b Remove ESIntegTestCase#pluginList
It was a useful method in 1.7 when javac's type inference wasn't as
good, but now we can just replace it with `Arrays.asList`.
2016-08-11 15:44:02 -04:00
Nik Everett e07e5d66fa Make reindex and lang-javascript compatible
Fixes two issues:
1. lang-javascript doesn't support `executable` with a `null` `vars`
parameters. The parameter is quite nullable.
2. reindex didn't support script engines who's `unwrap` method wasn't
a noop. This didn't come up for lang-groovy or lang-painless because
both of those `unwrap`s were noops. lang-javascript copys all maps that
it `unwrap`s.

This adds fairly low level unit tests for these fixes but dosen't add
an integration test that makes sure that reindex and lang-javascript
play well together. That'd make backporting this difficult and would
add a fairly significant amount of time to the build for a fairly rare
interaction. Hopefully the unit tests will be enough.
2016-08-11 09:54:03 -04:00
David Pilato 42f851cf49 Merge branch 'master' into fix/19924-attachment 2016-08-10 19:05:22 +02:00
David Pilato 905684fe73 Adds content-length as number
If you run Elasticsearch with the ingest-attachment plugin:

```sh
gradle plugins:ingest-attachment:run
```

And then you use it on a document:

```js
 PUT _ingest/pipeline/attachment
 {
   "description" : "Extract attachment information",
   "processors" : [
     {
       "attachment" : {
         "field" : "data"
       }
     }
   ]
 }
 PUT my_index/my_type/my_id?pipeline=attachment
 {
   "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0="
 }
 GET my_index/my_type/my_id
```

 You were getting this back:

```js
 # PUT _ingest/pipeline/attachment
 {
   "acknowledged": true
 }

 # PUT my_index/my_type/my_id?pipeline=attachment
 {
   "_index": "my_index",
   "_type": "my_type",
   "_id": "my_id",
   "_version": 2,
   "result": "updated",
   "_shards": {
     "total": 2,
     "successful": 1,
     "failed": 0
   },
   "created": false
 }

 # GET my_index/my_type/my_id
 {
   "_index": "my_index",
   "_type": "my_type",
   "_id": "my_id",
   "_version": 2,
   "found": true,
   "_source": {
     "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=",
     "attachment": {
       "content_type": "application/rtf",
       "language": "ro",
       "content": "Lorem ipsum dolor sit amet",
       "content_length": "28"
     }
   }
 }
```

With this commit you are now getting:

```
 # GET my_index/my_type/my_id
 {
   "_index": "my_index",
   "_type": "my_type",
   "_id": "my_id",
   "_version": 2,
   "found": true,
   "_source": {
     "data": "e1xydGYxXGFuc2kNCkxvcmVtIGlwc3VtIGRvbG9yIHNpdCBhbWV0DQpccGFyIH0=",
     "attachment": {
       "content_type": "application/rtf",
       "language": "ro",
       "content": "Lorem ipsum dolor sit amet",
       "content_length": 28
     }
   }
 }
```

Closes #19924
2016-08-10 18:31:16 +02:00
Adrien Grand 0d6ac57acf Collapse o.e.index.mapper packages. #19921
I also reduced the visibility of a couple classes and renamed/consolidated some
test classes for consistency, eg. removing the `Simple` prefix or using the
`<Type>FieldMapperTests` convention for testing field mappers.
2016-08-10 17:51:11 +02:00
Lee Hinman 5849c488b5 Merge remote-tracking branch 'dakrone/compliation-breaker' 2016-08-09 11:57:26 -06:00
Lee Hinman 2be52eff09 Circuit break the number of inline scripts compiled per minute
When compiling many dynamically changing scripts, parameterized
scripts (<https://www.elastic.co/guide/en/elasticsearch/reference/master/modules-scripting-using.html#prefer-params>)
should be preferred. This enforces a limit to the number of scripts that
can be compiled within a minute. A new dynamic setting is added -
`script.max_compilations_per_minute`, which defaults to 15.

If more dynamic scripts are sent, a user will get the following
exception:

```json
{
  "error" : {
    "root_cause" : [
      {
        "type" : "circuit_breaking_exception",
        "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
        "bytes_wanted" : 0,
        "bytes_limit" : 0
      }
    ],
    "type" : "search_phase_execution_exception",
    "reason" : "all shards failed",
    "phase" : "query",
    "grouped" : true,
    "failed_shards" : [
      {
        "shard" : 0,
        "index" : "i",
        "node" : "a5V1eXcZRYiIk8lecjZ4Jw",
        "reason" : {
          "type" : "general_script_exception",
          "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]",
          "caused_by" : {
            "type" : "circuit_breaking_exception",
            "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
            "bytes_wanted" : 0,
            "bytes_limit" : 0
          }
        }
      }
    ],
    "caused_by" : {
      "type" : "general_script_exception",
      "reason" : "Failed to compile inline script [\"aaaaaaaaaaaaaaaa\"] using lang [painless]",
      "caused_by" : {
        "type" : "circuit_breaking_exception",
        "reason" : "[script] Too many dynamic script compilations within one minute, max: [15/min]; please use on-disk, indexed, or scripts with parameters instead",
        "bytes_wanted" : 0,
        "bytes_limit" : 0
      }
    }
  },
  "status" : 500
}
```

This also fixes a bug in `ScriptService` where requests being executed
concurrently on a single node could cause a script to be compiled
multiple times (many in the case of a powerful node with many shards)
due to no synchronization between checking the cache and compiling the
script. There is now synchronization so that a script being compiled
will only be compiled once regardless of the number of concurrent
searches on a node.

Relates to #19396
2016-08-09 10:26:27 -06:00
Ali Beyad f59ca9083b Snapshot repository cleans up empty index folders (#19751)
This commit cleans up indices in a snapshot repository when all
snapshots containing the index are all deleted. Previously, empty
indices folders would lay around after all snapshots containing
them were deleted.
2016-08-05 09:39:02 -04:00
David Pilato 6b9a084086 Merge branch 'pr/19557-extract-aws-key' 2016-08-04 17:48:44 +02:00
Ali Beyad c4ae23f5d8 Enables implementations of the BlobContainer interface to (#19749)
conform with the requirements of the writeBlob method by
throwing a FileAlreadyExistsException if attempting to write
to a blob that already exists. This change means implementations
of BlobContainer should never overwrite blobs - to overwrite a
blob, it must first be deleted and then can be written again.

Closes #15579
2016-08-02 09:48:21 -04:00
Ali Beyad 456ea56527 Cleans up the BlobContainer interface by removing the (#19727)
writeBlob method takes a BytesReference in favor of just
the writeBlob method that takes an InputStream.

Closes #18528
2016-08-02 09:21:43 -04:00
Ali Beyad 9f88a8194a Merge pull request #19706 from elastic/enhancement/snapshot-blob-handling
More resilient blob handling in snapshot repositories
2016-08-01 12:03:53 -04:00
Ali Beyad 401edeb0d8 AzureBlobContainer's deleteBlob method now throws a NoSuchFileException
instead of a vanilla IOException when the blob doesn't exist, in order
to conform to the BlobContainer's interface contract.
2016-08-01 10:50:02 -04:00
Nik Everett 303c9faca5 Squash o.e.rest.action.admin.cluster
In an effort to reduce the number of tiny packages we have in the
code base this moves all the files that were in subdirectories of
`org.elasticsearch.rest.action.admin.cluster` into
`org.elasticsearch.rest.action.admin.cluster`.

Also fixes line length in these packages.
2016-07-29 20:31:24 -04:00
David Pilato 6b68d1e09b Fix typo in comment 2016-07-29 13:49:11 +02:00
David Pilato f8e0557be5 Extract AWS Key from KeyChain instead of using potential null value
While I was working on #18703, I discovered a bad behavior when people don't provide AWS key/secret as part as their `elasticsearch.yml` but rely on SysProps or env. variables...

In [`InternalAwsS3Service#getClient(...)`](d4366f8493/plugins/repository-s3/src/main/java/org/elasticsearch/cloud/aws/InternalAwsS3Service.java (L76-L141)), we have:

```java
        Tuple<String, String> clientDescriptor = new Tuple<>(endpoint, account);
        AmazonS3Client client = clients.get(clientDescriptor);
```

But if people don't provide credentials, `account` is `null`.

Even if it actually could work, I think that we should use the `AWSCredentialsProvider` we create later on and extract from it the `account` (AWS KEY actually) and then use it as the second value of the tuple.

Closes #19557.
2016-07-28 18:05:51 +02:00
David Pilato 3adccd4560 Merge branch 'pr/19556-use-DefaultAWSCredentialsProviderChain' 2016-07-28 17:38:52 +02:00
David Pilato fb9bad23de Rename GceMetadataServiceImpl to GceMetadataService
See https://github.com/elastic/elasticsearch/pull/15765/files#r65527203
2016-07-27 13:28:53 +02:00
David Pilato e9339a1960 Merge branch 'master' into pr/15724-gce-network-host-master 2016-07-27 11:24:53 +02:00
Nik Everett 9270e8b22b Rename client yaml test infrastructure
This makes it obvious that these tests are for running the client yaml
suites. Now that there are other ways of running tests using the REST
client against a running cluster we can't go on calling the shared
client yaml tests "REST tests". They are rest tests, but they aren't
**the** rest tests.
2016-07-26 13:53:44 -04:00