1694 Commits

Author SHA1 Message Date
Jason Tedor
15d3d74444 Merge branch 'master' into feature/seq_no
* master: (904 commits)
  Removes unused methods in the o/e/common/Strings class.
  Add note regarding thread stack size on Windows
  painless: restore accidentally removed test
  Documented fuzzy_transpositions in match query
  Add not-null precondition check in BulkRequest
  Build: Make run task you full zip distribution
  Build: More pom generation improvements
  Add test for wrong array index
  Take return type from "after" field.
  painless: build descriptor of array and field load/store in code; fix array index to adapt type not DEF
  Build: Add developer info to generated pom files
  painless: improve exception stacktraces
  painless: Rename the dynamic call site factory to DefBootstrap and make the inner class very short (PIC = Polymorphic Inline Cache)
  Remove dead code.
  Avoid race while retiring executors
  Allow only a single extension for a scripting engine
  Adding REST tests to ensure key_as_string behavior stays consistent
  [test] Set logging to 11 on reindex test
  [TEST] increase logger level until we know what is going on
  Don't allow `fuzziness` for `multi_match` types cross_fields, phrase and phrase_prefix
  ...
2016-05-14 20:23:59 -04:00
Robert Muir
2028691e66 painless: improve exception stacktraces
closes #18319
2016-05-13 15:40:45 -04:00
Lee Hinman
9bcdafedda Allow only a single extension for a scripting engine
Previously multiple extensions could be provided, however, this can lead
to confusion with on-disk scripts (ie, "foo.js" and "foo.javascript")
having different content. Only a single extension is now supported.

The only language currently supporting multiple extensions was the
Javascript engine ("js" and "javascript"). It now only supports the
`.js` extension.

Relates to #10598
2016-05-13 09:54:31 -06:00
Lee Hinman
efff3918d8 Remove support for mulitple languages per scripting engine 2016-05-13 09:24:31 -06:00
Lee Hinman
a4060f7436 Remove vestiges of script engine sandboxing
This removes all the mentions of the sandbox from the script engine
services and permissions model. This means that the following settings
are no longer supported:

```yaml
script.inline: sandbox
script.stored: sandbox
```

Instead, only a `true` or `false` value can be specified.

Since this would otherwise break the default-allow parameter for
languages like expressions, painless, and mustache, all script engines
have been updated to have individual settings, for instance:

```yaml
script.engine.groovy.inline: true
```

Would enable all inline scripts for groovy. (they can still be
overridden on a per-operation basis).

Expressions, Painless, and Mustache all default to `true` for inline,
file, and stored scripts to preserve the old scripting behavior.

Resolves #17114
2016-05-13 09:24:31 -06:00
John Barker
531dcbf20a Add TAG_SETTING to list of allowed tags for the ec2 discovery plugin.
I am unable to set ec2 discovery tags because this setting was
accidentally omitted from the register settings list in
Ec2DiscoveryPlugin.java. I get this:

java.lang.IllegalArgumentException: unknown setting [discovery.ec2.tag.project]
2016-05-10 16:19:46 -04:00
David Pilato
e8ddf5de2f Merge branch 'pr/hide-s3-repositories-credentials' 2016-05-10 20:22:39 +02:00
Adrien Grand
f481492af3 Remove FieldMapper.Builder.indexName. #18219
The ability to configure index names that are different from the full name was
removed in 2.0.
2016-05-10 08:17:00 +02:00
Adrien Grand
5d8f684319 Mapping cleanups. #18180
This removes dead/duplicate code and makes the `_index` field not configurable.
(Configuration used to jus be ignored, now we would throw an exception if any
is provided.)
2016-05-10 08:14:18 +02:00
Chris Earle
5be79ed02c Add Failure Details to every NodesResponse
Most of the current implementations of BaseNodesResponse (plural Nodes) ignore FailedNodeExceptions.

- This adds a helper function to do the grouping to TransportNodesAction
- Requires a non-null array of FailedNodeExceptions within the BaseNodesResponse constructor
- Reads/writes the array to output
- Also adds StreamInput and StreamOutput methods for generically reading and writing arrays
2016-05-06 14:59:43 -04:00
Adrien Grand
7d8708716e QueryBuilder does not need generics. #18133
QueryBuilder has generics, but those are never used: all call sites use
`QueryBuilder<?>`. Only `AbstractQueryBuilder` needs generics so that the base
class can contain a default implementation for setters that returns `this`.
2016-05-06 08:38:20 +02:00
Jason Tedor
784c9e5fb9 Introduce node handshake
This commit introduces a handshake when initiating a light
connection. During this handshake, node information, cluster name, and
version are received from the target node of the connection. This
information can be used to immediately validate that the target node is
a member of the same cluster, and used to set the version on the
stream. This will allow us to extend APIs that are used during initial
cluster recovery without a major version change.

Relates #15971
2016-05-04 20:06:47 -04:00
Jason Tedor
2dea449949 Remove Strings#splitStringToArray
This commit removes the method Strings#splitStringToArray and replaces
the call sites with invocations to String#split. There are only two
explanations for the existence of this method. The first is that
String#split is slightly tricky in that it accepts a regular expression
rather than a character to split on. This means that if s is a string,
s.split(".")  does not split on the character '.', but rather splits on
the regular expression '.' which splits on every character (of course,
this is easily fixed by invoking s.split("\\.") instead). The second
possible explanation is that (again) String#split accepts a regular
expression. This means that there could be a performance concern
compared to just splitting on a single character. However, it turns out
that String#split has a fast path for the case of splitting on a single
character and microbenchmarks show that String#split has 1.5x--2x the
throughput of Strings#splitStringToArray. There is a slight behavior
difference between Strings#splitStringToArray and String#split: namely,
the former would return an empty array in cases when the input string
was null or empty but String#split will just NPE at the call site on
null and return a one-element array containing the empty string when the
input string is empty. There was only one place relying on this behavior
and the call site has been modified accordingly.
2016-05-04 08:12:41 -04:00
Martijn van Groningen
7aca1389e2 ingest: Add date_index_name processor.
Closes #17814
2016-04-29 17:20:48 +02:00
Tal Levy
07c2fbf83a Validate properties values according to database type (#17940)
Fixes #17683.
2016-04-29 07:58:27 -07:00
David Pilato
c16d309c8c Allow _gce_ network when not using discovery gce
For now we support `_gce_` only if discovery is set to `gce` and all information about GCE is provided (project_id and zone).
But in some cases, people would like to only bind to `_gce_` on a single node (without any elasticsearch cluster).

They could access the machine then from other machines running inside the same project.

This commit adds a new GceMetadataService which is started as soon as the plugin is started so GceNameResolver can use it to resolve `_gce`.

Closes #15724.
2016-04-29 16:56:24 +02:00
Yannick Welsch
37382ecfb2 Add Azure discovery tests mocking Azure management endpoint (#18004) 2016-04-29 15:54:15 +02:00
David Pilato
7cc8a1419b Update after rebase onto master 2016-04-29 15:39:51 +02:00
David Pilato
d7eb375d24 Merge branch 'master' into pr/s3-path-style-access
# Conflicts:
#	plugins/repository-s3/src/main/java/org/elasticsearch/cloud/aws/AwsS3Service.java
#	plugins/repository-s3/src/main/java/org/elasticsearch/cloud/aws/InternalAwsS3Service.java
#	plugins/repository-s3/src/main/java/org/elasticsearch/repositories/s3/S3Repository.java
#	plugins/repository-s3/src/test/java/org/elasticsearch/cloud/aws/TestAwsS3Service.java
2016-04-29 15:21:16 +02:00
David Pilato
6c7a44ccd9 Fix test in mapper attachments plugin 2016-04-29 15:02:04 +02:00
David Pilato
2636703afa Merge branch 'master' into pr/attachments-add-test-forced-values 2016-04-29 14:55:42 +02:00
David Pilato
faa3c6ef3c Add new UnsupportedException for EC Mock 2016-04-29 14:41:57 +02:00
David Pilato
6ef81c5dcd S3 repositories credentials should be filtered
When working on #18008 I found while reading the code that we don't filter anymore `repositories.s3.access_key` and `repositories.s3.secret_key`.

Also fixed a typo in REST test
2016-04-27 14:11:17 +02:00
Alexander Reelsen
f71eb0b888 Version: Set version to 5.0.0-alpha2 2016-04-26 09:30:26 +02:00
Xu Zhang
3e4b470f83 Fix icu IndexScope setting 2016-04-22 15:03:02 -07:00
Ryan Ernst
d12a4bb51d Merge pull request #17933 from rjernst/camelcase4
Remove camelCase support
2016-04-22 13:46:43 -07:00
xuzha
cd527c5b92 Add support for customizing the rule file in ICU tokenizer
Lucene allows to create a ICUTokenizer with a special config argument
enabling the customization of the rule based iterator by providing
custom rules files.

This commit enable this feature. Users could provide a list of RBBI rule
files to ICU tokenizer.

closes #13146
2016-04-22 12:39:20 -07:00
Ryan Ernst
55388590c1 Remove camelCase support
Now that the current uses of magical camelCase support have been
deprecated, we can remove these in master (sans remaining issues like
BulkRequest). This change removes camel case support from ParseField,
query types, analysis, and settings lookup.

see #8988
2016-04-22 09:18:10 -07:00
Martijn van Groningen
c5ad2e2865 Changed indexed scripts to be stored in the cluster state instead of the .scripts index.
Also added max script size soft limit for stored scripts.

Closes #16651
2016-04-22 13:42:55 +02:00
Martijn van Groningen
dd2184ab25 ingest: Streamline option naming for several processors:
* `rename` processor, renamed `to` to `target_field`
* `date` processor, renamed `match_field` to `field` and renamed `match_formats` to `formats`
* `geoip` processor, renamed `source_field` to `field` and renamed `fields` to `properties`
* `attachment` processor, renamed `source_field` to `field` and renamed `fields` to `properties`

Closes #17835
2016-04-21 13:40:43 +02:00
Jun Ohtani
9eb242a5fe Analyze API : Rename filters/token_filters/char_filter to filter/token_filter/char_filter
Closes #15189
2016-04-21 18:05:11 +09:00
Ryan Ernst
523b071836 Internal: Remove XContentBuilderString
This was previously used by xcontentbuilder to support camelCase.
However, it is no longer used, and can be replaced with just String.
2016-04-18 14:32:18 -07:00
Nik Everett
ff9b28d806 Deprecate remaining readXYZ|writeXYZ methods 2016-04-18 16:19:45 -04:00
David Pilato
44080a007f Add cloud.aws.s3.throttle_retries setting
Defaults to `true`.

If anyone is having trouble with this option, you could disable it with `cloud.aws.s3.throttle_retries: false` in `elasticsearch.yml` file.
2016-04-15 14:53:09 +02:00
David Pilato
f2ee759ad5 Upgrade AWS SDK to 1.10.69
* Moving from JSON.org to Jackson for request marshallers.
* The Java SDK now supports retry throttling to limit the rate of retries during periods of reduced availability. This throttling behavior can be enabled via ClientConfiguration or via the system property "-Dcom.amazonaws.sdk.enableThrottledRetry".
* Fixed String case conversion issues when running with non English locales.
* AWS SDK for Java introduces a new dynamic endpoint system that can compute endpoints for services in new regions.
* Introducing a new AWS region, ap-northeast-2.
* Added a new metric, HttpSocketReadTime, that records socket read latency. You can enable this metric by adding enableHttpSocketReadMetric to the system property com.amazonaws.sdk.enableDefaultMetrics. For more information, see [Enabling Metrics with the AWS SDK for Java](https://java.awsblog.com/post/Tx3C0RV4NRRBKTG/Enabling-Metrics-with-the-AWS-SDK-for-Java).
* New Client Execution timeout feature to set a limit spent across retries, backoffs, ummarshalling, etc. This new timeout can be specified at the client level or per request.
  Also included in this release is the ability to specify the existing HTTP Request timeout per request rather than just per client.

* Added support for RequesterPays for all operations.
* Ignore the 'Connection' header when generating S3 responses.
* Allow users to generate an AmazonS3URI from a string without using URL encoding.
* Fixed issue that prevented creating buckets when using a client configured for the s3-external-1 endpoint.
* Amazon S3 bucket lifecycle configuration supports two new features: the removal of expired object delete markers and an action to abort incomplete multipart uploads.
* Allow TransferManagerConfiguration to accept integer values for multipart upload threshold.
* Copy the list of ETags before sorting https://github.com/aws/aws-sdk-java/pull/589.
* Option to disable chunked encoding https://github.com/aws/aws-sdk-java/pull/586.
* Adding retry on InternalErrors in CompleteMultipartUpload operation. https://github.com/aws/aws-sdk-java/issues/538
* Deprecated two APIs : AmazonS3#changeObjectStorageClass and AmazonS3#setObjectRedirectLocation.
* Added support for the aws-exec-read canned ACL. Owner gets FULL_CONTROL. Amazon EC2 gets READ access to GET an Amazon Machine Image (AMI) bundle from Amazon S3.

* Added support for referencing security groups in peered Virtual Private Clouds (VPCs). For more information see the service announcement at https://aws.amazon.com/about-aws/whats-new/2016/03/announcing-support-for-security-group-references-in-a-peered-vpc/ .
* Fixed a bug in AWS SDK for Java - Amazon EC2 module that returns NPE for dry run requests.
* Regenerated client with new implementation of code generator.
* This feature enables support for DNS resolution of public hostnames to private IP addresses when queried over ClassicLink. Additionally, you can now access private hosted zones associated with your VPC from a linked EC2-Classic instance. ClassicLink DNS support makes it easier for EC2-Classic instances to communicate with VPC resources using public DNS hostnames.
* You can now use Network Address Translation (NAT) Gateway, a highly available AWS managed service that makes it easy to connect to the Internet from instances within a private subnet in an AWS Virtual Private Cloud (VPC). Previously, you needed to launch a NAT instance to enable NAT for instances in a private subnet. Amazon VPC NAT Gateway is available in the US East (N. Virginia), US West (Oregon), US West (N. California), EU (Ireland), Asia Pacific (Tokyo), Asia Pacific (Singapore), and Asia Pacific (Sydney) regions. To learn more about Amazon VPC NAT, see [New - Managed NAT (Network Address Translation) Gateway for AWS](https://aws.amazon.com/blogs/aws/new-managed-nat-network-address-translation-gateway-for-aws/)
* A default read timeout is now applied when querying data from EC2 metadata service.
2016-04-15 14:52:48 +02:00
Adrien Grand
d84c643f58 Use the new points API to index numeric fields. #17746
This makes all numeric fields including `date`, `ip` and `token_count` use
points instead of the inverted index as a lookup structure. This is expected
to perform worse for exact queries, but faster for range queries. It also
requires less storage.

Notes about how the change works:
 - Numeric mappers have been split into a legacy version that is essentially
   the current mapper, and a new version that uses points, eg.
   LegacyDateFieldMapper and DateFieldMapper.
 - Since new and old fields have the same names, the decision about which one
   to use is made based on the index creation version.
 - If you try to force using a legacy field on a new index or a field that uses
   points on an old index, you will get an exception.
 - IP addresses now support IPv6 via Lucene's InetAddressPoint and store them
   in SORTED_SET doc values using the same encoding (fixed length of 16 bytes
   and sortable).
 - The internal MappedFieldType that is stored by the new mappers does not have
   any of the points-related properties set. Instead, it keeps setting the index
   options when parsing the `index` property of mappings and does
   `if (fieldType.indexOptions() != IndexOptions.NONE) { // add point field }`
   when parsing documents.

Known issues that won't fix:
 - You can't use numeric fields in significant terms aggregations anymore since
   this requires document frequencies, which points do not record.
 - Term queries on numeric fields will now return constant scores instead of
   giving better scores to the rare values.

Known issues that we could work around (in follow-up PRs, this one is too large
already):
 - Range queries on `ip` addresses only work if both the lower and upper bounds
   are inclusive (exclusive bounds are not exposed in Lucene). We could either
   decide to implement it, or drop range support entirely and tell users to
   query subnets using the CIDR notation instead.
 - Since IP addresses now use a different representation for doc values,
   aggregations will fail when running a terms aggregation on an ip field on a
   list of indices that contains both pre-5.0 and 5.0 indices.
 - The ip range aggregation does not work on the new ip field. We need to either
   implement range aggs for SORTED_SET doc values or drop support for ip ranges
   and tell users to use filters instead. #17700

Closes #16751
Closes #17007
Closes #11513
2016-04-14 17:56:23 +02:00
Yannick Welsch
80cf9fc761 Add EC2 discovery tests to check permissions of AWS Java SDK (#17677) 2016-04-13 10:01:49 +02:00
Adrien Grand
3bf6f4076c Do not set analyzers on numeric fields.
When it comes to query parsing, either a field is tokenized and it would go
through analysis with its search_analyzer. Or it is not tokenized and the
raw string should be passed to termQuery(). Since numeric fields are not
tokenized and also declare a search analyzer, values would currently go through
analysis twice...
2016-04-12 17:47:29 +02:00
Adrien Grand
013acf9179 Remove MappedFieldType.value. #17557
This commit removes `MappedFieldType.value` and simplifies
`MappedFieldType.valueforSearch`. `valueforSearch` was used to post-process
values that come for stored fields (eg. to convert a long back to a string
representation of a date in the case of a date field) and also values that
are extracted from the source but only in the case of GET calls: it would
not be called when performing source filtering on search requests.

`valueforSearch` is now only called for stored fields, since values that are
extracted from the source should already be formatted as expected.
2016-04-12 09:12:56 +02:00
Adrien Grand
496c7fbd84 Upgrade Lucene 6 Release
* upgrades numerics to new Point format
* updates geo api changes
  * adds GeoPointDistanceRangeQuery as XGeoPointDistanceRangeQuery
  * cuts over to ES GeoHashUtils
2016-04-11 16:50:04 -05:00
Ryan Ernst
31ca8fa411 Merge branch 'master' into placeholder 2016-04-11 13:44:59 -07:00
Yannick Welsch
b08d453a0a Fix EC2 Discovery settings (#17651)
Fixes two bugs introduced by the settings refactoring in #16602
2016-04-11 16:17:55 +02:00
Alexander Reelsen
da19ddf3e6 Ingest Attachment: Allow to prevent base64 conversions by using raw bytes (#16601)
CBOR is natively supported in Elasticsearch and allows for byte arrays.
This means, that by using CBOR the user can prevent base64 conversions
for the data being sent back and forth.

This PR adds support to extract data from a byte array in addition to
a string. This also required to add a ByteArrayValueSource class.
2016-04-11 14:14:56 +02:00
Adrien Grand
42526ac28e Remove Settings.settingsBuilder.
We have both `Settings.settingsBuilder` and `Settings.builder` that do exactly
the same thing, so we should keep only one. I kept `Settings.builder` since it
has my preference but also it is the one that we use in examples of the Java API.
2016-04-08 18:10:02 +02:00
David Pilato
c6b1beb083 Add a test for forced values in mapper-attachments plugin
This PR just adds a new test where we check that we forcing a value in the JSON document actually works as expected:

```json
{
     "file": {
        "_content": "BASE64"
        "_name": "12-240.pdf",
        "_language": "en",
        "_content_type": "pdf"
    }
}
```

Note that we don't support forcing all values. So sending:

```json
{
     "file": {
        "_content": "BASE64"
        "_name": "12-240.pdf",
        "_title": "12-240.pdf",
        "_keywords": "Div42 Src580 LGE Mechtech",
        "_language": "en",
        "_content_type": "pdf"
    }
}
```

Will have absolutely no effect on fields `title` and `keywords`.

Note that when `_language` is set, it only works if `index.mapping.attachment.detect_language` is set to `true`.

Related to https://discuss.elastic.co/t/mapper-attachments/46615/4
2016-04-08 10:07:21 +02:00
Chris Earle
d97d5ebb8b Remove hostname from NetworkAddress.format
This removes the inconsistent output of IP addresses. The format was parsing-unfriendly and it makes it hard
to reason about API responses, such as to _nodes.

With this change in place, it will never print the hostname as part of the default format, which has the
added benefit that it can be used consistently for URIs, which was not the case when the hostname might
appear at the front with "hostname/ip:port".
2016-04-07 17:27:59 -04:00
javanna
b9f9b2e3ee Merge branch 'master' into enhancement/discovery_node_one_getter 2016-03-30 17:22:40 +02:00
javanna
f8b5d1f5b0 Remove DiscoveryNodes#masterNodeId in favour of existing DiscoveryNodes#getMasterNodeId 2016-03-30 15:28:06 +02:00
Adrien Grand
068c788ec8 Disable fielddata on text fields by defaults. #17386
`text` fields will have fielddata disabled by default. Fielddata can still be
enabled on an existing index by setting `fielddata=true` in the mappings.
2016-03-30 14:35:32 +02:00
javanna
8fc9dbbb99 Merge branch 'master' into enhancement/remove_node_client_setting 2016-03-29 14:27:04 +02:00