Commit Graph

4526 Commits

Author SHA1 Message Date
Simon Willnauer 82fa531ab4 Remove `_index` fielddata hack if cluster alias is present (#26082)
We introduced a hack in #25885 to respect the cluster alias if available on the `_index` field. This is important if aggregations or other field data related operations are executed. Yet, we added a small hack that duplicated an implementation detail from the `_index` field data builder to make this work. This change adds a necessary but simple API change that allows us to remove the hack and only have a single implementation.
2017-08-08 09:24:24 +02:00
Adrien Grand f0cba4fce5 Add a scripted similarity. (#25831)
The goal of this similarity is to help users who would like to keep the
functionality of the `tf-idf` similarity that we want to remove, or to allow
for specific usec-cases (disabling idf, disabling tf, disabling length norm,
etc.) to not have to build a custom plugin and familiarize with the low-level
Lucene API.
2017-08-08 08:55:12 +02:00
Tal Levy 872526cad3 add URL-Decode Processor to Ingest (#26045)
closes #25837

Adds a URL Decoder Processor to Ingest

this will decode urls like:

https%3a%2f%2felastic.co%2 to https://elastic.co/
2017-08-07 10:26:11 -07:00
Christoph Büscher 18155ed69a Merge branch 'master' into feature/rank-eval 2017-08-07 16:07:34 +02:00
Luca Cavanna 14ba36977e [TEST] prevent yaml tests from using raw requests (#26044)
Raw requests are supported only by the java yaml test runner and were introduced to test docs snippets. Some yaml tests ended up using them (see #23497) which causes failures for other language clients. This commit migrates those yaml tests to Java tests that send requests through the Java low-level REST client, and also moves the ability to send raw requests to a special client that's only available when testing docs snippets.

Closes #25694
2017-08-07 11:02:16 +02:00
Martijn van Groningen 11ce6b91a4
test: Do not use random index writer as test expects a single segment
check against right version
2017-08-07 09:40:54 +02:00
Colin Goodheart-Smithe bb3d5b7426
[TEST] Fix internalMatrixStatsTests failure 2017-08-02 16:36:34 +01:00
Colin Goodheart-Smithe 87c6e63e73 Adds mutate function to various tests (#25999)
* Adds mutate function to various tests

Relates to #25929

* fix test

* implements mutate function for all single bucket aggs

* review comments

* convert getMutateFunction to mutateIInstance
2017-08-02 11:38:31 +01:00
Martijn van Groningen 53dd8afaea
fix test 2017-08-02 11:25:03 +02:00
Martijn van Groningen a3d1248014
percolator: use correct version. 2017-08-02 10:37:59 +02:00
Adrien Grand 88d456989e Make FieldMapper.copyTo() always non-null. (#25994)
Otherwise it is confusing that both a null copyTo and an empty copyTo should
be treated the same.
2017-08-02 10:07:29 +02:00
Tim Brooks 0f4f49496f Use nio transport in test clusters (#25986)
This commit adds the nio transport as an option in place of the mock tcp
transport for tests. Each test will only use one transport type. The
transport type is decided by a random boolean generated inside of the
`ESTestCase` class.
2017-08-01 16:19:31 -05:00
Ryan Ernst 072281d5aa Update version to 7.0.0-alpha1 (#25876)
This commit updates the version for master to 7.0.0-alpha1. It also adds
the 6.1 version constant, and fixes many tests, as well as marking some
as awaits fix.

Closes #25893
Closes #25870
2017-08-01 15:47:48 -04:00
Adrien Grand 53c829b6bc Painless: allow doubles to be casted to longs. (#25936)
Running `(long) someDoubleValue` currently throws a `ClassCastException` while
eg. `(int) someDoubleValue` is accepted.
2017-08-01 16:22:55 +02:00
Jason Tedor 764f7ef2ef Fix Netty 4 multi-port test
This commit fixes an issue with the Netty 4 multi-port test that a
transport client can connect. The problem here is that in case the
bottom of the random port range was already bound to (for example, by
another JVM) then then transport client could not connect to the data
node. This is because the transport client was in fact using the bottom
of the port range only. Instead, we simply try all the ports that the
data node might be bound to.

Closes #24441
2017-08-01 19:47:20 +09:00
Martijn van Groningen 5f36bdfda0
percolator: Also support IndexOrDocValuesQuery
Otherwise ranges are never extracted properly.
2017-08-01 09:44:42 +02:00
Martijn van Groningen ff3b909a83
Moved HtmlStripCharFilterFactory to analyis.common package like the other factories. 2017-07-31 15:34:54 +02:00
Martijn van Groningen 0b776a1de0
Move more token filters to analysis-common module
The following token filters were moved: delimited_payload_filter, keep, keep_types, classic, apostrophe, decimal_digit, fingerprint, min_hash and scandinavian_folding.

Relates to #23658
2017-07-31 15:15:04 +02:00
Martijn van Groningen 7c3735bdc4
percolator: Store the QueryBuilder's Writable representation instead of its XContent representation.
The Writeble representation is less heavy to parse and that will benefit percolate performance and throughput.

The query builder's binary format has now the same bwc guarentees as the xcontent format.

Added a qa test that verifies that percolator queries written in older versions are still readable by the current version.
2017-07-28 12:24:10 +02:00
Yannick Welsch 1a01514081 Move tribe to a module (#25778)
This commit moves tribe to a module, stripping core from the tribe functionality.
2017-07-28 11:23:50 +02:00
Jim Ferenczi 562c3744ca Merge FunctionScoreQuery and FiltersFunctionScoreQuery (#25889)
This change merges the functionality of the FiltersFunctionScoreQuery in the FunctionScoreQuery.
It also ensures that an exception is thrown when the computed score is equals to Float.NaN or Float.NEGATIVE_INFINITY.
These scores are invalid for TopDocsCollectors that relies on score comparison.

Fixes #15709
Fixes #23628
2017-07-28 09:22:20 +02:00
Martijn van Groningen edad7b4737
Add support for selecting percolator query candidate matches containing range queries.
Extracts ranges from range queries on byte, short, integer, long, half_float, scaled_float, float, double, date and ip fields.
byte, short, integer and date ranges are normalized to Lucene's LongRange.
half_float and float are normalized to Lucene's DoubleRange.

When extracting range queries, the QueryAnalyzer computes the width of the range.  This width is used to determine
what range should be preferred in a conjunction query. The QueryAnalyzer prefers the smaller ranges, because these
ranges tend to match with less documents.

Closes #21040
2017-07-26 21:25:45 +02:00
Simon Willnauer b72c71083c Cleanup IndexFieldData visibility (#25900)
Today we expose `IndexFieldDataService` outside of IndexService to do maintenance
or lookup field data in different ways. Yet, we have a streamlined way to access IndexFieldData
via `QueryShardContext` that should encapsulate all access to it. This also ensures that we control all other functionality like cache clearing etc.

This change also removes the `recycler` option from `ClearIndicesCacheRequest` this option is a no-op and should have been removed long ago.
2017-07-26 20:03:42 +02:00
Simon Willnauer 634ce90dc0 Respect cluster alias in `_index` aggs and queries (#25885)
Today when we aggregate on the `_index` field the cross cluster search
alias is not taken into account. Neither is it respected when we search
on the field. This change adds support for cluster alias when the cluster
alias is present on the `_index` field.

Closes #25606
2017-07-26 09:16:52 +02:00
Michael Basnight e816ef89a2 Shade external dependencies in the rest client jar
This commit removes all external dependencies from the rest client jar
and shades them in an 'org.elasticsearch.client' package within the jar
using shadowJar gradle plugin. All projects that depended on the
existing jar have been converted to using the 'org.elasticsearch.client'
package prefixes to interact with the rest client.

Closes #25208
2017-07-24 12:55:43 -05:00
Jim Ferenczi ab3b5c695a Pre-configured shingle filter should disable graph analysis (#25853)
This change disables the graph analysis on default `shingle` filter.
The pre-configured shingle filter produces shingles of different size.
Graph analysis on such token stream is useless and dangerous as it may create too many paths.

Fixes #25555
2017-07-24 18:42:15 +02:00
Simon Willnauer 0e3ad522a2 Rewrite search requests on the coordinating nodes (#25814)
This change rewrites search requests on the coordinating node before
we send requests to the individual shards. This will reduce the rewrite load
and object creation for each rewrite on the executing nodes and will fetch
resources only once instead of N times once per shard for queries like `terms`
query with index lookups. (among percolator and geo-shape)

Relates to #25791
2017-07-21 09:38:38 +02:00
Jack Conradson 9f7463e796 remove lang url parameter from stored script requests (#25779)
Also has updates to ScriptMetaData for allowing the old namespace format to be loaded all the way back through 5.0; however, it will throw an exception if two scripts share the same id but different languages.
2017-07-20 08:51:08 -07:00
Simon Willnauer 5e629cfba0 Ensure query resources are fetched asynchronously during rewrite (#25791)
The `QueryRewriteContext` used to provide a client object that can
be used to fetch geo-shapes, terms or documents for percolation. Unfortunately
all client calls used to be blocking calls which can have significant impact on the
rewrite phase since it occupies an entire search thread until the resource is
received. In the case that the index the resource is fetched from isn't on the local
node this can have significant impact on query throughput.

Note: this doesn't fix MLT since it fetches stuff in doQuery which is a different beast. Yet, it is a huge step in the right direction
2017-07-20 15:37:50 +02:00
Jay Modi 3e4bc027eb RestClient uses system properties and system default SSLContext (#25757)
This commit calls the `useSystemProperties` method on the HttpAsyncClientBuilder so that the jvm
system properties are used. The primary reason for doing this is to ensure the builder uses the
system default SSLContext rather than the default instance created by the http client library.

Closes #23231
2017-07-20 07:36:56 -06:00
Simon Willnauer 4d78935df7 Introduce a new Rewriteable interface to streamline rewriting (#25788)
Today we have duplicated code that is quite complicated to iterate
over rewriteable (`QueryBuilders` mainly) This change introduces a
`Rewriteable` interface that allow to share code to do the rewriting as
well as encapsulation and composition of queries.
2017-07-19 15:06:49 +02:00
Adrien Grand f1ff7f2454 Require a field when a `seed` is provided to the `random_score` function. (#25594)
We currently use fielddata on the `_id` field which is trappy, especially as we
do it implicitly. This changes the `random_score` function to use doc ids when
no seed is provided and to suggest a field when a seed is provided.

For now the change only emits a deprecation warning when no field is supplied
but this should be replaced by a strict check on 7.0.

Closes #25240
2017-07-19 14:11:15 +02:00
Martijn van Groningen 8003171a0c
Move more token filters to analysis-common module
The following token filters were moved: arabic_normalization, german_normalization, hindi_normalization, indic_normalization, persian_normalization, scandinavian_normalization, serbian_normalization, sorani_normalization, cjk_width and cjk_width

Relates to #23658
2017-07-17 08:29:44 +02:00
Ryan Ernst 072402463b Scripting: Remove search template actions (#25717)
The dedicated search template put/get/delete actions are deprecated in
5.6. This commit removes them from 6.0.
2017-07-14 23:12:05 -07:00
Christoph Büscher 887ed68cf2 Fixing compilation issues and tests after merging in master 2017-07-14 19:23:35 +02:00
Christoph Büscher 6d999f074a Merge branch 'master' into feature/rank-eval 2017-07-14 18:36:08 +02:00
Jim Ferenczi 13da3eb53e Refactor QueryStringQuery for 6.0 (#25646)
This change refactors the query_string query to analyze the query text around logical operators of the query string the same way than a match_query/multi_match_query.
It also adds a type parameter that can be used to change the way multi fields query are built the same way than a multi_match query does.

Now that these queries share the same behavior regarding text analysis, some parameters are obsolete and have been deprecated:

split_on_whitespace: This setting is now ignored with a deprecation notice
if it is used explicitely. With this PR The query_string always splits on logical operator.
It simplifies the understanding of the other parameters that can have different meanings
depending on the value of split_on_whitespace.

auto_generate_phrase_queries: This setting is now ignored with a deprecation notice
if it is used explicitely. This setting only makes sense when the parser splits on whitespace.

use_dismax: This setting is now ignored with a deprecation notice
if it is used explicitely. The tie_breaker parameter is sufficient to handle best_fields/most_fields.

Fixes #25574
2017-07-13 15:32:17 +02:00
Luca Cavanna ec66d655b5 Rename client artifacts (#25693)
It was brought up that our current client artifacts have generic names like 'rest' that may cause conflicts with other artifacts.

This commit renames:

- rest -> elasticsearch-rest-client
- sniffer -> elasticsearch-rest-client-sniffer
- rest-high-level -> elasticsearch-rest-high-level-client

A couple of small changes are also preparing the high level client for its first release.

Closes #20248
2017-07-13 09:44:25 +02:00
Ryan Ernst 70b2897bdf Scripting: Deprecate stored search template apis (#25437)
This commit deprecates the PUT, GET and DELETE search template apis.
Instead, the stored script api should be used.

closes #24596
2017-07-12 16:07:28 -07:00
Simon Willnauer e81804cfa4 Add a shard filter search phase to pre-filter shards based on query rewriting (#25658)
Today if we search across a large amount of shards we hit every shard. Yet, it's quite
common to search across an index pattern for time based indices but filtering will exclude
all results outside a certain time range ie. `now-3d`. While the search can potentially hit
hundreds of shards the majority of the shards might yield 0 results since there is not document
that is within this date range. Kibana for instance does this regularly but used `_field_stats`
to optimize the indexes they need to query. Now with the deprecation of `_field_stats` and it's upcoming removal a single dashboard in kibana can potentially turn into searches hitting hundreds or thousands of shards and that can easily cause search rejections even though the most of the requests are very likely super cheap and only need a query rewriting to early terminate with 0 results.

This change adds a pre-filter phase for searches that can, if the number of shards are higher than a the `pre_filter_shard_size` threshold (defaults to 128 shards), fan out to the shards
and check if the query can potentially match any documents at all. While false positives are possible, a negative response means that no matches are possible. These requests are not subject to rejection and can greatly reduce the number of shards a request needs to hit. The approach here is preferable to the kibana approach with field stats since it correctly handles aliases and uses the correct threadpools to execute these requests. Further it's completely transparent to the user and improves scalability of elasticsearch in general on large clusters.
2017-07-12 22:19:20 +02:00
Jack Conradson d2b4f7ac5a Disallow lang to be used with Stored Scripts (#25610)
Requests that execute a stored script will no longer be allowed to specify the lang of the script. This information is stored in the cluster state making only an id necessary to execute against. Putting a stored script will still require a lang.
2017-07-12 07:55:57 -07:00
Tim Brooks a3ade99fcf Fix BytesReferenceStreamInput#skip with offset (#25634)
There is a bug when a call to `BytesReferenceStreamInput` skip is made
on a `BytesReference` that has an initial offset. The offset for the
current slice is added to the current index and then subtracted from the
length. This introduces the possibility of a negative number of bytes to
skip. This happens inside a loop, which leads to an infinte loop.

This commit correctly subtracts the current slice index from the
slice.length. Additionally, the `BytesArrayTests` are modified to test
instances that include an offset.
2017-07-11 09:54:29 -05:00
Adrien Grand 481d5d09b2 Upgrade to lucene-7.0.0-snapshot-00142c9. (#25641)
Lucene 7.0 is feature-frozen now, so there should not be many changes until GA.
2017-07-11 13:58:55 +02:00
Tim Brooks b22bbf94da Avoid blocking on channel close on network thread (#25521)
Currently when we close a channel in Netty4Utils.closeChannels we
block until the closing is complete. This introduces the possibility
that a network selector thread will block while waiting until a
separate network selector thread closes a channel.

For instance: T1 closes channel 1 (which is assigned to a T1 selector).
Channel 1's close listener executes the closing of the node. That
means that T1 now tries to close channel 2. However, channel 2 is
assigned to a selector that is running on T2. T1 now must wait until T2
closes that channel at some point in the future.

This commit addresses this by adding a boolean to closeChannels
indicating if we should block on close. We only set this boolean to true
if we are closing down the server channels at shutdown. This call is
never made from a network thread. When we call the closeChannels method
with that boolean set to false, we do not block on close.
2017-07-10 10:50:51 -05:00
Jason Tedor c084542731 Bump version to 6.0.0-beta1
This commit does two things:
 - bumps the version from 6.0.0-alpha3 to 6.0.0-beta1
 - renames the 6.0.0-alpha3 version constant to 6.0.0-beta1

Relates #25621
2017-07-09 18:12:50 -04:00
Adrien Grand 40bb1663ee Index ids in binary form. (#25352)
Indexing ids in binary form should help with indexing speed since we would
have to compare fewer bytes upon sorting, should help with memory usage of
the live version map since keys will be shorter, and might help with disk
usage depending on how efficient the terms dictionary is at compressing
terms.

Since we can only expect base64 ids in the auto-generated case, this PR tries
to use an encoding that makes the binary id equal to the base64-decoded id in
the majority of cases (253 out of 256). It also specializes numeric ids, since
this seems to be common when content that is stored in Elasticsearch comes
from another database that uses eg. auto-increment ids.

Another option could be to require base64 ids all the time. It would make things
simpler but I'm not sure users would welcome this requirement.

This PR should bring some benefits, but I expect it to be mostly useful when
coupled with something like #24615.

Closes #18154
2017-07-07 14:22:47 +02:00
Martijn van Groningen 6db708ef75
Move more token filters to analysis-common module
The following token filters were moved: common grams, limit token, pattern capture and pattern raplace.

Relates to #23658
2017-07-07 10:02:52 +02:00
Simon Willnauer 1f67d079b1 Validate `transport.profiles.*` settings (#25508)
Transport profiles unfortunately have never been validated. Yet, it's very
easy to make a mistake when configuring profiles which will most likely stay
undetected since we don't validate the settings but allow almost everything
based on the wildcard in `transport.profiles.*`. This change removes the
settings subset based parsing of profiles but rather uses concrete affix settings
for the profiles which makes it easier to fall back to higher level settings since
the fallback settings are present when the profile setting is parsed. Previously, it was
unclear in the code which setting is used ie. if the profiles settings (with removed
prefixes) or the global node setting. There is no distinction anymore since we don't pull
prefix based settings.
2017-07-07 09:40:59 +02:00
Jason Tedor c96257ca73 Upgrade to Netty 4.1.13.Final
This commit upgrades the Netty dependency from version 4.1.11.Final to
4.1.13.Final.

Relates #25581
2017-07-06 15:37:00 -04:00
Martijn van Groningen d0f9f425bd
parent/child: Removed ParentJoinFieldSubFetchPhase 2017-07-06 13:15:02 +02:00
Martijn van Groningen 407273f81d
parent/child: Support parent id being specified as number in the _source 2017-07-06 11:48:57 +02:00
Jun Ohtani 6894ef6057 [Analysis] Support normalizer in request param (#24767)
* [Analysis] Support normalizer in request param

Support normalizer param
Support custom normalizer with char_filter/filter param

Closes #23347
2017-07-04 19:16:56 +09:00
Colin Goodheart-Smithe 43efcffcc2 Adds check for negative search request size (#25397)
* Adds check for negative search request size

This change adds a check to `SearchSourceBuilder` to throw and exception if the size set on it is set to a negative value.

Closes #22530

* fix error in reindex

* update re-index tests

* Addresses review comment

* Fixed tests

* Added random negative size test

* Fixes test
2017-07-04 10:51:38 +01:00
Christoph Büscher f576c987ce Remove QueryParseContext (#25486)
QueryParseContext is currently only used as a wrapper for an XContentParser, so
this change removes it entirely and changes the appropriate APIs that use it so
far to only accept a parser instead.
2017-07-03 17:30:40 +02:00
Simon Willnauer 5a7c8bb04e Cleanup network / transport related settings (#25489)
This commit makes the use of the global network settings explicit instead
of implicit within NetworkService. It cleans up several places where we fall
back to the global settings while we should have used tcp or http ones.

In addition this change also removes unnecessary settings classes
2017-07-02 10:16:50 +02:00
Simon Willnauer 6f131a63d3 Remove unregistered `transport.netty.*` settings (#25476)
These settings have not be working for a full major version since they
are not registered. Given that they are simply duplicates we can just remove
them.
2017-06-29 20:56:18 +02:00
Christoph Büscher 927111c91d Remove QueryParseContext from parsing QueryBuilders (#25448)
Currently QueryParseContext is only a thin wrapper around an XContentParser that
adds little functionality of its own. I provides helpers for long deprecated
field names which can be removed and two helper methods that can be made static
and moved to other classes. This is a first step in helping to remove
QueryParseContext entirely.
2017-06-29 17:10:20 +02:00
Christoph Büscher 2708bcc6ed Merge branch 'master' into feature/rank-eval 2017-06-29 15:07:45 +02:00
olcbean 3518e313b8 Unify the result interfaces from get and search in Java client (#25361)
As GetField and SearchHitField have the same members, they have been unified into
DocumentField.

Closes #16440
2017-06-29 11:35:28 +02:00
Martijn van Groningen c85ac402b0
test: Make many percolator integration tests real integration tests 2017-06-27 17:44:30 +02:00
Simon Willnauer d338a09812 Remove `mapping.single_type` from parent join test (#25391)
This removes the remaining usage of `mapping.single_type` from the parent join
module and moves it's bwc test to the mixed cluster tests

Relates to #24961
Relates to #20257
2017-06-26 17:33:07 +02:00
Martijn van Groningen a34f5fa812
Move more token filters to analysis-common module
The following token filters were moved: stemmer, stemmer_override, kstem, dictionary_decompounder, hyphenation_decompounder, reverse, elision and truncate.

Relates to #23658
2017-06-26 09:02:16 +02:00
Nik Everett da0b991331 Remove `index.mapping.single_type=false` from reindex tests (#25365)
* Remove the setting from the yml tests and replace with tests using
`join` field. We can't use the setting in yml tests without lots of
backflips but we have `ReindexParentChildTests` for the coverage.
There weren't tests for `join` field with reindex before this. Adding
these tests discovered #25363.
* Remove the setting from `ReindexParentChildTests` and replace with
`index.version.created=V_5_6_0`. This test can be entirely removed
when legacy parent/child support is dropped from core.
* Port the yml tests that set _parent into integ tests so they
can set the index created version. These tests can be removed
when we drop support for _parent in core.
* Port a delete-by-query test for filtering based on type to an
`ESIntegTestCase` so it can use `index.version.created=5.6.0` to
setup documents of multiple types. This whole feature can be dropped
when we no longer support multiple types per index.

Relates to #24961
2017-06-23 17:14:59 -04:00
Simon Willnauer 4ae426a552 Remove remaining `index.mapping.single_type=false` (#25369)
This change cleans up remaining tests  to not use index.mapping.single_type=false
but instead where applicable use a single type or markt the index as created
with a pre 6.x version.

Yet, there is still on leftover in the client tests that needs special attention.
See `org.elasticsearch.client.SearchIT`

Relates to #24961
2017-06-23 10:26:06 +02:00
Tal Levy 1ac7818201 fix sort and string processor tests around targetField (#25358)
Tests were randomly assigning `targetField` to an existing field that was an array,
causing path resolution issues. This PR fixes those tests

Closes #25346 & #25348
2017-06-22 13:14:18 -07:00
Jack Conradson 96b62409a8 Update Painless to Allow Augmentation from Any Class (#25360)
Custom whitelists in Painless will need to allow classes to be augmented beyond the currently hard-coded Augmentation class tied to Painless directly. This change allows any class to specify an augmentation on a Painless struct using an appropriate static method. Changes to loading the whitelist have also been created to allow for this specification of a different class for augmentation.
2017-06-22 12:16:46 -07:00
Martijn van Groningen 343e7571b9
test: single type defaults to true since alpha1 and not alpha3
Closes #25354
2017-06-22 16:31:15 +02:00
Adrien Grand 44e9c0b947 Upgrade to lucene-7.0.0-snapshot-ad2cb77. (#25349)
Most notable changes:
 - better update concurrency: LUCENE-7868
 - TopDocs.totalHits is now a long: LUCENE-7872
 - QueryBuilder does not remove the boolean query around multi-term synonyms:
   LUCENE-7878
 - removal of Fields: LUCENE-7500

For the `TopDocs.totalHits` change, this PR relies on the fact that the encoding
of vInts and vLongs are compatible: you can write and read with any of them as
long as the value can be represented by a positive int.
2017-06-22 12:35:33 +02:00
Martijn van Groningen a977569085
percolator: Deprecate `document_type` parameter.
The `document_type` parameter is no longer required to be specified,
because by default from 6.0 only a single type is allowed. (`index.mapping.single_type` defaults to `true`)
2017-06-22 09:55:06 +02:00
Nik Everett 8d9a08e239 Fix reindex test when log level is debug
When log level is debug we'd dereference null because the test
was being cute and cutting corners.

Relates to #25256
2017-06-20 16:06:58 -04:00
Jun Ohtani 62d1969595 Parse synonyms with the same analysis chain (#8049)
* [Analysis] Parse synonyms with the same analysis chain

Synonym Token Filter / Synonym Graph Filter tokenize synonyms with whatever tokenizer and token filters appear before it in the chain.

Close #7199
2017-06-20 21:50:33 +09:00
Nik Everett 3261586cac Tweak reindex cancel logic and add many debug logs (#25256)
I'm still trying to hunt down rare failures in the cancelation tests
for reindex and friends. Here is the latest:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-unix-compatibility/os=ubuntu/876/console

It doesn't show much, other than that one of the tasks didn't kill
itself when asked to cancel.

So I'm going a bit crazy with debug logging so that the next time this
comes up I can trace exactly what happened.

Additionally, this tweaks the logic around how rethrottles were
performed around cancel. Previously we set the `requestsPerSecond`
to `0` when we cancelled the task. That was the "old way" to set them
to inifity which was the intent. This switches that from `0` to
`Float.MAX_VALUE` which is the "new way" to set the `requestsPerSecond`
to infinity. I don't know that this is much better, but it feels better.
2017-06-19 18:46:42 -04:00
Andy Bristol 4c5bd57619 Rename simple pattern tokenizers (#25300)
Changed names to be snake case for consistency

Related to #25159, original issue #23363
2017-06-19 13:48:43 -07:00
Simon Willnauer a8d5a58801 Replace deprecated API usage in Netty4HttpChannel 2017-06-17 14:04:23 +02:00
Christoph Büscher e99ced06cc [Tests] Check that parsing aggregations works in a forward compatible way (#25219)
This change adds tests for the aggregation parsing that try to simulate that we
can parse existing aggregations in a forward compatible way in the future,
ignoring potential newly added fields or substructures to the xContent response.
2017-06-17 13:06:31 +02:00
Simon Willnauer f18b0d293c Move TransportStats accounting into TcpTransport (#25251)
Today TcpTransport is the de-facto base-class for transport implementations.
The need for all the callbacks we have in TransportServiceAdaptor are not necessary
anymore since we can simply have the logic inside the base class itself. This change
moves the stats metrics directly into TcpTransport removing the need for low level
bytes send / received callbacks.
2017-06-16 22:34:11 +02:00
Nik Everett ecc87f613f Move pre-configured "keyword" tokenizer to the analysis-common module (#24863)
Moves the keyword tokenizer to the analysis-common module. The keyword tokenizer is special because it is used by CustomNormalizerProvider so I pulled it out into its own PR. To get the move to work I've reworked the lookup from static to one using the AnalysisRegistry. This seems safe enough.

Part of #23658.
2017-06-16 11:48:15 -04:00
Jack Conradson 50db8cb351 Add needs methods for specific variables to Painless script context factories. (#25267) 2017-06-15 17:00:33 -07:00
Martijn van Groningen 428e70758a
Moved more token filters to analysis-common module.
The following token filters were moved: `edge_ngram`, `ngram`, `uppercase`, `lowercase`, `length`, `flatten_graph` and `unique`.

Relates to #23658
2017-06-15 18:28:31 +02:00
Jim Ferenczi 5e64cd08bc [Test] restore BWC for parent-join now that the new mapping format is in 5.x 2017-06-15 15:15:48 +02:00
Jim Ferenczi 9ca33e2450 Add a section named "relations" in the ParentJoinFieldMapper (#25248)
* Add a section named "relation" in the ParentJoinFieldMapper

This commit puts the parent/child definition in an inner section named "relation".
Mapping for the parent-join will look like this:

```
"join_field": {
  "type": "join"
  "relations":
    "parent": "child"
  }
}
```
2017-06-15 14:56:20 +02:00
Tal Levy 2cd771a230 fix: Sort Processor does not have proper behavior with targetField (#25237)
to specify a `targetField`. This results in some interesting behavior that was missed in the review.
This processor sorts in-place, so there is a side-effect in both the original field and the target field.
Another bug was that the targetField was not being set if the list being sorted was fewer than two elements.

The new behavior works like this: If targetField and fieldName are not the same, we copy the list.
2017-06-15 05:28:54 -07:00
Boaz Leskes 648b4717a4 move assertBusy to use CheckException (#25246)
We use assertBusy in many places where the underlying code throw exceptions. Currently we need to wrap those exceptions in a RuntimeException which is ugly.
2017-06-15 13:24:07 +02:00
Tanguy Leroux 27f1206999 Use SPI in High Level Rest Client to load XContent parsers (#25098)
This commit adds a NamedXContentProvider interface that can 
be implemented by plugins or modules using Java's SPI feature 
in order to provide additional NamedXContent parsers to external
applications like the Java High Level Rest Client.
2017-06-15 12:50:02 +02:00
Adrien Grand 0c117145f6 Upgrade to lucene-7.0.0-snapshot-92b1783. (#25222)
This snapshot has faster range queries on range fields (LUCENE-7828), more
accurate norms (LUCENE-7730) and the ability to use fake term frequencies
(LUCENE-7854).
2017-06-15 09:52:07 +02:00
Ryan Ernst caf7792db1 Scripting: Rename SearchScript.needsScores to needs_score (#25235)
This commit renames the needsScores method so as to make it
automatically generatable, based on the name of the `_score` variable
which is available in search scripts. It also adds documentation to
ScriptContext to explain the naming and signature of such methods.
2017-06-14 22:01:19 -07:00
Jack Conradson a4471f51e4 Support script context stateful factory in Painless. (#25233) 2017-06-14 16:44:41 -07:00
Christoph Büscher 4de4c795b7 Fix issues after merging in master 2017-06-14 12:16:58 +02:00
Christoph Büscher ac3db8c30f Merge branch 'master' into feature/rank-eval 2017-06-14 11:57:05 +02:00
Andy Bristol 48696ab544 expose simple pattern tokenizers (#25159)
Expose the experimental simplepattern and 
simplepatternsplit tokenizers in the common 
analysis plugin. They provide tokenization based 
on regular expressions, using Lucene's 
deterministic regex implementation that is usually 
faster than Java's and has protections against 
creating too-deep stacks during matching.

Both have a not-very-useful default pattern of the 
empty string because all tokenizer factories must 
be able to be instantiated at index creation time. 
They should always be configured by the user 
in practice.
2017-06-13 12:46:59 -07:00
Alexander Kazakov a7dafdaa05 Add target_field parameter to gsub, join, lowercase, sort, split, trim, uppercase (#24133)
Closes #23682 #23228
2017-06-13 09:40:44 -07:00
Simon Willnauer 186c16ea41 Ensure pending transport handlers are invoked for all channel failures (#25150)
Today if a channel gets closed due to a disconnect we notify the response
handler that the connection is closed and the node is disconnected. Unfortunately
this is not a complete solution since it only works for published connections.
Connections that are unpublished ie. for discovery can indefinitely hang since we
never invoke their handers when we get a failure while a user is waiting for
the response. This change adds connection tracking to TcpTransport that ensures
we are notifying the corresponding connection if there is a failure on a channel.
2017-06-13 09:37:05 +02:00
Jason Tedor dcf57f296e Fix get mappings HEAD requests
Get mappings HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for get
mappings HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.

Relates #23192
2017-06-11 14:58:56 -04:00
Jason Tedor 7182577904 Fix handling of exceptions thrown on HEAD requests
Today when an exception is thrown handling a HEAD request, the body is
swallowed before the channel has a chance to see it. Yet, the channel is
where we compute the content length that would be returned as a header
in the response. This is a violation of the HTTP specification. This
commit addresses the issue. To address this issue, we remove the special
handling in bytes rest response for HEAD requests when an exception is
thrown. Instead, we let the upstream channel handle the special case, as
we already do today for the non-exceptional case.

Relates #25172
2017-06-10 23:44:18 -04:00
Ryan Ernst a03b6c2fa5 Scripting: Change keys for inline/stored scripts to source/id (#25127)
This commit adds back "id" as the key within a script to specify a
stored script (which with file scripts now gone is no longer ambiguous).
It also adds "source" as a replacement for "code". This is in an attempt
to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.
2017-06-09 08:29:25 -07:00
Jim Ferenczi 8250aa4267 Remove the postings highlighter and make unified the default highlighter choice (#25028)
This change removes the `postings` highlighter. This highlighter has been removed from Lucene master (7.x) because it behaves
exactly like the `unified` highlighter when index_options is set to `offsets`:
https://issues.apache.org/jira/browse/LUCENE-7815

It also makes the `unified` highlighter the default choice for highlighting a field (if `type` is not provided).
The strategy used internally by this highlighter remain the same as before, it checks `term_vectors` first, then `postings` and ultimately it re-analyzes the text.
Ultimately it rewrites the docs so that the options that the `unified` highlighter cannot handle are clearly marked as such.
There are few features that the `unified` highlighter is not able to handle which is why the other highlighters (`plain` and `fvh`) are still available.
I'll open separate issues for these features and we'll deprecate the `fvh` and `plain` highlighters when full support for these features have been added to the `unified`.
2017-06-09 14:09:57 +02:00
Tal Levy a771912a22 Add Ingest-Processor specific Rest Endpoints & Add Grok endpoint (#25059)
This PR enables Ingest plugins to leverage processor-scoped REST
endpoints. First of which being the Grok endpoint that retrieves
Grok Patterns for users to retrieve all the built-in patterns.
Example usage: Kibana Grok Autocomplete!
2017-06-08 15:24:35 -07:00
Tal Levy 340909582f remove Ingest's Internal Template Service (#25085)
Ingest was using it's own wrapper around TemplateScripts and the ScriptService.
This commit removes that abstraction
2017-06-08 15:24:03 -07:00
Guillaume Le Floch 3f6d80aa66 Allow removing multiple fields in ingest processor (#24750)
* Allow removing multiple fields in ingest processor

* Iteration 2

* Few fixes
2017-06-08 13:17:44 -07:00
Jim Ferenczi 21a57c1494 Always use DisjunctionMaxQuery to build cross fields disjunction (#25115)
This commit modifies query_string, simple_query_string and multi_match queries to always use a DisjunctionMaxQuery when a disjunction over multiple fields is built. The tiebreaker is set to 1 in order to behave like the boolean query in terms of scoring.
The removal of the coord factor in Lucene 7 made this change mandatory to correctly handle minimum_should_match.

Closes #23966
2017-06-08 11:18:17 +02:00
Adrien Grand a8ea2f0df4 Leverage scorerSupplier when applicable. (#25109)
The `scorerSupplier` API allows to give a hint to queries in order to let them
know that they will be consumed in a random-access fashion. We should use this
for aggregations, function_score and matched queries.
2017-06-08 10:19:38 +02:00
Jack Conradson d187fa78fd Generate Painless Factory for Creating Script Instances (#25120) 2017-06-07 16:06:11 -07:00
Jim Ferenczi 3924fd79ef Add BWC rest test for parent-join after the backport to 5.x 2017-06-07 19:29:01 +02:00
Martijn van Groningen db8aa8e94e
Changed inner_hits to work with the new join field type and
at the same time maintaining support for the `_parent` meta field type/

Relates to #20257
2017-06-07 10:52:49 +02:00
Jason Tedor e03c4938c5 GET aliases should 404 if aliases are missing
Previously the HEAD and GET aliases endpoints were misaigned in
behavior. The HEAD verb would 404 if any aliases are missing while the
GET verb would not if any aliases existed. When HEAD was aligned with
GET, this broke the previous usage of HEAD to serve as an existence
check for aliases. It is the behavior of GET that is problematic here
though, if any alias is missing the request should 404. This commit
addresses this by modifying the behavior of GET to behave in this
way. This fixes the behavior for HEAD to also 404 when aliases are
missing.

Relates #25043
2017-06-06 14:37:29 -04:00
Jim Ferenczi 7e60cf3e54 Move parent_id query to the parent-join module (#25072)
This change moves the parent_id query to the parent-join module and handles the case when only the parent-join field can be declared on an index (index with single type on).
If single type is off it uses the legacy parent join field mapper and switch to the new one otherwise (default in 6).

Relates #20257
2017-06-06 19:35:14 +02:00
Tal Levy d6d0c13bd6 fix grok's pattern parsing to validate pattern names in expression (#25063)
Unknown patterns used to silently be ignored. This was a problem because users did not know they were providing an invalid pattern name, and maybe thought the rest of their regexes were invalid.

Fixes #22831.
2017-06-06 08:07:53 -07:00
Tal Levy e51246023a add `exclude_keys` option to KeyValueProcessor (#24876)
and modify data-structure of `include_keys` and `exclude_keys` to be
backed by a HashSet
2017-06-05 14:12:48 -07:00
Alex Benusovich 5463294ec4 Fixed NPEs caused by requests without content. (#23497)
REST handlers that require a body will throw an an ElasticsearchParseException "request body required".
REST handlers that require a body OR source param will throw an ElasticsearchParseException "request body or source param required".
Replaced asserts in BulkRequest parsing code with a more descriptive IllegalArgumentException if the line contains an empty object.
Updated bulk REST test to verify an empty action line is rejected properly.
Updated BulkRequestTests with randomized testing for an empty action line.
Used try-with-resouces for XContentParser in AbstractBulkByQueryRestHandler.
2017-06-05 09:08:14 -06:00
Nik Everett 73307a2144 Plugins can register pre-configured char filters (#25000)
Fixes the plumbing so plugins can register char filters and moves
the `html_strip` char filter into analysis-common.

Relates to #23658
2017-06-05 09:25:15 -04:00
Jack Conradson 8999104b14 Change ScriptContexts to use needs instead of uses$. (#25036) 2017-06-02 14:50:51 -07:00
Martijn van Groningen 2a71a7bffc
Change `has_child`, `has_parent` queries and `childen` aggregation to work with the new join field type and
at the same time maintaining support for the `_parent` meta field type.

Relates to #20257
2017-06-02 23:27:16 +02:00
Ryan Ernst 0d8216d5af Scripting: Convert CompiledTemplate to a ScriptContext (#25032)
This commit creates TemplateScript and associated classes so that
templates no longer need a special ScriptService.compileTemplate method.
The execute() method is equivalent to the old run() method.

relates #20426
2017-06-02 13:41:26 -07:00
Jack Conradson a926ace2e1 Update Painless to Use New Script Contexts (#25015)
*  All public methods starting with get will be added as local variables
to the execute method.
* The execute method on a ScriptContext must be both public and
abstract.  This method will be implemented by the Painless compiler.
* A static list of parameter names for the execute method must be
provided since the names will be eliminated at runtime.
* The uses$ methods will still be implemented as before.
* A single constructor may be provided by the ScriptContext.  This
constructor will be overridden by the Painless compiler to include the
exact same arguments.  This allows instances of a Painless script to
potentially contain state.  If a constructor is not provided it is
assumed the default constructor with no arguments will be used.
2017-06-02 13:36:45 -07:00
Jim Ferenczi 4077600035 Disallow the new parent join field on indices with multiple types
Relates https://github.com/elastic/elasticsearch/pull/24978
2017-06-02 18:28:03 +02:00
Chris Earle 6464add551 Always Accumulate Transport Exceptions (#25017)
This removes the `accumulateExceptions()` method (and its usage) from `TransportNodesAction` and `TransportTasksAction`, forcing both transport actions to always accumulate exceptions.

Without this change, some transport actions, like `TransportNodesStatsAction` would respond in very unexpected ways by returning no response due to some failure, but instead of returning an
error the response would simply be empty: no response and no error.

This results in a very trappy response structure where users can check for an error, then attempt to blindly use the response when no error is returned.
2017-06-02 10:01:42 -04:00
Jim Ferenczi b8605775df Add the ability to set eager_global_ordinals in the new parent-join field (#25019)
Defaults to true
2017-06-02 15:34:22 +02:00
Colin Goodheart-Smithe 779fb9a1c0 Adds nodes usage API to monitor usages of actions (#24169)
* Adds nodes usage API to monitor usages of actions

The nodes usage API has 2 main endpoints

/_nodes/usage and /_nodes/{nodeIds}/usage return the usage statistics
for all nodes and the specified node(s) respectively.

At the moment only one type of usage statistics is available, the REST
actions usage. This records the number of times each REST action class is
called and when the nodes usage api is called will return a map of rest
action class name to long representing the number of times each of the action
classes has been called.

Still to do:

* [x] Create usage service to store usage statistics
* [x] Record usage in REST layer
* [x] Add Transport Actions
* [x] Add REST Actions
* [x] Tests
* [x] Documentation

* Rafactors UsageService so counts are done by the handlers

* Fixing up docs tests

* Adds a name to all rest actions

* Addresses review comments
2017-06-02 08:46:38 +01:00
Jim Ferenczi f4aee1e583 Disallow multiple parent-join fields per mapping (#25002)
This change ensures that there is a single parent-join field defined per mapping.
The verification is done through the addition of a special field mapper (MetaJoinFieldMapper) with a unique name (_parent_join) that is registered to the mapping service
when the first parent-join field is defined. If a new parent-join is added, this field mapper will clash with the new one and the update will fail.
This change also simplifies the parent join fetch sub phase by retrieving the parent-join field without iterating on all fields in the mapping.
2017-06-02 09:21:15 +02:00
Ryan Ernst 8d88b94372 Scripting: Add optional context parameter to put stored script requests (#25014)
This commit adds an optional `context` url parameter to the put stored
script request. When a context is specified, the script is compiled
against that context before storing, as a validation the script will
work when used in that context.
2017-06-01 17:53:48 -07:00
Jim Ferenczi b5d62ae747 Introduce ParentJoinFieldMapper, a field mapper that creates parent/child relation within documents of the same index (#24978)
* Introduce ParentJoinFieldMapper, a field mapper that creates parent/child relation within documents of the same index

This change adds a new field mapper named ParentJoinFieldMapper. This mapper is a replacement for the ParentFieldMapper but instead of using the types in the mapping
it uses an internal field to materialize parent/child relation within a single index.
This change also adds a fetch sub phase that automatically retrieves the join name (parent or child name) and the parent id for child documents in the response hit fields.
The compatibility with `has_parent`, `has_child` queries and `children` agg will be added in a follow up.

Relates #20257
2017-05-31 18:07:21 +02:00
Martijn van Groningen 258be2b135
Moved `keyword_marker`, `trim`, `snowball` and `porter_stemmer` tokenfilter factories from core to common-analysis module.
Relates to #23658
2017-05-31 09:34:08 +02:00
Martijn van Groningen 6945d7b046
test: Stop using the `mapping.single_type` setting in percolator tests.
Closes #24958
2017-05-31 09:11:33 +02:00
Ryan Ernst 7c1211d2ed Scripting: Add StatefulFactoryType as optional intermediate factory in script contexts (#24974)
ScriptContexts currently understand a FactoryType that can produce
instances of the script InstanceType. However, for search scripts, this
does not work as we have the concept of LeafSearchScript that is created
per lucene segment. This commit effectively renames the existing
SearchScript class into SearchScript.LeafFactory, which is a new,
optional, class that can be defined within a ScriptContext.
LeafSearchScript is effectively renamed back into SearchScript. This
change allows the model of stateless factory -> stateful factory ->
script instance to continue, but in a generic way that any script
context may take advantage of.

relates #20426
2017-05-30 16:32:14 -07:00
Jack Conradson 04daac2243 Make Painless Compiler Use an Instance Per Context (#24972)
Allows for easier management of compilation of individual interfaces on a per script context basis.
2017-05-30 14:30:48 -07:00
Tal Levy 2a6e6866bd Fix floating-point error when DateProcessor parses UNIX (#24947)
DateProcessor's DateFormat UNIX format parser resulted in
a floating point rounding error when parsing certain stringed
epoch times. Now Double.parseDouble is used, preserving the
intented input.
2017-05-30 09:42:26 -07:00
Jack Conradson 5bcae914d9 Make PainlessScript an interface (#24966)
Allows more flexibility for the specified script context interface if we want to allow script contexts to specify an abstract class instead.
2017-05-30 09:03:46 -07:00
Christoph Büscher 5a4124d4fb Fixing template rendering after changes in master 2017-05-30 15:30:24 +02:00
Christoph Büscher 3d6fb4eb0b Merge branch 'master' into feature/rank-eval 2017-05-30 14:24:26 +02:00
Tanguy Leroux eea010b408 Add doc_count to ParsedMatrixStats (#24952)
This commit adds support in ParsedMatrixStats for parsing the doc_count
field.

Related to #24776
2017-05-30 10:16:08 +02:00
Tanguy Leroux 28d97df67c Add document count to Matrix Stats aggregation response (#24776)
This commit adds a `doc_count` field to the response body of Matrix
Stats aggregation. It exposes the number of documents involved in
 the computation of statistics, a value that can already be retrieved using
  the method MatrixStats.getDocCount() in the Java API.
2017-05-30 09:39:41 +02:00
Nik Everett 5da8ce8318 Remove the need for _UNRELEASED suffix in versions (#24798)
Removes the need for the `_UNRELEASED` suffix on versions by detecting if a version should be unreleased or not based on the versions around it. This should make it simpler to automate the task of adding a new version label.
2017-05-26 18:36:32 -04:00
Ryan Ernst 74e031e842 Scripting: Rename CompiledType to FactoryType in ScriptContext (#24897)
This commit renames the concept of the "compiled type" to a "factory
type", along with all implementations of this class to be named Factory.
This brings it inline with the classes purpose.
2017-05-26 00:02:54 -07:00
Ryan Ernst 8eab1fefa1 Scripting: Make contexts available to ScriptEngine construction (#24896)
This commit adds collection of all contexts to the parameters of
getScriptEngine. This will allow script engines like painless to
precache extra information about the contexts.
2017-05-25 16:55:47 -07:00
Ryan Ernst 8aaea51a0a Scripting: Move context definitions to instance type classes (#24883)
This is a simple refactoring to move the context definitions into the
type that they use. While we have multiple context names for the same
class at the moment, this will eventually become one ScriptContext per
instance type, so the pattern of a static member on the interface called
CONTEXT can be used. This commit also moves the consolidated list of
contexts provided by core ES into ScriptModule.
2017-05-25 12:18:45 -07:00
Ryan Ernst 7d03cff820 Scripting: Make ScriptEngine.compile generic on the script context (#24873)
This commit changes the compile method of ScriptEngine to be generic in
the same way it is on ScriptService. This moves the shim of handling the
two existing context classes into each script engine, so that each
engine can be worked on independently to convert to real handling of
contexts.
2017-05-24 20:06:32 -07:00
Ryan Ernst 1daacd97b0 Scripting: Add instance and compiled classes to script contexts (#24868)
This commit modifies the compile method of ScriptService to be context
aware. The ScriptContext is now a generic class which contains both the
instance type and compiled type for a script. Instance type may be
stateful (for example, pre loading field information for the index a
script will execute on, like in expressions), while the compiled type is
stateless and used to construct instance type instances. This change is
only a first step to cutover ScriptService to the new paradigm. It only
converts callers to the script service, and has a small shim to wrap
compilation from the script engines to support the current two fixed
instance types, SearchScript and ExecutableScript.
2017-05-24 14:29:02 -07:00
Ryan Ernst 0ddd219423 Scripting: Add default implementation of close() for ScriptEngine (#24851)
Since groovy was removed, we no longer have any ScriptEngines with
resources to release. We may want to keep the option open for a script
engine to close resources, but this would not be common. This commit
adds a default implementation to ScriptEngine for `close()` to reduce
the boiler plate that must be added for a ScriptEngine implementation.
2017-05-24 13:19:27 -07:00
Jim Ferenczi 4707377cea Move InnerHitBuilder queries BWC version to 5.5 after the backport
Relates #24676
2017-05-23 22:41:39 +02:00
Martijn van Groningen 34093735e3
Added unit tests for MatrixStatsAggregator 2017-05-23 16:19:12 +02:00
Luca Cavanna 747fa721e4 Build: add client jar for aggs-matrix-stats (#24827)
This will be useful for the high level client to add support for the matrix stats aggregation, as we will ship with this jar by default like we do for parent-join-client which is aligned with distributing core with the modules already included.

Relates to #24796
2017-05-23 13:33:54 +02:00
Jim Ferenczi 9087803cd9 Add the ability to define custom inner hit sub context builder (#24676)
This commit moves the handling of nested and parent/child inner hits to specialized classes that can be defined outside of ES core.
InnerHitBuilderContext is now used by the parent query (nested or hasChild, ...) to build the sub context from the InnerHitBuilder definition.
BWC is also ensured so that nodes in previous versions can still send/receive inner hits to/from this version.

Relates #20257
2017-05-23 13:06:22 +02:00
Jack Conradson 8887bcc4c6 Fix settings names for script.allowed_types and script.allowed_contexts. (#24831)
Fixes #24830
2017-05-22 15:08:45 -07:00
Ryan Ernst 52d504bb5f Scripting: Simplify ScriptContext (#24818)
As we work towards contexts implying the return type of compilation, we
first need ScriptContext to not be an enum. This commit removes the
Standard enum and Plugin subclass of ScriptContext.
2017-05-22 13:11:15 -07:00
javanna 7a3e38eb8e Merge branch 'master' into feature/client_aggs_parsing 2017-05-22 12:25:14 +02:00
Martijn van Groningen 08eda43899
percolator: Use QueryBuilder.rewriteQuery(...) to rewrite query builder instead of QueryBuilder.rewrite(...)
Relates to #24617
2017-05-22 12:20:26 +02:00
Ryan Ernst 2de748859f Scripting: Remove "inline script enabled" on script engines (#24815)
ScriptEngine implementations have an overridable method to indicate they
are safe to use as inline scripts. Since groovy was removed fro 6.0,
there are no longer any implementations which used the default false
value. Furthermore, the value was not actually read anywhere. This
commit removes the method. The ScriptEngineRegistry was also no longer
necessary as it only was used to build a map from language to engine.
2017-05-20 12:01:25 -07:00
javanna db0490343e Merge branch 'master' into feature/client_aggs_parsing 2017-05-19 18:17:06 +02:00
Nik Everett b9ea579633 Allow plugins to register pre-configured tokenizers (#24751)
Allows plugins to register pre-configured tokenizers. Much
of the decisions are the same as those in #24223, #24572,
and #24223. This only migrates the lowercase tokenizer but
I figure that is a good start because it proves out the features.
2017-05-19 12:07:04 -04:00
Nicholas Knize deb7caf4d3 Upgrade to lucene-7.0.0-snapshot-a0aef2f
This commit upgrades master to a current lucene snapshot with commit id a0aef2f.
2017-05-19 10:20:55 -05:00
Jim Ferenczi d241c4898e Removes parent child fielddata specialization (#24737)
This change removes the field data specialization needed for the parent field and replaces it with
a simple DocValuesIndexFieldData. The underlying global ordinals are retrieved via a new function called
IndexOrdinalsFieldData#getOrdinalMap.
The children aggregation is also modified to use a simple WithOrdinals value source rather than the deleted WithOrdinals.Parent.

Relates #20257
2017-05-19 17:11:23 +02:00
Tanguy Leroux 83aa00b3f6 Merge remote-tracking branch 'origin/master' into feature/client_aggs_parsing 2017-05-19 13:13:00 +02:00
Tanguy Leroux dd731d9e98 Add parsing method for Matrix Stats (#24746)
Related to #23331
2017-05-19 12:22:54 +02:00
Jack Conradson 1196dfb6bb Remove Deprecated Script Settings (#24756)
Removes all fine-grained script settings replaced by scripts.types_allowed and scripts.contexts_allowed.
2017-05-18 13:32:46 -07:00
Christoph Büscher 10d308578e Fix compilation issues after merge with master 2017-05-18 17:52:58 +02:00
Christoph Büscher cd0941810f Merge branch 'master' into feature/rank-eval 2017-05-18 16:47:47 +02:00
Tanguy Leroux eeef2e6c31 Merge remote-tracking branch 'origin/master' into feature/client_aggs_parsing 2017-05-18 09:43:57 +02:00
Ryan Ernst 463fe2f4d4 Scripting: Remove file scripts (#24627)
This commit removes file scripts, which were deprecated in 5.5.

closes #21798
2017-05-17 14:42:25 -07:00
Ryan Ernst f8a48badcf Settings: Remove shared setting property (#24728)
Shared settings were added intially to allow the few common settings
names across aws plugins. However, in 6.0 these settings have been
removed. The last use was in netty, but since 6.0 also has the netty 3
modules removed, there is no longer a need for the shared property. This
commit removes the shared setting property.
2017-05-17 13:14:12 -07:00
javanna ce7326eb88 Merge branch 'master' into feature/client_aggs_parsing 2017-05-17 17:59:00 +02:00
Simon Willnauer 2ccc223ff7 Fix Version based BWC and set correct minCompatVersion (#24732)
Approaching the release of 6.0 we need to sort out the usage of
`Version#minimumCompatibilityVersion` which was still set to 5.0.0.
Now this change moves it to the latest released version of 5.x (5.4 at this point)
to ensure we are compatible with the latest minor of the previous major. This change
also removes all the `_UNRELEASED` from the versions that where released and drops versions
that were never released and are not expected to be released (bugfixes in minors that are not
the latest in the previous major).
2017-05-17 17:27:09 +02:00
Nik Everett 0189a65e6b Fail rest tests on yaml files (#24740)
We've switched to supporting only `yml` files but anyone who didn't
notice will commit a `yaml` file which won't be executed
which is bad because it is easy not to notice. The test to catch this is
simple enough that I think it is worth adding just to warn folks about
their mistake.
2017-05-17 10:24:57 -04:00
Nik Everett 5fc6f17121 Fix new analysis-common yml tests
These tests are broken because I added them with the `yml` extension
and didn't realize that we weren't running tests with that extension
until we merged #24659. I used that extension in anticipation of #24659
but didn't verify that the tests were actually running. Ooops!

Closes #24734
2017-05-17 07:55:04 -04:00
Ryan Ernst 2a65bed243 Tests: Change rest test extension from .yaml to .yml (#24659)
This commit renames all rest test files to use the .yml extension
instead of .yaml. This way the extension used within all of
elasticsearch for yaml is consistent.
2017-05-16 17:24:35 -07:00
Nik Everett 7ef390068a Move remaining pre-configured token filters into analysis-common (#24716)
Moves the remaining preconfigured token figured into the analysis-common module. There were a couple of tests in core that depended on the pre-configured token filters so I had to touch them:

* `GetTermVectorsCheckDocFreqIT` depended on `type_as_payload` but didn't do anything important with it. I dropped the dependency. Then I moved the test to a single node test case because we're trying to cut down on the number of `ESIntegTestCase` subclasses.
* `AbstractTermVectorsTestCase` and its subclasses depended on `type_as_payload`. I dropped their usage of the token filter and added an integration test for the termvectors API that uses `type_as_payload` to the `analysis-common` module.
* `AnalysisModuleTests` expected a few pre-configured token filtes be registered by default. They aren't any more so I dropped this assertion. We assert that the `CommonAnalysisPlugin` registers these pre-built token filters in `CommonAnalysisFactoryTests`
* `SearchQueryIT` and `SuggestSearchIT` had tests that depended on the specific behavior of the token filters so I moved the tests to integration tests in `analysis-common`.
2017-05-16 13:10:24 -04:00
Simon Willnauer 1cae850cf5 Add a cluster block that allows to delete indices that are read-only (#24678)
Today when an index is `read-only` the index is also blocked from
being deleted which sometimes is undesired since in-order to make
changes to a cluster indices must be deleted to free up space. This is
a likely scenario in a hosted environment when disk-space is limited to switch
indices read-only but allow deletions to free up space.
2017-05-16 17:34:37 +02:00
Martijn van Groningen f6e19dcedc
percolator: Fix range queries with date range based on current time.
Range queries with now based date ranges were previously not allowed,
but since #23921 these queries were allowed. This change should really
fix range queries with now based date ranges.
2017-05-16 13:13:11 +02:00
Christoph Büscher 059b23e92e Merge branch 'master' into feature/client_aggs_parsing 2017-05-16 11:54:02 +02:00
Ryan Ernst 6ce597a378 Scripts: Convert template script engines to return String instead of BytesReference (#24447)
Template script engines (mustache, the only one) currently return a
BytesReference that users must know is utf8 encoded. This commit
modifies all callers and mustache to have the template engine return
String. This is much simpler, and does not require decoding in order to
use (for example, in ingest).
2017-05-15 22:37:31 -07:00
Igor Motov 243635222a Move ReindexAction class to core (#24684)
This class is also needed for plugins to use reindex functionality.

Relates to #24578
2017-05-15 14:28:59 -04:00
Christoph Büscher 0b688a8733 Small improvement in InternalAggregationTestCase test setup after changes in master (#24675) 2017-05-15 15:06:01 +02:00
Christoph Büscher 42e8d4b761 Merge branch 'master' into feature/client_aggs_parsing
Conflicts:
	core/src/test/java/org/elasticsearch/search/aggregations/bucket/filter/InternalFilterTests.java
	core/src/test/java/org/elasticsearch/search/aggregations/bucket/global/InternalGlobalTests.java
	core/src/test/java/org/elasticsearch/search/aggregations/bucket/missing/InternalMissingTests.java
	core/src/test/java/org/elasticsearch/search/aggregations/bucket/nested/InternalNestedTests.java
	core/src/test/java/org/elasticsearch/search/aggregations/bucket/nested/InternalReverseNestedTests.java
	core/src/test/java/org/elasticsearch/search/aggregations/bucket/sampler/InternalSamplerTests.java
	modules/parent-join/src/test/java/org/elasticsearch/join/aggregations/InternalChildrenTests.java
	test/framework/src/main/java/org/elasticsearch/search/aggregations/InternalSingleBucketAggregationTestCase.java
2017-05-15 12:25:07 +02:00
Koen De Groote f185b69e04 Replace manual copying an array or collection with static methods calls (#24657) 2017-05-15 09:35:48 +02:00
Jason Tedor 5da940532d Remove Netty logging hack (#24653)
Netty removed a logging guarded we added to prevent a scary logging
message. We added a hack to work around this. They've added the guard
back, so we can remove the hack now.
2017-05-12 16:05:13 -04:00
Jason Tedor 458129a85a Upgrade to Netty 4.1.11.Final
This commit upgrades the Netty dependency from 4.1.10.Final to
4.1.11.Final.

Relates #24652
2017-05-12 15:53:51 -04:00
Koen De Groote 878ae8eb3c Size lists in advance when known
When constructing an array list, if we know the size of the list in
advance (because we are adding objects to it derived from another list),
we should size the array list to the appropriate capacity in advance (to
avoid resizing allocations). This commit does this in various places.

Relates #24439
2017-05-12 10:36:13 -04:00
Jim Ferenczi 279a18a527 Add parent-join module (#24638)
* Add parent-join module

This change adds a new module named `parent-join`.
The goal of this module is to provide a replacement for the `_parent` field but as a first step this change only moves the `has_child`, `has_parent` queries and the `children` aggregation to this module.
These queries and aggregations are no longer in core but they are deployed by default as a module.

Relates #20257
2017-05-12 15:58:06 +02:00
Simon Willnauer be2a6ce80b Notify onConnectionClosed rather than onNodeDisconnect to prune transport handlers (#24639)
Today we prune transport handlers in TransportService when a node is disconnected.
This can cause connections to starve in the TransportService if the connection is
opened as a short living connection ie. without sharing the connection to a node
via registering in the transport itself. This change now moves to pruning based
on the connections cache key to ensure we notify handlers as soon as the connection
is closed for all connections not just for registered connections.

Relates to #24632
Relates to #24575
Relates to #24557
2017-05-12 15:40:40 +02:00
Nik Everett a40c3a99c9 Reindex: don't duplicate _source parameter (#24629)
If the request asks for the `_source` stored field then don't
duplicate it when forcing the `_source` parameter to onto the
request for reindex-from-remote from versions before 1.0.

Closes #24628
2017-05-11 16:30:06 -04:00
Simon Willnauer 1155615536 Move DeleteByQuery and Reindex requests into core (#24578)
This allows other plugins to use a client to call the functionality
that is in the core modules without duplicating the logic.
Plugins can now safely send the request and response classes via the
client even if the requests are executed locally. All relevant classes
are loaded by the core classloader such that plugins can share them.

This is re-adds this commit that was revered in 952feb58e4
2017-05-11 20:22:30 +02:00
Simon Willnauer 952feb58e4 Revert "Move DeleteByQuery and Reindex requests into core (#24578)"
This reverts commit 6ea2ae32b8.
2017-05-11 18:26:40 +02:00
Uwe Schindler f7c50f5f71 Painless: Optimize instance creation in LambdaBootstrap (#24618)
Optimize instance creation in LambdaBootstrap to allow Hotspot's escape analysis, preventing us from creating many instances stressing GC
2017-05-11 09:10:27 -07:00
Simon Willnauer 6ea2ae32b8 Move DeleteByQuery and Reindex requests into core (#24578)
This allows other plugins to use a client to call the functionality
that is in the core modules without duplicating the logic.
Plugins can now safely send the request and response classes via the
client even if the requests are executed locally. All relevant classes
are loaded by the core classloader such that plugins can share them.
2017-05-11 16:20:40 +02:00
Nik Everett 8188569fd1 Add qa module that tests reindex-from-remote against pre-5.0 versions of Elasticsearch (#24561)
Adds tests for reindex-from-remote for the latest 2.4, 1.7, and
0.90 releases. 2.4 and 1.7 are fairly popular versions but 0.90
is a point of pride.

This fixes any issues those tests revealed.

Closes #23828
Closes #24520
2017-05-11 10:06:20 -04:00
Martijn van Groningen 840da4aebf
Removed deprecated template query.
Relates to #19390
2017-05-11 14:56:45 +02:00
Nik Everett e49ebd04fd Get more information when reindex test fails
It looks like it leaks contexts but it isn't clear why so this
adds a little more logging. This is the failure:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.4+multijob-intake/94/console
2017-05-11 08:36:54 -04:00
Nik Everett 65f2717ab7 Make PreConfiguredTokenFilter harder to misuse (#24572)
There are now three public static method to build instances of
PreConfiguredTokenFilter and the ctor is private. I chose static
methods instead of constructors because those allow us to change
out the implementation returned if we so desire.

Relates to #23658
2017-05-10 22:39:43 -04:00
Martijn van Groningen 0ff5933a55
Rewrite multi search template api to delegate to multi search api instead of to search template api.
The max concurrent searches logic is complex and we shouldn't duplicate that in multi search template api,
so we should template each individual template search request and then delegate to multi search api.
2017-05-10 11:47:53 +02:00
Martijn van Groningen 760e5fce77
Rewrite multi search template api to delegate to multi search api instead of to search template api.
The max concurrent searches logic is complex and we shouldn't duplicate that in multi search template api,
so we should template each individual template search request and then delegate to multi search api.
2017-05-10 11:23:24 +02:00
Isabel Drost-Fromm bd559d96d4
This adds max_concurrent_searches to multi-search-template endpoint.
Closes #20912
2017-05-10 11:23:24 +02:00
Martijn van Groningen 51c74ce547
Added unit tests for InternalMatrixStats.
Also moved InternalAggregationTestCase to test-framework module in order to make use of it from other modules than core.

Relates to #22278
2017-05-10 11:06:18 +02:00
Ryan Ernst 9ca7d28552 Scripting: Remove "service" from ScriptEngine interface name (#24574)
This commit renames ScriptEngineService to ScriptEngine.  It is often
confusing because we have the ScriptService, and then
ScriptEngineService implementations, but the latter are not services as
we see in other places in elasticsearch.
2017-05-10 00:47:33 -07:00
Nik Everett bb06d8ec4f Allow plugins to build pre-configured token filters (#24223)
This changes the way we register pre-configured token filters so that
plugins can declare them and starts to move all of the pre-configured
token filters out of core. It doesn't finish the job because doing
so would make the change unreviewably large. So this PR includes
a shim that keeps the "old" way of registering pre-configured token
filters around.

The Lowercase token filter is special because there is a "special"
interaction between it and the lowercase tokenizer. I'm not sure
exactly what to do about it so for now I'm leaving it alone with
the intent of figuring out what to do with it in a followup.

This also renames these pre-configured token filters from
"pre-built" to "pre-configured" because that seemed like a more
descriptive name.

This is a part of #23658
2017-05-09 14:50:49 -04:00
javanna e875f7f72e remove duplicated import in AppendProcessor 2017-05-08 10:36:36 +02:00
Koen De Groote 13c17c75b5 Remove unneeded empty string concatentation
This commit removes concatenation by empty string in places where it
is simply not needed to obtain a string representation.

Relates #24411
2017-05-06 00:28:53 -04:00
Nik Everett 9bc7e210a0 Test: Move flag to painless tests (#24494)
The `-XX:-OmitStackTraceInFastThrow` flag is only required by Painless's
tests so we'll only set it there. This is much simpler.
2017-05-04 13:11:09 -04:00
Jason Tedor 06364cf6f0 You had one job Netty logging guard
In pre-release versions of Elasticsearch 5.0.0, users were subject to
log messages of the form "your platform does not.*reliably.*potential
system instability". This is because we disable Netty from being unsafe,
and Netty throws up this scary info-level message when unsafe is
unavailable, even if it was unavailable because the user requested that
it be unavailabe. Users were rightly confused, and concerned. So, we
contributed a guard to Netty to prevent this log message from showing up
when unsafe was explicitly disabled. This guard shipped with all
versions of Netty that shipped starting with Elasticsearch
5.0.0. Unfortunately, this guard was lost in an unrelated refactoring
and now with the 4.1.10.Final upgrade, users will again see this
message. This commit is a hack around this until we can get a fix
upstream again.

Relates #24469
2017-05-03 18:49:08 -04:00
Adrien Grand 7311aaa2eb Fix PercolatorQuerySearchIT to not create multiple types. 2017-05-03 16:44:14 +02:00
Luca Cavanna 92bfd16c58 Java api: ActionRequestBuilder#execute to return a PlainActionFuture (#24415)
This change makes the request builder code-path same as `Client#execute`. The request builder used to return a `ListenableActionFuture` when calling execute, which allows to associate listeners with the returned future. For async execution though it is recommended to use the `execute` method that accepts an `ActionListener`, like users would do when using `Client#execute`.

Relates to #24412
Relates to #9201
2017-05-03 11:20:53 +02:00
Nik Everett 3b47355e56 Try not to lose stacktraces (#24426)
This adds `-XX:-OmitStackTraceInFastThrow` to the JVM arguments
which *should* prevent the JVM from omitting stack traces on
common exception sites. Even though these sites are common, we'd
still like the exceptions to debug them.

This also adds the flag when running tests and adapts some tests
that had workarounds for the absense of the flag.

Closes #24376
2017-05-02 11:34:12 -04:00
Uwe Schindler 62fa7081b0 Painless: Add tests to check for existence and correct detection of the special Java 9 optimizations: Indified String concat and MethodHandles#ArrayLengthHelper() (#24405) 2017-05-02 08:08:51 -07:00
Jason Tedor 40ff169c54 Set available processors for Netty
Netty uses the number of processors for sizing various resources (e.g.,
thread pools, buffer pools, etc.). However, it uses the runtime number
of available processors which might not match the configured number of
processors as set in Elasticsearch to limit the number of threads (for
example, in Docker containers). A new feature was added to Netty that
enables configuring the number of processors Netty should see for sizing
this various resources. This commit takes advantage of this feature to
set this number of available processors to be equal to the configured
number of processors set in Elasticsearch.

Relates #24420
2017-05-01 19:27:28 -04:00
Uwe Schindler e88d54bf0a Painless: Fix method references to ctor with the new LambdaBootstrap and cleanup code (#24406)
* Fix wrong delegation to constructors when compiling lambdas with method references to ctors. Also remove the get$lambda factory.
* Cleanup code and remove unneeded transformations between binary and internal class names (uses ASM Type class instead)
* Cleanup Exception handling
* Simplification by moving the type adaption to the outside
* Remove STATIC access flag from our Lambda class (not required and also officially not allowed)
* Move the lambda counter to the classloader, so we have a per-script lambda ID
* Change Codesource of generated lambdas to be consistent
2017-05-01 16:15:13 -07:00
Jason Tedor eefcad94b8 Upgrade Netty to 4.1.10.Final
This commit upgrades the Netty dependency from version 4.1.9.Final to
version 4.1.10.Final.

Relates #24414
2017-05-01 10:25:32 -04:00
Adrien Grand 1be2800120 Only allow one type on 7.0 indices (#24317)
This adds the `index.mapping.single_type` setting, which enforces that indices
have at most one type when it is true. The default value is true for 6.0+ indices
and false for old indices.

Relates #15613
2017-04-27 08:43:20 +02:00
Nik Everett bc45d10e82 Remove most usages of 1-arg Script ctor (#24325)
The one argument ctor for `Script` creates a script with the
default language but most usages of are for testing and either
don't care about the language or are for use with
`MockScriptEngine`. This replaces most usages of the one argument
ctor on `Script` with calls to `ESTestCase#mockScript` to make
it clear that the tests don't need the default scripting language.

I've also factored out some copy and pasted script generation
code into a single place. I would have had to change that code
to use `mockScript` anyway, so it was easier to perform the
refactor.

Relates to #16314
2017-04-26 16:04:38 -04:00
Nik Everett 7c3efb829b Move char filters into analysis-common (#24261)
Another step down the road to dropping the
lucene-analyzers-common dependency from core.

Note that this removes some tests that no longer compile from
core. I played around with adding them to the analysis-common
module where they would compile but we already test these in
the tests generated from the example usage in the documentation.

I'm not super happy with the way that `requriesAnalysisSettings`
works with regards to plugins. I think it'd be fairly bug-prone
for plugin authors to use. But I'm making it visible as is for
now and I'll rethink later.

A part of #23658
2017-04-26 13:25:34 -04:00
Martijn van Groningen c17de49a6d
[percolator] Fix memory leak when percolator uses bitset or field data cache.
The percolator doesn't close the IndexReader of the memory index any more.
Prior to 2.x the percolator had its own SearchContext (PercolatorContext) that did this,
but that was removed when the percolator was refactored as part of the 5.0 release.

I think an alternative way to fix this is to let percolator not use the bitset and fielddata caches,
that way we prevent the memory leak.

Closes #24108
2017-04-26 11:08:15 +02:00
Jason Tedor 1b660c5127 Fix incorrect logger invocation
It looks like auto-complete gave us a nasty surprise here with
Logger#equals being invoked instead of Logger#error swallowing the
absolute worst-possible level of a log message. This commit fixes the
invocation.
2017-04-25 16:25:52 -04:00
Ryan Ernst 6ebf08759b Templates: Add compileTemplate method to ScriptService for template consumers (#24280)
This commit adds a compileTemplate method to the ScriptService.
Eventually this will be used to easily cutover all consumers to a new
TemplateService.

relates #16314
2017-04-24 15:45:20 -07:00
Nik Everett 5fbc86e2aa Allow painless to load stored fields (#24290)
We document that painless can load stored fields but it can't
because the classes that make that work aren't whitelisted.
2017-04-24 14:22:39 -04:00
Jack Conradson 30cc33e2e5 Fix Painless Lambdas for Java 9 (#24070)
Replaces LambdaMetaFactory with LambdaBootstrap, a custom solution for lambdas in Painless using a design similar to LambdaMetaFactory, but allows for custom adaptation of types which recent changes to LambdaMetaFactory no longer allowed.
2017-04-24 09:58:02 -07:00
Christoph Büscher d1703decee Adapting to changes in master 2017-04-22 22:06:06 +02:00
Christoph Büscher 5254731039 Merge branch 'master' into feature/rank-eval 2017-04-22 21:47:32 +02:00
Ryan Ernst 473e98981b Scripts: Remove unnecessary executable shortcut (#24264)
ScriptService has two executable methods, one which takes a
CompiledScript, which is similar to search, and one that takes a raw
Script and both compiles and returns an ExecutableScript for it. The
latter is not needed, and the call sites which used one or the other
were mixed. This commit removes the extra executable method in favor of
callers first calling compile, then executable.
2017-04-21 17:53:03 -07:00
Ryan Ernst aadc33d260 Scripts: Remove unwrap method from executable scripts (#24263)
The unwrap method was leftover from support javascript and python. Since
those languages are removed in 6.0, this commit removes the unwrap
feature from scripts.
2017-04-21 17:50:22 -07:00
Nik Everett caf376c8af Start building analysis-common module (#23614)
Start moving built in analysis components into the new analysis-common
module. The goal of this project is:
1. Remove core's dependency on lucene-analyzers-common.jar which should
shrink the dependencies for transport client and high level rest client.
2. Prove that analysis plugins can do all the "built in" things by moving all
"built in" behavior to a plugin.
3. Force tests not to depend on any oddball analyzer behavior. If tests
need anything more than the standard analyzer they can use the mock
analyzer provided by Lucene's test infrastructure.
2017-04-19 18:51:34 -04:00
Nik Everett db0a5e4263 Painless: more testing for script_stack (#24168)
`script_stack` is super useful when debugging Painless scripts
because it skips all the "weird" stuff involved that obfuscates
where the actual error is. It skips Painless's internals and
call site bootstrapping.

It works fine, but it didn't have many tests. This converts a
test that we had for line numbers into a test for the
`script_stack`. The line numbers test was an indirect test
for `script_stack`.
2017-04-18 22:52:59 -04:00
Ryan Ernst 212f24aa27 Tests: Clean up rest test file handling (#21392)
This change simplifies how the rest test runner finds test files and
removes all leniency.  Previously multiple prefixes and suffixes would
be tried, and tests could exist inside or outside of the classpath,
although outside of the classpath never quite worked. Now only classpath
tests are supported, and only one resource prefix is supported,
`/rest-api-spec/tests`.

closes #20240
2017-04-18 15:07:08 -07:00
Nik Everett 0b15fde27a Start on custom whitelists for Painless (#23563)
We'd like to be able to support context-sensitive whitelists in
Painless but we can't now because the whitelist is a static thing.
This begins to de-static the whitelist, in particular removing
the static keyword from most of the methods on `Definition` and
plumbing the static instance into the appropriate spots as though
it weren't static. Once we de-static all the methods we should be
able to fairly simply build context-sensitive whitelists.

The only "fun" bit of this is that I added another layer in the
chain of methods that bootstraps `def` calls. Instead of running
`invokedynamic` directly on `DefBootstrap` we now `invokedynamic`
`$bootstrapDef` on the script itself loads the `Definition` that
the script was compiled against and then calls `DefBootstrap`.

I chose to put `Definition` into `Locals` so I didn't have to
change the signature of all the `analyze` methods. I could have
do it another way, but that seems ok for now.
2017-04-18 10:39:42 -04:00
Adrien Grand 4632661bc7 Upgrade to a Lucene 7 snapshot (#24089)
We want to upgrade to Lucene 7 ahead of time in order to be able to check whether it causes any trouble to Elasticsearch before Lucene 7.0 gets released. From a user perspective, the main benefit of this upgrade is the enhanced support for sparse fields, whose resource consumption is now function of the number of docs that have a value rather than the total number of docs in the index.

Some notes about the change:
 - it includes the deprecation of the `disable_coord` parameter of the `bool` and `common_terms` queries: Lucene has removed support for coord factors
 - it includes the deprecation of the `index.similarity.base` expert setting, since it was only useful to configure coords and query norms, which have both been removed
 - two tests have been marked with `@AwaitsFix` because of #23966, which we intend to address after the merge
2017-04-18 15:17:21 +02:00
Nik Everett 25119a7e78 Harden painless test against "fun" caching (#24077)
The JVM caches `Integer` objects. This is known. A test in Painless
was relying on the JVM not caching the particular integer `1000`.
It turns out that when you provide `-XX:+AggressiveOpts` the JVM
*does* cache `1000`, causing the test to fail when that is
specified.

This replaces `1000` with a randomly selected integer that we test
to make sure *isn't* cached by the JVM. *Hopefully* this test is
good enough. It relies on the caching not changing in between when
we check that the value isn't cached and when we run the painless
code. The cache now is a simple array but there is nothing
preventing it from changing. If it does change in a way that thwarts
this test then the test fail fail again. At least when that happens
the next person can see the comment about how it is important
that the integer isn't cached and can follow that line of inquiry.

Closes #24041
2017-04-17 13:44:05 -04:00
Jason Tedor 972bdc09ee Reject empty IDs
When indexing a document via the bulk API where IDs can be explicitly
specified, we currently accept an empty ID. This is problematic because
such a document can not be obtained via the get API. Instead, we should
rejected these requets as accepting them could be a dangerous form of
leniency. Additionally, we already have a way of specifying
auto-generated IDs and that is to not explicitly specify an ID so we do
not need a second way. This commit rejects the individual requests where
ID is specified but empty.

Relates #24118
2017-04-15 10:36:03 -04:00
Jay Modi 30ab8739a6 Closing a ReleasableBytesStreamOutput closes the underlying BigArray (#23941)
This commit makes closing a ReleasableBytesStreamOutput release the underlying BigArray so
that we can use try-with-resources with these streams and avoid leaking memory by not returning
the BigArray. As part of this change, the ReleasableBytesStreamOutput adds protection to only
release the BigArray once.

In order to make some of the changes cleaner, the ReleasableBytesStream interface has been
removed. The BytesStream interface is changed to a abstract class so that we can use it as a
useable return type for a new method, Streams#flushOnCloseStream. This new method wraps a
given stream and overrides the close method so that the stream is simply flushed and not closed.
This behavior is used in the TcpTransport when compression is used with a
ReleasableBytesStreamOutput as we need to close the compressed stream to ensure all of the data
is written from this stream. Closing the compressed stream will try to close the underlying stream
but we only want to flush so that all of the written bytes are available.

Additionally, an error message method added in the BytesRestResponse did not use a builder
provided by the channel and instead created its own JSON builder. This changes that method to use
the channel builder and in turn the bytes stream output that is managed by the channel.

Note, this commit differs from 6bfecdf921 in that it updates
ReleasableBytesStreamOutput to handle the case of the BigArray decreasing in size, which changes
the reference to the BigArray. When the reference is changed, the releasable needs to be updated
otherwise there could be a leak of bytes and corruption of data in unrelated streams.

This reverts commit afd45c1432, which reverted #23572.
2017-04-14 10:50:31 -04:00
Tim Brooks ffaac5a08a Simplify BulkProcessor handling and retry logic (#24051)
This commit collapses the SyncBulkRequestHandler and
AsyncBulkRequestHandler into a single BulkRequestHandler. The new
handler executes a bulk request and awaits for the completion if the
BulkProcessor was configured with a concurrentRequests setting of 0.
Otherwise the execution happens asynchronously.

As part of this change the Retry class has been refactored.
withSyncBackoff and withAsyncBackoff have been replaced with two
versions of withBackoff. One method takes a listener that will be
called on completion. The other method returns a future that will been
complete on request completion.
2017-04-13 14:48:52 -05:00
Nik Everett e99f90fb46 Add more debugging information to rethrottles
I'm still trying to track down failures like:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+master+dockeralpine-periodic/1180/console

It looks like a task is hanging but I'm not sure why. So this
adds more logging for next time.
2017-04-12 08:37:31 -04:00
Jason Tedor 653619079c Skip two Painless branch tests on Windows
This commit skips the two Painless tests
EqualsTests#testBranchEqualsDefAndPrimitive and
EqualsTests#testBranchNotEqualsDefAndPrimitive on Windows as the tests
are repeatedly failing there.
2017-04-11 06:19:42 -04:00
Colin Goodheart-Smithe 0114f0061c Removes version 2.x constants from Version (#24011)
* Removes version 2.x constants from Version

Closes #21887

* Addresses review comments
2017-04-11 08:31:22 +01:00
Luca Cavanna 2c545c064d Move getProperty method out of MultiBucketsAggregation.Bucket interface (#23988)
The getProperty method is an internal method needed to run pipeline aggregations and retrieve info by path from the aggs tree. It is not needed in the MultiBucketsAggregation.Bucket interface, which is returned to users running aggregations from the transport client. The method is moved to the InternalMultiBucketAggregation class as that's where it belongs.
2017-04-10 13:35:01 +02:00
Nik Everett de6837b7ac Fix throttled reindex_from_remote (#23953)
reindex_from_remote was using `TimeValue#toString` to generate the
scroll timeout which is bad because that generates fractional
time values that are useful for people but bad for Elasticsearch
which doesn't like to parse them. This switches it to using
`TimeValue#getStringRep` which spits out whole time values.

Closes to #23945

Makes #23828 even more desirable
2017-04-07 15:56:52 -04:00
Martijn van Groningen 3d9671a668
[PERCOLATOR] Allowing range queries with now ranges inside percolator queries.
Before now ranges where forbidden, because the percolator query itself could get cached and then the percolator queries with now ranges that should no longer match, incorrectly will continue to match.
By disabling caching when the `percolator` is being used, the percolator can now correctly support range queries with now based ranges.

 I think this is the right tradeoff. The percolator query is likely to not be the same between search requests and disabling range queries with now ranges really disabled people using the percolator for their use cases.

 Also fixed an issue that existed in the percolator fieldmapper, it was unable to find forbidden queries inside `dismax` queries.

 Closes #23859
2017-04-07 08:44:43 +02:00
Tim Brooks 5b1fbe5e6c Decouple BulkProcessor from client implementation (#23373)
This commit modifies the BulkProcessor to be decoupled from the
client implementation. Instead it just takes a
BiConsumer<BulkRequest, ActionListener<BulkResponse>> that executes
the BulkRequest.
2017-04-05 12:12:43 -05:00
Jason Tedor afd45c1432 Revert "Closing a ReleasableBytesStreamOutput closes the underlying BigArray (#23572)"
This reverts commit 6bfecdf921.
2017-04-04 20:33:51 -04:00
Christoph Büscher 6cfbef73a0 Follow renaming of randomAsciiOfLength() to randomAlphaOfLength() 2017-04-04 18:31:00 +02:00
Christoph Büscher 024ed1b6ca Merge branch 'master' into feature/rank-eval 2017-04-04 18:23:41 +02:00
Jay Modi 6bfecdf921 Closing a ReleasableBytesStreamOutput closes the underlying BigArray (#23572)
This commit makes closing a ReleasableBytesStreamOutput release the underlying BigArray so
that we can use try-with-resources with these streams and avoid leaking memory by not returning
the BigArray. As part of this change, the ReleasableBytesStreamOutput adds protection to only release the BigArray once.

In order to make some of the changes cleaner, the ReleasableBytesStream interface has been
removed. The BytesStream interface is changed to a abstract class so that we can use it as a
useable return type for a new method, Streams#flushOnCloseStream. This new method wraps a
given stream and overrides the close method so that the stream is simply flushed and not closed.
This behavior is used in the TcpTransport when compression is used with a
ReleasableBytesStreamOutput as we need to close the compressed stream to ensure all of the data
is written from this stream. Closing the compressed stream will try to close the underlying stream
but we only want to flush so that all of the written bytes are available.

Additionally, an error message method added in the BytesRestResponse did not use a builder
provided by the channel and instead created its own JSON builder. This changes that method to use the channel builder and in turn the bytes stream output that is managed by the channel.
2017-04-04 17:01:30 +01:00
Jason Tedor 3136ed1490 Rename random ASCII helper methods
This commit renames the random ASCII helper methods in ESTestCase. This
is because this method ultimately uses the random ASCII methods from
randomized runner, but these methods actually only produce random
strings generated from [a-zA-Z].

Relates #23886
2017-04-04 11:04:18 -04:00
Nik Everett ebd74f09cf Add extra debugging to reindex cancel tests
Adds more diagnostics when reindex's cancel tests fail. It fails
every once in a while and didn't have useful failure messages:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.3+multijob-unix-compatibility/os=amazon/86/consoleFull
2017-03-31 11:17:06 -04:00
Jim Ferenczi f3a925fdbe Fix reindex with a remote source on a version before 2.0.0 (#23805)
Send the scroll id in the body as plain text when the remote version is before 2.0.0
2017-03-31 09:07:43 +02:00
Tim Brooks 5fa80a6521 Pass exception from sendMessage to listener (#23559)
This commit changes the listener passed to sendMessage from a Runnable
to a ActionListener.

This change also removes IOException from the sendMessage signature.
That signature is misleading as it allows implementers to assume an
exception will be thrown in case of failure. That does not happen due
to Netty's async nature.
2017-03-30 15:08:23 -05:00
Dimitris Athanasiou 34f116eae3 Require explicit query in _delete_by_query API (#23632)
As the query of a search request defaults to match_all,
calling _delete_by_query without an explicit query may
result in deleting all data.

In order to protect users against falling into that
pitfall, this commit adds a check to require the explicit
setting of a query.

Closes #23629
2017-03-28 15:44:57 +01:00
Jim Ferenczi 0e95c90e9f Upgrade to Lucene 6.5.0 (#23750) 2017-03-27 15:57:54 +02:00
Ryan Ernst 4cb8a0100c Build: Rewrite antlr regeneration in gradle (#23733)
This change ports the regeneration of antlr parser/lexer into gradle
(but does still take advantage of ant calls where appropriate).
2017-03-24 09:44:53 -07:00
Ryan Ernst 8c53555b28 Tests: Use local clone build of 5.x with bwc tests (#22946)
The current rest backcompat tests, which run against a mixed cluster of
5.x and 6.0 nodes, depend on snapshot builds of 5.x. However, this has
the potential for inconsistency that results in CI failures, and happens
quite often, whenever some backcompat logic is added to 5.x, but the bwc
test on master fails because the 5.x code has not yet been published as
a snapshot.

This change creates a git clone of the 5.x branch,
builds the zip distribution, and ties that into gradle substitutions for
the 5.x version.
2017-03-23 22:32:13 -07:00
Christoph Büscher 4a75ede208 Reformatting source to fit 100 character line length restriction 2017-03-23 20:20:22 +01:00
Christoph Büscher 96fc3aaf6f Merge branch 'master' into feature/rank-eval 2017-03-23 19:55:47 +01:00
AdityaJNair 63757efe9c Remove DocumentMapper#parse(String index, String type, String id, BytesReference source) (#23706)
Removed `parse(String index, String type, String id, BytesReference source)` in DocumentMapper.java and replaced all of its use in Test files with `parse(SourceToParse source)`.

`parse(String index, String type, String id, BytesReference source)` was only used in test files and never in the main code so it was removed. All of the test files that used it was then modified to use `parse(SourceToParse source)` method that existing in DocumentMapper.java
2017-03-23 11:01:09 -04:00
Nik Everett 257a7d77ed Painless: Fix regex lexer and error messages (#23634)
Without this change, if write a script with multiple regexes
*sometimes* the lexer will decide to look at them like one
big regex and then some trailing garbage. Like this discuss post:
https://discuss.elastic.co/t/error-with-the-split-function-in-painless-script/79021

```
def val = /\\\\/.split(ctx._source.event_data.param17);
if (val[2] =~ /\\./) {
  def val2 = /\\./.split(val[2]);
  ctx._source['user_crash'] = val2[0]
} else {
  ctx._source['user_crash'] = val[2]
}
```

The error message you get from the lexer is `lexer_no_viable_alt_exception`
right after the *second* regex.

With this change each regex is just a single regex like it ought to be.

As a bonus, while looking into this issue I found that the error
reporting for regexes wasn't very nice. If you specify an invalid
pattern then you get an error marker on the start of the pattern
with the JVM's regex error message which attempts to point you to the
location in the regex but is totally unreadable in the JSON response.

This change fixes the location to point to the appropriate spot
inside the pattern and removes the portion of the JVM's error message
that doesn't render well. It is no longer needed now that we point
users to the appropriate spot in the pattern.
2017-03-22 15:56:17 -04:00
Nik Everett bc65be2a65 Reindex: wait for cleanup before responding (#23677)
Changes reindex and friends to wait until the entire request has
been "cleaned up" before responding. "Clean up" in this context
is clearing the scroll and (for reindex-from-remote) shutting
down the client. Failures to clean up are still only logged, not
returned to the user.

Closes #23653
2017-03-21 15:33:39 -04:00
Christoph Büscher f5388e5799 Adapting rank_eval integration tests 2017-03-14 12:21:28 -07:00