Commit Graph

4097 Commits

Author SHA1 Message Date
Martijn van Groningen c85ac402b0
test: Make many percolator integration tests real integration tests 2017-06-27 17:44:30 +02:00
Simon Willnauer d338a09812 Remove `mapping.single_type` from parent join test (#25391)
This removes the remaining usage of `mapping.single_type` from the parent join
module and moves it's bwc test to the mixed cluster tests

Relates to #24961
Relates to #20257
2017-06-26 17:33:07 +02:00
Martijn van Groningen a34f5fa812
Move more token filters to analysis-common module
The following token filters were moved: stemmer, stemmer_override, kstem, dictionary_decompounder, hyphenation_decompounder, reverse, elision and truncate.

Relates to #23658
2017-06-26 09:02:16 +02:00
Nik Everett da0b991331 Remove `index.mapping.single_type=false` from reindex tests (#25365)
* Remove the setting from the yml tests and replace with tests using
`join` field. We can't use the setting in yml tests without lots of
backflips but we have `ReindexParentChildTests` for the coverage.
There weren't tests for `join` field with reindex before this. Adding
these tests discovered #25363.
* Remove the setting from `ReindexParentChildTests` and replace with
`index.version.created=V_5_6_0`. This test can be entirely removed
when legacy parent/child support is dropped from core.
* Port the yml tests that set _parent into integ tests so they
can set the index created version. These tests can be removed
when we drop support for _parent in core.
* Port a delete-by-query test for filtering based on type to an
`ESIntegTestCase` so it can use `index.version.created=5.6.0` to
setup documents of multiple types. This whole feature can be dropped
when we no longer support multiple types per index.

Relates to #24961
2017-06-23 17:14:59 -04:00
Simon Willnauer 4ae426a552 Remove remaining `index.mapping.single_type=false` (#25369)
This change cleans up remaining tests  to not use index.mapping.single_type=false
but instead where applicable use a single type or markt the index as created
with a pre 6.x version.

Yet, there is still on leftover in the client tests that needs special attention.
See `org.elasticsearch.client.SearchIT`

Relates to #24961
2017-06-23 10:26:06 +02:00
Tal Levy 1ac7818201 fix sort and string processor tests around targetField (#25358)
Tests were randomly assigning `targetField` to an existing field that was an array,
causing path resolution issues. This PR fixes those tests

Closes #25346 & #25348
2017-06-22 13:14:18 -07:00
Jack Conradson 96b62409a8 Update Painless to Allow Augmentation from Any Class (#25360)
Custom whitelists in Painless will need to allow classes to be augmented beyond the currently hard-coded Augmentation class tied to Painless directly. This change allows any class to specify an augmentation on a Painless struct using an appropriate static method. Changes to loading the whitelist have also been created to allow for this specification of a different class for augmentation.
2017-06-22 12:16:46 -07:00
Martijn van Groningen 343e7571b9
test: single type defaults to true since alpha1 and not alpha3
Closes #25354
2017-06-22 16:31:15 +02:00
Adrien Grand 44e9c0b947 Upgrade to lucene-7.0.0-snapshot-ad2cb77. (#25349)
Most notable changes:
 - better update concurrency: LUCENE-7868
 - TopDocs.totalHits is now a long: LUCENE-7872
 - QueryBuilder does not remove the boolean query around multi-term synonyms:
   LUCENE-7878
 - removal of Fields: LUCENE-7500

For the `TopDocs.totalHits` change, this PR relies on the fact that the encoding
of vInts and vLongs are compatible: you can write and read with any of them as
long as the value can be represented by a positive int.
2017-06-22 12:35:33 +02:00
Martijn van Groningen a977569085
percolator: Deprecate `document_type` parameter.
The `document_type` parameter is no longer required to be specified,
because by default from 6.0 only a single type is allowed. (`index.mapping.single_type` defaults to `true`)
2017-06-22 09:55:06 +02:00
Nik Everett 8d9a08e239 Fix reindex test when log level is debug
When log level is debug we'd dereference null because the test
was being cute and cutting corners.

Relates to #25256
2017-06-20 16:06:58 -04:00
Jun Ohtani 62d1969595 Parse synonyms with the same analysis chain (#8049)
* [Analysis] Parse synonyms with the same analysis chain

Synonym Token Filter / Synonym Graph Filter tokenize synonyms with whatever tokenizer and token filters appear before it in the chain.

Close #7199
2017-06-20 21:50:33 +09:00
Nik Everett 3261586cac Tweak reindex cancel logic and add many debug logs (#25256)
I'm still trying to hunt down rare failures in the cancelation tests
for reindex and friends. Here is the latest:
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+5.x+multijob-unix-compatibility/os=ubuntu/876/console

It doesn't show much, other than that one of the tasks didn't kill
itself when asked to cancel.

So I'm going a bit crazy with debug logging so that the next time this
comes up I can trace exactly what happened.

Additionally, this tweaks the logic around how rethrottles were
performed around cancel. Previously we set the `requestsPerSecond`
to `0` when we cancelled the task. That was the "old way" to set them
to inifity which was the intent. This switches that from `0` to
`Float.MAX_VALUE` which is the "new way" to set the `requestsPerSecond`
to infinity. I don't know that this is much better, but it feels better.
2017-06-19 18:46:42 -04:00
Andy Bristol 4c5bd57619 Rename simple pattern tokenizers (#25300)
Changed names to be snake case for consistency

Related to #25159, original issue #23363
2017-06-19 13:48:43 -07:00
Simon Willnauer a8d5a58801 Replace deprecated API usage in Netty4HttpChannel 2017-06-17 14:04:23 +02:00
Christoph Büscher e99ced06cc [Tests] Check that parsing aggregations works in a forward compatible way (#25219)
This change adds tests for the aggregation parsing that try to simulate that we
can parse existing aggregations in a forward compatible way in the future,
ignoring potential newly added fields or substructures to the xContent response.
2017-06-17 13:06:31 +02:00
Simon Willnauer f18b0d293c Move TransportStats accounting into TcpTransport (#25251)
Today TcpTransport is the de-facto base-class for transport implementations.
The need for all the callbacks we have in TransportServiceAdaptor are not necessary
anymore since we can simply have the logic inside the base class itself. This change
moves the stats metrics directly into TcpTransport removing the need for low level
bytes send / received callbacks.
2017-06-16 22:34:11 +02:00
Nik Everett ecc87f613f Move pre-configured "keyword" tokenizer to the analysis-common module (#24863)
Moves the keyword tokenizer to the analysis-common module. The keyword tokenizer is special because it is used by CustomNormalizerProvider so I pulled it out into its own PR. To get the move to work I've reworked the lookup from static to one using the AnalysisRegistry. This seems safe enough.

Part of #23658.
2017-06-16 11:48:15 -04:00
Jack Conradson 50db8cb351 Add needs methods for specific variables to Painless script context factories. (#25267) 2017-06-15 17:00:33 -07:00
Martijn van Groningen 428e70758a
Moved more token filters to analysis-common module.
The following token filters were moved: `edge_ngram`, `ngram`, `uppercase`, `lowercase`, `length`, `flatten_graph` and `unique`.

Relates to #23658
2017-06-15 18:28:31 +02:00
Jim Ferenczi 5e64cd08bc [Test] restore BWC for parent-join now that the new mapping format is in 5.x 2017-06-15 15:15:48 +02:00
Jim Ferenczi 9ca33e2450 Add a section named "relations" in the ParentJoinFieldMapper (#25248)
* Add a section named "relation" in the ParentJoinFieldMapper

This commit puts the parent/child definition in an inner section named "relation".
Mapping for the parent-join will look like this:

```
"join_field": {
  "type": "join"
  "relations":
    "parent": "child"
  }
}
```
2017-06-15 14:56:20 +02:00
Tal Levy 2cd771a230 fix: Sort Processor does not have proper behavior with targetField (#25237)
to specify a `targetField`. This results in some interesting behavior that was missed in the review.
This processor sorts in-place, so there is a side-effect in both the original field and the target field.
Another bug was that the targetField was not being set if the list being sorted was fewer than two elements.

The new behavior works like this: If targetField and fieldName are not the same, we copy the list.
2017-06-15 05:28:54 -07:00
Boaz Leskes 648b4717a4 move assertBusy to use CheckException (#25246)
We use assertBusy in many places where the underlying code throw exceptions. Currently we need to wrap those exceptions in a RuntimeException which is ugly.
2017-06-15 13:24:07 +02:00
Tanguy Leroux 27f1206999 Use SPI in High Level Rest Client to load XContent parsers (#25098)
This commit adds a NamedXContentProvider interface that can 
be implemented by plugins or modules using Java's SPI feature 
in order to provide additional NamedXContent parsers to external
applications like the Java High Level Rest Client.
2017-06-15 12:50:02 +02:00
Adrien Grand 0c117145f6 Upgrade to lucene-7.0.0-snapshot-92b1783. (#25222)
This snapshot has faster range queries on range fields (LUCENE-7828), more
accurate norms (LUCENE-7730) and the ability to use fake term frequencies
(LUCENE-7854).
2017-06-15 09:52:07 +02:00
Ryan Ernst caf7792db1 Scripting: Rename SearchScript.needsScores to needs_score (#25235)
This commit renames the needsScores method so as to make it
automatically generatable, based on the name of the `_score` variable
which is available in search scripts. It also adds documentation to
ScriptContext to explain the naming and signature of such methods.
2017-06-14 22:01:19 -07:00
Jack Conradson a4471f51e4 Support script context stateful factory in Painless. (#25233) 2017-06-14 16:44:41 -07:00
Andy Bristol 48696ab544 expose simple pattern tokenizers (#25159)
Expose the experimental simplepattern and 
simplepatternsplit tokenizers in the common 
analysis plugin. They provide tokenization based 
on regular expressions, using Lucene's 
deterministic regex implementation that is usually 
faster than Java's and has protections against 
creating too-deep stacks during matching.

Both have a not-very-useful default pattern of the 
empty string because all tokenizer factories must 
be able to be instantiated at index creation time. 
They should always be configured by the user 
in practice.
2017-06-13 12:46:59 -07:00
Alexander Kazakov a7dafdaa05 Add target_field parameter to gsub, join, lowercase, sort, split, trim, uppercase (#24133)
Closes #23682 #23228
2017-06-13 09:40:44 -07:00
Simon Willnauer 186c16ea41 Ensure pending transport handlers are invoked for all channel failures (#25150)
Today if a channel gets closed due to a disconnect we notify the response
handler that the connection is closed and the node is disconnected. Unfortunately
this is not a complete solution since it only works for published connections.
Connections that are unpublished ie. for discovery can indefinitely hang since we
never invoke their handers when we get a failure while a user is waiting for
the response. This change adds connection tracking to TcpTransport that ensures
we are notifying the corresponding connection if there is a failure on a channel.
2017-06-13 09:37:05 +02:00
Jason Tedor dcf57f296e Fix get mappings HEAD requests
Get mappings HEAD requests incorrectly return a content-length header of
0. This commit addresses this by removing the special handling for get
mappings HEAD requests, and just relying on the general mechanism that
exists for handling HEAD requests in the REST layer.

Relates #23192
2017-06-11 14:58:56 -04:00
Jason Tedor 7182577904 Fix handling of exceptions thrown on HEAD requests
Today when an exception is thrown handling a HEAD request, the body is
swallowed before the channel has a chance to see it. Yet, the channel is
where we compute the content length that would be returned as a header
in the response. This is a violation of the HTTP specification. This
commit addresses the issue. To address this issue, we remove the special
handling in bytes rest response for HEAD requests when an exception is
thrown. Instead, we let the upstream channel handle the special case, as
we already do today for the non-exceptional case.

Relates #25172
2017-06-10 23:44:18 -04:00
Ryan Ernst a03b6c2fa5 Scripting: Change keys for inline/stored scripts to source/id (#25127)
This commit adds back "id" as the key within a script to specify a
stored script (which with file scripts now gone is no longer ambiguous).
It also adds "source" as a replacement for "code". This is in an attempt
to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.
2017-06-09 08:29:25 -07:00
Jim Ferenczi 8250aa4267 Remove the postings highlighter and make unified the default highlighter choice (#25028)
This change removes the `postings` highlighter. This highlighter has been removed from Lucene master (7.x) because it behaves
exactly like the `unified` highlighter when index_options is set to `offsets`:
https://issues.apache.org/jira/browse/LUCENE-7815

It also makes the `unified` highlighter the default choice for highlighting a field (if `type` is not provided).
The strategy used internally by this highlighter remain the same as before, it checks `term_vectors` first, then `postings` and ultimately it re-analyzes the text.
Ultimately it rewrites the docs so that the options that the `unified` highlighter cannot handle are clearly marked as such.
There are few features that the `unified` highlighter is not able to handle which is why the other highlighters (`plain` and `fvh`) are still available.
I'll open separate issues for these features and we'll deprecate the `fvh` and `plain` highlighters when full support for these features have been added to the `unified`.
2017-06-09 14:09:57 +02:00
Tal Levy a771912a22 Add Ingest-Processor specific Rest Endpoints & Add Grok endpoint (#25059)
This PR enables Ingest plugins to leverage processor-scoped REST
endpoints. First of which being the Grok endpoint that retrieves
Grok Patterns for users to retrieve all the built-in patterns.
Example usage: Kibana Grok Autocomplete!
2017-06-08 15:24:35 -07:00
Tal Levy 340909582f remove Ingest's Internal Template Service (#25085)
Ingest was using it's own wrapper around TemplateScripts and the ScriptService.
This commit removes that abstraction
2017-06-08 15:24:03 -07:00
Guillaume Le Floch 3f6d80aa66 Allow removing multiple fields in ingest processor (#24750)
* Allow removing multiple fields in ingest processor

* Iteration 2

* Few fixes
2017-06-08 13:17:44 -07:00
Jim Ferenczi 21a57c1494 Always use DisjunctionMaxQuery to build cross fields disjunction (#25115)
This commit modifies query_string, simple_query_string and multi_match queries to always use a DisjunctionMaxQuery when a disjunction over multiple fields is built. The tiebreaker is set to 1 in order to behave like the boolean query in terms of scoring.
The removal of the coord factor in Lucene 7 made this change mandatory to correctly handle minimum_should_match.

Closes #23966
2017-06-08 11:18:17 +02:00
Adrien Grand a8ea2f0df4 Leverage scorerSupplier when applicable. (#25109)
The `scorerSupplier` API allows to give a hint to queries in order to let them
know that they will be consumed in a random-access fashion. We should use this
for aggregations, function_score and matched queries.
2017-06-08 10:19:38 +02:00
Jack Conradson d187fa78fd Generate Painless Factory for Creating Script Instances (#25120) 2017-06-07 16:06:11 -07:00
Jim Ferenczi 3924fd79ef Add BWC rest test for parent-join after the backport to 5.x 2017-06-07 19:29:01 +02:00
Martijn van Groningen db8aa8e94e
Changed inner_hits to work with the new join field type and
at the same time maintaining support for the `_parent` meta field type/

Relates to #20257
2017-06-07 10:52:49 +02:00
Jason Tedor e03c4938c5 GET aliases should 404 if aliases are missing
Previously the HEAD and GET aliases endpoints were misaigned in
behavior. The HEAD verb would 404 if any aliases are missing while the
GET verb would not if any aliases existed. When HEAD was aligned with
GET, this broke the previous usage of HEAD to serve as an existence
check for aliases. It is the behavior of GET that is problematic here
though, if any alias is missing the request should 404. This commit
addresses this by modifying the behavior of GET to behave in this
way. This fixes the behavior for HEAD to also 404 when aliases are
missing.

Relates #25043
2017-06-06 14:37:29 -04:00
Jim Ferenczi 7e60cf3e54 Move parent_id query to the parent-join module (#25072)
This change moves the parent_id query to the parent-join module and handles the case when only the parent-join field can be declared on an index (index with single type on).
If single type is off it uses the legacy parent join field mapper and switch to the new one otherwise (default in 6).

Relates #20257
2017-06-06 19:35:14 +02:00
Tal Levy d6d0c13bd6 fix grok's pattern parsing to validate pattern names in expression (#25063)
Unknown patterns used to silently be ignored. This was a problem because users did not know they were providing an invalid pattern name, and maybe thought the rest of their regexes were invalid.

Fixes #22831.
2017-06-06 08:07:53 -07:00
Tal Levy e51246023a add `exclude_keys` option to KeyValueProcessor (#24876)
and modify data-structure of `include_keys` and `exclude_keys` to be
backed by a HashSet
2017-06-05 14:12:48 -07:00
Alex Benusovich 5463294ec4 Fixed NPEs caused by requests without content. (#23497)
REST handlers that require a body will throw an an ElasticsearchParseException "request body required".
REST handlers that require a body OR source param will throw an ElasticsearchParseException "request body or source param required".
Replaced asserts in BulkRequest parsing code with a more descriptive IllegalArgumentException if the line contains an empty object.
Updated bulk REST test to verify an empty action line is rejected properly.
Updated BulkRequestTests with randomized testing for an empty action line.
Used try-with-resouces for XContentParser in AbstractBulkByQueryRestHandler.
2017-06-05 09:08:14 -06:00
Nik Everett 73307a2144 Plugins can register pre-configured char filters (#25000)
Fixes the plumbing so plugins can register char filters and moves
the `html_strip` char filter into analysis-common.

Relates to #23658
2017-06-05 09:25:15 -04:00
Jack Conradson 8999104b14 Change ScriptContexts to use needs instead of uses$. (#25036) 2017-06-02 14:50:51 -07:00