Commit Graph

179 Commits

Author SHA1 Message Date
Tal Levy 1ac7818201 fix sort and string processor tests around targetField (#25358)
Tests were randomly assigning `targetField` to an existing field that was an array,
causing path resolution issues. This PR fixes those tests

Closes #25346 & #25348
2017-06-22 13:14:18 -07:00
Tal Levy 2cd771a230 fix: Sort Processor does not have proper behavior with targetField (#25237)
to specify a `targetField`. This results in some interesting behavior that was missed in the review.
This processor sorts in-place, so there is a side-effect in both the original field and the target field.
Another bug was that the targetField was not being set if the list being sorted was fewer than two elements.

The new behavior works like this: If targetField and fieldName are not the same, we copy the list.
2017-06-15 05:28:54 -07:00
Alexander Kazakov a7dafdaa05 Add target_field parameter to gsub, join, lowercase, sort, split, trim, uppercase (#24133)
Closes #23682 #23228
2017-06-13 09:40:44 -07:00
Ryan Ernst a03b6c2fa5 Scripting: Change keys for inline/stored scripts to source/id (#25127)
This commit adds back "id" as the key within a script to specify a
stored script (which with file scripts now gone is no longer ambiguous).
It also adds "source" as a replacement for "code". This is in an attempt
to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.
2017-06-09 08:29:25 -07:00
Tal Levy a771912a22 Add Ingest-Processor specific Rest Endpoints & Add Grok endpoint (#25059)
This PR enables Ingest plugins to leverage processor-scoped REST
endpoints. First of which being the Grok endpoint that retrieves
Grok Patterns for users to retrieve all the built-in patterns.
Example usage: Kibana Grok Autocomplete!
2017-06-08 15:24:35 -07:00
Tal Levy 340909582f remove Ingest's Internal Template Service (#25085)
Ingest was using it's own wrapper around TemplateScripts and the ScriptService.
This commit removes that abstraction
2017-06-08 15:24:03 -07:00
Guillaume Le Floch 3f6d80aa66 Allow removing multiple fields in ingest processor (#24750)
* Allow removing multiple fields in ingest processor

* Iteration 2

* Few fixes
2017-06-08 13:17:44 -07:00
Tal Levy d6d0c13bd6 fix grok's pattern parsing to validate pattern names in expression (#25063)
Unknown patterns used to silently be ignored. This was a problem because users did not know they were providing an invalid pattern name, and maybe thought the rest of their regexes were invalid.

Fixes #22831.
2017-06-06 08:07:53 -07:00
Tal Levy e51246023a add `exclude_keys` option to KeyValueProcessor (#24876)
and modify data-structure of `include_keys` and `exclude_keys` to be
backed by a HashSet
2017-06-05 14:12:48 -07:00
Alex Benusovich 5463294ec4 Fixed NPEs caused by requests without content. (#23497)
REST handlers that require a body will throw an an ElasticsearchParseException "request body required".
REST handlers that require a body OR source param will throw an ElasticsearchParseException "request body or source param required".
Replaced asserts in BulkRequest parsing code with a more descriptive IllegalArgumentException if the line contains an empty object.
Updated bulk REST test to verify an empty action line is rejected properly.
Updated BulkRequestTests with randomized testing for an empty action line.
Used try-with-resouces for XContentParser in AbstractBulkByQueryRestHandler.
2017-06-05 09:08:14 -06:00
Tal Levy 2a6e6866bd Fix floating-point error when DateProcessor parses UNIX (#24947)
DateProcessor's DateFormat UNIX format parser resulted in
a floating point rounding error when parsing certain stringed
epoch times. Now Double.parseDouble is used, preserving the
intented input.
2017-05-30 09:42:26 -07:00
Ryan Ernst 74e031e842 Scripting: Rename CompiledType to FactoryType in ScriptContext (#24897)
This commit renames the concept of the "compiled type" to a "factory
type", along with all implementations of this class to be named Factory.
This brings it inline with the classes purpose.
2017-05-26 00:02:54 -07:00
Ryan Ernst 8aaea51a0a Scripting: Move context definitions to instance type classes (#24883)
This is a simple refactoring to move the context definitions into the
type that they use. While we have multiple context names for the same
class at the moment, this will eventually become one ScriptContext per
instance type, so the pattern of a static member on the interface called
CONTEXT can be used. This commit also moves the consolidated list of
contexts provided by core ES into ScriptModule.
2017-05-25 12:18:45 -07:00
Ryan Ernst 1daacd97b0 Scripting: Add instance and compiled classes to script contexts (#24868)
This commit modifies the compile method of ScriptService to be context
aware. The ScriptContext is now a generic class which contains both the
instance type and compiled type for a script. Instance type may be
stateful (for example, pre loading field information for the index a
script will execute on, like in expressions), while the compiled type is
stateless and used to construct instance type instances. This change is
only a first step to cutover ScriptService to the new paradigm. It only
converts callers to the script service, and has a small shim to wrap
compilation from the script engines to support the current two fixed
instance types, SearchScript and ExecutableScript.
2017-05-24 14:29:02 -07:00
Ryan Ernst 52d504bb5f Scripting: Simplify ScriptContext (#24818)
As we work towards contexts implying the return type of compilation, we
first need ScriptContext to not be an enum. This commit removes the
Standard enum and Plugin subclass of ScriptContext.
2017-05-22 13:11:15 -07:00
Ryan Ernst 463fe2f4d4 Scripting: Remove file scripts (#24627)
This commit removes file scripts, which were deprecated in 5.5.

closes #21798
2017-05-17 14:42:25 -07:00
Ryan Ernst 2a65bed243 Tests: Change rest test extension from .yaml to .yml (#24659)
This commit renames all rest test files to use the .yml extension
instead of .yaml. This way the extension used within all of
elasticsearch for yaml is consistent.
2017-05-16 17:24:35 -07:00
Koen De Groote 878ae8eb3c Size lists in advance when known
When constructing an array list, if we know the size of the list in
advance (because we are adding objects to it derived from another list),
we should size the array list to the appropriate capacity in advance (to
avoid resizing allocations). This commit does this in various places.

Relates #24439
2017-05-12 10:36:13 -04:00
javanna e875f7f72e remove duplicated import in AppendProcessor 2017-05-08 10:36:36 +02:00
Nik Everett bc45d10e82 Remove most usages of 1-arg Script ctor (#24325)
The one argument ctor for `Script` creates a script with the
default language but most usages of are for testing and either
don't care about the language or are for use with
`MockScriptEngine`. This replaces most usages of the one argument
ctor on `Script` with calls to `ESTestCase#mockScript` to make
it clear that the tests don't need the default scripting language.

I've also factored out some copy and pasted script generation
code into a single place. I would have had to change that code
to use `mockScript` anyway, so it was easier to perform the
refactor.

Relates to #16314
2017-04-26 16:04:38 -04:00
Ryan Ernst 473e98981b Scripts: Remove unnecessary executable shortcut (#24264)
ScriptService has two executable methods, one which takes a
CompiledScript, which is similar to search, and one that takes a raw
Script and both compiles and returns an ExecutableScript for it. The
latter is not needed, and the call sites which used one or the other
were mixed. This commit removes the extra executable method in favor of
callers first calling compile, then executable.
2017-04-21 17:53:03 -07:00
Ryan Ernst 212f24aa27 Tests: Clean up rest test file handling (#21392)
This change simplifies how the rest test runner finds test files and
removes all leniency.  Previously multiple prefixes and suffixes would
be tried, and tests could exist inside or outside of the classpath,
although outside of the classpath never quite worked. Now only classpath
tests are supported, and only one resource prefix is supported,
`/rest-api-spec/tests`.

closes #20240
2017-04-18 15:07:08 -07:00
Jason Tedor 3136ed1490 Rename random ASCII helper methods
This commit renames the random ASCII helper methods in ESTestCase. This
is because this method ultimately uses the random ASCII methods from
randomized runner, but these methods actually only produce random
strings generated from [a-zA-Z].

Relates #23886
2017-04-04 11:04:18 -04:00
Jason Tedor 9a0b216c36 Upgrade checkstyle to version 7.5
This commit upgrades the checkstyle configuration from version 5.9 to
version 7.5, the latest version as of today. The main enhancement
obtained via this upgrade is better detection of redundant modifiers.

Relates #22960
2017-02-03 09:46:44 -05:00
Jack Conradson 3d2626c4c6 Change Namespace for Stored Script to Only Use Id (#22206)
Currently, stored scripts use a namespace of (lang, id) to be put, get, deleted, and executed. This is not necessary since the lang is stored with the stored script. A user should only have to specify an id to use a stored script. This change makes that possible while keeping backwards compatibility with the previous namespace of (lang, id). Anywhere the previous namespace is used will log deprecation warnings.

The new behavior is the following:

When a user specifies a stored script, that script will be stored under both the new namespace and old namespace.

Take for example script 'A' with lang 'L0' and data 'D0'. If we add script 'A' to the empty set, the scripts map will be ["A" -- D0, "A#L0" -- D0]. If a script 'A' with lang 'L1' and data 'D1' is then added, the scripts map will be ["A" -- D1, "A#L1" -- D1, "A#L0" -- D0].

When a user deletes a stored script, that script will be deleted from both the new namespace (if it exists) and the old namespace.

Take for example a scripts map with {"A" -- D1, "A#L1" -- D1, "A#L0" -- D0}. If a script is removed specified by an id 'A' and lang null then the scripts map will be {"A#L0" -- D0}. To remove the final script, the deprecated namespace must be used, so an id 'A' and lang 'L0' would need to be specified.

When a user gets/executes a stored script, if the new namespace is used then the script will be retrieved/executed using only 'id', and if the old namespace is used then the script will be retrieved/executed using 'id' and 'lang'
2017-01-31 13:27:02 -08:00
Tal Levy e9a68b3287 fix date-processor to a new default year for every new pipeline execution. (#22601)
Beforehand, the DateProcessor constructs its joda pattern formatter during processor
construction. This led to newly ingested documents being defaulted to
the year that the pipeline was constructed, not that of processing.

Fixes #22547.
2017-01-25 15:09:07 -08:00
Tal Levy e6fb3a5d95 fix index out of bounds error in KV Processor (#22288)
- checks for index-out-of-bounds
- added unit tests for failed `field_split` and `value_split` scenarios

missed this test in #22272.
2016-12-27 10:57:11 -08:00
Nik Everett f5f2149ff2 Remove much ceremony from parsing client yaml test suites (#22311)
* Remove a checked exception, replacing it with `ParsingException`.
* Remove all Parser classes for the yaml sections, replacing them with static methods.
* Remove `ClientYamlTestFragmentParser`. Isn't used any more.
* Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods.

I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.
2016-12-22 11:00:34 -05:00
Tal Levy c53b2ee9cd introduce KV Processor in Ingest Node (#22272)
Now you can parse field values of the `key=value` variety and have
`key` be inserted as a field name in an ingest document.

Closes #22222.
2016-12-20 13:26:17 -08:00
Nik Everett a04dcfb95b Introduce XContentParser#namedObject (#22003)
Introduces `XContentParser#namedObject which works a little like
`StreamInput#readNamedWriteable`: on startup components register
parsers under names and a superclass. At runtime we look up the
parser and call it to parse the object.

Right now the parsers take a context object they use to help with
the parsing but I hope to be able to eliminate the need for this
context as most what it is used for at this point is to move
around parser registries which should be replaced by this method
eventually. I make no effort to do so in this PR because it is
big enough already. This is meant to the a start down a road that
allows us to remove classes like `QueryParseContext`,
`AggregatorParsers`, `IndicesQueriesRegistry`, and
`ParseFieldRegistry`.

The goal here is to reduce the amount of plumbing required to
allow parsing pluggable things. With this you don't have to pass
registries all over the place. Instead you must pass a super
registry to fewer places and use it to wrap the reader. This is
the same tradeoff that we use for NamedWriteable and it allows
much, much simpler binary serialization. We think we want that
same thing for xcontent serialization.

The only parsing actually converted to this method is parsing
`ScoreFunctions` inside of `FunctionScoreQuery`. I chose this
because it is relatively self contained.
2016-12-20 11:05:24 -05:00
Grzegorz Gajos f6b6e4e376 Added ability to remove pipelines via wildcards (#22149) (#22191)
This commit is adding an ability to remove pipelines with wildcards.
2016-12-19 10:59:59 -08:00
Tal Levy bb37167946 Enables the ability to inject serialized json fields into root of document. (#22179)
The JSON processor has an optional field called "target_field".
If you don't specify target_field then target_field becomes what you specified as "field".
There isn't anyway to add the fields to the root of a document. By
setting `add_to_root`, now serialized fields will be inserted into the
top-level fields of the ingest document.

Closes #21898.
2016-12-16 10:17:27 -08:00
Tal Levy eaf82a6e7e compile ScriptProcessor inline scripts when creating ingest pipelines (#21858)
Inline scripts defined in Ingest Pipelines are now compiled at creation time to preemptively catch errors on initialization of the pipeline.

Fixes #21842.
2016-12-14 17:26:51 -08:00
Tal Levy f56097b57a Fixes GrokProcessor's ignorance of named-captures with same name. (#22131)
Grok was originally ignoring potential matches to named-capture groups
larger than one. For example, If you had two patterns containing the
same named field, but only the second pattern matched, it would fail to
pick this up.

This PR fixes this by exploring all potential places where a
named-capture was used and chooses the first one that matched.

Fixes #22117.
2016-12-13 13:19:55 -08:00
Adrien Grand 6231009a8f Remove 2.x backward compatibility of mappings. (#21670)
For the record, I also had to remove the geo-hash cell and geo-distance range
queries to make the code compile. These queries already throw an exception in
all cases with 5.x indices, so that does not hurt any more.

I also had to rename all 2.x bwc indices from `index-${version}` to
`unsupported-${version}` to make `OldIndexBackwardCompatibilityIT`
happy.
2016-11-30 13:34:46 +01:00
Tal Levy 6796464f16 add `ignore_missing` option to SplitProcessor (#20982)
Closes #20840.
2016-11-16 15:46:09 +02:00
Tal Levy 04b712bdc5 fix trace_match behavior for when there is only one grok pattern (#21413)
There is an issue in the Grok Processor, where trace_match: true does not inject the _ingest._grok_match_index into the ingest-document when there is just one pattern provided. This is due to an optimization in the regex construction. This commit adds a check for when this is the case, and injects a static index value of "0", since there is only one pattern matched (at the first index into the patterns).

To make this clearer, more documentation was added to the grok-processor docs.

Fixes #21371.
2016-11-16 15:41:54 +02:00
Jack Conradson aeb97ff412 Clean up of Script.
Closes #21321
2016-11-10 09:59:13 -08:00
Ryan Ernst 7a2c984bcc Test: Remove multi process support from rest test runner (#21391)
At one point in the past when moving out the rest tests from core to
their own subproject, we had multiple test classes which evenly split up
the tests to run. However, we simplified this and went back to a single
test runner to have better reproduceability in tests. This change
removes the remnants of that multiplexing support.
2016-11-07 15:07:34 -08:00
Jack Conradson 512a77a633 Refactor ScriptType to be a top-level class. 2016-10-26 10:21:22 -07:00
Tal Levy 38c650f376 make painless the default scripting language for ScriptProcessor (#20981)
- fixes a bug in the docs that mentions `lang` as optional
- now `lang` defaults to "painless"
2016-10-18 16:22:01 -07:00
Martijn van Groningen 55dce523c2 docs: marked `foreach` processor as experimental
Closes #19602
2016-09-30 12:23:42 +02:00
Tal Levy 33b9e2065b no null values in ingest configuration error messages (#20616)
The invalid ingest configuration field name used to show itself,
even when it was null, in error messages. Sometimes this does not make
sense.

e.g.
```[null] Only one of [file], [id], or [inline] may be configure```
vs.
```Only one of [file], [id], or [inline] may be configure```

The above deals with three fields, therefore this no one property
responsible.
2016-09-29 11:34:52 +02:00
Tal Levy 92ab44d35c [fix] JSON Processor was not properly added (#20613) 2016-09-28 23:04:22 +02:00
Tal Levy 9f1f5fdedc introduce the JSON Processor (#20128)
introduce the JSON Processor
2016-09-09 14:34:32 -07:00
Tal Levy dda32545bb add ignore_missing option to relevant processors (#20194) 2016-09-09 12:20:18 -07:00
Martijn van Groningen 6f6d17dc9c ingest: Add `dot_expander` processor that can turn fields with dots in the field name into object fields. 2016-09-05 07:28:38 +02:00
Martijn van Groningen 1925813e09 ingest: Fix rename processor change rename leaf fields into branch fields
Instead of get, set and remove we do get, remove and then set to avoid type conflicts in IngestDocument.
If the set still fails we try to restore the original field in ingest document.

Closes #19892
2016-08-30 07:38:01 +02:00
Martijn van Groningen 48926b4d66 ingest: don't render template twice for append processor 2016-08-26 18:07:32 +02:00
Igor Motov b36fbc4452 Add support for parameters to the script ingest processor
The script processor should support `params` to be consistent with all other script consumers.
2016-08-24 16:49:48 -04:00
Tal Levy 84bf24b1e9 remove ability to set field value in script-processor configuration (#19981) 2016-08-15 10:57:39 -07:00
Martijn van Groningen a91bb29585 ingest: Made the response format of the get pipeline api match with the response format of the index template api
Closes #19585
2016-07-29 17:58:30 +02:00
Martijn van Groningen 7b36a72ccb fixed compiler warnining and removed unused imports 2016-07-29 15:06:49 +02:00
Martijn van Groningen 7e3d5b21bb test: fix generic type 2016-07-27 14:56:17 +02:00
Martijn van Groningen 24d7fa6d54 ingest: Change the `foreach` processor to use the `_ingest._value` ingest metadata attribute to store the current array element being processed.
Closes #19592
2016-07-27 09:35:09 +02:00
Nik Everett 9270e8b22b Rename client yaml test infrastructure
This makes it obvious that these tests are for running the client yaml
suites. Now that there are other ways of running tests using the REST
client against a running cluster we can't go on calling the shared
client yaml tests "REST tests". They are rest tests, but they aren't
**the** rest tests.
2016-07-26 13:53:44 -04:00
Chris Earle 0553ba9151 [Ingest] Add REST _ingest/pipeline to get all pipelines
This adds an extra REST handler for "_ingest/pipeline" so that users do not need to supply "_ingest/pipeline/*" to get all of them.

- Also adds a teardown section to related REST-tests for ingest.
2016-07-26 13:48:15 -04:00
Nik Everett a95d4f4ee7 Add Location header and improve REST testing
This adds a header that looks like `Location: /test/test/1` to the
response for the index/create/update API. The requirement for the header
comes from https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html

https://tools.ietf.org/html/rfc7231#section-7.1.2 claims that relative
URIs are OK. So we use an absolute path which should resolve to the
appropriate location.

Closes #19079

This makes large changes to our rest test infrastructure, allowing us
to write junit tests that test a running cluster via the rest client.
It does this by splitting ESRestTestCase into two classes:
* ESRestTestCase is the superclass of all tests that use the rest client
to interact with a running cluster.
* ESClientYamlSuiteTestCase is the superclass of all tests that use the
rest client to run the yaml tests. These tests are shared across all
official clients, thus the `ClientYamlSuite` part of the name.
2016-07-25 17:02:40 -04:00
Tal Levy 19e7b1c737 fix: no other processors should be executed after on_failure is called in a compound processor (#19545) 2016-07-21 14:27:04 -07:00
Tal Levy f7cd86ef6d rethrow script compilation exceptions into ingest configuration exceptions (#19318)
* rethrow script compilation exceptions into ingest configuration exceptions
* update readProcessor to rethrow any exception as an ElasticsearchException
2016-07-20 10:37:56 -07:00
Martijn van Groningen e0ebf5da1c Template cleanup:
* Removed `Template` class and unified script & template parsing logic. Templates are scripts, so they should be defined as a script. Unless there will be separate template infrastructure, templates should share as much code as possible with scripts.
* Removed ScriptParseException in favour for ElasticsearchParseException
* Moved TemplateQueryBuilder to lang-mustache module because this query is hard coded to work with mustache only
2016-07-18 10:16:01 +02:00
Martijn van Groningen d0069f0fbb Provide access to ThreadContext in ingest plugins
Also introduced a `Processor.Parameters` class that is holder for several services processors rely on,
the  IngestPlugin#getProcessors(...) method has been changed to accept `Processor.Parameters` instead
of each service seperately.
2016-07-15 08:16:15 +02:00
Tal Levy ed768b101f show ignored errors in verbose simulate result (#19404)
Closes #19319.
2016-07-13 13:32:10 -07:00
Tal Levy 8fd01554bc update foreach processor to only support one applied processor. (#19402)
Closes #19345.
2016-07-13 13:13:00 -07:00
Ryan Ernst 2fc41adeb5 Merge branch 'master' into ingest_plugin_api 2016-07-05 20:53:03 -07:00
Tanguy Leroux 0e7faf1005 Enable Checkstyle RedundantModifier 2016-07-04 15:22:12 +02:00
Ryan Ernst e5caadc4f3 Merge branch 'master' into ingest_plugin_api 2016-07-01 12:35:26 -07:00
Ryan Ernst 65c9b0b588 Merge branch 'master' into ingest_plugin_api 2016-07-01 09:26:17 -07:00
Tanguy Leroux 8c40b2b54e Fix order of modifiers 2016-07-01 16:57:14 +02:00
Ryan Ernst e4f265eb3a Ingest: Remove generics from Processor.Factory
The factory for ingest processor is generic, but that is only for the
return type of the create mehtod. However, the actual consumer of the
factories only cares about Processor, so generics are not needed.

This change removes the generic type from the factory. It also removes
AbstractProcessorFactory which only existed in order pull the optional
tag from config. This functionality is moved to the caller of the
factories in ConfigurationUtil, and the create method now takes the tag.
This allows the covariant return of the implementation to work with
tests not needing casts.
2016-06-30 02:33:54 -07:00
Ryan Ernst 08b3b6264e Tests pass, started removing generics from processor factory 2016-06-30 01:49:22 -07:00
Ryan Ernst f4519c44b7 Merge branch 'master' into ingest_plugin_api 2016-06-29 22:38:23 -07:00
Ryan Ernst ecf6101798 Scripts: Remove ClusterState from compile api
Stored scripts are pulled from the cluster state, and the current api
requires passing the ClusterState on each call to compile. However, this
means every user of the ScriptService needs to depend on the
ClusterService. Instead, this change makes the ScriptService a
ClusterStateListener. It also simplifies tests a lot, as they no longer
need to create fake cluster states (except when testing stored scripts).
2016-06-28 13:20:00 -07:00
Ryan Ernst 258c3e86ab Added IngestPlugin api, cutover common and geoip, changed ingest factory
api to take ProcessorsRegistry
2016-06-28 10:52:07 -07:00
Tal Levy 28fd684eef Fix ignore_failure behavior in _simulate?verbose (#18987)
- fix it so that processors with the `ignore_failure` option do not
record their exception in the response
- add more tests to make empty `on_failure`. This now throws an
  exception
2016-06-21 13:29:53 -07:00
Martijn van Groningen 82f7bfad98 ingest: merged o.e.ingest.core with o.e.ingest and in ingest-common module added o.e.ingest.common package
and moved all code to that package.
2016-06-21 09:24:00 +02:00
Ryan Ernst a4503c2aed Plugins: Remove name() and description() from api
In 2.0 we added plugin descriptors which require defining a name and
description for the plugin. However, we still have name() and
description() which must be overriden from the Plugin class. This still
exists for classpath plugins. But classpath plugins are mainly for
tests, and even then, referring to classpath plugins with their class is
a better idea. This change removes name() and description(), replacing
the name for classpath plugins with the full class name.
2016-06-15 17:12:22 -07:00
Tal Levy a26260fb72 new ScriptProcessor for Ingest (#18193)
add new ScriptProcessor for executing ES Scripts within pipelines
2016-06-15 14:57:18 -07:00
Martijn van Groningen f611f1c99e ingest: Move processors from core to ingest-common module.
Folded grok processor into ingest-common module.

The rest tests have been moved to ingest-common module as well, because these tests don't run in the rest-api-spec module but in the distribution:integ-test-zip module
and adding a test plugin there felt just wrong to me. I think this is ok. I left a tiny ingest rest test behind in that tests with an empty pipeline.

Removed messy tests, these tests were already covered in the rest tests

Added ingest test plugin in test infra so that each module testing integration with ingest doesn't need write its own plugin

Moved reindex ingest tests to qa module

Closes #18490
2016-06-07 17:32:52 +02:00