Add support version and version_type in ingest pipelines
Add support for setting document version and version type in set
processor of an ingest pipeline.
The current script service has a script compilation limit for a one
minute window. This is set to a small default value of 15. Instead of
increasing that default value, this commit introduces a new setting
that allows to configure a rate per time unit, so that the script service can deal with bursts better.
The new setting is named `script.max_compilations_rate`,
requires a nonnegative number and a positive time value.
The default is `75/5m`, which is equivalent to the existing 15 per minute.
This commit adds back "id" as the key within a script to specify a
stored script (which with file scripts now gone is no longer ambiguous).
It also adds "source" as a replacement for "code". This is in an attempt
to normalize how scripts are specified across both put stored scripts and script usages, including search template requests. This also deprecates the old inline/stored keys.
This is a simple refactoring to move the context definitions into the
type that they use. While we have multiple context names for the same
class at the moment, this will eventually become one ScriptContext per
instance type, so the pattern of a static member on the interface called
CONTEXT can be used. This commit also moves the consolidated list of
contexts provided by core ES into ScriptModule.
As we work towards contexts implying the return type of compilation, we
first need ScriptContext to not be an enum. This commit removes the
Standard enum and Plugin subclass of ScriptContext.
ScriptEngine implementations have an overridable method to indicate they
are safe to use as inline scripts. Since groovy was removed fro 6.0,
there are no longer any implementations which used the default false
value. Furthermore, the value was not actually read anywhere. This
commit removes the method. The ScriptEngineRegistry was also no longer
necessary as it only was used to build a map from language to engine.
This commit renames all rest test files to use the .yml extension
instead of .yaml. This way the extension used within all of
elasticsearch for yaml is consistent.
This commit renames ScriptEngineService to ScriptEngine. It is often
confusing because we have the ScriptService, and then
ScriptEngineService implementations, but the latter are not services as
we see in other places in elasticsearch.
File scripts have 2 related settings: the path of file scripts, and
whether they can be dynamically reloaded. This commit deprecates those
settings.
relates #21798
Previously, Mustache would call `toString` on the `_ingest.timestamp`
field and return a date format that did not match Elasticsearch's
defaults for date-mapping parsing. The new ZonedDateTime class in Java 8
happens to do format itself in the same way ES is expecting.
This commit adds support for a feature flag that enables the usage of this new date format
that has more native behavior.
Fixes#23168.
This new fix can be found in the form of a cluster setting called
`ingest.new_date_format`. By default, in 5.x, the existing behavior
will remain the same. One will set this property to `true` in order to
take advantage of this update for ingest-pipeline convenience.
This change simplifies how the rest test runner finds test files and
removes all leniency. Previously multiple prefixes and suffixes would
be tried, and tests could exist inside or outside of the classpath,
although outside of the classpath never quite worked. Now only classpath
tests are supported, and only one resource prefix is supported,
`/rest-api-spec/tests`.
closes#20240
Gradle's finalizedBy on tasks only ensures one task runs after another,
but not immediately after. This is problematic for our integration tests
since it allows multiple project's integ test clusters to be
simultaneously. While this has not been a problem thus far (gradle 2.13
happened to keep the finalizedBy tasks close enough that no clusters
were running in parallel), with gradle 3.3 the task graph generation has
changed, and numerous clusters may be running simultaneously, causing
memory pressure, and thus generally slower tests, or even failure if the
system has a limited amount of memory (eg in a vagrant host).
This commit reworks how integ tests are configured. It adds an
`integTestCluster` extension to gradle which is equivalent to the current
`integTest.cluster` and moves the rest test runner task to
`integTestRunner`. The `integTest` task is then just a dummy task,
which depends on the cluster runner task, as well as the cluster stop
task. This means running `integTest` in one project will both run the
rest tests, and shut down the cluster, before running `integTest` in
another project.
Currently, stored scripts use a namespace of (lang, id) to be put, get, deleted, and executed. This is not necessary since the lang is stored with the stored script. A user should only have to specify an id to use a stored script. This change makes that possible while keeping backwards compatibility with the previous namespace of (lang, id). Anywhere the previous namespace is used will log deprecation warnings.
The new behavior is the following:
When a user specifies a stored script, that script will be stored under both the new namespace and old namespace.
Take for example script 'A' with lang 'L0' and data 'D0'. If we add script 'A' to the empty set, the scripts map will be ["A" -- D0, "A#L0" -- D0]. If a script 'A' with lang 'L1' and data 'D1' is then added, the scripts map will be ["A" -- D1, "A#L1" -- D1, "A#L0" -- D0].
When a user deletes a stored script, that script will be deleted from both the new namespace (if it exists) and the old namespace.
Take for example a scripts map with {"A" -- D1, "A#L1" -- D1, "A#L0" -- D0}. If a script is removed specified by an id 'A' and lang null then the scripts map will be {"A#L0" -- D0}. To remove the final script, the deprecated namespace must be used, so an id 'A' and lang 'L0' would need to be specified.
When a user gets/executes a stored script, if the new namespace is used then the script will be retrieved/executed using only 'id', and if the old namespace is used then the script will be retrieved/executed using 'id' and 'lang'
* Remove a checked exception, replacing it with `ParsingException`.
* Remove all Parser classes for the yaml sections, replacing them with static methods.
* Remove `ClientYamlTestFragmentParser`. Isn't used any more.
* Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods.
I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.
Inline scripts defined in Ingest Pipelines are now compiled at creation time to preemptively catch errors on initialization of the pipeline.
Fixes#21842.
For the record, I also had to remove the geo-hash cell and geo-distance range
queries to make the code compile. These queries already throw an exception in
all cases with 5.x indices, so that does not hurt any more.
I also had to rename all 2.x bwc indices from `index-${version}` to
`unsupported-${version}` to make `OldIndexBackwardCompatibilityIT`
happy.
At one point in the past when moving out the rest tests from core to
their own subproject, we had multiple test classes which evenly split up
the tests to run. However, we simplified this and went back to a single
test runner to have better reproduceability in tests. This change
removes the remnants of that multiplexing support.
This makes it obvious that these tests are for running the client yaml
suites. Now that there are other ways of running tests using the REST
client against a running cluster we can't go on calling the shared
client yaml tests "REST tests". They are rest tests, but they aren't
**the** rest tests.
This adds a header that looks like `Location: /test/test/1` to the
response for the index/create/update API. The requirement for the header
comes from https://www.w3.org/Protocols/rfc2616/rfc2616-sec10.htmlhttps://tools.ietf.org/html/rfc7231#section-7.1.2 claims that relative
URIs are OK. So we use an absolute path which should resolve to the
appropriate location.
Closes#19079
This makes large changes to our rest test infrastructure, allowing us
to write junit tests that test a running cluster via the rest client.
It does this by splitting ESRestTestCase into two classes:
* ESRestTestCase is the superclass of all tests that use the rest client
to interact with a running cluster.
* ESClientYamlSuiteTestCase is the superclass of all tests that use the
rest client to run the yaml tests. These tests are shared across all
official clients, thus the `ClientYamlSuite` part of the name.
* rethrow script compilation exceptions into ingest configuration exceptions
* update readProcessor to rethrow any exception as an ElasticsearchException
This change removes the multiple ways that plugins can be added to the
integ test cluster. It also removes the use of the default
configuration, and instead adds a zip configuration to all plugins. This
will enable using project substitutions with plugins, which must be done
with the default configuration.
Registering a script engine or native scripts still uses Guice today
and is much more complicated than needed. This change moves to a pull
based model where script plugins have to implement a dedicated interface
`ScriptPlugin` and defines simple getter returning instances rather than
classes.
The database files have been doubled in size compared to the previous files being used.
For this reason the database files are now gzip compressed, which required using
`GZIPInputStream` when loading database files.
Folded grok processor into ingest-common module.
The rest tests have been moved to ingest-common module as well, because these tests don't run in the rest-api-spec module but in the distribution:integ-test-zip module
and adding a test plugin there felt just wrong to me. I think this is ok. I left a tiny ingest rest test behind in that tests with an empty pipeline.
Removed messy tests, these tests were already covered in the rest tests
Added ingest test plugin in test infra so that each module testing integration with ingest doesn't need write its own plugin
Moved reindex ingest tests to qa module
Closes#18490
- now you can specify a list of grok patterns to match your field with
and the first one to successfully match wins.
- only non-null captures will be inserted into your matched document.
Fixes#17903.
This removes the ScriptMode class entirely, which was an enum with two
options (ON and OFF) which essentially boiled down to true and false.
Now the boolean values are used instead.
* `rename` processor, renamed `to` to `target_field`
* `date` processor, renamed `match_field` to `field` and renamed `match_formats` to `formats`
* `geoip` processor, renamed `source_field` to `field` and renamed `fields` to `properties`
* `attachment` processor, renamed `source_field` to `field` and renamed `fields` to `properties`
Closes#17835