When constructing an array list, if we know the size of the list in
advance (because we are adding objects to it derived from another list),
we should size the array list to the appropriate capacity in advance (to
avoid resizing allocations). This commit does this in various places.
Relates #24439
This commit renames ScriptEngineService to ScriptEngine. It is often
confusing because we have the ScriptService, and then
ScriptEngineService implementations, but the latter are not services as
we see in other places in elasticsearch.
This adds `-XX:-OmitStackTraceInFastThrow` to the JVM arguments
which *should* prevent the JVM from omitting stack traces on
common exception sites. Even though these sites are common, we'd
still like the exceptions to debug them.
This also adds the flag when running tests and adapts some tests
that had workarounds for the absense of the flag.
Closes#24376
* Fix wrong delegation to constructors when compiling lambdas with method references to ctors. Also remove the get$lambda factory.
* Cleanup code and remove unneeded transformations between binary and internal class names (uses ASM Type class instead)
* Cleanup Exception handling
* Simplification by moving the type adaption to the outside
* Remove STATIC access flag from our Lambda class (not required and also officially not allowed)
* Move the lambda counter to the classloader, so we have a per-script lambda ID
* Change Codesource of generated lambdas to be consistent
Replaces LambdaMetaFactory with LambdaBootstrap, a custom solution for lambdas in Painless using a design similar to LambdaMetaFactory, but allows for custom adaptation of types which recent changes to LambdaMetaFactory no longer allowed.
`script_stack` is super useful when debugging Painless scripts
because it skips all the "weird" stuff involved that obfuscates
where the actual error is. It skips Painless's internals and
call site bootstrapping.
It works fine, but it didn't have many tests. This converts a
test that we had for line numbers into a test for the
`script_stack`. The line numbers test was an indirect test
for `script_stack`.
This change simplifies how the rest test runner finds test files and
removes all leniency. Previously multiple prefixes and suffixes would
be tried, and tests could exist inside or outside of the classpath,
although outside of the classpath never quite worked. Now only classpath
tests are supported, and only one resource prefix is supported,
`/rest-api-spec/tests`.
closes#20240
We'd like to be able to support context-sensitive whitelists in
Painless but we can't now because the whitelist is a static thing.
This begins to de-static the whitelist, in particular removing
the static keyword from most of the methods on `Definition` and
plumbing the static instance into the appropriate spots as though
it weren't static. Once we de-static all the methods we should be
able to fairly simply build context-sensitive whitelists.
The only "fun" bit of this is that I added another layer in the
chain of methods that bootstraps `def` calls. Instead of running
`invokedynamic` directly on `DefBootstrap` we now `invokedynamic`
`$bootstrapDef` on the script itself loads the `Definition` that
the script was compiled against and then calls `DefBootstrap`.
I chose to put `Definition` into `Locals` so I didn't have to
change the signature of all the `analyze` methods. I could have
do it another way, but that seems ok for now.
The JVM caches `Integer` objects. This is known. A test in Painless
was relying on the JVM not caching the particular integer `1000`.
It turns out that when you provide `-XX:+AggressiveOpts` the JVM
*does* cache `1000`, causing the test to fail when that is
specified.
This replaces `1000` with a randomly selected integer that we test
to make sure *isn't* cached by the JVM. *Hopefully* this test is
good enough. It relies on the caching not changing in between when
we check that the value isn't cached and when we run the painless
code. The cache now is a simple array but there is nothing
preventing it from changing. If it does change in a way that thwarts
this test then the test fail fail again. At least when that happens
the next person can see the comment about how it is important
that the integer isn't cached and can follow that line of inquiry.
Closes#24041
This commit skips the two Painless tests
EqualsTests#testBranchEqualsDefAndPrimitive and
EqualsTests#testBranchNotEqualsDefAndPrimitive on Windows as the tests
are repeatedly failing there.
This commit renames the random ASCII helper methods in ESTestCase. This
is because this method ultimately uses the random ASCII methods from
randomized runner, but these methods actually only produce random
strings generated from [a-zA-Z].
Relates #23886
The current rest backcompat tests, which run against a mixed cluster of
5.x and 6.0 nodes, depend on snapshot builds of 5.x. However, this has
the potential for inconsistency that results in CI failures, and happens
quite often, whenever some backcompat logic is added to 5.x, but the bwc
test on master fails because the 5.x code has not yet been published as
a snapshot.
This change creates a git clone of the 5.x branch,
builds the zip distribution, and ties that into gradle substitutions for
the 5.x version.
Without this change, if write a script with multiple regexes
*sometimes* the lexer will decide to look at them like one
big regex and then some trailing garbage. Like this discuss post:
https://discuss.elastic.co/t/error-with-the-split-function-in-painless-script/79021
```
def val = /\\\\/.split(ctx._source.event_data.param17);
if (val[2] =~ /\\./) {
def val2 = /\\./.split(val[2]);
ctx._source['user_crash'] = val2[0]
} else {
ctx._source['user_crash'] = val[2]
}
```
The error message you get from the lexer is `lexer_no_viable_alt_exception`
right after the *second* regex.
With this change each regex is just a single regex like it ought to be.
As a bonus, while looking into this issue I found that the error
reporting for regexes wasn't very nice. If you specify an invalid
pattern then you get an error marker on the start of the pattern
with the JVM's regex error message which attempts to point you to the
location in the regex but is totally unreadable in the JSON response.
This change fixes the location to point to the appropriate spot
inside the pattern and removes the portion of the JVM's error message
that doesn't render well. It is no longer needed now that we point
users to the appropriate spot in the pattern.
This commit mutes a ton of Painless lambda tests on JDK 9. This commit
did not attempt to discover exactly which tests are failing, but instead
just blanket muted all tests in LambdaTests, FunctionRefTests, and
AugmentationTests.
Relates #23473
Throw error when skip or do sections are malformed, such as they don't start with the proper token (START_OBJECT). That signals bad indentation, which would be ignored otherwise. Thanks (or due to) our pull parsing code, we were still able to properly parse the sections, yet other runners weren't able to.
Closes#21980
* [TEST] fix indentation in matrix_stats yaml tests
* [TEST] fix indentation in painless yaml test
* [TEST] fix indentation in analysis yaml tests
* [TEST] fix indentation in generated docs yaml tests
* [TEST] fix indentation in multi_cluster_search yaml tests
Gradle's finalizedBy on tasks only ensures one task runs after another,
but not immediately after. This is problematic for our integration tests
since it allows multiple project's integ test clusters to be
simultaneously. While this has not been a problem thus far (gradle 2.13
happened to keep the finalizedBy tasks close enough that no clusters
were running in parallel), with gradle 3.3 the task graph generation has
changed, and numerous clusters may be running simultaneously, causing
memory pressure, and thus generally slower tests, or even failure if the
system has a limited amount of memory (eg in a vagrant host).
This commit reworks how integ tests are configured. It adds an
`integTestCluster` extension to gradle which is equivalent to the current
`integTest.cluster` and moves the rest test runner task to
`integTestRunner`. The `integTest` task is then just a dummy task,
which depends on the cluster runner task, as well as the cluster stop
task. This means running `integTest` in one project will both run the
rest tests, and shut down the cluster, before running `integTest` in
another project.
Fixes Painless to properly implement scripts that return primitives
and void. Adds some simple tests that we emit sane opcodes and some
other tests that we implement primitives as expected.
Mostly this is just a fix following up from #22983 but there is one
thing I did really worth talking about, I think. So, before this script
Painless scripts could only ever return Object and they did would always
return null for paths that didn't return any values. Now that they
can return primitives the question is "what should Painless return
from paths that don't return any values?" And I answered that with
"whatever the JLS default value is". So 0/0L/0f/0d/false.
Generalizes three previously hard coded things in painless into
generic concepts:
1. The "main method" is no longer hardcoded to:
```
public abstract Object execute(Map<String, Object> params,
Scorer scorer, LeafDocLookup doc, Object value);
```
Instead Painless's compiler takes an interface and implements it. It looks like:
```
public interface SomeScript {
// Argument names we expose to Painless scripts
String[] ARGUMENTS = new String[] {"a", "b"};
// Method implemented by Painless script. Must be named execute but can have any parameters or return any value.
Object execute(String a, int b);
// Is the "a" argument used by the script?
boolean uses$a();
}
SomeScript script = scriptEngine.compile(SomeScript.class, null, "the_script_here", emptyMap());
Object result = script.execute("a", 1);
```
`PainlessScriptEngine` now compiles all scripts to the new
`GenericElasticsearchScript` interface by default for compatibility
with the rest of Elasticsearch until it is able to use this new
ability.
2. `_score` and `ctx` are no longer hardcoded to be extracted from
`#score` and `params` respectively. Instead Painless's default
implementation of Elasticsearch scripts uses the `uses$_score` and
`uses$ctx` methods to determine if it is used and gives them
dummy values if they are not used.
3. Throwing the `ScriptException` is now handled by the Painless
script itself. That way Painless doesn't have to leak the metadata
that is required to build the fancy stack trace. And all painless scripts
get the fancy stack trace.
Today all search phases are inner classes of AbstractSearchAsyncAction or one of it's
subclasses. This makes unit testing of these classes practically impossible. This commit
Extracts `DfsQueryPhase` and `FetchSearchPhase` or of the code that composes the actual
query execution types and moves most of the fan-out and collect code into an `InitialSearchPhase`
class that can be used to build initial search phases (phases that retry on shards). This will
make modification to these classes simpler and allows to easily compose or add new search phases
down the road if additional roundtrips are required.
Painless can cast anything into the magic type `def` but it
really shouldn't try to cast **nothing** into `def`. That causes
the byte code generation library to freak out a little.
Closes#22908
This commit upgrades the checkstyle configuration from version 5.9 to
version 7.5, the latest version as of today. The main enhancement
obtained via this upgrade is better detection of redundant modifiers.
Relates #22960
We were incorrectly resolving qualified method references at run
time when invoked on `def`. This lead to errors like
`The struct with name [org] has not been defined.` when attempting
```
doc.date.dates.stream().map(
org.joda.time.ReadableDateTime::centuryOfEra
).collect(Collectors.toList())
```
Implemented by wrapping an array of reused `ModuleDateTime`s that
we grow when needed. The `ModuleDateTime`s are reused when we
move to the next document.
Also improves the error message returned when attempting to modify
the `ScriptdocValues`, removes a couple of allocations, and documents
that the date functions are available in Painless.
Relates to #22162
Currently, stored scripts use a namespace of (lang, id) to be put, get, deleted, and executed. This is not necessary since the lang is stored with the stored script. A user should only have to specify an id to use a stored script. This change makes that possible while keeping backwards compatibility with the previous namespace of (lang, id). Anywhere the previous namespace is used will log deprecation warnings.
The new behavior is the following:
When a user specifies a stored script, that script will be stored under both the new namespace and old namespace.
Take for example script 'A' with lang 'L0' and data 'D0'. If we add script 'A' to the empty set, the scripts map will be ["A" -- D0, "A#L0" -- D0]. If a script 'A' with lang 'L1' and data 'D1' is then added, the scripts map will be ["A" -- D1, "A#L1" -- D1, "A#L0" -- D0].
When a user deletes a stored script, that script will be deleted from both the new namespace (if it exists) and the old namespace.
Take for example a scripts map with {"A" -- D1, "A#L1" -- D1, "A#L0" -- D0}. If a script is removed specified by an id 'A' and lang null then the scripts map will be {"A#L0" -- D0}. To remove the final script, the deprecated namespace must be used, so an id 'A' and lang 'L0' would need to be specified.
When a user gets/executes a stored script, if the new namespace is used then the script will be retrieved/executed using only 'id', and if the old namespace is used then the script will be retrieved/executed using 'id' and 'lang'
Adds "Appending B. Painless API Reference", a reference of all classes
and methods available from Painless. Removes links to java packages
because they contain methods that we don't expose and don't contain
methods that we do expose (the ones in Augmentation). Instead this
generates a list of every class and every exposed method using the same
type information available to the
interpreter/compiler/whatever-we-call-it. From there you can jump to
the relevant docs.
Right now you build all the asciidoc files by running
```
gradle generatePainlessApi
```
These files are expected to be committed because we build the docs
without running `gradle`.
Also changes the output of `Debug.explain` so that it is easy to
search for the class in the generated reference documentation.
You can also run it in an IDE safely if you pass the path to the
directory in which to generate the docs as the first parameter. It'll
blow away the entire directory an recreate it from scratch so be careful.
And then you can build the docs by running something like:
```
../docs/build_docs.pl --out ../built_docs/ --doc docs/reference/index.asciidoc --open
```
That is, if you have checked out https://github.com/elastic/docs in
`../docs`. Wait a minute or two and your browser will pop open in with
all of Elasticsearch's reference documentation. If you go to
`http://localhost:8000/painless-api-reference.html` you can see this
list. Or you can get there by following the links to `Modules` and
`Scripting` and `Painless` and then clicking the link in the paragraphs
below titled `Appendix B. Painless API Reference`.
I like having these in asciidoc because we can deep link to them from the
rest of the guide with constructs like
`<<painless-api-reference-Object-hashCode-0>>` and
`<<painless-api-reference->>` and we get link checking. Then the only
brittle link maintenance bit is the link generation for javadoc. Which
sucks. But I think it is important that we link to the methods directly
so they are easy to find.
Relates to #22720
move "es." internal headers to separate metadata set in ElasticsearchException and stop returning them as response headers
Closes#17593
* [TEST] remove ESExceptionTests, move its methods to ElasticsearchExceptionTests or ExceptionSerializationTests
This commit adds a SpecialPermission constant and uses that constant
opposed to introducing new instances everywhere.
Additionally, this commit introduces a single static method to check that
the current code has permission. This avoids all the duplicated access
blocks that exist currently.
We don't want to expose `String#getBytes` which is required for
`Base64.getEncoder.encode` to work because we're worried about
character sets. This adds `encodeBase64` and `decodeBase64`
methods to `String` in Painless that are duals of one another
such that:
`someString == someString.encodeBase64().decodeBase64()`.
Both methods work with the UTF-8 encoding of the string.
Closes#22648
1. Escape sequences we're working. For example `\\` is now correctly
interpreted as `\` instead of `\\`. Same with `\'` being `'` and
`\"` being `"`.
2. `'` delimited strings weren't allowed to contain `"`s but it looked
like they were intended to support it. Now they do.
3. Improves the error message when the script contains an invalid
escape sequence inside a string to include a list of the valid
escape sequences.
Closes#22372
* Remove a checked exception, replacing it with `ParsingException`.
* Remove all Parser classes for the yaml sections, replacing them with static methods.
* Remove `ClientYamlTestFragmentParser`. Isn't used any more.
* Remove `ClientYamlTestSuiteParseContext`, replacing it with some static utility methods.
I did not rewrite the parsers using `ObjectParser` because I don't think it is worth it right now.
If a bug occurs in painless compilation (not from a user, but from the
painless infrastructure), a VerifyError may be thrown when compiling the
broken generated class. This commit wraps VerifyErrors in
ScriptException so that useful information is returned to the user,
which can be passed on to the ES team for analysis.