387 lines
14 KiB
Plaintext
387 lines
14 KiB
Plaintext
[[modules-scripting-painless]]
|
|
=== Painless Scripting Language
|
|
|
|
experimental[The Painless scripting language is new and is still marked as experimental. The syntax or API may be changed in the future in non-backwards compatible ways if required.]
|
|
|
|
_Painless_ is a simple, secure scripting language available in Elasticsearch
|
|
by default. It is designed specifically for use with Elasticsearch and can
|
|
safely be used with `inline` and `stored` scripting, which is enabled by
|
|
default.
|
|
|
|
The Painless syntax is similar to http://groovy-lang.org/index.html[Groovy].
|
|
|
|
You can use Painless anywhere a script can be used in Elasticsearch. It is the
|
|
default if you don't set the `lang` parameter but if you want to be explicit you
|
|
can set the `lang` parameter to `painless`.
|
|
|
|
[[painless-features]]
|
|
[float]
|
|
== Painless Features
|
|
|
|
* Fast performance: https://benchmarks.elastic.co/index.html#search_qps_scripts[several times faster] than the alternatives.
|
|
|
|
* Safety: Fine-grained whitelist with method call/field granularity. See
|
|
<<painless-api-reference>> for a complete list of available classes and methods.
|
|
|
|
* Optional typing: Variables and parameters can use explicit types or the dynamic `def` type.
|
|
|
|
* Syntax: Extends Java's syntax with a subset of Groovy for ease of use. See the <<modules-scripting-painless-syntax, Syntax Overview>>.
|
|
|
|
* Optimizations: Designed specifically for Elasticsearch scripting.
|
|
|
|
[[painless-examples]]
|
|
[float]
|
|
== Painless Examples
|
|
|
|
To illustrate how Painless works, let's load some hockey stats into an Elasticsearch index:
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
PUT hockey/player/_bulk?refresh
|
|
{"index":{"_id":1}}
|
|
{"first":"johnny","last":"gaudreau","goals":[9,27,1],"assists":[17,46,0],"gp":[26,82,1],"born":"1993/08/13"}
|
|
{"index":{"_id":2}}
|
|
{"first":"sean","last":"monohan","goals":[7,54,26],"assists":[11,26,13],"gp":[26,82,82],"born":"1994/10/12"}
|
|
{"index":{"_id":3}}
|
|
{"first":"jiri","last":"hudler","goals":[5,34,36],"assists":[11,62,42],"gp":[24,80,79],"born":"1984/01/04"}
|
|
{"index":{"_id":4}}
|
|
{"first":"micheal","last":"frolik","goals":[4,6,15],"assists":[8,23,15],"gp":[26,82,82],"born":"1988/02/17"}
|
|
{"index":{"_id":5}}
|
|
{"first":"sam","last":"bennett","goals":[5,0,0],"assists":[8,1,0],"gp":[26,1,0],"born":"1996/06/20"}
|
|
{"index":{"_id":6}}
|
|
{"first":"dennis","last":"wideman","goals":[0,26,15],"assists":[11,30,24],"gp":[26,81,82],"born":"1983/03/20"}
|
|
{"index":{"_id":7}}
|
|
{"first":"david","last":"jones","goals":[7,19,5],"assists":[3,17,4],"gp":[26,45,34],"born":"1984/08/10"}
|
|
{"index":{"_id":8}}
|
|
{"first":"tj","last":"brodie","goals":[2,14,7],"assists":[8,42,30],"gp":[26,82,82],"born":"1990/06/07"}
|
|
{"index":{"_id":39}}
|
|
{"first":"mark","last":"giordano","goals":[6,30,15],"assists":[3,30,24],"gp":[26,60,63],"born":"1983/10/03"}
|
|
{"index":{"_id":10}}
|
|
{"first":"mikael","last":"backlund","goals":[3,15,13],"assists":[6,24,18],"gp":[26,82,82],"born":"1989/03/17"}
|
|
{"index":{"_id":11}}
|
|
{"first":"joe","last":"colborne","goals":[3,18,13],"assists":[6,20,24],"gp":[26,67,82],"born":"1990/01/30"}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
// TESTSETUP
|
|
|
|
[float]
|
|
=== Accessing Doc Values from Painless
|
|
|
|
Document values can be accessed from a `Map` named `doc`.
|
|
|
|
For example, the following script calculates a player's total goals. This example uses a strongly typed `int` and a `for` loop.
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
GET hockey/_search
|
|
{
|
|
"query": {
|
|
"function_score": {
|
|
"script_score": {
|
|
"script": {
|
|
"lang": "painless",
|
|
"inline": "int total = 0; for (int i = 0; i < doc['goals'].length; ++i) { total += doc['goals'][i]; } return total;"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
Alternatively, you could do the same thing using a script field instead of a function score:
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
GET hockey/_search
|
|
{
|
|
"query": {
|
|
"match_all": {}
|
|
},
|
|
"script_fields": {
|
|
"total_goals": {
|
|
"script": {
|
|
"lang": "painless",
|
|
"inline": "int total = 0; for (int i = 0; i < doc['goals'].length; ++i) { total += doc['goals'][i]; } return total;"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
The following example uses a Painless script to sort the players by their combined first and last names. The names are accessed using
|
|
`doc['first'].value` and `doc['last'].value`.
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
GET hockey/_search
|
|
{
|
|
"query": {
|
|
"match_all": {}
|
|
},
|
|
"sort": {
|
|
"_script": {
|
|
"type": "string",
|
|
"order": "asc",
|
|
"script": {
|
|
"lang": "painless",
|
|
"inline": "doc['first.keyword'].value + ' ' + doc['last.keyword'].value"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
[float]
|
|
=== Updating Fields with Painless
|
|
|
|
You can also easily update fields. You access the original source for a field as `ctx._source.<field-name>`.
|
|
|
|
First, let's look at the source data for a player by submitting the following request:
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
GET hockey/_search
|
|
{
|
|
"stored_fields": [
|
|
"_id",
|
|
"_source"
|
|
],
|
|
"query": {
|
|
"term": {
|
|
"_id": 1
|
|
}
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
To change player 1's last name to `hockey`, simply set `ctx._source.last` to the new value:
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
POST hockey/player/1/_update
|
|
{
|
|
"script": {
|
|
"lang": "painless",
|
|
"inline": "ctx._source.last = params.last",
|
|
"params": {
|
|
"last": "hockey"
|
|
}
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
You can also add fields to a document. For example, this script adds a new field that contains
|
|
the player's nickname, _hockey_.
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
POST hockey/player/1/_update
|
|
{
|
|
"script": {
|
|
"lang": "painless",
|
|
"inline": "ctx._source.last = params.last; ctx._source.nick = params.nick",
|
|
"params": {
|
|
"last": "gaudreau",
|
|
"nick": "hockey"
|
|
}
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
[float]
|
|
[[modules-scripting-painless-dates]]
|
|
=== Dates
|
|
|
|
Date fields are exposed as
|
|
<<painless-api-reference-org-joda-time-ReadableDateTime, `ReadableDateTime`>>s
|
|
so they support methods like
|
|
<<painless-api-reference-org-joda-time-ReadableDateTime-getYear-0, `getYear`>>,
|
|
and
|
|
<<painless-api-reference-org-joda-time-ReadableDateTime-getDayOfWeek-0, `getDayOfWeek`>>.
|
|
To get milliseconds since epoch call
|
|
<<painless-api-reference-org-joda-time-ReadableInstant-getMillis-0, `getMillis`>>.
|
|
For example, the following returns every hockey player's birth year:
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
GET hockey/_search
|
|
{
|
|
"script_fields": {
|
|
"birth_year": {
|
|
"script": {
|
|
"inline": "doc.born.value.year"
|
|
}
|
|
}
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
[float]
|
|
[[modules-scripting-painless-regex]]
|
|
=== Regular expressions
|
|
|
|
NOTE: Regexes are disabled by default because they circumvent Painless's
|
|
protection against long running and memory hungry scripts. To make matters
|
|
worse even innocuous looking regexes can have staggering performance and stack
|
|
depth behavior. They remain an amazing powerful tool but are too scary to enable
|
|
by default. To enable them yourself set `script.painless.regex.enabled: true` in
|
|
`elasticsearch.yml`. We'd like very much to have a safe alternative
|
|
implementation that can be enabled by default so check this space for later
|
|
developments!
|
|
|
|
Painless's native support for regular expressions has syntax constructs:
|
|
|
|
* `/pattern/`: Pattern literals create patterns. This is the only way to create
|
|
a pattern in painless. The pattern inside the ++/++'s are just
|
|
http://docs.oracle.com/javase/8/docs/api/java/util/regex/Pattern.html[Java regular expressions].
|
|
See <<modules-scripting-painless-regex-flags>> for more.
|
|
* `=~`: The find operator return a `boolean`, `true` if a subsequence of the
|
|
text matches, `false` otherwise.
|
|
* `==~`: The match operator returns a `boolean`, `true` if the text matches,
|
|
`false` if it doesn't.
|
|
|
|
Using the find operator (`=~`) you can update all hockey players with "b" in
|
|
their last name:
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
POST hockey/player/_update_by_query
|
|
{
|
|
"script": {
|
|
"lang": "painless",
|
|
"inline": "if (ctx._source.last =~ /b/) {ctx._source.last += \"matched\"} else {ctx.op = 'noop'}"
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
Using the match operator (`==~`) you can update all the hockey players who's
|
|
names start with a consonant and end with a vowel:
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
POST hockey/player/_update_by_query
|
|
{
|
|
"script": {
|
|
"lang": "painless",
|
|
"inline": "if (ctx._source.last ==~ /[^aeiou].*[aeiou]/) {ctx._source.last += \"matched\"} else {ctx.op = 'noop'}"
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
You can use the `Pattern.matcher` directly to get a `Matcher` instance and
|
|
remove all of the vowels in all of their last names:
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
POST hockey/player/_update_by_query
|
|
{
|
|
"script": {
|
|
"lang": "painless",
|
|
"inline": "ctx._source.last = /[aeiou]/.matcher(ctx._source.last).replaceAll('')"
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
`Matcher.replaceAll` is just a call to Java's `Matcher`'s
|
|
http://docs.oracle.com/javase/8/docs/api/java/util/regex/Matcher.html#replaceAll-java.lang.String-[replaceAll]
|
|
method so it supports `$1` and `\1` for replacements:
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
POST hockey/player/_update_by_query
|
|
{
|
|
"script": {
|
|
"lang": "painless",
|
|
"inline": "ctx._source.last = /n([aeiou])/.matcher(ctx._source.last).replaceAll('$1')"
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
If you need more control over replacements you can call `replaceAll` on a
|
|
`CharSequence` with a `Function<Matcher, String>` that builds the replacement.
|
|
This does not support `$1` or `\1` to access replacements because you already
|
|
have a reference to the matcher and can get them with `m.group(1)`.
|
|
|
|
IMPORTANT: Calling `Matcher.find` inside of the function that builds the
|
|
replacement is rude and will likely break the replacement process.
|
|
|
|
This will make all of the vowels in the hockey player's last names upper case:
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
POST hockey/player/_update_by_query
|
|
{
|
|
"script": {
|
|
"lang": "painless",
|
|
"inline": "ctx._source.last = ctx._source.last.replaceAll(/[aeiou]/, m -> m.group().toUpperCase(Locale.ROOT))"
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
Or you can use the `CharSequence.replaceFirst` to make the first vowel in their
|
|
last names upper case:
|
|
|
|
[source,js]
|
|
----------------------------------------------------------------
|
|
POST hockey/player/_update_by_query
|
|
{
|
|
"script": {
|
|
"lang": "painless",
|
|
"inline": "ctx._source.last = ctx._source.last.replaceFirst(/[aeiou]/, m -> m.group().toUpperCase(Locale.ROOT))"
|
|
}
|
|
}
|
|
----------------------------------------------------------------
|
|
// CONSOLE
|
|
|
|
|
|
Note: all of the `_update_by_query` examples above could really do with a
|
|
`query` to limit the data that they pull back. While you *could* use a
|
|
<<query-dsl-script-query>> it wouldn't be as efficient as using any other query
|
|
because script queries aren't able to use the inverted index to limit the
|
|
documents that they have to check.
|
|
|
|
[float]
|
|
[[modules-scripting-painless-dispatch]]
|
|
=== How painless dispatches functions
|
|
|
|
Painless uses receiver, name, and https://en.wikipedia.org/wiki/Arity[arity] to
|
|
for method dispatch. For example, `s.foo(a, b)` is resolved by first getting
|
|
the class of `s` and then looking up the method `foo` with two parameters. This
|
|
is different from Groovy which uses the
|
|
https://en.wikipedia.org/wiki/Multiple_dispatch[runtime types] of the
|
|
parameters and Java which uses the compile time types of the parameters.
|
|
|
|
The consequence of this that Painless doesn't support overloaded methods like
|
|
Java, leading to some trouble when it whitelists classes from the Java
|
|
standard library. For example, in Java and Groovy, `Matcher` has two methods:
|
|
`group(int)` and `group(String)`. Painless can't whitelist both of them methods
|
|
because they have the same name and the same number of parameters. So instead it
|
|
has <<painless-api-reference-Matcher-group-1, `group(int)`>> and
|
|
<<painless-api-reference-Matcher-namedGroup-1, `namedGroup(String)`>>.
|
|
|
|
We have a few justifications for this different way of dispatching methods:
|
|
|
|
1. It makes operating on `def` types simpler and, presumably, faster. Using
|
|
receiver, name, and arity means when Painless sees a call on a `def` objects it
|
|
can dispatch the appropriate method without having to do expensive comparisons
|
|
of the types of the parameters. The same is true for invocations with `def`
|
|
typed parameters.
|
|
2. It keeps things consistent. It would be genuinely weird for Painless to
|
|
behave like Groovy if any `def` typed parameters were involved and Java
|
|
otherwise. It'd be slow for it to behave like Groovy all the time.
|
|
3. It keeps Painless maintainable. Adding the Java or Groovy like method
|
|
dispatch *feels* like it'd add a ton of complexity which'd make maintenance and
|
|
other improvements much more difficult.
|