Docs: Java Security Manager and scripting
Added docs explaining the impact of the Java Security Manager on scripting languages, how to disable the JSM, and how to customise the classloader whitelist. Closes https://github.com/elastic/elasticsearch/issues/16094 Closes https://github.com/elastic/elasticsearch/issues/14290
This commit is contained in:
parent
001e1b4714
commit
676078c53d
|
@ -3,7 +3,7 @@
|
|||
|
||||
:version: 3.0.0-beta1
|
||||
:major-version: 3.x
|
||||
:branch: 3.0
|
||||
:branch: master
|
||||
:jdk: 1.8.0_25
|
||||
:defguide: https://www.elastic.co/guide/en/elasticsearch/guide/current
|
||||
:plugins: https://www.elastic.co/guide/en/elasticsearch/plugins/master
|
||||
|
|
|
@ -4,13 +4,20 @@
|
|||
This section discusses the changes that you need to be aware of when migrating
|
||||
your application to Elasticsearch 2.2.
|
||||
|
||||
* <<breaking_22_index_apis>>
|
||||
[float]
|
||||
=== Scripting and security
|
||||
|
||||
[[breaking_22_index_apis]]
|
||||
=== Index APIs
|
||||
The Java Security Manager is being used to lock down the privileges available
|
||||
to the scripting languages and to restrict the classes they are allowed to
|
||||
load to a predefined whitelist. These changes may cause scripts which worked
|
||||
in earlier versions to fail. See <<modules-scripting-security>> for more
|
||||
details.
|
||||
|
||||
==== Field stats API
|
||||
[float]
|
||||
=== Field stats API
|
||||
|
||||
The field stats' response format has been changed for number based and date
|
||||
fields. The `min_value` and `max_value` elements now return values as number
|
||||
and the new `min_value_as_string` and `max_value_as_string` return the values
|
||||
as string.
|
||||
|
||||
The field stats' response format has been changed for number based and date fields. The `min_value` and
|
||||
`max_value` elements now return values as number and the new `min_value_as_string` and `max_value_as_string`
|
||||
return the values as string.
|
||||
|
|
|
@ -67,10 +67,10 @@ The modules in this section are:
|
|||
|
||||
Configure the transport networking layer, used internally by Elasticsearch
|
||||
to communicate between nodes.
|
||||
|
||||
|
||||
<<modules-tribe,Tribe nodes>>::
|
||||
|
||||
A tribe node joins one or more clusters and acts as a federated
|
||||
A tribe node joins one or more clusters and acts as a federated
|
||||
client across them.
|
||||
--
|
||||
|
||||
|
@ -93,8 +93,6 @@ include::modules/plugins.asciidoc[]
|
|||
|
||||
include::modules/scripting.asciidoc[]
|
||||
|
||||
include::modules/advanced-scripting.asciidoc[]
|
||||
|
||||
include::modules/snapshots.asciidoc[]
|
||||
|
||||
include::modules/threadpool.asciidoc[]
|
||||
|
|
|
@ -1,691 +1,6 @@
|
|||
[[modules-scripting]]
|
||||
== Scripting
|
||||
include::scripting/scripting.asciidoc[]
|
||||
|
||||
The scripting module allows to use scripts in order to evaluate custom
|
||||
expressions. For example, scripts can be used to return "script fields"
|
||||
as part of a search request, or can be used to evaluate a custom score
|
||||
for a query and so on.
|
||||
include::scripting/advanced-scripting.asciidoc[]
|
||||
|
||||
The scripting module uses by default http://groovy-lang.org/[groovy]
|
||||
(previously http://mvel.codehaus.org/[mvel] in 1.3.x and earlier) as the
|
||||
scripting language with some extensions. Groovy is used since it is extremely
|
||||
fast and very simple to use.
|
||||
include::scripting/security.asciidoc[]
|
||||
|
||||
.Groovy dynamic scripting off by default from v1.4.3
|
||||
[IMPORTANT]
|
||||
===================================================
|
||||
|
||||
Groovy dynamic scripting is off by default, preventing dynamic Groovy scripts
|
||||
from being accepted as part of a request or retrieved from the special
|
||||
`.scripts` index. You will still be able to use Groovy scripts stored in files
|
||||
in the `config/scripts/` directory on every node.
|
||||
|
||||
To convert an inline script to a file, take this simple script
|
||||
as an example:
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
GET /_search
|
||||
{
|
||||
"script_fields": {
|
||||
"my_field": {
|
||||
"inline": "1 + my_var",
|
||||
"params": {
|
||||
"my_var": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
-----------------------------------
|
||||
|
||||
Save the contents of the `inline` field as a file called `config/scripts/my_script.groovy`
|
||||
on every data node in the cluster:
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
1 + my_var
|
||||
-----------------------------------
|
||||
|
||||
Now you can access the script by file name (without the extension):
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
GET /_search
|
||||
{
|
||||
"script_fields": {
|
||||
"my_field": {
|
||||
"script": {
|
||||
"file": "my_script",
|
||||
"params": {
|
||||
"my_var": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
-----------------------------------
|
||||
|
||||
===================================================
|
||||
|
||||
|
||||
Additional `lang` plugins are provided to allow to execute scripts in
|
||||
different languages. All places where a script can be used, a `lang` parameter
|
||||
can be provided to define the language of the script. The following are the
|
||||
supported scripting languages:
|
||||
|
||||
[cols="<,<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Language |Sandboxed |Required plugin
|
||||
|groovy |no |built-in
|
||||
|expression |yes |built-in
|
||||
|mustache |yes |built-in
|
||||
|javascript |no |{plugins}/lang-javascript.html[elasticsearch-lang-javascript]
|
||||
|python |no |{plugins}/lang-python.html[elasticsearch-lang-python]
|
||||
|=======================================================================
|
||||
|
||||
To increase security, Elasticsearch does not allow you to specify scripts for
|
||||
non-sandboxed languages with a request. Instead, scripts must be placed in the
|
||||
`scripts` directory inside the configuration directory (the directory where
|
||||
elasticsearch.yml is). The default location of this `scripts` directory can be
|
||||
changed by setting `path.scripts` in elasticsearch.yml. Scripts placed into
|
||||
this directory will automatically be picked up and be available to be used.
|
||||
Once a script has been placed in this directory, it can be referenced by name.
|
||||
For example, a script called `calculate-score.groovy` can be referenced in a
|
||||
request like this:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
$ tree config
|
||||
config
|
||||
├── elasticsearch.yml
|
||||
├── logging.yml
|
||||
└── scripts
|
||||
└── calculate-score.groovy
|
||||
--------------------------------------------------
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
$ cat config/scripts/calculate-score.groovy
|
||||
log(_score * 2) + my_modifier
|
||||
--------------------------------------------------
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPOST localhost:9200/_search -d '{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"query": {
|
||||
"match": {
|
||||
"body": "foo"
|
||||
}
|
||||
},
|
||||
"functions": [
|
||||
{
|
||||
"script_score": {
|
||||
"script": {
|
||||
"lang": "groovy",
|
||||
"file": "calculate-score",
|
||||
"params": {
|
||||
"my_modifier": 8
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
The name of the script is derived from the hierarchy of directories it
|
||||
exists under, and the file name without the lang extension. For example,
|
||||
a script placed under `config/scripts/group1/group2/test.py` will be
|
||||
named `group1_group2_test`.
|
||||
|
||||
[float]
|
||||
=== Indexed Scripts
|
||||
Elasticsearch allows you to store scripts in an internal index known as
|
||||
`.scripts` and reference them by id. There are REST endpoints to manage
|
||||
indexed scripts as follows:
|
||||
|
||||
Requests to the scripts endpoint look like :
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
/_scripts/{lang}/{id}
|
||||
-----------------------------------
|
||||
Where the `lang` part is the language the script is in and the `id` part is the id
|
||||
of the script. In the `.scripts` index the type of the document will be set to the `lang`.
|
||||
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
curl -XPOST localhost:9200/_scripts/groovy/indexedCalculateScore -d '{
|
||||
"script": "log(_score * 2) + my_modifier"
|
||||
}'
|
||||
-----------------------------------
|
||||
|
||||
This will create a document with id: `indexedCalculateScore` and type: `groovy` in the
|
||||
`.scripts` index. The type of the document is the language used by the script.
|
||||
|
||||
This script can be accessed at query time by using the `id` script parameter and passing
|
||||
the script id:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPOST localhost:9200/_search -d '{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"query": {
|
||||
"match": {
|
||||
"body": "foo"
|
||||
}
|
||||
},
|
||||
"functions": [
|
||||
{
|
||||
"script_score": {
|
||||
"script": {
|
||||
"id": "indexedCalculateScore",
|
||||
"lang" : "groovy",
|
||||
"params": {
|
||||
"my_modifier": 8
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
The script can be viewed by:
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
curl -XGET localhost:9200/_scripts/groovy/indexedCalculateScore
|
||||
-----------------------------------
|
||||
|
||||
This is rendered as:
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
'{
|
||||
"script": "log(_score * 2) + my_modifier"
|
||||
}'
|
||||
-----------------------------------
|
||||
|
||||
Indexed scripts can be deleted by:
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
curl -XDELETE localhost:9200/_scripts/groovy/indexedCalculateScore
|
||||
-----------------------------------
|
||||
|
||||
|
||||
|
||||
[float]
|
||||
[[enable-dynamic-scripting]]
|
||||
=== Enabling dynamic scripting
|
||||
|
||||
We recommend running Elasticsearch behind an application or proxy, which
|
||||
protects Elasticsearch from the outside world. If users are allowed to run
|
||||
inline scripts (even in a search request) or indexed scripts, then they have
|
||||
the same access to your box as the user that Elasticsearch is running as. For
|
||||
this reason dynamic scripting is allowed only for sandboxed languages by default.
|
||||
|
||||
First, you should not run Elasticsearch as the `root` user, as this would allow
|
||||
a script to access or do *anything* on your server, without limitations. Second,
|
||||
you should not expose Elasticsearch directly to users, but instead have a proxy
|
||||
application inbetween. If you *do* intend to expose Elasticsearch directly to
|
||||
your users, then you have to decide whether you trust them enough to run scripts
|
||||
on your box or not.
|
||||
|
||||
It is possible to enable scripts based on their source, for
|
||||
every script engine, through the following settings that need to be added to the
|
||||
`config/elasticsearch.yml` file on every node.
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.inline: true
|
||||
script.indexed: true
|
||||
|
||||
-----------------------------------
|
||||
|
||||
While this still allows execution of named scripts provided in the config, or
|
||||
_native_ Java scripts registered through plugins, it also allows users to run
|
||||
arbitrary scripts via the API. Instead of sending the name of the file as the
|
||||
script, the body of the script can be sent instead or retrieved from the
|
||||
`.scripts` indexed if previously stored.
|
||||
|
||||
There are three possible configuration values for any of the fine-grained
|
||||
script settings:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Value |Description
|
||||
| `false` |scripting is turned off completely, in the context of the setting being set.
|
||||
| `true` |scripting is turned on, in the context of the setting being set.
|
||||
| `sandbox` |scripts may be executed only for languages that are sandboxed
|
||||
|=======================================================================
|
||||
|
||||
The default values are the following:
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.inline: sandbox
|
||||
script.indexed: sandbox
|
||||
script.file: true
|
||||
|
||||
-----------------------------------
|
||||
|
||||
NOTE: Global scripting settings affect the `mustache` scripting language.
|
||||
<<search-template,Search templates>> internally use the `mustache` language,
|
||||
and will still be enabled by default as the `mustache` engine is sandboxed,
|
||||
but they will be enabled/disabled according to fine-grained settings
|
||||
specified in `elasticsearch.yml`.
|
||||
|
||||
It is also possible to control which operations can execute scripts. The
|
||||
supported operations are:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Value |Description
|
||||
| `aggs` |Aggregations (wherever they may be used)
|
||||
| `search` |Search api, Percolator api and Suggester api (e.g filters, script_fields)
|
||||
| `update` |Update api
|
||||
| `plugin` |Any plugin that makes use of scripts under the generic `plugin` category
|
||||
|=======================================================================
|
||||
|
||||
Plugins can also define custom operations that they use scripts for instead
|
||||
of using the generic `plugin` category. Those operations can be referred to
|
||||
in the following form: `${pluginName}_${operation}`.
|
||||
|
||||
The following example disables scripting for `update` and `mapping` operations,
|
||||
regardless of the script source, for any engine. Scripts can still be
|
||||
executed from sandboxed languages as part of `aggregations`, `search`
|
||||
and plugins execution though, as the above defaults still get applied.
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.update: false
|
||||
script.mapping: false
|
||||
|
||||
-----------------------------------
|
||||
|
||||
Generic settings get applied in order, operation based ones have precedence
|
||||
over source based ones. Language specific settings are supported too. They
|
||||
need to be prefixed with the `script.engine.<engine>` prefix and have
|
||||
precedence over any other generic settings.
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.engine.groovy.file.aggs: true
|
||||
script.engine.groovy.file.mapping: true
|
||||
script.engine.groovy.file.search: true
|
||||
script.engine.groovy.file.update: true
|
||||
script.engine.groovy.file.plugin: true
|
||||
script.engine.groovy.indexed.aggs: true
|
||||
script.engine.groovy.indexed.mapping: false
|
||||
script.engine.groovy.indexed.search: true
|
||||
script.engine.groovy.indexed.update: false
|
||||
script.engine.groovy.indexed.plugin: false
|
||||
script.engine.groovy.inline.aggs: true
|
||||
script.engine.groovy.inline.mapping: false
|
||||
script.engine.groovy.inline.search: false
|
||||
script.engine.groovy.inline.update: false
|
||||
script.engine.groovy.inline.plugin: false
|
||||
|
||||
-----------------------------------
|
||||
|
||||
[float]
|
||||
=== Default Scripting Language
|
||||
|
||||
The default scripting language (assuming no `lang` parameter is provided) is
|
||||
`groovy`. In order to change it, set the `script.default_lang` to the
|
||||
appropriate language.
|
||||
|
||||
[float]
|
||||
=== Automatic Script Reloading
|
||||
|
||||
The `config/scripts` directory is scanned periodically for changes.
|
||||
New and changed scripts are reloaded and deleted script are removed
|
||||
from preloaded scripts cache. The reload frequency can be specified
|
||||
using `resource.reload.interval` setting, which defaults to `60s`.
|
||||
To disable script reloading completely set `script.auto_reload_enabled`
|
||||
to `false`.
|
||||
|
||||
[[native-java-scripts]]
|
||||
[float]
|
||||
=== Native (Java) Scripts
|
||||
|
||||
Sometimes `groovy` and `expressions` aren't enough. For those times you can
|
||||
implement a native script.
|
||||
|
||||
The best way to implement a native script is to write a plugin and install it.
|
||||
The plugin {plugins}/plugin-authors.html[documentation] has more information on
|
||||
how to write a plugin so that Elasticsearch will properly load it.
|
||||
|
||||
To register the actual script you'll need to implement `NativeScriptFactory`
|
||||
to construct the script. The actual script will extend either
|
||||
`AbstractExecutableScript` or `AbstractSearchScript`. The second one is likely
|
||||
the most useful and has several helpful subclasses you can extend like
|
||||
`AbstractLongSearchScript`, `AbstractDoubleSearchScript`, and
|
||||
`AbstractFloatSearchScript`. Finally, your plugin should register the native
|
||||
script by declaring the `onModule(ScriptModule)` method.
|
||||
|
||||
If you squashed the whole thing into one class it'd look like:
|
||||
|
||||
[source,java]
|
||||
--------------------------------------------------
|
||||
public class MyNativeScriptPlugin extends Plugin {
|
||||
@Override
|
||||
public String name() {
|
||||
return "my-native-script";
|
||||
}
|
||||
@Override
|
||||
public String description() {
|
||||
return "my native script that does something great";
|
||||
}
|
||||
public void onModule(ScriptModule scriptModule) {
|
||||
scriptModule.registerScript("my_script", MyNativeScriptFactory.class);
|
||||
}
|
||||
|
||||
public static class MyNativeScriptFactory implements NativeScriptFactory {
|
||||
@Override
|
||||
public ExecutableScript newScript(@Nullable Map<String, Object> params) {
|
||||
return new MyNativeScript();
|
||||
}
|
||||
@Override
|
||||
public boolean needsScores() {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
public static class MyNativeScript extends AbstractFloatSearchScript {
|
||||
@Override
|
||||
public float runAsFloat() {
|
||||
float a = (float) source().get("a");
|
||||
float b = (float) source().get("b");
|
||||
return a * b;
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
You can execute the script by specifying its `lang` as `native`, and the name
|
||||
of the script as the `id`:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPOST localhost:9200/_search -d '{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"query": {
|
||||
"match": {
|
||||
"body": "foo"
|
||||
}
|
||||
},
|
||||
"functions": [
|
||||
{
|
||||
"script_score": {
|
||||
"script": {
|
||||
"id": "my_script",
|
||||
"lang" : "native"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
|
||||
[float]
|
||||
=== Lucene Expressions Scripts
|
||||
|
||||
experimental[The Lucene expressions module is undergoing significant development and the exposed functionality is likely to change in the future]
|
||||
|
||||
Lucene's expressions module provides a mechanism to compile a
|
||||
`javascript` expression to bytecode. This allows very fast execution,
|
||||
as if you had written a `native` script. Expression scripts can be
|
||||
used in `script_score`, `script_fields`, sort scripts and numeric aggregation scripts.
|
||||
|
||||
See the link:http://lucene.apache.org/core/4_9_0/expressions/index.html?org/apache/lucene/expressions/js/package-summary.html[expressions module documentation]
|
||||
for details on what operators and functions are available.
|
||||
|
||||
Variables in `expression` scripts are available to access:
|
||||
|
||||
* Single valued document fields, e.g. `doc['myfield'].value`
|
||||
* Single valued document fields can also be accessed without `.value` e.g. `doc['myfield']`
|
||||
* Parameters passed into the script, e.g. `mymodifier`
|
||||
* The current document's score, `_score` (only available when used in a `script_score`)
|
||||
|
||||
Variables in `expression` scripts that are of type `date` may use the following member methods:
|
||||
|
||||
* getYear()
|
||||
* getMonth()
|
||||
* getDayOfMonth()
|
||||
* getHourOfDay()
|
||||
* getMinutes()
|
||||
* getSeconds()
|
||||
|
||||
The following example shows the difference in years between the `date` fields date0 and date1:
|
||||
|
||||
`doc['date1'].getYear() - doc['date0'].getYear()`
|
||||
|
||||
There are a few limitations relative to other script languages:
|
||||
|
||||
* Only numeric fields may be accessed
|
||||
* Stored fields are not available
|
||||
* If a field is sparse (only some documents contain a value), documents missing the field will have a value of `0`
|
||||
|
||||
[float]
|
||||
=== Score
|
||||
|
||||
In all scripts that can be used in aggregations, the current
|
||||
document's score is accessible in `_score`.
|
||||
|
||||
[float]
|
||||
=== Computing scores based on terms in scripts
|
||||
|
||||
see <<modules-advanced-scripting, advanced scripting documentation>>
|
||||
|
||||
[float]
|
||||
=== Document Fields
|
||||
|
||||
Most scripting revolve around the use of specific document fields data.
|
||||
The `doc['field_name']` can be used to access specific field data within
|
||||
a document (the document in question is usually derived by the context
|
||||
the script is used). Document fields are very fast to access since they
|
||||
end up being loaded into memory (all the relevant field values/tokens
|
||||
are loaded to memory). Note, however, that the `doc[...]` notation only
|
||||
allows for simple valued fields (can’t return a json object from it)
|
||||
and makes sense only on non-analyzed or single term based fields.
|
||||
|
||||
The following data can be extracted from a field:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Expression |Description
|
||||
|`doc['field_name'].value` |The native value of the field. For example,
|
||||
if its a short type, it will be short.
|
||||
|
||||
|`doc['field_name'].values` |The native array values of the field. For
|
||||
example, if its a short type, it will be short[]. Remember, a field can
|
||||
have several values within a single doc. Returns an empty array if the
|
||||
field has no values.
|
||||
|
||||
|`doc['field_name'].empty` |A boolean indicating if the field has no
|
||||
values within the doc.
|
||||
|
||||
|`doc['field_name'].multiValued` |A boolean indicating that the field
|
||||
has several values within the corpus.
|
||||
|
||||
|`doc['field_name'].lat` |The latitude of a geo point type.
|
||||
|
||||
|`doc['field_name'].lon` |The longitude of a geo point type.
|
||||
|
||||
|`doc['field_name'].lats` |The latitudes of a geo point type.
|
||||
|
||||
|`doc['field_name'].lons` |The longitudes of a geo point type.
|
||||
|
||||
|`doc['field_name'].distance(lat, lon)` |The `plane` distance (in meters)
|
||||
of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceWithDefault(lat, lon, default)` |The `plane` distance (in meters)
|
||||
of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].distanceInMiles(lat, lon)` |The `plane` distance (in
|
||||
miles) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceInMilesWithDefault(lat, lon, default)` |The `plane` distance (in
|
||||
miles) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].distanceInKm(lat, lon)` |The `plane` distance (in
|
||||
km) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceInKmWithDefault(lat, lon, default)` |The `plane` distance (in
|
||||
km) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].arcDistance(lat, lon)` |The `arc` distance (in
|
||||
meters) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceWithDefault(lat, lon, default)` |The `arc` distance (in
|
||||
meters) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].arcDistanceInMiles(lat, lon)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceInMilesWithDefault(lat, lon, default)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].arcDistanceInKm(lat, lon)` |The `arc` distance (in
|
||||
km) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceInKmWithDefault(lat, lon, default)` |The `arc` distance (in
|
||||
km) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].factorDistance(lat, lon)` |The distance factor of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].factorDistance(lat, lon, default)` |The distance factor of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].geohashDistance(geohash)` |The `arc` distance (in meters)
|
||||
of this geo point field from the provided geohash.
|
||||
|
||||
|`doc['field_name'].geohashDistanceInKm(geohash)` |The `arc` distance (in km)
|
||||
of this geo point field from the provided geohash.
|
||||
|
||||
|`doc['field_name'].geohashDistanceInMiles(geohash)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided geohash.
|
||||
|=======================================================================
|
||||
|
||||
[float]
|
||||
=== Stored Fields
|
||||
|
||||
Stored fields can also be accessed when executing a script. Note, they
|
||||
are much slower to access compared with document fields, as they are not
|
||||
loaded into memory. They can be simply accessed using
|
||||
`_fields['my_field_name'].value` or `_fields['my_field_name'].values`.
|
||||
|
||||
[float]
|
||||
=== Accessing the score of a document within a script
|
||||
|
||||
When using scripting for calculating the score of a document (for instance, with
|
||||
the `function_score` query), you can access the score using the `_score`
|
||||
variable inside of a Groovy script.
|
||||
|
||||
[float]
|
||||
=== Source Field
|
||||
|
||||
The source field can also be accessed when executing a script. The
|
||||
source field is loaded per doc, parsed, and then provided to the script
|
||||
for evaluation. The `_source` forms the context under which the source
|
||||
field can be accessed, for example `_source.obj2.obj1.field3`.
|
||||
|
||||
Accessing `_source` is much slower compared to using `doc`
|
||||
but the data is not loaded into memory. For a single field access `_fields` may be
|
||||
faster than using `_source` due to the extra overhead of potentially parsing large documents.
|
||||
However, `_source` may be faster if you access multiple fields or if the source has already been
|
||||
loaded for other purposes.
|
||||
|
||||
|
||||
[float]
|
||||
=== Groovy Built In Functions
|
||||
|
||||
There are several built in functions that can be used within scripts.
|
||||
They include:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Function |Description
|
||||
|`sin(a)` |Returns the trigonometric sine of an angle.
|
||||
|
||||
|`cos(a)` |Returns the trigonometric cosine of an angle.
|
||||
|
||||
|`tan(a)` |Returns the trigonometric tangent of an angle.
|
||||
|
||||
|`asin(a)` |Returns the arc sine of a value.
|
||||
|
||||
|`acos(a)` |Returns the arc cosine of a value.
|
||||
|
||||
|`atan(a)` |Returns the arc tangent of a value.
|
||||
|
||||
|`toRadians(angdeg)` |Converts an angle measured in degrees to an
|
||||
approximately equivalent angle measured in radians
|
||||
|
||||
|`toDegrees(angrad)` |Converts an angle measured in radians to an
|
||||
approximately equivalent angle measured in degrees.
|
||||
|
||||
|`exp(a)` |Returns Euler's number _e_ raised to the power of value.
|
||||
|
||||
|`log(a)` |Returns the natural logarithm (base _e_) of a value.
|
||||
|
||||
|`log10(a)` |Returns the base 10 logarithm of a value.
|
||||
|
||||
|`sqrt(a)` |Returns the correctly rounded positive square root of a
|
||||
value.
|
||||
|
||||
|`cbrt(a)` |Returns the cube root of a double value.
|
||||
|
||||
|`IEEEremainder(f1, f2)` |Computes the remainder operation on two
|
||||
arguments as prescribed by the IEEE 754 standard.
|
||||
|
||||
|`ceil(a)` |Returns the smallest (closest to negative infinity) value
|
||||
that is greater than or equal to the argument and is equal to a
|
||||
mathematical integer.
|
||||
|
||||
|`floor(a)` |Returns the largest (closest to positive infinity) value
|
||||
that is less than or equal to the argument and is equal to a
|
||||
mathematical integer.
|
||||
|
||||
|`rint(a)` |Returns the value that is closest in value to the argument
|
||||
and is equal to a mathematical integer.
|
||||
|
||||
|`atan2(y, x)` |Returns the angle _theta_ from the conversion of
|
||||
rectangular coordinates (_x_, _y_) to polar coordinates (r,_theta_).
|
||||
|
||||
|`pow(a, b)` |Returns the value of the first argument raised to the
|
||||
power of the second argument.
|
||||
|
||||
|`round(a)` |Returns the closest _int_ to the argument.
|
||||
|
||||
|`random()` |Returns a random _double_ value.
|
||||
|
||||
|`abs(a)` |Returns the absolute value of a value.
|
||||
|
||||
|`max(a, b)` |Returns the greater of two values.
|
||||
|
||||
|`min(a, b)` |Returns the smaller of two values.
|
||||
|
||||
|`ulp(d)` |Returns the size of an ulp of the argument.
|
||||
|
||||
|`signum(d)` |Returns the signum function of the argument.
|
||||
|
||||
|`sinh(x)` |Returns the hyperbolic sine of a value.
|
||||
|
||||
|`cosh(x)` |Returns the hyperbolic cosine of a value.
|
||||
|
||||
|`tanh(x)` |Returns the hyperbolic tangent of a value.
|
||||
|
||||
|`hypot(x, y)` |Returns sqrt(_x2_ + _y2_) without intermediate overflow
|
||||
or underflow.
|
||||
|=======================================================================
|
||||
|
|
|
@ -0,0 +1,691 @@
|
|||
[[modules-scripting]]
|
||||
== Scripting
|
||||
|
||||
The scripting module allows to use scripts in order to evaluate custom
|
||||
expressions. For example, scripts can be used to return "script fields"
|
||||
as part of a search request, or can be used to evaluate a custom score
|
||||
for a query and so on.
|
||||
|
||||
The scripting module uses by default http://groovy-lang.org/[groovy]
|
||||
(previously http://mvel.codehaus.org/[mvel] in 1.3.x and earlier) as the
|
||||
scripting language with some extensions. Groovy is used since it is extremely
|
||||
fast and very simple to use.
|
||||
|
||||
.Groovy dynamic scripting off by default from v1.4.3
|
||||
[IMPORTANT]
|
||||
===================================================
|
||||
|
||||
Groovy dynamic scripting is off by default, preventing dynamic Groovy scripts
|
||||
from being accepted as part of a request or retrieved from the special
|
||||
`.scripts` index. You will still be able to use Groovy scripts stored in files
|
||||
in the `config/scripts/` directory on every node.
|
||||
|
||||
To convert an inline script to a file, take this simple script
|
||||
as an example:
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
GET /_search
|
||||
{
|
||||
"script_fields": {
|
||||
"my_field": {
|
||||
"inline": "1 + my_var",
|
||||
"params": {
|
||||
"my_var": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
-----------------------------------
|
||||
|
||||
Save the contents of the `inline` field as a file called `config/scripts/my_script.groovy`
|
||||
on every data node in the cluster:
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
1 + my_var
|
||||
-----------------------------------
|
||||
|
||||
Now you can access the script by file name (without the extension):
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
GET /_search
|
||||
{
|
||||
"script_fields": {
|
||||
"my_field": {
|
||||
"script": {
|
||||
"file": "my_script",
|
||||
"params": {
|
||||
"my_var": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
-----------------------------------
|
||||
|
||||
===================================================
|
||||
|
||||
|
||||
Additional `lang` plugins are provided to allow to execute scripts in
|
||||
different languages. All places where a script can be used, a `lang` parameter
|
||||
can be provided to define the language of the script. The following are the
|
||||
supported scripting languages:
|
||||
|
||||
[cols="<,<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Language |Sandboxed |Required plugin
|
||||
|groovy |no |built-in
|
||||
|expression |yes |built-in
|
||||
|mustache |yes |built-in
|
||||
|javascript |no |{plugins}/lang-javascript.html[elasticsearch-lang-javascript]
|
||||
|python |no |{plugins}/lang-python.html[elasticsearch-lang-python]
|
||||
|=======================================================================
|
||||
|
||||
To increase security, Elasticsearch does not allow you to specify scripts for
|
||||
non-sandboxed languages with a request. Instead, scripts must be placed in the
|
||||
`scripts` directory inside the configuration directory (the directory where
|
||||
elasticsearch.yml is). The default location of this `scripts` directory can be
|
||||
changed by setting `path.scripts` in elasticsearch.yml. Scripts placed into
|
||||
this directory will automatically be picked up and be available to be used.
|
||||
Once a script has been placed in this directory, it can be referenced by name.
|
||||
For example, a script called `calculate-score.groovy` can be referenced in a
|
||||
request like this:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
$ tree config
|
||||
config
|
||||
├── elasticsearch.yml
|
||||
├── logging.yml
|
||||
└── scripts
|
||||
└── calculate-score.groovy
|
||||
--------------------------------------------------
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
$ cat config/scripts/calculate-score.groovy
|
||||
log(_score * 2) + my_modifier
|
||||
--------------------------------------------------
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPOST localhost:9200/_search -d '{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"query": {
|
||||
"match": {
|
||||
"body": "foo"
|
||||
}
|
||||
},
|
||||
"functions": [
|
||||
{
|
||||
"script_score": {
|
||||
"script": {
|
||||
"lang": "groovy",
|
||||
"file": "calculate-score",
|
||||
"params": {
|
||||
"my_modifier": 8
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
The name of the script is derived from the hierarchy of directories it
|
||||
exists under, and the file name without the lang extension. For example,
|
||||
a script placed under `config/scripts/group1/group2/test.py` will be
|
||||
named `group1_group2_test`.
|
||||
|
||||
[float]
|
||||
=== Indexed Scripts
|
||||
Elasticsearch allows you to store scripts in an internal index known as
|
||||
`.scripts` and reference them by id. There are REST endpoints to manage
|
||||
indexed scripts as follows:
|
||||
|
||||
Requests to the scripts endpoint look like :
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
/_scripts/{lang}/{id}
|
||||
-----------------------------------
|
||||
Where the `lang` part is the language the script is in and the `id` part is the id
|
||||
of the script. In the `.scripts` index the type of the document will be set to the `lang`.
|
||||
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
curl -XPOST localhost:9200/_scripts/groovy/indexedCalculateScore -d '{
|
||||
"script": "log(_score * 2) + my_modifier"
|
||||
}'
|
||||
-----------------------------------
|
||||
|
||||
This will create a document with id: `indexedCalculateScore` and type: `groovy` in the
|
||||
`.scripts` index. The type of the document is the language used by the script.
|
||||
|
||||
This script can be accessed at query time by using the `id` script parameter and passing
|
||||
the script id:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPOST localhost:9200/_search -d '{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"query": {
|
||||
"match": {
|
||||
"body": "foo"
|
||||
}
|
||||
},
|
||||
"functions": [
|
||||
{
|
||||
"script_score": {
|
||||
"script": {
|
||||
"id": "indexedCalculateScore",
|
||||
"lang" : "groovy",
|
||||
"params": {
|
||||
"my_modifier": 8
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
The script can be viewed by:
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
curl -XGET localhost:9200/_scripts/groovy/indexedCalculateScore
|
||||
-----------------------------------
|
||||
|
||||
This is rendered as:
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
'{
|
||||
"script": "log(_score * 2) + my_modifier"
|
||||
}'
|
||||
-----------------------------------
|
||||
|
||||
Indexed scripts can be deleted by:
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
curl -XDELETE localhost:9200/_scripts/groovy/indexedCalculateScore
|
||||
-----------------------------------
|
||||
|
||||
|
||||
|
||||
[float]
|
||||
[[enable-dynamic-scripting]]
|
||||
=== Enabling dynamic scripting
|
||||
|
||||
We recommend running Elasticsearch behind an application or proxy, which
|
||||
protects Elasticsearch from the outside world. If users are allowed to run
|
||||
inline scripts (even in a search request) or indexed scripts, then they have
|
||||
the same access to your box as the user that Elasticsearch is running as. For
|
||||
this reason dynamic scripting is allowed only for sandboxed languages by default.
|
||||
|
||||
First, you should not run Elasticsearch as the `root` user, as this would allow
|
||||
a script to access or do *anything* on your server, without limitations. Second,
|
||||
you should not expose Elasticsearch directly to users, but instead have a proxy
|
||||
application inbetween. If you *do* intend to expose Elasticsearch directly to
|
||||
your users, then you have to decide whether you trust them enough to run scripts
|
||||
on your box or not.
|
||||
|
||||
It is possible to enable scripts based on their source, for
|
||||
every script engine, through the following settings that need to be added to the
|
||||
`config/elasticsearch.yml` file on every node.
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.inline: true
|
||||
script.indexed: true
|
||||
|
||||
-----------------------------------
|
||||
|
||||
While this still allows execution of named scripts provided in the config, or
|
||||
_native_ Java scripts registered through plugins, it also allows users to run
|
||||
arbitrary scripts via the API. Instead of sending the name of the file as the
|
||||
script, the body of the script can be sent instead or retrieved from the
|
||||
`.scripts` indexed if previously stored.
|
||||
|
||||
There are three possible configuration values for any of the fine-grained
|
||||
script settings:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Value |Description
|
||||
| `false` |scripting is turned off completely, in the context of the setting being set.
|
||||
| `true` |scripting is turned on, in the context of the setting being set.
|
||||
| `sandbox` |scripts may be executed only for languages that are sandboxed
|
||||
|=======================================================================
|
||||
|
||||
The default values are the following:
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.inline: sandbox
|
||||
script.indexed: sandbox
|
||||
script.file: true
|
||||
|
||||
-----------------------------------
|
||||
|
||||
NOTE: Global scripting settings affect the `mustache` scripting language.
|
||||
<<search-template,Search templates>> internally use the `mustache` language,
|
||||
and will still be enabled by default as the `mustache` engine is sandboxed,
|
||||
but they will be enabled/disabled according to fine-grained settings
|
||||
specified in `elasticsearch.yml`.
|
||||
|
||||
It is also possible to control which operations can execute scripts. The
|
||||
supported operations are:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Value |Description
|
||||
| `aggs` |Aggregations (wherever they may be used)
|
||||
| `search` |Search api, Percolator api and Suggester api (e.g filters, script_fields)
|
||||
| `update` |Update api
|
||||
| `plugin` |Any plugin that makes use of scripts under the generic `plugin` category
|
||||
|=======================================================================
|
||||
|
||||
Plugins can also define custom operations that they use scripts for instead
|
||||
of using the generic `plugin` category. Those operations can be referred to
|
||||
in the following form: `${pluginName}_${operation}`.
|
||||
|
||||
The following example disables scripting for `update` and `mapping` operations,
|
||||
regardless of the script source, for any engine. Scripts can still be
|
||||
executed from sandboxed languages as part of `aggregations`, `search`
|
||||
and plugins execution though, as the above defaults still get applied.
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.update: false
|
||||
script.mapping: false
|
||||
|
||||
-----------------------------------
|
||||
|
||||
Generic settings get applied in order, operation based ones have precedence
|
||||
over source based ones. Language specific settings are supported too. They
|
||||
need to be prefixed with the `script.engine.<engine>` prefix and have
|
||||
precedence over any other generic settings.
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.engine.groovy.file.aggs: true
|
||||
script.engine.groovy.file.mapping: true
|
||||
script.engine.groovy.file.search: true
|
||||
script.engine.groovy.file.update: true
|
||||
script.engine.groovy.file.plugin: true
|
||||
script.engine.groovy.indexed.aggs: true
|
||||
script.engine.groovy.indexed.mapping: false
|
||||
script.engine.groovy.indexed.search: true
|
||||
script.engine.groovy.indexed.update: false
|
||||
script.engine.groovy.indexed.plugin: false
|
||||
script.engine.groovy.inline.aggs: true
|
||||
script.engine.groovy.inline.mapping: false
|
||||
script.engine.groovy.inline.search: false
|
||||
script.engine.groovy.inline.update: false
|
||||
script.engine.groovy.inline.plugin: false
|
||||
|
||||
-----------------------------------
|
||||
|
||||
[float]
|
||||
=== Default Scripting Language
|
||||
|
||||
The default scripting language (assuming no `lang` parameter is provided) is
|
||||
`groovy`. In order to change it, set the `script.default_lang` to the
|
||||
appropriate language.
|
||||
|
||||
[float]
|
||||
=== Automatic Script Reloading
|
||||
|
||||
The `config/scripts` directory is scanned periodically for changes.
|
||||
New and changed scripts are reloaded and deleted script are removed
|
||||
from preloaded scripts cache. The reload frequency can be specified
|
||||
using `resource.reload.interval` setting, which defaults to `60s`.
|
||||
To disable script reloading completely set `script.auto_reload_enabled`
|
||||
to `false`.
|
||||
|
||||
[[native-java-scripts]]
|
||||
[float]
|
||||
=== Native (Java) Scripts
|
||||
|
||||
Sometimes `groovy` and `expressions` aren't enough. For those times you can
|
||||
implement a native script.
|
||||
|
||||
The best way to implement a native script is to write a plugin and install it.
|
||||
The plugin {plugins}/plugin-authors.html[documentation] has more information on
|
||||
how to write a plugin so that Elasticsearch will properly load it.
|
||||
|
||||
To register the actual script you'll need to implement `NativeScriptFactory`
|
||||
to construct the script. The actual script will extend either
|
||||
`AbstractExecutableScript` or `AbstractSearchScript`. The second one is likely
|
||||
the most useful and has several helpful subclasses you can extend like
|
||||
`AbstractLongSearchScript`, `AbstractDoubleSearchScript`, and
|
||||
`AbstractFloatSearchScript`. Finally, your plugin should register the native
|
||||
script by declaring the `onModule(ScriptModule)` method.
|
||||
|
||||
If you squashed the whole thing into one class it'd look like:
|
||||
|
||||
[source,java]
|
||||
--------------------------------------------------
|
||||
public class MyNativeScriptPlugin extends Plugin {
|
||||
@Override
|
||||
public String name() {
|
||||
return "my-native-script";
|
||||
}
|
||||
@Override
|
||||
public String description() {
|
||||
return "my native script that does something great";
|
||||
}
|
||||
public void onModule(ScriptModule scriptModule) {
|
||||
scriptModule.registerScript("my_script", MyNativeScriptFactory.class);
|
||||
}
|
||||
|
||||
public static class MyNativeScriptFactory implements NativeScriptFactory {
|
||||
@Override
|
||||
public ExecutableScript newScript(@Nullable Map<String, Object> params) {
|
||||
return new MyNativeScript();
|
||||
}
|
||||
@Override
|
||||
public boolean needsScores() {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
public static class MyNativeScript extends AbstractFloatSearchScript {
|
||||
@Override
|
||||
public float runAsFloat() {
|
||||
float a = (float) source().get("a");
|
||||
float b = (float) source().get("b");
|
||||
return a * b;
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
You can execute the script by specifying its `lang` as `native`, and the name
|
||||
of the script as the `id`:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPOST localhost:9200/_search -d '{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"query": {
|
||||
"match": {
|
||||
"body": "foo"
|
||||
}
|
||||
},
|
||||
"functions": [
|
||||
{
|
||||
"script_score": {
|
||||
"script": {
|
||||
"id": "my_script",
|
||||
"lang" : "native"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
|
||||
[float]
|
||||
=== Lucene Expressions Scripts
|
||||
|
||||
experimental[The Lucene expressions module is undergoing significant development and the exposed functionality is likely to change in the future]
|
||||
|
||||
Lucene's expressions module provides a mechanism to compile a
|
||||
`javascript` expression to bytecode. This allows very fast execution,
|
||||
as if you had written a `native` script. Expression scripts can be
|
||||
used in `script_score`, `script_fields`, sort scripts and numeric aggregation scripts.
|
||||
|
||||
See the link:http://lucene.apache.org/core/4_9_0/expressions/index.html?org/apache/lucene/expressions/js/package-summary.html[expressions module documentation]
|
||||
for details on what operators and functions are available.
|
||||
|
||||
Variables in `expression` scripts are available to access:
|
||||
|
||||
* Single valued document fields, e.g. `doc['myfield'].value`
|
||||
* Single valued document fields can also be accessed without `.value` e.g. `doc['myfield']`
|
||||
* Parameters passed into the script, e.g. `mymodifier`
|
||||
* The current document's score, `_score` (only available when used in a `script_score`)
|
||||
|
||||
Variables in `expression` scripts that are of type `date` may use the following member methods:
|
||||
|
||||
* getYear()
|
||||
* getMonth()
|
||||
* getDayOfMonth()
|
||||
* getHourOfDay()
|
||||
* getMinutes()
|
||||
* getSeconds()
|
||||
|
||||
The following example shows the difference in years between the `date` fields date0 and date1:
|
||||
|
||||
`doc['date1'].getYear() - doc['date0'].getYear()`
|
||||
|
||||
There are a few limitations relative to other script languages:
|
||||
|
||||
* Only numeric fields may be accessed
|
||||
* Stored fields are not available
|
||||
* If a field is sparse (only some documents contain a value), documents missing the field will have a value of `0`
|
||||
|
||||
[float]
|
||||
=== Score
|
||||
|
||||
In all scripts that can be used in aggregations, the current
|
||||
document's score is accessible in `_score`.
|
||||
|
||||
[float]
|
||||
=== Computing scores based on terms in scripts
|
||||
|
||||
see <<modules-advanced-scripting, advanced scripting documentation>>
|
||||
|
||||
[float]
|
||||
=== Document Fields
|
||||
|
||||
Most scripting revolve around the use of specific document fields data.
|
||||
The `doc['field_name']` can be used to access specific field data within
|
||||
a document (the document in question is usually derived by the context
|
||||
the script is used). Document fields are very fast to access since they
|
||||
end up being loaded into memory (all the relevant field values/tokens
|
||||
are loaded to memory). Note, however, that the `doc[...]` notation only
|
||||
allows for simple valued fields (can’t return a json object from it)
|
||||
and makes sense only on non-analyzed or single term based fields.
|
||||
|
||||
The following data can be extracted from a field:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Expression |Description
|
||||
|`doc['field_name'].value` |The native value of the field. For example,
|
||||
if its a short type, it will be short.
|
||||
|
||||
|`doc['field_name'].values` |The native array values of the field. For
|
||||
example, if its a short type, it will be short[]. Remember, a field can
|
||||
have several values within a single doc. Returns an empty array if the
|
||||
field has no values.
|
||||
|
||||
|`doc['field_name'].empty` |A boolean indicating if the field has no
|
||||
values within the doc.
|
||||
|
||||
|`doc['field_name'].multiValued` |A boolean indicating that the field
|
||||
has several values within the corpus.
|
||||
|
||||
|`doc['field_name'].lat` |The latitude of a geo point type.
|
||||
|
||||
|`doc['field_name'].lon` |The longitude of a geo point type.
|
||||
|
||||
|`doc['field_name'].lats` |The latitudes of a geo point type.
|
||||
|
||||
|`doc['field_name'].lons` |The longitudes of a geo point type.
|
||||
|
||||
|`doc['field_name'].distance(lat, lon)` |The `plane` distance (in meters)
|
||||
of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceWithDefault(lat, lon, default)` |The `plane` distance (in meters)
|
||||
of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].distanceInMiles(lat, lon)` |The `plane` distance (in
|
||||
miles) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceInMilesWithDefault(lat, lon, default)` |The `plane` distance (in
|
||||
miles) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].distanceInKm(lat, lon)` |The `plane` distance (in
|
||||
km) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceInKmWithDefault(lat, lon, default)` |The `plane` distance (in
|
||||
km) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].arcDistance(lat, lon)` |The `arc` distance (in
|
||||
meters) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceWithDefault(lat, lon, default)` |The `arc` distance (in
|
||||
meters) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].arcDistanceInMiles(lat, lon)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceInMilesWithDefault(lat, lon, default)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].arcDistanceInKm(lat, lon)` |The `arc` distance (in
|
||||
km) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceInKmWithDefault(lat, lon, default)` |The `arc` distance (in
|
||||
km) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].factorDistance(lat, lon)` |The distance factor of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].factorDistance(lat, lon, default)` |The distance factor of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].geohashDistance(geohash)` |The `arc` distance (in meters)
|
||||
of this geo point field from the provided geohash.
|
||||
|
||||
|`doc['field_name'].geohashDistanceInKm(geohash)` |The `arc` distance (in km)
|
||||
of this geo point field from the provided geohash.
|
||||
|
||||
|`doc['field_name'].geohashDistanceInMiles(geohash)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided geohash.
|
||||
|=======================================================================
|
||||
|
||||
[float]
|
||||
=== Stored Fields
|
||||
|
||||
Stored fields can also be accessed when executing a script. Note, they
|
||||
are much slower to access compared with document fields, as they are not
|
||||
loaded into memory. They can be simply accessed using
|
||||
`_fields['my_field_name'].value` or `_fields['my_field_name'].values`.
|
||||
|
||||
[float]
|
||||
=== Accessing the score of a document within a script
|
||||
|
||||
When using scripting for calculating the score of a document (for instance, with
|
||||
the `function_score` query), you can access the score using the `_score`
|
||||
variable inside of a Groovy script.
|
||||
|
||||
[float]
|
||||
=== Source Field
|
||||
|
||||
The source field can also be accessed when executing a script. The
|
||||
source field is loaded per doc, parsed, and then provided to the script
|
||||
for evaluation. The `_source` forms the context under which the source
|
||||
field can be accessed, for example `_source.obj2.obj1.field3`.
|
||||
|
||||
Accessing `_source` is much slower compared to using `doc`
|
||||
but the data is not loaded into memory. For a single field access `_fields` may be
|
||||
faster than using `_source` due to the extra overhead of potentially parsing large documents.
|
||||
However, `_source` may be faster if you access multiple fields or if the source has already been
|
||||
loaded for other purposes.
|
||||
|
||||
|
||||
[float]
|
||||
=== Groovy Built In Functions
|
||||
|
||||
There are several built in functions that can be used within scripts.
|
||||
They include:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Function |Description
|
||||
|`sin(a)` |Returns the trigonometric sine of an angle.
|
||||
|
||||
|`cos(a)` |Returns the trigonometric cosine of an angle.
|
||||
|
||||
|`tan(a)` |Returns the trigonometric tangent of an angle.
|
||||
|
||||
|`asin(a)` |Returns the arc sine of a value.
|
||||
|
||||
|`acos(a)` |Returns the arc cosine of a value.
|
||||
|
||||
|`atan(a)` |Returns the arc tangent of a value.
|
||||
|
||||
|`toRadians(angdeg)` |Converts an angle measured in degrees to an
|
||||
approximately equivalent angle measured in radians
|
||||
|
||||
|`toDegrees(angrad)` |Converts an angle measured in radians to an
|
||||
approximately equivalent angle measured in degrees.
|
||||
|
||||
|`exp(a)` |Returns Euler's number _e_ raised to the power of value.
|
||||
|
||||
|`log(a)` |Returns the natural logarithm (base _e_) of a value.
|
||||
|
||||
|`log10(a)` |Returns the base 10 logarithm of a value.
|
||||
|
||||
|`sqrt(a)` |Returns the correctly rounded positive square root of a
|
||||
value.
|
||||
|
||||
|`cbrt(a)` |Returns the cube root of a double value.
|
||||
|
||||
|`IEEEremainder(f1, f2)` |Computes the remainder operation on two
|
||||
arguments as prescribed by the IEEE 754 standard.
|
||||
|
||||
|`ceil(a)` |Returns the smallest (closest to negative infinity) value
|
||||
that is greater than or equal to the argument and is equal to a
|
||||
mathematical integer.
|
||||
|
||||
|`floor(a)` |Returns the largest (closest to positive infinity) value
|
||||
that is less than or equal to the argument and is equal to a
|
||||
mathematical integer.
|
||||
|
||||
|`rint(a)` |Returns the value that is closest in value to the argument
|
||||
and is equal to a mathematical integer.
|
||||
|
||||
|`atan2(y, x)` |Returns the angle _theta_ from the conversion of
|
||||
rectangular coordinates (_x_, _y_) to polar coordinates (r,_theta_).
|
||||
|
||||
|`pow(a, b)` |Returns the value of the first argument raised to the
|
||||
power of the second argument.
|
||||
|
||||
|`round(a)` |Returns the closest _int_ to the argument.
|
||||
|
||||
|`random()` |Returns a random _double_ value.
|
||||
|
||||
|`abs(a)` |Returns the absolute value of a value.
|
||||
|
||||
|`max(a, b)` |Returns the greater of two values.
|
||||
|
||||
|`min(a, b)` |Returns the smaller of two values.
|
||||
|
||||
|`ulp(d)` |Returns the size of an ulp of the argument.
|
||||
|
||||
|`signum(d)` |Returns the signum function of the argument.
|
||||
|
||||
|`sinh(x)` |Returns the hyperbolic sine of a value.
|
||||
|
||||
|`cosh(x)` |Returns the hyperbolic cosine of a value.
|
||||
|
||||
|`tanh(x)` |Returns the hyperbolic tangent of a value.
|
||||
|
||||
|`hypot(x, y)` |Returns sqrt(_x2_ + _y2_) without intermediate overflow
|
||||
or underflow.
|
||||
|=======================================================================
|
|
@ -0,0 +1,160 @@
|
|||
[[modules-scripting-security]]
|
||||
=== Scripting and the Java Security Manager
|
||||
|
||||
Elasticsearch runs with the https://docs.oracle.com/javase/tutorial/essential/environment/security.html[Java Security Manager]
|
||||
enabled by default. The security policy in Elasticsearch locks down the
|
||||
permissions granted to each class to the bare minimum required to operate.
|
||||
The benefit of doing this is that it severely limits the attack vectors
|
||||
available to a hacker.
|
||||
|
||||
Restricting permissions is particularly important with scripting languages
|
||||
like Groovy and Javascript which are designed to do anything that can be done
|
||||
in Java itself, including writing to the file system, opening sockets to
|
||||
remote servers, etc.
|
||||
|
||||
[float]
|
||||
=== Script Classloader Whitelist
|
||||
|
||||
Scripting languages are only allowed to load classes which appear in a
|
||||
hardcoded whitelist that can be found in
|
||||
https://github.com/elastic/elasticsearch/blob/{branch}/core/src/main/java/org/elasticsearch/script/ClassPermission.java[`org.elasticsearch.script.ClassPermission`].
|
||||
|
||||
|
||||
In a script, attempting to load a class that does not appear in the whitelist
|
||||
_may_ result in a `ClassNotFoundException`, for instance this script:
|
||||
|
||||
[source,json]
|
||||
------------------------------
|
||||
GET _search
|
||||
{
|
||||
"script_fields": {
|
||||
"the_hour": {
|
||||
"script": "use(java.math.BigInteger); new BigInteger(1)"
|
||||
}
|
||||
}
|
||||
}
|
||||
------------------------------
|
||||
|
||||
will return the following exception:
|
||||
|
||||
[source,json]
|
||||
------------------------------
|
||||
{
|
||||
"reason": {
|
||||
"type": "script_exception",
|
||||
"reason": "failed to run inline script [use(java.math.BigInteger); new BigInteger(1)] using lang [groovy]",
|
||||
"caused_by": {
|
||||
"type": "no_class_def_found_error",
|
||||
"reason": "java/math/BigInteger",
|
||||
"caused_by": {
|
||||
"type": "class_not_found_exception",
|
||||
"reason": "java.math.BigInteger"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
------------------------------
|
||||
|
||||
However, classloader issues may also result in more difficult to interpret
|
||||
exceptions. For instance, this script:
|
||||
|
||||
[source,groovy]
|
||||
------------------------------
|
||||
use(groovy.time.TimeCategory); new Date(123456789).format('HH')
|
||||
------------------------------
|
||||
|
||||
Returns the following exception:
|
||||
|
||||
[source,json]
|
||||
------------------------------
|
||||
{
|
||||
"reason": {
|
||||
"type": "script_exception",
|
||||
"reason": "failed to run inline script [use(groovy.time.TimeCategory); new Date(123456789).format('HH')] using lang [groovy]",
|
||||
"caused_by": {
|
||||
"type": "missing_property_exception",
|
||||
"reason": "No such property: groovy for class: 8d45f5c1a07a1ab5dda953234863e283a7586240"
|
||||
}
|
||||
}
|
||||
}
|
||||
------------------------------
|
||||
|
||||
[float]
|
||||
== Dealing with Java Security Manager issues
|
||||
|
||||
If you encounter issues with the Java Security Manager, you have three options
|
||||
for resolving these issues:
|
||||
|
||||
[float]
|
||||
=== Fix the security problem
|
||||
|
||||
The safest and most secure long term solution is to change the code causing
|
||||
the security issue. We recognise that this may take time to do correctly and
|
||||
so we provide the following two alternatives.
|
||||
|
||||
[float]
|
||||
=== Disable the Java Security Manager
|
||||
|
||||
deprecated[2.2.0,The ability to disable the Java Security Manager will be removed in a future version]
|
||||
|
||||
You can disable the Java Security Manager entirely with the
|
||||
`security.manager.enabled` command line flag:
|
||||
|
||||
[source,sh]
|
||||
-----------------------------
|
||||
./bin/elasticsearch --security.manager.enabled false
|
||||
-----------------------------
|
||||
|
||||
WARNING: This disables the Security Manager entirely and makes Elasticsearch
|
||||
much more vulnerable to attacks! It is an option that should only be used in
|
||||
the most urgent of situations and for the shortest amount of time possible.
|
||||
Optional security is not secure at all because it **will** be disabled and
|
||||
leave the system vulnerable. This option will be removed in a future version.
|
||||
|
||||
[float]
|
||||
=== Customising the classloader whitelist
|
||||
|
||||
The classloader whitelist can be customised by tweaking the local Java
|
||||
Security Policy either:
|
||||
|
||||
* system wide: `$JAVA_HOME/lib/security/java.policy`,
|
||||
* for just the `elasticsearch` user: `/home/elasticsearch/.java.policy`, or
|
||||
* from a file specified on the command line: `-Djava.security.policy=someURL`
|
||||
|
||||
Permissions may be granted at the class, package, or global level. For instance:
|
||||
|
||||
[source,js]
|
||||
----------------------------------
|
||||
grant {
|
||||
permission org.elasticsearch.script.ClassPermission "java.util.Base64"; // allow class
|
||||
permission org.elasticsearch.script.ClassPermission "java.util.*"; // allow package
|
||||
permission org.elasticsearch.script.ClassPermission "*"; // allow all (disables filtering basically)
|
||||
};
|
||||
----------------------------------
|
||||
|
||||
Here is an example of how to enable the `groovy.time.TimeCategory` class:
|
||||
|
||||
[source,js]
|
||||
----------------------------------
|
||||
grant {
|
||||
permission org.elasticsearch.script.ClassPermission "java.lang.Class";
|
||||
permission org.elasticsearch.script.ClassPermission "groovy.time.TimeCategory";
|
||||
};
|
||||
----------------------------------
|
||||
|
||||
[TIP]
|
||||
======================================
|
||||
|
||||
Before adding classes to the whitelist, consider the security impact that it
|
||||
will have on Elasticsearch. Do you really need an extra class or can your code
|
||||
be rewritten in a more secure way?
|
||||
|
||||
It is quite possible that we have not whitelisted a generically useful and
|
||||
safe class. If you have a class that you think should be whitelisted by
|
||||
default, please open an issue on GitHub and we will consider the impact of
|
||||
doing so.
|
||||
|
||||
======================================
|
||||
|
||||
See http://docs.oracle.com/javase/7/docs/technotes/guides/security/PolicyFiles.html for more information.
|
||||
|
Loading…
Reference in New Issue