Merge pull request #16332 from clintongormley/classloader_docs
Java Security Manager and scripting
This commit is contained in:
commit
35ef382369
|
@ -3,7 +3,7 @@
|
|||
|
||||
:version: 3.0.0-beta1
|
||||
:major-version: 3.x
|
||||
:branch: 3.0
|
||||
:branch: master
|
||||
:jdk: 1.8.0_25
|
||||
:defguide: https://www.elastic.co/guide/en/elasticsearch/guide/current
|
||||
:plugins: https://www.elastic.co/guide/en/elasticsearch/plugins/master
|
||||
|
|
|
@ -4,13 +4,20 @@
|
|||
This section discusses the changes that you need to be aware of when migrating
|
||||
your application to Elasticsearch 2.2.
|
||||
|
||||
* <<breaking_22_index_apis>>
|
||||
[float]
|
||||
=== Scripting and security
|
||||
|
||||
[[breaking_22_index_apis]]
|
||||
=== Index APIs
|
||||
The Java Security Manager is being used to lock down the privileges available
|
||||
to the scripting languages and to restrict the classes they are allowed to
|
||||
load to a predefined whitelist. These changes may cause scripts which worked
|
||||
in earlier versions to fail. See <<modules-scripting-security>> for more
|
||||
details.
|
||||
|
||||
==== Field stats API
|
||||
[float]
|
||||
=== Field stats API
|
||||
|
||||
The field stats' response format has been changed for number based and date
|
||||
fields. The `min_value` and `max_value` elements now return values as number
|
||||
and the new `min_value_as_string` and `max_value_as_string` return the values
|
||||
as string.
|
||||
|
||||
The field stats' response format has been changed for number based and date fields. The `min_value` and
|
||||
`max_value` elements now return values as number and the new `min_value_as_string` and `max_value_as_string`
|
||||
return the values as string.
|
||||
|
|
|
@ -93,8 +93,6 @@ include::modules/plugins.asciidoc[]
|
|||
|
||||
include::modules/scripting.asciidoc[]
|
||||
|
||||
include::modules/advanced-scripting.asciidoc[]
|
||||
|
||||
include::modules/snapshots.asciidoc[]
|
||||
|
||||
include::modules/threadpool.asciidoc[]
|
||||
|
|
|
@ -1,691 +1,6 @@
|
|||
[[modules-scripting]]
|
||||
== Scripting
|
||||
include::scripting/scripting.asciidoc[]
|
||||
|
||||
The scripting module allows to use scripts in order to evaluate custom
|
||||
expressions. For example, scripts can be used to return "script fields"
|
||||
as part of a search request, or can be used to evaluate a custom score
|
||||
for a query and so on.
|
||||
include::scripting/advanced-scripting.asciidoc[]
|
||||
|
||||
The scripting module uses by default http://groovy-lang.org/[groovy]
|
||||
(previously http://mvel.codehaus.org/[mvel] in 1.3.x and earlier) as the
|
||||
scripting language with some extensions. Groovy is used since it is extremely
|
||||
fast and very simple to use.
|
||||
include::scripting/security.asciidoc[]
|
||||
|
||||
.Groovy dynamic scripting off by default from v1.4.3
|
||||
[IMPORTANT]
|
||||
===================================================
|
||||
|
||||
Groovy dynamic scripting is off by default, preventing dynamic Groovy scripts
|
||||
from being accepted as part of a request or retrieved from the special
|
||||
`.scripts` index. You will still be able to use Groovy scripts stored in files
|
||||
in the `config/scripts/` directory on every node.
|
||||
|
||||
To convert an inline script to a file, take this simple script
|
||||
as an example:
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
GET /_search
|
||||
{
|
||||
"script_fields": {
|
||||
"my_field": {
|
||||
"inline": "1 + my_var",
|
||||
"params": {
|
||||
"my_var": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
-----------------------------------
|
||||
|
||||
Save the contents of the `inline` field as a file called `config/scripts/my_script.groovy`
|
||||
on every data node in the cluster:
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
1 + my_var
|
||||
-----------------------------------
|
||||
|
||||
Now you can access the script by file name (without the extension):
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
GET /_search
|
||||
{
|
||||
"script_fields": {
|
||||
"my_field": {
|
||||
"script": {
|
||||
"file": "my_script",
|
||||
"params": {
|
||||
"my_var": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
-----------------------------------
|
||||
|
||||
===================================================
|
||||
|
||||
|
||||
Additional `lang` plugins are provided to allow to execute scripts in
|
||||
different languages. All places where a script can be used, a `lang` parameter
|
||||
can be provided to define the language of the script. The following are the
|
||||
supported scripting languages:
|
||||
|
||||
[cols="<,<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Language |Sandboxed |Required plugin
|
||||
|groovy |no |built-in
|
||||
|expression |yes |built-in
|
||||
|mustache |yes |built-in
|
||||
|javascript |no |{plugins}/lang-javascript.html[elasticsearch-lang-javascript]
|
||||
|python |no |{plugins}/lang-python.html[elasticsearch-lang-python]
|
||||
|=======================================================================
|
||||
|
||||
To increase security, Elasticsearch does not allow you to specify scripts for
|
||||
non-sandboxed languages with a request. Instead, scripts must be placed in the
|
||||
`scripts` directory inside the configuration directory (the directory where
|
||||
elasticsearch.yml is). The default location of this `scripts` directory can be
|
||||
changed by setting `path.scripts` in elasticsearch.yml. Scripts placed into
|
||||
this directory will automatically be picked up and be available to be used.
|
||||
Once a script has been placed in this directory, it can be referenced by name.
|
||||
For example, a script called `calculate-score.groovy` can be referenced in a
|
||||
request like this:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
$ tree config
|
||||
config
|
||||
├── elasticsearch.yml
|
||||
├── logging.yml
|
||||
└── scripts
|
||||
└── calculate-score.groovy
|
||||
--------------------------------------------------
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
$ cat config/scripts/calculate-score.groovy
|
||||
log(_score * 2) + my_modifier
|
||||
--------------------------------------------------
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPOST localhost:9200/_search -d '{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"query": {
|
||||
"match": {
|
||||
"body": "foo"
|
||||
}
|
||||
},
|
||||
"functions": [
|
||||
{
|
||||
"script_score": {
|
||||
"script": {
|
||||
"lang": "groovy",
|
||||
"file": "calculate-score",
|
||||
"params": {
|
||||
"my_modifier": 8
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
The name of the script is derived from the hierarchy of directories it
|
||||
exists under, and the file name without the lang extension. For example,
|
||||
a script placed under `config/scripts/group1/group2/test.py` will be
|
||||
named `group1_group2_test`.
|
||||
|
||||
[float]
|
||||
=== Indexed Scripts
|
||||
Elasticsearch allows you to store scripts in an internal index known as
|
||||
`.scripts` and reference them by id. There are REST endpoints to manage
|
||||
indexed scripts as follows:
|
||||
|
||||
Requests to the scripts endpoint look like :
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
/_scripts/{lang}/{id}
|
||||
-----------------------------------
|
||||
Where the `lang` part is the language the script is in and the `id` part is the id
|
||||
of the script. In the `.scripts` index the type of the document will be set to the `lang`.
|
||||
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
curl -XPOST localhost:9200/_scripts/groovy/indexedCalculateScore -d '{
|
||||
"script": "log(_score * 2) + my_modifier"
|
||||
}'
|
||||
-----------------------------------
|
||||
|
||||
This will create a document with id: `indexedCalculateScore` and type: `groovy` in the
|
||||
`.scripts` index. The type of the document is the language used by the script.
|
||||
|
||||
This script can be accessed at query time by using the `id` script parameter and passing
|
||||
the script id:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPOST localhost:9200/_search -d '{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"query": {
|
||||
"match": {
|
||||
"body": "foo"
|
||||
}
|
||||
},
|
||||
"functions": [
|
||||
{
|
||||
"script_score": {
|
||||
"script": {
|
||||
"id": "indexedCalculateScore",
|
||||
"lang" : "groovy",
|
||||
"params": {
|
||||
"my_modifier": 8
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
The script can be viewed by:
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
curl -XGET localhost:9200/_scripts/groovy/indexedCalculateScore
|
||||
-----------------------------------
|
||||
|
||||
This is rendered as:
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
'{
|
||||
"script": "log(_score * 2) + my_modifier"
|
||||
}'
|
||||
-----------------------------------
|
||||
|
||||
Indexed scripts can be deleted by:
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
curl -XDELETE localhost:9200/_scripts/groovy/indexedCalculateScore
|
||||
-----------------------------------
|
||||
|
||||
|
||||
|
||||
[float]
|
||||
[[enable-dynamic-scripting]]
|
||||
=== Enabling dynamic scripting
|
||||
|
||||
We recommend running Elasticsearch behind an application or proxy, which
|
||||
protects Elasticsearch from the outside world. If users are allowed to run
|
||||
inline scripts (even in a search request) or indexed scripts, then they have
|
||||
the same access to your box as the user that Elasticsearch is running as. For
|
||||
this reason dynamic scripting is allowed only for sandboxed languages by default.
|
||||
|
||||
First, you should not run Elasticsearch as the `root` user, as this would allow
|
||||
a script to access or do *anything* on your server, without limitations. Second,
|
||||
you should not expose Elasticsearch directly to users, but instead have a proxy
|
||||
application inbetween. If you *do* intend to expose Elasticsearch directly to
|
||||
your users, then you have to decide whether you trust them enough to run scripts
|
||||
on your box or not.
|
||||
|
||||
It is possible to enable scripts based on their source, for
|
||||
every script engine, through the following settings that need to be added to the
|
||||
`config/elasticsearch.yml` file on every node.
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.inline: true
|
||||
script.indexed: true
|
||||
|
||||
-----------------------------------
|
||||
|
||||
While this still allows execution of named scripts provided in the config, or
|
||||
_native_ Java scripts registered through plugins, it also allows users to run
|
||||
arbitrary scripts via the API. Instead of sending the name of the file as the
|
||||
script, the body of the script can be sent instead or retrieved from the
|
||||
`.scripts` indexed if previously stored.
|
||||
|
||||
There are three possible configuration values for any of the fine-grained
|
||||
script settings:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Value |Description
|
||||
| `false` |scripting is turned off completely, in the context of the setting being set.
|
||||
| `true` |scripting is turned on, in the context of the setting being set.
|
||||
| `sandbox` |scripts may be executed only for languages that are sandboxed
|
||||
|=======================================================================
|
||||
|
||||
The default values are the following:
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.inline: sandbox
|
||||
script.indexed: sandbox
|
||||
script.file: true
|
||||
|
||||
-----------------------------------
|
||||
|
||||
NOTE: Global scripting settings affect the `mustache` scripting language.
|
||||
<<search-template,Search templates>> internally use the `mustache` language,
|
||||
and will still be enabled by default as the `mustache` engine is sandboxed,
|
||||
but they will be enabled/disabled according to fine-grained settings
|
||||
specified in `elasticsearch.yml`.
|
||||
|
||||
It is also possible to control which operations can execute scripts. The
|
||||
supported operations are:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Value |Description
|
||||
| `aggs` |Aggregations (wherever they may be used)
|
||||
| `search` |Search api, Percolator api and Suggester api (e.g filters, script_fields)
|
||||
| `update` |Update api
|
||||
| `plugin` |Any plugin that makes use of scripts under the generic `plugin` category
|
||||
|=======================================================================
|
||||
|
||||
Plugins can also define custom operations that they use scripts for instead
|
||||
of using the generic `plugin` category. Those operations can be referred to
|
||||
in the following form: `${pluginName}_${operation}`.
|
||||
|
||||
The following example disables scripting for `update` and `mapping` operations,
|
||||
regardless of the script source, for any engine. Scripts can still be
|
||||
executed from sandboxed languages as part of `aggregations`, `search`
|
||||
and plugins execution though, as the above defaults still get applied.
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.update: false
|
||||
script.mapping: false
|
||||
|
||||
-----------------------------------
|
||||
|
||||
Generic settings get applied in order, operation based ones have precedence
|
||||
over source based ones. Language specific settings are supported too. They
|
||||
need to be prefixed with the `script.engine.<engine>` prefix and have
|
||||
precedence over any other generic settings.
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.engine.groovy.file.aggs: true
|
||||
script.engine.groovy.file.mapping: true
|
||||
script.engine.groovy.file.search: true
|
||||
script.engine.groovy.file.update: true
|
||||
script.engine.groovy.file.plugin: true
|
||||
script.engine.groovy.indexed.aggs: true
|
||||
script.engine.groovy.indexed.mapping: false
|
||||
script.engine.groovy.indexed.search: true
|
||||
script.engine.groovy.indexed.update: false
|
||||
script.engine.groovy.indexed.plugin: false
|
||||
script.engine.groovy.inline.aggs: true
|
||||
script.engine.groovy.inline.mapping: false
|
||||
script.engine.groovy.inline.search: false
|
||||
script.engine.groovy.inline.update: false
|
||||
script.engine.groovy.inline.plugin: false
|
||||
|
||||
-----------------------------------
|
||||
|
||||
[float]
|
||||
=== Default Scripting Language
|
||||
|
||||
The default scripting language (assuming no `lang` parameter is provided) is
|
||||
`groovy`. In order to change it, set the `script.default_lang` to the
|
||||
appropriate language.
|
||||
|
||||
[float]
|
||||
=== Automatic Script Reloading
|
||||
|
||||
The `config/scripts` directory is scanned periodically for changes.
|
||||
New and changed scripts are reloaded and deleted script are removed
|
||||
from preloaded scripts cache. The reload frequency can be specified
|
||||
using `resource.reload.interval` setting, which defaults to `60s`.
|
||||
To disable script reloading completely set `script.auto_reload_enabled`
|
||||
to `false`.
|
||||
|
||||
[[native-java-scripts]]
|
||||
[float]
|
||||
=== Native (Java) Scripts
|
||||
|
||||
Sometimes `groovy` and `expressions` aren't enough. For those times you can
|
||||
implement a native script.
|
||||
|
||||
The best way to implement a native script is to write a plugin and install it.
|
||||
The plugin {plugins}/plugin-authors.html[documentation] has more information on
|
||||
how to write a plugin so that Elasticsearch will properly load it.
|
||||
|
||||
To register the actual script you'll need to implement `NativeScriptFactory`
|
||||
to construct the script. The actual script will extend either
|
||||
`AbstractExecutableScript` or `AbstractSearchScript`. The second one is likely
|
||||
the most useful and has several helpful subclasses you can extend like
|
||||
`AbstractLongSearchScript`, `AbstractDoubleSearchScript`, and
|
||||
`AbstractFloatSearchScript`. Finally, your plugin should register the native
|
||||
script by declaring the `onModule(ScriptModule)` method.
|
||||
|
||||
If you squashed the whole thing into one class it'd look like:
|
||||
|
||||
[source,java]
|
||||
--------------------------------------------------
|
||||
public class MyNativeScriptPlugin extends Plugin {
|
||||
@Override
|
||||
public String name() {
|
||||
return "my-native-script";
|
||||
}
|
||||
@Override
|
||||
public String description() {
|
||||
return "my native script that does something great";
|
||||
}
|
||||
public void onModule(ScriptModule scriptModule) {
|
||||
scriptModule.registerScript("my_script", MyNativeScriptFactory.class);
|
||||
}
|
||||
|
||||
public static class MyNativeScriptFactory implements NativeScriptFactory {
|
||||
@Override
|
||||
public ExecutableScript newScript(@Nullable Map<String, Object> params) {
|
||||
return new MyNativeScript();
|
||||
}
|
||||
@Override
|
||||
public boolean needsScores() {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
public static class MyNativeScript extends AbstractFloatSearchScript {
|
||||
@Override
|
||||
public float runAsFloat() {
|
||||
float a = (float) source().get("a");
|
||||
float b = (float) source().get("b");
|
||||
return a * b;
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
You can execute the script by specifying its `lang` as `native`, and the name
|
||||
of the script as the `id`:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPOST localhost:9200/_search -d '{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"query": {
|
||||
"match": {
|
||||
"body": "foo"
|
||||
}
|
||||
},
|
||||
"functions": [
|
||||
{
|
||||
"script_score": {
|
||||
"script": {
|
||||
"id": "my_script",
|
||||
"lang" : "native"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
|
||||
[float]
|
||||
=== Lucene Expressions Scripts
|
||||
|
||||
experimental[The Lucene expressions module is undergoing significant development and the exposed functionality is likely to change in the future]
|
||||
|
||||
Lucene's expressions module provides a mechanism to compile a
|
||||
`javascript` expression to bytecode. This allows very fast execution,
|
||||
as if you had written a `native` script. Expression scripts can be
|
||||
used in `script_score`, `script_fields`, sort scripts and numeric aggregation scripts.
|
||||
|
||||
See the link:http://lucene.apache.org/core/4_9_0/expressions/index.html?org/apache/lucene/expressions/js/package-summary.html[expressions module documentation]
|
||||
for details on what operators and functions are available.
|
||||
|
||||
Variables in `expression` scripts are available to access:
|
||||
|
||||
* Single valued document fields, e.g. `doc['myfield'].value`
|
||||
* Single valued document fields can also be accessed without `.value` e.g. `doc['myfield']`
|
||||
* Parameters passed into the script, e.g. `mymodifier`
|
||||
* The current document's score, `_score` (only available when used in a `script_score`)
|
||||
|
||||
Variables in `expression` scripts that are of type `date` may use the following member methods:
|
||||
|
||||
* getYear()
|
||||
* getMonth()
|
||||
* getDayOfMonth()
|
||||
* getHourOfDay()
|
||||
* getMinutes()
|
||||
* getSeconds()
|
||||
|
||||
The following example shows the difference in years between the `date` fields date0 and date1:
|
||||
|
||||
`doc['date1'].getYear() - doc['date0'].getYear()`
|
||||
|
||||
There are a few limitations relative to other script languages:
|
||||
|
||||
* Only numeric fields may be accessed
|
||||
* Stored fields are not available
|
||||
* If a field is sparse (only some documents contain a value), documents missing the field will have a value of `0`
|
||||
|
||||
[float]
|
||||
=== Score
|
||||
|
||||
In all scripts that can be used in aggregations, the current
|
||||
document's score is accessible in `_score`.
|
||||
|
||||
[float]
|
||||
=== Computing scores based on terms in scripts
|
||||
|
||||
see <<modules-advanced-scripting, advanced scripting documentation>>
|
||||
|
||||
[float]
|
||||
=== Document Fields
|
||||
|
||||
Most scripting revolve around the use of specific document fields data.
|
||||
The `doc['field_name']` can be used to access specific field data within
|
||||
a document (the document in question is usually derived by the context
|
||||
the script is used). Document fields are very fast to access since they
|
||||
end up being loaded into memory (all the relevant field values/tokens
|
||||
are loaded to memory). Note, however, that the `doc[...]` notation only
|
||||
allows for simple valued fields (can’t return a json object from it)
|
||||
and makes sense only on non-analyzed or single term based fields.
|
||||
|
||||
The following data can be extracted from a field:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Expression |Description
|
||||
|`doc['field_name'].value` |The native value of the field. For example,
|
||||
if its a short type, it will be short.
|
||||
|
||||
|`doc['field_name'].values` |The native array values of the field. For
|
||||
example, if its a short type, it will be short[]. Remember, a field can
|
||||
have several values within a single doc. Returns an empty array if the
|
||||
field has no values.
|
||||
|
||||
|`doc['field_name'].empty` |A boolean indicating if the field has no
|
||||
values within the doc.
|
||||
|
||||
|`doc['field_name'].multiValued` |A boolean indicating that the field
|
||||
has several values within the corpus.
|
||||
|
||||
|`doc['field_name'].lat` |The latitude of a geo point type.
|
||||
|
||||
|`doc['field_name'].lon` |The longitude of a geo point type.
|
||||
|
||||
|`doc['field_name'].lats` |The latitudes of a geo point type.
|
||||
|
||||
|`doc['field_name'].lons` |The longitudes of a geo point type.
|
||||
|
||||
|`doc['field_name'].distance(lat, lon)` |The `plane` distance (in meters)
|
||||
of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceWithDefault(lat, lon, default)` |The `plane` distance (in meters)
|
||||
of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].distanceInMiles(lat, lon)` |The `plane` distance (in
|
||||
miles) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceInMilesWithDefault(lat, lon, default)` |The `plane` distance (in
|
||||
miles) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].distanceInKm(lat, lon)` |The `plane` distance (in
|
||||
km) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceInKmWithDefault(lat, lon, default)` |The `plane` distance (in
|
||||
km) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].arcDistance(lat, lon)` |The `arc` distance (in
|
||||
meters) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceWithDefault(lat, lon, default)` |The `arc` distance (in
|
||||
meters) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].arcDistanceInMiles(lat, lon)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceInMilesWithDefault(lat, lon, default)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].arcDistanceInKm(lat, lon)` |The `arc` distance (in
|
||||
km) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceInKmWithDefault(lat, lon, default)` |The `arc` distance (in
|
||||
km) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].factorDistance(lat, lon)` |The distance factor of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].factorDistance(lat, lon, default)` |The distance factor of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].geohashDistance(geohash)` |The `arc` distance (in meters)
|
||||
of this geo point field from the provided geohash.
|
||||
|
||||
|`doc['field_name'].geohashDistanceInKm(geohash)` |The `arc` distance (in km)
|
||||
of this geo point field from the provided geohash.
|
||||
|
||||
|`doc['field_name'].geohashDistanceInMiles(geohash)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided geohash.
|
||||
|=======================================================================
|
||||
|
||||
[float]
|
||||
=== Stored Fields
|
||||
|
||||
Stored fields can also be accessed when executing a script. Note, they
|
||||
are much slower to access compared with document fields, as they are not
|
||||
loaded into memory. They can be simply accessed using
|
||||
`_fields['my_field_name'].value` or `_fields['my_field_name'].values`.
|
||||
|
||||
[float]
|
||||
=== Accessing the score of a document within a script
|
||||
|
||||
When using scripting for calculating the score of a document (for instance, with
|
||||
the `function_score` query), you can access the score using the `_score`
|
||||
variable inside of a Groovy script.
|
||||
|
||||
[float]
|
||||
=== Source Field
|
||||
|
||||
The source field can also be accessed when executing a script. The
|
||||
source field is loaded per doc, parsed, and then provided to the script
|
||||
for evaluation. The `_source` forms the context under which the source
|
||||
field can be accessed, for example `_source.obj2.obj1.field3`.
|
||||
|
||||
Accessing `_source` is much slower compared to using `doc`
|
||||
but the data is not loaded into memory. For a single field access `_fields` may be
|
||||
faster than using `_source` due to the extra overhead of potentially parsing large documents.
|
||||
However, `_source` may be faster if you access multiple fields or if the source has already been
|
||||
loaded for other purposes.
|
||||
|
||||
|
||||
[float]
|
||||
=== Groovy Built In Functions
|
||||
|
||||
There are several built in functions that can be used within scripts.
|
||||
They include:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Function |Description
|
||||
|`sin(a)` |Returns the trigonometric sine of an angle.
|
||||
|
||||
|`cos(a)` |Returns the trigonometric cosine of an angle.
|
||||
|
||||
|`tan(a)` |Returns the trigonometric tangent of an angle.
|
||||
|
||||
|`asin(a)` |Returns the arc sine of a value.
|
||||
|
||||
|`acos(a)` |Returns the arc cosine of a value.
|
||||
|
||||
|`atan(a)` |Returns the arc tangent of a value.
|
||||
|
||||
|`toRadians(angdeg)` |Converts an angle measured in degrees to an
|
||||
approximately equivalent angle measured in radians
|
||||
|
||||
|`toDegrees(angrad)` |Converts an angle measured in radians to an
|
||||
approximately equivalent angle measured in degrees.
|
||||
|
||||
|`exp(a)` |Returns Euler's number _e_ raised to the power of value.
|
||||
|
||||
|`log(a)` |Returns the natural logarithm (base _e_) of a value.
|
||||
|
||||
|`log10(a)` |Returns the base 10 logarithm of a value.
|
||||
|
||||
|`sqrt(a)` |Returns the correctly rounded positive square root of a
|
||||
value.
|
||||
|
||||
|`cbrt(a)` |Returns the cube root of a double value.
|
||||
|
||||
|`IEEEremainder(f1, f2)` |Computes the remainder operation on two
|
||||
arguments as prescribed by the IEEE 754 standard.
|
||||
|
||||
|`ceil(a)` |Returns the smallest (closest to negative infinity) value
|
||||
that is greater than or equal to the argument and is equal to a
|
||||
mathematical integer.
|
||||
|
||||
|`floor(a)` |Returns the largest (closest to positive infinity) value
|
||||
that is less than or equal to the argument and is equal to a
|
||||
mathematical integer.
|
||||
|
||||
|`rint(a)` |Returns the value that is closest in value to the argument
|
||||
and is equal to a mathematical integer.
|
||||
|
||||
|`atan2(y, x)` |Returns the angle _theta_ from the conversion of
|
||||
rectangular coordinates (_x_, _y_) to polar coordinates (r,_theta_).
|
||||
|
||||
|`pow(a, b)` |Returns the value of the first argument raised to the
|
||||
power of the second argument.
|
||||
|
||||
|`round(a)` |Returns the closest _int_ to the argument.
|
||||
|
||||
|`random()` |Returns a random _double_ value.
|
||||
|
||||
|`abs(a)` |Returns the absolute value of a value.
|
||||
|
||||
|`max(a, b)` |Returns the greater of two values.
|
||||
|
||||
|`min(a, b)` |Returns the smaller of two values.
|
||||
|
||||
|`ulp(d)` |Returns the size of an ulp of the argument.
|
||||
|
||||
|`signum(d)` |Returns the signum function of the argument.
|
||||
|
||||
|`sinh(x)` |Returns the hyperbolic sine of a value.
|
||||
|
||||
|`cosh(x)` |Returns the hyperbolic cosine of a value.
|
||||
|
||||
|`tanh(x)` |Returns the hyperbolic tangent of a value.
|
||||
|
||||
|`hypot(x, y)` |Returns sqrt(_x2_ + _y2_) without intermediate overflow
|
||||
or underflow.
|
||||
|=======================================================================
|
||||
|
|
|
@ -0,0 +1,691 @@
|
|||
[[modules-scripting]]
|
||||
== Scripting
|
||||
|
||||
The scripting module allows to use scripts in order to evaluate custom
|
||||
expressions. For example, scripts can be used to return "script fields"
|
||||
as part of a search request, or can be used to evaluate a custom score
|
||||
for a query and so on.
|
||||
|
||||
The scripting module uses by default http://groovy-lang.org/[groovy]
|
||||
(previously http://mvel.codehaus.org/[mvel] in 1.3.x and earlier) as the
|
||||
scripting language with some extensions. Groovy is used since it is extremely
|
||||
fast and very simple to use.
|
||||
|
||||
.Groovy dynamic scripting off by default from v1.4.3
|
||||
[IMPORTANT]
|
||||
===================================================
|
||||
|
||||
Groovy dynamic scripting is off by default, preventing dynamic Groovy scripts
|
||||
from being accepted as part of a request or retrieved from the special
|
||||
`.scripts` index. You will still be able to use Groovy scripts stored in files
|
||||
in the `config/scripts/` directory on every node.
|
||||
|
||||
To convert an inline script to a file, take this simple script
|
||||
as an example:
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
GET /_search
|
||||
{
|
||||
"script_fields": {
|
||||
"my_field": {
|
||||
"inline": "1 + my_var",
|
||||
"params": {
|
||||
"my_var": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
-----------------------------------
|
||||
|
||||
Save the contents of the `inline` field as a file called `config/scripts/my_script.groovy`
|
||||
on every data node in the cluster:
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
1 + my_var
|
||||
-----------------------------------
|
||||
|
||||
Now you can access the script by file name (without the extension):
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
GET /_search
|
||||
{
|
||||
"script_fields": {
|
||||
"my_field": {
|
||||
"script": {
|
||||
"file": "my_script",
|
||||
"params": {
|
||||
"my_var": 2
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
-----------------------------------
|
||||
|
||||
===================================================
|
||||
|
||||
|
||||
Additional `lang` plugins are provided to allow to execute scripts in
|
||||
different languages. All places where a script can be used, a `lang` parameter
|
||||
can be provided to define the language of the script. The following are the
|
||||
supported scripting languages:
|
||||
|
||||
[cols="<,<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Language |Sandboxed |Required plugin
|
||||
|groovy |no |built-in
|
||||
|expression |yes |built-in
|
||||
|mustache |yes |built-in
|
||||
|javascript |no |{plugins}/lang-javascript.html[elasticsearch-lang-javascript]
|
||||
|python |no |{plugins}/lang-python.html[elasticsearch-lang-python]
|
||||
|=======================================================================
|
||||
|
||||
To increase security, Elasticsearch does not allow you to specify scripts for
|
||||
non-sandboxed languages with a request. Instead, scripts must be placed in the
|
||||
`scripts` directory inside the configuration directory (the directory where
|
||||
elasticsearch.yml is). The default location of this `scripts` directory can be
|
||||
changed by setting `path.scripts` in elasticsearch.yml. Scripts placed into
|
||||
this directory will automatically be picked up and be available to be used.
|
||||
Once a script has been placed in this directory, it can be referenced by name.
|
||||
For example, a script called `calculate-score.groovy` can be referenced in a
|
||||
request like this:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
$ tree config
|
||||
config
|
||||
├── elasticsearch.yml
|
||||
├── logging.yml
|
||||
└── scripts
|
||||
└── calculate-score.groovy
|
||||
--------------------------------------------------
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
$ cat config/scripts/calculate-score.groovy
|
||||
log(_score * 2) + my_modifier
|
||||
--------------------------------------------------
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPOST localhost:9200/_search -d '{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"query": {
|
||||
"match": {
|
||||
"body": "foo"
|
||||
}
|
||||
},
|
||||
"functions": [
|
||||
{
|
||||
"script_score": {
|
||||
"script": {
|
||||
"lang": "groovy",
|
||||
"file": "calculate-score",
|
||||
"params": {
|
||||
"my_modifier": 8
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
The name of the script is derived from the hierarchy of directories it
|
||||
exists under, and the file name without the lang extension. For example,
|
||||
a script placed under `config/scripts/group1/group2/test.py` will be
|
||||
named `group1_group2_test`.
|
||||
|
||||
[float]
|
||||
=== Indexed Scripts
|
||||
Elasticsearch allows you to store scripts in an internal index known as
|
||||
`.scripts` and reference them by id. There are REST endpoints to manage
|
||||
indexed scripts as follows:
|
||||
|
||||
Requests to the scripts endpoint look like :
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
/_scripts/{lang}/{id}
|
||||
-----------------------------------
|
||||
Where the `lang` part is the language the script is in and the `id` part is the id
|
||||
of the script. In the `.scripts` index the type of the document will be set to the `lang`.
|
||||
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
curl -XPOST localhost:9200/_scripts/groovy/indexedCalculateScore -d '{
|
||||
"script": "log(_score * 2) + my_modifier"
|
||||
}'
|
||||
-----------------------------------
|
||||
|
||||
This will create a document with id: `indexedCalculateScore` and type: `groovy` in the
|
||||
`.scripts` index. The type of the document is the language used by the script.
|
||||
|
||||
This script can be accessed at query time by using the `id` script parameter and passing
|
||||
the script id:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPOST localhost:9200/_search -d '{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"query": {
|
||||
"match": {
|
||||
"body": "foo"
|
||||
}
|
||||
},
|
||||
"functions": [
|
||||
{
|
||||
"script_score": {
|
||||
"script": {
|
||||
"id": "indexedCalculateScore",
|
||||
"lang" : "groovy",
|
||||
"params": {
|
||||
"my_modifier": 8
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
The script can be viewed by:
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
curl -XGET localhost:9200/_scripts/groovy/indexedCalculateScore
|
||||
-----------------------------------
|
||||
|
||||
This is rendered as:
|
||||
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
'{
|
||||
"script": "log(_score * 2) + my_modifier"
|
||||
}'
|
||||
-----------------------------------
|
||||
|
||||
Indexed scripts can be deleted by:
|
||||
[source,js]
|
||||
-----------------------------------
|
||||
curl -XDELETE localhost:9200/_scripts/groovy/indexedCalculateScore
|
||||
-----------------------------------
|
||||
|
||||
|
||||
|
||||
[float]
|
||||
[[enable-dynamic-scripting]]
|
||||
=== Enabling dynamic scripting
|
||||
|
||||
We recommend running Elasticsearch behind an application or proxy, which
|
||||
protects Elasticsearch from the outside world. If users are allowed to run
|
||||
inline scripts (even in a search request) or indexed scripts, then they have
|
||||
the same access to your box as the user that Elasticsearch is running as. For
|
||||
this reason dynamic scripting is allowed only for sandboxed languages by default.
|
||||
|
||||
First, you should not run Elasticsearch as the `root` user, as this would allow
|
||||
a script to access or do *anything* on your server, without limitations. Second,
|
||||
you should not expose Elasticsearch directly to users, but instead have a proxy
|
||||
application inbetween. If you *do* intend to expose Elasticsearch directly to
|
||||
your users, then you have to decide whether you trust them enough to run scripts
|
||||
on your box or not.
|
||||
|
||||
It is possible to enable scripts based on their source, for
|
||||
every script engine, through the following settings that need to be added to the
|
||||
`config/elasticsearch.yml` file on every node.
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.inline: true
|
||||
script.indexed: true
|
||||
|
||||
-----------------------------------
|
||||
|
||||
While this still allows execution of named scripts provided in the config, or
|
||||
_native_ Java scripts registered through plugins, it also allows users to run
|
||||
arbitrary scripts via the API. Instead of sending the name of the file as the
|
||||
script, the body of the script can be sent instead or retrieved from the
|
||||
`.scripts` indexed if previously stored.
|
||||
|
||||
There are three possible configuration values for any of the fine-grained
|
||||
script settings:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Value |Description
|
||||
| `false` |scripting is turned off completely, in the context of the setting being set.
|
||||
| `true` |scripting is turned on, in the context of the setting being set.
|
||||
| `sandbox` |scripts may be executed only for languages that are sandboxed
|
||||
|=======================================================================
|
||||
|
||||
The default values are the following:
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.inline: sandbox
|
||||
script.indexed: sandbox
|
||||
script.file: true
|
||||
|
||||
-----------------------------------
|
||||
|
||||
NOTE: Global scripting settings affect the `mustache` scripting language.
|
||||
<<search-template,Search templates>> internally use the `mustache` language,
|
||||
and will still be enabled by default as the `mustache` engine is sandboxed,
|
||||
but they will be enabled/disabled according to fine-grained settings
|
||||
specified in `elasticsearch.yml`.
|
||||
|
||||
It is also possible to control which operations can execute scripts. The
|
||||
supported operations are:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Value |Description
|
||||
| `aggs` |Aggregations (wherever they may be used)
|
||||
| `search` |Search api, Percolator api and Suggester api (e.g filters, script_fields)
|
||||
| `update` |Update api
|
||||
| `plugin` |Any plugin that makes use of scripts under the generic `plugin` category
|
||||
|=======================================================================
|
||||
|
||||
Plugins can also define custom operations that they use scripts for instead
|
||||
of using the generic `plugin` category. Those operations can be referred to
|
||||
in the following form: `${pluginName}_${operation}`.
|
||||
|
||||
The following example disables scripting for `update` and `mapping` operations,
|
||||
regardless of the script source, for any engine. Scripts can still be
|
||||
executed from sandboxed languages as part of `aggregations`, `search`
|
||||
and plugins execution though, as the above defaults still get applied.
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.update: false
|
||||
script.mapping: false
|
||||
|
||||
-----------------------------------
|
||||
|
||||
Generic settings get applied in order, operation based ones have precedence
|
||||
over source based ones. Language specific settings are supported too. They
|
||||
need to be prefixed with the `script.engine.<engine>` prefix and have
|
||||
precedence over any other generic settings.
|
||||
|
||||
[source,yaml]
|
||||
-----------------------------------
|
||||
script.engine.groovy.file.aggs: true
|
||||
script.engine.groovy.file.mapping: true
|
||||
script.engine.groovy.file.search: true
|
||||
script.engine.groovy.file.update: true
|
||||
script.engine.groovy.file.plugin: true
|
||||
script.engine.groovy.indexed.aggs: true
|
||||
script.engine.groovy.indexed.mapping: false
|
||||
script.engine.groovy.indexed.search: true
|
||||
script.engine.groovy.indexed.update: false
|
||||
script.engine.groovy.indexed.plugin: false
|
||||
script.engine.groovy.inline.aggs: true
|
||||
script.engine.groovy.inline.mapping: false
|
||||
script.engine.groovy.inline.search: false
|
||||
script.engine.groovy.inline.update: false
|
||||
script.engine.groovy.inline.plugin: false
|
||||
|
||||
-----------------------------------
|
||||
|
||||
[float]
|
||||
=== Default Scripting Language
|
||||
|
||||
The default scripting language (assuming no `lang` parameter is provided) is
|
||||
`groovy`. In order to change it, set the `script.default_lang` to the
|
||||
appropriate language.
|
||||
|
||||
[float]
|
||||
=== Automatic Script Reloading
|
||||
|
||||
The `config/scripts` directory is scanned periodically for changes.
|
||||
New and changed scripts are reloaded and deleted script are removed
|
||||
from preloaded scripts cache. The reload frequency can be specified
|
||||
using `resource.reload.interval` setting, which defaults to `60s`.
|
||||
To disable script reloading completely set `script.auto_reload_enabled`
|
||||
to `false`.
|
||||
|
||||
[[native-java-scripts]]
|
||||
[float]
|
||||
=== Native (Java) Scripts
|
||||
|
||||
Sometimes `groovy` and `expressions` aren't enough. For those times you can
|
||||
implement a native script.
|
||||
|
||||
The best way to implement a native script is to write a plugin and install it.
|
||||
The plugin {plugins}/plugin-authors.html[documentation] has more information on
|
||||
how to write a plugin so that Elasticsearch will properly load it.
|
||||
|
||||
To register the actual script you'll need to implement `NativeScriptFactory`
|
||||
to construct the script. The actual script will extend either
|
||||
`AbstractExecutableScript` or `AbstractSearchScript`. The second one is likely
|
||||
the most useful and has several helpful subclasses you can extend like
|
||||
`AbstractLongSearchScript`, `AbstractDoubleSearchScript`, and
|
||||
`AbstractFloatSearchScript`. Finally, your plugin should register the native
|
||||
script by declaring the `onModule(ScriptModule)` method.
|
||||
|
||||
If you squashed the whole thing into one class it'd look like:
|
||||
|
||||
[source,java]
|
||||
--------------------------------------------------
|
||||
public class MyNativeScriptPlugin extends Plugin {
|
||||
@Override
|
||||
public String name() {
|
||||
return "my-native-script";
|
||||
}
|
||||
@Override
|
||||
public String description() {
|
||||
return "my native script that does something great";
|
||||
}
|
||||
public void onModule(ScriptModule scriptModule) {
|
||||
scriptModule.registerScript("my_script", MyNativeScriptFactory.class);
|
||||
}
|
||||
|
||||
public static class MyNativeScriptFactory implements NativeScriptFactory {
|
||||
@Override
|
||||
public ExecutableScript newScript(@Nullable Map<String, Object> params) {
|
||||
return new MyNativeScript();
|
||||
}
|
||||
@Override
|
||||
public boolean needsScores() {
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
public static class MyNativeScript extends AbstractFloatSearchScript {
|
||||
@Override
|
||||
public float runAsFloat() {
|
||||
float a = (float) source().get("a");
|
||||
float b = (float) source().get("b");
|
||||
return a * b;
|
||||
}
|
||||
}
|
||||
}
|
||||
--------------------------------------------------
|
||||
|
||||
You can execute the script by specifying its `lang` as `native`, and the name
|
||||
of the script as the `id`:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
curl -XPOST localhost:9200/_search -d '{
|
||||
"query": {
|
||||
"function_score": {
|
||||
"query": {
|
||||
"match": {
|
||||
"body": "foo"
|
||||
}
|
||||
},
|
||||
"functions": [
|
||||
{
|
||||
"script_score": {
|
||||
"script": {
|
||||
"id": "my_script",
|
||||
"lang" : "native"
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}'
|
||||
--------------------------------------------------
|
||||
|
||||
|
||||
[float]
|
||||
=== Lucene Expressions Scripts
|
||||
|
||||
experimental[The Lucene expressions module is undergoing significant development and the exposed functionality is likely to change in the future]
|
||||
|
||||
Lucene's expressions module provides a mechanism to compile a
|
||||
`javascript` expression to bytecode. This allows very fast execution,
|
||||
as if you had written a `native` script. Expression scripts can be
|
||||
used in `script_score`, `script_fields`, sort scripts and numeric aggregation scripts.
|
||||
|
||||
See the link:http://lucene.apache.org/core/4_9_0/expressions/index.html?org/apache/lucene/expressions/js/package-summary.html[expressions module documentation]
|
||||
for details on what operators and functions are available.
|
||||
|
||||
Variables in `expression` scripts are available to access:
|
||||
|
||||
* Single valued document fields, e.g. `doc['myfield'].value`
|
||||
* Single valued document fields can also be accessed without `.value` e.g. `doc['myfield']`
|
||||
* Parameters passed into the script, e.g. `mymodifier`
|
||||
* The current document's score, `_score` (only available when used in a `script_score`)
|
||||
|
||||
Variables in `expression` scripts that are of type `date` may use the following member methods:
|
||||
|
||||
* getYear()
|
||||
* getMonth()
|
||||
* getDayOfMonth()
|
||||
* getHourOfDay()
|
||||
* getMinutes()
|
||||
* getSeconds()
|
||||
|
||||
The following example shows the difference in years between the `date` fields date0 and date1:
|
||||
|
||||
`doc['date1'].getYear() - doc['date0'].getYear()`
|
||||
|
||||
There are a few limitations relative to other script languages:
|
||||
|
||||
* Only numeric fields may be accessed
|
||||
* Stored fields are not available
|
||||
* If a field is sparse (only some documents contain a value), documents missing the field will have a value of `0`
|
||||
|
||||
[float]
|
||||
=== Score
|
||||
|
||||
In all scripts that can be used in aggregations, the current
|
||||
document's score is accessible in `_score`.
|
||||
|
||||
[float]
|
||||
=== Computing scores based on terms in scripts
|
||||
|
||||
see <<modules-advanced-scripting, advanced scripting documentation>>
|
||||
|
||||
[float]
|
||||
=== Document Fields
|
||||
|
||||
Most scripting revolve around the use of specific document fields data.
|
||||
The `doc['field_name']` can be used to access specific field data within
|
||||
a document (the document in question is usually derived by the context
|
||||
the script is used). Document fields are very fast to access since they
|
||||
end up being loaded into memory (all the relevant field values/tokens
|
||||
are loaded to memory). Note, however, that the `doc[...]` notation only
|
||||
allows for simple valued fields (can’t return a json object from it)
|
||||
and makes sense only on non-analyzed or single term based fields.
|
||||
|
||||
The following data can be extracted from a field:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Expression |Description
|
||||
|`doc['field_name'].value` |The native value of the field. For example,
|
||||
if its a short type, it will be short.
|
||||
|
||||
|`doc['field_name'].values` |The native array values of the field. For
|
||||
example, if its a short type, it will be short[]. Remember, a field can
|
||||
have several values within a single doc. Returns an empty array if the
|
||||
field has no values.
|
||||
|
||||
|`doc['field_name'].empty` |A boolean indicating if the field has no
|
||||
values within the doc.
|
||||
|
||||
|`doc['field_name'].multiValued` |A boolean indicating that the field
|
||||
has several values within the corpus.
|
||||
|
||||
|`doc['field_name'].lat` |The latitude of a geo point type.
|
||||
|
||||
|`doc['field_name'].lon` |The longitude of a geo point type.
|
||||
|
||||
|`doc['field_name'].lats` |The latitudes of a geo point type.
|
||||
|
||||
|`doc['field_name'].lons` |The longitudes of a geo point type.
|
||||
|
||||
|`doc['field_name'].distance(lat, lon)` |The `plane` distance (in meters)
|
||||
of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceWithDefault(lat, lon, default)` |The `plane` distance (in meters)
|
||||
of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].distanceInMiles(lat, lon)` |The `plane` distance (in
|
||||
miles) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceInMilesWithDefault(lat, lon, default)` |The `plane` distance (in
|
||||
miles) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].distanceInKm(lat, lon)` |The `plane` distance (in
|
||||
km) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].distanceInKmWithDefault(lat, lon, default)` |The `plane` distance (in
|
||||
km) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].arcDistance(lat, lon)` |The `arc` distance (in
|
||||
meters) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceWithDefault(lat, lon, default)` |The `arc` distance (in
|
||||
meters) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].arcDistanceInMiles(lat, lon)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceInMilesWithDefault(lat, lon, default)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].arcDistanceInKm(lat, lon)` |The `arc` distance (in
|
||||
km) of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].arcDistanceInKmWithDefault(lat, lon, default)` |The `arc` distance (in
|
||||
km) of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].factorDistance(lat, lon)` |The distance factor of this geo point field from the provided lat/lon.
|
||||
|
||||
|`doc['field_name'].factorDistance(lat, lon, default)` |The distance factor of this geo point field from the provided lat/lon with a default value.
|
||||
|
||||
|`doc['field_name'].geohashDistance(geohash)` |The `arc` distance (in meters)
|
||||
of this geo point field from the provided geohash.
|
||||
|
||||
|`doc['field_name'].geohashDistanceInKm(geohash)` |The `arc` distance (in km)
|
||||
of this geo point field from the provided geohash.
|
||||
|
||||
|`doc['field_name'].geohashDistanceInMiles(geohash)` |The `arc` distance (in
|
||||
miles) of this geo point field from the provided geohash.
|
||||
|=======================================================================
|
||||
|
||||
[float]
|
||||
=== Stored Fields
|
||||
|
||||
Stored fields can also be accessed when executing a script. Note, they
|
||||
are much slower to access compared with document fields, as they are not
|
||||
loaded into memory. They can be simply accessed using
|
||||
`_fields['my_field_name'].value` or `_fields['my_field_name'].values`.
|
||||
|
||||
[float]
|
||||
=== Accessing the score of a document within a script
|
||||
|
||||
When using scripting for calculating the score of a document (for instance, with
|
||||
the `function_score` query), you can access the score using the `_score`
|
||||
variable inside of a Groovy script.
|
||||
|
||||
[float]
|
||||
=== Source Field
|
||||
|
||||
The source field can also be accessed when executing a script. The
|
||||
source field is loaded per doc, parsed, and then provided to the script
|
||||
for evaluation. The `_source` forms the context under which the source
|
||||
field can be accessed, for example `_source.obj2.obj1.field3`.
|
||||
|
||||
Accessing `_source` is much slower compared to using `doc`
|
||||
but the data is not loaded into memory. For a single field access `_fields` may be
|
||||
faster than using `_source` due to the extra overhead of potentially parsing large documents.
|
||||
However, `_source` may be faster if you access multiple fields or if the source has already been
|
||||
loaded for other purposes.
|
||||
|
||||
|
||||
[float]
|
||||
=== Groovy Built In Functions
|
||||
|
||||
There are several built in functions that can be used within scripts.
|
||||
They include:
|
||||
|
||||
[cols="<,<",options="header",]
|
||||
|=======================================================================
|
||||
|Function |Description
|
||||
|`sin(a)` |Returns the trigonometric sine of an angle.
|
||||
|
||||
|`cos(a)` |Returns the trigonometric cosine of an angle.
|
||||
|
||||
|`tan(a)` |Returns the trigonometric tangent of an angle.
|
||||
|
||||
|`asin(a)` |Returns the arc sine of a value.
|
||||
|
||||
|`acos(a)` |Returns the arc cosine of a value.
|
||||
|
||||
|`atan(a)` |Returns the arc tangent of a value.
|
||||
|
||||
|`toRadians(angdeg)` |Converts an angle measured in degrees to an
|
||||
approximately equivalent angle measured in radians
|
||||
|
||||
|`toDegrees(angrad)` |Converts an angle measured in radians to an
|
||||
approximately equivalent angle measured in degrees.
|
||||
|
||||
|`exp(a)` |Returns Euler's number _e_ raised to the power of value.
|
||||
|
||||
|`log(a)` |Returns the natural logarithm (base _e_) of a value.
|
||||
|
||||
|`log10(a)` |Returns the base 10 logarithm of a value.
|
||||
|
||||
|`sqrt(a)` |Returns the correctly rounded positive square root of a
|
||||
value.
|
||||
|
||||
|`cbrt(a)` |Returns the cube root of a double value.
|
||||
|
||||
|`IEEEremainder(f1, f2)` |Computes the remainder operation on two
|
||||
arguments as prescribed by the IEEE 754 standard.
|
||||
|
||||
|`ceil(a)` |Returns the smallest (closest to negative infinity) value
|
||||
that is greater than or equal to the argument and is equal to a
|
||||
mathematical integer.
|
||||
|
||||
|`floor(a)` |Returns the largest (closest to positive infinity) value
|
||||
that is less than or equal to the argument and is equal to a
|
||||
mathematical integer.
|
||||
|
||||
|`rint(a)` |Returns the value that is closest in value to the argument
|
||||
and is equal to a mathematical integer.
|
||||
|
||||
|`atan2(y, x)` |Returns the angle _theta_ from the conversion of
|
||||
rectangular coordinates (_x_, _y_) to polar coordinates (r,_theta_).
|
||||
|
||||
|`pow(a, b)` |Returns the value of the first argument raised to the
|
||||
power of the second argument.
|
||||
|
||||
|`round(a)` |Returns the closest _int_ to the argument.
|
||||
|
||||
|`random()` |Returns a random _double_ value.
|
||||
|
||||
|`abs(a)` |Returns the absolute value of a value.
|
||||
|
||||
|`max(a, b)` |Returns the greater of two values.
|
||||
|
||||
|`min(a, b)` |Returns the smaller of two values.
|
||||
|
||||
|`ulp(d)` |Returns the size of an ulp of the argument.
|
||||
|
||||
|`signum(d)` |Returns the signum function of the argument.
|
||||
|
||||
|`sinh(x)` |Returns the hyperbolic sine of a value.
|
||||
|
||||
|`cosh(x)` |Returns the hyperbolic cosine of a value.
|
||||
|
||||
|`tanh(x)` |Returns the hyperbolic tangent of a value.
|
||||
|
||||
|`hypot(x, y)` |Returns sqrt(_x2_ + _y2_) without intermediate overflow
|
||||
or underflow.
|
||||
|=======================================================================
|
|
@ -0,0 +1,160 @@
|
|||
[[modules-scripting-security]]
|
||||
=== Scripting and the Java Security Manager
|
||||
|
||||
Elasticsearch runs with the https://docs.oracle.com/javase/tutorial/essential/environment/security.html[Java Security Manager]
|
||||
enabled by default. The security policy in Elasticsearch locks down the
|
||||
permissions granted to each class to the bare minimum required to operate.
|
||||
The benefit of doing this is that it severely limits the attack vectors
|
||||
available to a hacker.
|
||||
|
||||
Restricting permissions is particularly important with scripting languages
|
||||
like Groovy and Javascript which are designed to do anything that can be done
|
||||
in Java itself, including writing to the file system, opening sockets to
|
||||
remote servers, etc.
|
||||
|
||||
[float]
|
||||
=== Script Classloader Whitelist
|
||||
|
||||
Scripting languages are only allowed to load classes which appear in a
|
||||
hardcoded whitelist that can be found in
|
||||
https://github.com/elastic/elasticsearch/blob/{branch}/core/src/main/java/org/elasticsearch/script/ClassPermission.java[`org.elasticsearch.script.ClassPermission`].
|
||||
|
||||
|
||||
In a script, attempting to load a class that does not appear in the whitelist
|
||||
_may_ result in a `ClassNotFoundException`, for instance this script:
|
||||
|
||||
[source,json]
|
||||
------------------------------
|
||||
GET _search
|
||||
{
|
||||
"script_fields": {
|
||||
"the_hour": {
|
||||
"script": "use(java.math.BigInteger); new BigInteger(1)"
|
||||
}
|
||||
}
|
||||
}
|
||||
------------------------------
|
||||
|
||||
will return the following exception:
|
||||
|
||||
[source,json]
|
||||
------------------------------
|
||||
{
|
||||
"reason": {
|
||||
"type": "script_exception",
|
||||
"reason": "failed to run inline script [use(java.math.BigInteger); new BigInteger(1)] using lang [groovy]",
|
||||
"caused_by": {
|
||||
"type": "no_class_def_found_error",
|
||||
"reason": "java/math/BigInteger",
|
||||
"caused_by": {
|
||||
"type": "class_not_found_exception",
|
||||
"reason": "java.math.BigInteger"
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
------------------------------
|
||||
|
||||
However, classloader issues may also result in more difficult to interpret
|
||||
exceptions. For instance, this script:
|
||||
|
||||
[source,groovy]
|
||||
------------------------------
|
||||
use(groovy.time.TimeCategory); new Date(123456789).format('HH')
|
||||
------------------------------
|
||||
|
||||
Returns the following exception:
|
||||
|
||||
[source,json]
|
||||
------------------------------
|
||||
{
|
||||
"reason": {
|
||||
"type": "script_exception",
|
||||
"reason": "failed to run inline script [use(groovy.time.TimeCategory); new Date(123456789).format('HH')] using lang [groovy]",
|
||||
"caused_by": {
|
||||
"type": "missing_property_exception",
|
||||
"reason": "No such property: groovy for class: 8d45f5c1a07a1ab5dda953234863e283a7586240"
|
||||
}
|
||||
}
|
||||
}
|
||||
------------------------------
|
||||
|
||||
[float]
|
||||
== Dealing with Java Security Manager issues
|
||||
|
||||
If you encounter issues with the Java Security Manager, you have three options
|
||||
for resolving these issues:
|
||||
|
||||
[float]
|
||||
=== Fix the security problem
|
||||
|
||||
The safest and most secure long term solution is to change the code causing
|
||||
the security issue. We recognise that this may take time to do correctly and
|
||||
so we provide the following two alternatives.
|
||||
|
||||
[float]
|
||||
=== Disable the Java Security Manager
|
||||
|
||||
deprecated[2.2.0,The ability to disable the Java Security Manager will be removed in a future version]
|
||||
|
||||
You can disable the Java Security Manager entirely with the
|
||||
`security.manager.enabled` command line flag:
|
||||
|
||||
[source,sh]
|
||||
-----------------------------
|
||||
./bin/elasticsearch --security.manager.enabled false
|
||||
-----------------------------
|
||||
|
||||
WARNING: This disables the Security Manager entirely and makes Elasticsearch
|
||||
much more vulnerable to attacks! It is an option that should only be used in
|
||||
the most urgent of situations and for the shortest amount of time possible.
|
||||
Optional security is not secure at all because it **will** be disabled and
|
||||
leave the system vulnerable. This option will be removed in a future version.
|
||||
|
||||
[float]
|
||||
=== Customising the classloader whitelist
|
||||
|
||||
The classloader whitelist can be customised by tweaking the local Java
|
||||
Security Policy either:
|
||||
|
||||
* system wide: `$JAVA_HOME/lib/security/java.policy`,
|
||||
* for just the `elasticsearch` user: `/home/elasticsearch/.java.policy`, or
|
||||
* from a file specified on the command line: `-Djava.security.policy=someURL`
|
||||
|
||||
Permissions may be granted at the class, package, or global level. For instance:
|
||||
|
||||
[source,js]
|
||||
----------------------------------
|
||||
grant {
|
||||
permission org.elasticsearch.script.ClassPermission "java.util.Base64"; // allow class
|
||||
permission org.elasticsearch.script.ClassPermission "java.util.*"; // allow package
|
||||
permission org.elasticsearch.script.ClassPermission "*"; // allow all (disables filtering basically)
|
||||
};
|
||||
----------------------------------
|
||||
|
||||
Here is an example of how to enable the `groovy.time.TimeCategory` class:
|
||||
|
||||
[source,js]
|
||||
----------------------------------
|
||||
grant {
|
||||
permission org.elasticsearch.script.ClassPermission "java.lang.Class";
|
||||
permission org.elasticsearch.script.ClassPermission "groovy.time.TimeCategory";
|
||||
};
|
||||
----------------------------------
|
||||
|
||||
[TIP]
|
||||
======================================
|
||||
|
||||
Before adding classes to the whitelist, consider the security impact that it
|
||||
will have on Elasticsearch. Do you really need an extra class or can your code
|
||||
be rewritten in a more secure way?
|
||||
|
||||
It is quite possible that we have not whitelisted a generically useful and
|
||||
safe class. If you have a class that you think should be whitelisted by
|
||||
default, please open an issue on GitHub and we will consider the impact of
|
||||
doing so.
|
||||
|
||||
======================================
|
||||
|
||||
See http://docs.oracle.com/javase/7/docs/technotes/guides/security/PolicyFiles.html for more information.
|
||||
|
Loading…
Reference in New Issue