Commit Graph

127 Commits

Author SHA1 Message Date
Adrien Grand d84c643f58 Use the new points API to index numeric fields. #17746
This makes all numeric fields including `date`, `ip` and `token_count` use
points instead of the inverted index as a lookup structure. This is expected
to perform worse for exact queries, but faster for range queries. It also
requires less storage.

Notes about how the change works:
 - Numeric mappers have been split into a legacy version that is essentially
   the current mapper, and a new version that uses points, eg.
   LegacyDateFieldMapper and DateFieldMapper.
 - Since new and old fields have the same names, the decision about which one
   to use is made based on the index creation version.
 - If you try to force using a legacy field on a new index or a field that uses
   points on an old index, you will get an exception.
 - IP addresses now support IPv6 via Lucene's InetAddressPoint and store them
   in SORTED_SET doc values using the same encoding (fixed length of 16 bytes
   and sortable).
 - The internal MappedFieldType that is stored by the new mappers does not have
   any of the points-related properties set. Instead, it keeps setting the index
   options when parsing the `index` property of mappings and does
   `if (fieldType.indexOptions() != IndexOptions.NONE) { // add point field }`
   when parsing documents.

Known issues that won't fix:
 - You can't use numeric fields in significant terms aggregations anymore since
   this requires document frequencies, which points do not record.
 - Term queries on numeric fields will now return constant scores instead of
   giving better scores to the rare values.

Known issues that we could work around (in follow-up PRs, this one is too large
already):
 - Range queries on `ip` addresses only work if both the lower and upper bounds
   are inclusive (exclusive bounds are not exposed in Lucene). We could either
   decide to implement it, or drop range support entirely and tell users to
   query subnets using the CIDR notation instead.
 - Since IP addresses now use a different representation for doc values,
   aggregations will fail when running a terms aggregation on an ip field on a
   list of indices that contains both pre-5.0 and 5.0 indices.
 - The ip range aggregation does not work on the new ip field. We need to either
   implement range aggs for SORTED_SET doc values or drop support for ip ranges
   and tell users to use filters instead. #17700

Closes #16751
Closes #17007
Closes #11513
2016-04-14 17:56:23 +02:00
Adrien Grand a14db8e17e Remove MappedFieldType.useTermQueryWithQueryString() and isNumeric(). #17599
In both cases, what elasticsearch is really interested in is whether the field
is an analyzed string field. So it can just check `tokenized()` instead.
2016-04-12 08:45:28 +02:00
Adrien Grand 068c788ec8 Disable fielddata on text fields by defaults. #17386
`text` fields will have fielddata disabled by default. Fielddata can still be
enabled on an existing index by setting `fielddata=true` in the mappings.
2016-03-30 14:35:32 +02:00
Jack Conradson dfec4547ea Added one minor comment for expressions tests. 2016-03-14 13:19:52 -07:00
Alexander Kazakov 8e6b2b3909 Check that _value is used in aggregations script before setting value to specialValue #14262 2016-03-14 12:04:06 +03:00
Colin Goodheart-Smithe 5d9d91b761 Merge branch 'master' into feature/aggs-refactoring 2016-02-03 14:45:16 +00:00
Simon Willnauer 818a9eefb2 Make settings validation strict
This commit enableds strict settings validation on node startup. All settings
passed to elasticsearch either through system properties, yaml files or any other
way to pass settings must be registered and valid. Settings that are unknown ie. due to
typos or due to deprecation or removal will cause the node to NOT start up. Plugins
have to declare all their settings on the `SettingsModule#registerSetting` and settings for
plugins that are not installed must be removed.

This commit also removes the ability to specify the nodes name via `-Des.name` or just `name` in the
configuration files. The node name must be prefixed with the node prexif like `node.name: Boom`. Left over
usage of `name` will also cause startup to fail.
2016-02-02 11:32:44 +01:00
Colin Goodheart-Smithe 187009c12c Merge branch 'master' into feature/aggs-refactoring 2016-01-27 14:54:12 +00:00
Jason Tedor 284cc3a048 Script mode settings as booleans
This commit modifies the accept values for script mode settings from
"on", "off", and "sandbox" to "true", "false", and "sandbox".
2016-01-27 06:26:58 -05:00
Jason Tedor 9944573449 Rename methods on ScriptEngineService
This commit method renames the ScriptEngineService interface methods
types, extensions, and sandboxed to getTypes, getExtensions, and
isSandboxed, respectively.
2016-01-27 06:26:04 -05:00
Jason Tedor 087e55cc51 Script mode settings
This commit converts the script mode settings to the new settings
infrastructure. This is a major refactoring of the handling of script
mode settings. This refactoring is necessary because these settings are
determined at runtime based on the registered script engines and the
registered script contexts.
2016-01-27 06:26:04 -05:00
Colin Goodheart-Smithe 11bafa18e1 Removes Aggregation Builders in place of AggregatorFactory implementations 2016-01-26 15:13:43 +00:00
Ryan Ernst df24019261 Merge pull request #16038 from rjernst/remove_site_plugin
Plugins: Remove site plugins
2016-01-21 12:32:22 -08:00
Nik Everett e1e73d9914 Create default for ExecutableScript#unwrap
Tons of scripts just return the variable they are passed and that is the
most intuitive behavior so that may as well be the default implementation.
2016-01-19 13:17:11 -05:00
Ryan Ernst ef4f0a8699 Test: Make rest test framework accept http directly for the test cluster
The rest test framework, because it used to be tightly integrated with
ESIntegTestCase, currently expects the addresses for the test cluster to
be passed using the transport protocol port. However, it only uses this
to then find the http address.

This change makes ESRestTestCase extend from ESTestCase instead of
ESIntegTestCase, and changes the sysprop used to tests.rest.cluster,
which now takes the http address.

closes #15459
2016-01-18 16:44:14 -08:00
Ryan Ernst 3b78267c71 Plugins: Remove site plugins
Site plugins used to be used for things like kibana and marvel, but
there is no longer a need since kibana (and marvel as a kibana plugin)
uses node.js. This change removes site plugins, as well as the flag for
jvm plugins. Now all plugins are jvm plugins.
2016-01-16 22:45:37 -08:00
Nik Everett 0786c506dc Remove a few more Xlint skips 2016-01-06 23:28:13 -05:00
Adrien Grand d8d8666877 Remove `index_name` back compat.
Since 2.0 we enforce that fields have the same full and index names. So in 3.x
we can remove the ability to have different names on the same field.
2015-12-23 14:55:26 +01:00
Ryan Ernst 4ea19995cf Remove wildcard imports 2015-12-18 12:43:47 -08:00
Jack Conradson 4523eaec88 Added plumbing for compile time script parameters.
Closes #15464
2015-12-16 18:29:21 -08:00
Robert Muir 1e8f9558a0 Remove now-dead code in expressions (fixed in https://issues.apache.org/jira/browse/LUCENE-6920) 2015-12-10 14:50:32 -05:00
Robert Muir 2741888498 Remove RuntimePermission("accessDeclaredMembers")
Upgrades lucene to 5.5.0-1719088, randomizedtesting to 2.3.2, and securemock to 1.2
2015-12-10 14:26:55 -05:00
Robert Muir 3c419c2186 do expressions consistently with other engines 2015-12-05 22:08:40 -05:00
Robert Muir 2169a123a5 Filter classes loaded by scripts
Since 2.2 we run all scripts with minimal privileges, similar to applets in your browser.
The problem is, they have unrestricted access to other things they can muck with (ES, JDK, whatever).
So they can still easily do tons of bad things

This PR restricts what classes scripts can load via the classloader mechanism, to make life more difficult.
The "standard" list was populated from the old list used for the groovy sandbox: though
a few more were needed for tests to pass (java.lang.String, java.util.Iterator, nothing scary there).

Additionally, each scripting engine typically needs permissions to some runtime stuff.
That is the downside of this "good old classloader" approach, but I like the transparency and simplicity,
and I don't want to waste my time with any feature provided by the engine itself for this, I don't trust them.

This is not perfect and the engines are not perfect but you gotta start somewhere. For expert users that
need to tweak the permissions, we already support that via the standard java security configuration files, the
specification is simple, supports wildcards, etc (though we do not use them ourselves).
2015-12-05 21:46:52 -05:00
Robert Muir 46377778a9 Merge branch 'master' into getClassLoader 2015-12-04 15:58:36 -05:00
Robert Muir 7160c5ec15 list modules separately in pluginservice 2015-12-04 01:13:17 -05:00
Ryan Ernst 0a4a81afaf Added modules, distributions now include them (just plugins installed in
a diff dir)
2015-12-03 14:18:26 -08:00