Commit Graph

120 Commits

Author SHA1 Message Date
Nick Knize ec0dc2c0e9
[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#36751)
* [Geo] Expose BKDBackedGeoShapes as new VECTOR strategy

This commit exposes lucene's LatLonShape field as a new
strategy in GeoShapeFieldMapper. To use the new indexing
approach, strategy should be set to "vector" in the
geo_shape field mapper. If the tree parameter is set
the mapper will throw an IAE. Note the following:

When using vector strategy:

* geo_shape query does not support querying by POINT,
MULTIPOINT, or GEOMETRYCOLLECTION.
* LINESTRING and MULTILINESTRING queries do not support
WITHIN relation.
* CONTAINS relation is not supported.
* The tree, precision, tree_levels, distance_error_pct,
and points_only parameters will not throw an exception
but they have no effect and will be marked as
deprecated..

All other features are supported.

* revert change to PercolatorFieldMapper

* fix ExistsQuery for geo_shape vector strategy

* add deprecation logging for tree, precision, tree_levels, distance_error_pct, and points_only

* initial update to geoshape docs, including mapping migration updates

* initial support for GeoCollection queries

* fix docs and javadoc errors

* clean up geocollection queries

* set deprecated mapping tests to NOTCONSOLE

* fix geo-shape mapper asciidoc mapping and test warnings

* add support for point queries using LatLonShapeBoundingBoxQuery

* update GeoShapeQueryBuilderTests to include POINT queries for VECTOR strategy. Other comment cleanups

* add lucene geometry build testing to ShapeBuilder tests

* remove deprecated prefix tree mapping from geo-shape.asciidoc

* refactor GeoShapeFieldMapper into LegacyGeoShapeFieldMapper and GeoShapeFieldMapper

Both classes derive from BaseGeoShapeFieldMapper that provides shared parameters:
coerce, ignoreMalformed, ignore_z_value, orientation.

* update docs to remove vector strategy

* fix GeometryCollectionBuilder#buildLucene to return the object created by the shape builder

* fix LineLength failure in GeoJsonShapeParserTests

* ShapeMapper refactor changes from PR feedback

* fix typo in geo-shape.asciidoc

* ignore circle test in docs

* update indexing-approach ref to geoshape-indexing-approach

* add warnings check for LegacyGeoShapeFieldMapper to AbstractBuilderTestCase

* fix deprecatedParameters setup

* update indexing approach

* fixing unexpected warnings failures

* move orientation back to field type

* remove if in LegacyGeoShapeFieldMapper#doXContent. Fix GeoShapeFieldMapper to work with double array as a point

* fix indexing-approach link in circle section of geoshape docs

* add strategy to deprecation warnings check

* fix test failures

* fix typo in QueryStringQueryBuilderTests

* fix total hits to totalHits().value

* fix version number

* add version check to BaseGeoShapeFieldMapper

* fix line length!

* revert version check in BaseGeoShapeFieldMapper

* Fix serialization of mappings of legacy shapes.
2018-12-18 09:54:56 -06:00
Nicholas Knize 96d279ed83 Revert "[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#35320)"
This reverts commit 5bc7822562.
2018-12-17 20:09:46 -06:00
Nick Knize 5bc7822562
[Geo] Integrate Lucene's LatLonShape (BKD Backed GeoShapes) as default `geo_shape` indexing approach (#35320)
This commit  exposes lucene's LatLonShape field as the
default type in GeoShapeFieldMapper. To use the new 
indexing approach, simply set "type" : "geo_shape" in 
the mappings without setting any of the strategy, precision, 
tree_levels, or distance_error_pct parameters. Note the 
following when using the new indexing approach:

* geo_shape query does not support querying by 
MULTIPOINT.
* LINESTRING and MULTILINESTRING queries do not 
yet support WITHIN relation.
* CONTAINS relation is not yet supported.
The tree, precision, tree_levels, distance_error_pct, 
and points_only parameters are deprecated.
2018-12-17 14:38:14 -06:00
Jeff Hajewski f1f3b28f5c Delete deprecated getValues from ScriptDocValues (#36183)
* Adds deprecation logging to ScriptDocValues#getValues.

First commit addressing issue #22919.

`ScriptDocValues#getValues` was added for backwards compatibility but no
longer needed. Scripts using the syntax `doc['foo'].values` when
`doc['foo']` is a list should be using `doc['foo']` instead.

* Fixes two build errors in #34279

* Removes unused import in ScriptDocValuesDatesTest
* Removes used of `.values` in example in diversified-sampler-aggregation.asciidoc

* Removes use of .values from painless test.

Part of #34279

* Updates tests to use `doc[foo]` syntax rather than `doc[foo].values`.

* Removes use of `getValues()` and replaces use of `doc[foo].values` with `doc[foo]`.

* Indentation fix.

* Remove unnecessary list construction at previous `getValues()` callsite in ScriptDocValues.GeoPoints.

* Update migration doc and add link to `getValue` in ScriptDocValues javadoc.

* Fix compile

* Fix javadoc issue

* Removes ScriptDocValues#getValues usage from painless whitelist.
2018-12-14 07:56:47 -05:00
Yu d01b30acba lower fielddata circuit breaker's default limit (#27162)
* Lower fielddata circuit breaker default limit

Lower fielddata circuit breaker default limit from 60% to 40% as we have
moved to doc_values for most of the cases.

* merge master in

* update tests

* update docs
2018-12-11 11:30:58 +01:00
Gordon Brown 58a5ad1f1e
Add Tribe removal to breaking changes list (#36239) 2018-12-10 10:23:58 -07:00
David Turner c32e4fb83f
[Zen2] Best-effort cluster formation if unconfigured (#36215)
In real deployments it is important that clusters are properly configured to
avoid accidentally forming multiple independent clusters at cluster
bootstrapping time. However we also expect to be able to unpack Elasticsearch
and start up one or more nodes without any up-front configuration, and have
them do their best to find each other and form a cluster after a few seconds.

This change adds a delayed automatic bootstrapping process to nodes that start
up with no relevant settings set to support the desired out-of-the-box
experience without compromising safety in properly-configured deployments.
2018-12-07 12:47:09 +00:00
Jason Tedor fc85c37efc
Fix typo in migration node for one shard per index 2018-12-06 19:20:10 -05:00
Jason Tedor e8fe624570
Add migration note on the number of shards
This commit adds a migration note regarding the default number of shards
changing from five to one.

Relates #30539
2018-12-06 19:15:45 -05:00
Jim Ferenczi 18866c4c0b
Make hits.total an object in the search response (#35849)
This commit changes the format of the `hits.total` in the search response to be an object with
a `value` and a `relation`. The `value` indicates the number of hits that match the query and the
`relation` indicates whether the number is accurate (in which case the relation is equals to `eq`)
or a lower bound of the total (in which case it is equals to `gte`).
This change also adds a parameter called `rest_total_hits_as_int` that can be used in the
search APIs to opt out from this change (retrieve the total hits as a number in the rest response).
Note that currently all search responses are accurate (`track_total_hits: true`) or they don't contain
`hits.total` (`track_total_hits: true`). We'll add a way to get a lower bound of the total hits in a
follow up (to allow numbers to be passed to `track_total_hits`).

Relates #33028
2018-12-05 19:49:06 +01:00
Julie Tibshirani 59ee8b5c69
Remove the deprecated _termvector endpoint. (#36131) 2018-12-03 10:22:42 -08:00
Jim Ferenczi 74aca756b8
Remove the distinction between query and filter context in QueryBuilders (#35354)
When building a query Lucene distinguishes two cases, queries that require to produce a score and queries that only need to match. We cloned this mechanism in the QueryBuilders in order to be able to produce different queries based on whether they need to produce a score or not. However the only case in es that require this distinction is the BoolQueryBuilder that sets a different minimum_should_match when a `bool` query is built in a filter context..
This behavior doesn't seem right because it makes the matching of `should` clauses different when the score is not required.

Closes #35293
2018-12-03 11:49:11 +01:00
Jason Tedor 78ac12d952
Remove deprecated Graph endpoints (#35956)
We had some endpoints in Graph that are deprecated for removal in
7.0.0. This commit removes these deprecated endpoints.
2018-11-28 08:24:15 -05:00
Simon Willnauer ad1f0dccd4
Validate metdata on `_msearch` (#35938)
MultiSearchRequests issues through `_msearch` now validate all keys
in the metadata section. Previously unknown keys were ignored
while now an exception is thrown.

Closes #35869
2018-11-27 17:08:24 +01:00
Christophe Bismuth b95a4db6e6 Throw a parsing exception when boost is set in span_or query (#28390) (#34112) 2018-11-26 12:15:59 -05:00
Mayya Sharipova b6014d971c
Forbid negative scores in functon_score query (#35709)
* Forbid negative scores in functon_score query

- Throw an exception when scores are negative in field_value_factor
function
- Throw an exception when scores are negative in script_score
function

Relates to #33309
2018-11-22 06:08:48 -05:00
Tim Vernum b0acd709d7
DOCS: Add SecurityExtension breaking change (#35358)
We changed the way realm settings are defined, and this affects custom
realms in SecurityExtensions. This change adds those details to the
breaking changes docs.

Relates: #30241
2018-11-09 16:57:49 +11:00
Armin Braun 02b4e28534
#31608 Add S3 Setting to Force Path Type Access (#34721)
* SNAPSHOTS: Use Path Style Access in S3

* Use path style access pattern to fix #31608
* closes #31608
2018-11-09 05:07:26 +01:00
Albert Zaharovits 025a0c82e5
Remove deprecated audit settings (#35205)
Removes `.prefix` deprecated settings for the logfile
auditing and also documents it in the migrate asciidoc.
2018-11-08 14:06:47 +02:00
Tim Vernum 7cd27d1a96 [DOCS] Fix missing float in breaking changes 2018-11-06 16:59:11 +11:00
Tim Vernum 574ec6686e
Include realm type in Security Realm setting keys (#30241)
This moves all Realm settings to an Affix definition.
However, because different realm types define different settings
(potentially conflicting settings) this requires that the realm type
become part of the setting key.

Thus, we now need to define realm settings as:

    xpack.security.authc.realms:
      file.file1:
        order: 0

      native.native1:
        order: 1

- This is a breaking change to realm config
- This is also a breaking change to custom security realms (SecurityExtension)
2018-11-06 14:56:50 +11:00
lipsill d181d1bab1 Remove deprecated url parameters `_source_include` and `_source_exclude` (#35097)
Removes `_source_include` and `_source_exclude` url parameters. 
These parameters have been deprecated in #33475.

Closes #22792
2018-10-31 17:11:59 -04:00
Christoph Büscher 389910f11d
[Docs] Add migration note about expanded fields limit (#34920)
Adds a note to warn users about the limit introduced in #26541.
2018-10-29 10:23:07 +01:00
Gordon Brown da20dfd81c
Add cluster-wide shard limit warnings (#34021)
In a future major version, we will be introducing a soft limit on the
number of shards in a cluster based on the number of nodes in the
cluster. This limit will be configurable, and checked on operations
which create or open shards and issue a warning if the operation would
take the cluster over the limit.

There is an option to enable strict enforcement of the limit, which
turns the warnings into errors.  In a future release, the option will be
removed and strict enforcement will be the default (and only) behavior.
2018-10-23 16:35:10 -06:00
Zachary Tong 299d044bfc
Collapse pipeline aggs into single package (#34658)
- Restrict visibility of Aggregators and Factories
- Move PipelineAggregatorBuilders up a level so it is consistent with
AggregatorBuilders
- Checkstyle line length fixes for a few classes
- Minor odds/ends (swapping to method references, formatting, etc)
2018-10-23 16:01:01 -04:00
Igor Motov 94bde37bcf
Geo: Don't flip longitude of envelopes crossing dateline (#34535)
When a envelope that crosses the dateline is specified as a part of
geo_shape query is parsed it shouldn't have its left and right points
flipped.

Fixes #34418
2018-10-19 13:53:54 -04:00
Jim Ferenczi 7b49beb9b0
Fix threshold frequency computation in Suggesters (#34312)
The `term` and `phrase` suggesters have different options to filter candidates
based on their frequencies. The `popular` mode for instance filters candidate
terms that occur in less docs than the original term. However when we compute this threshold
we use the total term frequency of a term instead of the document frequency. This is not inline
with the actual filtering which is always based on the document frequency. This change fixes
this discrepancy and clarifies the meaning of the different frequencies in use in the suggesters.
It also ensures that the threshold doesn't overflow the maximum allowed value (Integer.MAX_VALUE).

Closes #34282
2018-10-19 13:33:19 +02:00
Jim Ferenczi 544de13d8e
Disallow negative query boost (#34486)
This change disallows negative query boosts. Negative scores are not allowed in Lucene 8 so
it is easier to just disallow negative boosts entirely. We should also deprecate negative boosts
in 6x in order to ensure that users are aware when they'll upgrade to ES 7.

Relates #33309
2018-10-16 11:31:53 +01:00
Mayya Sharipova 8f10c771e6 Add migration info for missing values in script
Relates to #30975
2018-10-03 11:56:18 -04:00
albendz f09190c14d Require combine and reduce scripts in scripted metrics aggregation (#33452)
* Make text message not required in constructor for slack

* Remove unnecessary comments in test file

* Throw exception when reduce or combine is not provided; update tests

* Update integration tests for scripted metrics to always include reduce and combine

* Remove some old changes from previous branches

* Rearrange script presence checks to be earlier in build

* Change null check order in script builder for aggregated metrics; correct test scripts in IT

* Add breaking change details to PR
2018-10-03 15:22:01 +01:00
Vladimir Dolzhenko 84111e9607 fix broken doc due to `elasticsearch-translog` removal 2018-10-01 17:54:32 +02:00
Vladimir Dolzhenko 2e2ae19b97
drop elasticsearch-translog for 7.0 (#33373)
#32281 adds elasticsearch-shard to provide bwc version of elasticsearch-translog for 6.x; have to remove elasticsearch-translog for 7.0

Relates to #31389
2018-10-01 16:21:14 +02:00
Lisa Cawley 37be3e713c
[DOCS] Synchronize location of Breaking Changes (#33588) 2018-09-27 08:41:38 -07:00
Nik Everett cac93949fe
API: Drop deprecated methods from Retry (#33925)
We deprecated the `Retry.withBackoff` flavors with `Settings` in 6.5
because they were no longer needed. This drops them form 7.0.
2018-09-21 07:55:50 -04:00
Nik Everett 26c4f1fb6c
Core: Default node.name to the hostname (#33677)
Changes the default of the `node.name` setting to the hostname of the
machine on which Elasticsearch is running. Previously it was the first 8
characters of the node id. This had the advantage of producing a unique
name even when the node name isn't configured but the disadvantage of
being unrecognizable and not being available until fairly late in the
startup process. Of particular interest is that it isn't available until
after logging is configured. This forces us to use a volatile read
whenever we add the node name to the log.

Using the hostname is available immediately on startup and is generally
recognizable but has the disadvantage of not being unique when run on
machines that don't set their hostname or when multiple elasticsearch
processes are run on the same host. I believe that, taken together, it
is better to default to the hostname.

1. Running multiple copies of Elasticsearch on the same node is a fairly
advanced feature. We do it all the as part of the elasticsearch build
for testing but we make sure to set the node name then.
2. That the node.name defaults to some flavor of "localhost" on an
unconfigured box feels like it isn't going to come up too much in
production. I expect most production deployments to at least set the
hostname.

As a bonus, production deployments need no longer set the node name in
most cases. At least in my experience most folks set it to the hostname
anyway.
2018-09-19 15:21:29 -04:00
David Turner 421f58e172
Remove discovery-file plugin (#33257)
In #33241 we moved the file-based discovery functionality to core
Elasticsearch, but preserved the `discovery-file` plugin, and support for the
existing location of the `unicast_hosts.txt` file, for BWC reasons. This commit
completes the removal of this plugin.
2018-09-18 12:01:16 +01:00
Jay Modi 3914a980f7
Security: remove wrapping in put user response (#33512)
This change removes the wrapping of the created field in the put user
response. The created field was added as a top level field in #32332,
while also still being wrapped within the `user` object of the
response. Since the value is available in both formats in 6.x, we can
remove the wrapped version for 7.0.
2018-09-13 14:40:36 -06:00
Jason Tedor c023f67c5d
Add migration note for remote cluster settings (#33632)
The remote cluster settings search.remote.* have been renamed to
cluster.remote.* and are automatically upgraded in the cluster state on
gateway recovery, and on put. This commit adds a note to the migration
docs for these changes.
2018-09-12 13:37:11 -04:00
lcawl 6b780e9926 [DOCS] Fixing formatting issues in breaking changes 2018-09-07 16:53:36 -07:00
Jim Ferenczi 79cd6385fe
Collapse package structure for metrics aggs (#33463)
This change collapses all metrics aggregations classes into a single package `org.elasticsearch.aggregations.metrics`.
It also restricts the visibility of some classes (aggregators and factories) that should not be used outside of the package.

Relates #22868
2018-09-07 10:58:06 +02:00
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
Nik Everett f28cddf951
LLREST: Drop deprecated methods (#33223)
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. In a
long series of PRs I've changed all of the old style requests. This
drops the deprecated methods and will be released with 7.0.
2018-09-01 11:11:25 -04:00
Vladimir Dolzhenko 00b272af32 completely drop `index.shard.check_on_startup: fix` for 7.0 (#33194)
Relates to #32279
2018-08-31 22:08:28 +02:00
Mark Tozzi 84b61d0738
Scroll queries asking for rescore are considered invalid (#32918)
This PR changes our behavior from silently ignoring rescore in a scroll query to instead report to the user that such a query is invalid.

Closes #31775
2018-08-28 15:48:23 -04:00
Jonathan Little 9d92a87ae6 Remove support for deprecated params._agg/_aggs for scripted metric aggregations (#32979) 2018-08-28 09:27:43 +01:00
Luca Cavanna 393eec1482
Set maxScore for empty TopDocs to Nan rather than 0 (#32938)
We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).
2018-08-22 17:23:54 +02:00
Igor Motov da6b61e8ef
Make Geo Context Mapping Parsing More Strict (#32821)
Currently, if geo context is represented by something other than
geo_point or an object with lat and lon fields, the parsing of it
as a geo context can result in ignoring the context altogether,
returning confusing errors such as number_format_exception or trying
to parse the number specifying as long-encoded hash code. It would also
fail if the geo_point was stored.

This commit makes the mapping parsing more strict and will fail during
mapping update or index creation if the geo context doesn't point to
a geo_point field.

Supersedes #32412

Closes #32202
2018-08-17 08:13:16 -07:00
Luca Cavanna 00a6ad0e9e
Remove aliases resolution limitations when security is enabled (#31952)
Resolving wildcards in aliases expression is challenging as we may end
up with no aliases to replace the original expression with, but if we
replace with an empty array that means _all which is quite the opposite.
Now that we support and serialize the original requested aliases,
whenever aliases are replaced we will be able to know what was
initially requested. `MetaData#findAliases` can then be updated to not
return anything in case it gets empty aliases, but the original aliases
were not empty. That means that empty aliases are interpreted as _all
only if they were originally requested that way.

Relates to #31516
2018-07-20 09:23:32 +02:00
Alan Woodward a01e26a39b
Correct spelling of AnalysisPlugin#requriesAnalysisSettings (#32025)
Because this is a static method on a public API, and one that we encourage
plugin authors to use, the method with the typo is deprecated in 6.x
rather than just renamed.
2018-07-13 13:13:21 +01:00
Daniel Mitterdorfer f174f72fee
Circuit-break based on real memory usage
With this commit we introduce a new circuit-breaking strategy to the parent
circuit breaker. Contrary to the current implementation which only accounts for
memory reserved via child circuit breakers, the new strategy measures real heap
memory usage at the time of reservation. This allows us to be much more
aggressive with the circuit breaker limit so we bump it to 95% by default. The
new strategy is turned on by default and can be controlled  with the new cluster
setting `indices.breaker.total.userealmemory`.

Note that we turn it off for all integration tests with an internal test cluster
because it leads to spurious test failures which are of no value (we cannot
fully control heap memory usage in tests). All REST tests, however, will make
use of the real memory circuit breaker.

Relates #31767
2018-07-13 10:08:28 +02:00