Commit Graph

94 Commits

Author SHA1 Message Date
Jim Ferenczi 7b49beb9b0
Fix threshold frequency computation in Suggesters (#34312)
The `term` and `phrase` suggesters have different options to filter candidates
based on their frequencies. The `popular` mode for instance filters candidate
terms that occur in less docs than the original term. However when we compute this threshold
we use the total term frequency of a term instead of the document frequency. This is not inline
with the actual filtering which is always based on the document frequency. This change fixes
this discrepancy and clarifies the meaning of the different frequencies in use in the suggesters.
It also ensures that the threshold doesn't overflow the maximum allowed value (Integer.MAX_VALUE).

Closes #34282
2018-10-19 13:33:19 +02:00
Jim Ferenczi 544de13d8e
Disallow negative query boost (#34486)
This change disallows negative query boosts. Negative scores are not allowed in Lucene 8 so
it is easier to just disallow negative boosts entirely. We should also deprecate negative boosts
in 6x in order to ensure that users are aware when they'll upgrade to ES 7.

Relates #33309
2018-10-16 11:31:53 +01:00
Mayya Sharipova 8f10c771e6 Add migration info for missing values in script
Relates to #30975
2018-10-03 11:56:18 -04:00
albendz f09190c14d Require combine and reduce scripts in scripted metrics aggregation (#33452)
* Make text message not required in constructor for slack

* Remove unnecessary comments in test file

* Throw exception when reduce or combine is not provided; update tests

* Update integration tests for scripted metrics to always include reduce and combine

* Remove some old changes from previous branches

* Rearrange script presence checks to be earlier in build

* Change null check order in script builder for aggregated metrics; correct test scripts in IT

* Add breaking change details to PR
2018-10-03 15:22:01 +01:00
Vladimir Dolzhenko 84111e9607 fix broken doc due to `elasticsearch-translog` removal 2018-10-01 17:54:32 +02:00
Vladimir Dolzhenko 2e2ae19b97
drop elasticsearch-translog for 7.0 (#33373)
#32281 adds elasticsearch-shard to provide bwc version of elasticsearch-translog for 6.x; have to remove elasticsearch-translog for 7.0

Relates to #31389
2018-10-01 16:21:14 +02:00
Lisa Cawley 37be3e713c
[DOCS] Synchronize location of Breaking Changes (#33588) 2018-09-27 08:41:38 -07:00
Nik Everett cac93949fe
API: Drop deprecated methods from Retry (#33925)
We deprecated the `Retry.withBackoff` flavors with `Settings` in 6.5
because they were no longer needed. This drops them form 7.0.
2018-09-21 07:55:50 -04:00
Nik Everett 26c4f1fb6c
Core: Default node.name to the hostname (#33677)
Changes the default of the `node.name` setting to the hostname of the
machine on which Elasticsearch is running. Previously it was the first 8
characters of the node id. This had the advantage of producing a unique
name even when the node name isn't configured but the disadvantage of
being unrecognizable and not being available until fairly late in the
startup process. Of particular interest is that it isn't available until
after logging is configured. This forces us to use a volatile read
whenever we add the node name to the log.

Using the hostname is available immediately on startup and is generally
recognizable but has the disadvantage of not being unique when run on
machines that don't set their hostname or when multiple elasticsearch
processes are run on the same host. I believe that, taken together, it
is better to default to the hostname.

1. Running multiple copies of Elasticsearch on the same node is a fairly
advanced feature. We do it all the as part of the elasticsearch build
for testing but we make sure to set the node name then.
2. That the node.name defaults to some flavor of "localhost" on an
unconfigured box feels like it isn't going to come up too much in
production. I expect most production deployments to at least set the
hostname.

As a bonus, production deployments need no longer set the node name in
most cases. At least in my experience most folks set it to the hostname
anyway.
2018-09-19 15:21:29 -04:00
David Turner 421f58e172
Remove discovery-file plugin (#33257)
In #33241 we moved the file-based discovery functionality to core
Elasticsearch, but preserved the `discovery-file` plugin, and support for the
existing location of the `unicast_hosts.txt` file, for BWC reasons. This commit
completes the removal of this plugin.
2018-09-18 12:01:16 +01:00
Jay Modi 3914a980f7
Security: remove wrapping in put user response (#33512)
This change removes the wrapping of the created field in the put user
response. The created field was added as a top level field in #32332,
while also still being wrapped within the `user` object of the
response. Since the value is available in both formats in 6.x, we can
remove the wrapped version for 7.0.
2018-09-13 14:40:36 -06:00
Jason Tedor c023f67c5d
Add migration note for remote cluster settings (#33632)
The remote cluster settings search.remote.* have been renamed to
cluster.remote.* and are automatically upgraded in the cluster state on
gateway recovery, and on put. This commit adds a note to the migration
docs for these changes.
2018-09-12 13:37:11 -04:00
lcawl 6b780e9926 [DOCS] Fixing formatting issues in breaking changes 2018-09-07 16:53:36 -07:00
Jim Ferenczi 79cd6385fe
Collapse package structure for metrics aggs (#33463)
This change collapses all metrics aggregations classes into a single package `org.elasticsearch.aggregations.metrics`.
It also restricts the visibility of some classes (aggregators and factories) that should not be used outside of the package.

Relates #22868
2018-09-07 10:58:06 +02:00
Jim Ferenczi 7ad71f906a
Upgrade to a Lucene 8 snapshot (#33310)
The main benefit of the upgrade for users is the search optimization for top scored documents when the total hit count is not needed. However this optimization is not activated in this change, there is another issue opened to discuss how it should be integrated smoothly.
Some comments about the change:
* Tests that can produce negative scores have been adapted but we need to forbid them completely: #33309

Closes #32899
2018-09-06 14:42:06 +02:00
Nik Everett f28cddf951
LLREST: Drop deprecated methods (#33223)
In #29623 we added `Request` object flavored requests to the low level
REST client and in #30315 we deprecated the old `performRequest`s. In a
long series of PRs I've changed all of the old style requests. This
drops the deprecated methods and will be released with 7.0.
2018-09-01 11:11:25 -04:00
Vladimir Dolzhenko 00b272af32 completely drop `index.shard.check_on_startup: fix` for 7.0 (#33194)
Relates to #32279
2018-08-31 22:08:28 +02:00
Mark Tozzi 84b61d0738
Scroll queries asking for rescore are considered invalid (#32918)
This PR changes our behavior from silently ignoring rescore in a scroll query to instead report to the user that such a query is invalid.

Closes #31775
2018-08-28 15:48:23 -04:00
Jonathan Little 9d92a87ae6 Remove support for deprecated params._agg/_aggs for scripted metric aggregations (#32979) 2018-08-28 09:27:43 +01:00
Luca Cavanna 393eec1482
Set maxScore for empty TopDocs to Nan rather than 0 (#32938)
We used to set `maxScore` to `0` within `TopDocs` in situations where there is really no score as the size was set to `0` and scores were not even tracked. In such scenarios, `Float.Nan` is more appropriate, which gets converted to `max_score: null` on the REST layer. That's also more consistent with lucene which set `maxScore` to `Float.Nan` when merging empty `TopDocs` (see `TopDocs#merge`).
2018-08-22 17:23:54 +02:00
Igor Motov da6b61e8ef
Make Geo Context Mapping Parsing More Strict (#32821)
Currently, if geo context is represented by something other than
geo_point or an object with lat and lon fields, the parsing of it
as a geo context can result in ignoring the context altogether,
returning confusing errors such as number_format_exception or trying
to parse the number specifying as long-encoded hash code. It would also
fail if the geo_point was stored.

This commit makes the mapping parsing more strict and will fail during
mapping update or index creation if the geo context doesn't point to
a geo_point field.

Supersedes #32412

Closes #32202
2018-08-17 08:13:16 -07:00
Luca Cavanna 00a6ad0e9e
Remove aliases resolution limitations when security is enabled (#31952)
Resolving wildcards in aliases expression is challenging as we may end
up with no aliases to replace the original expression with, but if we
replace with an empty array that means _all which is quite the opposite.
Now that we support and serialize the original requested aliases,
whenever aliases are replaced we will be able to know what was
initially requested. `MetaData#findAliases` can then be updated to not
return anything in case it gets empty aliases, but the original aliases
were not empty. That means that empty aliases are interpreted as _all
only if they were originally requested that way.

Relates to #31516
2018-07-20 09:23:32 +02:00
Alan Woodward a01e26a39b
Correct spelling of AnalysisPlugin#requriesAnalysisSettings (#32025)
Because this is a static method on a public API, and one that we encourage
plugin authors to use, the method with the typo is deprecated in 6.x
rather than just renamed.
2018-07-13 13:13:21 +01:00
Daniel Mitterdorfer f174f72fee
Circuit-break based on real memory usage
With this commit we introduce a new circuit-breaking strategy to the parent
circuit breaker. Contrary to the current implementation which only accounts for
memory reserved via child circuit breakers, the new strategy measures real heap
memory usage at the time of reservation. This allows us to be much more
aggressive with the circuit breaker limit so we bump it to 95% by default. The
new strategy is turned on by default and can be controlled  with the new cluster
setting `indices.breaker.total.userealmemory`.

Note that we turn it off for all integration tests with an internal test cluster
because it leads to spurious test failures which are of no value (we cannot
fully control heap memory usage in tests). All REST tests, however, will make
use of the real memory circuit breaker.

Relates #31767
2018-07-13 10:08:28 +02:00
Jim Ferenczi 584fa261cc
Remove the ability to index or query context suggestions without context (#31007)
This is a follow up of #30712 that removes the ability to index or query
and context enabled completion field without context.

Relates #30712
2018-07-09 16:01:01 +02:00
Sohaib Iftikhar 40b822c878 Scripting: Remove support for deprecated StoredScript contexts (#31394)
Removes support for storing scripts without the usual json around the
script. So You can no longer do:
```
POST _scripts/<templatename>
{
    "query": {
        "match": {
            "title": "{{query_string}}"
        }
    }
}
```

and must instead do:
```
POST _scripts/<templatename>
{
    "script": {
        "lang": "mustache",
        "source": {
            "query": {
                "match": {
                    "title": "{{query_string}}"
                }
            }
        }
    }
}
```

This improves error reporting when you attempt to store a script but don't
quite get the syntax right. Before, there was a good chance that we'd
think of it as a "raw" template and just store it. Now we won't do that.
Nice.
2018-07-05 09:30:08 -04:00
Daniel Mitterdorfer 3d53daeb2f
Account for XContent overhead in in-flight breaker
So far the in-flight request circuit breaker has only accounted for the
on-the-wire representation of a request. However, we convert the raw
request into XContent internally which increases the overhead.
Therefore, we increase the value of the corresponding setting
`network.breaker.inflight_requests.overhead` from one to two. While this
value is still rather conservative (we assume that the representation as
structured objects has no overhead compared to the byte[]), it is closer
to reality than the current value.

Relates #31613
2018-07-03 09:17:16 +02:00
Jonathan Little 8e4768890a Migrate scripted metric aggregation scripts to ScriptContext design (#30111)
* Migrate scripted metric aggregation scripts to ScriptContext design #29328

* Rename new script context container class and add clarifying comments to remaining references to params._agg(s)

* Misc cleanup: make mock metric agg script inner classes static

* Move _score to an accessor rather than an arg for scripted metric agg scripts

This causes the score to be evaluated only when it's used.

* Documentation changes for params._agg -> agg

* Migration doc addition for scripted metric aggs _agg object change

* Rename "agg" Scripted Metric Aggregation script context variable to "state"

* Rename a private base class from ...Agg to ...State that I missed in my last commit

* Clean up imports after merge
2018-06-25 12:01:33 +01:00
Ryan Ernst c0961b79be
Docs: Add note about removing prepareExecute from the java client (#31401)
relates #30966
2018-06-19 07:21:58 -07:00
Ryan Ernst f3297ed23a
Packaging: Remove windows bin files from the tar distribution (#30596)
This commit removes windows specific files from the tar distribution.
Windows users use the zip, linux users use the tar.
2018-06-18 19:02:51 +02:00
Luca Cavanna 24163d10b7
REST hl client: cluster health to default to cluster level (#31268)
With #29331 we added support for the cluster health API to the
high-level REST client. The transport client does not support the level
parameter, and it always returns all the info needed for shards level
rendering. We have maintained that behaviour when adding support for
cluster health to the high-level REST client, to ease migration, but the
correct thing to do is to default the high-level REST client to
`cluster` level, which is the same default as when going through the
Elasticsearch REST layer.
2018-06-13 15:06:13 +02:00
Luca Cavanna 92eb324776
REST high-level Client: remove deprecated API methods (#31200)
This commit removes all the API methods that accept a `Header` varargs
argument, in favour of the newly introduced API methods that accept a
`RequestOptions` argument.

Relates to #31069
2018-06-12 21:00:06 +02:00
Simon Willnauer f825a530b8
Limit the number of concurrent requests per node (#31206)
With `max_concurrent_shard_requests` we used to throttle / limit
the number of concurrent shard requests a high level search request
can execute per node. This had several problems since it limited the
number on a global level based on the number of nodes. This change
now throttles the number of concurrent requests per node while still
allowing concurrency across multiple nodes.

Closes #31192
2018-06-11 08:49:18 +02:00
Alan Woodward 852df128a5
Match phrase queries against non-indexed fields should throw an exception (#31060)
When `lenient=false`, attempts to create match phrase queries with custom analyzers against non-text fields will throw an IllegalArgumentException.

Also changes `*Match*QueryBuilderTests` so that it avoids this scenario

Fixes #31061
2018-06-04 19:12:45 +01:00
Christoph Büscher 1ea9f11b03
Change ScriptException status to 400 (bad request) (#30861)
Currently failures to compile a script usually lead to a ScriptException, which
inherits the 500 INTERNAL_SERVER_ERROR from ElasticsearchException if it does
not contain another root cause. Instead, this should be a 400 Bad Request error.
This PR changes this more generally for script compilation errors by changing 
ScriptException to return 400 (bad request) as status code.

Closes #12315
2018-05-30 14:00:07 +02:00
Jim Ferenczi f582418ada Fix missing option serialization after backport
Relates #29465
2018-05-30 12:55:31 +02:00
Vladimir Dolzhenko 81eb8ba0f0
Include size of snapshot in snapshot metadata (#29602)
Include size of snapshot in snapshot metadata

Adds difference of number of files (and file sizes) between prev and current snapshot. Total number/size reflects total number/size of files in snapshot.

Closes #18543
2018-05-25 21:04:50 +02:00
Igor Motov cf0e0606af
Use geohash cell instead of just a corner in geo_bounding_box (#30698)
Treats geohashes as grid cells instead of just points when the
geohashes are used to specify the edges in the geo_bounding_box
query. For example, if a geohash is used to specify the top_left
corner, the top left corner of the geohash cell will be used as the
corner of the bounding box.

Closes #25154
2018-05-24 14:46:15 -04:00
Tim Brooks d7040ad7b4
Reintroduce mandatory http pipelining support (#30820)
This commit reintroduces 31251c9 and 63a5799. These commits introduced a
memory leak and were reverted. This commit brings those commits back
and fixes the memory leak by removing unnecessary retain method calls.
2018-05-23 14:38:52 -06:00
Colin Goodheart-Smithe 4fd0a3e492 Revert "Make http pipelining support mandatory (#30695)" (#30813)
This reverts commit 31251c9 introduced in #30695.

We suspect this commit is causing the OOME's reported in #30811 and we will use this PR to test this assertion.
2018-05-23 10:54:46 -06:00
Tim Brooks 31251c9a6d
Make http pipelining support mandatory (#30695)
This is related to #29500 and #28898. This commit removes the abilitiy
to disable http pipelining. After this commit, any elasticsearch node
will support pipelined requests from a client. Additionally, it extracts
some of the http pipelining work to the server module. This extracted
work is used to implement pipelining for the nio plugin.
2018-05-22 09:29:31 -06:00
Tanguy Leroux 74474e99d6 [Docs] Fix broken cross link in documentation 2018-05-22 16:03:33 +02:00
Ryan Ernst 34180f2285
Scripting: Remove getDate methods from ScriptDocValues (#30690)
The getDate() and getDates() existed prior to 5.x on long fields in
scripting. In 5.x, a new Date type for ScriptDocValues was added. The
getDate() and getDates() methods were left on long fields and added to date
fields to ease the transition. This commit removes those methods for
7.0.
2018-05-18 21:26:26 -07:00
Jason Tedor d68c44b76c
Default copy settings to true and deprecate on the REST layer (#30598)
This commit defaults the copy_settings REST parameter to the shrink and
split APIs to true, and deprecates the parameter.
2018-05-18 10:12:08 -04:00
Ryan Ernst fb0aa562a5
Network: Remove http.enabled setting (#29601)
This commit removes the http.enabled setting. While all real nodes (started with bin/elasticsearch) will always have an http binding, there are many tests that rely on the quickness of not actually needing to bind to 2 ports. For this case, the MockHttpTransport.TestPlugin provides a dummy http transport implementation which is used by default in ESIntegTestCase.

closes #12792
2018-05-02 11:42:05 -07:00
Ryan Ernst fba2f00a73
Packaging: Unmark systemd service file as a config file (#29004)
Systemd overrides should happen through /etc/systemd/system, not
directly editing the service file. This commit removes marking the
service file as configuration for rpm and deb packages.
2018-05-02 09:48:49 -07:00
Jason Tedor 5de6f4ff7b Adjust copy settings on resize BWC version
This commit adjusts the BWC version for copy settings on resize
operations after the behavior was backported to 6.x.
2018-05-01 08:49:16 -04:00
Jason Tedor 50535423ff
Allow copying source settings on resize operation (#30255)
Today when an index is created from shrinking or splitting an existing
index, the target index inherits almost none of the source index
settings. This is surprising and a hassle for operators managing such
indices. Given this is the default behavior, we can not simply change
it. Instead, we start by introducing the ability to copy settings. This
flag can be set on the REST API or on the transport layer and it has the
behavior that it copies all settings from the source except non-copyable
settings (a property of a setting introduced in this
change). Additionally, settings on the request will always override.

This change is the first step in our adventure:
 - this flag is added here in 7.0.0 and immediately deprecated
 - this flag will be backported to 6.4.0 and remain deprecated
 - then, we will remove the ability to set this flag to false in 7.0.0
 - finally, in 8.0.0 we will remove this flag and the only behavior will
   be for settings to be copied
2018-05-01 08:48:19 -04:00
Jason Tedor f381e2a00c
Add migration note on thread pool API changes (#29192)
A previous change modified the output of the thread pool info contained
in the nodes info API. This commit adds a note to the migration docs for
this change.
2018-04-28 00:11:17 -04:00
Julie Tibshirani f5978d6d33
In the field capabilities API, remove support for providing fields in the request body. (#30185) 2018-04-27 16:14:11 -07:00