Commit Graph

1646 Commits

Author SHA1 Message Date
Martijn van Groningen 07a57cc131
Move number of language analyzers to analysis-common module (#31143)
The following analyzers were moved from server module to analysis-common module:
`snowball`, `arabic`, `armenian`, `basque`, `bengali`, `brazilian`, `bulgarian`,
`catalan`, `chinese`, `cjk`, `czech`, `danish`, `dutch`, `english`, `finnish`,
`french`, `galician` and `german`.

Relates to #23658
2018-06-08 08:58:46 +02:00
Jim Ferenczi b30aa3137d
Reject long regex in query_string (#31136)
This change applies the existing `index.max_regex_length` to regex queries
produced by the `query_string` query.

Relates #28344
2018-06-07 09:29:26 +02:00
Lee Hinman b22a055bcf
Add get mappings support to high-level rest client (#30889)
This adds support for the get mappings API to the high level rest client.

Relates to #27205
2018-06-04 14:31:08 -06:00
Jim Ferenczi f94a75778c
Fix index prefixes to work with span_multi (#31066)
* Fix index prefixes to work with span_multi

Text fields that use `index_prefixes` can rewrite `prefix` queries into
`term` queries internally. This commit fix the handling of this rewriting
in the `span_multi` query.
This change also copies the index options of the text field into the
prefix field in order to be able to run positional queries. This is mandatory
for `span_multi` to work but this could also be useful to optimize `match_phrase_prefix`
queries in a follow up. Note that this change can only be done on indices created
after 6.3 since we set the index options to doc only in this version.

Fixes #31056
2018-06-04 21:48:56 +02:00
Alan Woodward 0427339ab0
Index phrases (#30450)
Specifying `index_phrases: true` on a text field mapping will add a subsidiary
[field]._index_phrase field, indexing two-term shingles from the parent field.
The parent analysis chain is re-used, wrapped with a FixedShingleFilter.

At query time, if a phrase match query is executed, the mapping will redirect it
to run against the subsidiary field.

This should trade faster phrase querying for a larger index and longer indexing
times.

Relates to #27049
2018-06-04 08:50:35 +01:00
Michael Basnight d826cb36c3
Remove version read/write logic in Verify Response (#30879)
Since master will always communicate with a >=6.4 node, the logic for
checking if the node is 6.4 and conditionally reading and writing based
on that can be removed from master. This logic will stay in 6.x as it is
the bridge to the cleaner response in master. This also unmutes the
failing test due to this bwc change.

Closes #30807
2018-05-31 12:10:01 -05:00
Jim Ferenczi 0f5e570184
Deprecates indexing and querying a context completion field without context (#30712)
This change deprecates completion queries and documents without context that target a
context enabled completion field. Querying without context degrades the search
performance considerably (even when the number of indexed contexts is low).
This commit targets master but the deprecation will take place in 6.x and the functionality
will be removed in 7 in a follow up.

Closes #29222
2018-05-31 16:09:48 +02:00
Jim Ferenczi f582418ada Fix missing option serialization after backport
Relates #29465
2018-05-30 12:55:31 +02:00
Jim Ferenczi e33d107f84
Add missing_bucket option in the composite agg (#29465)
This change adds a new option to the composite aggregation named `missing_bucket`.
This option can be set by source and dictates whether documents without a value for the
source should be ignored. When set to true, documents without a value for a field emits
an explicit `null` value which is then added in the composite bucket.
The `missing` option that allows to set an explicit value (instead of `null`) is deprecated in this change and will be removed in a follow up (only in 7.x).
This commit also changes how the big arrays are allocated, instead of reserving
the provided `size` for all sources they are created with a small intial size and they grow
depending on the number of buckets created by the aggregation:
Closes #29380
2018-05-30 09:48:40 +02:00
Alan Woodward 67905c85a5
Rename index_prefix to index_prefixes (#30932)
This commit also adds index_prefixes tests to TextFieldMapperTests to ensure that cloning and wire-serialization work correctly
2018-05-30 08:32:31 +01:00
Martijn van Groningen 544822c78b
Moved keyword tokenizer to analysis-common module (#30642)
Relates to #23658
2018-05-29 19:22:28 +02:00
Vladimir Dolzhenko b55b079a90
Include size of snapshot in snapshot metadata #18543, bwc clean up (#30890) 2018-05-26 21:20:44 +02:00
Vladimir Dolzhenko 81eb8ba0f0
Include size of snapshot in snapshot metadata (#29602)
Include size of snapshot in snapshot metadata

Adds difference of number of files (and file sizes) between prev and current snapshot. Total number/size reflects total number/size of files in snapshot.

Closes #18543
2018-05-25 21:04:50 +02:00
Tom Callahan 36fbb4cb48
Harmonize include_defaults tests (#30700)
This PR breaks the include_defaults functionality of the get settings API into its own
test, which is skipped for mixed-mode clusters containing pre-6.4 nodes.
2018-05-25 09:41:16 -04:00
David Roberts 40534ccabc [TEST] Mute {p0=snapshot.get_repository/10_basic/Verify created repository} YAML test
Issue is #30807
2018-05-25 12:58:02 +01:00
Michael Basnight e1ffbeb824
Fix bad version check writing Repository nodes (#30846)
The writeTo method of VerifyRepositoryResponse incorrectly used its
local version to determine what it was receiving, rather than the
sender's version. This fixes a bug that ocassionally happened when a 6.4
master node sent data to a 7.0 client, causing the number of bytes to be
improperly read. This also unmutes the test.

Closes #30807
2018-05-24 19:21:57 -05:00
Julie Tibshirani f55b09bae4 Update the version checks around ip_range bucket keys, now that the change was backported. 2018-05-24 12:04:18 -07:00
Julie Tibshirani 638a719370
Ensure that ip_range aggregations always return bucket keys. (#30701) 2018-05-24 08:55:14 -07:00
Adrien Grand 405eb7a751 Change serialization version of doc-value fields.
Relates #29639
2018-05-23 18:34:05 +02:00
Adrien Grand a19df4ab3b
Add a `format` option to `docvalue_fields`. (#29639)
This commit adds the ability to configure how a docvalue field should be
formatted, so that it would be possible eg. to return a date field
formatted as the number of milliseconds since Epoch.

Closes #27740
2018-05-23 14:39:04 +02:00
Colin Goodheart-Smithe 483b25330b
Mustes {p0=snapshot.get_repository/10_basic/*} YAML test
This is awaiting a fix for
https://github.com/elastic/elasticsearch/issues/30807
2018-05-23 11:32:58 +01:00
olcbean af8ad8d172 Add more yaml tests for get alias API (#29513) 2018-05-22 11:48:28 +02:00
Jason Tedor d68c44b76c
Default copy settings to true and deprecate on the REST layer (#30598)
This commit defaults the copy_settings REST parameter to the shrink and
split APIs to true, and deprecates the parameter.
2018-05-18 10:12:08 -04:00
Zachary Tong d120fb222c [TEST] Adjust version skips for movavg/movfn tests
Since the MovFn PR was backported to 6.x, we can adjust
the version skip numbers in master to correctly
match 6.3.99 instead of 6.4.0
2018-05-17 18:07:52 +00:00
Mayya Sharipova 3dfa93ef7c
Improve explanation in rescore (#30629)
Currently in a rescore request if window_size is smaller than
the top N documents returned (N=size), explanation of scores could be incorrect
for documents that were a part of topN and not part of rescoring.
This PR corrects this, but saving in RescoreContext docIDs of documents
for which rescoring was applied, and adding rescoring explanation
only for these docIDs.

Closes #28725
2018-05-17 07:09:18 -04:00
Tanguy Leroux 2ac1f9fe89
Fix _cluster/state to always return cluster_uuid (#30656)
Since #30143, the Cluster State API should always returns the current
cluster_uuid in the response body, regardless of the metrics filters.

This is not exactly true as it is returned only if metadata metrics and
no specific indices are requested.

This commit fixes the behavior to always return the cluster_uuid and
add new test.
2018-05-17 10:58:25 +02:00
Zachary Tong df853c49c0
Add a MovingFunction pipeline aggregation, deprecate MovingAvg agg (#29594)
This pipeline aggregation gives the user the ability to script functions that "move" across a window
of data, instead of single data points.  It is the scripted version of MovingAvg pipeline agg.

Through custom script contexts, we expose a number of convenience methods:

 - MovingFunctions.max()
 - MovingFunctions.min()
 - MovingFunctions.sum()
 - MovingFunctions.unweightedAvg()
 - MovingFunctions.linearWeightedAvg()
 - MovingFunctions.ewma()
 - MovingFunctions.holt()
 - MovingFunctions.holtWinters()
 - MovingFunctions.stdDev()

The user can also define any arbitrary logic via their own scripting, or combine with the above methods.
2018-05-16 10:57:00 -04:00
Simon Willnauer b50cf3c6b0
Side-step pending deletes check (#30571)
When we split/shrink an index we open several IndexWriter instances
causeing file-deletes to be pending on windows. This subsequently fails
when we open an IW to bootstrap the index history due to pending deletes.
This change sidesteps the check since we know our history goes forward
in terms of files and segments.

Closes #30416
2018-05-15 11:51:54 +02:00
Jason Tedor 4e33443690
Adjust versions for resize copy settings (#30578)
Now that the change to deprecate copy settings and disallow it being
explicitly set to false is backported, this commit adjusts the BWC
versions in master.
2018-05-14 16:41:25 -04:00
Jason Tedor 4a4e3d70d5
Default to one shard (#30539)
This commit changes the default out-of-the-box configuration for the
number of shards from five to one. We think this will help address a
common problem of oversharding. For users with time-based indices that
need a different default, this can be managed with index templates. For
users with non-time-based indices that find they need to re-shard with
the split API in place they no longer need to resort only to
reindexing.

Since this has the impact of changing the default number of shards used
in REST tests, we want to ensure that we still have coverage for issues
that could arise from multiple shards. As such, we randomize (rarely)
the default number of shards in REST tests to two. This is managed via a
global index template. However, some tests check the templates that are
in the cluster state during the test. Since this template is randomly
there, we need a way for tests to skip adding the template used to set
the number of shards to two. For this we add the default_shards feature
skip. To avoid having to write our docs in a complicated way because
sometimes they might be behind one shard, and sometimes they might be
behind two shards we apply the default_shards feature skip to all docs
tests. That is, these tests will always run with the default number of
shards (one).
2018-05-14 12:22:35 -04:00
Martijn van Groningen 7b95470897
Moved tokenizers to analysis common module (#30538)
The following tokenizers were moved: classic, edge_ngram,
letter, lowercase, ngram, path_hierarchy, pattern, thai, uax_url_email and
whitespace.

Left keyword tokenizer factory in server module, because
normalizers directly depend on it.This should be addressed on a
follow up change.

Relates to #23658
2018-05-14 07:55:01 +02:00
Jason Tedor 593fdd40ed
Deprecate not copy settings and explicitly disallow (#30404)
We want copying settings to be the default behavior. This commit
deprecates not copying settings, and disallows explicitly not copying
settings. This gives users a transition path to the future default
behavior.
2018-05-13 10:30:05 -04:00
Tal Levy 34f92df2d3 AwaitsFix IntegTestZipClientYamlTestSuiteIT#indices.split tests
there are two tests that have failed multiple times in one day on windows CI.

This commit AwaitsFixes them until their timeout issues are resolved.

tracking here: https://github.com/elastic/elasticsearch/issues/30503
2018-05-09 18:25:25 -07:00
Boaz Leskes 9f5fe49cec Disable REST default settings testing until #29229 is back-ported
That PR changed the execution path of index settings default to be on the master
until the PR is back-ported the old master will not return default settings.
2018-05-07 13:30:14 +02:00
tomcallahan 0a93956194
Add Get Settings API support to java high-level rest client (#29229)
This PR adds support for the Get Settings API to the java high-level rest client.
Furthermore, logic related to the retrieval of default settings has been moved from the rest layer into the transport layer and now default settings may be retrieved consistency via both the rest API and the transport API.
2018-05-04 11:14:28 -04:00
Adrien Grand bcdf3d5c61 Post backport of #29658. 2018-05-02 11:43:50 +02:00
Adrien Grand 231a63fdf8
Remove useless version checks in REST tests. (#30165)
Many tests are added with a version check so that they do not run against a
version that doesn't have the feature yet. Master is 7.0, so all tests that
do not run against 6.0+ can be removed and the version check can be removed
on all tests that always run on 6.0+.
2018-05-02 11:34:15 +02:00
Adrien Grand 7358946bda
Add a new `_ignored` meta field. (#29658)
This adds a new `_ignored` meta field which indexes and stores fields that have
been ignored at index time because of the `ignore_malformed` option. It makes
malformed documents easier to identify by using `exists` or `term(s)` queries
on the `_ignored` field.

Closes #29494
2018-05-02 10:47:02 +02:00
Jason Tedor 5de6f4ff7b Adjust copy settings on resize BWC version
This commit adjusts the BWC version for copy settings on resize
operations after the behavior was backported to 6.x.
2018-05-01 08:49:16 -04:00
Jason Tedor 50535423ff
Allow copying source settings on resize operation (#30255)
Today when an index is created from shrinking or splitting an existing
index, the target index inherits almost none of the source index
settings. This is surprising and a hassle for operators managing such
indices. Given this is the default behavior, we can not simply change
it. Instead, we start by introducing the ability to copy settings. This
flag can be set on the REST API or on the transport layer and it has the
behavior that it copies all settings from the source except non-copyable
settings (a property of a setting introduced in this
change). Additionally, settings on the request will always override.

This change is the first step in our adventure:
 - this flag is added here in 7.0.0 and immediately deprecated
 - this flag will be backported to 6.4.0 and remain deprecated
 - then, we will remove the ability to set this flag to false in 7.0.0
 - finally, in 8.0.0 we will remove this flag and the only behavior will
   be for settings to be copied
2018-05-01 08:48:19 -04:00
Chris Earle 421bd9bd7a
_cluster/state Skip Test for pre-6.4, not pre-7.0 (#30264)
This updates the skip section for the new `_cluster/state` responses to
include 6.4+ now that it has been backported.
2018-04-30 14:53:48 -04:00
Chris Earle 725a5af2c6
_cluster/state should always return cluster_uuid (#30143)
Currently, the only way to get the REST response for the `/_cluster/state`
call to return the `cluster_uuid` is to request the `metadata` metrics,
which is one of the most expensive response structures. However, external
monitoring agents will likely want the `cluster_uuid` to correlate the
response with other API responses whether or not they want cluster
metadata.
2018-04-30 10:16:11 -04:00
Julie Tibshirani f5978d6d33
In the field capabilities API, remove support for providing fields in the request body. (#30185) 2018-04-27 16:14:11 -07:00
Alexander Reelsen e1a16a6018
REST: Remove GET support for clear cache indices (#29525)
Clearing the cache indices can be done via GET and POST. As GET should
only support read only operations, this removes the support for using
GET for clearing the indices caches.
2018-04-27 08:41:36 +02:00
Jason Tedor c12c2a6cc9 Rename the bulk thread pool to write thread pool (#29593)
This commit renames the bulk thread pool to the write thread pool. This
is to better reflect the fact that the underlying thread pool is used to
execute any document write request (single-document index/delete/update
requests, and bulk requests).

With this change, we add support for fallback settings
thread_pool.bulk.* which will be supported until 7.0.0.

We also add a system property so that the display name of the thread
pool remains as "bulk" if needed to avoid breaking users.
2018-04-19 08:18:58 -04:00
Martijn van Groningen 8afa7c174f
Added painless execute api. (#29164)
Added an api that allows to execute an arbitrary script and a result to be returned.

```
POST /_scripts/painless/_execute
{
  "script": {
    "source": "params.var1 / params.var2",
    "params": {
      "var1": 1,
      "var2": 1
    }
  }
}
```

Relates to #27875
2018-04-19 09:33:34 +02:00
Jason Tedor 2b47d67d95
Remove the index thread pool (#29556)
Now that single-document indexing requests are executed on the bulk
thread pool the index thread pool is no longer needed. This commit
removes this thread pool from Elasticsearch.
2018-04-18 09:18:08 -04:00
Adrien Grand d223bcf7ab
Add the `include_type_name` option to the search and document APIs. (#29506)
This commit add the `include_type_name` option to the `index`, `update`,
`delete`, `get`, `bulk` and `search` APIs. When set to `false`, the response
will omit the `_type` in the response. This option doesn't work if the endpoint
contains a type. For instance, the following call would succeed:

```
GET index/_doc/1?include_type_name=false
```

But the following one would fail:

```
GET index/some_type/1?include_type_name=false
```

Relates #15613
2018-04-17 11:29:08 +02:00
olcbean b3e3b80f1b REST high-level client: add support for Indices Update Settings API [take 2] (#29327)
Relates to #27205
2018-04-16 21:39:11 +02:00
Ke Li 0bfb59dcf2 Using ObjectParser in UpdateRequest (#29293)
CRUD: Parsing changes for UpdateRequest (#29293)

Use `ObjectParser` to parse `UpdateRequest` so we reject unknown fields
and drop support for the `_fields` parameter because it was deprecated
in 5.x.
2018-04-16 08:39:35 -04:00