Commit Graph

4384 Commits

Author SHA1 Message Date
Adrien Grand 1af6d20efe Fix docs build. 2018-06-05 14:55:40 +02:00
Adrien Grand 984523dda9
Clarify docs about boolean operator precedence. (#30808)
Unfortunately, the classic queryparser does not honor the usual precedence
rules of boolean operators. See
https://issues.apache.org/jira/browse/LUCENE-3674.
2018-06-05 08:59:17 +02:00
Adrien Grand 21fe6159d4
Docs: remove notes on sparsity. (#30905)
Sparsity is less of a concern since 6.0.

Closes #30833
2018-06-05 08:58:52 +02:00
Adrien Grand 500094f5c8
Improve documentation of dynamic mappings. (#30952)
Closes #30939
2018-06-05 08:51:52 +02:00
Adrien Grand f5073813ef
Docs: Clarify constraints on scripted similarities. (#31076)
Scripted similarities provide a lot of flexibility but they still need to obey
some rules to not confuse Lucene.
2018-06-05 08:51:00 +02:00
Dmitri Moore c7c0acc2c7 Update get.asciidoc (#31084)
Fix typo

(cherry picked from commit 51fbf8a3d3486fcf8a66dd6a6fe0940482b0f2d8)
2018-06-04 15:44:00 -07:00
Jim Ferenczi f94a75778c
Fix index prefixes to work with span_multi (#31066)
* Fix index prefixes to work with span_multi

Text fields that use `index_prefixes` can rewrite `prefix` queries into
`term` queries internally. This commit fix the handling of this rewriting
in the `span_multi` query.
This change also copies the index options of the text field into the
prefix field in order to be able to run positional queries. This is mandatory
for `span_multi` to work but this could also be useful to optimize `match_phrase_prefix`
queries in a follow up. Note that this change can only be done on indices created
after 6.3 since we set the index options to doc only in this version.

Fixes #31056
2018-06-04 21:48:56 +02:00
Alan Woodward 852df128a5
Match phrase queries against non-indexed fields should throw an exception (#31060)
When `lenient=false`, attempts to create match phrase queries with custom analyzers against non-text fields will throw an IllegalArgumentException.

Also changes `*Match*QueryBuilderTests` so that it avoids this scenario

Fixes #31061
2018-06-04 19:12:45 +01:00
Itamar Syn-Hershko 526f906927 Remove usage of explicit type in docs (#29667) 2018-06-04 09:42:29 -04:00
Jim Ferenczi fa6b7266eb Remove wrong link in index phrases doc
Relates #30450
2018-06-04 12:13:55 +02:00
Colin Goodheart-Smithe 360b09f148
[DOCS] Fixes accounting setting names (#30863)
The documentation for the account circuit breaker listed the settings for it's limit and overhead to be `network.breaker.accounting.limit` and `network.breaker.accounting.overhead` when in `HieratchyCircuitBreakerService` it seems the settings are actually `indices.breaker.accounting.limit` and `indices.breaker.accounting.overhead`.
2018-06-04 09:20:54 +01:00
Colin Goodheart-Smithe 1efb1aae28
[DOCS] Rewords _field_names documentation (#31029)
* [DOCS] Rewords _field_names documentation

Corrects the language around when we write to `_field_names` and when you might want to disable it given that n recent versions it does not carry the indexing overhead it once did.

Relates to #30862

* Update wording following review
2018-06-04 09:17:11 +01:00
Alan Woodward 0427339ab0
Index phrases (#30450)
Specifying `index_phrases: true` on a text field mapping will add a subsidiary
[field]._index_phrase field, indexing two-term shingles from the parent field.
The parent analysis chain is re-used, wrapped with a FixedShingleFilter.

At query time, if a phrase match query is executed, the mapping will redirect it
to run against the subsidiary field.

This should trade faster phrase querying for a larger index and longer indexing
times.

Relates to #27049
2018-06-04 08:50:35 +01:00
Igor Motov 7376c35960
[DOCS] Make geoshape docs less memory hungry (#31014)
Reduces shape size and precision in geo shape mapper examples to reduce
amount of memory required to check docs.

Fixes #23836
2018-06-01 15:05:37 -04:00
lipsill a8c643e833 [Docs] Fix a typo in Create Index naming limitation (#30891) 2018-06-01 18:15:24 +02:00
Jim Ferenczi 0791f93dbd
Add an option to split keyword field on whitespace at query time (#30691)
This change adds an option named `split_queries_on_whitespace` to the `keyword`
field type. When set to true full text queries (`match`, `multi_match`, `query_string`, ...) that target the field will split the input on whitespace to build the query terms. Defaults to `false`.
Closes #30393
2018-06-01 09:47:03 +02:00
Jim Ferenczi 0f5e570184
Deprecates indexing and querying a context completion field without context (#30712)
This change deprecates completion queries and documents without context that target a
context enabled completion field. Querying without context degrades the search
performance considerably (even when the number of indexed contexts is low).
This commit targets master but the deprecation will take place in 6.x and the functionality
will be removed in 7 in a follow up.

Closes #29222
2018-05-31 16:09:48 +02:00
Christoph Büscher 4777d8a2df
[Docs] Fix typo in Min Aggregation reference (#30899) 2018-05-31 15:05:03 +02:00
Christoph Büscher 1ea9f11b03
Change ScriptException status to 400 (bad request) (#30861)
Currently failures to compile a script usually lead to a ScriptException, which
inherits the 500 INTERNAL_SERVER_ERROR from ElasticsearchException if it does
not contain another root cause. Instead, this should be a 400 Bad Request error.
This PR changes this more generally for script compilation errors by changing 
ScriptException to return 400 (bad request) as status code.

Closes #12315
2018-05-30 14:00:07 +02:00
Jim Ferenczi f582418ada Fix missing option serialization after backport
Relates #29465
2018-05-30 12:55:31 +02:00
Jim Ferenczi e33d107f84
Add missing_bucket option in the composite agg (#29465)
This change adds a new option to the composite aggregation named `missing_bucket`.
This option can be set by source and dictates whether documents without a value for the
source should be ignored. When set to true, documents without a value for a field emits
an explicit `null` value which is then added in the composite bucket.
The `missing` option that allows to set an explicit value (instead of `null`) is deprecated in this change and will be removed in a follow up (only in 7.x).
This commit also changes how the big arrays are allocated, instead of reserving
the provided `size` for all sources they are created with a small intial size and they grow
depending on the number of buckets created by the aggregation:
Closes #29380
2018-05-30 09:48:40 +02:00
Alan Woodward 67905c85a5
Rename index_prefix to index_prefixes (#30932)
This commit also adds index_prefixes tests to TextFieldMapperTests to ensure that cloning and wire-serialization work correctly
2018-05-30 08:32:31 +01:00
Lisa Cawley 6ce86a8d7f
[DOCS] Reset edit links (#30909) 2018-05-29 08:16:53 -07:00
David Turner 89869a2d0d
Improve allocation-disabling instructions (#30248)
Clarify the “one minute” in the instructions to disable the shard allocation
when doing maintenance to say that it is configurable.
2018-05-29 08:34:20 +01:00
Vladimir Dolzhenko b55b079a90
Include size of snapshot in snapshot metadata #18543, bwc clean up (#30890) 2018-05-26 21:20:44 +02:00
Vladimir Dolzhenko 81eb8ba0f0
Include size of snapshot in snapshot metadata (#29602)
Include size of snapshot in snapshot metadata

Adds difference of number of files (and file sizes) between prev and current snapshot. Total number/size reflects total number/size of files in snapshot.

Closes #18543
2018-05-25 21:04:50 +02:00
Zachary Tong 6909a05f3d
[DOCS] Document index name limitations (#30826)
Also tidy up the docs a bit, there's no yaml example anymore, etc
2018-05-25 10:21:09 -04:00
Peter Dyson adc2d408d3 [Docs] Add reindex.remote.whitelist example (#30828) 2018-05-25 11:17:55 +02:00
Igor Motov cf0e0606af
Use geohash cell instead of just a corner in geo_bounding_box (#30698)
Treats geohashes as grid cells instead of just points when the
geohashes are used to specify the edges in the geo_bounding_box
query. For example, if a geohash is used to specify the top_left
corner, the top left corner of the geohash cell will be used as the
corner of the bounding box.

Closes #25154
2018-05-24 14:46:15 -04:00
Julie Tibshirani 638a719370
Ensure that ip_range aggregations always return bucket keys. (#30701) 2018-05-24 08:55:14 -07:00
Christoph Büscher 3f78b3f5e1
[Docs] Explain incomplete dates in range queries (#30689)
The current documentation isn't very clear about how incomplete dates are
treated when specifying custom formats in a `range` query. This change adds a
note explaining how missing month or year coordinates translate to dates that
have the missings slots filled with unix time start date (1970-01-01)

Closes #30634
2018-05-24 11:20:00 +02:00
Tim Brooks d7040ad7b4
Reintroduce mandatory http pipelining support (#30820)
This commit reintroduces 31251c9 and 63a5799. These commits introduced a
memory leak and were reverted. This commit brings those commits back
and fixes the memory leak by removing unnecessary retain method calls.
2018-05-23 14:38:52 -06:00
Igor Motov 4b6915976c
Add support for indexed shape routing in geo_shape query (#30760)
Adds ability to specify the routing value for the indexed shape in the
geo_shape query.

Closes #7663
2018-05-23 15:15:19 -04:00
Colin Goodheart-Smithe 4fd0a3e492 Revert "Make http pipelining support mandatory (#30695)" (#30813)
This reverts commit 31251c9 introduced in #30695.

We suspect this commit is causing the OOME's reported in #30811 and we will use this PR to test this assertion.
2018-05-23 10:54:46 -06:00
Adrien Grand a19df4ab3b
Add a `format` option to `docvalue_fields`. (#29639)
This commit adds the ability to configure how a docvalue field should be
formatted, so that it would be possible eg. to return a date field
formatted as the number of milliseconds since Epoch.

Closes #27740
2018-05-23 14:39:04 +02:00
Adrien Grand 886db84ad2
Expose Lucene's FeatureField. (#30618)
Lucene has a new `FeatureField` which gives the ability to record numeric
features as term frequencies. Its main benefit is that it allows to boost
queries with the values of these features and efficiently skip non-competitive
documents at the same time using block-max WAND and indexed impacts.
2018-05-23 08:55:21 +02:00
Fernando Medina Corey 739bb4f0ec Fix a grammatical error in the 'search types' documentation.
Simple grammatical fix.
2018-05-22 22:09:04 -07:00
Christoph Büscher f7b5986682
[Docs] Fix script-fields snippet execution (#30693)
Currently the first snippet in the documentation test in script-fields.asciidoc
isn't executed, although it has the CONSOLE annotation. Adding a test setup
annotation to it seems to fix the problem.
2018-05-22 20:22:42 +02:00
Tim Brooks 31251c9a6d
Make http pipelining support mandatory (#30695)
This is related to #29500 and #28898. This commit removes the abilitiy
to disable http pipelining. After this commit, any elasticsearch node
will support pipelined requests from a client. Additionally, it extracts
some of the http pipelining work to the server module. This extracted
work is used to implement pipelining for the nio plugin.
2018-05-22 09:29:31 -06:00
Lee Jones 37f67d9e21 [Docs] Fix typo in circuit breaker docs (#29659)
The previous description had a part that didn't fit and was probably
from a copy/paste of the in flight requests description above.
2018-05-22 16:43:45 +02:00
Itamar Syn-Hershko 5f172b6795 [Feature] Adding a char_group tokenizer (#24186)
=== Char Group Tokenizer

The `char_group` tokenizer breaks text into terms whenever it encounters
a
character which is in a defined set. It is mostly useful for cases where
a simple
custom tokenization is desired, and the overhead of use of the
<<analysis-pattern-tokenizer, `pattern` tokenizer>>
is not acceptable.

=== Configuration

The `char_group` tokenizer accepts one parameter:

`tokenize_on_chars`::
    A string containing a list of characters to tokenize the string on.
Whenever a character
    from this list is encountered, a new token is started. Also supports
escaped values like `\\n` and `\\f`,
    and in addition `\\s` to represent whitespace, `\\d` to represent
digits and `\\w` to represent letters.
    Defaults to an empty list.

=== Example output

```The 2 QUICK Brown-Foxes jumped over the lazy dog's bone for $2```

When the configuration `\\s-:<>` is used for `tokenize_on_chars`, the
above sentence would produce the following terms:

```[ The, 2, QUICK, Brown, Foxes, jumped, over, the, lazy, dog's, bone,
for, $2 ]```
2018-05-22 16:26:31 +02:00
Tanguy Leroux 74474e99d6 [Docs] Fix broken cross link in documentation 2018-05-22 16:03:33 +02:00
Jim Ferenczi bdb79d021a
Fix docs failure on language analyzers (#30722)
This commit fixes docs failure on language analyzers when compared to the built in analyzers.
The `elision` filters used by the rebuilt language analyzers should be case insensitive to match
the definition of the prebuilt analyzers.

Closes #30557
2018-05-22 09:58:12 +02:00
Tanguy Leroux c351b51ac4
[Docs] Fix inconsistencies in snapshot/restore doc (#30480)
Closes #30444
2018-05-22 09:19:07 +02:00
Adam Chalkley 7cc38ab45a Fix default shards count in create index docs (#30747)
Update the default number of primary shards to match doc update work
done in #30539.
2018-05-20 14:59:28 -04:00
Ryan Ernst 34180f2285
Scripting: Remove getDate methods from ScriptDocValues (#30690)
The getDate() and getDates() existed prior to 5.x on long fields in
scripting. In 5.x, a new Date type for ScriptDocValues was added. The
getDate() and getDates() methods were left on long fields and added to date
fields to ease the transition. This commit removes those methods for
7.0.
2018-05-18 21:26:26 -07:00
Lisa Cawley 6846d2c94a
[DOCS] Removes redundant index.asciidoc files (#30707) 2018-05-18 11:05:40 -07:00
Lisa Cawley e750462e0c
[DOCS] Moves X-Pack configurationg pages in table of contents (#30702) 2018-05-18 10:26:03 -07:00
Jason Tedor d68c44b76c
Default copy settings to true and deprecate on the REST layer (#30598)
This commit defaults the copy_settings REST parameter to the shrink and
split APIs to true, and deprecates the parameter.
2018-05-18 10:12:08 -04:00
lcawl 663295d635 [DOCS] Replace X-Pack terms with attributes 2018-05-17 09:57:11 -07:00