3841 Commits

Author SHA1 Message Date
Jim Ferenczi
5e8b569255 fix highlighting docs 2017-06-09 14:42:08 +02:00
Jim Ferenczi
8250aa4267 Remove the postings highlighter and make unified the default highlighter choice (#25028)
This change removes the `postings` highlighter. This highlighter has been removed from Lucene master (7.x) because it behaves
exactly like the `unified` highlighter when index_options is set to `offsets`:
https://issues.apache.org/jira/browse/LUCENE-7815

It also makes the `unified` highlighter the default choice for highlighting a field (if `type` is not provided).
The strategy used internally by this highlighter remain the same as before, it checks `term_vectors` first, then `postings` and ultimately it re-analyzes the text.
Ultimately it rewrites the docs so that the options that the `unified` highlighter cannot handle are clearly marked as such.
There are few features that the `unified` highlighter is not able to handle which is why the other highlighters (`plain` and `fvh`) are still available.
I'll open separate issues for these features and we'll deprecate the `fvh` and `plain` highlighters when full support for these features have been added to the `unified`.
2017-06-09 14:09:57 +02:00
Pandiyan Murugan
34c3d1d5bf Fix typo in shards.asciidoc (#25143) 2017-06-09 12:45:43 +02:00
Andrey Groshev
e4fd8485ce Made the same length of opening and closing lines (#23583) 2017-06-09 00:50:43 -07:00
Jim Ferenczi
ad905924ae update docs that claim that classic is the default similarity 2017-06-09 09:22:48 +02:00
Deb Adair
dbe2de0891 [DOCS] Fixed callout reference error. 2017-06-08 16:47:13 -07:00
Tal Levy
a771912a22 Add Ingest-Processor specific Rest Endpoints & Add Grok endpoint (#25059)
This PR enables Ingest plugins to leverage processor-scoped REST
endpoints. First of which being the Grok endpoint that retrieves
Grok Patterns for users to retrieve all the built-in patterns.
Example usage: Kibana Grok Autocomplete!
2017-06-08 15:24:35 -07:00
Guillaume Le Floch
3f6d80aa66 Allow removing multiple fields in ingest processor (#24750)
* Allow removing multiple fields in ingest processor

* Iteration 2

* Few fixes
2017-06-08 13:17:44 -07:00
Jim Ferenczi
36a5cf8f35 Automatically early terminate search query based on index sorting (#24864)
This commit refactors the query phase in order to be able
to automatically detect queries that can be early terminated.
If the index sort matches the query sort, the top docs collection is early terminated
on each segment and the computing of the total number of hits that match the query is delegated to a simple TotalHitCountCollector.
This change also adds a new parameter to the search request called `track_total_hits`.
It indicates if the total number of hits that match the query should be tracked.
If false, queries sorted by the index sort will not try to compute this information and 
and will limit the collection to the first N documents per segment.
Aggregations are not impacted and will continue to see every document
even when the index sort matches the query sort and `track_total_hits` is false.

Relates #6720
2017-06-08 12:10:46 +02:00
Christian Hofstaedtler
c3ec6a1714 Honor masking of systemd-sysctl.service
During package install on systemd-based systems, some sysctl settings
should be set (e.g. vm.max_map_count).

In some environments, changing sysctl settings plainly does not work;
previously a global environment variable named
ES_SKIP_SET_KERNEL_PARAMETERS was introduced to skip calling sysctl, but
this causes trouble for:
 - configuration management systems, which usually cannot apply an env
   var when running a package manager
 - package upgrades, which will not have the env var set any more, and
   thus leaving the package management system in a bad state (possibly
   half-way upgraded, can be very hard to recover)

This removes the env var again and instead of calling systemd-sysctl
manually, tells systemd to restart the wrapper unit - which itself can
be masked by system administrators or management tools if it is known
that sysctl does not work in a given environment.

The restart is not silent on systems in their default configuration, but
is ignored if the unit is masked.

Relates #24234
2017-06-06 10:44:41 -04:00
Yibin Lin
fbf2e3d574 Tiny correction in inner-hits.asciidoc (#25066) 2017-06-06 13:26:37 +02:00
Clinton Gormley
8b9c201224 Added release notes for 6.0.0-alpha2 2017-06-06 11:52:18 +02:00
olcbean
0d5f3958e7 Expand index expressions against indices only when managing aliases (#23997)
The index parameter in the update-aliases, put-alias, and delete-alias APIs no longer accepts alias names. Instead, it accepts only index names (or wildcards which will expand to matching indices).

Closes #23960
2017-06-06 11:01:38 +02:00
Tal Levy
e51246023a add exclude_keys option to KeyValueProcessor (#24876)
and modify data-structure of `include_keys` and `exclude_keys` to be
backed by a HashSet
2017-06-05 14:12:48 -07:00
Lee Hinman
a32d1b91fa Remove comma-separated feature parsing for GetIndicesAction
This removes the parsing of things like `GET /idx/_aliases,_mappings`, instead,
a user must choose between retriving all index metadata with `GET /idx`, or only
a specific form such as `GET /idx/_settings`.

Relates to (and is a prerequisite of) #24437
2017-06-02 14:43:38 -06:00
Colin Goodheart-Smithe
5e7a79636d
[DOCS] Clarify behaviour of scripted-metric arg with empty parent buckets 2017-06-02 11:00:27 +01:00
Luca Cavanna
018c6c38fe [DOCS] Clarify connections and gateway nodes selection in cross cluster search docs (#24859)
Closes #24836
2017-06-02 11:13:47 +02:00
olcbean
6dea5f14c3 Java api: Remove unneeded getTookInMillis method (#23923)
Some response classes in the java api expose both `getTook()` which returns a `TimeValue` and `getTookInMillis` which returns a `long` value. `getTook()` is enough as one can do `getTook().millis()` to obtain the same result as `getTookInMillis()`, which can be removed.
2017-06-02 11:11:05 +02:00
Colin Goodheart-Smithe
779fb9a1c0 Adds nodes usage API to monitor usages of actions (#24169)
* Adds nodes usage API to monitor usages of actions

The nodes usage API has 2 main endpoints

/_nodes/usage and /_nodes/{nodeIds}/usage return the usage statistics
for all nodes and the specified node(s) respectively.

At the moment only one type of usage statistics is available, the REST
actions usage. This records the number of times each REST action class is
called and when the nodes usage api is called will return a map of rest
action class name to long representing the number of times each of the action
classes has been called.

Still to do:

* [x] Create usage service to store usage statistics
* [x] Record usage in REST layer
* [x] Add Transport Actions
* [x] Add REST Actions
* [x] Tests
* [x] Documentation

* Rafactors UsageService so counts are done by the handlers

* Fixing up docs tests

* Adds a name to all rest actions

* Addresses review comments
2017-06-02 08:46:38 +01:00
Tanguy Leroux
528bd25fa7 Add superset size to Significant Term REST response (#24865)
This commit adds a new bg_count field to the REST response of
SignificantTerms aggregations. Similarly to the bg_count that already
exists in significant terms buckets, this new bg_count field is set at
the aggregation level and is populated with the superset size value.
2017-06-02 09:45:15 +02:00
Adrien Grand
bbdf50f6bd Docs: More search speed advices. (#24802) 2017-06-01 17:23:22 +02:00
Adrien Grand
ebf806d38f Reorganize docs of global ordinals. (#24982)
Currently global ordinals are documented under `fielddata`. It moves them to
their own file since they also work with doc values and fielddata is on the way
out.

Closes #23101
2017-06-01 16:47:44 +02:00
Clinton Gormley
1b0c93b07c Documented the level parameter to nodes stats
Closes #24999
2017-06-01 12:11:21 +02:00
Sergey Novikov
a7b21534b1 Docs: Fix typo in docker docs (#24988)
`boostrap.memory_lock` -> `bootstrap.memory_lock`
2017-05-31 13:42:47 -04:00
David Cho-Lerat
491dc1186a Add missing word to terms-query.asciidoc (#24960) 2017-05-30 09:42:07 -04:00
David Cho-Lerat
c939bcb7f5 Correct some spelling in match-phrase-prefix docs (#24956) 2017-05-30 09:02:01 -04:00
Tanguy Leroux
28d97df67c Add document count to Matrix Stats aggregation response (#24776)
This commit adds a `doc_count` field to the response body of Matrix
Stats aggregation. It exposes the number of documents involved in
 the computation of statistics, a value that can already be retrieved using
  the method MatrixStats.getDocCount() in the Java API.
2017-05-30 09:39:41 +02:00
propulkit
25516868fe TCorrecting api name (#24924)
As per REST request signature for reroute, API has no underscore.
2017-05-29 13:58:31 +02:00
Clinton Gormley
0656d0236b Update context-suggest.asciidoc
Removed incorrect parameter
2017-05-26 17:41:40 +02:00
Matt Weber
601a61a91c Support Multiple Collapse Inner Hits
Support multiple named inner hits on a field collapsing
request.
2017-05-26 13:23:57 +02:00
Tal Levy
dfe2ecaa28 add docs example for Ingest scripts manipulating document metadata (#24875)
It may not be clear to users that the Ingest ScriptProcessor context object `ctx` can 
manipulate document metadata like `_index` and `_type`.
2017-05-25 07:45:19 -07:00
Brian Lesperance
959990728b Docs: Fix grammar in aliases doc (#24852) 2017-05-24 10:18:25 -04:00
markharwood
b7197f5e21 SignificantText aggregation - like significant_terms, but for text (#24432)
* SignificantText aggregation - like significant_terms but doesn’t require fielddata=true, recommended used with `sampler` agg to limit expense of tokenizing docs and takes optional `filter_duplicate_text`:true setting to avoid stats skew from repeated sections of text in search results.

Closes #23674
2017-05-24 13:46:43 +01:00
António Ribeiro
85a1b2b406 Fix link to perl docs (#24842)
* Fixes Elasticsearch issue #24606.

* Fixes Elasticsearch issue #24606.

* Fixes Elasticsearch issue #24606.

* Fixes Elasticsearch issue #24606.

* Issue #24606 - Changed the link text to Search::Elasticsearch::Client::5_0::Bulk and
Search::Elasticsearch::Client::5_0::Scroll.
2017-05-24 11:43:54 +02:00
Nik Everett
13a86fec99 Add magic $_path stash key to docs tests (#24724)
Adds a "magic" key to the yaml testing stash mostly for use with
documentation tests. When unstashing an object, `$_path` is the
path into the current position in the object you are unstashing.
This means that in docs tests you can use
`// TESTRESPONSEs/somevalue/$body.${_path}/` to mean "replace
`somevalue` with whatever is the response in the same position."

Compare how you must carefully mock out all the numbers in the profile
response without this change:
```
// TESTRESPONSE[s/"id": "\[2aE02wS1R8q_QFnYu6vDVQ\]\[twitter\]\[1\]"/"id": $body.profile.shards.0.id/]
// TESTRESPONSE[s/"rewrite_time": 51443/"rewrite_time": $body.profile.shards.0.searches.0.rewrite_time/]
// TESTRESPONSE[s/"score": 51306/"score": $body.profile.shards.0.searches.0.query.0.breakdown.score/]
// TESTRESPONSE[s/"time_in_nanos": "1873811"/"time_in_nanos": $body.profile.shards.0.searches.0.query.0.time_in_nanos/]
// TESTRESPONSE[s/"build_scorer": 2935582/"build_scorer": $body.profile.shards.0.searches.0.query.0.breakdown.build_scorer/]
// TESTRESPONSE[s/"create_weight": 919297/"create_weight": $body.profile.shards.0.searches.0.query.0.breakdown.create_weight/]
// TESTRESPONSE[s/"next_doc": 53876/"next_doc": $body.profile.shards.0.searches.0.query.0.breakdown.next_doc/]
// TESTRESPONSE[s/"time_in_nanos": "391943"/"time_in_nanos": $body.profile.shards.0.searches.0.query.0.children.0.time_in_nanos/]
// TESTRESPONSE[s/"score": 28776/"score": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.score/]
// TESTRESPONSE[s/"build_scorer": 784451/"build_scorer": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.build_scorer/]
// TESTRESPONSE[s/"create_weight": 1669564/"create_weight": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.create_weight/]
// TESTRESPONSE[s/"next_doc": 10111/"next_doc": $body.profile.shards.0.searches.0.query.0.children.0.breakdown.next_doc/]
// TESTRESPONSE[s/"time_in_nanos": "210682"/"time_in_nanos": $body.profile.shards.0.searches.0.query.0.children.1.time_in_nanos/]
// TESTRESPONSE[s/"score": 4552/"score": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.score/]
// TESTRESPONSE[s/"build_scorer": 42602/"build_scorer": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.build_scorer/]
// TESTRESPONSE[s/"create_weight": 89323/"create_weight": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.create_weight/]
// TESTRESPONSE[s/"next_doc": 2852/"next_doc": $body.profile.shards.0.searches.0.query.0.children.1.breakdown.next_doc/]
// TESTRESPONSE[s/"time_in_nanos": "304311"/"time_in_nanos": $body.profile.shards.0.searches.0.collector.0.time_in_nanos/]
// TESTRESPONSE[s/"time_in_nanos": "32273"/"time_in_nanos": $body.profile.shards.0.searches.0.collector.0.children.0.time_in_nanos/]
```

To how you can cavalierly mock all the numbers at once with this change:
```
// TESTRESPONSE[s/(?<=[" ])\d+(\.\d+)?/$body.$_path/]
```
2017-05-23 15:33:48 -04:00
Clinton Gormley
086abe6216 Marked fixed_auto_queue_size as experimental
Relates to https://github.com/elastic/elasticsearch/pull/23884
2017-05-22 10:03:31 +02:00
olcbean
e08e92d934 Deleting a document from a non-existing index creates the should not auto create it, unless using EXTERNAL* versioning (#24518)
Currently a `delete document` request against a non-existing index actually **creates** this index.

With this change the `delete document` no longer creates the previously non-existing index and throws an `index_not_found` exception instead.

However as discussed in https://github.com/elastic/elasticsearch/pull/15451#issuecomment-165772026, if an external version is explicitly used, the current behavior is preserved and the index is still created and the document is marked for deletion.

Fixes #15425
2017-05-22 10:00:22 +03:00
archana
a5358f34b3 Update mappings.asciidoc
typo
2017-05-20 13:39:05 -05:00
Oleksandr Chychkan
065d91bccc Typo in setup/configuration.asciidoc (#24797) 2017-05-19 10:49:56 -04:00
Jack Conradson
0aa380b770 Fix search template documentation reference to scripting security. 2017-05-18 14:27:58 -07:00
Jack Conradson
1196dfb6bb Remove Deprecated Script Settings (#24756)
Removes all fine-grained script settings replaced by scripts.types_allowed and scripts.contexts_allowed.
2017-05-18 13:32:46 -07:00
Ryan Ernst
b214b80e6c GCS Repository: Remove specifying credential file on disk (#24727)
This commit removes the ability to specify the google credential json
file on disk, which is deprecated in 5.5.0.
2017-05-18 10:22:29 -07:00
Ryan Ernst
26e2e933f5 Scripting: Remove native scripts (#24726)
Native scripts have been replaced in documentation by implementing
a ScriptEngine and they were deprecated in 5.5.0. This commit
removes the native script infrastructure for 6.0.

closes #19966
2017-05-17 14:49:24 -07:00
Ryan Ernst
463fe2f4d4 Scripting: Remove file scripts (#24627)
This commit removes file scripts, which were deprecated in 5.5.

closes #21798
2017-05-17 14:42:25 -07:00
Zachary Tong
a2845c86fe
CONSOLEify some more aggregation docs
Related #18160
2017-05-16 17:25:24 -04:00
Jack Conradson
b7f0df626a [DOCS] Added Painless Language Spec content 2017-05-16 12:46:56 -07:00
Lee Hinman
d09e64323f Add ability to automatically adjust search threadpool queue_size
This PR adds a new thread pool type: `fixed_auto_queue_size`. This thread pool
behaves like a regular `fixed` threadpool, except that every
`auto_queue_frame_size` operations (default: 10,000) in the thread pool,
[Little's Law](https://en.wikipedia.org/wiki/Little's_law) is calculated and
used to adjust the pool's `queue_size` either up or down by 50. A minimum and
maximum is taken into account also. When the min and max are the same value, a
regular fixed executor is used instead.

The `SEARCH` threadpool is changed to use this new type of thread pool. However,
the min and max are both set to 1000, meaning auto adjustment is opt-in rather
than opt-out.

Resolves #3890
2017-05-16 11:13:16 -06:00
Ryan Ernst
97d2657e18 Remove script access to term statistics (#19462)
In scripts (at least some of the languages), the terms dictionary and
postings can be access with the special _index variable. This is for
very advanced use cases which want to do their own scoring. The problem
is segment level statistics must be recomputed for every document.
Additionally, this is not friendly to the terms index caching as the
order of looking up terms should be controlled by lucene.

This change removes _index from scripts. Anyone using it can and should
instead write a Similarity plugin, which is explicitly designed to allow
doing the calculations needed for a relevance score.

closes #19359
2017-05-16 09:10:09 -07:00
Simon Willnauer
1cae850cf5 Add a cluster block that allows to delete indices that are read-only (#24678)
Today when an index is `read-only` the index is also blocked from
being deleted which sometimes is undesired since in-order to make
changes to a cluster indices must be deleted to free up space. This is
a likely scenario in a hosted environment when disk-space is limited to switch
indices read-only but allow deletions to free up space.
2017-05-16 17:34:37 +02:00
Daniel Mitterdorfer
77762fcbb0 Use correct script name in docs for Windows
With this commit we correct the name of the ES batch script to
`elasticsearch.bat` in the docs and use backslashes in path names.
2017-05-16 15:57:05 +02:00