Commit Graph

8012 Commits

Author SHA1 Message Date
Adrien Grand cb868d2f5e
Introduce a `constant_keyword` field. (#49713) (#53024)
This field is a specialization of the `keyword` field for the case when all
documents have the same value. It typically performs more efficiently than
keywords at query time by figuring out whether all or none of the documents
match at rewrite time, like `term` queries on `_index`.

The name is up for discussion. I liked including `keyword` in it, so that we
still have room for a `singleton_numeric` in the future. However I'm unsure
whether to call it `singleton`, `constant` or something else, any opinions?

For this field there is a choice between
 1. accepting values in `_source` when they are equal to the value configured
    in mappings, but rejecting mapping updates
 2. rejecting values in `_source` but then allowing updates to the value that
    is configured in the mapping
This commit implements option 1, so that it is possible to reindex from/to an
index that has the field mapped as a keyword with no changes to the source.

Backport of #49713
2020-03-03 16:01:47 +01:00
James Rodewig bcb68c860c [DOCS] Reorganize EQL requirements page 2020-03-03 07:02:30 -05:00
James Rodewig 5cffa14f45 [DOCS] Fix typo in EQL docs 2020-03-02 16:09:03 -05:00
Costin Leau 712e0c05cd EQL: Add implicit ordering on timestamp (#53004)
QL: Move Sort base class from SQL to QL
(cherry picked from commit 798015b7bbd565e9c4222724614baeb432c7c2b3)
2020-03-02 22:41:36 +02:00
Orhan Toy ad2f630795 [DOCS] Fix formatting of simulate ingest pipeline API docs (#52754) 2020-03-02 11:46:27 -05:00
István Zoltán Szabó a267849def [DOCS] Adds transform-settings page and its subpage to redirects (#52944)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-03-02 16:36:40 +01:00
Lisa Cawley 4fbe1b0550
[DOCS] Adds cat anomaly detectors API (#52866) (#52970) 2020-03-02 07:28:55 -08:00
Hendrik Muhs a328a8eaf1
[7.x][Transform] implement node.transform to control where to… (#52998)
implement transform node attributes to disable transform on certain nodes and
test which nodes are allowed to do remote connections

closes #52200
closes #50033
closes #48734

backport #52712
2020-03-02 16:10:57 +01:00
James Rodewig db64029919
[7.x] [DOCS] Add parameter examples to EQL search tutorial (#52953)
Makes the following updates to the EQL search tutorial:

* Adds an API response to the basic tutorial
* Adds an example using the `event_type_field` parm
* Adds an example using the `timestamp_field`parm
* Adds an example using the `query` parm
* Updates example dataset to support more EQL query variety
2020-03-02 10:08:03 -05:00
Aleksandr Maus 89ed857c79
EQL: Change request parameter query to filter and rule to query (#52971) (#53006)
Related to https://github.com/elastic/elasticsearch/issues/52911
2020-03-02 09:26:23 -05:00
James Rodewig d336faa0b0 [DOCS] Reformat trim token filter docs (#51649)
Makes the following changes to the `trim` token filter docs:

* Updates description
* Adds a link to the related Lucene filter
* Adds tip about removing whitespace using tokenizers
* Adds detailed analyze snippets
* Adds custom analyzer snippet
2020-03-02 07:48:23 -05:00
James Rodewig f5bccad847 [DOCS] Correct guidance for `index_options` mapping parm (#52899)
Adds a warning admonition stating that the `index_options` mapping
parameter is intended only for `text` fields.

Removes an outdated statement regarding default values for numeric
and other datatypes.
2020-03-02 07:39:35 -05:00
rhymes 7eb4c07f1f [DOCS] Fix typo in index and search analysis docs (#52988) 2020-03-02 07:25:01 -05:00
Yang Wang 712191f2af
[DOCS] Fix typo and ensure asterisks render properly (#52991) (#52992)
Fix a typo and rendering issue for doc test example.
2020-03-02 11:56:42 +11:00
Dimitris Athanasiou 85b4e45093
[7.x]ML] Parse and report memory usage for DF Analytics (#52778) (#52980)
Adds reporting of memory usage for data frame analytics jobs.
This commit introduces a new index pattern `.ml-stats-*` whose
first concrete index will be `.ml-stats-000001`. This index serves
to store instrumentation information for those jobs.

Backport of #52778 and #52958
2020-02-29 13:03:40 +02:00
Rory Hunter b1be7dcd2d Document how to change GC logging behaviour (#52879)
Closes #43990. Describe how to change the default GC settings without changing
the default `jvm.options`. Give examples using `jvm.options.d`, and
`ES_JAVA_OPTS` with Docker.
2020-02-28 21:27:45 +00:00
Martijn van Groningen 6aa9aaa2c6
Add validation for dynamic templates (#52890)
Backport of #51233 to the seven dot x branch.

Tries to load a `Mapper` instance for the mapping snippet of a dynamic template.
This should catch things like using an analyzer that is undefined or mapping attributes that are unused.

This is best effort:
* If `{{name}}` placeholder is used in the mapping snippet then validation is skipped.
* If `match_mapping_type` is not specified then validation is performed for all mapping types.
  If parsing succeeds with a single mapping type then this the dynamic mapping is considered valid.

If is detected that a dynamic template mapping snippet is invalid at mapping update time then the mapping update is failed for indices created on 8.0.0-alpha1 and later. For indices created on prior version a deprecation warning is omitted instead. In 7.x clusters the mapping update will never fail in case of an invalid dynamic template mapping snippet and a deprecation warning will always be omitted.

Closes #17411
Closes #24419

Co-authored-by: Adrien Grand <jpountz@gmail.com>
2020-02-28 10:35:04 +01:00
Nik Everett 1d1956ee93
Add size support to `top_metrics` (backport of #52662) (#52914)
This adds support for returning the top "n" metrics instead of just the
very top.

Relates to #51813
2020-02-27 16:12:52 -05:00
Benjamin Trent eac38e9847
[ML] Add indices_options to datafeed config and update (#52793) (#52905)
This adds a new configurable field called `indices_options`. This allows users to create or update the indices_options used when a datafeed reads from an index.

This is necessary for the following use cases:
 - Reading from frozen indices
 - Allowing certain indices in multiple index patterns to not exist yet

These index options are available on datafeed creation and update. Users may specify them as URL parameters or within the configuration object.

closes https://github.com/elastic/elasticsearch/issues/48056
2020-02-27 13:43:25 -05:00
Nattachai Suteerapongpan 14f847cc8f [DOCS] Fix typo in task management API docs (#52881) 2020-02-27 11:31:11 -05:00
Josh Devins 68ba571f70
Adds recall@k metric to rank eval API (#52889)
This change adds the recall@k metric and refactors precision@k to match
the new metric.

Recall@k is an important metric to use for learning to rank (LTR)
use-cases. Candidate generation or first ranking phase ranking functions
are often optimized for high recall, in order to generate as many
relevant candidates in the top-k as possible for a second phase of
ranking. Adding this metric allows tuning that base query for LTR.

See: https://github.com/elastic/elasticsearch/issues/51676
Backports: https://github.com/elastic/elasticsearch/pull/52577
2020-02-27 16:04:24 +01:00
István Zoltán Szabó 8785f57dfe [DOCS] Reformats cat DFA API docs. (#52885) 2020-02-27 14:21:52 +01:00
István Zoltán Szabó 4a33352a94 [DOCS] Adds cat trained model API documentation (#52824) 2020-02-27 12:54:11 +01:00
Costin Leau 40bc06f6ad EQL: Hook engine to Elasticsearch (#52828)
Add query execution and return actual results returned from
Elasticsearch inside the tests

(cherry picked from commit 3e039282bf991af87604a6d4f8eada19d5e33842)
2020-02-27 11:22:22 +02:00
David Turner 69b78f7f8a "Adding nodes" instructions only work on localhost (#52677)
The introductory sections of the reference manual contains some simplified
instructions for adding a node to the cluster. Unfortunately they are a little
too simplified and only really work for clusters running on `localhost`. If you
try and follow these instructions for a distributed cluster then the new node
will, confusingly, auto-bootstrap itself into a distinct one-node cluster.

Multiple nodes running on localhost is a valid config, of course, but we should
spell out that these instructions are really only for experimentation and that
it takes a bit more work to add nodes to a distributed cluster. This commit
does so.

Also, the "important config" instructions for discovery say that you MUST set
`discovery.seed_hosts` whereas in fact it is fine to ignore this setting and
use a dynamic discovery mechanism instead. This commit weakens this statement
and links to the docs for dynamic discovery mechanisms.

Finally, this section is also overloaded with some technical details that are
not important for this context and are adequately covered elsewhere, and
completely fails to note that the default discovery port is 9300. This commit
addresses this.
2020-02-27 09:18:37 +00:00
James Rodewig f5253d20f7 [DOCS] Update term vectors snippet to prevent CI failure (#52819)
Adds the `?refresh=wait_for` query argument to an index API snippet in
the term vectors API docs.

This should ensure the document is indexed and available before a
subsequent term vectors API request executes.

Fixes #52814.
2020-02-26 12:41:40 -05:00
Lisa Cawley b788ec7157 [DOCS] Adds cat datafeeds API (#52738) 2020-02-26 09:28:57 -08:00
Jake Landis 8d311297ca
[7.x] Smarter copying of the rest specs and tests (#52114) (#52798)
* Smarter copying of the rest specs and tests (#52114)

This PR addresses the unnecessary copying of the rest specs and allows
for better semantics for which specs and tests are copied. By default
the rest specs will get copied if the project applies
`elasticsearch.standalone-rest-test` or `esplugin` and the project
has rest tests or you configure the custom extension `restResources`.

This PR also removes the need for dozens of places where the x-pack
specs were copied by supporting copying of the x-pack rest specs too.

The plugin/task introduced here can also copy the rest tests to the
local project through a similar configuration.

The new plugin/task allows a user to minimize the surface area of
which rest specs are copied. Per project can be configured to include
only a subset of the specs (or tests). Configuring a project to only
copy the specs when actually needed should help with build cache hit
rates since we can better define what is actually in use.
However, project level optimizations for build cache hit rates are
not included with this PR.

Also, with this PR you can no longer use the includePackaged flag on
integTest task.

The following items are included in this PR:
* new plugin: `elasticsearch.rest-resources`
* new tasks: CopyRestApiTask and CopyRestTestsTask - performs the copy
* new extension 'restResources'
```
restResources {
  restApi {
    includeCore 'foo' , 'bar' //will include the core specs that start with foo and bar
    includeXpack 'baz' //will include x-pack specs that start with baz
  }
  restTests {
    includeCore 'foo', 'bar' //will include the core tests that start with foo and bar
    includeXpack 'baz' //will include the x-pack tests that start with baz
  }
}

```
2020-02-26 08:13:41 -06:00
Bogdan Pintea 304e1e69b8 remove references to the SQL API from ODBC config (#52765)
Remove reference to an "SQL API" which could suggest that one needs to
treat this in a special way when configuring the ODBC driver.

(cherry picked from commit 451c341e0193b542409e8891ec2a31e62529a5e7)
2020-02-26 13:39:54 +01:00
István Zoltán Szabó f57422bbfd [DOCS] Adds cat data frame analytics API (#52764)
Co-authored-by: Lisa Cawley <lcawley@elastic.co>
2020-02-26 11:10:42 +01:00
Lisa Cawley 05f1cd74a6 [DOCS] Fixes monitoring links (#52790) 2020-02-25 18:08:23 -08:00
Lisa Cawley 924f0bd243 [DOCS] Updates custom rules example (#52731) 2020-02-25 09:32:52 -08:00
Andrei Stefan 51c6aefa55
SQL: Use calendar_interval of 1d for HISTOGRAMs with 1 DAY intervals (#52749) (#52771)
(cherry picked from commit 556f5fa33be88570c4f8550cb8f784323d26a707)
2020-02-25 18:44:02 +02:00
Pius 563f033511 Update ilm-settings.asciidoc (#51577) 2020-02-25 10:18:55 -05:00
bellengao d2db16e046 [DOCS] Correct policy name in ILM docs example (#52354)
Updates an example snippet to use a consistent policy name.
2020-02-25 09:36:22 -05:00
David Pilato 6c6ab8fa47 [DOS] Fix typo in CSV processor docs (#52649)
Corrects an example array in a snippet of the CSV processor docs.
2020-02-25 08:48:50 -05:00
bellengao 49f37989c4 [DOCS] Fix typo in ingest node docs (#52671) 2020-02-25 07:57:52 -05:00
David Roberts cf122d13b8 [ML] Use event.timezone in file_structure_finder ingest pipeline (#52720)
This is because beat.timezone was renamed to event.timezone in
elastic/beats#9458
2020-02-25 12:33:53 +00:00
James Rodewig 9b05f6a668 [DOCS] Add admonition for app using cat APIs (#52727)
Adds an explicit "important" admonition discouraging apps from using
cat APIs.

cat APIs are intended for human consumption via the command line or
Kibana console only. They are not intended for consumption by
applications.
2020-02-25 07:20:33 -05:00
James Rodewig 1a14ae4e1b [DOCS] Document `include_in_*` nested mapping parms (#52648)
Adds documentation for the `include_in_parent` and `include_in_root`
mapping parameters for the `nested` mapping datatype.
2020-02-25 07:13:49 -05:00
Adrien Grand 5f81906fcf Discourage from opting in for the `niofs` store. (#52638)
Indices open with the `niofs` store type load much more data on-heap than
indices open with the `mmapfs` store type. This limitation is now documented
and examples have been updated to show how to update settings to use the
`mmapfs` store type rather than `niofs`.
2020-02-25 08:54:11 +01:00
Adrien Grand 9b0ddc1c03 Clarify the resiliency trade-off of disabling replicas to speed up indexing. (#52714)
We should be more explicit about the downsides of disabling replicas and
explain that users should be ready to re-do the entire load in case of
issues mid-way.
2020-02-25 08:54:10 +01:00
Adrien Grand 5ce66b8b3c Document how CCR may be used to speed up indexing. (#52717)
One architecture that we have recommended to several users to speed up
indexing involved using CCR to prevent searching from stealing resources
from indexing.
2020-02-25 08:54:10 +01:00
Bob Blank 28d4b71947
Clarified http.max_content_length description (#52329)
Adding "greater than" based on discussion with @jasontedor for clarity.
2020-02-24 21:01:14 -05:00
Andrei Stefan ed6b10bc03
SQL: use a calendar interval for histograms over 1 month intervals (#52586) (#52715)
(cherry picked from commit 928b11a34ec92d90d082abdf4fa09f7ce1d7c0c4)
2020-02-25 01:41:51 +02:00
Julie Tibshirani ba0401ecfd Correct the name of the search timeout parameter. (#52733)
The request body parameter is called 'timeout', not 'search_timeout'.
2020-02-24 14:59:06 -08:00
lcawl c6e35b460e [DOCS] Adds anchor for custom rules 2020-02-24 11:39:15 -08:00
Mayya Sharipova 034b1c0ba3
Correct boost calculation in script_score query (#52478) (#52724)
Before boost in script_score query was wrongly applied only to the subquery.
This commit makes sure that the boost is applied to the whole score
that comes out of script.

Closes #48465
2020-02-24 13:48:21 -05:00
Przemko Robakowski e72cb79476
Add docs for errors in GetAlias API (#51850) (#52716)
Closes #31499

Co-authored-by: Maxim <timonin.maksim@mail.ru>
2020-02-24 18:22:09 +01:00
James Rodewig 5e48811585 [DOCS] Document CCS-supported APIs (#52708)
Explicitly notes the Elasticsearch API endpoints that support CCS.

This should deter users from attempting to use CCS with other API
endpoints, such as `GET <index>/_doc/<_id>`.
2020-02-24 09:59:08 -05:00