Commit Graph

7762 Commits

Author SHA1 Message Date
James Rodewig 9d1567b13b [DOCS] Add overview page to analysis topic (#50515)
Adds a 'text analysis overview' page to the analysis topic docs.

The goals of this page are:

* Concisely summarize the analysis process while avoiding in-depth concepts, tutorials, or API examples
* Explain why analysis is important, largely through highlighting problems with full-text searches missing analysis
* Highlight how analysis can be used to improve search results
2020-01-08 12:54:00 -06:00
István Zoltán Szabó 0444da944e [DOCS] Adds DFA resources as deleted page to redirects. (#50756) 2020-01-08 19:01:16 +01:00
James Rodewig f87e61ec30 [DOCS] Add default index-time analyzer example (#50501)
The Analysis docs mention including a default analyzer in the index settings. However, no example snippet is included.

This adds an example snippet that users can easily copy and adjust.
2020-01-08 11:07:49 -06:00
Valentin Crettaz f3ddd4066a [DOCS] Fixed typos (_op => op) in Painless context docs (#50301) 2020-01-08 10:54:06 -06:00
blueSky1825821 5ff6eafb4b [Docs] Update similarity.asciidoc (#50719)
DFRSimilarity -> DFR similarity
2020-01-08 17:48:26 +01:00
Adrien Grand 31158ab3d5
Add per-field metadata. (#50333)
This PR adds per-field metadata that can be set in the mappings and is later
returned by the field capabilities API. This metadata is completely opaque to
Elasticsearch but may be used by tools that index data in Elasticsearch to
communicate metadata about fields with tools that then search this data. A
typical example that has been requested in the past is the ability to attach
a unit to a numeric field.

In order to not bloat the cluster state, Elasticsearch requires that this
metadata be small:
 - keys can't be longer than 20 chars,
 - values can only be numbers or strings of no more than 50 chars - no inner
   arrays or objects,
 - the metadata can't have more than 5 keys in total.

Given that metadata is opaque to Elasticsearch, field capabilities don't try to
do anything smart when merging metadata about multiple indices, the union of
all field metadatas is returned.

Here is how the meta might look like in mappings:

```json
{
  "properties": {
    "latency": {
      "type": "long",
      "meta": {
        "unit": "ms"
      }
    }
  }
}
```

And then in the field capabilities response:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms" ]
      }
    }
  }
}
```

When there are no conflicts, values are arrays of size 1, but when there are
conflicts, Elasticsearch includes all unique values in this array, without
giving ways to know which index has which metadata value:

```json
{
  "latency": {
    "long": {
      "searchable": true,
      "aggreggatable": true,
      "meta": {
        "unit": [ "ms", "ns" ]
      }
    }
  }
}
```

Closes #33267
2020-01-08 16:21:18 +01:00
James Rodewig d3094f9d23 [DOCS] Fix typo in mapping date format docs 2020-01-08 07:55:51 -06:00
Christoph Büscher d8c907d648 Remove _reload_search_analyzer experimental status (#50696)
Removing the experimental status in the docs and the rest specs.
2020-01-08 10:35:19 +01:00
James Rodewig de6b62f789 [DOCS] Fuzzy wildcard not supported in `query_string` (#50466)
The `query_string` does not support mixing wildcards with fuzziness.
This adds a related warning to the `query_string` docs.
2020-01-07 12:54:50 -06:00
James Rodewig 20eba1e410 [DOCS] Reformat reverse token filter docs (#50672)
* Updates the description and adds a Lucene link
* Adds analyze and custom analyzer snippets
2020-01-07 11:01:55 -06:00
James Rodewig 8009b07ccb [DOCS] Reformat truncate token filter docs (#50687)
* Updates the description and adds a Lucene link
* Adds analyze, custom analyzer, and custom filter snippets
* Adds parameter documentation
2020-01-07 10:33:57 -06:00
arkel-s d5f4790f90 [DOCS] Add example format for `date_optional_time` (#50458)
Adds an example format for `date_optional_time` to the `format` mapping
parameter docs.

Closes #50457
2020-01-07 10:13:34 -06:00
James Rodewig 0753915eed [DOCS] Update SQL REST API pages for new structure (#50690)
#43007 restructured the SQL REST API docs so they display across several pages.

This updates up a reference that assumes a single page in the "Paginating through a large response" section. It also reformats a tip for the Kibana console.

Closes #50688
2020-01-07 09:27:34 -06:00
James Rodewig 074866256b [DOCS] Remove unneeded redirects (#50510)
The docs/reference/redirects.asciidoc file stores a list of relocated or
deleted pages for the Elasticsearch Reference documentation.

This prunes several older redirects that are no longer needed.
2020-01-06 09:11:48 -06:00
James Rodewig 1299dda437 [DOCS] Warn about using `geo_centroid` as sub-agg to `geohash_grid` (#50038)
If `geo_point fields` are multi-valued, using `geo_centroid` as a
sub-agg to `geohash_grid` could result in centroids outside of bucket
boundaries.

This adds a related warning to the geo_centroid agg docs.
2020-01-06 07:47:54 -06:00
Nhat Nguyen b71490b06b
Deprecate indices without soft-deletes (#50502) (#50634)
Soft-deletes will be enabled for all indices in 8.0. Hence, we should
deprecate new indices without soft-deletes in 7.x.

Backport of #50502
2020-01-06 08:44:30 -05:00
Lisa Cawley 62969c35cd [DOCS] Adds missing timing_stats descriptions (#50574) 2020-01-03 09:14:09 -08:00
Nik Everett 4d58656065
Declare remaining parsers `final` (#50571) (#50615)
We have about 800 `ObjectParsers` in Elasticsearch, about 700 of which
are final. This is *probably* the right way to declare them because in
practice we never mutate them after they are built. And we certainly
don't change the static reference. Anyway, this adds `final` to these
parsers.

I found the non-final parsers with this:
```
diff \
  <(find . -type f -name '*.java' -exec grep -iHe 'static.*PARSER\s*=' {} \+ | sort) \
  <(find . -type f -name '*.java' -exec grep -iHe 'static.*final.*PARSER\s*=' {} \+ | sort) \
  2>&1 | grep '^<'
```
2020-01-03 11:48:11 -05:00
Orhan Toy 44827e577e [DOCS] Fix missing quote in script-score-query.asciidoc (#50590) 2020-01-03 16:15:45 +01:00
István Zoltán Szabó 0bcbddecf8 [DOCS] Fine-tunes training_percent definition. (#50601) 2020-01-03 14:51:03 +01:00
James Rodewig e6a469cc74 [DOCS] Reformat uppercase token filter docs (#50555)
* Updates the description and adds a Lucene link
* Adds analyze and custom analyzer snippets
2020-01-03 08:39:08 -05:00
Dimitris Athanasiou ca0828ba07
[7.x][ML] Implement force deleting a data frame analytics job (#50553) (#50589)
Adds a `force` parameter to the delete data frame analytics
request. When `force` is `true`, the action force-stops the
jobs and then proceeds to the deletion. This can be used in
order to delete a non-stopped job with a single request.

Closes #48124

Backport of #50553
2020-01-03 13:46:02 +02:00
Alan Woodward 8b362c657b Add fuzzy intervals source (#49762)
This intervals source will return terms that are similar to an input term, up to
an edit distance defined by fuzziness, similar to FuzzyQuery.

Closes #49595
2020-01-03 09:59:19 +00:00
István Zoltán Szabó a34b3f133c [DOCS] Specifies the possible data types of classification dependent_variable (#50582) 2020-01-03 10:42:56 +01:00
Lee Hinman 0d78aa2708
Don't dump a stacktrace for invalid patterns when executing elasticsearch-croneval (#49744) (#50578)
Co-authored-by: bellengao <gbl_long@163.com>
2020-01-02 16:57:51 -07:00
Lisa Cawley 81a9cff16f
[7.x][DOCS] Remove redundant results from ML APIs (#50565) 2020-01-02 11:23:26 -08:00
James Rodewig 338dd642c4 [DOCS] Correct typos in Painless datetime docs (#50563)
Fixes several typos and grammar errors raised by @glenacota in #47512.

Co-authored-by: Guido Lena Cota <guido.lenacota@gmail.com>
2020-01-02 13:16:56 -05:00
David Turner b209e7b6ff
Remove included docs (#50557)
In #50499 we accidentally duplicated the docs for the `?local` parameter to the
`GET _cat/nodes` API. This commit removes the duplicate docs.
2020-01-02 17:19:42 +00:00
Nik Everett 55107ce8ae Docs: Refine note about `after_key` (#50475)
* Docs: Refine note about `after_key`

I was curious about composite aggregations, specifically I wanted to
know how to write a composite aggregation that had all of its buckets
filtered out so you *had* to use the `after_key`. Then I saw that we've
declared composite aggregations not to work with pipelines in #44180. So
I'm not sure you *can* do that any more. Which makes the note about
`after_key` inaccurate. This rejiggers that section of the docs a little
so it is more obvious that you send the `after_key` back to us. And so
it is more obvious that you should *only* use the `after_key` that we
give you rather than try to work it out for yourself.

* Apply suggestions from code review

Co-Authored-By: James Rodewig <james.rodewig@elastic.co>

Co-authored-by: James Rodewig <james.rodewig@elastic.co>
2020-01-02 10:03:23 -05:00
Oleg 7539fbb30f Deprecate the 'local' parameter of /_cat/nodes (#50499)
The cat nodes API performs a `ClusterStateAction` then a `NodesInfoAction`.
Today it accepts the `?local` parameter and passes this to the
`ClusterStateAction` but this parameter has no effect on the `NodesInfoAction`.
This is surprising, because `GET _cat/nodes?local` looks like it might be a
completely local call but in fact it still depends on every node in the
cluster.

This commit deprecates the `?local` parameter on this API so that it can be
removed in 8.0.

Relates #50088
2020-01-02 14:53:56 +00:00
Lisa Cawley ab5a69d1e2
[7.x][DOCS] Move machine learning results definitions into APIs (#50543) 2019-12-31 13:21:17 -08:00
Lisa Cawley f8eef43fc6
[7.x][DOCS] Move model snapshot resource definitions into APIs (#50540) 2019-12-31 10:53:05 -08:00
Lisa Cawley 3fb4f1b5bf
[DOCS] Moves job count resource definitions into API (#50529) 2019-12-30 14:55:36 -08:00
Jake ecf295fa77 [DOCS] Bump copyright to 2019 for Java HLRC license (#50206) 2019-12-30 15:39:53 -05:00
Gilad Gal 9fdfb075bb
Deleted 'a' before plural 'messages'
Deleted 'a' before plural 'messages'
2019-12-30 21:25:15 +02:00
Lisa Cawley 4b829db593
[7.x][DOCS] Move datafeed resource definitions into APIs (#50516) 2019-12-30 09:35:16 -08:00
riverbuilding e7d5443bf5 [DOCS] Correct Painless operator typos (#50472) 2019-12-30 08:48:32 -05:00
Lisa Cawley 72840c0cb2
[7.x][DOCS] Move anomaly detection job resource definitions into APIs (#50490) 2019-12-27 13:30:26 -08:00
James Rodewig 7a14607a25 [DOCS] Abbreviate token filter titles (#50511) 2019-12-27 11:01:52 -05:00
James Rodewig 3f7f31b6b0 [DOCS] Fix search request body links (#50500)
PR #44238 changed several links related to the Elasticsearch search request body API. This updates several places still using outdated links or anchors.

This will ultimately let us remove some redirects related to those link changes.
2019-12-26 14:31:09 -05:00
James Rodewig 261566154b [DOCS] Fix search request body link (#50498) 2019-12-26 12:34:09 -05:00
James Rodewig ef467cc6f5 [DOCS] Remove unneeded redirects (#50476)
The docs/reference/redirects.asciidoc file stores a list of relocated or
deleted pages for the Elasticsearch Reference documentation.

This prunes several older redirects that are no longer needed and
don't require work to fix broken links in other repositories.
2019-12-26 08:29:28 -05:00
James Rodewig 1cec87c0e6 [DOCS] Document `transport` and `http` node stats (#50473)
Documents the `transport` and `http` parameters returned by the
`_nodes/stats` API.
2019-12-26 07:43:14 -05:00
Lisa Cawley d479e0563a
[7.x][DOCS] Augments ML shared definitions (#50487) 2019-12-24 10:22:05 -08:00
Martijn van Groningen 10ed1ae1d2
Add remote info to the HLRC (#50483)
The additional change to the original PR (#49657), is that `org.elasticsearch.client.cluster.RemoteConnectionInfo` now parses the initial_connect_timeout field as a string instead of a TimeValue instance.

The reason that this is needed is because that the initial_connect_timeout field in the remote connection api is serialized for human consumption, but not for parsing purposes.
Therefore the HLRC can't parse it correctly (which caused test failures in CI, but not in the PR CI
:( ). The way this field is serialized needs to be changed in the remote connection api, but that is a breaking change. We should wait making this change until rest api versioning is introduced.

Co-Authored-By: j-bean <anton.shuvaev91@gmail.com>

Co-authored-by: j-bean <anton.shuvaev91@gmail.com>
2019-12-24 15:11:58 +01:00
Orhan Toy 6a3d1a077e [DOCS] Fixes "enables you to" typos (#50225) 2019-12-23 14:39:14 -05:00
James Rodewig 694b119f0a [DOCS] Percentile aggs are non-deterministic (#50468)
Percentile aggregations are non-deterministic. A percentile aggregation
can produce different results even when using the same data.

Based on [this discuss post][0], the non-deterministic property stems
from processes in Lucene that can affect the order in which docs are
provided to the aggregation.

This adds a warning stating that the aggregation is non-deterministic
and what that means.

[0]: https://discuss.elastic.co/t/different-results-for-same-query/111757
2019-12-23 13:13:34 -05:00
Nik Everett 01293ebad5
Fix docs typos (#50365) (#50464)
Fixes a few typos in the docs.

Co-authored-by: Xiang Dai <764524258@qq.com>
2019-12-23 12:38:17 -05:00
Igor Motov 339d10c16f Geo: Switch generated GeoJson type names to camel case (#50400)
Switches generated GeoJson type names to camel case
to conform to the standard.

Closes #49568
2019-12-20 15:37:22 -05:00
James Rodewig 27ae9a1435 [DOCS] Remove outdated file scripts refererence (#50437)
File scripts were removed in 6.0 with #24627.

This removes an outdated file scripts reference from the conditional clauses section of the search templates docs.
2019-12-20 14:53:40 -05:00