From 6054d33a63299da12f6158b4d20db3ff384de85a Mon Sep 17 00:00:00 2001 From: James Rodewig <40268737+jrodewig@users.noreply.github.com> Date: Wed, 29 Jul 2020 14:14:01 -0400 Subject: [PATCH] [DOCS] Replace `twitter` dataset in API conventions + README (#60408) (#60410) --- README.asciidoc | 166 ++++++++++++------------ docs/reference/api-conventions.asciidoc | 54 ++++---- 2 files changed, 107 insertions(+), 113 deletions(-) diff --git a/README.asciidoc b/README.asciidoc index 94a0588efd8..e91e46df961 100644 --- a/README.asciidoc +++ b/README.asciidoc @@ -35,68 +35,67 @@ First of all, DON'T PANIC. It will take 5 minutes to get the gist of what Elasti * https://www.elastic.co/downloads/elasticsearch[Download] and unpack the Elasticsearch official distribution. * Run `bin/elasticsearch` on Linux or macOS. Run `bin\elasticsearch.bat` on Windows. -* Run `curl -X GET http://localhost:9200/`. -* Start more servers ... +* Run `curl -X GET http://localhost:9200/` to verify Elasticsearch is running. === Indexing -Let's try and index some twitter like information. First, let's index some tweets (the `twitter` index will be created automatically): +First, index some sample JSON documents. The first request automatically creates +the `my-index-000001` index. ---- -curl -XPUT 'http://localhost:9200/twitter/_doc/1?pretty' -H 'Content-Type: application/json' -d ' +curl -X POST 'http://localhost:9200/my-index-000001/_doc?pretty' -H 'Content-Type: application/json' -d ' { - "user": "kimchy", - "post_date": "2009-11-15T13:12:00", - "message": "Trying out Elasticsearch, so far so good?" + "@timestamp": "2099-11-15T13:12:00", + "message": "GET /search HTTP/1.1 200 1070000", + "user": { + "id": "kimchy" + } }' -curl -XPUT 'http://localhost:9200/twitter/_doc/2?pretty' -H 'Content-Type: application/json' -d ' +curl -X POST 'http://localhost:9200/my-index-000001/_doc?pretty' -H 'Content-Type: application/json' -d ' { - "user": "kimchy", - "post_date": "2009-11-15T14:12:12", - "message": "Another tweet, will it be indexed?" + "@timestamp": "2099-11-15T14:12:12", + "message": "GET /search HTTP/1.1 200 1070000", + "user": { + "id": "elkbee" + } }' -curl -XPUT 'http://localhost:9200/twitter/_doc/3?pretty' -H 'Content-Type: application/json' -d ' +curl -X POST 'http://localhost:9200/my-index-000001/_doc?pretty' -H 'Content-Type: application/json' -d ' { - "user": "elastic", - "post_date": "2010-01-15T01:46:38", - "message": "Building the site, should be kewl" -}' ----- - -Now, let's see if the information was added by GETting it: - ----- -curl -XGET 'http://localhost:9200/twitter/_doc/1?pretty=true' -curl -XGET 'http://localhost:9200/twitter/_doc/2?pretty=true' -curl -XGET 'http://localhost:9200/twitter/_doc/3?pretty=true' ----- - -=== Searching - -Mmm search..., shouldn't it be elastic? -Let's find all the tweets that `kimchy` posted: - ----- -curl -XGET 'http://localhost:9200/twitter/_search?q=user:kimchy&pretty=true' ----- - -We can also use the JSON query language Elasticsearch provides instead of a query string: - ----- -curl -XGET 'http://localhost:9200/twitter/_search?pretty=true' -H 'Content-Type: application/json' -d ' -{ - "query" : { - "match" : { "user": "kimchy" } + "@timestamp": "2099-11-15T01:46:38", + "message": "GET /search HTTP/1.1 200 1070000", + "user": { + "id": "elkbee" } }' ---- -Just for kicks, let's get all the documents stored (we should see the tweet from `elastic` as well): +=== Search + +Next, use a search request to find any documents with a `user.id` of `kimchy`. ---- -curl -XGET 'http://localhost:9200/twitter/_search?pretty=true' -H 'Content-Type: application/json' -d ' +curl -X GET 'http://localhost:9200/my-index-000001/_search?q=user.id:kimchy&pretty=true' +---- + +Instead of a query string, you can use Elasticsearch's +https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html[Query +DSL] in the request body. + +---- +curl -X GET 'http://localhost:9200/my-index-000001/_search?pretty=true' -H 'Content-Type: application/json' -d ' +{ + "query" : { + "match" : { "user.id": "kimchy" } + } +}' +---- + +You can also retrieve all documents in `my-index-000001`. + +---- +curl -X GET 'http://localhost:9200/my-index-000001/_search?pretty=true' -H 'Content-Type: application/json' -d ' { "query" : { "match_all" : {} @@ -104,64 +103,61 @@ curl -XGET 'http://localhost:9200/twitter/_search?pretty=true' -H 'Content-Type: }' ---- -We can also do range search (the `post_date` was automatically identified as date) +During indexing, Elasticsearch automatically mapped the `@timestamp` field as a +date. This lets you run a range search. ---- -curl -XGET 'http://localhost:9200/twitter/_search?pretty=true' -H 'Content-Type: application/json' -d ' +curl -X GET 'http://localhost:9200/my-index-000001/_search?pretty=true' -H 'Content-Type: application/json' -d ' { "query" : { "range" : { - "post_date" : { "from" : "2009-11-15T13:00:00", "to" : "2009-11-15T14:00:00" } + "@timestamp": { + "from": "2099-11-15T13:00:00", + "to": "2099-11-15T14:00:00" + } } } }' ---- -There are many more options to perform search, after all, it's a search product no? All the familiar Lucene queries are available through the JSON query language, or through the query parser. +=== Multiple indices -=== Multi Tenant - Indices +Elasticsearch supports multiple indices. The previous examples used an index +called `my-index-000001`. You can create another index, `my-index-000002`, to +store additional data when `my-index-000001` reaches a certain age or size. You +can also use separate indices to store different types of data. -Man, that twitter index might get big (in this case, index size == valuation). Let's see if we can structure our twitter system a bit differently in order to support such large amounts of data. - -Elasticsearch supports multiple indices. In the previous example we used an index called `twitter` that stored tweets for every user. - -Another way to define our simple twitter system is to have a different index per user (note, though that each index has an overhead). Here is the indexing curl's in this case: +You can configure each index differently. The following request +creates `my-index-000002` with two primary shards rather than the default of +one. This may be helpful for larger indices. ---- -curl -XPUT 'http://localhost:9200/kimchy/_doc/1?pretty' -H 'Content-Type: application/json' -d ' -{ - "user": "kimchy", - "post_date": "2009-11-15T13:12:00", - "message": "Trying out Elasticsearch, so far so good?" -}' - -curl -XPUT 'http://localhost:9200/kimchy/_doc/2?pretty' -H 'Content-Type: application/json' -d ' -{ - "user": "kimchy", - "post_date": "2009-11-15T14:12:12", - "message": "Another tweet, will it be indexed?" -}' ----- - -The above will index information into the `kimchy` index. Each user will get their own special index. - -Complete control on the index level is allowed. As an example, in the above case, we might want to change from the default 1 shards with 1 replica per index, to 2 shards with 1 replica per index (because this user tweets a lot). Here is how this can be done (the configuration can be in yaml as well): - ----- -curl -XPUT http://localhost:9200/another_user?pretty -H 'Content-Type: application/json' -d ' +curl -X PUT 'http://localhost:9200/my-index-000002?pretty' -H 'Content-Type: application/json' -d ' { "settings" : { - "index.number_of_shards" : 2, - "index.number_of_replicas" : 1 + "index.number_of_shards" : 2 } }' ---- -Search (and similar operations) are multi index aware. This means that we can easily search on more than one -index (twitter user), for example: +You can then add a document to `my-index-000002`. ---- -curl -XGET 'http://localhost:9200/kimchy,another_user/_search?pretty=true' -H 'Content-Type: application/json' -d ' +curl -X POST 'http://localhost:9200/my-index-000002/_doc?pretty' -H 'Content-Type: application/json' -d ' +{ + "@timestamp": "2099-11-16T13:12:00", + "message": "GET /search HTTP/1.1 200 1070000", + "user": { + "id": "kimchy" + } +}' +---- + +You can search and perform other operations on multiple indices with a single +request. The following request searches `my-index-000001` and `my-index-000002`. + +---- +curl -X GET 'http://localhost:9200/my-index-000001,my-index-000002/_search?pretty=true' -H 'Content-Type: application/json' -d ' { "query" : { "match_all" : {} @@ -169,10 +165,10 @@ curl -XGET 'http://localhost:9200/kimchy,another_user/_search?pretty=true' -H 'C }' ---- -Or on all the indices: +You can omit the index from the request path to search all indices. ---- -curl -XGET 'http://localhost:9200/_search?pretty=true' -H 'Content-Type: application/json' -d ' +curl -X GET 'http://localhost:9200/_search?pretty=true' -H 'Content-Type: application/json' -d ' { "query" : { "match_all" : {} @@ -180,9 +176,7 @@ curl -XGET 'http://localhost:9200/_search?pretty=true' -H 'Content-Type: applica }' ---- -And the cool part about that? You can easily search on multiple twitter users (indices), with different boost levels per user (index), making social search so much simpler (results from my friends rank higher than results from friends of my friends). - -=== Distributed, Highly Available +=== Distributed, highly available Let's face it, things will fail.... @@ -194,7 +188,7 @@ In order to play with the distributed nature of Elasticsearch, simply bring more We have just covered a very small portion of what Elasticsearch is all about. For more information, please refer to the https://www.elastic.co/products/elasticsearch[elastic.co] website. General questions can be asked on the https://discuss.elastic.co[Elastic Forum] or https://ela.st/slack[on Slack]. The Elasticsearch GitHub repository is reserved for bug reports and feature requests only. -=== Building from Source +=== Building from source Elasticsearch uses https://gradle.org[Gradle] for its build system. diff --git a/docs/reference/api-conventions.asciidoc b/docs/reference/api-conventions.asciidoc index 8f162110d28..b9811c1fb6e 100644 --- a/docs/reference/api-conventions.asciidoc +++ b/docs/reference/api-conventions.asciidoc @@ -229,9 +229,9 @@ separated list of filters expressed with the dot notation: [source,console] -------------------------------------------------- -GET /_search?q=elasticsearch&filter_path=took,hits.hits._id,hits.hits._score +GET /_search?q=kimchy&filter_path=took,hits.hits._id,hits.hits._score -------------------------------------------------- -// TEST[setup:twitter] +// TEST[setup:my_index] Responds: @@ -259,7 +259,7 @@ of a field's name: -------------------------------------------------- GET /_cluster/state?filter_path=metadata.indices.*.stat* -------------------------------------------------- -// TEST[s/^/PUT twitter\n/] +// TEST[s/^/PUT my-index-000001\n/] Responds: @@ -268,7 +268,7 @@ Responds: { "metadata" : { "indices" : { - "twitter": {"state": "open"} + "my-index-000001": {"state": "open"} } } } @@ -282,7 +282,7 @@ of every segment with this request: -------------------------------------------------- GET /_cluster/state?filter_path=routing_table.indices.**.state -------------------------------------------------- -// TEST[s/^/PUT twitter\n/] +// TEST[s/^/PUT my-index-000001\n/] Responds: @@ -291,7 +291,7 @@ Responds: { "routing_table": { "indices": { - "twitter": { + "my-index-000001": { "shards": { "0": [{"state": "STARTED"}, {"state": "UNASSIGNED"}] } @@ -307,7 +307,7 @@ It is also possible to exclude one or more fields by prefixing the filter with t -------------------------------------------------- GET /_count?filter_path=-_shards -------------------------------------------------- -// TEST[setup:twitter] +// TEST[setup:my_index] Responds: @@ -326,7 +326,7 @@ inclusive filters: -------------------------------------------------- GET /_cluster/state?filter_path=metadata.indices.*.state,-metadata.indices.logstash-* -------------------------------------------------- -// TEST[s/^/PUT index-1\nPUT index-2\nPUT index-3\nPUT logstash-2016.01\n/] +// TEST[s/^/PUT my-index-000001\nPUT my-index-000002\nPUT my-index-000003\nPUT logstash-2016.01\n/] Responds: @@ -335,9 +335,9 @@ Responds: { "metadata" : { "indices" : { - "index-1" : {"state" : "open"}, - "index-2" : {"state" : "open"}, - "index-3" : {"state" : "open"} + "my-index-000001" : {"state" : "open"}, + "my-index-000002" : {"state" : "open"}, + "my-index-000003" : {"state" : "open"} } } } @@ -384,46 +384,46 @@ The `flat_settings` flag affects rendering of the lists of settings. When the [source,console] -------------------------------------------------- -GET twitter/_settings?flat_settings=true +GET my-index-000001/_settings?flat_settings=true -------------------------------------------------- -// TEST[setup:twitter] +// TEST[setup:my_index] Returns: [source,console-result] -------------------------------------------------- { - "twitter" : { + "my-index-000001" : { "settings": { "index.number_of_replicas": "1", "index.number_of_shards": "1", "index.creation_date": "1474389951325", "index.uuid": "n6gzFZTgS664GUfx0Xrpjw", "index.version.created": ..., - "index.provided_name" : "twitter" + "index.provided_name" : "my-index-000001" } } } -------------------------------------------------- -// TESTRESPONSE[s/1474389951325/$body.twitter.settings.index\\\\.creation_date/] -// TESTRESPONSE[s/n6gzFZTgS664GUfx0Xrpjw/$body.twitter.settings.index\\\\.uuid/] -// TESTRESPONSE[s/"index.version.created": \.\.\./"index.version.created": $body.twitter.settings.index\\\\.version\\\\.created/] +// TESTRESPONSE[s/1474389951325/$body.my-index-000001.settings.index\\\\.creation_date/] +// TESTRESPONSE[s/n6gzFZTgS664GUfx0Xrpjw/$body.my-index-000001.settings.index\\\\.uuid/] +// TESTRESPONSE[s/"index.version.created": \.\.\./"index.version.created": $body.my-index-000001.settings.index\\\\.version\\\\.created/] When the `flat_settings` flag is `false`, settings are returned in a more human readable structured format: [source,console] -------------------------------------------------- -GET twitter/_settings?flat_settings=false +GET my-index-000001/_settings?flat_settings=false -------------------------------------------------- -// TEST[setup:twitter] +// TEST[setup:my_index] Returns: [source,console-result] -------------------------------------------------- { - "twitter" : { + "my-index-000001" : { "settings" : { "index" : { "number_of_replicas": "1", @@ -433,15 +433,15 @@ Returns: "version": { "created": ... }, - "provided_name" : "twitter" + "provided_name" : "my-index-000001" } } } } -------------------------------------------------- -// TESTRESPONSE[s/1474389951325/$body.twitter.settings.index.creation_date/] -// TESTRESPONSE[s/n6gzFZTgS664GUfx0Xrpjw/$body.twitter.settings.index.uuid/] -// TESTRESPONSE[s/"created": \.\.\./"created": $body.twitter.settings.index.version.created/] +// TESTRESPONSE[s/1474389951325/$body.my-index-000001.settings.index.creation_date/] +// TESTRESPONSE[s/n6gzFZTgS664GUfx0Xrpjw/$body.my-index-000001.settings.index.uuid/] +// TESTRESPONSE[s/"created": \.\.\./"created": $body.my-index-000001.settings.index.version.created/] By default `flat_settings` is set to `false`. @@ -578,7 +578,7 @@ invalid `size` parameter to the `_search` API: [source,console] ---------------------------------------------------------------------- -POST /twitter/_search?size=surprise_me +POST /my-index-000001/_search?size=surprise_me ---------------------------------------------------------------------- // TEST[s/surprise_me/surprise_me&error_trace=false/ catch:bad_request] // Since the test system sends error_trace=true by default we have to override @@ -610,7 +610,7 @@ But if you set `error_trace=true`: [source,console] ---------------------------------------------------------------------- -POST /twitter/_search?size=surprise_me&error_trace=true +POST /my-index-000001/_search?size=surprise_me&error_trace=true ---------------------------------------------------------------------- // TEST[catch:bad_request]