746 lines
25 KiB
Plaintext
Executable File
746 lines
25 KiB
Plaintext
Executable File
[[getting-started]]
|
||
= Getting started with {es}
|
||
|
||
[partintro]
|
||
--
|
||
Ready to take {es} for a test drive and see for yourself how you can use the
|
||
REST APIs to store, search, and analyze data?
|
||
|
||
Step through this getting started tutorial to:
|
||
|
||
. Get an {es} cluster up and running
|
||
. Index some sample documents
|
||
. Search for documents using the {es} query language
|
||
. Analyze the results using bucket and metrics aggregations
|
||
|
||
|
||
Need more context?
|
||
|
||
Check out the <<elasticsearch-intro,
|
||
{es} Introduction>> to learn the lingo and understand the basics of
|
||
how {es} works. If you're already familiar with {es} and want to see how it works
|
||
with the rest of the stack, you might want to jump to the
|
||
{stack-gs}/get-started-elastic-stack.html[Elastic Stack
|
||
Tutorial] to see how to set up a system monitoring solution with {es}, {kib},
|
||
{beats}, and {ls}.
|
||
|
||
TIP: The fastest way to get started with {es} is to
|
||
{ess-trial}[start a free 14-day
|
||
trial of {ess}] in the cloud.
|
||
--
|
||
|
||
[[getting-started-install]]
|
||
== Get {es} up and running
|
||
|
||
To take {es} for a test drive, you can create a
|
||
{ess-trial}[hosted deployment] on
|
||
the {ess} or set up a multi-node {es} cluster on your own
|
||
Linux, macOS, or Windows machine.
|
||
|
||
[discrete]
|
||
[[run-elasticsearch-hosted]]
|
||
=== Run {es} on Elastic Cloud
|
||
|
||
When you create a deployment on the {es} Service, the service provisions
|
||
a three-node {es} cluster along with Kibana and APM.
|
||
|
||
To create a deployment:
|
||
|
||
. Sign up for a {ess-trial}[free trial]
|
||
and verify your email address.
|
||
. Set a password for your account.
|
||
. Click **Create Deployment**.
|
||
|
||
Once you've created a deployment, you're ready to <<getting-started-index>>.
|
||
|
||
[discrete]
|
||
[[run-elasticsearch-local]]
|
||
=== Run {es} locally on Linux, macOS, or Windows
|
||
|
||
When you create a deployment on the {ess}, a master node and
|
||
two data nodes are provisioned automatically. By installing from the tar or zip
|
||
archive, you can start multiple instances of {es} locally to see how a multi-node
|
||
cluster behaves.
|
||
|
||
To run a three-node {es} cluster locally:
|
||
|
||
. Download the {es} archive for your OS:
|
||
+
|
||
Linux: https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-{version}-linux-x86_64.tar.gz[elasticsearch-{version}-linux-x86_64.tar.gz]
|
||
+
|
||
["source","sh",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-{version}-linux-x86_64.tar.gz
|
||
--------------------------------------------------
|
||
// NOTCONSOLE
|
||
+
|
||
macOS: https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-{version}-darwin-x86_64.tar.gz[elasticsearch-{version}-darwin-x86_64.tar.gz]
|
||
+
|
||
["source","sh",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
curl -L -O https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-{version}-darwin-x86_64.tar.gz
|
||
--------------------------------------------------
|
||
// NOTCONSOLE
|
||
+
|
||
Windows:
|
||
https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-{version}-windows-x86_64.zip[elasticsearch-{version}-windows-x86_64.zip]
|
||
|
||
. Extract the archive:
|
||
+
|
||
Linux:
|
||
+
|
||
["source","sh",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
tar -xvf elasticsearch-{version}-linux-x86_64.tar.gz
|
||
--------------------------------------------------
|
||
+
|
||
macOS:
|
||
+
|
||
["source","sh",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
tar -xvf elasticsearch-{version}-darwin-x86_64.tar.gz
|
||
--------------------------------------------------
|
||
+
|
||
Windows PowerShell:
|
||
+
|
||
["source","powershell",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
Expand-Archive elasticsearch-{version}-windows-x86_64.zip
|
||
--------------------------------------------------
|
||
|
||
. Start {es} from the `bin` directory:
|
||
+
|
||
Linux and macOS:
|
||
+
|
||
["source","sh",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
cd elasticsearch-{version}/bin
|
||
./elasticsearch
|
||
--------------------------------------------------
|
||
+
|
||
Windows:
|
||
+
|
||
["source","powershell",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
cd elasticsearch-{version}\bin
|
||
.\elasticsearch.bat
|
||
--------------------------------------------------
|
||
+
|
||
You now have a single-node {es} cluster up and running!
|
||
|
||
. Start two more instances of {es} so you can see how a typical multi-node
|
||
cluster behaves. You need to specify unique data and log paths
|
||
for each node.
|
||
+
|
||
Linux and macOS:
|
||
+
|
||
["source","sh",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
./elasticsearch -Epath.data=data2 -Epath.logs=log2
|
||
./elasticsearch -Epath.data=data3 -Epath.logs=log3
|
||
--------------------------------------------------
|
||
+
|
||
Windows:
|
||
+
|
||
["source","powershell",subs="attributes,callouts"]
|
||
--------------------------------------------------
|
||
.\elasticsearch.bat -E path.data=data2 -E path.logs=log2
|
||
.\elasticsearch.bat -E path.data=data3 -E path.logs=log3
|
||
--------------------------------------------------
|
||
+
|
||
The additional nodes are assigned unique IDs. Because you're running all three
|
||
nodes locally, they automatically join the cluster with the first node.
|
||
|
||
. Use the cat health API to verify that your three-node cluster is up running.
|
||
The cat APIs return information about your cluster and indices in a
|
||
format that's easier to read than raw JSON.
|
||
+
|
||
You can interact directly with your cluster by submitting HTTP requests to
|
||
the {es} REST API. If you have Kibana installed and running, you can also
|
||
open Kibana and submit requests through the Dev Console.
|
||
+
|
||
TIP: You'll want to check out the
|
||
https://www.elastic.co/guide/en/elasticsearch/client/index.html[{es} language
|
||
clients] when you're ready to start using {es} in your own applications.
|
||
+
|
||
[source,console]
|
||
--------------------------------------------------
|
||
GET /_cat/health?v=true
|
||
--------------------------------------------------
|
||
+
|
||
The response should indicate that the status of the `elasticsearch` cluster
|
||
is `green` and it has three nodes:
|
||
+
|
||
[source,txt]
|
||
--------------------------------------------------
|
||
epoch timestamp cluster status node.total node.data shards pri relo init unassign pending_tasks max_task_wait_time active_shards_percent
|
||
1565052807 00:53:27 elasticsearch green 3 3 6 3 0 0 0 0 - 100.0%
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/1565052807 00:53:27 elasticsearch/\\d+ \\d+:\\d+:\\d+ integTest/]
|
||
// TESTRESPONSE[s/3 3 6 3/\\d+ \\d+ \\d+ \\d+/]
|
||
// TESTRESPONSE[s/0 0 -/0 \\d+ (-|\\d+(\.\\d+)?(micros|ms|s))/]
|
||
// TESTRESPONSE[non_json]
|
||
+
|
||
NOTE: The cluster status will remain yellow if you are only running a single
|
||
instance of {es}. A single node cluster is fully functional, but data
|
||
cannot be replicated to another node to provide resiliency. Replica shards must
|
||
be available for the cluster status to be green. If the cluster status is red,
|
||
some data is unavailable.
|
||
|
||
[discrete]
|
||
[[gs-curl]]
|
||
=== Talking to {es} with cURL commands
|
||
|
||
Most of the examples in this guide enable you to copy the appropriate cURL
|
||
command and submit the request to your local {es} instance from the command line.
|
||
|
||
A request to Elasticsearch consists of the same parts as any HTTP request:
|
||
|
||
[source,sh]
|
||
--------------------------------------------------
|
||
curl -X<VERB> '<PROTOCOL>://<HOST>:<PORT>/<PATH>?<QUERY_STRING>' -d '<BODY>'
|
||
--------------------------------------------------
|
||
// NOTCONSOLE
|
||
|
||
This example uses the following variables:
|
||
|
||
`<VERB>`:: The appropriate HTTP method or verb. For example, `GET`, `POST`,
|
||
`PUT`, `HEAD`, or `DELETE`.
|
||
`<PROTOCOL>`:: Either `http` or `https`. Use the latter if you have an HTTPS
|
||
proxy in front of {es} or you use {es} {security-features} to encrypt HTTP
|
||
communications.
|
||
`<HOST>`:: The hostname of any node in your {es} cluster. Alternatively, use
|
||
+localhost+ for a node on your local machine.
|
||
`<PORT>`:: The port running the {es} HTTP service, which defaults to `9200`.
|
||
`<PATH>`:: The API endpoint, which can contain multiple components, such as
|
||
`_cluster/stats` or `_nodes/stats/jvm`.
|
||
`<QUERY_STRING>`:: Any optional query-string parameters. For example, `?pretty`
|
||
will _pretty-print_ the JSON response to make it easier to read.
|
||
`<BODY>`:: A JSON-encoded request body (if necessary).
|
||
|
||
If the {es} {security-features} are enabled, you must also provide a valid user
|
||
name (and password) that has authority to run the API. For example, use the
|
||
`-u` or `--u` cURL command parameter. For details about which security
|
||
privileges are required to run each API, see <<rest-apis>>.
|
||
|
||
{es} responds to each API request with an HTTP status code like `200 OK`. With
|
||
the exception of `HEAD` requests, it also returns a JSON-encoded response body.
|
||
|
||
[discrete]
|
||
[[gs-other-install]]
|
||
=== Other installation options
|
||
|
||
Installing {es} from an archive file enables you to easily install and run
|
||
multiple instances locally so you can try things out. To run a single instance,
|
||
you can run {es} in a Docker container, install {es} using the DEB or RPM
|
||
packages on Linux, install using Homebrew on macOS, or install using the MSI
|
||
package installer on Windows. See <<install-elasticsearch>> for more information.
|
||
|
||
[[getting-started-index]]
|
||
== Index some documents
|
||
|
||
Once you have a cluster up and running, you're ready to index some data.
|
||
There are a variety of ingest options for {es}, but in the end they all
|
||
do the same thing: put JSON documents into an {es} index.
|
||
|
||
You can do this directly with a simple PUT request that specifies
|
||
the index you want to add the document, a unique document ID, and one or more
|
||
`"field": "value"` pairs in the request body:
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
PUT /customer/_doc/1
|
||
{
|
||
"name": "John Doe"
|
||
}
|
||
--------------------------------------------------
|
||
|
||
This request automatically creates the `customer` index if it doesn't already
|
||
exist, adds a new document that has an ID of `1`, and stores and
|
||
indexes the `name` field.
|
||
|
||
Since this is a new document, the response shows that the result of the
|
||
operation was that version 1 of the document was created:
|
||
|
||
[source,console-result]
|
||
--------------------------------------------------
|
||
{
|
||
"_index" : "customer",
|
||
"_type" : "_doc",
|
||
"_id" : "1",
|
||
"_version" : 1,
|
||
"result" : "created",
|
||
"_shards" : {
|
||
"total" : 2,
|
||
"successful" : 2,
|
||
"failed" : 0
|
||
},
|
||
"_seq_no" : 26,
|
||
"_primary_term" : 4
|
||
}
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/]
|
||
// TESTRESPONSE[s/"successful" : \d+/"successful" : $body._shards.successful/]
|
||
// TESTRESPONSE[s/"_primary_term" : \d+/"_primary_term" : $body._primary_term/]
|
||
|
||
|
||
The new document is available immediately from any node in the cluster.
|
||
You can retrieve it with a GET request that specifies its document ID:
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
GET /customer/_doc/1
|
||
--------------------------------------------------
|
||
// TEST[continued]
|
||
|
||
The response indicates that a document with the specified ID was found
|
||
and shows the original source fields that were indexed.
|
||
|
||
[source,console-result]
|
||
--------------------------------------------------
|
||
{
|
||
"_index" : "customer",
|
||
"_type" : "_doc",
|
||
"_id" : "1",
|
||
"_version" : 1,
|
||
"_seq_no" : 26,
|
||
"_primary_term" : 4,
|
||
"found" : true,
|
||
"_source" : {
|
||
"name": "John Doe"
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/"_seq_no" : \d+/"_seq_no" : $body._seq_no/ ]
|
||
// TESTRESPONSE[s/"_primary_term" : \d+/"_primary_term" : $body._primary_term/]
|
||
|
||
[discrete]
|
||
[[getting-started-batch-processing]]
|
||
=== Indexing documents in bulk
|
||
|
||
If you have a lot of documents to index, you can submit them in batches with
|
||
the {ref}/docs-bulk.html[bulk API]. Using bulk to batch document
|
||
operations is significantly faster than submitting requests individually as it minimizes network roundtrips.
|
||
|
||
The optimal batch size depends on a number of factors: the document size and complexity, the indexing and search load, and the resources available to your cluster. A good place to start is with batches of 1,000 to 5,000 documents
|
||
and a total payload between 5MB and 15MB. From there, you can experiment
|
||
to find the sweet spot.
|
||
|
||
To get some data into {es} that you can start searching and analyzing:
|
||
|
||
. Download the https://github.com/elastic/elasticsearch/blob/master/docs/src/test/resources/accounts.json?raw=true[`accounts.json`] sample data set. The documents in this randomly-generated data set represent user accounts with the following information:
|
||
+
|
||
[source,js]
|
||
--------------------------------------------------
|
||
{
|
||
"account_number": 0,
|
||
"balance": 16623,
|
||
"firstname": "Bradshaw",
|
||
"lastname": "Mckenzie",
|
||
"age": 29,
|
||
"gender": "F",
|
||
"address": "244 Columbus Place",
|
||
"employer": "Euron",
|
||
"email": "bradshawmckenzie@euron.com",
|
||
"city": "Hobucken",
|
||
"state": "CO"
|
||
}
|
||
--------------------------------------------------
|
||
// NOTCONSOLE
|
||
|
||
. Index the account data into the `bank` index with the following `_bulk` request:
|
||
+
|
||
[source,sh]
|
||
--------------------------------------------------
|
||
curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_bulk?pretty&refresh" --data-binary "@accounts.json"
|
||
curl "localhost:9200/_cat/indices?v=true"
|
||
--------------------------------------------------
|
||
// NOTCONSOLE
|
||
+
|
||
////
|
||
This replicates the above in a document-testing friendly way but isn't visible
|
||
in the docs:
|
||
+
|
||
[source,console]
|
||
--------------------------------------------------
|
||
GET /_cat/indices?v=true
|
||
--------------------------------------------------
|
||
// TEST[setup:bank]
|
||
////
|
||
+
|
||
The response indicates that 1,000 documents were indexed successfully.
|
||
+
|
||
[source,txt]
|
||
--------------------------------------------------
|
||
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
|
||
yellow open bank l7sSYV2cQXmu6_4rJWVIww 5 1 1000 0 128.6kb 128.6kb
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/128.6kb/\\d+(\\.\\d+)?[mk]?b/]
|
||
// TESTRESPONSE[s/l7sSYV2cQXmu6_4rJWVIww/.+/ non_json]
|
||
|
||
[[getting-started-search]]
|
||
== Start searching
|
||
|
||
Once you have ingested some data into an {es} index, you can search it
|
||
by sending requests to the `_search` endpoint. To access the full suite of
|
||
search capabilities, you use the {es} Query DSL to specify the
|
||
search criteria in the request body. You specify the name of the index you
|
||
want to search in the request URI.
|
||
|
||
For example, the following request retrieves all documents in the `bank`
|
||
index sorted by account number:
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"query": { "match_all": {} },
|
||
"sort": [
|
||
{ "account_number": "asc" }
|
||
]
|
||
}
|
||
--------------------------------------------------
|
||
// TEST[continued]
|
||
|
||
By default, the `hits` section of the response includes the first 10 documents
|
||
that match the search criteria:
|
||
|
||
[source,console-result]
|
||
--------------------------------------------------
|
||
{
|
||
"took" : 63,
|
||
"timed_out" : false,
|
||
"_shards" : {
|
||
"total" : 5,
|
||
"successful" : 5,
|
||
"skipped" : 0,
|
||
"failed" : 0
|
||
},
|
||
"hits" : {
|
||
"total" : {
|
||
"value": 1000,
|
||
"relation": "eq"
|
||
},
|
||
"max_score" : null,
|
||
"hits" : [ {
|
||
"_index" : "bank",
|
||
"_type" : "_doc",
|
||
"_id" : "0",
|
||
"sort": [0],
|
||
"_score" : null,
|
||
"_source" : {"account_number":0,"balance":16623,"firstname":"Bradshaw","lastname":"Mckenzie","age":29,"gender":"F","address":"244 Columbus Place","employer":"Euron","email":"bradshawmckenzie@euron.com","city":"Hobucken","state":"CO"}
|
||
}, {
|
||
"_index" : "bank",
|
||
"_type" : "_doc",
|
||
"_id" : "1",
|
||
"sort": [1],
|
||
"_score" : null,
|
||
"_source" : {"account_number":1,"balance":39225,"firstname":"Amber","lastname":"Duke","age":32,"gender":"M","address":"880 Holmes Lane","employer":"Pyrami","email":"amberduke@pyrami.com","city":"Brogan","state":"IL"}
|
||
}, ...
|
||
]
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/"took" : 63/"took" : $body.took/]
|
||
// TESTRESPONSE[s/\.\.\./$body.hits.hits.2, $body.hits.hits.3, $body.hits.hits.4, $body.hits.hits.5, $body.hits.hits.6, $body.hits.hits.7, $body.hits.hits.8, $body.hits.hits.9/]
|
||
|
||
The response also provides the following information about the search request:
|
||
|
||
* `took` – how long it took {es} to run the query, in milliseconds
|
||
* `timed_out` – whether or not the search request timed out
|
||
* `_shards` – how many shards were searched and a breakdown of how many shards
|
||
succeeded, failed, or were skipped.
|
||
* `max_score` – the score of the most relevant document found
|
||
* `hits.total.value` - how many matching documents were found
|
||
* `hits.sort` - the document's sort position (when not sorting by relevance score)
|
||
* `hits._score` - the document's relevance score (not applicable when using `match_all`)
|
||
|
||
Each search request is self-contained: {es} does not maintain any
|
||
state information across requests. To page through the search hits, specify
|
||
the `from` and `size` parameters in your request.
|
||
|
||
For example, the following request gets hits 10 through 19:
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"query": { "match_all": {} },
|
||
"sort": [
|
||
{ "account_number": "asc" }
|
||
],
|
||
"from": 10,
|
||
"size": 10
|
||
}
|
||
--------------------------------------------------
|
||
// TEST[continued]
|
||
|
||
Now that you've seen how to submit a basic search request, you can start to
|
||
construct queries that are a bit more interesting than `match_all`.
|
||
|
||
To search for specific terms within a field, you can use a `match` query.
|
||
For example, the following request searches the `address` field to find
|
||
customers whose addresses contain `mill` or `lane`:
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"query": { "match": { "address": "mill lane" } }
|
||
}
|
||
--------------------------------------------------
|
||
// TEST[continued]
|
||
|
||
To perform a phrase search rather than matching individual terms, you use
|
||
`match_phrase` instead of `match`. For example, the following request only
|
||
matches addresses that contain the phrase `mill lane`:
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"query": { "match_phrase": { "address": "mill lane" } }
|
||
}
|
||
--------------------------------------------------
|
||
// TEST[continued]
|
||
|
||
To construct more complex queries, you can use a `bool` query to combine
|
||
multiple query criteria. You can designate criteria as required (must match),
|
||
desirable (should match), or undesirable (must not match).
|
||
|
||
For example, the following request searches the `bank` index for accounts that
|
||
belong to customers who are 40 years old, but excludes anyone who lives in
|
||
Idaho (ID):
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"query": {
|
||
"bool": {
|
||
"must": [
|
||
{ "match": { "age": "40" } }
|
||
],
|
||
"must_not": [
|
||
{ "match": { "state": "ID" } }
|
||
]
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// TEST[continued]
|
||
|
||
Each `must`, `should`, and `must_not` element in a Boolean query is referred
|
||
to as a query clause. How well a document meets the criteria in each `must` or
|
||
`should` clause contributes to the document's _relevance score_. The higher the
|
||
score, the better the document matches your search criteria. By default, {es}
|
||
returns documents ranked by these relevance scores.
|
||
|
||
The criteria in a `must_not` clause is treated as a _filter_. It affects whether
|
||
or not the document is included in the results, but does not contribute to
|
||
how documents are scored. You can also explicitly specify arbitrary filters to
|
||
include or exclude documents based on structured data.
|
||
|
||
For example, the following request uses a range filter to limit the results to
|
||
accounts with a balance between $20,000 and $30,000 (inclusive).
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"query": {
|
||
"bool": {
|
||
"must": { "match_all": {} },
|
||
"filter": {
|
||
"range": {
|
||
"balance": {
|
||
"gte": 20000,
|
||
"lte": 30000
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// TEST[continued]
|
||
|
||
[[getting-started-aggregations]]
|
||
== Analyze results with aggregations
|
||
|
||
{es} aggregations enable you to get meta-information about your search results
|
||
and answer questions like, "How many account holders are in Texas?" or
|
||
"What's the average balance of accounts in Tennessee?" You can search
|
||
documents, filter hits, and use aggregations to analyze the results all in one
|
||
request.
|
||
|
||
For example, the following request uses a `terms` aggregation to group
|
||
all of the accounts in the `bank` index by state, and returns the ten states
|
||
with the most accounts in descending order:
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"size": 0,
|
||
"aggs": {
|
||
"group_by_state": {
|
||
"terms": {
|
||
"field": "state.keyword"
|
||
}
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// TEST[continued]
|
||
|
||
The `buckets` in the response are the values of the `state` field. The
|
||
`doc_count` shows the number of accounts in each state. For example, you
|
||
can see that there are 27 accounts in `ID` (Idaho). Because the request
|
||
set `size=0`, the response only contains the aggregation results.
|
||
|
||
[source,console-result]
|
||
--------------------------------------------------
|
||
{
|
||
"took": 29,
|
||
"timed_out": false,
|
||
"_shards": {
|
||
"total": 5,
|
||
"successful": 5,
|
||
"skipped" : 0,
|
||
"failed": 0
|
||
},
|
||
"hits" : {
|
||
"total" : {
|
||
"value": 1000,
|
||
"relation": "eq"
|
||
},
|
||
"max_score" : null,
|
||
"hits" : [ ]
|
||
},
|
||
"aggregations" : {
|
||
"group_by_state" : {
|
||
"doc_count_error_upper_bound": 20,
|
||
"sum_other_doc_count": 770,
|
||
"buckets" : [ {
|
||
"key" : "ID",
|
||
"doc_count" : 27
|
||
}, {
|
||
"key" : "TX",
|
||
"doc_count" : 27
|
||
}, {
|
||
"key" : "AL",
|
||
"doc_count" : 25
|
||
}, {
|
||
"key" : "MD",
|
||
"doc_count" : 25
|
||
}, {
|
||
"key" : "TN",
|
||
"doc_count" : 23
|
||
}, {
|
||
"key" : "MA",
|
||
"doc_count" : 21
|
||
}, {
|
||
"key" : "NC",
|
||
"doc_count" : 21
|
||
}, {
|
||
"key" : "ND",
|
||
"doc_count" : 21
|
||
}, {
|
||
"key" : "ME",
|
||
"doc_count" : 20
|
||
}, {
|
||
"key" : "MO",
|
||
"doc_count" : 20
|
||
} ]
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// TESTRESPONSE[s/"took": 29/"took": $body.took/]
|
||
|
||
|
||
You can combine aggregations to build more complex summaries of your data. For
|
||
example, the following request nests an `avg` aggregation within the previous
|
||
`group_by_state` aggregation to calculate the average account balances for
|
||
each state.
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"size": 0,
|
||
"aggs": {
|
||
"group_by_state": {
|
||
"terms": {
|
||
"field": "state.keyword"
|
||
},
|
||
"aggs": {
|
||
"average_balance": {
|
||
"avg": {
|
||
"field": "balance"
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// TEST[continued]
|
||
|
||
Instead of sorting the results by count, you could sort using the result of
|
||
the nested aggregation by specifying the order within the `terms` aggregation:
|
||
|
||
[source,console]
|
||
--------------------------------------------------
|
||
GET /bank/_search
|
||
{
|
||
"size": 0,
|
||
"aggs": {
|
||
"group_by_state": {
|
||
"terms": {
|
||
"field": "state.keyword",
|
||
"order": {
|
||
"average_balance": "desc"
|
||
}
|
||
},
|
||
"aggs": {
|
||
"average_balance": {
|
||
"avg": {
|
||
"field": "balance"
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
}
|
||
--------------------------------------------------
|
||
// TEST[continued]
|
||
|
||
In addition to basic bucketing and metrics aggregations like these, {es}
|
||
provides specialized aggregations for operating on multiple fields and
|
||
analyzing particular types of data such as dates, IP addresses, and geo
|
||
data. You can also feed the results of individual aggregations into pipeline
|
||
aggregations for further analysis.
|
||
|
||
The core analysis capabilities provided by aggregations enable advanced
|
||
features such as using machine learning to detect anomalies.
|
||
|
||
[[getting-started-next-steps]]
|
||
== Where to go from here
|
||
|
||
Now that you've set up a cluster, indexed some documents, and run some
|
||
searches and aggregations, you might want to:
|
||
|
||
* {stack-gs}/get-started-elastic-stack.html#install-kibana[Dive in to the Elastic
|
||
Stack Tutorial] to install Kibana, Logstash, and Beats and
|
||
set up a basic system monitoring solution.
|
||
|
||
* {kibana-ref}/add-sample-data.html[Load one of the sample data sets into Kibana]
|
||
to see how you can use {es} and Kibana together to visualize your data.
|
||
|
||
* Try out one of the Elastic search solutions:
|
||
** https://swiftype.com/documentation/site-search/crawler-quick-start[Site Search]
|
||
** https://swiftype.com/documentation/app-search/getting-started[App Search]
|
||
** https://swiftype.com/documentation/enterprise-search/getting-started[Enterprise Search]
|