mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-10 06:55:32 +00:00
[Docs] Unify spelling of Elasticsearch (#27567)
Removes occurences of "elasticsearch" or "ElasticSearch" in favour of "Elasticsearch" where appropriate.
This commit is contained in:
parent
547f006118
commit
0d11b9fe34
@ -129,7 +129,7 @@ The following project appears to be abandoned:
|
||||
== Lua
|
||||
|
||||
* https://github.com/DhavalKapil/elasticsearch-lua[elasticsearch-lua]:
|
||||
Lua client for elasticsearch
|
||||
Lua client for Elasticsearch
|
||||
|
||||
[[dotnet]]
|
||||
== .NET
|
||||
@ -218,7 +218,7 @@ Also see the {client}/ruby-api/current/index.html[official Elasticsearch Ruby cl
|
||||
Tiny client with built-in zero-downtime migrations and ActiveRecord integration.
|
||||
|
||||
* https://github.com/toptal/chewy[chewy]:
|
||||
Chewy is ODM and wrapper for official elasticsearch client
|
||||
Chewy is an ODM and wrapper for the official Elasticsearch client
|
||||
|
||||
* https://github.com/ankane/searchkick[Searchkick]:
|
||||
Intelligent search made easy
|
||||
|
@ -13,7 +13,7 @@ The first type is to simply provide the request as a Closure, which
|
||||
automatically gets resolved into the respective request instance (for
|
||||
the index API, its the `IndexRequest` class). The API returns a special
|
||||
future, called `GActionFuture`. This is a groovier version of
|
||||
elasticsearch Java `ActionFuture` (in turn a nicer extension to Java own
|
||||
Elasticsearch Java `ActionFuture` (in turn a nicer extension to Java own
|
||||
`Future`) which allows to register listeners (closures) on it for
|
||||
success and failures, as well as blocking for the response. For example:
|
||||
|
||||
|
@ -1,7 +1,7 @@
|
||||
[[client]]
|
||||
== Client
|
||||
|
||||
Obtaining an elasticsearch Groovy `GClient` (a `GClient` is a simple
|
||||
Obtaining an Elasticsearch Groovy `GClient` (a `GClient` is a simple
|
||||
wrapper on top of the Java `Client`) is simple. The most common way to
|
||||
get a client is by starting an embedded `Node` which acts as a node
|
||||
within the cluster.
|
||||
@ -11,7 +11,7 @@ within the cluster.
|
||||
=== Node Client
|
||||
|
||||
A Node based client is the simplest form to get a `GClient` to start
|
||||
executing operations against elasticsearch.
|
||||
executing operations against Elasticsearch.
|
||||
|
||||
[source,groovy]
|
||||
--------------------------------------------------
|
||||
@ -29,7 +29,7 @@ GClient client = node.client();
|
||||
node.close();
|
||||
--------------------------------------------------
|
||||
|
||||
Since elasticsearch allows to configure it using JSON based settings,
|
||||
Since Elasticsearch allows to configure it using JSON based settings,
|
||||
the configuration itself can be done using a closure that represent the
|
||||
JSON:
|
||||
|
||||
|
@ -6,7 +6,7 @@ include::../Versions.asciidoc[]
|
||||
== Preface
|
||||
|
||||
This section describes the http://groovy-lang.org/[Groovy] API
|
||||
elasticsearch provides. All elasticsearch APIs are executed using a
|
||||
Elasticsearch provides. All Elasticsearch APIs are executed using a
|
||||
<<client,GClient>>, and are completely
|
||||
asynchronous in nature (they either accept a listener, or return a
|
||||
future).
|
||||
|
@ -8,7 +8,7 @@ You can use the *Java client* in multiple ways:
|
||||
existing cluster
|
||||
* Perform administrative tasks on a running cluster
|
||||
|
||||
Obtaining an elasticsearch `Client` is simple. The most common way to
|
||||
Obtaining an Elasticsearch `Client` is simple. The most common way to
|
||||
get a client is by creating a <<transport-client,`TransportClient`>>
|
||||
that connects to a cluster.
|
||||
|
||||
@ -69,7 +69,7 @@ After this, the client will call the internal cluster state API on those nodes
|
||||
to discover available data nodes. The internal node list of the client will
|
||||
be replaced with those data nodes only. This list is refreshed every five seconds by default.
|
||||
Note that the IP addresses the sniffer connects to are the ones declared as the 'publish'
|
||||
address in those node's elasticsearch config.
|
||||
address in those node's Elasticsearch config.
|
||||
|
||||
Keep in mind that the list might possibly not include the original node it connected to
|
||||
if that node is not a data node. If, for instance, you initially connect to a
|
||||
|
@ -78,7 +78,7 @@ BulkProcessor bulkProcessor = BulkProcessor.builder(
|
||||
BackoffPolicy.exponentialBackoff(TimeValue.timeValueMillis(100), 3)) <9>
|
||||
.build();
|
||||
--------------------------------------------------
|
||||
<1> Add your elasticsearch client
|
||||
<1> Add your Elasticsearch client
|
||||
<2> This method is called just before bulk is executed. You can for example see the numberOfActions with
|
||||
`request.numberOfActions()`
|
||||
<3> This method is called after bulk execution. You can for example check if there was some failing requests
|
||||
@ -138,7 +138,7 @@ all bulk requests to complete then returns `true`, if the specified waiting time
|
||||
[[java-docs-bulk-processor-tests]]
|
||||
==== Using Bulk Processor in tests
|
||||
|
||||
If you are running tests with elasticsearch and are using the `BulkProcessor` to populate your dataset
|
||||
If you are running tests with Elasticsearch and are using the `BulkProcessor` to populate your dataset
|
||||
you should better set the number of concurrent requests to `0` so the flush operation of the bulk will be executed
|
||||
in a synchronous manner:
|
||||
|
||||
|
@ -5,8 +5,8 @@ include::../Versions.asciidoc[]
|
||||
|
||||
[preface]
|
||||
== Preface
|
||||
This section describes the Java API that elasticsearch provides. All
|
||||
elasticsearch operations are executed using a
|
||||
This section describes the Java API that Elasticsearch provides. All
|
||||
Elasticsearch operations are executed using a
|
||||
<<client,Client>> object. All
|
||||
operations are completely asynchronous in nature (either accepts a
|
||||
listener, or returns a future).
|
||||
|
@ -2,7 +2,7 @@
|
||||
== Indexed Scripts API
|
||||
|
||||
The indexed script API allows one to interact with scripts and templates
|
||||
stored in an elasticsearch index. It can be used to create, update, get,
|
||||
stored in an Elasticsearch index. It can be used to create, update, get,
|
||||
and delete indexed scripts and templates.
|
||||
|
||||
[source,java]
|
||||
|
@ -16,10 +16,10 @@ The javadoc for the REST client sniffer can be found at {rest-client-sniffer-jav
|
||||
=== Maven Repository
|
||||
|
||||
The REST client sniffer is subject to the same release cycle as
|
||||
elasticsearch. Replace the version with the desired sniffer version, first
|
||||
Elasticsearch. Replace the version with the desired sniffer version, first
|
||||
released with `5.0.0-alpha4`. There is no relation between the sniffer version
|
||||
and the elasticsearch version that the client can communicate with. Sniffer
|
||||
supports fetching the nodes list from elasticsearch 2.x and onwards.
|
||||
and the Elasticsearch version that the client can communicate with. Sniffer
|
||||
supports fetching the nodes list from Elasticsearch 2.x and onwards.
|
||||
|
||||
|
||||
==== Maven configuration
|
||||
|
@ -17,10 +17,10 @@ http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.elasticsearch.client%22[Ma
|
||||
Central]. The minimum Java version required is `1.7`.
|
||||
|
||||
The low-level REST client is subject to the same release cycle as
|
||||
elasticsearch. Replace the version with the desired client version, first
|
||||
Elasticsearch. Replace the version with the desired client version, first
|
||||
released with `5.0.0-alpha4`. There is no relation between the client version
|
||||
and the elasticsearch version that the client can communicate with. The
|
||||
low-level REST client is compatible with all elasticsearch versions.
|
||||
and the Elasticsearch version that the client can communicate with. The
|
||||
low-level REST client is compatible with all Elasticsearch versions.
|
||||
|
||||
[[java-rest-low-usage-maven-maven]]
|
||||
==== Maven configuration
|
||||
|
@ -46,7 +46,7 @@ can fill in the necessary values in the `build.gradle` file for your plugin.
|
||||
Use the system property `java.specification.version`. Version string must be a sequence
|
||||
of nonnegative decimal integers separated by "."'s and may have leading zeros.
|
||||
|
||||
|`elasticsearch.version` |String | version of elasticsearch compiled against.
|
||||
|`elasticsearch.version` |String | version of Elasticsearch compiled against.
|
||||
|
||||
|=======================================================================
|
||||
|
||||
@ -57,7 +57,7 @@ If you need other resources, package them into a resources jar.
|
||||
.Plugin release lifecycle
|
||||
==============================================
|
||||
|
||||
You will have to release a new version of the plugin for each new elasticsearch release.
|
||||
You will have to release a new version of the plugin for each new Elasticsearch release.
|
||||
This version is checked when the plugin is loaded so Elasticsearch will refuse to start
|
||||
in the presence of plugins with the incorrect `elasticsearch.version`.
|
||||
|
||||
@ -86,7 +86,7 @@ with a large warning, and they will have to confirm them when installing the
|
||||
plugin interactively. So if possible, it is best to avoid requesting any
|
||||
spurious permissions!
|
||||
|
||||
If you are using the elasticsearch Gradle build system, place this file in
|
||||
If you are using the Elasticsearch Gradle build system, place this file in
|
||||
`src/main/plugin-metadata` and it will be applied during unit tests as well.
|
||||
|
||||
Keep in mind that the Java security model is stack-based, and the additional
|
||||
|
@ -37,7 +37,7 @@ discovery:
|
||||
.Binding the network host
|
||||
==============================================
|
||||
|
||||
The keystore file must be placed in a directory accessible by elasticsearch like the `config` directory.
|
||||
The keystore file must be placed in a directory accessible by Elasticsearch like the `config` directory.
|
||||
|
||||
It's important to define `network.host` as by default it's bound to `localhost`.
|
||||
|
||||
@ -95,7 +95,7 @@ The following are a list of settings that can further control the discovery:
|
||||
`discovery.azure.endpoint.name`::
|
||||
|
||||
When using `public_ip` this setting is used to identify the endpoint name
|
||||
used to forward requests to elasticsearch (aka transport port name).
|
||||
used to forward requests to Elasticsearch (aka transport port name).
|
||||
Defaults to `elasticsearch`. In Azure management console, you could define
|
||||
an endpoint `elasticsearch` forwarding for example requests on public IP
|
||||
on port 8100 to the virtual machine on port 9300.
|
||||
@ -131,7 +131,7 @@ discovery:
|
||||
We will expose here one strategy which is to hide our Elasticsearch cluster from outside.
|
||||
|
||||
With this strategy, only VMs behind the same virtual port can talk to each
|
||||
other. That means that with this mode, you can use elasticsearch unicast
|
||||
other. That means that with this mode, you can use Elasticsearch unicast
|
||||
discovery to build a cluster, using the Azure API to retrieve information
|
||||
about your nodes.
|
||||
|
||||
@ -177,7 +177,7 @@ cat azure-cert.pem azure-pk.pem > azure.pem.txt
|
||||
openssl pkcs12 -export -in azure.pem.txt -out azurekeystore.pkcs12 -name azure -noiter -nomaciter
|
||||
----
|
||||
|
||||
Upload the `azure-certificate.cer` file both in the elasticsearch Cloud Service (under `Manage Certificates`),
|
||||
Upload the `azure-certificate.cer` file both in the Elasticsearch Cloud Service (under `Manage Certificates`),
|
||||
and under `Settings -> Manage Certificates`.
|
||||
|
||||
IMPORTANT: When prompted for a password, you need to enter a non empty one.
|
||||
@ -354,7 +354,7 @@ sudo dpkg -i elasticsearch-{version}.deb
|
||||
----
|
||||
// NOTCONSOLE
|
||||
|
||||
Check that elasticsearch is running:
|
||||
Check that Elasticsearch is running:
|
||||
|
||||
[source,js]
|
||||
----
|
||||
@ -393,11 +393,11 @@ This command should give you a JSON result:
|
||||
// So much s/// but at least we test that the layout is close to matching....
|
||||
|
||||
[[discovery-azure-classic-long-plugin]]
|
||||
===== Install elasticsearch cloud azure plugin
|
||||
===== Install Elasticsearch cloud azure plugin
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
# Stop elasticsearch
|
||||
# Stop Elasticsearch
|
||||
sudo service elasticsearch stop
|
||||
|
||||
# Install the plugin
|
||||
@ -428,7 +428,7 @@ discovery:
|
||||
# path.data: /mnt/resource/elasticsearch/data
|
||||
----
|
||||
|
||||
Restart elasticsearch:
|
||||
Restart Elasticsearch:
|
||||
|
||||
[source,sh]
|
||||
----
|
||||
|
@ -181,10 +181,10 @@ to use include the `discovery.ec2.tag.` prefix. For example, setting `discovery.
|
||||
filter instances with a tag key set to `stage`, and a value of `dev`. Several tags set will require all of those tags
|
||||
to be set for the instance to be included.
|
||||
|
||||
One practical use for tag filtering is when an ec2 cluster contains many nodes that are not running elasticsearch. In
|
||||
One practical use for tag filtering is when an ec2 cluster contains many nodes that are not running Elasticsearch. In
|
||||
this case (particularly with high `discovery.zen.ping_timeout` values) there is a risk that a new node's discovery phase
|
||||
will end before it has found the cluster (which will result in it declaring itself master of a new cluster with the same
|
||||
name - highly undesirable). Tagging elasticsearch ec2 nodes and then filtering by that tag will resolve this issue.
|
||||
name - highly undesirable). Tagging Elasticsearch ec2 nodes and then filtering by that tag will resolve this issue.
|
||||
|
||||
[[discovery-ec2-attributes]]
|
||||
===== Automatic Node Attributes
|
||||
|
@ -201,7 +201,7 @@ sudo dpkg -i elasticsearch-2.0.0.deb
|
||||
--------------------------------------------------
|
||||
|
||||
[[discovery-gce-usage-long-install-plugin]]
|
||||
===== Install elasticsearch discovery gce plugin
|
||||
===== Install Elasticsearch discovery gce plugin
|
||||
|
||||
Install the plugin:
|
||||
|
||||
@ -231,7 +231,7 @@ discovery:
|
||||
--------------------------------------------------
|
||||
|
||||
|
||||
Start elasticsearch:
|
||||
Start Elasticsearch:
|
||||
|
||||
[source,sh]
|
||||
--------------------------------------------------
|
||||
@ -354,9 +354,9 @@ For example, setting `discovery.gce.tags` to `dev` will only filter instances ha
|
||||
set will require all of those tags to be set for the instance to be included.
|
||||
|
||||
One practical use for tag filtering is when an GCE cluster contains many nodes that are not running
|
||||
elasticsearch. In this case (particularly with high `discovery.zen.ping_timeout` values) there is a risk that a new
|
||||
Elasticsearch. In this case (particularly with high `discovery.zen.ping_timeout` values) there is a risk that a new
|
||||
node's discovery phase will end before it has found the cluster (which will result in it declaring itself master of a
|
||||
new cluster with the same name - highly undesirable). Adding tag on elasticsearch GCE nodes and then filtering by that
|
||||
new cluster with the same name - highly undesirable). Adding tag on Elasticsearch GCE nodes and then filtering by that
|
||||
tag will resolve this issue.
|
||||
|
||||
Add your tag when building the new instance:
|
||||
@ -385,8 +385,8 @@ discovery:
|
||||
[[discovery-gce-usage-port]]
|
||||
==== Changing default transport port
|
||||
|
||||
By default, elasticsearch GCE plugin assumes that you run elasticsearch on 9300 default port.
|
||||
But you can specify the port value elasticsearch is meant to use using google compute engine metadata `es_port`:
|
||||
By default, Elasticsearch GCE plugin assumes that you run Elasticsearch on 9300 default port.
|
||||
But you can specify the port value Elasticsearch is meant to use using google compute engine metadata `es_port`:
|
||||
|
||||
[[discovery-gce-usage-port-create]]
|
||||
===== When creating instance
|
||||
|
@ -60,10 +60,10 @@ releases 2.0 and later do not support rivers.
|
||||
The Java Database Connection (JDBC) importer allows to fetch data from JDBC sources for indexing into Elasticsearch (by Jörg Prante)
|
||||
|
||||
* https://github.com/reachkrishnaraj/kafka-elasticsearch-standalone-consumer/tree/branch2.0[Kafka Standalone Consumer(Indexer)]:
|
||||
Kafka Standalone Consumer [Indexer] will read messages from Kafka in batches, processes(as implemented) and bulk-indexes them into ElasticSearch. Flexible and scalable. More documentation in above GitHub repo's Wiki.(Please use branch 2.0)!
|
||||
Kafka Standalone Consumer [Indexer] will read messages from Kafka in batches, processes(as implemented) and bulk-indexes them into Elasticsearch. Flexible and scalable. More documentation in above GitHub repo's Wiki.(Please use branch 2.0)!
|
||||
|
||||
* https://github.com/ozlerhakan/mongolastic[Mongolastic]:
|
||||
A tool that clones data from ElasticSearch to MongoDB and vice versa
|
||||
A tool that clones data from Elasticsearch to MongoDB and vice versa
|
||||
|
||||
* https://github.com/Aconex/scrutineer[Scrutineer]:
|
||||
A high performance consistency checker to compare what you've indexed
|
||||
@ -106,7 +106,7 @@ releases 2.0 and later do not support rivers.
|
||||
indexing in Elasticsearch.
|
||||
|
||||
* https://camel.apache.org/elasticsearch.html[Apache Camel Integration]:
|
||||
An Apache camel component to integrate elasticsearch
|
||||
An Apache camel component to integrate Elasticsearch
|
||||
|
||||
* https://metacpan.org/release/Catmandu-Store-ElasticSearch[Catmanadu]:
|
||||
An Elasticsearch backend for the Catmandu framework.
|
||||
@ -164,7 +164,7 @@ releases 2.0 and later do not support rivers.
|
||||
Nagios.
|
||||
|
||||
* https://github.com/radu-gheorghe/check-es[check-es]:
|
||||
Nagios/Shinken plugins for checking on elasticsearch
|
||||
Nagios/Shinken plugins for checking on Elasticsearch
|
||||
|
||||
* https://github.com/mattweber/es2graphite[es2graphite]:
|
||||
Send cluster and indices stats and status to Graphite for monitoring and graphing.
|
||||
@ -206,7 +206,7 @@ These projects appear to have been abandoned:
|
||||
Daikon Elasticsearch CLI
|
||||
|
||||
* https://github.com/fullscale/dangle[dangle]:
|
||||
A set of AngularJS directives that provide common visualizations for elasticsearch based on
|
||||
A set of AngularJS directives that provide common visualizations for Elasticsearch based on
|
||||
D3.
|
||||
* https://github.com/OlegKunitsyn/eslogd[eslogd]:
|
||||
Linux daemon that replicates events to a central Elasticsearch server in realtime
|
||||
|
@ -34,7 +34,7 @@ bin/elasticsearch-keystore add azure.client.secondary.key
|
||||
`default` is the default account name which will be used by a repository unless you set an explicit one.
|
||||
|
||||
You can set the client side timeout to use when making any single request. It can be defined globally, per account or both.
|
||||
It's not set by default which means that elasticsearch is using the
|
||||
It's not set by default which means that Elasticsearch is using the
|
||||
http://azure.github.io/azure-storage-java/com/microsoft/azure/storage/RequestOptions.html#setTimeoutIntervalInMs(java.lang.Integer)[default value]
|
||||
set by the azure client (known as 5 minutes).
|
||||
|
||||
|
@ -42,7 +42,7 @@ The bucket should now be created.
|
||||
The plugin supports two authentication modes:
|
||||
|
||||
* The built-in <<repository-gcs-using-compute-engine, Compute Engine authentication>>. This mode is
|
||||
recommended if your elasticsearch node is running on a Compute Engine virtual machine.
|
||||
recommended if your Elasticsearch node is running on a Compute Engine virtual machine.
|
||||
|
||||
* Specifying <<repository-gcs-using-service-account, Service Account>> credentials.
|
||||
|
||||
@ -61,7 +61,7 @@ instance details at the section "Cloud API access scopes".
|
||||
|
||||
[[repository-gcs-using-service-account]]
|
||||
===== Using a Service Account
|
||||
If your elasticsearch node is not running on Compute Engine, or if you don't want to use Google's
|
||||
If your Elasticsearch node is not running on Compute Engine, or if you don't want to use Google's
|
||||
built-in authentication mechanism, you can authenticate on the Storage service using a
|
||||
https://cloud.google.com/iam/docs/overview#service_account[Service Account] file.
|
||||
|
||||
|
@ -95,7 +95,7 @@ methods are supported by the plugin:
|
||||
`simple`::
|
||||
|
||||
Also means "no security" and is enabled by default. Uses information from underlying operating system account
|
||||
running elasticsearch to inform Hadoop of the name of the current user. Hadoop makes no attempts to verify this
|
||||
running Elasticsearch to inform Hadoop of the name of the current user. Hadoop makes no attempts to verify this
|
||||
information.
|
||||
|
||||
`kerberos`::
|
||||
|
@ -284,7 +284,7 @@ You may further restrict the permissions by specifying a prefix within the bucke
|
||||
// NOTCONSOLE
|
||||
|
||||
The bucket needs to exist to register a repository for snapshots. If you did not create the bucket then the repository
|
||||
registration will fail. If you want elasticsearch to create the bucket instead, you can add the permission to create a
|
||||
registration will fail. If you want Elasticsearch to create the bucket instead, you can add the permission to create a
|
||||
specific bucket like this:
|
||||
|
||||
[source,js]
|
||||
@ -305,6 +305,6 @@ specific bucket like this:
|
||||
[float]
|
||||
==== AWS VPC Bandwidth Settings
|
||||
|
||||
AWS instances resolve S3 endpoints to a public IP. If the elasticsearch instances reside in a private subnet in an AWS VPC then all traffic to S3 will go through that VPC's NAT instance. If your VPC's NAT instance is a smaller instance size (e.g. a t1.micro) or is handling a high volume of network traffic your bandwidth to S3 may be limited by that NAT instance's networking bandwidth limitations.
|
||||
AWS instances resolve S3 endpoints to a public IP. If the Elasticsearch instances reside in a private subnet in an AWS VPC then all traffic to S3 will go through that VPC's NAT instance. If your VPC's NAT instance is a smaller instance size (e.g. a t1.micro) or is handling a high volume of network traffic your bandwidth to S3 may be limited by that NAT instance's networking bandwidth limitations.
|
||||
|
||||
Instances residing in a public subnet in an AWS VPC will connect to S3 via the VPC's internet gateway and not be bandwidth limited by the VPC's NAT instance.
|
||||
|
@ -2,7 +2,7 @@
|
||||
=== Date Histogram Aggregation
|
||||
|
||||
A multi-bucket aggregation similar to the <<search-aggregations-bucket-histogram-aggregation,histogram>> except it can
|
||||
only be applied on date values. Since dates are represented in elasticsearch internally as long values, it is possible
|
||||
only be applied on date values. Since dates are represented in Elasticsearch internally as long values, it is possible
|
||||
to use the normal `histogram` on dates as well, though accuracy will be compromised. The reason for this is in the fact
|
||||
that time based intervals are not fixed (think of leap years and on the number of days in a month). For this reason,
|
||||
we need special support for time based data. From a functionality perspective, this histogram supports the same features
|
||||
|
@ -449,7 +449,7 @@ a consolidated review by the reducing node before the final selection. Obviously
|
||||
will cause extra network traffic and RAM usage so this is quality/cost trade off that needs to be balanced. If `shard_size` is set to -1 (the default) then `shard_size` will be automatically estimated based on the number of shards and the `size` parameter.
|
||||
|
||||
|
||||
NOTE: `shard_size` cannot be smaller than `size` (as it doesn't make much sense). When it is, elasticsearch will
|
||||
NOTE: `shard_size` cannot be smaller than `size` (as it doesn't make much sense). When it is, Elasticsearch will
|
||||
override it and reset it to be equal to `size`.
|
||||
|
||||
===== Minimum document count
|
||||
|
@ -99,7 +99,7 @@ Response:
|
||||
--------------------------------------------------
|
||||
// TESTRESPONSE[s/\.\.\.//]
|
||||
<1> an upper bound of the error on the document counts for each term, see <<search-aggregations-bucket-terms-aggregation-approximate-counts,below>>
|
||||
<2> when there are lots of unique terms, elasticsearch only returns the top terms; this number is the sum of the document counts for all buckets that are not part of the response
|
||||
<2> when there are lots of unique terms, Elasticsearch only returns the top terms; this number is the sum of the document counts for all buckets that are not part of the response
|
||||
<3> the list of the top buckets, the meaning of `top` being defined by the <<search-aggregations-bucket-terms-aggregation-order,order>>
|
||||
|
||||
By default, the `terms` aggregation will return the buckets for the top ten terms ordered by the `doc_count`. One can
|
||||
@ -210,7 +210,7 @@ one can increase the accuracy of the returned terms and avoid the overhead of st
|
||||
the client.
|
||||
|
||||
|
||||
NOTE: `shard_size` cannot be smaller than `size` (as it doesn't make much sense). When it is, elasticsearch will
|
||||
NOTE: `shard_size` cannot be smaller than `size` (as it doesn't make much sense). When it is, Elasticsearch will
|
||||
override it and reset it to be equal to `size`.
|
||||
|
||||
|
||||
@ -740,7 +740,7 @@ expire then we may be missing accounts of interest and have set our numbers too
|
||||
* increase the `size` parameter to return more results per partition (could be heavy on memory) or
|
||||
* increase the `num_partitions` to consider less accounts per request (could increase overall processing time as we need to make more requests)
|
||||
|
||||
Ultimately this is a balancing act between managing the elasticsearch resources required to process a single request and the volume
|
||||
Ultimately this is a balancing act between managing the Elasticsearch resources required to process a single request and the volume
|
||||
of requests that the client application must issue to complete a task.
|
||||
|
||||
==== Multi-field terms aggregation
|
||||
|
@ -161,7 +161,7 @@ counting millions of items.
|
||||
On string fields that have a high cardinality, it might be faster to store the
|
||||
hash of your field values in your index and then run the cardinality aggregation
|
||||
on this field. This can either be done by providing hash values from client-side
|
||||
or by letting elasticsearch compute hash values for you by using the
|
||||
or by letting Elasticsearch compute hash values for you by using the
|
||||
{plugins}/mapper-murmur3.html[`mapper-murmur3`] plugin.
|
||||
|
||||
NOTE: Pre-computing hashes is usually only useful on very large and/or
|
||||
|
@ -3,7 +3,7 @@
|
||||
|
||||
[partintro]
|
||||
--
|
||||
The *elasticsearch* REST APIs are exposed using <<modules-http,JSON over HTTP>>.
|
||||
The *Elasticsearch* REST APIs are exposed using <<modules-http,JSON over HTTP>>.
|
||||
|
||||
The conventions listed in this chapter can be applied throughout the REST
|
||||
API, unless otherwise specified.
|
||||
@ -228,7 +228,7 @@ Some examples are:
|
||||
=== Response Filtering
|
||||
|
||||
All REST APIs accept a `filter_path` parameter that can be used to reduce
|
||||
the response returned by elasticsearch. This parameter takes a comma
|
||||
the response returned by Elasticsearch. This parameter takes a comma
|
||||
separated list of filters expressed with the dot notation:
|
||||
|
||||
[source,js]
|
||||
@ -360,7 +360,7 @@ Responds:
|
||||
--------------------------------------------------
|
||||
// TESTRESPONSE
|
||||
|
||||
Note that elasticsearch sometimes returns directly the raw value of a field,
|
||||
Note that Elasticsearch sometimes returns directly the raw value of a field,
|
||||
like the `_source` field. If you want to filter `_source` fields, you should
|
||||
consider combining the already existing `_source` parameter (see
|
||||
<<get-source-filtering,Get API>> for more details) with the `filter_path`
|
||||
|
@ -32,7 +32,7 @@ All these exposed metrics come directly from Lucene APIs.
|
||||
1. As the number of documents and deleted documents shown in this are at the lucene level,
|
||||
it includes all the hidden documents (e.g. from nested documents) as well.
|
||||
|
||||
2. To get actual count of documents at the elasticsearch level, the recommended way
|
||||
2. To get actual count of documents at the Elasticsearch level, the recommended way
|
||||
is to use either the <<cat-count>> or the <<search-count>>
|
||||
|
||||
[float]
|
||||
|
@ -133,7 +133,7 @@ version number is equal to zero.
|
||||
A nice side effect is that there is no need to maintain strict ordering
|
||||
of async indexing operations executed as a result of changes to a source
|
||||
database, as long as version numbers from the source database are used.
|
||||
Even the simple case of updating the elasticsearch index using data from
|
||||
Even the simple case of updating the Elasticsearch index using data from
|
||||
a database is simplified if external versioning is used, as only the
|
||||
latest version will be used if the index operations are out of order for
|
||||
whatever reason.
|
||||
@ -355,7 +355,7 @@ and isn't able to compare it against the new source.
|
||||
There isn't a hard and fast rule about when noop updates aren't acceptable.
|
||||
It's a combination of lots of factors like how frequently your data source
|
||||
sends updates that are actually noops and how many queries per second
|
||||
elasticsearch runs on the shard with receiving the updates.
|
||||
Elasticsearch runs on the shard with receiving the updates.
|
||||
|
||||
[float]
|
||||
[[timeout]]
|
||||
|
@ -1030,7 +1030,7 @@ PUT metricbeat-2016.05.31/beat/1?refresh
|
||||
----------------------------------------------------------------
|
||||
// CONSOLE
|
||||
|
||||
The new template for the `metricbeat-*` indices is already loaded into elasticsearch
|
||||
The new template for the `metricbeat-*` indices is already loaded into Elasticsearch
|
||||
but it applies only to the newly created indices. Painless can be used to reindex
|
||||
the existing documents and apply the new template.
|
||||
|
||||
|
@ -16,7 +16,7 @@
|
||||
terms stored in the index.
|
||||
+
|
||||
It is this process of analysis (both at index time and at search time)
|
||||
that allows elasticsearch to perform full text queries.
|
||||
that allows Elasticsearch to perform full text queries.
|
||||
+
|
||||
Also see <<glossary-text,text>> and <<glossary-term,term>>.
|
||||
|
||||
@ -29,7 +29,7 @@
|
||||
|
||||
[[glossary-document]] document ::
|
||||
|
||||
A document is a JSON document which is stored in elasticsearch. It is
|
||||
A document is a JSON document which is stored in Elasticsearch. It is
|
||||
like a row in a table in a relational database. Each document is
|
||||
stored in an <<glossary-index,index>> and has a <<glossary-type,type>> and an
|
||||
<<glossary-id,id>>.
|
||||
@ -82,7 +82,7 @@
|
||||
|
||||
[[glossary-node]] node ::
|
||||
|
||||
A node is a running instance of elasticsearch which belongs to a
|
||||
A node is a running instance of Elasticsearch which belongs to a
|
||||
<<glossary-cluster,cluster>>. Multiple nodes can be started on a single
|
||||
server for testing purposes, but usually you should have one node per
|
||||
server.
|
||||
@ -136,7 +136,7 @@
|
||||
[[glossary-shard]] shard ::
|
||||
|
||||
A shard is a single Lucene instance. It is a low-level “worker” unit
|
||||
which is managed automatically by elasticsearch. An index is a logical
|
||||
which is managed automatically by Elasticsearch. An index is a logical
|
||||
namespace which points to <<glossary-primary-shard,primary>> and
|
||||
<<glossary-replica-shard,replica>> shards.
|
||||
+
|
||||
@ -159,7 +159,7 @@
|
||||
|
||||
[[glossary-term]] term ::
|
||||
|
||||
A term is an exact value that is indexed in elasticsearch. The terms
|
||||
A term is an exact value that is indexed in Elasticsearch. The terms
|
||||
`foo`, `Foo`, `FOO` are NOT equivalent. Terms (i.e. exact values) can
|
||||
be searched for using _term_ queries. +
|
||||
See also <<glossary-text,text>> and <<glossary-analysis,analysis>>.
|
||||
|
@ -4,7 +4,7 @@
|
||||
[float]
|
||||
=== Disable the features you do not need
|
||||
|
||||
By default elasticsearch indexes and adds doc values to most fields so that they
|
||||
By default Elasticsearch indexes and adds doc values to most fields so that they
|
||||
can be searched and aggregated out of the box. For instance if you have a numeric
|
||||
field called `foo` that you need to run histograms on but that you never need to
|
||||
filter on, you can safely disable indexing on this field in your
|
||||
@ -30,7 +30,7 @@ PUT index
|
||||
|
||||
<<text,`text`>> fields store normalization factors in the index in order to be
|
||||
able to score documents. If you only need matching capabilities on a `text`
|
||||
field but do not care about the produced scores, you can configure elasticsearch
|
||||
field but do not care about the produced scores, you can configure Elasticsearch
|
||||
to not write norms to the index:
|
||||
|
||||
[source,js]
|
||||
@ -54,7 +54,7 @@ PUT index
|
||||
<<text,`text`>> fields also store frequencies and positions in the index by
|
||||
default. Frequencies are used to compute scores and positions are used to run
|
||||
phrase queries. If you do not need to run phrase queries, you can tell
|
||||
elasticsearch to not index positions:
|
||||
Elasticsearch to not index positions:
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
@ -75,7 +75,7 @@ PUT index
|
||||
// CONSOLE
|
||||
|
||||
Furthermore if you do not care about scoring either, you can configure
|
||||
elasticsearch to just index matching documents for every term. You will
|
||||
Elasticsearch to just index matching documents for every term. You will
|
||||
still be able to search on this field, but phrase queries will raise errors
|
||||
and scoring will assume that terms appear only once in every document.
|
||||
|
||||
|
@ -64,13 +64,13 @@ process by <<setup-configuration-memory,disabling swapping>>.
|
||||
=== Give memory to the filesystem cache
|
||||
|
||||
The filesystem cache will be used in order to buffer I/O operations. You should
|
||||
make sure to give at least half the memory of the machine running elasticsearch
|
||||
make sure to give at least half the memory of the machine running Elasticsearch
|
||||
to the filesystem cache.
|
||||
|
||||
[float]
|
||||
=== Use auto-generated ids
|
||||
|
||||
When indexing a document that has an explicit id, elasticsearch needs to check
|
||||
When indexing a document that has an explicit id, Elasticsearch needs to check
|
||||
whether a document with the same id already exists within the same shard, which
|
||||
is a costly operation and gets even more costly as the index grows. By using
|
||||
auto-generated ids, Elasticsearch can skip this check, which makes indexing
|
||||
|
@ -6,7 +6,7 @@
|
||||
|
||||
Elasticsearch heavily relies on the filesystem cache in order to make search
|
||||
fast. In general, you should make sure that at least half the available memory
|
||||
goes to the filesystem cache so that elasticsearch can keep hot regions of the
|
||||
goes to the filesystem cache so that Elasticsearch can keep hot regions of the
|
||||
index in physical memory.
|
||||
|
||||
[float]
|
||||
@ -275,8 +275,8 @@ merging to the background merge process.
|
||||
Global ordinals are a data-structure that is used in order to run
|
||||
<<search-aggregations-bucket-terms-aggregation,`terms`>> aggregations on
|
||||
<<keyword,`keyword`>> fields. They are loaded lazily in memory because
|
||||
elasticsearch does not know which fields will be used in `terms` aggregations
|
||||
and which fields won't. You can tell elasticsearch to load global ordinals
|
||||
Elasticsearch does not know which fields will be used in `terms` aggregations
|
||||
and which fields won't. You can tell Elasticsearch to load global ordinals
|
||||
eagerly at refresh-time by configuring mappings as described below:
|
||||
|
||||
[source,js]
|
||||
@ -300,7 +300,7 @@ PUT index
|
||||
[float]
|
||||
=== Warm up the filesystem cache
|
||||
|
||||
If the machine running elasticsearch is restarted, the filesystem cache will be
|
||||
If the machine running Elasticsearch is restarted, the filesystem cache will be
|
||||
empty, so it will take some time before the operating system loads hot regions
|
||||
of the index into memory so that search operations are fast. You can explicitly
|
||||
tell the operating system which files should be loaded into memory eagerly
|
||||
|
@ -3,7 +3,7 @@
|
||||
|
||||
beta[]
|
||||
|
||||
When creating a new index in elasticsearch it is possible to configure how the Segments
|
||||
When creating a new index in Elasticsearch it is possible to configure how the Segments
|
||||
inside each Shard will be sorted. By default Lucene does not apply any sort.
|
||||
The `index.sort.*` settings define which fields should be used to sort the documents inside each Segment.
|
||||
|
||||
@ -112,7 +112,7 @@ before activating this feature.
|
||||
[[early-terminate]]
|
||||
=== Early termination of search request
|
||||
|
||||
By default in elasticsearch a search request must visit every document that match a query to
|
||||
By default in Elasticsearch a search request must visit every document that match a query to
|
||||
retrieve the top documents sorted by a specified sort.
|
||||
Though when the index sort and the search sort are the same it is possible to limit
|
||||
the number of documents that should be visited per segment to retrieve the N top ranked documents globally.
|
||||
|
@ -1,7 +1,7 @@
|
||||
[[index-modules-merge]]
|
||||
== Merge
|
||||
|
||||
A shard in elasticsearch is a Lucene index, and a Lucene index is broken down
|
||||
A shard in Elasticsearch is a Lucene index, and a Lucene index is broken down
|
||||
into segments. Segments are internal storage elements in the index where the
|
||||
index data is stored, and are immutable. Smaller segments are periodically
|
||||
merged into larger segments to keep the index size at bay and to expunge
|
||||
|
@ -8,7 +8,7 @@ The store module allows you to control how index data is stored and accessed on
|
||||
=== File system storage types
|
||||
|
||||
There are different file system implementations or _storage types_. By default,
|
||||
elasticsearch will pick the best implementation based on the operating
|
||||
Elasticsearch will pick the best implementation based on the operating
|
||||
environment.
|
||||
|
||||
This can be overridden for all indices by adding this to the
|
||||
@ -76,7 +76,7 @@ compatibility.
|
||||
|
||||
NOTE: This is an expert setting, the details of which may change in the future.
|
||||
|
||||
By default, elasticsearch completely relies on the operating system file system
|
||||
By default, Elasticsearch completely relies on the operating system file system
|
||||
cache for caching I/O operations. It is possible to set `index.store.preload`
|
||||
in order to tell the operating system to load the content of hot index
|
||||
files into memory upon opening. This setting accept a comma-separated list of
|
||||
@ -115,7 +115,7 @@ The default value is the empty array, which means that nothing will be loaded
|
||||
into the file-system cache eagerly. For indices that are actively searched,
|
||||
you might want to set it to `["nvd", "dvd"]`, which will cause norms and doc
|
||||
values to be loaded eagerly into physical memory. These are the two first
|
||||
extensions to look at since elasticsearch performs random access on them.
|
||||
extensions to look at since Elasticsearch performs random access on them.
|
||||
|
||||
A wildcard can be used in order to indicate that all files should be preloaded:
|
||||
`index.store.preload: ["*"]`. Note however that it is generally not useful to
|
||||
|
@ -257,9 +257,9 @@ Here are some examples of potentially useful dynamic templates:
|
||||
|
||||
===== Structured search
|
||||
|
||||
By default elasticsearch will map string fields as a `text` field with a sub
|
||||
By default Elasticsearch will map string fields as a `text` field with a sub
|
||||
`keyword` field. However if you are only indexing structured content and not
|
||||
interested in full text search, you can make elasticsearch map your fields
|
||||
interested in full text search, you can make Elasticsearch map your fields
|
||||
only as `keyword`s. Note that this means that in order to search those fields,
|
||||
you will have to search on the exact same value that was indexed.
|
||||
|
||||
@ -290,7 +290,7 @@ PUT my_index
|
||||
On the contrary to the previous example, if the only thing that you care about
|
||||
on your string fields is full-text search, and if you don't plan on running
|
||||
aggregations, sorting or exact search on your string fields, you could tell
|
||||
elasticsearch to map it only as a text field (which was the default behaviour
|
||||
Elasticsearch to map it only as a text field (which was the default behaviour
|
||||
before 5.0):
|
||||
|
||||
[source,js]
|
||||
@ -357,7 +357,7 @@ remove it as described in the previous section.
|
||||
|
||||
===== Time-series
|
||||
|
||||
When doing time series analysis with elasticsearch, it is common to have many
|
||||
When doing time series analysis with Elasticsearch, it is common to have many
|
||||
numeric fields that you will often aggregate on but never filter on. In such a
|
||||
case, you could disable indexing on those fields to save disk space and also
|
||||
maybe gain some indexing speed:
|
||||
|
@ -1,7 +1,7 @@
|
||||
[[modules-discovery-zen]]
|
||||
=== Zen Discovery
|
||||
|
||||
The zen discovery is the built in discovery module for elasticsearch and
|
||||
The zen discovery is the built in discovery module for Elasticsearch and
|
||||
the default. It provides unicast discovery, but can be extended to
|
||||
support cloud environments and other forms of discovery.
|
||||
|
||||
|
@ -1,7 +1,7 @@
|
||||
[[modules-http]]
|
||||
== HTTP
|
||||
|
||||
The http module allows to expose *elasticsearch* APIs
|
||||
The http module allows to expose *Elasticsearch* APIs
|
||||
over HTTP.
|
||||
|
||||
The http mechanism is completely asynchronous in nature, meaning that
|
||||
@ -74,7 +74,7 @@ allowed. If you prepend and append a `/` to the value, this will
|
||||
be treated as a regular expression, allowing you to support HTTP and HTTPs.
|
||||
for example using `/https?:\/\/localhost(:[0-9]+)?/` would return the
|
||||
request header appropriately in both cases. `*` is a valid value but is
|
||||
considered a *security risk* as your elasticsearch instance is open to cross origin
|
||||
considered a *security risk* as your Elasticsearch instance is open to cross origin
|
||||
requests from *anywhere*.
|
||||
|
||||
|`http.cors.max-age` |Browsers send a "preflight" OPTIONS-request to
|
||||
|
@ -1,7 +1,7 @@
|
||||
[[modules-memcached]]
|
||||
== memcached
|
||||
|
||||
The memcached module allows to expose *elasticsearch*
|
||||
The memcached module allows to expose *Elasticsearch*
|
||||
APIs over the memcached protocol (as closely
|
||||
as possible).
|
||||
|
||||
@ -18,7 +18,7 @@ automatically detecting the correct one to use.
|
||||
=== Mapping REST to Memcached Protocol
|
||||
|
||||
Memcached commands are mapped to REST and handled by the same generic
|
||||
REST layer in elasticsearch. Here is a list of the memcached commands
|
||||
REST layer in Elasticsearch. Here is a list of the memcached commands
|
||||
supported:
|
||||
|
||||
[float]
|
||||
|
@ -4,7 +4,7 @@
|
||||
[float]
|
||||
=== Plugins
|
||||
|
||||
Plugins are a way to enhance the basic elasticsearch functionality in a
|
||||
Plugins are a way to enhance the basic Elasticsearch functionality in a
|
||||
custom manner. They range from adding custom mapping types, custom
|
||||
analyzers (in a more built in fashion), custom script engines, custom discovery
|
||||
and more.
|
||||
|
@ -319,7 +319,7 @@ GET /_snapshot/my_backup/snapshot_1
|
||||
// TEST[continued]
|
||||
|
||||
This command returns basic information about the snapshot including start and end time, version of
|
||||
elasticsearch that created the snapshot, the list of included indices, the current state of the
|
||||
Elasticsearch that created the snapshot, the list of included indices, the current state of the
|
||||
snapshot and the list of failures that occurred during the snapshot. The snapshot `state` can be
|
||||
|
||||
[horizontal]
|
||||
@ -343,7 +343,7 @@ snapshot and the list of failures that occurred during the snapshot. The snapsho
|
||||
|
||||
`INCOMPATIBLE`::
|
||||
|
||||
The snapshot was created with an old version of elasticsearch and therefore is incompatible with
|
||||
The snapshot was created with an old version of Elasticsearch and therefore is incompatible with
|
||||
the current version of the cluster.
|
||||
|
||||
|
||||
|
@ -2,7 +2,7 @@
|
||||
== Thrift
|
||||
|
||||
The https://thrift.apache.org/[thrift] transport module allows to expose the REST interface of
|
||||
elasticsearch using thrift. Thrift should provide better performance
|
||||
Elasticsearch using thrift. Thrift should provide better performance
|
||||
over http. Since thrift provides both the wire protocol and the
|
||||
transport, it should make using Elasticsearch more efficient (though it has limited
|
||||
documentation).
|
||||
|
@ -12,7 +12,7 @@ that there is no blocking thread waiting for a response. The benefit of
|
||||
using asynchronous communication is first solving the
|
||||
http://en.wikipedia.org/wiki/C10k_problem[C10k problem], as well as
|
||||
being the ideal solution for scatter (broadcast) / gather operations such
|
||||
as search in ElasticSearch.
|
||||
as search in Elasticsearch.
|
||||
|
||||
[float]
|
||||
=== TCP Transport
|
||||
|
@ -401,7 +401,7 @@ Looking at the previous example:
|
||||
We see a single collector named `SimpleTopScoreDocCollector` wrapped into `CancellableCollector`. `SimpleTopScoreDocCollector` is the default "scoring and sorting"
|
||||
`Collector` used by Elasticsearch. The `reason` field attempts to give a plain english description of the class name. The
|
||||
`time_in_nanos` is similar to the time in the Query tree: a wall-clock time inclusive of all children. Similarly, `children` lists
|
||||
all sub-collectors. The `CancellableCollector` that wraps `SimpleTopScoreDocCollector` is used by elasticsearch to detect if the current
|
||||
all sub-collectors. The `CancellableCollector` that wraps `SimpleTopScoreDocCollector` is used by Elasticsearch to detect if the current
|
||||
search was cancelled and stop collecting documents as soon as it occurs.
|
||||
|
||||
It should be noted that Collector times are **independent** from the Query times. They are calculated, combined
|
||||
|
@ -143,7 +143,7 @@ favor of the options documented above.
|
||||
===== Nested sorting examples
|
||||
|
||||
In the below example `offer` is a field of type `nested`.
|
||||
The nested `path` needs to be specified; otherwise, elasticsearch doesn't know on what nested level sort values need to be captured.
|
||||
The nested `path` needs to be specified; otherwise, Elasticsearch doesn't know on what nested level sort values need to be captured.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
@ -171,7 +171,7 @@ POST /_search
|
||||
// CONSOLE
|
||||
|
||||
In the below example `parent` and `child` fields are of type `nested`.
|
||||
The `nested_path` needs to be specified at each level; otherwise, elasticsearch doesn't know on what nested level sort values need to be captured.
|
||||
The `nested_path` needs to be specified at each level; otherwise, Elasticsearch doesn't know on what nested level sort values need to be captured.
|
||||
|
||||
[source,js]
|
||||
--------------------------------------------------
|
||||
|
@ -26,7 +26,7 @@ For more information on how Mustache templating and what kind of templating you
|
||||
can do with it check out the http://mustache.github.io/mustache.5.html[online
|
||||
documentation of the mustache project].
|
||||
|
||||
NOTE: The mustache language is implemented in elasticsearch as a sandboxed
|
||||
NOTE: The mustache language is implemented in Elasticsearch as a sandboxed
|
||||
scripting language, hence it obeys settings that may be used to enable or
|
||||
disable scripts per type and context as described in the
|
||||
<<allowed-script-types-setting, scripting docs>>
|
||||
|
@ -60,8 +60,8 @@ GET /_search?q=tag:wow
|
||||
// CONSOLE
|
||||
// TEST[setup:twitter]
|
||||
|
||||
By default elasticsearch doesn't reject any search requests based on the number
|
||||
of shards the request hits. While elasticsearch will optimize the search execution
|
||||
By default Elasticsearch doesn't reject any search requests based on the number
|
||||
of shards the request hits. While Elasticsearch will optimize the search execution
|
||||
on the coordinating node a large number of shards can have a significant impact
|
||||
CPU and memory wise. It is usually a better idea to organize data in such a way
|
||||
that there are fewer larger shards. In case you would like to configure a soft
|
||||
|
@ -19,7 +19,7 @@
|
||||
information, check the
|
||||
https://github.com/torvalds/linux/blob/master/Documentation/sysctl/vm.txt[linux kernel documentation]
|
||||
about `max_map_count`. This is set via `sysctl` before starting
|
||||
elasticsearch. Defaults to `262144`.
|
||||
Elasticsearch. Defaults to `262144`.
|
||||
|
||||
`ES_PATH_CONF`::
|
||||
|
||||
@ -34,7 +34,7 @@
|
||||
`RESTART_ON_UPGRADE`::
|
||||
|
||||
Configure restart on package upgrade, defaults to `false`. This means you
|
||||
will have to restart your elasticsearch instance after installing a
|
||||
will have to restart your Elasticsearch instance after installing a
|
||||
package manually. The reason for this is to ensure, that upgrades in a
|
||||
cluster do not result in a continuous shard reallocation resulting in high
|
||||
network traffic and reducing the response times of your cluster.
|
||||
|
@ -2,11 +2,11 @@
|
||||
=== Secure Settings
|
||||
|
||||
Some settings are sensitive, and relying on filesystem permissions to protect
|
||||
their values is not sufficient. For this use case, elasticsearch provides a
|
||||
their values is not sufficient. For this use case, Elasticsearch provides a
|
||||
keystore, which may be password protected, and the `elasticsearch-keystore`
|
||||
tool to manage the settings in the keystore.
|
||||
|
||||
NOTE: All commands here should be run as the user which will run elasticsearch.
|
||||
NOTE: All commands here should be run as the user which will run Elasticsearch.
|
||||
|
||||
NOTE: Only some settings are designed to be read from the keystore. See
|
||||
documentation for each setting to see if it is supported as part of the keystore.
|
||||
|
@ -3,7 +3,7 @@
|
||||
|
||||
[partintro]
|
||||
--
|
||||
This section is about utilizing elasticsearch as part of your testing infrastructure.
|
||||
This section is about utilizing Elasticsearch as part of your testing infrastructure.
|
||||
|
||||
[float]
|
||||
[[testing-header]]
|
||||
|
@ -3,7 +3,7 @@
|
||||
|
||||
[[testing-intro]]
|
||||
|
||||
Testing is a crucial part of your application, and as information retrieval itself is already a complex topic, there should not be any additional complexity in setting up a testing infrastructure, which uses elasticsearch. This is the main reason why we decided to release an additional file to the release, which allows you to use the same testing infrastructure we do in the elasticsearch core. The testing framework allows you to setup clusters with multiple nodes in order to check if your code covers everything needed to run in a cluster. The framework prevents you from writing complex code yourself to start, stop or manage several test nodes in a cluster. In addition there is another very important feature called randomized testing, which you are getting for free as it is part of the elasticsearch infrastructure.
|
||||
Testing is a crucial part of your application, and as information retrieval itself is already a complex topic, there should not be any additional complexity in setting up a testing infrastructure, which uses Elasticsearch. This is the main reason why we decided to release an additional file to the release, which allows you to use the same testing infrastructure we do in the Elasticsearch core. The testing framework allows you to setup clusters with multiple nodes in order to check if your code covers everything needed to run in a cluster. The framework prevents you from writing complex code yourself to start, stop or manage several test nodes in a cluster. In addition there is another very important feature called randomized testing, which you are getting for free as it is part of the Elasticsearch infrastructure.
|
||||
|
||||
|
||||
|
||||
@ -16,9 +16,9 @@ All of the tests are run using a custom junit runner, the `RandomizedRunner` pro
|
||||
|
||||
|
||||
[[using-elasticsearch-test-classes]]
|
||||
=== Using the elasticsearch test classes
|
||||
=== Using the Elasticsearch test classes
|
||||
|
||||
First, you need to include the testing dependency in your project, along with the elasticsearch dependency you have already added. If you use maven and its `pom.xml` file, it looks like this
|
||||
First, you need to include the testing dependency in your project, along with the Elasticsearch dependency you have already added. If you use maven and its `pom.xml` file, it looks like this
|
||||
|
||||
[source,xml]
|
||||
--------------------------------------------------
|
||||
@ -50,7 +50,7 @@ We provide a few classes that you can inherit from in your own test classes whic
|
||||
[[unit-tests]]
|
||||
=== unit tests
|
||||
|
||||
If your test is a well isolated unit test which doesn't need a running elasticsearch cluster, you can use the `ESTestCase`. If you are testing lucene features, use `ESTestCase` and if you are testing concrete token streams, use the `ESTokenStreamTestCase` class. Those specific classes execute additional checks which ensure that no resources leaks are happening, after the test has run.
|
||||
If your test is a well isolated unit test which doesn't need a running Elasticsearch cluster, you can use the `ESTestCase`. If you are testing lucene features, use `ESTestCase` and if you are testing concrete token streams, use the `ESTokenStreamTestCase` class. Those specific classes execute additional checks which ensure that no resources leaks are happening, after the test has run.
|
||||
|
||||
|
||||
[[integration-tests]]
|
||||
@ -58,7 +58,7 @@ If your test is a well isolated unit test which doesn't need a running elasticse
|
||||
|
||||
These kind of tests require firing up a whole cluster of nodes, before the tests can actually be run. Compared to unit tests they are obviously way more time consuming, but the test infrastructure tries to minimize the time cost by only restarting the whole cluster, if this is configured explicitly.
|
||||
|
||||
The class your tests have to inherit from is `ESIntegTestCase`. By inheriting from this class, you will no longer need to start elasticsearch nodes manually in your test, although you might need to ensure that at least a certain number of nodes are up. The integration test behaviour can be configured heavily by specifying different system properties on test runs. See the `TESTING.asciidoc` documentation in the https://github.com/elastic/elasticsearch/blob/master/TESTING.asciidoc[source repository] for more information.
|
||||
The class your tests have to inherit from is `ESIntegTestCase`. By inheriting from this class, you will no longer need to start Elasticsearch nodes manually in your test, although you might need to ensure that at least a certain number of nodes are up. The integration test behaviour can be configured heavily by specifying different system properties on test runs. See the `TESTING.asciidoc` documentation in the https://github.com/elastic/elasticsearch/blob/master/TESTING.asciidoc[source repository] for more information.
|
||||
|
||||
|
||||
[[number-of-shards]]
|
||||
@ -100,8 +100,8 @@ The `InternalTestCluster` class is the heart of the cluster functionality in a r
|
||||
`stopRandomNode()`:: Stop a random node in your cluster to mimic an outage
|
||||
`stopCurrentMasterNode()`:: Stop the current master node to force a new election
|
||||
`stopRandomNonMaster()`:: Stop a random non master node to mimic an outage
|
||||
`buildNode()`:: Create a new elasticsearch node
|
||||
`startNode(settings)`:: Create and start a new elasticsearch node
|
||||
`buildNode()`:: Create a new Elasticsearch node
|
||||
`startNode(settings)`:: Create and start a new Elasticsearch node
|
||||
|
||||
|
||||
[[changing-node-settings]]
|
||||
@ -165,7 +165,7 @@ data nodes will be allowed to become masters as well.
|
||||
[[changing-node-configuration]]
|
||||
==== Changing plugins via configuration
|
||||
|
||||
As elasticsearch is using JUnit 4, using the `@Before` and `@After` annotations is not a problem. However you should keep in mind, that this does not have any effect in your cluster setup, as the cluster is already up and running when those methods are run. So in case you want to configure settings - like loading a plugin on node startup - before the node is actually running, you should overwrite the `nodePlugins()` method from the `ESIntegTestCase` class and return the plugin classes each node should load.
|
||||
As Elasticsearch is using JUnit 4, using the `@Before` and `@After` annotations is not a problem. However you should keep in mind, that this does not have any effect in your cluster setup, as the cluster is already up and running when those methods are run. So in case you want to configure settings - like loading a plugin on node startup - before the node is actually running, you should overwrite the `nodePlugins()` method from the `ESIntegTestCase` class and return the plugin classes each node should load.
|
||||
|
||||
[source,java]
|
||||
-----------------------------------------
|
||||
@ -191,7 +191,7 @@ The next step is to convert your test using static test data into a test using r
|
||||
* Changing your response sizes/configurable limits with each run
|
||||
* Changing the number of shards/replicas when creating an index
|
||||
|
||||
So, how can you create random data. The most important thing to know is, that you never should instantiate your own `Random` instance, but use the one provided in the `RandomizedTest`, from which all elasticsearch dependent test classes inherit from.
|
||||
So, how can you create random data. The most important thing to know is, that you never should instantiate your own `Random` instance, but use the one provided in the `RandomizedTest`, from which all Elasticsearch dependent test classes inherit from.
|
||||
|
||||
[horizontal]
|
||||
`getRandom()`:: Returns the random instance, which can recreated when calling the test with specific parameters
|
||||
@ -221,7 +221,7 @@ If you want to debug a specific problem with a specific random seed, you can use
|
||||
[[assertions]]
|
||||
=== Assertions
|
||||
|
||||
As many elasticsearch tests are checking for a similar output, like the amount of hits or the first hit or special highlighting, a couple of predefined assertions have been created. Those have been put into the `ElasticsearchAssertions` class. There is also a specific geo assertions in `ElasticsearchGeoAssertions`.
|
||||
As many Elasticsearch tests are checking for a similar output, like the amount of hits or the first hit or special highlighting, a couple of predefined assertions have been created. Those have been put into the `ElasticsearchAssertions` class. There is also a specific geo assertions in `ElasticsearchGeoAssertions`.
|
||||
|
||||
[horizontal]
|
||||
`assertHitCount()`:: Checks hit count of a search or count request
|
||||
|
Loading…
x
Reference in New Issue
Block a user