528 lines
20 KiB
Plaintext
528 lines
20 KiB
Plaintext
[glossary]
|
||
[[glossary]]
|
||
= Glossary of terms
|
||
|
||
[glossary]
|
||
[[glossary-analysis]] analysis ::
|
||
+
|
||
--
|
||
// tag::analysis-def[]
|
||
Analysis is the process of converting <<glossary-text,full text>> to
|
||
<<glossary-term,terms>>. Depending on which analyzer is used, these phrases:
|
||
`FOO BAR`, `Foo-Bar`, `foo,bar` will probably all result in the
|
||
terms `foo` and `bar`. These terms are what is actually stored in
|
||
the index.
|
||
|
||
A full text query (not a <<glossary-term,term>> query) for `FoO:bAR` will
|
||
also be analyzed to the terms `foo`,`bar` and will thus match the
|
||
terms stored in the index.
|
||
|
||
It is this process of analysis (both at index time and at search time)
|
||
that allows Elasticsearch to perform full text queries.
|
||
|
||
Also see <<glossary-text,text>> and <<glossary-term,term>>.
|
||
// end::analysis-def[]
|
||
--
|
||
|
||
[[glossary-api-key]] API key ::
|
||
// tag::api-key-def[]
|
||
A unique identifier that you can use for authentication when submitting {es} requests.
|
||
When TLS is enabled, all requests must be authenticated using either basic authentication
|
||
(user name and password) or an API key.
|
||
// end::api-key-def[]
|
||
|
||
|
||
[[glossary-auto-follow-pattern]] auto-follow pattern ::
|
||
// tag::auto-follow-pattern-def[]
|
||
An <<glossary-index-pattern,index pattern>> that automatically configures new indices as
|
||
<<glossary-follower-index,follower indices>> for <<glossary-ccr,{ccr}>>.
|
||
For more information, see {ref}/ccr-auto-follow.html[Managing auto follow patterns].
|
||
// end::auto-follow-pattern-def[]
|
||
|
||
[[glossary-cluster]] cluster ::
|
||
// tag::cluster-def[]
|
||
One or more <<glossary-node,nodes>> that share the
|
||
same cluster name. Each cluster has a single master node, which is
|
||
chosen automatically by the cluster and can be replaced if it fails.
|
||
// end::cluster-def[]
|
||
|
||
[[glossary-cold-phase]] cold phase ::
|
||
// tag::cold-phase-def[]
|
||
The third possible phase in the <<glossary-index-lifecycle,index lifecycle>>.
|
||
In the cold phase, an index is no longer updated and seldom queried.
|
||
The information still needs to be searchable, but it’s okay if those queries are slower.
|
||
// end::cold-phase-def[]
|
||
|
||
[[glossary-component-template]] component template ::
|
||
// tag::component-template-def[]
|
||
A building block for constructing <<indices-templates,index templates>> that specifies index
|
||
<<mapping,mappings>>, <<index-modules-settings,settings>>, and <<indices-aliases,aliases>>.
|
||
// end::component-template-def[]
|
||
|
||
[[glossary-ccr]] {ccr} (CCR)::
|
||
// tag::ccr-def[]
|
||
A feature that enables you to replicate indices in remote clusters to your
|
||
local cluster. For more information, see
|
||
{ref}/xpack-ccr.html[{ccr-cap}].
|
||
// end::ccr-def[]
|
||
|
||
[[glossary-ccs]] {ccs} (CCS)::
|
||
|
||
The {ccs} feature enables any node to act as a federated client across
|
||
multiple clusters. See <<modules-cross-cluster-search>>.
|
||
|
||
[[glossary-data-stream]] data stream ::
|
||
+
|
||
--
|
||
// tag::data-stream-def[]
|
||
A named resource used to ingest, search, and manage time series data in {es}. A
|
||
data stream's data is stored across multiple hidden, auto-generated
|
||
<<glossary-index,indices>>. You can automate management of these indices to more
|
||
efficiently store large data volumes.
|
||
|
||
See {ref}/data-streams.html[Data streams].
|
||
// end::data-stream-def[]
|
||
--
|
||
|
||
[[glossary-delete-phase]] delete phase ::
|
||
// tag::delete-phase-def[]
|
||
The last possible phase in the <<glossary-index-lifecycle,index lifecycle>>.
|
||
In the delete phase, an index is no longer needed and can safely be deleted.
|
||
// end::delete-phase-def[]
|
||
|
||
[[glossary-document]] document ::
|
||
|
||
A document is a JSON document which is stored in Elasticsearch. It is
|
||
like a row in a table in a relational database. Each document is
|
||
stored in an <<glossary-index,index>> and has a <<glossary-type,type>> and an
|
||
<<glossary-id,id>>.
|
||
+
|
||
A document is a JSON object (also known in other languages as a hash /
|
||
hashmap / associative array) which contains zero or more
|
||
<<glossary-field,fields>>, or key-value pairs.
|
||
+
|
||
The original JSON document that is indexed will be stored in the
|
||
<<glossary-source_field,`_source` field>>, which is returned by default when
|
||
getting or searching for a document.
|
||
|
||
[[glossary-field]] field ::
|
||
|
||
A <<glossary-document,document>> contains a list of fields, or key-value
|
||
pairs. The value can be a simple (scalar) value (eg a string, integer,
|
||
date), or a nested structure like an array or an object. A field is
|
||
similar to a column in a table in a relational database.
|
||
+
|
||
The <<glossary-mapping,mapping>> for each field has a field _type_ (not to
|
||
be confused with document <<glossary-type,type>>) which indicates the type
|
||
of data that can be stored in that field, eg `integer`, `string`,
|
||
`object`. The mapping also allows you to define (amongst other things)
|
||
how the value for a field should be analyzed.
|
||
|
||
[[glossary-filter]] filter ::
|
||
// tag::filter-def[]
|
||
A filter is a non-scoring <<glossary-query,query>>,
|
||
meaning that it does not score documents.
|
||
It is only concerned about answering the question - "Does this document match?".
|
||
The answer is always a simple, binary yes or no. This kind of query is said to be made
|
||
in a {ref}/query-filter-context.html[filter context],
|
||
hence it is called a filter. Filters are simple checks for set inclusion or exclusion.
|
||
In most cases, the goal of filtering is to reduce the number of documents that have to be examined.
|
||
// end::filter-def[]
|
||
|
||
[[glossary-flush]] flush ::
|
||
// tag::flush-def[]
|
||
Peform a Lucene commit to write index updates in the transaction log (translog) to disk.
|
||
Because a Lucene commit is a relatively expensive operation,
|
||
{es} records index and delete operations in the translog and
|
||
automatically flushes changes to disk in batches.
|
||
To recover from a crash, operations that have been acknowledged but not yet committed
|
||
can be replayed from the translog.
|
||
Before upgrading, you can explicitly call the {ref}/indices-flush.html[Flush] API
|
||
to ensure that all changes are committed to disk.
|
||
// end::flush-def[]
|
||
|
||
[[glossary-follower-index]] follower index ::
|
||
// tag::follower-index-def[]
|
||
The target index for <<glossary-ccr,{ccr}>>. A follower index exists
|
||
in a local cluster and replicates a <<glossary-leader-index,leader index>>.
|
||
// end::follower-index-def[]
|
||
|
||
[[glossary-force-merge]] force merge ::
|
||
// tag::force-merge-def[]
|
||
// tag::force-merge-def-short[]
|
||
Manually trigger a merge to reduce the number of segments in each shard of an index
|
||
and free up the space used by deleted documents.
|
||
// end::force-merge-def-short[]
|
||
You should not force merge indices that are actively being written to.
|
||
Merging is normally performed automatically, but you can use force merge after
|
||
<<glossary-rollover,rollover>> to reduce the shards in the old index to a single segment.
|
||
See the {ref}/indices-forcemerge.html[force merge API].
|
||
// end::force-merge-def[]
|
||
|
||
[[glossary-freeze]] freeze ::
|
||
// tag::freeze-def[]
|
||
// tag::freeze-def-short[]
|
||
Make an index read-only and minimize its memory footprint.
|
||
// end::freeze-def-short[]
|
||
Frozen indices can be searched without incurring the overhead of of re-opening a closed index,
|
||
but searches are throttled and might be slower.
|
||
You can freeze indices to reduce the overhead of keeping older indices searchable
|
||
before you are ready to archive or delete them.
|
||
See the {ref}/freeze-index-api.html[freeze API].
|
||
// end::freeze-def[]
|
||
|
||
[[glossary-frozen-index]] frozen index ::
|
||
// tag::frozen-index-def[]
|
||
An index reduced to a low overhead state that still enables occasional searches.
|
||
Frozen indices use a memory-efficient shard implementation and throttle searches to conserve resources.
|
||
Searching a frozen index is lower overhead than re-opening a closed index to enable searching.
|
||
// end::frozen-index-def[]
|
||
|
||
[[glossary-hot-phase]] hot phase ::
|
||
// tag::hot-phase-def[]
|
||
The first possible phase in the <<glossary-index-lifecycle,index lifecycle>>.
|
||
In the hot phase, an index is actively updated and queried.
|
||
// end::hot-phase-def[]
|
||
|
||
[[glossary-id]] id ::
|
||
|
||
The ID of a <<glossary-document,document>> identifies a document. The
|
||
`index/id` of a document must be unique. If no ID is provided,
|
||
then it will be auto-generated. (also see <<glossary-routing,routing>>)
|
||
|
||
[[glossary-index]] index ::
|
||
+
|
||
--
|
||
// tag::index-def[]
|
||
// tag::index-def-short[]
|
||
An optimized collection of JSON documents. Each document is a collection of fields,
|
||
the key-value pairs that contain your data.
|
||
// end::index-def-short[]
|
||
|
||
An index is a logical namespace which maps to one or more
|
||
<<glossary-primary-shard,primary shards>> and can have zero or more
|
||
<<glossary-replica-shard,replica shards>>.
|
||
// end::index-def[]
|
||
--
|
||
|
||
[[glossary-index-alias]] index alias ::
|
||
+
|
||
--
|
||
// tag::index-alias-def[]
|
||
// tag::index-alias-desc[]
|
||
An index alias is a secondary name
|
||
used to refer to one or more existing indices.
|
||
|
||
Most {es} APIs accept an index alias
|
||
in place of an index name.
|
||
// end::index-alias-desc[]
|
||
|
||
See {ref}/indices-add-alias.html[Add index alias].
|
||
// end::index-alias-def[]
|
||
--
|
||
|
||
[[glossary-index-lifecycle]] index lifecycle ::
|
||
// tag::index-lifecycle-def[]
|
||
The four phases an index can transition through:
|
||
<<glossary-hot-phase,hot>>, <<glossary-warm-phase,warm>>,
|
||
<<glossary-cold-phase,cold>>, and <<glossary-delete-phase,delete>>.
|
||
For more information, see {ref}/ilm-policy-definition.html[Index lifecycle].
|
||
// end::index-lifecycle-def[]
|
||
|
||
[[glossary-index-lifecycle-policy]] index lifecycle policy ::
|
||
// tag::index-lifecycle-policy-def[]
|
||
Specifies how an index moves between phases in the index lifecycle and
|
||
what actions to perform during each phase.
|
||
// end::index-lifecycle-policy-def[]
|
||
|
||
[[glossary-index-pattern]] index pattern ::
|
||
// tag::index-pattern-def[]
|
||
A string that can contain the `*` wildcard to match multiple index names.
|
||
In most cases, the index parameter in an {es} request can be the name of a specific index,
|
||
a list of index names, or an index pattern.
|
||
For example, if you have the indices `datastream-000001`, `datastream-000002`, and `datastream-000003`,
|
||
to search across all three you could use the `datastream-*` index pattern.
|
||
// end::index-pattern-def[]
|
||
|
||
[[glossary-index-template]] index template ::
|
||
+
|
||
--
|
||
// tag::index-template-def[]
|
||
// tag::index-template-def-short[]
|
||
Defines settings and mappings to apply to new indexes that match a simple naming pattern, such as _logs-*_.
|
||
// end::index-template-def-short[]
|
||
An index template can also attach a lifecycle policy to the new index.
|
||
Index templates are used to automatically configure indices created during <<glossary-rollover,rollover>>.
|
||
// end::index-template-def[]
|
||
--
|
||
|
||
[[glossary-leader-index]] leader index ::
|
||
// tag::leader-index-def[]
|
||
The source index for <<glossary-ccr,{ccr}>>. A leader index exists
|
||
on a remote cluster and is replicated to
|
||
<<glossary-follower-index,follower indices>>.
|
||
|
||
[[glossary-local-cluster]] local cluster ::
|
||
// tag::local-cluster-def[]
|
||
The cluster that pulls data from a <<glossary-remote-cluster,remote cluster>> in {ccs} or {ccr}.
|
||
// end::local-cluster-def[]
|
||
|
||
[[glossary-mapping]] mapping ::
|
||
|
||
A mapping is like a _schema definition_ in a relational database. Each
|
||
<<glossary-index,index>> has a mapping, which defines a <<glossary-type,type>>,
|
||
plus a number of index-wide settings.
|
||
+
|
||
A mapping can either be defined explicitly, or it will be generated
|
||
automatically when a document is indexed.
|
||
|
||
[[glossary-node]] node ::
|
||
|
||
A node is a running instance of Elasticsearch which belongs to a
|
||
<<glossary-cluster,cluster>>. Multiple nodes can be started on a single
|
||
server for testing purposes, but usually you should have one node per
|
||
server.
|
||
+
|
||
At startup, a node will use unicast to discover an existing cluster with
|
||
the same cluster name and will try to join that cluster.
|
||
|
||
[[glossary-primary-shard]] primary shard ::
|
||
|
||
Each document is stored in a single primary <<glossary-shard,shard>>. When
|
||
you index a document, it is indexed first on the primary shard, then
|
||
on all <<glossary-replica-shard,replicas>> of the primary shard.
|
||
+
|
||
By default, an <<glossary-index,index>> has one primary shard. You can specify
|
||
more primary shards to scale the number of <<glossary-document,documents>>
|
||
that your index can handle.
|
||
+
|
||
You cannot change the number of primary shards in an index, once the index is
|
||
created. However, an index can be split into a new index using the
|
||
<<indices-split-index, split API>>.
|
||
+
|
||
See also <<glossary-routing,routing>>
|
||
|
||
[[glossary-query]] query ::
|
||
|
||
A request for information from {es}. You can think of a query as a question,
|
||
written in a way {es} understands. A search consists of one or more queries
|
||
combined.
|
||
+
|
||
There are two types of queries: _scoring queries_ and _filters_. For more
|
||
information about query types, see <<query-filter-context>>.
|
||
|
||
[[glossary-recovery]] recovery ::
|
||
+
|
||
--
|
||
Shard recovery is the process
|
||
of syncing a <<glossary-replica-shard,replica shard>>
|
||
from a <<glossary-primary-shard,primary shard>>.
|
||
Upon completion,
|
||
the replica shard is available for search.
|
||
|
||
// tag::recovery-triggers[]
|
||
Recovery automatically occurs
|
||
during the following processes:
|
||
|
||
* Node startup or failure.
|
||
This type of recovery is called a *local store recovery*.
|
||
* <<glossary-replica-shard,Primary shard replication>>.
|
||
* Relocation of a shard to a different node in the same cluster.
|
||
* {ref}/snapshots-restore-snapshot.html[Snapshot restoration].
|
||
// end::recovery-triggers[]
|
||
--
|
||
|
||
[[glossary-reindex]] reindex ::
|
||
+
|
||
--
|
||
// tag::reindex-def[]
|
||
Copies documents from a _source_ to a _destination_. The source and
|
||
destination can be any pre-existing index, index alias, or
|
||
{ref}/data-streams.html[data stream].
|
||
|
||
You can reindex all documents from a source or select a subset of documents to
|
||
copy. You can also reindex to a destination in a remote cluster.
|
||
|
||
A reindex is often performed to update mappings, change static index settings,
|
||
or upgrade {es} between incompatible versions.
|
||
// end::reindex-def[]
|
||
--
|
||
|
||
[[glossary-remote-cluster]] remote cluster ::
|
||
|
||
// tag::remote-cluster-def[]
|
||
A separate cluster, often in a different data center or locale, that contains indices that
|
||
can be replicated or searched by the <<glossary-local-cluster,local cluster>>.
|
||
The connection to a remote cluster is unidirectional.
|
||
// end::remote-cluster-def[]
|
||
|
||
[[glossary-replica-shard]] replica shard ::
|
||
|
||
Each <<glossary-primary-shard,primary shard>> can have zero or more
|
||
replicas. A replica is a copy of the primary shard, and has two
|
||
purposes:
|
||
+
|
||
1. increase failover: a replica shard can be promoted to a primary
|
||
shard if the primary fails
|
||
2. increase performance: get and search requests can be handled by
|
||
primary or replica shards.
|
||
+
|
||
By default, each primary shard has one replica, but the number of
|
||
replicas can be changed dynamically on an existing index. A replica
|
||
shard will never be started on the same node as its primary shard.
|
||
|
||
[[glossary-rollover]] rollover ::
|
||
+
|
||
--
|
||
// tag::rollover-def[]
|
||
// tag::rollover-def-short[]
|
||
Creates a new index for a rollover target when the existing index reaches a certain size, number of docs, or age.
|
||
A rollover target can be either an <<indices-aliases, index alias>> or a <<data-streams, data stream>>.
|
||
// end::rollover-def-short[]
|
||
|
||
For example, if you're indexing log data, you might use rollover to create daily or weekly indices.
|
||
See the {ref}/indices-rollover-index.html[rollover index API].
|
||
// end::rollover-def[]
|
||
--
|
||
|
||
[[glossary-rollup]] rollup ::
|
||
// tag::rollup-def[]
|
||
Summarize high-granularity data into a more compressed format to
|
||
maintain access to historical data in a cost-effective way.
|
||
// end::rollup-def[]
|
||
|
||
[[glossary-rollup-index]] rollup index ::
|
||
// tag::rollup-index-def[]
|
||
A special type of index for storing historical data at reduced granularity.
|
||
Documents are summarized and indexed into a rollup index by a <<glossary-rollup-job,rollup job>>.
|
||
// end::rollup-index-def[]
|
||
|
||
[[glossary-rollup-job]] rollup job ::
|
||
// tag::rollup-job-def[]
|
||
A background task that runs continuously to summarize documents in an index and
|
||
index the summaries into a separate rollup index.
|
||
The job configuration controls what information is rolled up and how often.
|
||
// end::rollup-job-def[]
|
||
|
||
[[glossary-routing]] routing ::
|
||
|
||
When you index a document, it is stored on a single
|
||
<<glossary-primary-shard,primary shard>>. That shard is chosen by hashing
|
||
the `routing` value. By default, the `routing` value is derived from
|
||
the ID of the document or, if the document has a specified parent
|
||
document, from the ID of the parent document (to ensure that child and
|
||
parent documents are stored on the same shard).
|
||
+
|
||
This value can be overridden by specifying a `routing` value at index
|
||
time, or a <<mapping-routing-field,routing
|
||
field>> in the <<glossary-mapping,mapping>>.
|
||
|
||
[[glossary-searchable-snapshot]] searchable snapshot ::
|
||
// tag::searchable-snapshot-def[]
|
||
A <<glossary-snapshot, snapshot>> of an index that has been mounted as a
|
||
<<glossary-searchable-snapshot-index, searchable snapshot index>> and can be
|
||
searched as if it were a regular index.
|
||
// end::searchable-snapshot-def[]
|
||
|
||
[[glossary-searchable-snapshot-index]] searchable snapshot index ::
|
||
// tag::searchable-snapshot-index-def[]
|
||
An <<glossary-index, index>> whose data is stored in a <<glossary-snapshot,
|
||
snapshot>> that resides in a separate <<glossary-snapshot-repository,snapshot
|
||
repository>> such as AWS S3. Searchable snapshot indices do not need
|
||
<<glossary-replica-shard,replica>> shards for resilience, since their data is
|
||
reliably stored outside the cluster.
|
||
// end::searchable-snapshot-index-def[]
|
||
|
||
[[glossary-shard]] shard ::
|
||
+
|
||
--
|
||
// tag::shard-def[]
|
||
A shard is a single Lucene instance. It is a low-level “worker” unit
|
||
which is managed automatically by Elasticsearch. An index is a logical
|
||
namespace which points to <<glossary-primary-shard,primary>> and
|
||
<<glossary-replica-shard,replica>> shards.
|
||
+
|
||
Other than defining the number of primary and replica shards that an
|
||
index should have, you never need to refer to shards directly.
|
||
Instead, your code should deal only with an index.
|
||
+
|
||
Elasticsearch distributes shards amongst all <<glossary-node,nodes>> in the
|
||
<<glossary-cluster,cluster>>, and can move shards automatically from one
|
||
node to another in the case of node failure, or the addition of new
|
||
nodes.
|
||
// end::shard-def[]
|
||
--
|
||
|
||
[[glossary-shrink]] shrink ::
|
||
// tag::shrink-def[]
|
||
// tag::shrink-def-short[]
|
||
Reduce the number of primary shards in an index.
|
||
// end::shrink-def-short[]
|
||
You can shrink an index to reduce its overhead when the request volume drops.
|
||
For example, you might opt to shrink an index once it is no longer the write index.
|
||
See the {ref}/indices-shrink-index.html[shrink index API].
|
||
// end::shrink-def[]
|
||
|
||
[[glossary-snapshot]] snapshot ::
|
||
// tag::snapshot-def[]
|
||
Captures the state of the whole cluster or of particular indices or data
|
||
streams at a particular point in time. Snapshots provide a back up of a running
|
||
cluster, ensuring you can restore your data in the event of a failure. You can
|
||
also mount indices or datastreams from snapshots as read-only
|
||
{ref}/glossary.html#glossary-searchable-snapshot-index[searchable snapshots].
|
||
// end::snapshot-def[]
|
||
|
||
[[glossary-snapshot-lifecycle-policy]] snapshot lifecycle policy ::
|
||
// tag::snapshot-lifecycle-policy-def[]
|
||
Specifies how frequently to perform automatic backups of a cluster and
|
||
how long to retain the resulting snapshots.
|
||
// end::snapshot-lifecycle-policy-def[]
|
||
|
||
[[glossary-snapshot-repository]] snapshot repository ::
|
||
// tag::snapshot-repository-def[]
|
||
Specifies where snapshots are to be stored.
|
||
Snapshots can be written to a shared filesystem or to a remote repository.
|
||
// end::snapshot-repository-def[]
|
||
|
||
[[glossary-source_field]] source field ::
|
||
|
||
By default, the JSON document that you index will be stored in the
|
||
`_source` field and will be returned by all get and search requests.
|
||
This allows you access to the original object directly from search
|
||
results, rather than requiring a second step to retrieve the object
|
||
from an ID.
|
||
|
||
[[glossary-term]] term ::
|
||
|
||
A term is an exact value that is indexed in Elasticsearch. The terms
|
||
`foo`, `Foo`, `FOO` are NOT equivalent. Terms (i.e. exact values) can
|
||
be searched for using _term_ queries.
|
||
+
|
||
See also <<glossary-text,text>> and <<glossary-analysis,analysis>>.
|
||
|
||
[[glossary-text]] text ::
|
||
|
||
Text (or full text) is ordinary unstructured text, such as this
|
||
paragraph. By default, text will be <<glossary-analysis,analyzed>> into
|
||
<<glossary-term,terms>>, which is what is actually stored in the index.
|
||
+
|
||
Text <<glossary-field,fields>> need to be analyzed at index time in order to
|
||
be searchable as full text, and keywords in full text queries must be
|
||
analyzed at search time to produce (and search for) the same terms
|
||
that were generated at index time.
|
||
+
|
||
See also <<glossary-term,term>> and <<glossary-analysis,analysis>>.
|
||
|
||
[[glossary-type]] type ::
|
||
|
||
A type used to represent the _type_ of document, e.g. an `email`, a `user`, or a `tweet`.
|
||
Types are deprecated and are in the process of being removed.
|
||
See {ref}/removal-of-types.html[Removal of mapping types].
|
||
// end::type-def[]
|
||
|
||
[[glossary-warm-phase]] warm phase ::
|
||
// tag::warm-phase-def[]
|
||
The second possible phase in the <<glossary-index-lifecycle,index lifecycle>>.
|
||
In the warm phase, an index is generally optimized for search and no longer updated.
|
||
// end::warm-phase-def[]
|