OpenSearch/docs/reference/indices/segments.asciidoc
Clinton Gormley ff4a2519f2 Update experimental labels in the docs (#25727)
Relates https://github.com/elastic/elasticsearch/issues/19798

Removed experimental label from:
* Painless
* Diversified Sampler Agg
* Sampler Agg
* Significant Terms Agg
* Terms Agg document count error and execution_hint
* Cardinality Agg precision_threshold
* Pipeline Aggregations
* index.shard.check_on_startup
* index.store.type (added warning)
* Preloading data into the file system cache
* foreach ingest processor
* Field caps API
* Profile API

Added experimental label to:
* Moving Average Agg Prediction


Changed experimental to beta for:
* Adjacency matrix agg
* Normalizers
* Tasks API
* Index sorting

Labelled experimental in Lucene:
* ICU plugin custom rules file
* Flatten graph token filter
* Synonym graph token filter
* Word delimiter graph token filter
* Simple pattern tokenizer
* Simple pattern split tokenizer

Replaced experimental label with warning that details may change in the future:
* Analysis explain output format
* Segments verbose output format
* Percentile Agg compression and HDR Histogram
* Percentile Rank Agg HDR Histogram
2017-07-18 14:06:22 +02:00

117 lines
4.1 KiB
Plaintext

[[indices-segments]]
== Indices Segments
Provide low level segments information that a Lucene index (shard level)
is built with. Allows to be used to provide more information on the
state of a shard and an index, possibly optimization information, data
"wasted" on deletes, and so on.
Endpoints include segments for a specific index, several indices, or
all:
[source,js]
--------------------------------------------------
curl -XGET 'http://localhost:9200/test/_segments'
curl -XGET 'http://localhost:9200/test1,test2/_segments'
curl -XGET 'http://localhost:9200/_segments'
--------------------------------------------------
Response:
[source,js]
--------------------------------------------------
{
...
"_3": {
"generation": 3,
"num_docs": 1121,
"deleted_docs": 53,
"size_in_bytes": 228288,
"memory_in_bytes": 3211,
"committed": true,
"search": true,
"version": "4.6",
"compound": true
}
...
}
--------------------------------------------------
_0:: The key of the JSON document is the name of the segment. This name
is used to generate file names: all files starting with this
segment name in the directory of the shard belong to this segment.
generation:: A generation number that is basically incremented when needing to
write a new segment. The segment name is derived from this
generation number.
num_docs:: The number of non-deleted documents that are stored in this segment.
deleted_docs:: The number of deleted documents that are stored in this segment.
It is perfectly fine if this number is greater than 0, space is
going to be reclaimed when this segment gets merged.
size_in_bytes:: The amount of disk space that this segment uses, in bytes.
memory_in_bytes:: Segments need to store some data into memory in order to be
searchable efficiently. This number returns the number of bytes
that are used for that purpose. A value of -1 indicates that
Elasticsearch was not able to compute this number.
committed:: Whether the segment has been sync'ed on disk. Segments that are
committed would survive a hard reboot. No need to worry in case
of false, the data from uncommitted segments is also stored in
the transaction log so that Elasticsearch is able to replay
changes on the next start.
search:: Whether the segment is searchable. A value of false would most
likely mean that the segment has been written to disk but no
refresh occurred since then to make it searchable.
version:: The version of Lucene that has been used to write this segment.
compound:: Whether the segment is stored in a compound file. When true, this
means that Lucene merged all files from the segment in a single
one in order to save file descriptors.
[float]
=== Verbose mode
To add additional information that can be used for debugging, use the `verbose` flag.
NOTE: The format of the additional detail information is labelled as experimental in Lucene and it may change in the future.
[source,js]
--------------------------------------------------
curl -XGET 'http://localhost:9200/test/_segments?verbose=true'
--------------------------------------------------
Response:
[source,js]
--------------------------------------------------
{
...
"_3": {
...
"ram_tree": [
{
"description": "postings [PerFieldPostings(format=1)]",
"size_in_bytes": 2696,
"children": [
{
"description": "format 'Lucene50_0' ...",
"size_in_bytes": 2608,
"children" :[ ... ]
},
...
]
},
...
]
}
...
}
--------------------------------------------------