OpenSearch/docs/reference/cat/segments.asciidoc
Jason Tedor 4a4e3d70d5
Default to one shard (#30539)
This commit changes the default out-of-the-box configuration for the
number of shards from five to one. We think this will help address a
common problem of oversharding. For users with time-based indices that
need a different default, this can be managed with index templates. For
users with non-time-based indices that find they need to re-shard with
the split API in place they no longer need to resort only to
reindexing.

Since this has the impact of changing the default number of shards used
in REST tests, we want to ensure that we still have coverage for issues
that could arise from multiple shards. As such, we randomize (rarely)
the default number of shards in REST tests to two. This is managed via a
global index template. However, some tests check the templates that are
in the cluster state during the test. Since this template is randomly
there, we need a way for tests to skip adding the template used to set
the number of shards to two. For this we add the default_shards feature
skip. To avoid having to write our docs in a complicated way because
sometimes they might be behind one shard, and sometimes they might be
behind two shards we apply the default_shards feature skip to all docs
tests. That is, these tests will always run with the default number of
shards (one).
2018-05-14 12:22:35 -04:00

74 lines
3.5 KiB
Plaintext

[[cat-segments]]
== cat segments
The `segments` command provides low level information about the segments
in the shards of an index. It provides information similar to the
link:indices-segments.html[_segments] endpoint. For example:
[source,js]
--------------------------------------------------
GET /_cat/segments?v
--------------------------------------------------
// CONSOLE
// TEST[s/^/PUT \/test\/test\/1?refresh\n{"test":"test"}\nPUT \/test1\/test\/1?refresh\n{"test":"test"}\n/]
might look like:
["source","txt",subs="attributes,callouts"]
--------------------------------------------------
index shard prirep ip segment generation docs.count docs.deleted size size.memory committed searchable version compound
test 0 p 127.0.0.1 _0 0 1 0 3kb 2042 false true {lucene_version} true
test1 0 p 127.0.0.1 _0 0 1 0 3kb 2042 false true {lucene_version} true
--------------------------------------------------
// TESTRESPONSE[s/3kb/\\d+(\\.\\d+)?[mk]?b/ s/2042/\\d+/ _cat]
The output shows information about index names and shard numbers in the first
two columns.
If you only want to get information about segments in one particular index,
you can add the index name in the URL, for example `/_cat/segments/test`. Also,
several indexes can be queried like `/_cat/segments/test,test1`
The following columns provide additional monitoring information:
prirep:: Whether this segment belongs to a primary or replica shard.
ip:: The ip address of the segment's shard.
segment:: A segment name, derived from the segment generation. The name
is internally used to generate the file names in the directory
of the shard this segment belongs to.
generation:: The generation number is incremented with each segment that is written.
The name of the segment is derived from this generation number.
docs.count:: The number of non-deleted documents that are stored in this segment.
Note that these are Lucene documents, so the count will include hidden
documents (e.g. from nested types).
docs.deleted:: The number of deleted documents that are stored in this segment.
It is perfectly fine if this number is greater than 0, space is
going to be reclaimed when this segment gets merged.
size:: The amount of disk space that this segment uses.
size.memory:: Segments store some data into memory in order to be searchable efficiently.
This column shows the number of bytes in memory that are used.
committed:: Whether the segment has been sync'ed on disk. Segments that are
committed would survive a hard reboot. No need to worry in case
of false, the data from uncommitted segments is also stored in
the transaction log so that Elasticsearch is able to replay
changes on the next start.
searchable:: True if the segment is searchable. A value of false would most
likely mean that the segment has been written to disk but no
refresh occurred since then to make it searchable.
version:: The version of Lucene that has been used to write this segment.
compound:: Whether the segment is stored in a compound file. When true, this
means that Lucene merged all files from the segment in a single
one in order to save file descriptors.