mirror of
https://github.com/honeymoose/OpenSearch.git
synced 2025-02-06 21:18:31 +00:00
fadbe0de08
Today we require users to prepare their indices for split operations. Yet, we can do this automatically when an index is created which would make the split feature a much more appealing option since it doesn't have any 3rd party prerequisites anymore. This change automatically sets the number of routinng shards such that an index is guaranteed to be able to split once into twice as many shards. The number of routing shards is scaled towards the default shard limit per index such that indices with a smaller amount of shards can be split more often than larger ones. For instance an index with 1 or 2 shards can be split 10x (until it approaches 1024 shards) while an index created with 128 shards can only be split 3x by a factor of 2. Please note this is just a default value and users can still prepare their indices with `index.number_of_routing_shards` for custom splitting. NOTE: this change has an impact on the document distribution since we are changing the hash space. Documents are still uniformly distributed across all shards but since we are artificually changing the number of buckets in the consistent hashign space document might be hashed into different shards compared to previous versions. This is a 7.0 only change.
74 lines
3.5 KiB
Plaintext
74 lines
3.5 KiB
Plaintext
[[cat-segments]]
|
|
== cat segments
|
|
|
|
The `segments` command provides low level information about the segments
|
|
in the shards of an index. It provides information similar to the
|
|
link:indices-segments.html[_segments] endpoint. For example:
|
|
|
|
[source,js]
|
|
--------------------------------------------------
|
|
GET /_cat/segments?v
|
|
--------------------------------------------------
|
|
// CONSOLE
|
|
// TEST[s/^/PUT \/test\/test\/1?refresh\n{"test":"test"}\nPUT \/test1\/test\/1?refresh\n{"test":"test"}\n/]
|
|
|
|
might look like:
|
|
|
|
["source","txt",subs="attributes,callouts"]
|
|
--------------------------------------------------
|
|
index shard prirep ip segment generation docs.count docs.deleted size size.memory committed searchable version compound
|
|
test 4 p 127.0.0.1 _0 0 1 0 3kb 2042 false true {lucene_version} true
|
|
test1 4 p 127.0.0.1 _0 0 1 0 3kb 2042 false true {lucene_version} true
|
|
--------------------------------------------------
|
|
// TESTRESPONSE[s/3kb/\\d+(\\.\\d+)?[mk]?b/ s/2042/\\d+/ _cat]
|
|
|
|
The output shows information about index names and shard numbers in the first
|
|
two columns.
|
|
|
|
If you only want to get information about segments in one particular index,
|
|
you can add the index name in the URL, for example `/_cat/segments/test`. Also,
|
|
several indexes can be queried like `/_cat/segments/test,test1`
|
|
|
|
|
|
The following columns provide additional monitoring information:
|
|
|
|
prirep:: Whether this segment belongs to a primary or replica shard.
|
|
|
|
ip:: The ip address of the segment's shard.
|
|
|
|
segment:: A segment name, derived from the segment generation. The name
|
|
is internally used to generate the file names in the directory
|
|
of the shard this segment belongs to.
|
|
|
|
generation:: The generation number is incremented with each segment that is written.
|
|
The name of the segment is derived from this generation number.
|
|
|
|
docs.count:: The number of non-deleted documents that are stored in this segment.
|
|
Note that these are Lucene documents, so the count will include hidden
|
|
documents (e.g. from nested types).
|
|
|
|
docs.deleted:: The number of deleted documents that are stored in this segment.
|
|
It is perfectly fine if this number is greater than 0, space is
|
|
going to be reclaimed when this segment gets merged.
|
|
|
|
size:: The amount of disk space that this segment uses.
|
|
|
|
size.memory:: Segments store some data into memory in order to be searchable efficiently.
|
|
This column shows the number of bytes in memory that are used.
|
|
|
|
committed:: Whether the segment has been sync'ed on disk. Segments that are
|
|
committed would survive a hard reboot. No need to worry in case
|
|
of false, the data from uncommitted segments is also stored in
|
|
the transaction log so that Elasticsearch is able to replay
|
|
changes on the next start.
|
|
|
|
searchable:: True if the segment is searchable. A value of false would most
|
|
likely mean that the segment has been written to disk but no
|
|
refresh occurred since then to make it searchable.
|
|
|
|
version:: The version of Lucene that has been used to write this segment.
|
|
|
|
compound:: Whether the segment is stored in a compound file. When true, this
|
|
means that Lucene merged all files from the segment in a single
|
|
one in order to save file descriptors.
|