Add column type to sys table docs (#7359)

* Add column type

* oops should be used=1
This commit is contained in:
Surekha 2019-03-27 20:21:57 -07:00 committed by Fangjin Yang
parent db0125b709
commit be318f4de3
1 changed files with 44 additions and 44 deletions

View File

@ -590,21 +590,21 @@ Segments table provides details on all Druid segments, whether they are publishe
#### CAVEAT #### CAVEAT
Note that a segment can be served by more than one stream ingestion tasks or Historical processes, in that case it would have multiple replicas. These replicas are weakly consistent with each other when served by multiple ingestion tasks, until a segment is eventually served by a Historical, at that point the segment is immutable. Broker prefers to query a segment from Historical over an ingestion task. But if a segment has multiple realtime replicas, for eg. kafka index tasks, and one task is slower than other, then the sys.segments query results can vary for the duration of the tasks because only one of the ingestion tasks is queried by the Broker and it is not gauranteed that the same task gets picked everytime. The `num_rows` column of segments table can have inconsistent values during this period. There is an open [issue](https://github.com/apache/incubator-druid/issues/5915) about this inconsistency with stream ingestion tasks. Note that a segment can be served by more than one stream ingestion tasks or Historical processes, in that case it would have multiple replicas. These replicas are weakly consistent with each other when served by multiple ingestion tasks, until a segment is eventually served by a Historical, at that point the segment is immutable. Broker prefers to query a segment from Historical over an ingestion task. But if a segment has multiple realtime replicas, for eg. kafka index tasks, and one task is slower than other, then the sys.segments query results can vary for the duration of the tasks because only one of the ingestion tasks is queried by the Broker and it is not gauranteed that the same task gets picked everytime. The `num_rows` column of segments table can have inconsistent values during this period. There is an open [issue](https://github.com/apache/incubator-druid/issues/5915) about this inconsistency with stream ingestion tasks.
|Column|Notes| |Column|Type|Notes|
|------|-----| |------|-----|-----|
|segment_id|Unique segment identifier| |segment_id|STRING|Unique segment identifier|
|datasource|Name of datasource| |datasource|STRING|Name of datasource|
|start|Interval start time (in ISO 8601 format)| |start|STRING|Interval start time (in ISO 8601 format)|
|end|Interval end time (in ISO 8601 format)| |end|STRING|Interval end time (in ISO 8601 format)|
|size|Size of segment in bytes| |size|LONG|Size of segment in bytes|
|version|Version string (generally an ISO8601 timestamp corresponding to when the segment set was first started). Higher version means the more recently created segment. Version comparing is based on string comparison.| |version|STRING|Version string (generally an ISO8601 timestamp corresponding to when the segment set was first started). Higher version means the more recently created segment. Version comparing is based on string comparison.|
|partition_num|Partition number (an integer, unique within a datasource+interval+version; may not necessarily be contiguous)| |partition_num|LONG|Partition number (an integer, unique within a datasource+interval+version; may not necessarily be contiguous)|
|num_replicas|Number of replicas of this segment currently being served| |num_replicas|LONG|Number of replicas of this segment currently being served|
|num_rows|Number of rows in current segment, this value could be null if unkown to Broker at query time| |num_rows|LONG|Number of rows in current segment, this value could be null if unkown to Broker at query time|
|is_published|Boolean is represented as long type where 1 = true, 0 = false. 1 represents this segment has been published to the metadata store| |is_published|LONG|Boolean is represented as long type where 1 = true, 0 = false. 1 represents this segment has been published to the metadata store with `used=1`|
|is_available|Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is currently being served by any server(Historical or realtime)| |is_available|LONG|Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is currently being served by any process(Historical or realtime)|
|is_realtime|Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is being served on any type of realtime tasks| |is_realtime|LONG|Boolean is represented as long type where 1 = true, 0 = false. 1 if this segment is being served on any type of realtime tasks|
|payload|JSON-serialized data segment payload| |payload|STRING|JSON-serialized data segment payload|
For example to retrieve all segments for datasource "wikipedia", use the query: For example to retrieve all segments for datasource "wikipedia", use the query:
@ -629,16 +629,16 @@ ORDER BY 2 DESC
### SERVERS table ### SERVERS table
Servers table lists all data servers(any server that hosts a segment). It includes both Historicals and Peons. Servers table lists all data servers(any server that hosts a segment). It includes both Historicals and Peons.
|Column|Notes| |Column|Type|Notes|
|------|-----| |------|-----|-----|
|server|Server name in the form host:port| |server|STRING|Server name in the form host:port|
|host|Hostname of the server| |host|STRING|Hostname of the server|
|plaintext_port|Unsecured port of the server, or -1 if plaintext traffic is disabled| |plaintext_port|LONG|Unsecured port of the server, or -1 if plaintext traffic is disabled|
|tls_port|TLS port of the server, or -1 if TLS is disabled| |tls_port|LONG|TLS port of the server, or -1 if TLS is disabled|
|server_type|Type of Druid service. Possible values include: Historical, realtime and indexer_executor(Peon).| |server_type|STRING|Type of Druid service. Possible values include: Historical, realtime and indexer_executor(Peon).|
|tier|Distribution tier see [druid.server.tier](#../configuration/index.html#Historical-General-Configuration)| |tier|STRING|Distribution tier see [druid.server.tier](#../configuration/index.html#Historical-General-Configuration)|
|current_size|Current size of segments in bytes on this server| |current_size|LONG|Current size of segments in bytes on this server|
|max_size|Max size in bytes this server recommends to assign to segments see [druid.server.maxSize](#../configuration/index.html#Historical-General-Configuration)| |max_size|LONG|Max size in bytes this server recommends to assign to segments see [druid.server.maxSize](#../configuration/index.html#Historical-General-Configuration)|
To retrieve information about all servers, use the query: To retrieve information about all servers, use the query:
@ -650,10 +650,10 @@ SELECT * FROM sys.servers;
SERVER_SEGMENTS is used to join servers with segments table SERVER_SEGMENTS is used to join servers with segments table
|Column|Notes| |Column|Type|Notes|
|------|-----| |------|-----|-----|
|server|Server name in format host:port (Primary key of [servers table](#SERVERS-table))| |server|STRING|Server name in format host:port (Primary key of [servers table](#SERVERS-table))|
|segment_id|Segment identifier (Primary key of [segments table](#SEGMENTS-table))| |segment_id|STRING|Segment identifier (Primary key of [segments table](#SEGMENTS-table))|
JOIN between "servers" and "segments" can be used to query the number of segments for a specific datasource, JOIN between "servers" and "segments" can be used to query the number of segments for a specific datasource,
grouped by server, example query: grouped by server, example query:
@ -673,21 +673,21 @@ GROUP BY servers.server;
The tasks table provides information about active and recently-completed indexing tasks. For more information The tasks table provides information about active and recently-completed indexing tasks. For more information
check out [ingestion tasks](#../ingestion/tasks.html) check out [ingestion tasks](#../ingestion/tasks.html)
|Column|Notes| |Column|Type|Notes|
|------|-----| |------|-----|-----|
|task_id|Unique task identifier| |task_id|STRING|Unique task identifier|
|type|Task type, for example this value is "index" for indexing tasks. See [tasks-overview](../ingestion/tasks.html)| |type|STRING|Task type, for example this value is "index" for indexing tasks. See [tasks-overview](../ingestion/tasks.html)|
|datasource|Datasource name being indexed| |datasource|STRING|Datasource name being indexed|
|created_time|Timestamp in ISO8601 format corresponding to when the ingestion task was created. Note that this value is populated for completed and waiting tasks. For running and pending tasks this value is set to 1970-01-01T00:00:00Z| |created_time|STRING|Timestamp in ISO8601 format corresponding to when the ingestion task was created. Note that this value is populated for completed and waiting tasks. For running and pending tasks this value is set to 1970-01-01T00:00:00Z|
|queue_insertion_time|Timestamp in ISO8601 format corresponding to when this task was added to the queue on the Overlord| |queue_insertion_time|STRING|Timestamp in ISO8601 format corresponding to when this task was added to the queue on the Overlord|
|status|Status of a task can be RUNNING, FAILED, SUCCESS| |status|STRING|Status of a task can be RUNNING, FAILED, SUCCESS|
|runner_status|Runner status of a completed task would be NONE, for in-progress tasks this can be RUNNING, WAITING, PENDING| |runner_status|STRING|Runner status of a completed task would be NONE, for in-progress tasks this can be RUNNING, WAITING, PENDING|
|duration|Time it took to finish the task in milliseconds, this value is present only for completed tasks| |duration|LONG|Time it took to finish the task in milliseconds, this value is present only for completed tasks|
|location|Server name where this task is running in the format host:port, this information is present only for RUNNING tasks| |location|STRING|Server name where this task is running in the format host:port, this information is present only for RUNNING tasks|
|host|Hostname of the server where task is running| |host|STRING|Hostname of the server where task is running|
|plaintext_port|Unsecured port of the server, or -1 if plaintext traffic is disabled| |plaintext_port|LONG|Unsecured port of the server, or -1 if plaintext traffic is disabled|
|tls_port|TLS port of the server, or -1 if TLS is disabled| |tls_port|LONG|TLS port of the server, or -1 if TLS is disabled|
|error_msg|Detailed error message in case of FAILED tasks| |error_msg|STRING|Detailed error message in case of FAILED tasks|
For example, to retrieve tasks information filtered by status, use the query For example, to retrieve tasks information filtered by status, use the query