Web console: targetRowsPerSegment for hashed partionin (#10500)

* Web console: targetRowsPerSegment for hashed partionin

Added `targetRowsPerSegment` to the web console for hashed partition for both
the auto compaction view and as part of the ingestion workflow.

The help text was also updated to indicate when a user should care about
updating these fields

* code review

* update test snapshots

* oops
This commit is contained in:
Suneet Saldanha 2020-10-11 16:55:28 -07:00 committed by GitHub
parent ad437dd655
commit b45a56f989
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
3 changed files with 156 additions and 13 deletions

View File

@ -80,7 +80,26 @@ exports[`CompactionDialog matches snapshot with compactionConfig (dynamic partit
Object {
"defined": [Function],
"info": <React.Fragment>
Directly specify the number of shards to create. If this is specified and 'intervals' is specified in the granularitySpec, the index task can skip the determine intervals/partitions pass through the data.
<p>
If the segments generated are a sub-optimal size for the requested partition dimensions, consider setting this field.
</p>
<p>
A target row count for each partition. Each partition will have a row count close to the target assuming evenly distributed keys. Defaults to 5 million if numShards is null.
</p>
</React.Fragment>,
"label": "Target rows per segment",
"name": "tuningConfig.partitionsSpec.targetRowsPerSegment",
"type": "number",
},
Object {
"defined": [Function],
"info": <React.Fragment>
<p>
If you know the optimal number of shards and want to speed up the time it takes for compaction to run, set this field.
</p>
<p>
Directly specify the number of shards to create. If this is specified and 'intervals' is specified in the granularitySpec, the index task can skip the determine intervals/partitions pass through the data.
</p>
</React.Fragment>,
"label": "Num shards",
"name": "tuningConfig.partitionsSpec.numShards",
@ -298,7 +317,26 @@ exports[`CompactionDialog matches snapshot with compactionConfig (hashed partiti
Object {
"defined": [Function],
"info": <React.Fragment>
Directly specify the number of shards to create. If this is specified and 'intervals' is specified in the granularitySpec, the index task can skip the determine intervals/partitions pass through the data.
<p>
If the segments generated are a sub-optimal size for the requested partition dimensions, consider setting this field.
</p>
<p>
A target row count for each partition. Each partition will have a row count close to the target assuming evenly distributed keys. Defaults to 5 million if numShards is null.
</p>
</React.Fragment>,
"label": "Target rows per segment",
"name": "tuningConfig.partitionsSpec.targetRowsPerSegment",
"type": "number",
},
Object {
"defined": [Function],
"info": <React.Fragment>
<p>
If you know the optimal number of shards and want to speed up the time it takes for compaction to run, set this field.
</p>
<p>
Directly specify the number of shards to create. If this is specified and 'intervals' is specified in the granularitySpec, the index task can skip the determine intervals/partitions pass through the data.
</p>
</React.Fragment>,
"label": "Num shards",
"name": "tuningConfig.partitionsSpec.numShards",
@ -516,7 +554,26 @@ exports[`CompactionDialog matches snapshot with compactionConfig (single_dim par
Object {
"defined": [Function],
"info": <React.Fragment>
Directly specify the number of shards to create. If this is specified and 'intervals' is specified in the granularitySpec, the index task can skip the determine intervals/partitions pass through the data.
<p>
If the segments generated are a sub-optimal size for the requested partition dimensions, consider setting this field.
</p>
<p>
A target row count for each partition. Each partition will have a row count close to the target assuming evenly distributed keys. Defaults to 5 million if numShards is null.
</p>
</React.Fragment>,
"label": "Target rows per segment",
"name": "tuningConfig.partitionsSpec.targetRowsPerSegment",
"type": "number",
},
Object {
"defined": [Function],
"info": <React.Fragment>
<p>
If you know the optimal number of shards and want to speed up the time it takes for compaction to run, set this field.
</p>
<p>
Directly specify the number of shards to create. If this is specified and 'intervals' is specified in the granularitySpec, the index task can skip the determine intervals/partitions pass through the data.
</p>
</React.Fragment>,
"label": "Num shards",
"name": "tuningConfig.partitionsSpec.numShards",
@ -734,7 +791,26 @@ exports[`CompactionDialog matches snapshot without compactionConfig 1`] = `
Object {
"defined": [Function],
"info": <React.Fragment>
Directly specify the number of shards to create. If this is specified and 'intervals' is specified in the granularitySpec, the index task can skip the determine intervals/partitions pass through the data.
<p>
If the segments generated are a sub-optimal size for the requested partition dimensions, consider setting this field.
</p>
<p>
A target row count for each partition. Each partition will have a row count close to the target assuming evenly distributed keys. Defaults to 5 million if numShards is null.
</p>
</React.Fragment>,
"label": "Target rows per segment",
"name": "tuningConfig.partitionsSpec.targetRowsPerSegment",
"type": "number",
},
Object {
"defined": [Function],
"info": <React.Fragment>
<p>
If you know the optimal number of shards and want to speed up the time it takes for compaction to run, set this field.
</p>
<p>
Directly specify the number of shards to create. If this is specified and 'intervals' is specified in the granularitySpec, the index task can skip the determine intervals/partitions pass through the data.
</p>
</React.Fragment>,
"label": "Num shards",
"name": "tuningConfig.partitionsSpec.numShards",

View File

@ -74,16 +74,44 @@ const COMPACTION_CONFIG_FIELDS: Field<CompactionConfig>[] = [
info: <>Total number of rows in segments waiting for being pushed.</>,
},
// partitionsSpec type: hashed
{
name: 'tuningConfig.partitionsSpec.targetRowsPerSegment',
label: 'Target rows per segment',
type: 'number',
defined: (t: CompactionConfig) =>
deepGet(t, 'tuningConfig.partitionsSpec.type') === 'hashed' &&
!deepGet(t, 'tuningConfig.partitionsSpec.numShards'),
info: (
<>
<p>
If the segments generated are a sub-optimal size for the requested partition dimensions,
consider setting this field.
</p>
<p>
A target row count for each partition. Each partition will have a row count close to the
target assuming evenly distributed keys. Defaults to 5 million if numShards is null.
</p>
</>
),
},
{
name: 'tuningConfig.partitionsSpec.numShards',
label: 'Num shards',
type: 'number',
defined: (t: CompactionConfig) => deepGet(t, 'tuningConfig.partitionsSpec.type') === 'hashed',
defined: (t: CompactionConfig) =>
deepGet(t, 'tuningConfig.partitionsSpec.type') === 'hashed' &&
!deepGet(t, 'tuningConfig.partitionsSpec.targetRowsPerSegment'),
info: (
<>
Directly specify the number of shards to create. If this is specified and 'intervals' is
specified in the granularitySpec, the index task can skip the determine intervals/partitions
pass through the data.
<p>
If you know the optimal number of shards and want to speed up the time it takes for
compaction to run, set this field.
</p>
<p>
Directly specify the number of shards to create. If this is specified and 'intervals' is
specified in the granularitySpec, the index task can skip the determine
intervals/partitions pass through the data.
</p>
</>
),
},
@ -211,7 +239,12 @@ function validCompactionConfig(compactionConfig: CompactionConfig): boolean {
deepGet(compactionConfig, 'tuningConfig.partitionsSpec.type') || 'dynamic';
switch (partitionsSpecType) {
// case 'dynamic': // Nothing to check for dynamic
// case 'hashed': // Nothing to check for hashed
case 'hashed':
return !(
Boolean(deepGet(compactionConfig, 'tuningConfig.partitionsSpec.targetRowsPerSegment')) &&
Boolean(deepGet(compactionConfig, 'tuningConfig.partitionsSpec.numShards'))
);
break;
case 'single_dim':
if (!deepGet(compactionConfig, 'tuningConfig.partitionsSpec.partitionDimension')) {
return false;

View File

@ -2114,6 +2114,11 @@ export function invalidTuningConfig(tuningConfig: TuningConfig, intervals: any):
if (!intervals) return true;
switch (deepGet(tuningConfig, 'partitionsSpec.type')) {
case 'hashed':
return (
Boolean(deepGet(tuningConfig, 'partitionsSpec.targetRowsPerSegment')) &&
Boolean(deepGet(tuningConfig, 'partitionsSpec.numShards'))
);
case 'single_dim':
if (!deepGet(tuningConfig, 'partitionsSpec.partitionDimension')) return true;
const hasTargetRowsPerSegment = Boolean(
@ -2181,16 +2186,45 @@ export function getPartitionRelatedTuningSpecFormFields(
info: <>Total number of rows in segments waiting for being pushed.</>,
},
// partitionsSpec type: hashed
{
name: 'partitionsSpec.targetRowsPerSegment',
label: 'Target rows per segment',
type: 'number',
defined: (t: TuningConfig) =>
deepGet(t, 'partitionsSpec.type') === 'hashed' &&
!deepGet(t, 'partitionsSpec.numShards'),
info: (
<>
<p>
If the segments generated are a sub-optimal size for the requested partition
dimensions, consider setting this field.
</p>
<p>
A target row count for each partition. Each partition will have a row count close to
the target assuming evenly distributed keys. Defaults to 5 million if numShards is
null.
</p>
</>
),
},
{
name: 'partitionsSpec.numShards',
label: 'Num shards',
type: 'number',
defined: (t: TuningConfig) => deepGet(t, 'partitionsSpec.type') === 'hashed',
defined: (t: TuningConfig) =>
deepGet(t, 'partitionsSpec.type') === 'hashed' &&
!deepGet(t, 'partitionsSpec.targetRowsPerSegment'),
info: (
<>
Directly specify the number of shards to create. If this is specified and 'intervals'
is specified in the granularitySpec, the index task can skip the determine
intervals/partitions pass through the data.
<p>
If you know the optimal number of shards and want to speed up the time it takes for
compaction to run, set this field.
</p>
<p>
Directly specify the number of shards to create. If this is specified and
'intervals' is specified in the granularitySpec, the index task can skip the
determine intervals/partitions pass through the data.
</p>
</>
),
},