Update kinesis-ingestion.md (#11767)

* Update kinesis-ingestion.md

It seems that we are declaring (a final int) recordsPerFetch as 400 and fetchDelayMillis as 0 in https://github.com/implydata/druid/blob/imply-2021.09/extensions-core/kinesis-indexing-service/src/main/java/org/apache/druid/indexing/kinesis/KinesisIndexTaskIOConfig.java#L36

```
public static final int DEFAULT_RECORDS_PER_FETCH = 4000;
public static final int DEFAULT_FETCH_DELAY_MILLIS = 0;
```

updating `recordsPerFetch` and `fetchDelayMillis` to actual default values as hardcoded above .

* Update docs/development/extensions-core/kinesis-ingestion.md

Co-authored-by: Charles Smith <techdocsmith@gmail.com>
This commit is contained in:
Vaibhav 2021-10-04 23:56:53 +05:30 committed by GitHub
parent d02d2d9d56
commit 3c4bba1478
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 6 additions and 6 deletions

View File

@ -102,8 +102,8 @@ Where the file `supervisor-spec.json` contains a Kinesis supervisor spec:
"taskCount": 1, "taskCount": 1,
"replicas": 1, "replicas": 1,
"taskDuration": "PT1H", "taskDuration": "PT1H",
"recordsPerFetch": 2000, "recordsPerFetch": 4000,
"fetchDelayMillis": 1000 "fetchDelayMillis": 0
}, },
"tuningConfig": { "tuningConfig": {
"type": "kinesis", "type": "kinesis",
@ -141,8 +141,8 @@ Where the file `supervisor-spec.json` contains a Kinesis supervisor spec:
|`completionTimeout`|ISO8601 Period|The length of time to wait before declaring a publishing task as failed and terminating it. If this is set too low, your tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|no (default == PT6H)| |`completionTimeout`|ISO8601 Period|The length of time to wait before declaring a publishing task as failed and terminating it. If this is set too low, your tasks may never publish. The publishing clock for a task begins roughly after `taskDuration` elapses.|no (default == PT6H)|
|`lateMessageRejectionPeriod`|ISO8601 Period|Configure tasks to reject messages with timestamps earlier than this period before the task was created; for example if this is set to `PT1H` and the supervisor creates a task at *2016-01-01T12:00Z*, messages with timestamps earlier than *2016-01-01T11:00Z* will be dropped. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments (e.g. a realtime and a nightly batch ingestion pipeline).|no (default == none)| |`lateMessageRejectionPeriod`|ISO8601 Period|Configure tasks to reject messages with timestamps earlier than this period before the task was created; for example if this is set to `PT1H` and the supervisor creates a task at *2016-01-01T12:00Z*, messages with timestamps earlier than *2016-01-01T11:00Z* will be dropped. This may help prevent concurrency issues if your data stream has late messages and you have multiple pipelines that need to operate on the same segments (e.g. a realtime and a nightly batch ingestion pipeline).|no (default == none)|
|`earlyMessageRejectionPeriod`|ISO8601 Period|Configure tasks to reject messages with timestamps later than this period after the task reached its taskDuration; for example if this is set to `PT1H`, the taskDuration is set to `PT1H` and the supervisor creates a task at *2016-01-01T12:00Z*, messages with timestamps later than *2016-01-01T14:00Z* will be dropped. **Note:** Tasks sometimes run past their task duration, for example, in cases of supervisor failover. Setting earlyMessageRejectionPeriod too low may cause messages to be dropped unexpectedly whenever a task runs past its originally configured task duration.|no (default == none)| |`earlyMessageRejectionPeriod`|ISO8601 Period|Configure tasks to reject messages with timestamps later than this period after the task reached its taskDuration; for example if this is set to `PT1H`, the taskDuration is set to `PT1H` and the supervisor creates a task at *2016-01-01T12:00Z*, messages with timestamps later than *2016-01-01T14:00Z* will be dropped. **Note:** Tasks sometimes run past their task duration, for example, in cases of supervisor failover. Setting earlyMessageRejectionPeriod too low may cause messages to be dropped unexpectedly whenever a task runs past its originally configured task duration.|no (default == none)|
|`recordsPerFetch`|Integer|The number of records to request per GetRecords call to Kinesis. See 'Determining Fetch Settings' below.|no (default == 2000)| |`recordsPerFetch`|Integer|The number of records to request per call to fetch records from Kinesis. See [Determining fetch settings](#determining-fetch-settings).|no (default == 4000)|
|`fetchDelayMillis`|Integer|Time in milliseconds to wait between subsequent GetRecords calls to Kinesis. See 'Determining Fetch Settings' below.|no (default == 1000)| |`fetchDelayMillis`|Integer|Time in milliseconds to wait between subsequent calls to fetch records from Kinesis. See [Determining fetch settings](#determining-fetch-settings).|no (default == 0)|
|`awsAssumedRoleArn`|String|The AWS assumed role to use for additional permissions.|no| |`awsAssumedRoleArn`|String|The AWS assumed role to use for additional permissions.|no|
|`awsExternalId`|String|The AWS external id to use for additional permissions.|no| |`awsExternalId`|String|The AWS external id to use for additional permissions.|no|
|`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. See below for details.|no| |`deaggregate`|Boolean|Whether to use the de-aggregate function of the KCL. See below for details.|no|
@ -247,8 +247,8 @@ The following example demonstrates a supervisor spec with `lagBased` autoScaler
"taskCount": 1, "taskCount": 1,
"replicas": 1, "replicas": 1,
"taskDuration": "PT1H", "taskDuration": "PT1H",
"recordsPerFetch": 2000, "recordsPerFetch": 4000,
"fetchDelayMillis": 1000 "fetchDelayMillis": 0
}, },
"tuningConfig": { "tuningConfig": {
"type": "kinesis", "type": "kinesis",