mirror of https://github.com/apache/druid.git
Fix docs and flipped boolean in ScanQueryLimitRowIterator
This commit is contained in:
parent
35692680fc
commit
8a6bb1127c
|
@ -158,7 +158,7 @@ The format of the result when resultFormat equals `compactedList`:
|
||||||
The Scan query currently supports ordering based on timestamp for non-legacy queries. Note that using time ordering
|
The Scan query currently supports ordering based on timestamp for non-legacy queries. Note that using time ordering
|
||||||
will yield results that do not indicate which segment rows are from (`segmentId` will show up as `null`). Furthermore,
|
will yield results that do not indicate which segment rows are from (`segmentId` will show up as `null`). Furthermore,
|
||||||
time ordering is only supported where the result set limit is less than `druid.query.scan.maxRowsQueuedForOrdering`
|
time ordering is only supported where the result set limit is less than `druid.query.scan.maxRowsQueuedForOrdering`
|
||||||
rows **or** fewer than `druid.query.scan.maxSegmentPartitionsOrderedInMemory` segments are scanned per Historical. The
|
rows **or** all segments scanned have fewer than `druid.query.scan.maxSegmentPartitionsOrderedInMemory` partitions. The
|
||||||
reasoning behind these limitations is that the implementation of time ordering uses two strategies that can consume too
|
reasoning behind these limitations is that the implementation of time ordering uses two strategies that can consume too
|
||||||
much heap memory if left unbounded. These strategies (listed below) are chosen on a per-Historical basis depending on
|
much heap memory if left unbounded. These strategies (listed below) are chosen on a per-Historical basis depending on
|
||||||
query result set limit and the number of segments being scanned.
|
query result set limit and the number of segments being scanned.
|
||||||
|
@ -170,12 +170,12 @@ priority queue are streamed back to the Broker(s) in batches. Attempting to loa
|
||||||
risk of Historical nodes running out of memory. The `druid.query.scan.maxRowsQueuedForOrdering` property protects
|
risk of Historical nodes running out of memory. The `druid.query.scan.maxRowsQueuedForOrdering` property protects
|
||||||
from this by limiting the number of rows in the query result set when time ordering is used.
|
from this by limiting the number of rows in the query result set when time ordering is used.
|
||||||
|
|
||||||
2. N-Way Merge: Each segment on a Historical is opened in parallel. Since each segment's rows are already
|
2. N-Way Merge: For each segment, each partition is opened in parallel. Since each partition's rows are already
|
||||||
time-ordered, an n-way merge can be performed on the results from each segment. This approach doesn't persist the entire
|
time-ordered, an n-way merge can be performed on the results from each partition. This approach doesn't persist the entire
|
||||||
result set in memory (like the Priority Queue) as it streams back batches as they are returned from the merge function.
|
result set in memory (like the Priority Queue) as it streams back batches as they are returned from the merge function.
|
||||||
However, attempting to query too many segments could also result in high memory usage due to the need to open
|
However, attempting to query too many partition could also result in high memory usage due to the need to open
|
||||||
decompression and decoding buffers for each. The `druid.query.scan.maxSegmentPartitionsOrderedInMemory` limit protects
|
decompression and decoding buffers for each. The `druid.query.scan.maxSegmentPartitionsOrderedInMemory` limit protects
|
||||||
from this by capping the number of segments opened per historical when time ordering is used.
|
from this by capping the number of partitions opened at any times when time ordering is used.
|
||||||
|
|
||||||
Both `druid.query.scan.maxRowsQueuedForOrdering` and `druid.query.scan.maxSegmentPartitionsOrderedInMemory` are
|
Both `druid.query.scan.maxRowsQueuedForOrdering` and `druid.query.scan.maxSegmentPartitionsOrderedInMemory` are
|
||||||
configurable and can be tuned based on hardware specs and number of dimensions being queried.
|
configurable and can be tuned based on hardware specs and number of dimensions being queried.
|
||||||
|
|
|
@ -97,9 +97,10 @@ public class ScanQueryLimitRowIterator implements CloseableIterator<ScanResultVa
|
||||||
throw new UOE(ScanQuery.ResultFormat.RESULT_FORMAT_VALUE_VECTOR + " is not supported yet");
|
throw new UOE(ScanQuery.ResultFormat.RESULT_FORMAT_VALUE_VECTOR + " is not supported yet");
|
||||||
}
|
}
|
||||||
|
|
||||||
// We want to perform batching if we are not time-ordering or are at the outer level if we are re time-ordering
|
// We want to perform multi-event ScanResultValue limiting if we are not time-ordering or are at the
|
||||||
|
// outer level if we are time-ordering
|
||||||
if (query.getOrder() == ScanQuery.Order.NONE ||
|
if (query.getOrder() == ScanQuery.Order.NONE ||
|
||||||
!query.getContextBoolean(ScanQuery.CTX_KEY_OUTERMOST, true)) {
|
query.getContextBoolean(ScanQuery.CTX_KEY_OUTERMOST, true)) {
|
||||||
ScanResultValue batch = yielder.get();
|
ScanResultValue batch = yielder.get();
|
||||||
List events = (List) batch.getEvents();
|
List events = (List) batch.getEvents();
|
||||||
if (events.size() <= limit - count) {
|
if (events.size() <= limit - count) {
|
||||||
|
@ -114,8 +115,8 @@ public class ScanQueryLimitRowIterator implements CloseableIterator<ScanResultVa
|
||||||
return new ScanResultValue(batch.getSegmentId(), batch.getColumns(), events.subList(0, numLeft));
|
return new ScanResultValue(batch.getSegmentId(), batch.getColumns(), events.subList(0, numLeft));
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
// Perform single-event ScanResultValue batching. Each scan result value in this case will only have one event
|
// Perform single-event ScanResultValue batching. Each scan result value from the yielder in this case will only
|
||||||
// so there's no need to iterate through events.
|
// have one event so there's no need to iterate through events.
|
||||||
int batchSize = query.getBatchSize();
|
int batchSize = query.getBatchSize();
|
||||||
List<Object> eventsToAdd = new ArrayList<>(batchSize);
|
List<Object> eventsToAdd = new ArrayList<>(batchSize);
|
||||||
List<String> columns = new ArrayList<>();
|
List<String> columns = new ArrayList<>();
|
||||||
|
|
Loading…
Reference in New Issue