mirror of https://github.com/apache/druid.git
Fix docs and flipped boolean in ScanQueryLimitRowIterator
This commit is contained in:
parent
35692680fc
commit
8a6bb1127c
|
@ -158,7 +158,7 @@ The format of the result when resultFormat equals `compactedList`:
|
|||
The Scan query currently supports ordering based on timestamp for non-legacy queries. Note that using time ordering
|
||||
will yield results that do not indicate which segment rows are from (`segmentId` will show up as `null`). Furthermore,
|
||||
time ordering is only supported where the result set limit is less than `druid.query.scan.maxRowsQueuedForOrdering`
|
||||
rows **or** fewer than `druid.query.scan.maxSegmentPartitionsOrderedInMemory` segments are scanned per Historical. The
|
||||
rows **or** all segments scanned have fewer than `druid.query.scan.maxSegmentPartitionsOrderedInMemory` partitions. The
|
||||
reasoning behind these limitations is that the implementation of time ordering uses two strategies that can consume too
|
||||
much heap memory if left unbounded. These strategies (listed below) are chosen on a per-Historical basis depending on
|
||||
query result set limit and the number of segments being scanned.
|
||||
|
@ -170,12 +170,12 @@ priority queue are streamed back to the Broker(s) in batches. Attempting to loa
|
|||
risk of Historical nodes running out of memory. The `druid.query.scan.maxRowsQueuedForOrdering` property protects
|
||||
from this by limiting the number of rows in the query result set when time ordering is used.
|
||||
|
||||
2. N-Way Merge: Each segment on a Historical is opened in parallel. Since each segment's rows are already
|
||||
time-ordered, an n-way merge can be performed on the results from each segment. This approach doesn't persist the entire
|
||||
2. N-Way Merge: For each segment, each partition is opened in parallel. Since each partition's rows are already
|
||||
time-ordered, an n-way merge can be performed on the results from each partition. This approach doesn't persist the entire
|
||||
result set in memory (like the Priority Queue) as it streams back batches as they are returned from the merge function.
|
||||
However, attempting to query too many segments could also result in high memory usage due to the need to open
|
||||
However, attempting to query too many partition could also result in high memory usage due to the need to open
|
||||
decompression and decoding buffers for each. The `druid.query.scan.maxSegmentPartitionsOrderedInMemory` limit protects
|
||||
from this by capping the number of segments opened per historical when time ordering is used.
|
||||
from this by capping the number of partitions opened at any times when time ordering is used.
|
||||
|
||||
Both `druid.query.scan.maxRowsQueuedForOrdering` and `druid.query.scan.maxSegmentPartitionsOrderedInMemory` are
|
||||
configurable and can be tuned based on hardware specs and number of dimensions being queried.
|
||||
|
|
|
@ -97,9 +97,10 @@ public class ScanQueryLimitRowIterator implements CloseableIterator<ScanResultVa
|
|||
throw new UOE(ScanQuery.ResultFormat.RESULT_FORMAT_VALUE_VECTOR + " is not supported yet");
|
||||
}
|
||||
|
||||
// We want to perform batching if we are not time-ordering or are at the outer level if we are re time-ordering
|
||||
// We want to perform multi-event ScanResultValue limiting if we are not time-ordering or are at the
|
||||
// outer level if we are time-ordering
|
||||
if (query.getOrder() == ScanQuery.Order.NONE ||
|
||||
!query.getContextBoolean(ScanQuery.CTX_KEY_OUTERMOST, true)) {
|
||||
query.getContextBoolean(ScanQuery.CTX_KEY_OUTERMOST, true)) {
|
||||
ScanResultValue batch = yielder.get();
|
||||
List events = (List) batch.getEvents();
|
||||
if (events.size() <= limit - count) {
|
||||
|
@ -114,8 +115,8 @@ public class ScanQueryLimitRowIterator implements CloseableIterator<ScanResultVa
|
|||
return new ScanResultValue(batch.getSegmentId(), batch.getColumns(), events.subList(0, numLeft));
|
||||
}
|
||||
} else {
|
||||
// Perform single-event ScanResultValue batching. Each scan result value in this case will only have one event
|
||||
// so there's no need to iterate through events.
|
||||
// Perform single-event ScanResultValue batching. Each scan result value from the yielder in this case will only
|
||||
// have one event so there's no need to iterate through events.
|
||||
int batchSize = query.getBatchSize();
|
||||
List<Object> eventsToAdd = new ArrayList<>(batchSize);
|
||||
List<String> columns = new ArrayList<>();
|
||||
|
|
Loading…
Reference in New Issue