Add warning comments to Granularity.getIterable. (#13888)

This function is notorious for causing memory exhaustion and excessive
CPU usage; so much so that it was valuable to work around it in the
SQL planner in #13206. Hopefully, a warning comment will encourage
developers to stay away and come up with solutions that do not involve
computing all possible buckets.
This commit is contained in:
Gian Merlino 2023-03-06 22:57:10 -08:00 committed by GitHub
parent 38b6373bf7
commit fcfb7b8ff6
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
2 changed files with 15 additions and 0 deletions

View File

@ -209,6 +209,19 @@ public abstract class Granularity implements Cacheable
return vals;
}
/**
* Return an iterable of granular buckets that overlap a particular interval.
*
* In cases where the number of granular buckets is very large, the Iterable returned by this method will take
* an excessive amount of time to compute, and materializing it into a collection will take an excessive amount
* of memory. For example, this happens in the extreme case of an input interval of
* {@link org.apache.druid.java.util.common.Intervals#ETERNITY} and any granularity other than
* {@link Granularities#ALL}, as well as cases like an input interval of ten years with {@link Granularities#SECOND}.
*
* To avoid issues stemming from large numbers of buckets, this method should be avoided, and code that uses
* this method should be rewritten to use some other approach. For example: rather than computing all possible
* buckets in a wide time range, only process buckets related to actual data points that appear.
*/
public Iterable<Interval> getIterable(final Interval input)
{
return new IntervalIterable(input);

View File

@ -871,6 +871,8 @@ public class DruidQuery
* <p>
* Necessary because some combinations are unsafe, mainly because they would lead to the creation of too many
* time-granular buckets during query processing.
*
* @see Granularity#getIterable(Interval) the problematic method call we are trying to avoid
*/
private static boolean canUseQueryGranularity(
final DataSource dataSource,