mirror of
https://github.com/apache/druid.git
synced 2025-02-25 04:16:07 +00:00
Workaround non-thread-safe use of HLL aggregators. (#3578)
Despite the non-thread-safety of HyperLogLogCollector, it is actually currently used by multiple threads during realtime indexing. HyperUniquesAggregator's "aggregate" and "get" methods can be called simultaneously by OnheapIncrementalIndex, since its "doAggregate" and "getMetricObjectValue" methods are not synchronized. This means that the optimization of HyperLogLogCollector.fold in #3314 (saving and restoring position rather than duplicating the storage buffer of the right-hand side) could cause corruption in the face of concurrent writes. This patch works around the issue by duplicating the storage buffer in "get" before returning a collector. The returned collector still shares data with the original one, but the situation is no worse than before #3314. In the future we may want to consider making a thread safe version of HLLC that avoids these kinds of problems in realtime indexing. But for now I thought it was best to do a small change that restored the old behavior.
This commit is contained in:
parent
c2ae734848
commit
285516bede
@ -39,10 +39,15 @@ import java.nio.ByteBuffer;
|
||||
* }
|
||||
* </code>
|
||||
*
|
||||
* This class is *not* multi-threaded. It can be passed among threads, but it is written with the assumption that
|
||||
* This class is *not* multi-threaded. It can be passed among threads, but it is written with the assumption that
|
||||
* only one thread is ever calling methods on it.
|
||||
*
|
||||
* If you have multiple threads calling methods on this concurrently, I hope you manage to get correct behavior
|
||||
* If you have multiple threads calling methods on this concurrently, I hope you manage to get correct behavior.
|
||||
*
|
||||
* Note that despite the non-thread-safety of this class, it is actually currently used by multiple threads during
|
||||
* realtime indexing. HyperUniquesAggregator's "aggregate" and "get" methods can be called simultaneously by
|
||||
* OnheapIncrementalIndex, since its "doAggregate" and "getMetricObjectValue" methods are not synchronized. So, watch
|
||||
* out for that.
|
||||
*/
|
||||
public abstract class HyperLogLogCollector implements Comparable<HyperLogLogCollector>
|
||||
{
|
||||
|
@ -57,7 +57,8 @@ public class HyperUniquesAggregator implements Aggregator
|
||||
@Override
|
||||
public Object get()
|
||||
{
|
||||
return collector;
|
||||
// Workaround for OnheapIncrementalIndex's penchant for calling "aggregate" and "get" simultaneously.
|
||||
return HyperLogLogCollector.makeCollector(collector.getStorageBuffer().duplicate());
|
||||
}
|
||||
|
||||
@Override
|
||||
|
Loading…
x
Reference in New Issue
Block a user