SOLR-13082: A trigger that creates trigger events more frequently than the cool down period can starve other triggers.

This is mitigated to some extent by randomly choosing the trigger to resume after cool down. It is recommended that scheduled triggers not be used for very frequent operations to avoid this problem.
This commit is contained in:
Shalin Shekhar Mangar 2019-01-02 11:59:00 +05:30
parent 2532a5d31c
commit 5016959ce8
3 changed files with 36 additions and 2 deletions

View File

@ -193,6 +193,11 @@ Bug Fixes
* SOLR-13080: The "terms" QParser's "automaton" method semi-required that the input terms/IDs be sorted. This
query parser now does this. Unclear if this is a perf issue or actual bug. (Daniel Lowe, David Smiley)
* SOLR-13082: A trigger that creates trigger events more frequently than the cool down period can starve other triggers.
This is mitigated to some extent by randomly choosing the trigger to resume after cool down. It is recommended that
scheduled triggers not be used for very frequent operations to avoid this problem.
(ab, shalin)
Improvements
----------------------

View File

@ -28,6 +28,7 @@ import java.util.Iterator;
import java.util.List;
import java.util.Locale;
import java.util.Map;
import java.util.Random;
import java.util.Set;
import java.util.concurrent.ConcurrentHashMap;
import java.util.concurrent.ExecutorService;
@ -88,6 +89,18 @@ public class ScheduledTriggers implements Closeable {
DEFAULT_PROPERTIES.put(ACTION_THROTTLE_PERIOD_SECONDS, DEFAULT_ACTION_THROTTLE_PERIOD_SECONDS);
}
protected static final Random RANDOM;
static {
// We try to make things reproducible in the context of our tests by initializing the random instance
// based on the current seed
String seed = System.getProperty("tests.seed");
if (seed == null) {
RANDOM = new Random();
} else {
RANDOM = new Random(seed.hashCode());
}
}
private final Map<String, TriggerWrapper> scheduledTriggerWrappers = new ConcurrentHashMap<>();
/**
@ -381,9 +394,13 @@ public class ScheduledTriggers implements Closeable {
* @lucene.internal
*/
public synchronized void resumeTriggers(long afterDelayMillis) {
scheduledTriggerWrappers.forEach((s, triggerWrapper) -> {
List<Map.Entry<String, TriggerWrapper>> entries = new ArrayList<>(scheduledTriggerWrappers.entrySet());
Collections.shuffle(entries, RANDOM);
entries.forEach(e -> {
String key = e.getKey();
TriggerWrapper triggerWrapper = e.getValue();
if (triggerWrapper.scheduledFuture.isCancelled()) {
log.debug("Resuming trigger: {} after {}ms", s, afterDelayMillis);
log.debug("Resuming trigger: {} after {}ms", key, afterDelayMillis);
triggerWrapper.scheduledFuture = scheduledThreadPoolExecutor.scheduleWithFixedDelay(triggerWrapper, afterDelayMillis,
cloudManager.getTimeSource().convertDelay(TimeUnit.SECONDS, triggerDelay.get(), TimeUnit.MILLISECONDS), TimeUnit.MILLISECONDS);
}

View File

@ -527,3 +527,15 @@ This trigger applies the `every` date math expression on the `startTime` or the
Apart from the common event properties described in the <<Event Types>> section, the trigger adds an additional `actualEventTime` event property which has the actual event time as opposed to the scheduled time.
For example, if the scheduled time was `2018-01-31T15:30:00Z` and grace time was `+15MINUTES` then an event may be fired at `2018-01-31T15:45:00Z`. Such an event will have `eventTime` as `2018-01-31T15:30:00Z`, the scheduled time, but the `actualEventTime` property will have a value of `2018-01-31T15:45:00Z`, the actual time.
.Frequently scheduled events and trigger starvation
[CAUTION]
====
Be cautious with scheduled triggers that are set to run as or more frequently than the trigger cooldown period (defaults to 5 seconds).
Solr pauses all triggers for a cooldown period after a trigger fires so that the system has some time to stabilize. An aggressive scheduled trigger can starve all other triggers from
ever executing if a new scheduled event is ready as soon as the cooldown period is over. The same starvation scenario can happen to the scheduled trigger as well.
Solr randomizes the order in which the triggers are resumed after the cooldown period to mitigate this problem. However, it is recommended that scheduled triggers
are not used with low `every` values and an external scheduling process such as cron be used for such cases instead.
====