[7.x][ML] Do not fail DFA task that is stopped during reindexing (#55659) (#55663)

While we were catching `TaskCancelledException` while we wait for
reindexing to complete, we missed the fact that this exception
may be wrapped in a multi-node cluster. This is the reason
we may still fail the task when stop is called while reindexing.

Some times we're lucky and the exception is thrown by the same
node that runs the job. Then the exception is not wrapped and
things work fine. But when that is not the case the exception is
wrapped, we fail to catch it, and set the task to failed.

The fix is to simply unwrap the exception when we check it it
is `TaskCancelledException`.

Closes #55068

Backport of #55659
This commit is contained in:
Dimitris Athanasiou 2020-04-23 15:57:01 +03:00 committed by GitHub
parent 8669766a81
commit 4b11adf074
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 1 additions and 1 deletions

View File

@ -225,7 +225,7 @@ public class DataFrameAnalyticsManager {
startAnalytics(task, config);
},
error -> {
if (error instanceof TaskCancelledException && task.isStopping()) {
if (ExceptionsHelper.unwrapCause(error) instanceof TaskCancelledException && task.isStopping()) {
LOGGER.debug(new ParameterizedMessage("[{}] Caught task cancelled exception while task is stopping",
config.getId()), error);
task.markAsCompleted();