-
- Downloads
[SPARK-16230][CORE] CoarseGrainedExecutorBackend to self kill if there is an...
[SPARK-16230][CORE] CoarseGrainedExecutorBackend to self kill if there is an exception while creating an Executor ## What changes were proposed in this pull request? With the fix from SPARK-13112, I see that `LaunchTask` is always processed after `RegisteredExecutor` is done and so it gets chance to do all retries to startup an executor. There is still a problem that if `Executor` creation itself fails and there is some exception, it gets unnoticed and the executor is killed when it tries to process the `LaunchTask` as `executor` is null : https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/CoarseGrainedExecutorBackend.scala#L88 So if one looks at the logs, it does not tell that there was problem during `Executor` creation and thats why it was killed. This PR explicitly catches exception in `Executor` creation, logs a proper message and then exits the JVM. Also, I have changed the `exitExecutor` method to accept `reason` so that backends can use that reason and do stuff like logging to a DB to get an aggregate of such exits at a cluster level ## How was this patch tested? I am relying on existing tests Author: Tejas Patil <tejasp@fb.com> Closes #14202 from tejasapatil/exit_executor_failure.
Please register or sign in to comment