Skip to content
Snippets Groups Projects
  • Andrew Or's avatar
    148af608
    [SPARK-2454] Do not ship spark home to Workers · 148af608
    Andrew Or authored
    When standalone Workers launch executors, they inherit the Spark home set by the driver. This means if the worker machines do not share the same directory structure as the driver node, the Workers will attempt to run scripts (e.g. bin/compute-classpath.sh) that do not exist locally and fail. This is a common scenario if the driver is launched from outside of the cluster.
    
    The solution is to simply not pass the driver's Spark home to the Workers. This PR further makes an attempt to avoid overloading the usages of `spark.home`, which is now only used for setting executor Spark home on Mesos and in python.
    
    This is based on top of #1392 and originally reported by YanTangZhai. Tested on standalone cluster.
    
    Author: Andrew Or <andrewor14@gmail.com>
    
    Closes #1734 from andrewor14/spark-home-reprise and squashes the following commits:
    
    f71f391 [Andrew Or] Revert changes in python
    1c2532c [Andrew Or] Merge branch 'master' of github.com:apache/spark into spark-home-reprise
    188fc5d [Andrew Or] Avoid using spark.home where possible
    09272b7 [Andrew Or] Always use Worker's working directory as spark home
    148af608
    History
    [SPARK-2454] Do not ship spark home to Workers
    Andrew Or authored
    When standalone Workers launch executors, they inherit the Spark home set by the driver. This means if the worker machines do not share the same directory structure as the driver node, the Workers will attempt to run scripts (e.g. bin/compute-classpath.sh) that do not exist locally and fail. This is a common scenario if the driver is launched from outside of the cluster.
    
    The solution is to simply not pass the driver's Spark home to the Workers. This PR further makes an attempt to avoid overloading the usages of `spark.home`, which is now only used for setting executor Spark home on Mesos and in python.
    
    This is based on top of #1392 and originally reported by YanTangZhai. Tested on standalone cluster.
    
    Author: Andrew Or <andrewor14@gmail.com>
    
    Closes #1734 from andrewor14/spark-home-reprise and squashes the following commits:
    
    f71f391 [Andrew Or] Revert changes in python
    1c2532c [Andrew Or] Merge branch 'master' of github.com:apache/spark into spark-home-reprise
    188fc5d [Andrew Or] Avoid using spark.home where possible
    09272b7 [Andrew Or] Always use Worker's working directory as spark home