Skip to content
  • Jakub Nowacki's avatar
    b4edafa9
    [SPARK-22495] Fix setup of SPARK_HOME variable on Windows · b4edafa9
    Jakub Nowacki authored
    ## What changes were proposed in this pull request?
    
    Fixing the way how `SPARK_HOME` is resolved on Windows. While the previous version was working with the built release download, the set of directories changed slightly for the PySpark `pip` or `conda` install. This has been reflected in Linux files in `bin` but not for Windows `cmd` files.
    
    First fix improves the way how the `jars` directory is found, as this was stoping Windows version of `pip/conda` install from working; JARs were not found by on Session/Context setup.
    
    Second fix is adding `find-spark-home.cmd` script, which uses `find_spark_home.py` script, as the Linux version, to resolve `SPARK_HOME`. It is based on `find-spark-home` bash script, though, some operations are done in different order due to the `cmd` script language limitations. If environment variable is set, the Python script `find_spark_home.py` will not be run. The process can fail if Python is not installed, but it will mostly use this way if PySpark is installed via `pip/conda`, thus, there is some Python in the system.
    
    ## How was this patch tested?
    
    Tested on local installation.
    
    Author: Jakub Nowacki <j.s.nowacki@gmail.com>
    
    Closes #19370 from jsnowacki/fix_spark_cmds.
    b4edafa9
    [SPARK-22495] Fix setup of SPARK_HOME variable on Windows
    Jakub Nowacki authored
    ## What changes were proposed in this pull request?
    
    Fixing the way how `SPARK_HOME` is resolved on Windows. While the previous version was working with the built release download, the set of directories changed slightly for the PySpark `pip` or `conda` install. This has been reflected in Linux files in `bin` but not for Windows `cmd` files.
    
    First fix improves the way how the `jars` directory is found, as this was stoping Windows version of `pip/conda` install from working; JARs were not found by on Session/Context setup.
    
    Second fix is adding `find-spark-home.cmd` script, which uses `find_spark_home.py` script, as the Linux version, to resolve `SPARK_HOME`. It is based on `find-spark-home` bash script, though, some operations are done in different order due to the `cmd` script language limitations. If environment variable is set, the Python script `find_spark_home.py` will not be run. The process can fail if Python is not installed, but it will mostly use this way if PySpark is installed via `pip/conda`, thus, there is some Python in the system.
    
    ## How was this patch tested?
    
    Tested on local installation.
    
    Author: Jakub Nowacki <j.s.nowacki@gmail.com>
    
    Closes #19370 from jsnowacki/fix_spark_cmds.
Loading