Skip to content
Snippets Groups Projects
Commit b4edafa9 authored by Jakub Nowacki's avatar Jakub Nowacki Committed by hyukjinkwon
Browse files

[SPARK-22495] Fix setup of SPARK_HOME variable on Windows

## What changes were proposed in this pull request?

Fixing the way how `SPARK_HOME` is resolved on Windows. While the previous version was working with the built release download, the set of directories changed slightly for the PySpark `pip` or `conda` install. This has been reflected in Linux files in `bin` but not for Windows `cmd` files.

First fix improves the way how the `jars` directory is found, as this was stoping Windows version of `pip/conda` install from working; JARs were not found by on Session/Context setup.

Second fix is adding `find-spark-home.cmd` script, which uses `find_spark_home.py` script, as the Linux version, to resolve `SPARK_HOME`. It is based on `find-spark-home` bash script, though, some operations are done in different order due to the `cmd` script language limitations. If environment variable is set, the Python script `find_spark_home.py` will not be run. The process can fail if Python is not installed, but it will mostly use this way if PySpark is installed via `pip/conda`, thus, there is some Python in the system.

## How was this patch tested?

Tested on local installation.

Author: Jakub Nowacki <j.s.nowacki@gmail.com>

Closes #19370 from jsnowacki/fix_spark_cmds.
parent 1edb3175
No related branches found
No related tags found
No related merge requests found
...@@ -33,6 +33,7 @@ only_commits: ...@@ -33,6 +33,7 @@ only_commits:
- core/src/main/scala/org/apache/spark/api/r/ - core/src/main/scala/org/apache/spark/api/r/
- mllib/src/main/scala/org/apache/spark/ml/r/ - mllib/src/main/scala/org/apache/spark/ml/r/
- core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala - core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala
- bin/*.cmd
cache: cache:
- C:\Users\appveyor\.m2 - C:\Users\appveyor\.m2
......
@echo off
rem
rem Licensed to the Apache Software Foundation (ASF) under one or more
rem contributor license agreements. See the NOTICE file distributed with
rem this work for additional information regarding copyright ownership.
rem The ASF licenses this file to You under the Apache License, Version 2.0
rem (the "License"); you may not use this file except in compliance with
rem the License. You may obtain a copy of the License at
rem
rem http://www.apache.org/licenses/LICENSE-2.0
rem
rem Unless required by applicable law or agreed to in writing, software
rem distributed under the License is distributed on an "AS IS" BASIS,
rem WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
rem See the License for the specific language governing permissions and
rem limitations under the License.
rem
rem Path to Python script finding SPARK_HOME
set FIND_SPARK_HOME_PYTHON_SCRIPT=%~dp0find_spark_home.py
rem Default to standard python interpreter unless told otherwise
set PYTHON_RUNNER=python
rem If PYSPARK_DRIVER_PYTHON is set, it overwrites the python version
if not "x%PYSPARK_DRIVER_PYTHON%"=="x" (
set PYTHON_RUNNER=%PYSPARK_DRIVER_PYTHON%
)
rem If PYSPARK_PYTHON is set, it overwrites the python version
if not "x%PYSPARK_PYTHON%"=="x" (
set PYTHON_RUNNER=%PYSPARK_PYTHON%
)
rem If there is python installed, trying to use the root dir as SPARK_HOME
where %PYTHON_RUNNER% > nul 2>$1
if %ERRORLEVEL% neq 0 (
if not exist %PYTHON_RUNNER% (
if "x%SPARK_HOME%"=="x" (
echo Missing Python executable '%PYTHON_RUNNER%', defaulting to '%~dp0..' for SPARK_HOME ^
environment variable. Please install Python or specify the correct Python executable in ^
PYSPARK_DRIVER_PYTHON or PYSPARK_PYTHON environment variable to detect SPARK_HOME safely.
set SPARK_HOME=%~dp0..
)
)
)
rem Only attempt to find SPARK_HOME if it is not set.
if "x%SPARK_HOME%"=="x" (
if not exist "%FIND_SPARK_HOME_PYTHON_SCRIPT%" (
rem If we are not in the same directory as find_spark_home.py we are not pip installed so we don't
rem need to search the different Python directories for a Spark installation.
rem Note only that, if the user has pip installed PySpark but is directly calling pyspark-shell or
rem spark-submit in another directory we want to use that version of PySpark rather than the
rem pip installed version of PySpark.
set SPARK_HOME=%~dp0..
) else (
rem We are pip installed, use the Python script to resolve a reasonable SPARK_HOME
for /f "delims=" %%i in ('%PYTHON_RUNNER% %FIND_SPARK_HOME_PYTHON_SCRIPT%') do set SPARK_HOME=%%i
)
)
...@@ -18,7 +18,7 @@ rem limitations under the License. ...@@ -18,7 +18,7 @@ rem limitations under the License.
rem rem
rem Figure out where the Spark framework is installed rem Figure out where the Spark framework is installed
set SPARK_HOME=%~dp0.. call "%~dp0find-spark-home.cmd"
call "%SPARK_HOME%\bin\load-spark-env.cmd" call "%SPARK_HOME%\bin\load-spark-env.cmd"
set _SPARK_CMD_USAGE=Usage: bin\pyspark.cmd [options] set _SPARK_CMD_USAGE=Usage: bin\pyspark.cmd [options]
......
...@@ -17,7 +17,9 @@ rem See the License for the specific language governing permissions and ...@@ -17,7 +17,9 @@ rem See the License for the specific language governing permissions and
rem limitations under the License. rem limitations under the License.
rem rem
set SPARK_HOME=%~dp0.. rem Figure out where the Spark framework is installed
call "%~dp0find-spark-home.cmd"
set _SPARK_CMD_USAGE=Usage: ./bin/run-example [options] example-class [example args] set _SPARK_CMD_USAGE=Usage: ./bin/run-example [options] example-class [example args]
rem The outermost quotes are used to prevent Windows command line parse error rem The outermost quotes are used to prevent Windows command line parse error
......
...@@ -18,7 +18,7 @@ rem limitations under the License. ...@@ -18,7 +18,7 @@ rem limitations under the License.
rem rem
rem Figure out where the Spark framework is installed rem Figure out where the Spark framework is installed
set SPARK_HOME=%~dp0.. call "%~dp0find-spark-home.cmd"
call "%SPARK_HOME%\bin\load-spark-env.cmd" call "%SPARK_HOME%\bin\load-spark-env.cmd"
......
...@@ -17,7 +17,9 @@ rem See the License for the specific language governing permissions and ...@@ -17,7 +17,9 @@ rem See the License for the specific language governing permissions and
rem limitations under the License. rem limitations under the License.
rem rem
set SPARK_HOME=%~dp0.. rem Figure out where the Spark framework is installed
call "%~dp0find-spark-home.cmd"
set _SPARK_CMD_USAGE=Usage: .\bin\spark-shell.cmd [options] set _SPARK_CMD_USAGE=Usage: .\bin\spark-shell.cmd [options]
rem SPARK-4161: scala does not assume use of the java classpath, rem SPARK-4161: scala does not assume use of the java classpath,
......
...@@ -18,7 +18,7 @@ rem limitations under the License. ...@@ -18,7 +18,7 @@ rem limitations under the License.
rem rem
rem Figure out where the Spark framework is installed rem Figure out where the Spark framework is installed
set SPARK_HOME=%~dp0.. call "%~dp0find-spark-home.cmd"
call "%SPARK_HOME%\bin\load-spark-env.cmd" call "%SPARK_HOME%\bin\load-spark-env.cmd"
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment