diff --git a/docs/python-programming-guide.md b/docs/python-programming-guide.md index a840b9b34bb2fbecfde182acaa3fba99bbb79e29..4e84d23edf56b0a6b6fe3291744c269be0085b28 100644 --- a/docs/python-programming-guide.md +++ b/docs/python-programming-guide.md @@ -67,13 +67,20 @@ The script automatically adds the `pyspark` package to the `PYTHONPATH`. # Interactive Use -The `pyspark` script launches a Python interpreter that is configured to run PySpark jobs. -When run without any input files, `pyspark` launches a shell that can be used explore data interactively, which is a simple way to learn the API: +The `pyspark` script launches a Python interpreter that is configured to run PySpark jobs. To use `pyspark` interactively, first build Spark, then launch it directly from the command line without any options: + +{% highlight bash %} +$ sbt/sbt package +$ ./pyspark +{% endhighlight %} + +The Python shell can be used explore data interactively and is a simple way to learn the API: {% highlight python %} >>> words = sc.textFile("/usr/share/dict/words") >>> words.filter(lambda w: w.startswith("spar")).take(5) [u'spar', u'sparable', u'sparada', u'sparadrap', u'sparagrass'] +>>> help(pyspark) # Show all pyspark functions {% endhighlight %} By default, the `pyspark` shell creates SparkContext that runs jobs locally. diff --git a/python/pyspark/shell.py b/python/pyspark/shell.py index f6328c561f56259b5d36ad5b87d9d1b6379320ed..54ff1bf8e7c3ea63db72a5bd9015e26705948b9d 100644 --- a/python/pyspark/shell.py +++ b/python/pyspark/shell.py @@ -4,6 +4,7 @@ An interactive shell. This file is designed to be launched as a PYTHONSTARTUP script. """ import os +import pyspark from pyspark.context import SparkContext