-
- Downloads
[SPARK-12091] [PYSPARK] Deprecate the JAVA-specific deserialized storage levels
The current default storage level of Python persist API is MEMORY_ONLY_SER. This is different from the default level MEMORY_ONLY in the official document and RDD APIs. davies Is this inconsistency intentional? Thanks! Updates: Since the data is always serialized on the Python side, the storage levels of JAVA-specific deserialization are not removed, such as MEMORY_ONLY. Updates: Based on the reviewers' feedback. In Python, stored objects will always be serialized with the [Pickle](https://docs.python.org/2/library/pickle.html) library, so it does not matter whether you choose a serialized level. The available storage levels in Python include `MEMORY_ONLY`, `MEMORY_ONLY_2`, `MEMORY_AND_DISK`, `MEMORY_AND_DISK_2`, `DISK_ONLY`, `DISK_ONLY_2` and `OFF_HEAP`. Author: gatorsmile <gatorsmile@gmail.com> Closes #10092 from gatorsmile/persistStorageLevel.
Showing
- docs/configuration.md 4 additions, 3 deletionsdocs/configuration.md
- docs/programming-guide.md 6 additions, 4 deletionsdocs/programming-guide.md
- python/pyspark/rdd.py 4 additions, 4 deletionspython/pyspark/rdd.py
- python/pyspark/sql/dataframe.py 3 additions, 3 deletionspython/pyspark/sql/dataframe.py
- python/pyspark/storagelevel.py 21 additions, 10 deletionspython/pyspark/storagelevel.py
- python/pyspark/streaming/context.py 1 addition, 1 deletionpython/pyspark/streaming/context.py
- python/pyspark/streaming/dstream.py 2 additions, 2 deletionspython/pyspark/streaming/dstream.py
- python/pyspark/streaming/flume.py 2 additions, 2 deletionspython/pyspark/streaming/flume.py
- python/pyspark/streaming/kafka.py 1 addition, 1 deletionpython/pyspark/streaming/kafka.py
- python/pyspark/streaming/mqtt.py 1 addition, 1 deletionpython/pyspark/streaming/mqtt.py
Loading
Please register or sign in to comment