Skip to content
  • Davies Liu's avatar
    [SPARK-3073] [PySpark] use external sort in sortBy() and sortByKey() · f1e71d4c
    Davies Liu authored
    Using external sort to support sort large datasets in reduce stage.
    
    Author: Davies Liu <davies.liu@gmail.com>
    
    Closes #1978 from davies/sort and squashes the following commits:
    
    bbcd9ba [Davies Liu] check spilled bytes in tests
    b125d2f [Davies Liu] add test for external sort in rdd
    eae0176 [Davies Liu] choose different disks from different processes and instances
    1f075ed [Davies Liu] Merge branch 'master' into sort
    eb53ca6 [Davies Liu] Merge branch 'master' into sort
    644abaf [Davies Liu] add license in LICENSE
    19f7873 [Davies Liu] improve tests
    55602ee [Davies Liu] use external sort in sortBy() and sortByKey()
    f1e71d4c
This project is licensed under the Apache License 2.0. Learn more