Skip to content
Snippets Groups Projects
  • Josh Rosen's avatar
    c9fa870a
    [SPARK-7687] [SQL] DataFrame.describe() should cast all aggregates to String · c9fa870a
    Josh Rosen authored
    In `DataFrame.describe()`, the `count` aggregate produces an integer, the `avg` and `stdev` aggregates produce doubles, and `min` and `max` aggregates can produce varying types depending on what type of column they're applied to.  As a result, we should cast all aggregate results to String so that `describe()`'s output types match its declared output schema.
    
    Author: Josh Rosen <joshrosen@databricks.com>
    
    Closes #6218 from JoshRosen/SPARK-7687 and squashes the following commits:
    
    146b615 [Josh Rosen] Fix R test.
    2974bd5 [Josh Rosen] Cast to string type instead
    f206580 [Josh Rosen] Cast to double to fix SPARK-7687
    307ecbf [Josh Rosen] Add failing regression test for SPARK-7687
    c9fa870a
    History
    [SPARK-7687] [SQL] DataFrame.describe() should cast all aggregates to String
    Josh Rosen authored
    In `DataFrame.describe()`, the `count` aggregate produces an integer, the `avg` and `stdev` aggregates produce doubles, and `min` and `max` aggregates can produce varying types depending on what type of column they're applied to.  As a result, we should cast all aggregate results to String so that `describe()`'s output types match its declared output schema.
    
    Author: Josh Rosen <joshrosen@databricks.com>
    
    Closes #6218 from JoshRosen/SPARK-7687 and squashes the following commits:
    
    146b615 [Josh Rosen] Fix R test.
    2974bd5 [Josh Rosen] Cast to string type instead
    f206580 [Josh Rosen] Cast to double to fix SPARK-7687
    307ecbf [Josh Rosen] Add failing regression test for SPARK-7687
test_sparkSQL.R 26.51 KiB