Skip to content
  • goldmedal's avatar
    a28728a9
    [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support converting MapType to... · a28728a9
    goldmedal authored
    [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support converting MapType to json for PySpark and SparkR
    
    ## What changes were proposed in this pull request?
    In previous work SPARK-21513, we has allowed `MapType` and `ArrayType` of `MapType`s convert to a json string but only for Scala API. In this follow-up PR, we will make SparkSQL support it for PySpark and SparkR, too. We also fix some little bugs and comments of the previous work in this follow-up PR.
    
    ### For PySpark
    ```
    >>> data = [(1, {"name": "Alice"})]
    >>> df = spark.createDataFrame(data, ("key", "value"))
    >>> df.select(to_json(df.value).alias("json")).collect()
    [Row(json=u'{"name":"Alice")']
    >>> data = [(1, [{"name": "Alice"}, {"name": "Bob"}])]
    >>> df = spark.createDataFrame(data, ("key", "value"))
    >>> df.select(to_json(df.value).alias("json")).collect()
    [Row(json=u'[{"name":"Alice"},{"name":"Bob"}]')]
    ```
    ### For SparkR
    ```
    # Converts a map into a JSON object
    df2 <- sql("SELECT map('name', 'Bob')) as people")
    df2 <- mutate(df2, people_json = to_json(df2$people))
    # Converts an array of maps into a JSON array
    df2 <- sql("SELECT array(map('name', 'Bob'), map('name', 'Alice')) as people")
    df2 <- mutate(df2, people_json = to_json(df2$people))
    ```
    ## How was this patch tested?
    Add unit test cases.
    
    cc viirya HyukjinKwon
    
    Author: goldmedal <liugs963@gmail.com>
    
    Closes #19223 from goldmedal/SPARK-21513-fp-PySaprkAndSparkR.
    a28728a9
    [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support converting MapType to...
    goldmedal authored
    [SPARK-21513][SQL][FOLLOWUP] Allow UDF to_json support converting MapType to json for PySpark and SparkR
    
    ## What changes were proposed in this pull request?
    In previous work SPARK-21513, we has allowed `MapType` and `ArrayType` of `MapType`s convert to a json string but only for Scala API. In this follow-up PR, we will make SparkSQL support it for PySpark and SparkR, too. We also fix some little bugs and comments of the previous work in this follow-up PR.
    
    ### For PySpark
    ```
    >>> data = [(1, {"name": "Alice"})]
    >>> df = spark.createDataFrame(data, ("key", "value"))
    >>> df.select(to_json(df.value).alias("json")).collect()
    [Row(json=u'{"name":"Alice")']
    >>> data = [(1, [{"name": "Alice"}, {"name": "Bob"}])]
    >>> df = spark.createDataFrame(data, ("key", "value"))
    >>> df.select(to_json(df.value).alias("json")).collect()
    [Row(json=u'[{"name":"Alice"},{"name":"Bob"}]')]
    ```
    ### For SparkR
    ```
    # Converts a map into a JSON object
    df2 <- sql("SELECT map('name', 'Bob')) as people")
    df2 <- mutate(df2, people_json = to_json(df2$people))
    # Converts an array of maps into a JSON array
    df2 <- sql("SELECT array(map('name', 'Bob'), map('name', 'Alice')) as people")
    df2 <- mutate(df2, people_json = to_json(df2$people))
    ```
    ## How was this patch tested?
    Add unit test cases.
    
    cc viirya HyukjinKwon
    
    Author: goldmedal <liugs963@gmail.com>
    
    Closes #19223 from goldmedal/SPARK-21513-fp-PySaprkAndSparkR.
Loading