Skip to content
  • actuaryzhang's avatar
    ce112cec
    [SPARK-19395][SPARKR] Convert coefficients in summary to matrix · ce112cec
    actuaryzhang authored
    ## What changes were proposed in this pull request?
    The `coefficients` component in model summary should be 'matrix' but the underlying structure is indeed list. This affects several models except for 'AFTSurvivalRegressionModel' which has the correct implementation. The fix is to first `unlist` the coefficients returned from the `callJMethod` before converting to matrix. An example illustrates the issues:
    
    ```
    data(iris)
    df <- createDataFrame(iris)
    model <- spark.glm(df, Sepal_Length ~ Sepal_Width, family = "gaussian")
    s <- summary(model)
    
    > str(s$coefficients)
    List of 8
     $ : num 6.53
     $ : num -0.223
     $ : num 0.479
     $ : num 0.155
     $ : num 13.6
     $ : num -1.44
     $ : num 0
     $ : num 0.152
     - attr(*, "dim")= int [1:2] 2 4
     - attr(*, "dimnames")=List of 2
      ..$ : chr [1:2] "(Intercept)" "Sepal_Width"
      ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
    > s$coefficients[, 2]
    $`(Intercept)`
    [1] 0.4788963
    
    $Sepal_Width
    [1] 0.1550809
    ```
    
    This  shows that the underlying structure of coefficients is still `list`.
    
    felixcheung wangmiao1981
    
    Author: actuaryzhang <actuaryzhang10@gmail.com>
    
    Closes #16730 from actuaryzhang/sparkRCoef.
    ce112cec
    [SPARK-19395][SPARKR] Convert coefficients in summary to matrix
    actuaryzhang authored
    ## What changes were proposed in this pull request?
    The `coefficients` component in model summary should be 'matrix' but the underlying structure is indeed list. This affects several models except for 'AFTSurvivalRegressionModel' which has the correct implementation. The fix is to first `unlist` the coefficients returned from the `callJMethod` before converting to matrix. An example illustrates the issues:
    
    ```
    data(iris)
    df <- createDataFrame(iris)
    model <- spark.glm(df, Sepal_Length ~ Sepal_Width, family = "gaussian")
    s <- summary(model)
    
    > str(s$coefficients)
    List of 8
     $ : num 6.53
     $ : num -0.223
     $ : num 0.479
     $ : num 0.155
     $ : num 13.6
     $ : num -1.44
     $ : num 0
     $ : num 0.152
     - attr(*, "dim")= int [1:2] 2 4
     - attr(*, "dimnames")=List of 2
      ..$ : chr [1:2] "(Intercept)" "Sepal_Width"
      ..$ : chr [1:4] "Estimate" "Std. Error" "t value" "Pr(>|t|)"
    > s$coefficients[, 2]
    $`(Intercept)`
    [1] 0.4788963
    
    $Sepal_Width
    [1] 0.1550809
    ```
    
    This  shows that the underlying structure of coefficients is still `list`.
    
    felixcheung wangmiao1981
    
    Author: actuaryzhang <actuaryzhang10@gmail.com>
    
    Closes #16730 from actuaryzhang/sparkRCoef.
Loading