-
- Downloads
[SPARK-1969][MLlib] Online summarizer APIs for mean, variance, min, and max
It basically moved the private ColumnStatisticsAggregator class from RowMatrix to public available DeveloperApi with documentation and unitests. Changes: 1) Moved the private implementation from org.apache.spark.mllib.linalg.ColumnStatisticsAggregator to org.apache.spark.mllib.stat.MultivariateOnlineSummarizer 2) When creating OnlineSummarizer object, the number of columns is not needed in the constructor. It's determined when users add the first sample. 3) Added the APIs documentation for MultivariateOnlineSummarizer. 4) Added the unittests for MultivariateOnlineSummarizer. Author: DB Tsai <dbtsai@dbtsai.com> Closes #955 from dbtsai/dbtsai-summarizer and squashes the following commits: b13ac90 [DB Tsai] dbtsai-summarizer
Showing
- mllib/src/main/scala/org/apache/spark/mllib/linalg/distributed/RowMatrix.scala 2 additions, 134 deletions...org/apache/spark/mllib/linalg/distributed/RowMatrix.scala
- mllib/src/main/scala/org/apache/spark/mllib/stat/MultivariateOnlineSummarizer.scala 201 additions, 0 deletions...pache/spark/mllib/stat/MultivariateOnlineSummarizer.scala
- mllib/src/test/scala/org/apache/spark/mllib/stat/MultivariateOnlineSummarizerSuite.scala 209 additions, 0 deletions.../spark/mllib/stat/MultivariateOnlineSummarizerSuite.scala
- mllib/src/test/scala/org/apache/spark/mllib/util/TestingUtils.scala 45 additions, 0 deletions...test/scala/org/apache/spark/mllib/util/TestingUtils.scala
- project/MimaExcludes.scala 1 addition, 0 deletionsproject/MimaExcludes.scala
Loading
Please register or sign in to comment