-
- Downloads
SPARK-1215 [MLLIB]: Clustering: Index out of bounds error (2)
Added check to LocalKMeans.scala: kMeansPlusPlus initialization to handle case with fewer distinct data points than clusters k. Added two related unit tests to KMeansSuite. (Re-submitting PR after tangling commits in PR 1407 https://github.com/apache/spark/pull/1407 ) Author: Joseph K. Bradley <joseph.kurata.bradley@gmail.com> Closes #1468 from jkbradley/kmeans-fix and squashes the following commits: 4e9bd1e [Joseph K. Bradley] Updated PR per comments from mengxr 6c7a2ec [Joseph K. Bradley] Added check to LocalKMeans.scala: kMeansPlusPlus initialization to handle case with fewer distinct data points than clusters k. Added two related unit tests to KMeansSuite.
Showing
- mllib/src/main/scala/org/apache/spark/mllib/clustering/LocalKMeans.scala 7 additions, 1 deletion...scala/org/apache/spark/mllib/clustering/LocalKMeans.scala
- mllib/src/test/scala/org/apache/spark/mllib/clustering/KMeansSuite.scala 26 additions, 0 deletions...scala/org/apache/spark/mllib/clustering/KMeansSuite.scala
Please register or sign in to comment