Skip to content
Snippets Groups Projects
Commit 1537e556 authored by Kai Jiang's avatar Kai Jiang Committed by Joseph K. Bradley
Browse files

[SPARK-12041][ML][PYSPARK] Add columnSimilarities to IndexedRowMatrix

Add `columnSimilarities` to IndexedRowMatrix for PySpark spark.mllib.linalg.

Author: Kai Jiang <jiangkai@gmail.com>

Closes #10158 from vectorijk/spark-12041.
parent ff899755
No related branches found
No related tags found
No related merge requests found
......@@ -297,6 +297,20 @@ class IndexedRowMatrix(DistributedMatrix):
"""
return self._java_matrix_wrapper.call("numCols")
def columnSimilarities(self):
"""
Compute all cosine similarities between columns.
>>> rows = sc.parallelize([IndexedRow(0, [1, 2, 3]),
... IndexedRow(6, [4, 5, 6])])
>>> mat = IndexedRowMatrix(rows)
>>> cs = mat.columnSimilarities()
>>> print(cs.numCols())
3
"""
java_coordinate_matrix = self._java_matrix_wrapper.call("columnSimilarities")
return CoordinateMatrix(java_coordinate_matrix)
def toRowMatrix(self):
"""
Convert this matrix to a RowMatrix.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment