Skip to content
Snippets Groups Projects
layout: global
title: Clustering - ML
displayTitle: <a href="ml-guide.html">ML</a> - Clustering

In this section, we introduce the pipeline API for clustering in mllib.

Latent Dirichlet allocation (LDA)

LDA is implemented as an Estimator that supports both EMLDAOptimizer and OnlineLDAOptimizer, and generates a LDAModel as the base models. Expert users may cast a LDAModel generated by EMLDAOptimizer to a DistributedLDAModel if needed.

Refer to the Scala API docs for more details.

{% include_example scala/org/apache/spark/examples/ml/LDAExample.scala %}

Refer to the Java API docs for more details.

{% include_example java/org/apache/spark/examples/ml/JavaLDAExample.java %}