Skip to content
Snippets Groups Projects
Commit 93b96b44 authored by Nick Pentreath's avatar Nick Pentreath
Browse files

Adding implicit feedback ALS to MLlib user guide

parent c6ceaeae
No related branches found
No related tags found
No related merge requests found
......@@ -144,10 +144,9 @@ Available algorithms for clustering:
# Collaborative Filtering
[Collaborative
filtering](http://en.wikipedia.org/wiki/Recommender_system#Collaborative_filtering)
[Collaborative filtering](http://en.wikipedia.org/wiki/Recommender_system#Collaborative_filtering)
is commonly used for recommender systems. These techniques aim to fill in the
missing entries of a user-product association matrix. MLlib currently supports
missing entries of a user-item association matrix. MLlib currently supports
model-based collaborative filtering, in which users and products are described
by a small set of latent factors that can be used to predict missing entries.
In particular, we implement the [alternating least squares
......@@ -158,7 +157,24 @@ following parameters:
* *numBlocks* is the number of blacks used to parallelize computation (set to -1 to auto-configure).
* *rank* is the number of latent factors in our model.
* *iterations* is the number of iterations to run.
* *lambda* specifies the regularization parameter in ALS.
* *lambda* specifies the regularization parameter in ALS.
* *implicitPrefs* specifies whether to use the *explicit feedback* ALS variant or one adapted for *implicit feedback* data
* *alpha* is a parameter applicable to the implicit feedback variant of ALS that governs the *baseline* confidence in preference observations
## Explicit vs Implicit Feedback
The standard approach to matrix factorization based collaborative filtering treats
the entries in the user-item matrix as *explicit* preferences given by the user to the item.
It is common in many real-world use cases to only have access to *implicit feedback*
(e.g. views, clicks, purchases, likes, shares etc.). The approach used in MLlib to deal with
such data is taken from
[Collaborative Filtering for Implicit Feedback Datasets](http://research.yahoo.com/pub/2433).
Essentially instead of trying to model the matrix of ratings directly, this approach treats the data as
a combination of binary preferences and *confidence values*. The ratings are then related
to the level of confidence in observed user preferences, rather than explicit ratings given to items.
The model then tries to find latent factors that can be used to predict the expected preference of a user
for an item.
Available algorithms for collaborative filtering:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment