The following problem appeared as a project in the edX course ColumbiaX: CSMM.102x Machine Learning. The following problem description is taken from the course project itself.
The following figures show how the idea of Matrix Factorization can be extended to Probabilistic Matrix Factorization by assumes Gaussian generative process.
Here is the probabilistic matrix factorization algorithm:
The first few rows of the input dataset ratings are shown below, which is a comma separated file containing the data. Each row contains a three values that correspond in order to: user_index, object_index, rating. There are 1000 users, 100 objects and total 2000 ratings provided by the users.
The PMF algorithm will factorize the low rank sparse rating matrix into 2 matrices U (user) and V (object) with dimension of latent feature space as d, as shown below:
The following figure shows how the log likelihood value (as shown in the above figure, the MAP objective function in the log scale) increases abruptly with the first few iterations of the PMF algorithm and then gradually converges, as expected:
The following figures and the animation show how the U and V matrices change with #iterations
Finally the following figure and the animation show how the proximity (similarity) of the users and the objects in the latent embedded space change with #iterations, shown for a first few users and objects (the blue points are users and green ones are objects):
The blue points represent users and the orange ones the objects in the next animation.