In this article, the clustering output results using GMM-EM soft clustering is going to be compared with that of KMeans hard clustering on an image.
- The apples and oranges image shown below is used for the comparing the clustering techniques.
- The two color channels R,G are used as the variables for this image data.
- Two initial 2-dimensional Gaussian models the first one with red (1,0) and the second one with green (0,1) mean vectors along with random covariance matrices are used as initial models for GMM-EM.
- The same two initial points (1,0) and (0,1) are used as the initial cluster centroids for the Kmeans clustering also.
- The EM algorithm steps for GMM and change in the Gaussian contours with iterations (till convergence) are shown in the next animation.
- The change in the centroids with iteration (till convergence) for the KMeans clustering are shown in the next animation.
- Finally, after both the algorithms converge, the pixels assigned to one of clusters obtained are marked as black, for each of the algorithms. The next figure shows that GMM-EM can identify the orange from the apples but KMeans can not.