In this article, the Expectation Maximization technique will be used to find the outliers in images.
- We shall use the following apples and oranges image for the outlier detection.
- The colors R,G,B will form the variables for this image data.
- Two initial 3-dimensional Gaussian models the first one with red (255,0,0) and the second one with green (0,255,0) mean vectors along with random covariance matrices will be used as initial models.
- Then iteratively the E-step and the M-step are used till convergence is obtained (no change in estimated parameter values).
- First the E-step is used to compute the (normalized) probability of assignment of each datapoint (pixel) to both the models.
- Then the M-step is used to obtain the MLE for the parameters mean and covariance matrix for each of the models as weighted average contribution of the datapoints.
- The EM algorithm for GMM is shown as below:
- The convergence of the GMM EM with iterations as well as change in the estimated parameters are shown in the below animation (with the contours of the two Gaussian Models changing over iterations)
- The following figure shows that how the change in estimated mean decreases with iteration, leading the EM algorithm to converge.
10. Finally after the algorithm convergence, the pixels assigned to one of the Gaussian models were marked as black pixels. The below figure shows that it identified the orange among the apples as a different cluster.
11. The following animation shows how the image changes per iteration of the EM algorithm: