In this article, both the linear PCA and the kernel PCA will be applied on a few shape datasets, to show whether the structure of the data (in terms of different clusters) in higher dimensions are preserved in the lower dimension or not for both the methods.

- For the linear PCA, as usual, the dataset is first
*z-score normalized*and then the*eigen-analysis*of the*covariance matrix*is done. Then to reduce the dimension, the dataset is projected onto the first few principal components (dominant eigenvectors of the covariance matrix). - For the kernel PCA, Gaussian Kernel is used to compute the distances between the datapoints and the
*Kernel matrix*is computed (with the*kernel trick*), then normalized. Next the*eigen-analysis*of the*Kernel matrix*is done. Then to reduce the dimension, the first few dominant eigenvectors of the kernel matrix are chosen, which implicitly represent the data already projected on the principal components of the infinite dimensional space. The next figure shows the algorithms for the two methods.

Advertisements