In this article, both the linear PCA and the kernel PCA will be applied on a few shape datasets, to show whether the structure of the data (in terms of different clusters) in higher dimensions are preserved in the lower dimension or not for both the methods.
- For the linear PCA, as usual, the dataset is first z-score normalized and then the eigen-analysis of the covariance matrix is done. Then to reduce the dimension, the dataset is projected onto the first few principal components (dominant eigenvectors of the covariance matrix).
- For the kernel PCA, Gaussian Kernel is used to compute the distances between the datapoints and the Kernel matrix is computed (with the kernel trick), then normalized. Next the eigen-analysis of the Kernel matrix is done. Then to reduce the dimension, the first few dominant eigenvectors of the kernel matrix are chosen, which implicitly represent the data already projected on the principal components of the infinite dimensional space. The next figure shows the algorithms for the two methods.