Skip to main content

Principal Components Analysis

PCA is a useful statistical technique that has found application in fields such as face recognition and image compression, and is a common technique for finding patterns in data of high dimension. PCA is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Since patterns in data can be hard to find in data of high dimension, where the luxury of graphical representation is not available, PCA is a powerful tool for analysing data.

The other main advantage of PCA is that once you have found these patterns in the data, and you compress the data, ie. by reducing the number of dimensions, without much loss of information. This technique used in image compression, as we will see in a later section.

In this work, I will show the PCA with six step.
  1. Get some data
  2. Substract the mean
  3. Calculate the covariance matrix
  4. Calculate the eigenvectors and eigenvalues of the covariance matrix
  5. Choosing components and forming a feature vector
  6. Deriving the new data set

Matlab Code

clear all, clc, close all
%% Step 1: Get some data
X = [2.5 2.4; 
    0.5 0.7;
    2.2 2.9;
    1.9 2.2;
    3.1 3.0;
    2.3 2.7;
    2 1.6;
    1 1.1;
    1.5 1.6;
    1.1 0.9];
axis equal
axis([-1 4 -1 4])
hold on
plot(X(:,1), X(:,2),'k+');

pause
plot([0 0],[-1 4],'k:');
plot([-1 4], [0 0],'k:');
title('Original PCA Data');

%% Step 2: Subtract the mean 
% XAdjust = X-repmat(mean(X),size(X,1),1)
m = mean(X);
p1 = X(:,1)-m(1,1);
p2 = X(:,2)-m(1,2);
XAdjust = [p1 p2];

%% Step 3: Calculate the covariance matrix
CM = cov(X); 

%% Step 4: Calculate the eigenvektros and eigenvalues of the covaricance matrix
[V D] = eig(CM);

%% Step 5: Choosing components and forming a feature vector
pause
plot(XAdjust(:,1),XAdjust(:,2),'r+');

pause
A = 10*V; 
plot([-A(1,1) A(1,1)],[-A(1,2) A(1,2)],'b-.')
plot([-A(2,1) A(2,1)],[-A(2,2) A(2,2)],'b-.')

pause
% close all

figure
axis equal
axis([-2 2 -2 2])
hold on
plot(XAdjust(:,1),XAdjust(:,2),'r+');
plot([-A(1,1) A(1,1)],[-A(1,2) A(1,2)],'b-.');
plot([-A(2,1) A(2,1)],[-A(2,2) A(2,2)],'b-.');
plot([0 0],[-2,2],'k:');
plot([-2 2],[0,0],'k:');
title('Mean adjusted data with eigenvectors overlayed');

%% Step 6: Deriving the new data set
pause
%  f1 = V(:,2)'
%  V(:,find(D==max(D))); % maksimum varyansı bulmak için kullanılabilir.
f1 = [-0.677873399 -0.735178656];
PC1 = f1*XAdjust'
PC1 = PC1'

f2 =[-0.73518656 0.677873399];
%f2 = V(:,1)'
PC2 = f2*XAdjust';
PC2 = PC2'

F = [f1; f2];

Y = [PC1 PC2]
figure

plot(Y(:,1),Y(:,2),'r+')
axis equal
axis([-2 2 -2 2])

Cy = cov(Y);
[Vy Dy] = eig(Cy);
hold on
A = 10 * Vy';
plot([-A(1,1) A(1,1)],[-A(1,2) A(1,2)],'g:');
plot([-A(2,1) A(2,1)],[-A(2,2) A(2,2)],'b:');
title('Data transformed with 2 eigenvectors');

pause;
% Variance
figure;
hold off
bar(diag(D));
xlabel('Projection dimension');
ylabel('Variance');

OUTPUTS







Comments

Popular posts from this blog

Use offline WFDB Toolbox for MATLAB

The WFDB Toolbox for MATLAB and Octave is a collection of functions for reading, writing, and processing physiologic signals and time series in the formats used by PhysioBank databases (among others). The Toolbox is compatible with 64-bit MATLAB and GNU Octave on GNU/Linux, Mac OS X, and MS-Windows. To quick start to use offline WFDB Toolbox for Matlab http://physionet.org/physiotools/matlab/wfdb-app-matlab/wfdb-app-toolbox-0-9-6-1.zip  download this file.  Unzip file where you want. When unzip the file, mcode folder will create.  Open MATLAB.  Go where you unzip the file.  Write command line addpath(pwd) Finally write command line again savepath  You can use WFDB Toolbox offline anymore.  [tm,sig]=rdsamp('mitdb/100',1); plot(tm,sig);

ImageDatastore

ImageDatastore is a useful function and can get a collection of image files, where each individual image fits in memory, but the entire collection of images does not necessarily fit.

Using PCA on Three Dimensional Dataset

In this work, We use PCA three dimensional data.  Matlab Code % PCA Model clear all, clc , close all hold on axis equal axis([-2 2 -2 2 -2 2]) % Step 1: Get some data X = [1 2 -1 -2 0; 0.2 0 0.1 0.2 -0.4; 1.2 0.3 -1 -0.1 -0.4]'; % Step 2: Substract the mean plot3(X(:,1),X(:,2),X(:,3),'ko'); XAdjust = X-repmat(mean(X),size(X,1),1); plot3(XAdjust(:,1),XAdjust(:,2),XAdjust(:,3),'ro'); % Step 3: Calculate the covariance matrix CM = cov(X); % Step 4: Eigenvalue and Eigenvector [V D]= eig(CM); % Step 5: Choosing component f1 = V(:,1)'; f2 = V(:,2)'; f3 = V(:,3)'; F=[f1; f2; f3];