Principal Components Analysis

PCA is a useful statistical technique that has found application in fields such as face recognition and image compression, and is a common technique for finding patterns in data of high dimension. PCA is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Since patterns in data can be hard to find in data of high dimension, where the luxury of graphical representation is not available, PCA is a powerful tool for analysing data.

The other main advantage of PCA is that once you have found these patterns in the data, and you compress the data, ie. by reducing the number of dimensions, without much loss of information. This technique used in image compression, as we will see in a later section.

In this work, I will show the PCA with six step.

Get some data
Substract the mean
Calculate the covariance matrix
Calculate the eigenvectors and eigenvalues of the covariance matrix
Choosing components and forming a feature vector
Deriving the new data set

Matlab Code

clear all, clc, close all
%% Step 1: Get some data
X = [2.5 2.4; 
    0.5 0.7;
    2.2 2.9;
    1.9 2.2;
    3.1 3.0;
    2.3 2.7;
    2 1.6;
    1 1.1;
    1.5 1.6;
    1.1 0.9];
axis equal
axis([-1 4 -1 4])
hold on
plot(X(:,1), X(:,2),'k+');

pause
plot([0 0],[-1 4],'k:');
plot([-1 4], [0 0],'k:');
title('Original PCA Data');

%% Step 2: Subtract the mean 
% XAdjust = X-repmat(mean(X),size(X,1),1)
m = mean(X);
p1 = X(:,1)-m(1,1);
p2 = X(:,2)-m(1,2);
XAdjust = [p1 p2];

%% Step 3: Calculate the covariance matrix
CM = cov(X); 

%% Step 4: Calculate the eigenvektros and eigenvalues of the covaricance matrix
[V D] = eig(CM);

%% Step 5: Choosing components and forming a feature vector
pause
plot(XAdjust(:,1),XAdjust(:,2),'r+');

pause
A = 10*V; 
plot([-A(1,1) A(1,1)],[-A(1,2) A(1,2)],'b-.')
plot([-A(2,1) A(2,1)],[-A(2,2) A(2,2)],'b-.')

pause
% close all

figure
axis equal
axis([-2 2 -2 2])
hold on
plot(XAdjust(:,1),XAdjust(:,2),'r+');
plot([-A(1,1) A(1,1)],[-A(1,2) A(1,2)],'b-.');
plot([-A(2,1) A(2,1)],[-A(2,2) A(2,2)],'b-.');
plot([0 0],[-2,2],'k:');
plot([-2 2],[0,0],'k:');
title('Mean adjusted data with eigenvectors overlayed');

%% Step 6: Deriving the new data set
pause
%  f1 = V(:,2)'
%  V(:,find(D==max(D))); % maksimum varyansı bulmak için kullanılabilir.
f1 = [-0.677873399 -0.735178656];
PC1 = f1*XAdjust'
PC1 = PC1'

f2 =[-0.73518656 0.677873399];
%f2 = V(:,1)'
PC2 = f2*XAdjust';
PC2 = PC2'

F = [f1; f2];

Y = [PC1 PC2]
figure

plot(Y(:,1),Y(:,2),'r+')
axis equal
axis([-2 2 -2 2])

Cy = cov(Y);
[Vy Dy] = eig(Cy);
hold on
A = 10 * Vy';
plot([-A(1,1) A(1,1)],[-A(1,2) A(1,2)],'g:');
plot([-A(2,1) A(2,1)],[-A(2,2) A(2,2)],'b:');
title('Data transformed with 2 eigenvectors');

pause;
% Variance
figure;
hold off
bar(diag(D));
xlabel('Projection dimension');
ylabel('Variance');

OUTPUTS

Use offline WFDB Toolbox for MATLAB

The WFDB Toolbox for MATLAB and Octave is a collection of functions for reading, writing, and processing physiologic signals and time series in the formats used by PhysioBank databases (among others). The Toolbox is compatible with 64-bit MATLAB and GNU Octave on GNU/Linux, Mac OS X, and MS-Windows. To quick start to use offline WFDB Toolbox for Matlab http://physionet.org/physiotools/matlab/wfdb-app-matlab/wfdb-app-toolbox-0-9-6-1.zip download this file. Unzip file where you want. When unzip the file, mcode folder will create. Open MATLAB. Go where you unzip the file. Write command line addpath(pwd) Finally write command line again savepath You can use WFDB Toolbox offline anymore. [tm,sig]=rdsamp('mitdb/100',1); plot(tm,sig);

Matlab

Search This Blog

Principal Components Analysis

Matlab Code

OUTPUTS

Labels

Comments

Post a Comment

Popular posts from this blog

Use offline WFDB Toolbox for MATLAB

Control Random Number Generation “rng”

Plot a Sinusoidal Wave in Matlab