PCA is a way of identifying patterns in data, and expressing the data in such a way as to highlight their similarities and differences. Below are the steps of the algorithm:
close all; % clear all; clc; % data = load('Data/eeg.mat'); % data = data.data{1,2};
Step 1
– Initialize the dataset, 6 vectors of 32 sample data
% % Step 1 - Initialize the dataset, 6 vectors of 32 sample data % X = data(1:32,1:6);
Step 2
– Subtract the mean from each of the data dimensions. The mean subtracted is the average across each dimension.
, where is a vector of ones
% % Step 2 - Subtract the mean from each of the data dimensions. % - The mean subtracted is the average across each dimension. % Y = X - repmat(mean(X),size(X,1),1);
Step 3
– Calculate the covariance matrix of Y
% % Step 3 - Calculate the covariance matrix of Y % C = cov(Y);
Step 4
– Calculate the eigen(vect/vals) of the covariance matrix C.
% % Step 4 - Calculate the eigen(vect/vals) of the covariance matrix C. % [eigenvectors,lamda] = eig(C,'vector');
Step 5
– Sort the eigenvectors by eigenvalue, highest to lowest.
% % Step 5 - Sort the eigenvectors by eigenvalue, highest to lowest. % lamda = lamda(end:-1:1); eigenvectors = eigenvectors(end:-1:1,:);
Step 6
– Select eigenvectors and form a feature matrix. Get the best 97++%
where are the eigenvectors and depends on the level of compression.
% % Step 6 - Select eigenvectors and form a feature matrix. % - Get the best 97++% % percentance = 0; until = 0; while percentance < 97 percentance = sum(lamda(1:until+1)) * 100 / sum(lamda); until=until+1; end features = eigenvectors(:,1:until);
Step 7
– Derive the new data set. We take the transpose of the vector and multiply it on the left of the transposed original data set.
% % Step 7 - Derive the new data set. % - We take the transpose of the vector and multiply it on the left % - of the transposed original data set. % Xn = (features' * Y')';
Step 8
– Get the original data back. We take the features vector and multiply it on the left of the transposed PCA produced data set Xn. This procuces the Y matrix. To get the actual X data then we have to add the mean of each vectors (see Step 2).
.
% Extra % Step 8 - Get the original data back. % - We take the features vector and multiply it on the left % - of the transposed PCA produced data set Xn. This procuces the Y % matrix. To get the actual X data then we have to add the mean of % each vectors (see Step 2). % Xr = (features * Xn')' + repmat(mean(X),size(X,1),1);
Step 9
– Plotting.
% Extra % Step 9 - Plotting. % figure; subplot(2,3,1); plot(X); ylim([-100 100]); xlabel('X samples'); ylabel('vectors'); subplot(2,3,2); bar(mean(X)); ylim([0 35]); xlabel('vectors of X'); ylabel('Mean of each vector'); subplot(2,3,3); bar([mean(mean(X)),mean(mean(Y))]); ylim([0 20]); xlabel('X Y'); ylabel('mean(mean(X)),mean(mean(Y))'); subplot(2,3,4); plot(Y); ylim([-100 100]); ylim([-100 100]); xlabel('Y samples'); ylabel('vectors'); subplot(2,3,5); bar(mean(Y)); ylim([0 35]); xlabel('vectors of Y'); ylabel('Mean of each vector (almost 0)'); ylim([0 35]); subplot(2,3,6); bar([mean(mean(Y)),mean(mean(X))]); ylim([0 20]); xlabel('Y X'); ylabel('mean(mean(Y)),mean(mean(X))'); figure; subplot(4,1,1); plot(X); ylim([-100 100]); xlabel('X samples'); ylabel('vectors'); subplot(4,1,2); plot(Y); ylim([-100 100]); xlabel('Y samples'); ylabel('vectors'); subplot(4,1,3); plot(Xn); ylim([-100 100]); xlabel('Xn (aka PCA(X)) samples'); ylabel('vectors'); subplot(4,1,4); plot(Xr); ylim([-100 100]); xlabel('Recovered Xr (aka recover_PCA(PCA(X),features)) samples'); ylabel('vectors');
Results And Figures