SANTTUcurriculum vitae
09 Sep 2018

Machine Learning – Kernel Methods: Implementation for Support Vector Machines ( SVM ) for Classification (Gaussian)

Implementing soft margin SVM algorithm using Gaussian kernel for classification of handwritten digits (3 and 8) with a step by step Matlab code.

Contents

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% SVM Gaussian Kernel                                                                          %
% Step by step detailed implementation for soft margin SVM using Gaussian Kernel             %
% Santosh Tirunagari                                                                         %
%                                                                                            %
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

Parameters

$$k_{GAU}\;= (x_i , x_j )\;=\;exp(-\|x_i ? x_j\|_2^2 /s^2 )$$ $$C = 1,\;and\;s\;=\;\{1, 2, 4, 8, 16, 32, 64, 128\}$$

Datasets

The datasets for handwritten digits used are training set and test set with 500 samples and 256 features. Class labels for training and test set are denoted as $+1$for$3$and$-1$for$8$. There are 250 samples each for handwritten 3s and 8s for both the training and test sets.

clear all
close all
clc

Loading the data

Load and zscore normalize both the train and test data using zscore Matlab command.

load usps_3_vs_8_train_X.txt;
load  usps_3_vs_8_train_y.txt;
load usps_3_vs_8_test_X.txt;
load  usps_3_vs_8_test_y.txt;
Zscore normalization
Y = usps_3_vs_8_train_y;
X = zscore(usps_3_vs_8_train_X);
X_t = zscore(usps_3_vs_8_test_X);
Y_t = usps_3_vs_8_test_y;

Initialise vectors

result_train =[];
result_test = [];
support = [];
S = [1, 2, 4, 8, 16, 32, 64, 128];
C = 1;

for k = 1:1:8
%K = exp(-bsxfun(@plus,sum(X.^2,2),bsxfun(@plus,sum(X.^2,2)',-2.*X*X'))./(S(k).^2));

Compute Gaussian kernel

K = exp(-pdist2(X,X).^2/S(k).^2); % efficient way of calculate Gaussian kernel function
H = (Y*Y').*K;
A = [];
Aeq = Y';
l = zeros(500,1);
c = -1*ones(500,1);
b = [];
beq = 0;
u = C*ones(500, 1);

options = optimset('Algorithm','interior-point-convex','MaxIter',1500);

Solve quadprog to find alpha values

alpha = quadprog(H, c, A, b, Aeq, beq, l, u,[],options);
Minimum found that satisfies the constraints.

Calculate near boundary coefficients

alpha(alpha < C * 0.001) = 0;
alpha(alpha > C*0.999999999999) = C;

Get support vector indices

sv = find(alpha >0 & alpha<C);
sv_one = zeros(500,1);
sv_one(sv,1) = 1;%(alpha(alpha >0 & alpha<C));

Compute bias b

b = sv_one'*(Y-((alpha.*Y)'*K')')/sum(sv_one);
s = length(sv);
s = (s/500)*100;% support vector counts
support = [support;s];

Decision function for training set

Ki = exp(-bsxfun(@plus,sum(X(sv,:).^2,2),bsxfun(@plus,sum(X.^2,2)',-2.*X(sv,:)*X'))./(S(k).^2));  %X(sv,:)*X';
temp = bsxfun(@plus,Ki'*(alpha(sv,:).*Y(sv,:)),b);
res = temp;
res(res>=0) = 1;
res(res<0) = -1;
r = sum(res~=Y);
r=r/500;
result_train = [result_train;r];

Decision function for test set

Ki = exp(-bsxfun(@plus,sum(X(sv,:).^2,2),bsxfun(@plus,sum(X_t.^2,2)',-2.*X(sv,:)*X_t'))./(S(k).^2));
temp = bsxfun(@plus,Ki'*(alpha(sv,:).*Y(sv,:)),b);
res = temp;
% thresholding
res(res>=0) = 1;
res(res<0) = -1;
r = sum(res~=Y_t);
r=r/500;
result_test = [result_test;r];
end

Plot the graphs

hut = log2(S);
h = figure;
plot(hut,result_train*10,'k-*','LineWidth',5,'MarkerSize',10);
hold on
plot(hut,result_test*10,'r-o','LineWidth',5,'MarkerSize',10);
axis tight;
set(gca,'YTick',[0 5 10])
set(gca,'XTick',log2(S))
xlabel('log2(S)');
ylabel('Mis-classification Error');
hleg1 = legend('Train','Test');
title('Guassian Kernal');
set(hleg1,'Location','NorthWest')
    set(hleg1,'Interpreter','none')

    saveas(h,'gk_msc','png')

The misclassification error for the test data is minimum at $$S=8$$.

Plot the support vector count percentage

hut = log2(S);
h = figure;
plot(hut,support,'k-*','LineWidth',5,'MarkerSize',10);%
set(gca,'YTick',[0 50 100])
set(gca,'XTick',log2(S))
xlabel('log2(S)');
ylabel('Support Vector');
title('Guassian Kernal');
saveas(h,'gk_sv','png')

The results show that the increase or decrease in the count of support vectors utilised are not dependent on the value of S.

Download Datasets

usps_3_vs_8_test_X      usps_3_vs_8_test_y     usps_3_vs_8_train_X     usps_3_vs_8_train_y

Implementation with Linear kernel

http://santosh-tirunagari.com/2018/09/04/machine-learning-kernel-methods-implementation-for-support-vector-machines-svm-for-classification-linear/

machine learning • matlab • teaching Leave a comment

Leave a Reply

%d bloggers like this: