## Implementing soft margin SVM algorithm using Gaussian kernel for classification of handwritten digits (3 and 8) with a step by step Matlab code.

## Contents

- Parameters
- Datasets
- Loading the data
- Zscore normalization
- Initialise vectors
- Compute Gaussian kernel
- Solve quadprog to find alpha values
- Calculate near boundary coefficients
- Get support vector indices
- Compute bias b
- Decision function for training set
- Decision function for test set
- Plot the graphs
- Plot the support vector count percentage

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% % SVM Gaussian Kernel % % Step by step detailed implementation for soft margin SVM using Gaussian Kernel % % Santosh Tirunagari % % % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

## Parameters

## Datasets

The datasets for handwritten digits used are training set and test set with 500 samples and 256 features. Class labels for training and test set are denoted as forandfor. There are 250 samples each for handwritten 3s and 8s for both the training and test sets.

clear all close all clc

## Loading the data

Load and zscore normalize both the train and test data using zscore Matlab command.

load usps_3_vs_8_train_X.txt; load usps_3_vs_8_train_y.txt; load usps_3_vs_8_test_X.txt; load usps_3_vs_8_test_y.txt; Zscore normalization

Y = usps_3_vs_8_train_y; X = zscore(usps_3_vs_8_train_X); X_t = zscore(usps_3_vs_8_test_X); Y_t = usps_3_vs_8_test_y;

## Initialise vectors

result_train =[]; result_test = []; support = [];

```
S = [1, 2, 4, 8, 16, 32, 64, 128];
C = 1;
for k = 1:1:8
```

```
%K = exp(-bsxfun(@plus,sum(X.^2,2),bsxfun(@plus,sum(X.^2,2)',-2.*X*X'))./(S(k).^2));
```

## Compute Gaussian kernel

K = exp(-pdist2(X,X).^2/S(k).^2); % efficient way of calculate Gaussian kernel function H = (Y*Y').*K; A = []; Aeq = Y'; l = zeros(500,1); c = -1*ones(500,1); b = []; beq = 0; u = C*ones(500, 1); options = optimset('Algorithm','interior-point-convex','MaxIter',1500);

## Solve quadprog to find alpha values

alpha = quadprog(H, c, A, b, Aeq, beq, l, u,[],options);

Minimum found that satisfies the constraints.

## Calculate near boundary coefficients

alpha(alpha < C * 0.001) = 0; alpha(alpha > C*0.999999999999) = C;

## Get support vector indices

```
sv = find(alpha >0 & alpha<C);
sv_one = zeros(500,1);
sv_one(sv,1) = 1;%(alpha(alpha >0 & alpha<C));
```

## Compute bias b

```
b = sv_one'*(Y-((alpha.*Y)'*K')')/sum(sv_one);
s = length(sv);
s = (s/500)*100;% support vector counts
support = [support;s];
```

## Decision function for training set

```
Ki = exp(-bsxfun(@plus,sum(X(sv,:).^2,2),bsxfun(@plus,sum(X.^2,2)',-2.*X(sv,:)*X'))./(S(k).^2)); %X(sv,:)*X';
temp = bsxfun(@plus,Ki'*(alpha(sv,:).*Y(sv,:)),b);
res = temp;
res(res>=0) = 1;
res(res<0) = -1;
r = sum(res~=Y);
r=r/500;
result_train = [result_train;r];
```

## Decision function for test set

```
Ki = exp(-bsxfun(@plus,sum(X(sv,:).^2,2),bsxfun(@plus,sum(X_t.^2,2)',-2.*X(sv,:)*X_t'))./(S(k).^2));
temp = bsxfun(@plus,Ki'*(alpha(sv,:).*Y(sv,:)),b);
res = temp;
% thresholding
res(res>=0) = 1;
res(res<0) = -1;
r = sum(res~=Y_t);
r=r/500;
result_test = [result_test;r];
```

```
end
```

## Plot the graphs

hut = log2(S); h = figure; plot(hut,result_train*10,'k-*','LineWidth',5,'MarkerSize',10); hold on plot(hut,result_test*10,'r-o','LineWidth',5,'MarkerSize',10); axis tight; set(gca,'YTick',[0 5 10]) set(gca,'XTick',log2(S)) xlabel('log2(S)'); ylabel('Mis-classification Error'); hleg1 = legend('Train','Test'); title('Guassian Kernal'); set(hleg1,'Location','NorthWest') set(hleg1,'Interpreter','none') saveas(h,'gk_msc','png')

### The misclassification error for the test data is minimum at $$S=8$$.

## Plot the support vector count percentage

hut = log2(S); h = figure; plot(hut,support,'k-*','LineWidth',5,'MarkerSize',10);% set(gca,'YTick',[0 50 100]) set(gca,'XTick',log2(S)) xlabel('log2(S)'); ylabel('Support Vector'); title('Guassian Kernal'); saveas(h,'gk_sv','png')

#### The results show that the increase or decrease in the count of support vectors utilised are not dependent on the value of S.

## Download Datasets

usps_3_vs_8_test_X usps_3_vs_8_test_y usps_3_vs_8_train_X usps_3_vs_8_train_y