Support Vector Machine

Suppose dataset related to label , find and bias such that

which leads to

The distance between optimal hyperplanes is , so maximizing the margin equals to minimizing .

Primary problem : min ,

leads to

and ,

leads to

Dual problem : max ,

s.t. ,

Only support vectors have nonzero coefficients .

Approaches : sub-gradient descent and coordinate descent.