4

While using support vector machines (SVM), we encounter 3 types of lines (for a 2D case). One is the decision boundary and the other 2 are margins:

decision boundary of SVM

Why do we use $+1$ and $-1$ as the values after the $=$ sign while writing the equations for the SVM margins? What's so special about $1$ in this case?

For example, if $x$ and $y$ are two features then the decision boundary is: $ax+by+c=0$. Why are the two marginal boundaries represented as $ax+by+c=+1$ and $ax+by+c=-1$?

hbaderts
  • 1,114
  • 8
  • 21
user1825567
  • 1,396
  • 1
  • 12
  • 22

2 Answers2

3

It's important for the optimization formulation of the SVM that $y_i=\{-1,1\}$ which is why it makes sense to also output $y=\{-1,1\}$. If we look at the soft-margin linear SVM we want to minimize:

$\left[\frac{1}{n}\sum_{i=1}^n\max{(0,1-y_i(w\cdot x_i+b))}\right]+\lambda\| w\| ^2$

The $y_i$ is either +1 or -1 which flips the hyperplane in the soft-margin definition of the problem.

Jan van der Vegt
  • 9,368
  • 35
  • 52
  • I think you didn't understand my question. I have given some more details in my question. Please see it – user1825567 May 31 '17 at 17:19
  • 1
    I think he understood it. $y$ is not a feature; it's the response, and it was chosen the way it is for the loss function to have the tractable form given above. Try substituting $y=0$ as you suggested and examining the effect of changing the prediction $w\cdot x + b$. – Emre May 31 '17 at 17:28
0

This is just mathematical convenience.

Suppose we have $W*X+b \ge k$ for positive points and $W*X+b \le -k$ for negative points. If we scale the equation with $1 \over k$, $W'*X+b' \ge 1$, where $W'= {W \over k}$, $b'= {b \over k}$. This doesn't change the optimization target for $W$ and $b$.

Ethan
  • 1,633
  • 9
  • 24
  • 39
Rui Li
  • 1