3

Suppose we had some data points $\{(x_i,y_i)\}$ where $x_i$ is a real number and $y_i$ is zero or one. We want to fit the logistic function to the data where the logistic function is $\hat{y}=\frac{1}{1+e^{-(\beta x +\beta_0)}}$. To do this we would like to choose $\beta_1$ and $\beta_0$ in order to minimize the residual sum of squares, $RSS = \sum_i (y_i-\hat{y}_i)^2$.[nevermind that cross-entropy is a better cost function.] I believe there is only a single minimum. How can I know? Does anyone know of a proof for this or a counter example?

sebastianspiegel
  • 891
  • 4
  • 11
  • 16

1 Answers1

2

Sigmoid itself is not a convex function (see this) and square loss based on sigmoid, such as $ \left( A - \frac{1}{1 + e^{-z}} \right)^2 $ is not convex.

Simply plotting the square loss function with A = 5 shows that it's convex for x > 0 and concave, otherwise

enter image description here

If you want mathematical proof, take the second derivative and you'll see that it not strictly positive nor negative.

Tu N.
  • 509
  • 2
  • 3
  • What is the point? What does the residual squared being neither convex nor concave have to do with whether there is only one local minimum, the global minimum? – sebastianspiegel Mar 22 '16 at 02:30
  • sorry if it's not clear, the function is convex meaning local minima is global minima, strictly convex means only one global minima. The fact that a function is convex helps a lot in optimization procedure. If it's not convex, then it's difficult to find minima. – Tu N. Mar 22 '16 at 04:33
  • Ah, so I believe you are saying that we cannot use convexity to prove whether a minimum is unique or not. – sebastianspiegel Mar 22 '16 at 06:40