Suppose we had some data points $\{(x_i,y_i)\}$ where $x_i$ is a real number and $y_i$ is zero or one. We want to fit the logistic function to the data where the logistic function is $\hat{y}=\frac{1}{1+e^{-(\beta x +\beta_0)}}$. To do this we would like to choose $\beta_1$ and $\beta_0$ in order to minimize the residual sum of squares, $RSS = \sum_i (y_i-\hat{y}_i)^2$.[nevermind that cross-entropy is a better cost function.] I believe there is only a single minimum. How can I know? Does anyone know of a proof for this or a counter example?
Asked
Active
Viewed 341 times
1 Answers
2
Sigmoid itself is not a convex function (see this) and square loss based on sigmoid, such as $ \left( A - \frac{1}{1 + e^{-z}} \right)^2 $ is not convex.
Simply plotting the square loss function with A = 5 shows that it's convex for x > 0 and concave, otherwise
If you want mathematical proof, take the second derivative and you'll see that it not strictly positive nor negative.

Tu N.
- 509
- 2
- 3
-
What is the point? What does the residual squared being neither convex nor concave have to do with whether there is only one local minimum, the global minimum? – sebastianspiegel Mar 22 '16 at 02:30
-
sorry if it's not clear, the function is convex meaning local minima is global minima, strictly convex means only one global minima. The fact that a function is convex helps a lot in optimization procedure. If it's not convex, then it's difficult to find minima. – Tu N. Mar 22 '16 at 04:33
-
Ah, so I believe you are saying that we cannot use convexity to prove whether a minimum is unique or not. – sebastianspiegel Mar 22 '16 at 06:40