Can Adding a Convex Term to a Non-Convex Optimization Problem Make it Convex?

Question

In Machine Learning optimization problems - a "regularization" term is often added to the optimization problem to reduce overfitting:

I have noticed that in the case of the L2-Norm regularization, this term (i.e. a function) can be considered as a Convex Term as it is basically "quadratic" in nature.

My Question: In the L2-Norm case, the optimization problem without this regularization term is likely a Non-Convex problem - but we then add a Convex Term to this problem. Do we know if doing this (i.e. adding the Convex Term to a Non-Convex Optimization Problem) automatically makes the optimization problem as Convex?

I do not think that this is the case, seeing as:

Convex Optimization Problems are generally easier to solve than Non-Convex Optimization Problems
Anecdotally, I have heard of Regularized Loss Functions (e.g. for Neural Networks) that are considered to be "very difficult" optimization problems - even though they have this Convex Term. This informally leads me to believe that in the case of L2 Regularization, the fundamental optimization problem remains Non-Convex.

However, "anecdotal and informal logic" is generally never acceptable in understanding mathematics.

Can someone please comment on this?

Thanks!

Certainly not automatically; if lambda is small enough then your equation will be indistinguishable from the original. — Steven Stadnicki, Mar 13 '22 at 18:03
Non-convex plus convex is non convex. One of the main issues with the optimization problems for Neural Networks is the large amount of data and the large amount of variables/constraints. But it also true that the costs may be pretty ugly. — KBS, Mar 13 '22 at 18:24

RobPratt · Answer 1 · 2022-03-15T17:51:31.437

4

Yes, adding a large enough convex term can make a problem convex. For example, consider the nonconvex function $-x^2$ and the convex function $x^2$. For constant $\lambda \ge 1$, the sum $-x^2 + \lambda x^2=(\lambda-1)x^2$ is convex.

This is also a standard trick in binary quadratic programming, where $x_i$ is a binary decision variable and the objective is to minimize the multivariate quadratic function $$\sum_i \sum_j q_{ij} x_i x_j + \sum_i c_i x_i$$ subject to linear constraints. Let $\lambda$ be the absolute value of the smallest (negative) eigenvalue of $Q=(q_{ij})$. Then adding $\lambda(x_i^2-x_i)$, which is $0$ when $x_i$ is binary, makes the objective function convex.

edited Mar 15 '22 at 17:51

answered Mar 13 '22 at 19:38

RobPratt

45,619

@ RobPratt: Thank you so much for your answer! How "large" is "large enough"? In the case of the L2 Norm regularization problem that I posted - does this L2 Norm regularization term automatically make this optimization problem as Convex? Thank you so much! – stats_noob Mar 13 '22 at 19:50
If you have time - could you please take a look at this related question over here? https://math.stackexchange.com/questions/4402076/how-can-the-loss-functions-of-neural-networks-be-non-convex Thank you so much! – stats_noob Mar 13 '22 at 19:51
Is there a typo on the binary QP addition term, i.e., it should be the negative of what's there, i.e., need to be adding positive number to the diagonal? – Mark L. Stone Mar 15 '22 at 17:19
@MarkL.Stone Yes, corrected. Thanks! – RobPratt Mar 15 '22 at 17:52

Can Adding a Convex Term to a Non-Convex Optimization Problem Make it Convex?

1 Answers1