Elementary proof that $|x|^p$ is convex.

Question

I'm writing some notes about analysis and want to use the fact that $|x|^p$ is convex for every $p>1$ to prove Minkowski's inequality. However I didn't wrote anything about derivatives nor limits yet. Is there a simple way to prove this?

EDIT: The only non-trivial inequalities proved yet are the triangular inequality and the Cauchy-Schwarz inequality.

Hint: use the fact that absolute value and the p-norm are norms. — Jonathan Hebert, Mar 23 '17 at 17:44
There is an old question, very similar to this one, and also without an answer. I don't know about the policy well enough, but this could be a duplicate. — mickep, Mar 23 '17 at 18:15
I'm proving Minkowski inequality exactly to prove that the p-norm is a norm. — Gabriel, Mar 23 '17 at 18:24
@GabrielRibeiro Is Bernoulli's inequality considered "non-trivial"? — dxiv, Mar 23 '17 at 19:14
May I ask, how do you define $x^p$? If you want to go from the first principles, answering that is quite fundamental. — Wojowu, Mar 23 '17 at 19:21

mickep · Answer 1 · 2021-01-17T10:44:16.670

Update

I let the solution stand as is, even though it uses non-allowed tools. Maybe someone in the future will find it useful.

Old solution

Here is a solution using the Hölder inequality. If that one is not allowed I suggest that you specify more clearly what is and what is not allowed to use among the "standard tools".

We want to show that (here $w_1$ and $w_2$ are non-negative numbers with $w_1+w_2=1$ and $x$ and $y$ are real numbers) $$ |w_1x+w_2y|^p\leq w_1|x|^p+w_2|y|^p, $$ that is $$ |w_1x+w_2y|\leq \bigl(w_1|x|^p+w_2|y|^p\bigr)^{1/p} $$ (If any of the involved quantities is $0$ we just note that the inequality is true).

We let $q$ be the dual exponent, i.e. defined such that $1/p+1/q=1$. Then, we have, using first the triangle inequality and then the Hölder inequality, $$ \begin{aligned} |w_1x+w_2y|&\leq w_1|x|+w_2|y|\\ &=\bigl(w_1^{1/p}|x|\bigr)(w_1)^{1/q}+\bigl(w_2^{1/p}|y|\bigr)(w_2)^{1/q}\\ &\leq\bigl(w_1|x|^p+w_2|y|^p\bigr)^{1/p}\bigl(w_1+w_2\bigr)^{1/q}\\ &=\bigl(w_1|x|^p+w_2|y|^p\bigr)^{1/p}. \end{aligned} $$ Since requested, by Hölder inequality, I mean $$ a_1b_1+a_2b_2\leq(a_1^p+a_2^p)^{1/p}(b_1^q+b_2^q)^{1/q}, $$ where $$ a_1=w_1^{1/p}|x|,\quad a_2=w_2^{1/p}|y|,\quad b_1=w_1^{1/q},\quad\text{and}\quad b_2=w_2^{1/q}. $$

Another for a request about obtaining strict convexity.

If $x$ and $y$ have different signs, then we have strict inequality in the triangle inequality. Thus, we can assume that $x$ and $y$ have the same sign.

Further, one has equality in the Hölder inequality precisely when there exists a constant $c$ such that $$ |b_1|=c|a_1|^{p-1},\quad\text{and}\quad |b_2|=c|a_2|^{p-1}. $$ In our case this simplifies to $$ w_1^{1/q}=cw_1^{(p-1)/p}|x|^{p-1},\quad\text{and}\quad w_2^{1/q}=cw_2^{(p-1)/p}|y|^{p-1}. $$ Since $1/q=(p-1)/p$ this simplifies to $$ 1=c|x|^{p-1},\quad\text{and}\quad 1=c|y|^{p-1}, $$ and hence to $|x|=|y|$. Since we assumed that $x$ and $y$ had the same sign, we conclude that we have equality precisely when $x=y$. All this implies that we have strict convexity.

@mickep can explain me how you apply Holder inequality to prove that $(w_1^{1/p}|x|)w_1^{1/q}+(w_2^{1/p}|y|)w_2^{1/q} \leq (w_1|x|^p+w_2|y|^p)^{1/p}(w_1+w_2)^{1/q}$ ? — Vrouvrou, Feb 02 '18 at 10:11
Hello, @mickep. Can you please tell me how your answer shows that $|x|^p$ ($p>1$) is strictly convex? — Tan, Jan 17 '21 at 07:31
@Gabriel +1 for this great answer.it's fair to use Hölder.Jensen→AM-GM relies on log's concavity.AM-GM→Young is simply a careful algebraic substitution.Young → Hölder is again a careful algebraic substitution with a smart observation of homogeneity.in the whole chain of thoughts,no differentiation is involved,so it's the most elementary possible.to define $x^p$ with an arbitrary real number $p$, both exponential and log functions are needed, and their constructions require both limits and continuity. as a result, i find using log's concavity fine---there's an elementary proof for that. — GNUSupporter 8964民主女神地下教會, Oct 26 '23 at 23:37

score 8 · Answer 2 · answered Mar 23 '17 at 18:08

8

I think yes. The composition of a convex function with an increasing convex one is still convex. More precisely, if $f$ is convex and $g$ is convex and increasing, then $(g\circ f)(x)=g\bigl(f(x)\bigr)$ is convex. Now $f(x)=|x|$ is convex by a triangle inequality and $g(x)=x^p$ is increasing and convex. Here we could use derivatives.

answered Mar 23 '17 at 18:08

szw1710

8,102

5

How do you know $x^p$ is convex? – Wojowu Mar 23 '17 at 18:10
The 2nd derivative is $p(p-1)x^{p-2}$ and $p(p-1)>0\iff p<0\vee p>1$. Of course, it is easy to use derivatives here. But knowing convexity of a power function we don't need use derivativex to chack that $|x|^p$ is convex. – szw1710 Mar 23 '17 at 18:13
1

"However I didn't wrote anything about derivatives nor limits yet" <- from the question. I assume this means one tries to avoid using derivatives... – Wojowu Mar 23 '17 at 18:14
Yes, I think the same. But now our task reduces to a power function. We could also check that differential quotients are increasing. There is also a nice determinantal condition of convexity. I have in mind a determinant of Vandermonde type. $g$ is convex iff $$\begin{vmatrix}1&1&1\ x&y&z\ g(z)&g(y)&g(z)\end{vmatrix}\ge 0$$ for any $x<y<z$. Another way is to show affine support at any interior point of a domain. This reduces to showing that $py^{p-1}(x-y)\le x^p-y^p$ for any $x,y>0$. – szw1710 Mar 23 '17 at 18:17
@szw1710 And how do you prove either of those with only elementary inequalities (no Hölder, no Bernoulli, no generalized means etc)? – dxiv Mar 23 '17 at 21:33

score 3 · Answer 3 · answered Jul 18 '18 at 09:15

Here is an elementary argument which uses only MVT: Claim: for $x \geq 0$ we have $x^{p}=\sup \{a^{p}+(x-a)pa^{p-1}: a\geq 0\}$. To prove that apply MVT to $x^{p}-a^{p}$ and consider the cases $a<x$ and $a \geq x$ separately. [ The idea for this comes from the geometric fact that a convex function is the upper envelope of the tangent lines]. Now define $f_a(x)=a^{p}+(x-a)pa^{p-1}$. [MVT gives $a^{p}+(x-a)pa^{p-1}\leq x^{p}$. To show that the supremum of the left side equals $x^{p}$ you just have to note that $a^{p}+(x-a)pa^{p-1}=x^{p}$ when $a=x$]. Then $f_a(\frac {x+y} {2})=\frac {f_a(x)+f_a(y)} 2$ (since $f_a$ is an affine function ). Just take supremum over $a \geq 0$ to get the desired inequality.

score 2 · Answer 4 · edited Apr 13 '17 at 12:21

I think you might consider starting with the generalised AM–GM inequality $$x^{\alpha}y^{1-\alpha} \leq \alpha x + (1-\alpha)y \qquad x,y>0, \quad 0 \leq \alpha \leq 1, \tag{1} $$ then go directly to Hölder's inequality (of which Cauchy–Schwarz is a simple special case), and then use it for Minkowski as in the answer by mickep.

Now, how to prove (1)? My preference would be to prove Jensen's inequality first and show the logarithm is concave using differentiation, since both of these are useful things in any analysis course, but it's not necessary, nor, I believe, what you want.

One can instead start by proving the unweighted AM–GM inequality for $n$ variables (various elementary proofs are available on the Wikipedia page, or one can simply iterate the near-triviality $x_1x_2<\left( \frac{x_1+x_2}{2} \right)^2 \iff x_1 \neq x_2$ to obtain the strict inequality for $2^n$ variables, and replace "rational" with "dyadic rational" in the next part). We then take rational approximations $ q_n/n \to \alpha$, and setting $q_n$ of the $x_i$ to $x$, $1-q_n$ to $y$. The case of equality is then obtained by careful juggling with the exponents and the fact that we have the strict result for (possibly dyadic) rational exponents. For the rest of the details, the easiest place to look is Hardy, Littlewood and Pólya, Inequalities, pp. 16–18.

Another method uses the logarithm: we have the inequality $\log{y} < y-1 $ for $y \neq 1$. Depending on your definition of the logarithm, this is either trivial ($n(x^{1/n}-1)$ is decreasing, or $\int_1^x dt/t < 1 \cdot (t-1)$) or a consequence of the even more obvious "$1+x<e^x$ unless $x=0$". Then writing $A=\alpha x + (1-\alpha)y$, $$ \alpha (\log{x}-\log{A}) + (1-\alpha) (\log{y}-\log{A}) < \alpha \left(\frac{x}{A}-1\right)+(1-\alpha)\left(\frac{y}{A}-1\right) = 0, $$ and rearranging gives $$ \log{x^{\alpha}y^{1-\alpha}} < \log{A}. $$ (C.f. HLP p. 138). This needed that the logarithm satisfies the inequality, $a\log{x} = \log{(x^a)}$, $\log{x}+\log{y}=\log{xy}$, and the increasingness of the logarithm. In particular, convexity itself is not required. The nice thing about this proof is that it trivially extends to more $x$s, and even to functions, with no modification.

Elementary proof that $|x|^p$ is convex.

4 Answers4

Linked

Related