Finding a compact set where to use Weierstrass theorem with the distance function from a fixed point to the graph of a convex function

Question

Let $f : \mathbb{R}^M\to \mathbb{R} $ ; $f$ is convex. Fix $x \in \mathbb{R}^M$ , and $t \in \mathbb{R}$ with $t < f (x)$ (This assumptions may or may not be used here because this question is just the initial part of a longer proof of another fact, which is f is convex iff $f= \sup\{\varphi :\mathbb{R}^M\to \mathbb{R}, \varphi \le f, \varphi $ affine $\}$)

We claim that there exists an Point $P$ such that the distance from $P=(x,t)$ to the graph of $f$ is minimized: i.e., $\exists Q\in \mathbb{R}^{M+1}$ s.t $d(P,S)=\inf\{d(P,R), R \in \mathbb{R}^{M+1} \}=d(P,Q)$

My lecturer said I should use Weierstrass theorem and the distance function

I am trying to fill in the details:

Define the function $d(Q)=|P-Q|$ is continuous, $Q\in \mathbb{R}^{M+1}$ If I could find a compact subset $K $ of the Graph$(f)$ I could conclude that there is a minimum of the distance function over this subset by the Weierstrass theorem. I am not sure how to define this subset. If I'd used the assumption that $t < f (x)$, I could take a closed ball $\bar B(x,r)$ of radius $r=f(x)-t$, and take $K:= \bar B(x,r)\cap$ Graph$(f)$.

Would this do the job? Anything that I am missing to make this rigorous? I am not sure how to justify this set is compact since the graph of f is not, but how do I choose a compact subset of the graph large enough for this to make sense?

The problem is that the $K$ constructed is compact, but the point in $K$ minimizing the distance to $(x,t)$ may not be the same as the point in the graph of $f$ which minimizes the distance to $(x,t)$. Instead, you're very close and just need to adjust the value of $r$. To give you the answer, let $D$ be the infimum of all distances from points of the graph to $(x,t)$(This is a finite number because it's at most $f(x)-t$). If you consider the ball of radius $2D$ around $(x,t)$ intersected with the graph of $f$, then that is going to contain a minimizer. — Sarvesh Ravichandran Iyer, May 13 '23 at 13:20
Also, the graph of $f$ is compact as long as you restrict the domain of $f$ to a compact interval. For that, see here. (Alternately, you can show that the graph is closed by continuity of $f$, and bounded by the compactness of the domain of $f$ and continuity of $f$. Then Heine Borel does the rest). Also, if you could mention the entire "longer fact", I'd be interested to know. Perhaps it is that if $f$ is strongly convex then the minimizer is unique? — Sarvesh Ravichandran Iyer, May 13 '23 at 13:25
@SarveshRavichandranIyer How do I argue that K is compact? Given that is the intersection of a compact set, i.e. the closed ball and the Graph(f) which is not? — some_math_guy, May 13 '23 at 13:31
I kind of left out some very tiny details for you to figure out (the statement is true, but to prove it you need to twist a few things here and there). Give me about 5 minutes : if there's no good question I can redirect you to, then I'll answer this question. Remember that convex functions on open intervals are continuous. — Sarvesh Ravichandran Iyer, May 13 '23 at 13:33
I found the same question behind a paywall of a poor website, and couldn't access the answer. There's no neat answer on here as well (and one can answer this particular question in two different ways) so I'll answer this question. I need an hour or so. — Sarvesh Ravichandran Iyer, May 13 '23 at 13:40
@SarveshRavichandranIyer See the post for the full theorem.... see also the picture I added, from which it looked to me that a ball of radius f(x) -t was enough... I don't get why I need to double the radius? — some_math_guy, May 13 '23 at 14:21
I was probably being careful by including $2D$ in place of $D$. A ball of radius $f(x)-t$ may well suffice. — Sarvesh Ravichandran Iyer, May 13 '23 at 14:22

Sarvesh Ravichandran Iyer · Answer 1 · 2023-05-16T12:31:22.900

I forgot to take cognizance of the fact that $f$ is defined on $\mathbb R^m, m \geq 1$, and not just on $\mathbb R$. We use the following fact :

If $f$ is a convex function on $\mathbb R^m$ then $f$ is continuous on $\mathbb R^m$.

A proof of this fact can be found here. If $m=1$ then a much simpler proof can be given via the so-called "three-slopes-property" that convex functions enjoy on intervals in $\mathbb R$.

Let's see how one can use this particular fact to produce a proof of the assertion, using the weaker assumption of continuity.

Suppose that $f : \mathbb R^M \to \mathbb R$ is convex. Then, $f$ is continuous on $\mathbb R^M$ by the fact we mentioned earlier. Throughout the rest of this proof, the variables $x_n,y$ will be used to denote various points in $\mathbb R^M$, while the variables $r,s$ will be used to denote various points in $\mathbb R$. When we write $(y,s) \in \mathbb R^{M+1}$, this is the point whose first $M$ coordinates are given by those of $y$ (in that order) and whose last coordinate equals $s$. $P,Q$ will be used to represent arbitrary points in $\mathbb R^{M+1}$, whose coordinates will be given by $P_i,Q_i , i =1,\ldots,M+1$.

Let $x \in \mathbb R^M$ and $t\in \mathbb R$ be such that $t < f(x)$ (these will be fixed for the rest of the proof). Let the graph of $f$, $G(f) \subset \mathbb R^{M+1}$ be defined by $$ G(f) = \{(y,f(y)) : y \in \mathbb R^M\}. $$

In particular, $(x,t)$ doesn't lie on $G(f)$. Let $d_{M+1} : \mathbb R^{M+1}\times \mathbb R^{M+1} \to \mathbb R$ be given by the usual Euclidean distance $$ d_{M+1}(P,Q) = \sqrt{\sum_{i=1}^{M+1} (P_i-Q_i)^2}. $$ An analogous definition holds for the distance $d_M : \mathbb R^M \times \mathbb R^M \to \mathbb R$.

We write $\overline{B_{M+1}(P,r)} = \{P'\in \mathbb R^{M+1} : d_{M+1}(P,P') \leq r\}$ for the closed ball of radius $r$ around $P$. The definition of $\overline{B_{M}(y,r)}$ is similar but with $d_{M+1}$ replaced by $d_{M}$.

Let $D = \inf\{d((x,t),P) : P \in G(f)\}$ be the infimum of all distances from points in $G(f)$ to the point $(t,x)$. Note that $D$ is finite since $D \leq f(x)-t$. We will show that for some point $P' \in G(f)$, $$ D = d_{M+1}((x,t),P'). $$ Then, $P'$ is the desired point.

Our strategy, which resembles that of the OP (more on that later) is the following :

Identify a compact subset $G'(f) \subset G(f)$ such that for $P \in G(f) \setminus G'(f)$, $d((x,t),P) > D$. We will show that there are two such candidates for $G'(f)$ : one whose compactness is easier to prove, and the other which is more natural but whose compactness follows as a corollary of the computations for the first candidate.
Use the compactness of $G'(f)$ and the continuity of $d$ along with Weierstrass' theorem to find a minimizer $P' \in G'(f)$ and complete the result.

Let us identify the candidates for the subset $G'(f) \subset G(f)$. I state them below :

$G'_1(f) = \{(y,f(y)) : y \in \overline{B_M(x,D+1)}\}$.
$G_2'(f) = G(f) \cap \overline{B_{M+1}((x,t),f(x)-t)}$.

The second matches the thinking of the OP. The first provides an easier-to-argue alternative.

It is clear that $G'_1(f),G'_2(f) \subset G(f)$. We will now prove that for $(y,s) \in G(f) \setminus G'_1(f)$ or $(y,s) \in G(f) \setminus G'_2(f)$, we have $d_{M+1}((x,t),(y,s)) > D$. Indeed, $$ (y,s) \in G(f) \setminus G'_1(f) \implies d_{M+1}((x,t),(y,s)) > d_M(x,y) > D+1 > D, $$ and $$ (y,s) \in G(f) \setminus G'_1(f) \implies (y,s) \notin \overline{B_{M+1}((x,t),f(x)-t)} \\ \implies d_{M+1}((x,t),(y,s)) > f(x)-t > D. $$

Our next task is to prove that $G'_1(f)$ and $G'_2(f)$ are both compact.

To see that $G'_1(f)$ is compact, note that $\overline{B_M(x,D+1)}$ is compact. The function $g : \overline{B_M(x,D+1)} \to \mathbb R^{M+1}$ given by $g(y) = (y,f(y))$ is a function whose components are continuous, hence $g$ is continuous. Now, by definition, $G'_1(f) = g(\overline{B_M(x,D+1)})$ is the image of a compact set under a continuous function. Hence, $G'_1(f)$ is a compact set.

To show that $G'_2(f)$ is compact, we first show that $G(f)$ is closed by showing that it contains all its limit points.

Let $(x_n,f(x_n)) \in G(f)$ be such that it converges to some point $(y,s)$ in $\mathbb R^{M+1}$. Then, $x_n \to y$ in $\mathbb R^M$ and $f(x_n) \to s$ in $\mathbb R$ because the components of a convergent sequence also converge. However, $f$ is continuous, hence $f(x_n) \to f(y)$ in $\mathbb R$ as well. Hence, $s= f(y)$ by uniqueness of limits. Thus, $(y,s) = (y,f(y)) \in G(f)$. Since the choice of convergent sequence $(x_n,f(x_n)) \in G(f)$ was arbitrary, $G(f)$ contains all its limit points, hence it's closed.

Now, $\overline{B_{M+1}((x,t),f(x)-t)}$ is a compact set. So $G'_2(f)$ is the intersection of a closed and a compact set, hence it's a compact set itself.

We let $H(f) \subset G(f)$ be equal to either of $G'_1(f)$ or $G'_2(f)$ in what follows. I want to show that from here, the choice doesn't matter. $H(f)$ is compact and if $P \in G(f) \setminus H(f)$, $d_{M+1}((x,t),P)>D$.

The function $d : H(f) \to \mathbb R$ given by $d(p) = d_{M+1}((x,t),P)$ is a continuous function. As $H(f)$ is compact, by the Weierstrass theorem $d$ attains its minimum on the set $H(f)$, so there is a point $P' \in H(f)$ such that $\min_{P \in H(f)} d(P) = d(P')$.

We claim that $D = d(P')$. Indeed, $$ D = \min_{P \in G(f)} d(P) \leq \min_{P \in H(f)} d(P) = d(P') $$ because a minimum over a larger set results in a smaller number. On the other hand, we have already seen that for every point $P \in G(f) \setminus H(f)$, $d(P)>D$. Hence, $$ D = \min_{P \in G(f)} d(P) = \min_{P \in H(f)} d(P) = d(P') $$ and $P'$ is the desired point.

Some points :

We never used the fact that $f(x)>t$ , really. We can replace $f(x)-t$ by $|f(x)-t|$ everywhere and obtain the same statement. Essentially, $(x,t)$ is only required to not lie on $G(f)$.
This question is a vast generalization of the above fact. Indeed, in infinite dimensions some of the arguments above break down because closed balls are not compact in infinite dimension, for example. In this case, convexity is strongly used in the proof, which I won't talk about.

Looks like you changed the assumptions or notation in my post, like $ f : R \to R $ instead of $f: R^M \to R$ and (t, x) instead of (x,t) which combined which combined with $t<f(x) $ makes a difference — some_math_guy, May 13 '23 at 17:57
Also notice (x,t) is a fixed point, so using x as variable of functions is confusing — some_math_guy, May 13 '23 at 18:54
I apologize for taking a long time @some_math_guy, I think I was too caught up to give this the attention it deserved, so I hope I've done that now. — Sarvesh Ravichandran Iyer, May 16 '23 at 12:31
Thanks for your detailed answer, I still have a doubt though here "$D = \min_{P \in G(f)} d(P) \leq \min_{P \in H(f)} d(P) = d(P')$". Weierstrass theorem says there is a minimum in H(f) which is compact so: $\min_{P \in H(f)} d(P) = d(P')$. And certainly the minimum over a larger set is smaller, but how do we know that if by enlarging the set, in the case that d(P) has values over the G(f)\H(f) that are smaller than those over H(f), there's still a minimum and not just an infimum: $D = \inf_{P \in G(f)} d(P) \leq \min_{P \in H(f)} d(P) = d(P')$ — some_math_guy, May 21 '23 at 18:49

Finding a compact set where to use Weierstrass theorem with the distance function from a fixed point to the graph of a convex function

1 Answers1

Linked