When does this semimetric induced by a quasi-arithmetic mean fulfil the triangle inequality?

Question

Definition. For a continuous, strictly monotone function $f \colon I \to J$, where $I, J \subset \mathbb R$ are intervals, we can define the $f$-mean of two numbers $p, q \in I$ as $$ M_f \colon I \to I, \qquad (p, q) \mapsto f^{-1}\left( \frac{f(p) + f(q)}{2}\right). $$ If $f$ is either (increasing and strictly concave) or (decreasing and strictly convex) on $I$, then the induced semimetric is $$ D_f \colon I \times I \to [0, \infty), \qquad (p, q) \mapsto \sqrt{p + q - 2 M_f(p, q)}. $$

Remark (Properties of $D_f$). $D_f$ is clearly symmetric. Furthermore, it is positive definite: $D_f(p, p) = \sqrt{p + p - 2 p} = 0$ and $D_f(p, q) = 0$ if and only if $f\left(\frac{p + q}{2}\right) = \frac{f(p) + f(q)}{2}$ if and only if $p = q$ by the strict concavity / convexity of $f$.

Example 1. If $I = (0, \infty)$ and $f = \ln$, then \begin{align*} M_{\ln}(p, q) & = \exp\left(\frac{1}{2} \ln(p) + \frac{1}{2} \ln(q) \right) = \exp\left(\ln(\sqrt{p}) + \ln(\sqrt{q})\right) \\ & = \exp(\ln(\sqrt{p})) \exp(\ln(\sqrt{q})) = \sqrt{p} \sqrt{q} \end{align*} and so $D_{\ln}(p, q) = | \sqrt{p} - \sqrt{q} |$, which is a metric.

How can we characterise the functions $f$ such that $D_f$ fulfills the triangle inequality?

Examples.

If $f(x) = \frac{1}{x}$ and $I = (0, \infty)$, then $M_f(p, q) = \frac{1}{\frac{\frac{1}{p} + \frac{1}{q}}{2}} = \frac{2 p q}{p + q}$ and $$D_f(p, q) = \sqrt{p + q - \frac{4 p q}{p + q}} = \frac{| p - q |}{\sqrt{p + q}}$$ might fulfill the triangle inequality.
If $f(x) = \sqrt{x}$ and $I = [0, \infty)$, then $M_f(p, q) = \frac{1}{4} (\sqrt{p} + \sqrt{q})^2$ and $D_f(p, q) = \sqrt{p + q - \frac{1}{2} (\sqrt{p} + \sqrt{q})^2} = \frac{1}{\sqrt{2}} | \sqrt{p} - \sqrt{q} |$ is a metric.
Other choices of $f$ could be $x \mapsto a^{-x}$ for $0 < a < 1$ or $x^{-p}$ for $p \ge 1$ or $x \mapsto x - e^{-x} + 1$.

Yet another perspective: since $M_f(P,P)=P$, we can write $D_f(P,Q)=\sqrt{M_f(P,P)+M_f(Q,Q)−2M_f(P,Q)}$, which is reminiscent of a metric build from a kernel function $M$.

The second part of the above definition is inspired by section 1.3 of the paper Quantum Optimal Transport for Tensor Field Processing (arXiv link, published not-open access version here) and through answers to this question I hope to make progress on this question of mine, that is, to find out whether the "quantum analog" of $D_f$ is a metric for the case for $f = \log$, that is, whether $$ D \colon \mathcal S_+^d \times \mathcal S_+^d \to [0, \infty), \qquad (P, Q) \mapsto \sqrt{\text{tr}\left(P + Q - 2 \exp\left(\frac{1}{2} \log(P) + \frac{1}{2} \log(Q)\right)\right)} $$ is a metric, where $\mathcal S_+^d$ is the cone of symmetric positive semidefinite matrices in $\mathbb{R}^{d \times d}$, see also the discussion here for the interpretation of the $M_{\ln}$ part. We at least have that $\ln$ is operator monotone and operator concave. Maybe this distance is related to the Bures-Wasserstein distance of positive definite matrices? This construction is also reminiscent of the metric $d$ in An extension of Kakutani’s theorem on infinite product measures to the tensor product of semifinite $w^*$-algebras by Donald Bures.

Some ideas to prove that $D(p, q) := \frac{| p - q |}{\sqrt{p + q}}$ fulfills the triangle inequality:

Case 1: $p > s > q$. Then $\frac{1}{\sqrt{p + s}} < \frac{1}{\sqrt{p + q}} < \frac{1}{\sqrt{s + q}}$, so \begin{align*} D(p, q) + D(q, s) & = \frac{p - q}{\sqrt{p + q}} + \frac{s - q}{\sqrt{s + q}} > \frac{p - q}{\sqrt{p + q}} + \frac{s - q}{\sqrt{p + q}} > \frac{p - q}{\sqrt{p + q}} - \frac{s - q}{\sqrt{p + q}} \\ & = \frac{p - s}{\sqrt{p + q}} > \frac{p - s}{\sqrt{p + s}} = D(p, s). \end{align*} But I haven't been able to use the same strategy for the other cases.

I'm not sure how related this is, but there are some superficial similarities to this question; the techniques used in some of those answers might be able to help you for specific functions. — Carl Schildkraut, Dec 19 '22 at 20:46

ViktorStein · Answer 1 · 2023-01-04T22:09:13.483

TL;DR: if the left endpoint of $f(I)$ is $0$, then it is sufficient that $(f^{-1})''$ is log-convex.

Based on @Carl Schildkraut's answer and a comment by @Martin R on the other answer to that question we can deduce the following necessary condition (on top of the constraints placed on $f$ in the question), where we require that (can this be relaxed?) the left endpoint of $J := f(I)$ is $0$ and that $I \subset [0, \infty)$.

Let $f_0 := f^{-1} \colon J \to I$. If $f_0$ and $\ln \circ (f_0'')$ are convex, then $D_f$ is a metric on $I \times I$.

Remark. Since $f \colon I \to J$ is bijective and either (strictly convex and decreasing) or (strictly concave and increasing), we automatically have that $f^{-1}$ is strictly convex, see e.g. here. Hence $f_0''(x) > 0$ for all $x \in J$, so that $\ln \circ (f_0'')$ is well-defined.

Examples.

The above conditions are (nearly, see below) satisfied for the functions $f \colon (0, \infty) \to (0, \infty)$, $x \mapsto \frac{1}{x}$, $f \colon (0, \infty) \to \mathbb R$, $x \mapsto \ln(x)$ and $f \colon [0, \infty) \to [0, \infty)$, $x \mapsto \sqrt{x}$ I mentioned above, which have the inverse functions $f^{-1}(x) = \frac{1}{x}$, $f^{-1}(x) = e^x$ and $f^{-1}(x) = x^2$, respectively, such that $(\ln \circ f_0'')(x)$ is equal to $\ln(2) - 3 \ln(x)$, $x$, and $\ln(2)$ respectively. Note that for the logarithm, the endpoint condition is not fulfilled, but the corresponding semimetric still satisfies the triangle inequality
The function $f_a \colon \mathbb R \to (0, \infty)$, $x \mapsto a^x$ for $a \in (0, 1)$ is strictly decreasing and convex. We have $f_a^{-1}(x) = \ln_a(x)$ and thus $(f_a^{-1})''(x) = - \frac{1}{x^2 \ln(a)}$ and hence $(\ln \circ f_0'')(x) = 2 \ln(x) \ln(-\ln(a))$. This function is convex if and only if $\ln(-\ln(a)) \le 0$, i.e. $a \ge \frac{1}{e}$. Note that the edge is case is exactly $f_{\frac{1}{e}}(x) = e^{-x}$.
The function $f_p \colon (0, \infty) \to (0, \infty)$, $x \mapsto x^{-\frac{1}{p}}$ for $p \ne 0$ is strictly convex and decreasing for $p > 0$ and strictly concave and increasing for $p < - 1$. Its inverse if $f_p^{-1}(x) = x^{-p}$, which has the second derivative $(f_{p}^{-1})''(x) = p (p + 1) x^{- p - 2}$, whose logarithm $\ln((f_p^{-1})''(x)) = \ln(p (p + 1)) - (p + 2) \ln(x)$ is convex if and only if $p \ge - 2$, so the induced semimetric fulfills the triangle inequality for $p \in (-2, -1) \cup (0, \infty)$.
The function $f \colon [0, \infty) \to [0, \infty)$, $x \mapsto \sinh^{-1}(x)$ is strictly increasing and concave. Its inverse is $f_0(x) = \sinh(x)$, which has second derivative $f_0''(x) = \sinh(x)$, whose logarithm $\ln(\sinh(x)) = \ln\left(\frac{e^x - e^{-x}}{2}\right)$ is concave, as $\frac{d^2}{dx^2} \ln(\sinh(x)) = - \text{csch}^2(x) \le 0$.

Proof. Let $$g \colon I \times I \to I, \qquad (u, v) \mapsto f_0''(u + v)$$ and $u, v, s, t \in I$ such that $u \ge s$ and $v \ge t$. Now define $$x_1 := u + v, \quad x_2 := s + t, \quad y_1 := \max(s + v, t + u) \quad \text{and} \quad y_2 := \min(s + v, t + u). $$ Then $(x_1, x_2)$ majorizes $(y_1, y_2)$ because

$x_1 \ge x_2$ and $y_1 \ge y_2$,
$x_1 \ge y_1$,
$x_1 + x_2 = y_1 + y_2$.

As $\ln(f_0'')$ is convex by assumption, Karamata's inequality states that $$ \ln(f_0''(x_1)) + \ln(f_0''(x_2)) \ge \ln(f_0''(y_1)) + \ln(f_0''(y_2)), $$ that is, $$ \ln\big(g(u, v) g(s, t)\big) \ge \ln\big(g(s, v) g(t, u)\big), $$ which, due to $\ln$ being strictly montonically increasing, is equivalent to $$ g(u, v) g(s, t) \ge g(s, v) g(t, u). $$ Now, by @Carl Schildkraut's answer the function \begin{align*} h(x, y) & = \sqrt{\int_{x}^{y} \int_{x}^{y} g(u, v) \; \text{d}u \text{ d}v} = \sqrt{\int_{x}^{y} \int_{x}^{y} f_0''(u + v) \; \text{d}u \text{ d}v} \\ & = \sqrt{f_0(2 x) + f_0(2 y) - 2 f_0(x + y)}. \end{align*} fulfills the triangle inequality.

The proof is now completed by noting that $$ h(x, y) = D_f\big(f_0(2 x), f_0(2 y)\big). $$ Indeed, let $p_1, p_2, p_3 \in I$. Then for $x_k := \frac{1}{2} f(p_k) \in J$ (here we need $\frac{1}{2} J \subset J$, which in our case implies that the left endpoint of $J$ has to be $0$) we have $p_k = f_0(2 x_k)$ for $k \in \{ 1, 2, 3 \}$. Hence \begin{align} D_f(p_1, p_2) + D_f(p_2, p_3) & = D_f\big(f_0(2 x_1), f_0(2 x_2)\big) + D_f\big(f_0(2 x_2), f_0(2 x_3)\big) \\ & = h(x_1, x_2) + h(x_2, x_3) \ge h(x_1, x_3) \\ & = D_f\big(f_0(2 x_1), f_0(2 x_3)\big) = D_f(p_1, p_3). \end{align}

When does this semimetric induced by a quasi-arithmetic mean fulfil the triangle inequality?

How can we characterise the functions $f$ such that $D_f$ fulfills the triangle inequality?

1 Answers1

Linked