Background: For $f:D\subset\mathbb{R}^d \to \mathbb{R}$ we call $$f^*:D^*\subset\mathbb{R}^d \to \mathbb{R}, ~ y\mapsto \underset{x\in D}{\sup}(\langle y, x \rangle - f(x))$$ the conjugate of $f$, where $D^*:=\{y\in\mathbb{R}^d | \underset{x\in D}{\sup}(\langle y, x \rangle - f(x)) < \infty \}$. From the definition we directly get the Fenchel inequality: $$ \langle y, x \rangle \leq f(x) + f^*(y) $$ If we set $D=\mathbb{R}^d, f(x)=\lVert x \rVert$ for some vector norm $\lVert \cdot \rVert$ and consider the dual norm $ \lVert y \rVert^* = \underset{\lVert x \rVert=1}{\sup}\langle y, x \rangle$ (which in fact is a dual norm thanks to Riesz' representation theorem) we get $$ f^*(y) = \begin{cases} 0, &\lVert y \rVert^* \leq 1 \\ \infty, &\mathrm{otherwise}, \end{cases} $$ and therefore $\langle y, x \rangle \leq \lVert x \rVert$ whenever $\lVert y \rVert^* \leq 1$.
My questions: The author of a paper I'm reading defines something he calls a matrix dual norm by $$ \lVert W \rVert^* := \underset{\lVert u \rVert = \lVert v \rVert = 1}{\sup} u^TWv $$ for some vector norm $\lVert \cdot \rVert$ and $W\in\mathbb{R}^{d\times d}$. He then claims that $$ \langle A, B \rangle_{\mathrm{F}} \leq \lVert A \rVert\qquad (*) $$ by the Fenchel-Young inequality whenever $\lVert B \rVert^* \leq 1$, where $\langle \cdot, \cdot \rangle_{\mathrm{F}}$ is the Frobenius SP and I'm guessing (this is not clearly defined) $\lVert \cdot \rVert$ is the matrix norm induced by the aforementioned vector norm $\Vert \cdot \Vert$.
- How is the defined matrix norm $\lVert \cdot \rVert^*$ a dual norm? I'm only familar with $\mathbb{R}^{d\times d}$ as a Hilbert space when equipped with the Frob. SP, but the dual norm defined here does not seem to coincide with the dual norm induced by the Frobenius SP.
- Since the author does not use the Frobenius norm as dual norm, can we still apply Fenchel-Young to achieve the inequality (*)?
Thanks a lot!
Edit: I just realized that it might be somewhat exaggerated to use Fenchel-Young here, since for all $x \neq 0, y \in \mathbb{R}^d$ with $\lVert y \Vert^* \leq 1$ we have $$ \langle x, y \rangle = \langle x/\lVert x \rVert, y \rangle \lVert x \rVert \leq \lVert y \Vert^* \lVert x \rVert \leq \lVert x \rVert. $$ My second question therefore reduces to whether the following is true: $$ \langle A, B \rangle_F = \langle A/\lVert A \rVert, B \rangle_F \lVert A \rVert \overset{\mathrm{?}}{\leq} \lVert B \Vert^* \lVert A \rVert. $$ I'm pretty sure it is not since by choosing $\lVert \cdot \lVert = \lVert \cdot \lVert_1$ we have $$ \lVert B \Vert^* = \underset{\lVert u \rVert_1 = \lVert v \rVert_1 = 1}{\sup} u^TBv = \underset{i,j}{\max}|b_{i,j}| $$ but choosing $A$ appropriately (one entry per column +/-1, otherwise zero s.t. $\lVert A \rVert_1$ = 1) $$ \langle A, B \rangle_F = \sum_{j=1}^d \underset{i=1, ..., d}{\max} |b_{ij}| > \lVert B \Vert^* $$ Could someone confirm?