1

While reading Wasserstein GAN paper and in Appendix A, it says that

The norm topology is very strong. Therefore, we can expect that not many functions $\theta \mapsto \mathbb{P}_\theta$ will be continuous when measuring distances between distributions with $\delta$

From what I understand, strong topology has more open sets than weak topology, hence I assumed that a continuous function on weak topology implies continuity in strong topology but not vice versa. Hence there would be more continuous functions in strong topology than weak topology. However, several other posts have shown that continuity in strong topology implies continuity in weak topology here. So my questions are:

  1. does continuity in strong topology imply continuity in weak topology?
  2. does continuity in weak topology imply continuity in strong topology?
  3. How do we know that weak topology has more continuous functions than strong topology?
MoneyBall
  • 877

1 Answers1

1

It seems that what the authors of the paper are saying is that the topology if total variation in the space of finite measures is to restrictive. The weak topology they are referring there is the weak topology $\sigma(\mathcal{M},\mathcal{C}_b(X))$ where $\mathcal{M}$ is the space of finite measures, and $\mathcal{C}_b(X)$ is the space of continuous bounded functions (on $X$). A local base of the topology $\sigma(\mathcal{M}(X),\mathcal{C}_b(X))$ is given by sets of the form $$V(f_1,\ldots,f_n;\varepsilon)=\{\mu\in\mathcal{M}:|\mu(f_j)|<\varepsilon\}$$ where $\mu(f_j)=\int_Xf_j\,d\mu$, $f_1,\ldots,f_n\in\mathcal{C}_b(X)$, $n\in\mathbb{N}$. In this topology, a net $(\mu_\alpha:\alpha \in D)$ converges to $\mu$ iff for any $f\in\mathcal{C}_b(X)$ $$\lim_\alpha \mu_\alpha(f)=\mu(a)$$

I assume for the rest of this posting that $X$ is a complete separable metric space.

As for your questions:

  1. If $\mu_n$ converges to $\mu$ in total variation, then $\mu_n$ converges to $\mu$ in $\sigma(\mathcal{M}(X),\mathcal{C}_b(X))$. Indeed, for $f\in\mathcal{C}_b(X)$ $$\Big|\int_Xf\,d\mu_n-\int_Xf\,d\mu\Big|\leq\int_X|f|\,d|\mu_n-\mu|\leq\|f\|_u\|\mu_n-\mu\|_{TV}\xrightarrow{n\rightarrow\infty}0$$

  2. Consider a sequence $x_n\in X\setminus\{x\}$ such that $x_n\xrightarrow x$ in $X$. Consider the measures $\delta_{x_n}$ and $\delta_x$ where $\delta_{x_n}(A)=\mathbb{1}_{A}(x_n)$ (similar for $\delta_x$). You can easily check that $\mu_n$ converses weakly to $\mu$ (i.e. in the $\sigma(\mathcal{M}(X),\mathcal{C}_b(X))$ topology); however, $\|\delta_{x_n} -\delta_x\|_{TV}=1$

  3. I leave this for the OP to think about.

Mittens
  • 39,145
  • Thank you for the detailed answer. If I understood it correctly, can we say that $\mu$ is the linear functional on $C_b(X)$ so the space of all $\mu$ is the dual space of $C_b(X)$? If so, if $\mu_n$ converges to $\mu$, isn't that a weak convergence and not a strong convergence? (Also I think there are some minor typos: $\mathcal{B}_b(X)$ should be $\mathcal{C}_b(X)$? and is $D$ some arbitrary set when you defined a net? , and $x_n \to x$ in X) – MoneyBall Apr 23 '22 at 23:39
  • I see. I'm not too familiar with topology but I get why you used nets. Coming from some functional analysis background, I thought strong convergence meant when $f_n \to f$ in $C_b(X)$, then $\mu(f_n) \to \mu(f)$, since $\mu$ is a linear functional, and weak convergence would be if $\mu(f_n) \to \mu(f)$ then $f_n \to f$ in $C_b(X)$. Is this not correct? – MoneyBall Apr 23 '22 at 23:56
  • That would be the weak topology $\sigma(\mathcal{C}_b(X),\mathcal{M}(X))$. In Statistics nd Probability is the $\sigma(\mathcal{M}(X),\mathcal{C}_b(X))$-topology that is of interest (think of the Central limit theorem for example). – Mittens Apr 24 '22 at 00:07
  • Okay, so we have a space $(\mathcal{M}(X), \mathcal{C}_b(X))$ and a linear functional $\int_X f_j d\mu$ that maps the measure function and continuous bounded function to the real space. Is the set of all these linear functionals the dual space of $(\mathcal{M}(X), \mathcal{C}_b(X))$? You used the notation $\mu$ to be the measure function and the linear functional which is slightly confusing. – MoneyBall Apr 24 '22 at 00:36
  • For the last question, using total variation, strong topology implies weak topology and not vice versa, there should be more functions that are continuous in the weak topology. Is this generally the case or are there other metrics where the other implication also holds? – MoneyBall Apr 24 '22 at 00:53
  • Correct, there are more continuous with respect the total variation norm topology. In Polish spaces, the weak topology $\sigma(\mathcal{M}(X),\mathscr{C}_b(X))$, restricted to the convex set of probability functions is metrizable (Prohorov metric for example) and versions of Ascolli-Arzela hold (sequential compactness). – Mittens Apr 24 '22 at 01:07
  • in the former, for (1) implies (2) in your posting. That is the problem with the total variation metric, there are way too many continuous functions, that makes convergence of probability measures too restrictive. – Mittens Apr 24 '22 at 01:44
  • The total variation function $\mu\rightarrow|\mu|_{TV}$ is one example of a continuous function in the TV norm that is not weakly continuous. – Mittens Apr 24 '22 at 01:55