1

I am working a bit in applied mathematics, but not in theoretical mathematics. I dabble a bit in general topology, but if possible would like to ask you to consider my level of know-how and add some intuition into your answer. I can handle some formulas, but since I am not a professional mathematician I am not fluent in all the lingo... so basic terminology is appreciated.


In data analytics / data science, we have premetrics, like Dynamic time warping, which can be calculated between two real-valued sequences, even if the sequences do not have the same length (so they are not vectors). As such, we can say that the space created by the premetric is a topological space.

When we are dealt with nominal-valued sequences (aka strings), like human words, we can have only equality as an operation at our disposal for comparing elements of the sequences. To calculate a metric distance between such, we have e.g. the Levenshtein distance. This metric only outputs integer-values as distances. This would create a metric space, I guess, however not in the normal sense that we have infinitesimal close neighbors for each sequence/point.

Usually a metric space would induce a topological space. Usually topological spaces concretize the notion of what "the closest next point" is. Here, the "closest next point" is not infinitesimally close. Can we still talking about topological spaces in this context?

I have not read anywhere that the codomain of the metric of a metric space needs to be a field (which integers are not), but are we still talking about a metric space for the Levenshtein distance?

How should I think about the fact that the Levenshtein distance have integers as output?

Make42
  • 1,085

1 Answers1

3

This question is predicated on a false assumption, namely that

in the normal sense [of "metric space"] we have infinitesimal close neighbors for each sequence/point.

Most basically, there are no such things as "infinitesimally close points" in a metric space: the distance between any two points is always a (nonnegative) real number, and there are no infinitesimal real numbers.

Moreover, the claim "topological spaces concretize the notion of what "the closest next point" is" is wrong for the same reason: we don't think about "closest next points" almost ever.

Even ignoring that, it sounds like you're thinking of metric spaces without isolated points (or maybe complete metric spaces, or complete and connected metric spaces, or something similar). These are just a particular type of metric space; there's no requirement that all metric spaces "look like that."

Indeed, a metric space is merely any set $X$ equipped with a function $\delta: X\times X\rightarrow\mathbb{R}_{\ge 0}$ satisfying three basic properties:

  • $\delta(x,y)=0\leftrightarrow x=y$.

  • $\delta(x,y)=\delta(y,x)$.

  • $\delta(x,y)+\delta(y,z)\ge\delta(x,z)$.

The metric function $\delta$ could be (nonnegative-)integer valued; that's totally fine.


That said, there is a sense in which integer-valued metrics are "less topology-flavored" than one might expect.

Suppose $(X,\delta)$ is a metric space with $ran(\delta)\subseteq\mathbb{Z}$. Then the topology on $X$ induced by $\delta$ is discrete: every set is open. In particular, any two integer-valued metrics on the same set yield the same topology on that set.

This means that if I want to compare integer-valued metrics - or more generally, discrete metric spaces, which are spaces where for each point $x$ there is some $u>0$ such that every point other than $x$ is at distance at least $u$ from $x$ - I have to use "finer-grained" notions than the usual ones coming from topology. But that doesn't mean that integer-valued metrics aren't allowed, it just means that I might have to be careful about what questions I ask about them if I want to get meaningful answers.

Noah Schweber
  • 245,398
  • I am not familiar with $ran(·)$. I tried to get an idea at https://en.wikipedia.org/wiki/Ran_space (which I did not understand), but maybe you mean something different anyway. Can you explain this part (maybe also with an example, as I am not a full-time mathematician)? – Make42 Nov 09 '20 at 19:30
  • According to https://en.wikipedia.org/wiki/Discrete_space, the image of a metric of an discrete metric space takes only 0 and 1 as values, but my example metric takes other integer values. What am I misunderstanding, why do you say that a discrete metric space is more general than an integer-valued space? I would have expected that an integer-valued space is more general. – Make42 Nov 09 '20 at 19:47
  • @Make42 "ran" just means range - the set of values the function takes on. "Image" is also used here. As to "discrete space," there's an annoying difference between the discrete metric (which is as you describe) and a discrete metric space - the latter being a metric space whose induced topology is discrete. This can be annoying. The discrete metric is an integer-valued metric, and every integer-valued metric yields the discrete topology. – Noah Schweber Nov 09 '20 at 19:52
  • Thank you for the clarifications. The range/image would only be the positive integers, while the codomain would be the integers - did I get this right? Or are there also metrics that use all integers? 2) If two integer-valued metrics have very different way of calculating distances, why do they result in the same topology?
  • – Make42 Nov 09 '20 at 20:40
  • @Make42 (1) Yes, metrics only output nonnegative reals, so I could have written "$\mathbb{N}$" (under the convention that $0\in\mathbb{N}$) instead of "$\mathbb{Z}$." (2) Basically because a metric has more information than a topology: two very different metrics may induce the same topology. (On the other hand there are topologies which don't come from metrics at all.) You should look into the definition of "topological space" for this. Regardless we're straying from the main point, which is that your integer-valued metrics are just ordinary metrics. – Noah Schweber Nov 09 '20 at 22:00
  • Back to topic: https://en.wikipedia.org/wiki/Topological_space writes "The definition of a topological space [..] is the most general notion of a mathematical space that allows for the definition of concepts such as continuity, connectedness, and convergence." This does not seem to be the case anymore if the underlying set has isolated points. – Make42 Nov 10 '20 at 12:27
  • @Make42 Regardless, they are perfectly valid topological spaces. Read the formal definition as opposed to the vague one-sentence summary. (And moreover you're misinterpreting that sentence: it's saying that the language of topological spaces lets us define connectedness etc., not that every topological space is connected or etc.) – Noah Schweber Nov 10 '20 at 16:04