0

In machine learning, we want to train a model. While training, if the dimension of data is high, we have a problem (Curse of Dimensionality), so we want to reduce the dimension of our data.

Since we know $\mathbb{R}^n$ and $\mathbb{R}$ have the same cardinality. So we always have some space-filling curve that maps each point uniquely in both directions. i.e. we can always bijectively map any n-dimensional data to 1-dimensional. So what is the problem we have in the first place?

I come up with two problems that can still be there:

  1. With this space-filling curve map, we can't reduce the size of data. i.e., we have to increase the precision when we are writing it in 1-dimension.
  2. This is the place where I have doubts. I am thinking that while representing data in $\mathbb{R}^n$ we have more information than representing it in $\mathbb{R}$. There is some structure when we write data in $\mathbb{R}^n$ like which point is near to which point. that is lost in this map (space-filling curve) .i.e, the map is not homomorphic.

My question:

  1. Is it right what I am thinking?
  2. I don't know what information I am talking about in point 2. could you help me make it more rigorous is.
  3. Is there any other problem?

Example:

Suppose we have a training data with $x^i \in \mathbb{R}^n $ with label $y^i \in \mathbb{R}$ where $i \in \{1,2,..,N\}$. When we are training a neural network to fit this data. If we change the order of the bases i.e.

$$x^i = (x_1^i,x_2^i,..,x_n^i)$$

if we take

$$\tilde{x}^i = (x_{\rho(1)}^i,x_{\rho(2)}^i,..,x_{\rho(n)}^i)$$

where $\rho$ is some permutation function(same for all $i$). then the training of neural network doesn't affect. So when order doesn't matter. What if I transform all the data from $\mathbb{R}^n$ to $\mathbb{R}$? It should also don't matter. But it did matter. Otherwise, there is nothing like the curse of dimensions.

I think When we transform the data from $\mathbb{R}^n$ to $\mathbb{R}$, we lose some information or do something wrong. What is that?

  • 1
    Please clarify your specific problem or provide additional details to highlight exactly what you need. As it's currently written, it's hard to tell exactly what you're asking. – Community Sep 01 '21 at 07:09
  • Generic bijective maps $\Bbb R\to\Bbb R^n$ do exist, but space-filling curves (curve = continuous map from an interval in $\Bbb R$ to somewhere) onto a larger $\Bbb R^n$ can never be injective. This is essentially a consequence of Baire category theorem. –  Sep 01 '21 at 07:30
  • Viceversa, as you kind of mention, there is no injective continuous map $\Bbb R^n\to \Bbb R$ (surjective or otherwise) for $n>1$. This goes by the name of theorem of invariance of domain. –  Sep 01 '21 at 07:36
  • Equality of cardinality, space filling maps, and dimension reduction of data are fundamentally different phenomena which, to my knowledge, have essentially nothing to do with one another. – Kajelad Sep 01 '21 at 10:38

1 Answers1

5

The fact that $\mathbb{R}$ and $\mathbb{R}^n$ have the same cardinality is a seriously overestimated and quite frankly not really useful information. For example from a set theoretic perspective there is no difference between a sphere and a line, but I'm sure you agree that these sets (treated as metric subspaces) are quite different, and we would need a different machinery (model) to describe them.

So this boils down to the fact that once you look at $\mathbb{R}^n$ as a set, instead of richer structure (normed vector space) you lose important information. And indeed, the information we are mostly interested in is metric/topological data, sometimes algebraic structure.

Note that "good" transformations are possible and are being done. Like if we preserve metric (isometry) or at least topology (homeomorphism). But $\mathbb{R}^n$ is not homeomorphic to $\mathbb{R}^m$ for $n\neq m$, let alone isometric. One of the examples of such a "good" transformation is what you've given us: permutation of coordinates (but only if a fixed permutation is applied to the entire data set). It is an isometry from $\mathbb{R}^n$ to itself.

Also note that space-filling curves are not useful here as well, since these are not injective. And therefore you lose even more information: you now map different points to the same point. And also (arguably more important) space-filling curves generally behave wildly, they deform shapes beyond recognition, they are hard to compute and generally unstable to use. Maybe you can do some analysis after such reduction but I don't think anyone tried it in a serious way (I might be wrong though).

In general there is no way to reduce dimension of arbitrary set of data while preserving the distance between points. Hence the curse. Sometimes we do however apply a dimensionality reduction and just accept the fact that we lose some information. Like we slightly move points around, or maybe remove some points, so that the data set changes but the general shape is more or less the same. As long as the loss is not too big then we are fine with it. Note that these techniques are far from trivial.

freakish
  • 42,851
  • You can't ensure injectivity for new data anyway when you do some dimensonality reduction (and your vectors are approximations of the real data by a non-injective process anyway). What I find problematic with the OP's idea for reducing dimension is that non-injectivity makes one add complete artifacts to the new dataset that clearly obfuscate the underlying patterns (just look at one space-filling curve). It's worthwhile to add that space-filling curves have terrible regularity (at best $\alpha$-Holder with $\alpha \leq 1/(d-1)$) so they are very impractical to compute and unstable to use. – Aphelli Sep 01 '21 at 16:12
  • @Mindlack good point, I've slightly modified the answer. – freakish Sep 01 '21 at 16:49