12

I'm trying to come up with a rationalisation for using the Euclidean distance in an application of mine. Any thoughts on why it is the fundamental choice?

Thanks

Conor
  • 141
  • I am not sure whether I understand your question. The Euclidean distance is the "normal distance in the real world". In other words, if you take a ruler and measure the distance between two points, then it's the Euclidean distance. – Simon Markett Aug 21 '12 at 14:09
  • Is there another metric you're considering? – axblount Aug 21 '12 at 14:10
  • 2
    At the moment it sounds like you want to rationalize it because you want to use it. Shouldn't you rationalize it because the evidence indicates it is the best choice? If that is the case, then we need information about your application, so we can reason that it is the best choice. – rschwieb Aug 21 '12 at 14:12
  • I guess I was considering (sum(i=1 to n) (xi - yi)^k)^(1/k) where k=2 is the standard. – Conor Aug 21 '12 at 14:16
  • This is a very good question. I don't understand why it got downvoted. – Christian Blatter Aug 21 '12 at 14:43
  • I would suggest that you migrate the question to http://physics.stackexchange.com/ I agree with Christian Blatter that the question is good, but (some) mathematicians don't like questions about why something is natural in relatino to the real world. However, maybe physicists would be able to say more about finding distances in the real world and why the Euclidean metric is "natural". From what I understand, this might actually not be the case when you look out into all of space. – Thomas Aug 21 '12 at 15:42
  • If the situation naturally fits the metric then there's no reason to "come up with" a rationalization. Inventing a rationalization without appealing to the situation seems dishonest. There probably are such reasons the OP has in mind, so I wish @conor would let us in on them. – rschwieb Aug 21 '12 at 15:59
  • @Thomas I don't think this question has any relation to physics at all - indeed, my first presumption was that it was closer to a fitness measure for statistical applications! I think this is a perfectly natural question here. – Steven Stadnicki Aug 21 '12 at 16:40
  • @StevenStadnicki: You are probably right. It is still not clear to me though what the OP is asking about. He hasn't said anything about the "application" of his. As to the general question about why the Euclidean metric is the natural choice, I would still say that might be better answered in physics. – Thomas Aug 21 '12 at 16:51
  • @Conor: I don't understand the question. Depending on the application, it may not be the natural choice. What is the application? – Qiaochu Yuan Aug 21 '12 at 17:32
  • A better formulation of this question might be: "Under what conditions is the Euclidean metric the best choice and why does this set of conditions prevail in such a diverse set of applications." – Waylon Flinn Dec 07 '13 at 04:28

2 Answers2

10

Beginning with the one-dimensional case: On ${\mathbb R}$ we have translations $t\mapsto t+a$, $\, a$ fixed, and reflexions $t\mapsto -t$ as natural "geometric" isomorphisms. A translation and reflexion invariant metric then necessarily is of the form $d(x,y)=\phi\bigl(|x-y|\bigr)$ where $\phi$ should satisfy some "technical" conditions to make $d$ a metric which in addition is compatible with the inborn topological structure of ${\mathbb R}$. There are many such $\phi$; e.g. the definition $d(x,y):=\tanh\bigl(|x-y|\bigr)$ would turn ${\mathbb R}$ into a bona-fide metric space where all distances are $<1$.

But in ${\mathbb R}$ we have an additional set of "geometric" isomorphisms, namely scalings. If we want that our metric behaves in a reasonable way under scalings $T_\lambda: \ x\mapsto \lambda x$, $\,\lambda>0$ fixed, then the only $\phi$s left are the functions $\phi(u)= cu$, $\, c>0$ fixed, and we may as well choose $c=1$, so that we arrive at $d(x,y)=|x-y|$.

Now the two-dimensional case: Symmetry considerations like the above imply that we should choose a direction dependent $\phi:\ S^1\to {\mathbb R}_{>0}$ which is even and satisfies a certain convexity condition; then we should put $$d(x,y):=\phi\left({x-y\over |x-y|}\right)\ |x-y|\ .$$ This metric is translation invariant and behaves correctly under scalings.

But again, in ${\mathbb R}^2$ new sets of "geometric" isomorphisms are available, namely compact one-parameter groups of "rotations". If we want our $d$ to be invariant under such a group the only candidates left are of the form $$\|x\|^2:=\bigl(d(x,0)\bigr)^2= x'Qx\ ,$$ where $Q$ is a positive definite quadratic form of the coordinate variables $x_1$, $x_2$. Introducing a coordinate system adapted to the $Q$ at hand we then arrive at $\|x\|^2=x_1^2+x_2^2$, i.e., the euclidean distance function.

  • 4
    Note that the argument for higher-dimensional actions such as rotations breaks down when the different coordinates of data are heterogenous - for instance, in a statistical application, categories like Age and Income - and that metrics like the $\ell_1$ or even $\ell_\infty$ metrics can be better choices. – Steven Stadnicki Aug 21 '12 at 16:42
0

The Euclidean Norm, $\|\cdot\|:\mathbb{R}^{n}\to\mathbb{R}$ is the most intuitive of the norms, it gives us the straight line distance (as we are used to thinking about it), from the origin to the position defined by the vector in question. The Euclidean norm is defined as:

$$\|\vec{v}\|=\sqrt{\sum_{i=1}^{n}{|v_{i}|^{2}}}$$

There are of course other norms, such as the Taxicab norm, $\|\cdot\|_{1}:\mathbb{R}^{n}\to\mathbb{R}$ defined as:

$$\|\vec{v}\|_{1}=\sum_{i=1}^{n}|v_{i}|,$$

Which as stated by wikipedia:

The name relates to the distance a taxi has to drive in a rectangular street grid to get from the origin to point $\vec{v}$.

Indeed we can define any norm $\|\cdot\|_{x}:\mathbb{R}^{n}\to\mathbb{R}$:

$$\|\vec{v}\|_{x}=\sqrt[x]{\sum_{i=1}^{n}{|v_{i}|^{x}}}$$

Which of these norms is best for your application depends on what exactly you are trying to measure. We need more information to give you a better idea of which of these to use for your application. For most applications (such as games, CAD packages etc. we are interested in the real-world distance, so we will use the Euclidean norm).

Thomas Russell
  • 10,425
  • 5
  • 38
  • 66