4

I found a paragraph in the book about $SSD$, can't get one thing:

Most commonly, the distance measure is the sum of squared differences. For two images $f(x, y)$ and $g (x, y)$ it is defined as $$ SSD(d_1,d_2) = \sum_{i=-n_1}^{n_1} \sum_{j=-n_2}^{n_2} \big(f(x+i,\,y+j)-g(x+i-d_1,\,y+j-d_2)\big)^2 $$

where the summation extends over the region of size $(2n_1 + 1) \times (2n_2 + 1)$.

I can not get, why does $i$ changes from $-n_1$ to $n_1$, but not from $0$ to $n_1$. The similar about $j$. And why does summation goes over the $(2n_1 + 1) \times (2n_2 + 1)$ but not over the $(n_1) \times (n_2)$

Sammy Black
  • 25,273
  • As defined, SSD(d1,d2) appears to be an image itself for each d1 and d2: It depends on x and y. This might be correct, but I am unsure. – willem Jun 21 '12 at 19:26
  • 1
    (0, 0) represents the pixel in question. In order to go both positive and negative in both axes it is required to use both strictly positive (e.g. n1) and strictly negative (e.g. -n1) offsets. –  Jun 21 '12 at 19:06
  • But in Matrix, there are no negative offsets oO –  Jun 21 '12 at 19:17
  • 1
    It seems that $(i, j)$ represents the offset from the point $(x, y)$. So, for example, $(i, j) = (0, 0)$ corresponds to the point $(x, y)$ itself. The collection of $(i, j)$ that you sum over forms a $2$-dimensional array (matrix if you like) that happens to be indexed where $(0, 0)$ is in the middle. This is okay and makes a lot of sense, given the context. You can define $k = i + n_1 + 1$ and see that it ranges from $1$ to $2n_1 + 1$ if you insist. Analogously, for $\ell = j + n_2 + 1$. – Sammy Black Dec 13 '13 at 22:01
  • because it is used for finding correlation...its similar to convolution...therefore consideration starts when the last pixel of first image is multiplied with the first pixel of the second image and continues till first pixel of the first image is multiplied with the last pixel of the second image... hint: refer to convolution (see esp overlapping technique) –  Jun 07 '14 at 18:31

2 Answers2

1

Certainly one could iterate a sum from $i = 0$ to $n_1$ and from $j=0$ to $n_2.$ The question is, what is the purpose of the sum?

The indexing scheme in the formula is apparently intended to extract two rectangular regions out of two much larger images. The rectangular regions are parallel to the coordinate axes, but each rectangle has a width, a height, and a position that need to be specified.

For reasons that may be explained in the parts of the book that have been omitted from the question, the author wanted to specify the position of each rectangle by the coordinates of a pixel in the exact center of the rectangle. Perhaps this is due to considerations of symmetry, or perhaps it is because we are really supposed to be interested in what is happening at a specific pixel in each image, and the other pixels are considered only as "neighbors" of the interesting pixel.

Whatever the motivation is, we get a rectangle by selecting a certain pixel to be in the center of the rectangle, and then extending the rectangle $n_1$ pixels to the left and to the right and $n_2$ pixels upward and downward.

The coordinates of the center of the rectangle inside the $f$ image are $(x,y),$ so the rectangle runs left and right from $x - n_1$ to $x + n_1$ and vertically from $y - n_2$ to $y + n_2.$

The coordinates of the center of the rectangle in side the $g$ image are not necessarily the same as those of the $f$ rectangle. Instead of $(x,y),$ they are $(x-d_1,y-d_2),$ so the rectangle runs left and right from $(x - d_1) - n_1$ to $(x - d_1) + n_1$ and vertically from $(y - d_2) - n_2$ to $(y - d_2) + n_2.$

If you take all the integers from $x - n_1$ to $x + n_1,$ there are $2n_1 + 1$ of them, so the $f$ rectangle is $2n_1 + 1$ pixels wide. For similar reasons, it is $2n_1 + 1$ pixels high, and the $g$ rectangle is the same size.

David K
  • 98,388
0

The region, $n_1×n_2$, starts from $1$. That is, $n_1,n_2$ start from $0$.

The region according to the formula is always odd, with windows of size: $1x1, 3x3, 5x5,.. $ or even $1x3, 1x5, .., 3x1$ and so on.

For example, if you are comparing two pixels (i.e. one pixel in each image), you have a region of $1$ pixel. Let's say it is the fifth pixel in the first row: x = $0$, y = $4$. The pixel values are $10,3$ for $f,g$ respectively. For the region of one$ 2n_1= 1 =>n_1 = 0$, and the same goes for $n_2$.

$SSD= (f(x+i,y+j) - g(x+i,y+j))^2$

$SSD= (f(0+0,4+0) - g(0+0,4+0))^2$

$SSD= (f(0,4) - g(0,4))^2$

$SSD= (10 - 3)^2 = 49$

Basically, you are comparing the value of the same pixel location with no offset around it.

If the window size is bigger, you are squaring the summation of the differences between the according pixels in that window.

N. Osil
  • 139