6

As I understand, convolution is one way to describe how 2 functions correlate to each other.

According to the wikipedia,

The convolution of $f$ and $g$ is written $f∗g$, using an asterisk or star. It is defined as the integral of the product of the two functions after one is reversed and shifted.

I can somehow accpet the shift operation. But why do we need to reverse one of the function? If we just want to make the 2 funcitons collide, I think shifting is enough.

ADD 1

An interesting article recommended by my lecturer: Understanding Convolution

  • 2
    Maybe if you pondered why polynomial multiplication makes sense (convolution is just an extension of that) you would think it is more natural. – rschwieb Apr 13 '17 at 09:27

4 Answers4

13

Well, shifting is enough is the sense that reversing does not really change in an essential way the mathematical object of convolution. But the reason we choose the reversing definition conventionally may be because of several conveniences. For example:

(1) The property of commutativity, that is, $f*g=g*f$, is lost without reversing;

(2) The property that convolution is multiplication on the Fourier side, that is $\mathcal{F}(f*g)=\mathcal{F}(f)\mathcal{F}(g)$ where $\mathcal{F}$ denotes the Fourier transform, is lost without reversing;

Etc.

shrinklemma
  • 1,755
9

Here is one way to discover the discrete convolution operation. Let $S: \mathbb R^n \to \mathbb R^n$ be the circular shift operator defined by $$ S \begin{bmatrix} x_0 \\ x_1 \\ \vdots \\ x_{n-1} \end{bmatrix} = \begin{bmatrix} x_1 \\ x_2 \\ \vdots \\ x_{n-1} \\ x_0 \end{bmatrix}. $$ Assume that the linear operator $A:\mathbb R^n \to \mathbb R^n$ is "shift-invariant" in the sense that $A(Sx) = S(Ax)$ for all $x$. In other words, if you shift the input, the output is shifted the same way. We can easily check that $A S^n x = S^n A x$ for all integers $n$, positive or negative. Shift-invariant linear operators are of fundamental importance, and for example they are useful in signal processing and numerical differentiation.

Key idea: Let's find the solution to $Ax = b$, assuming that we have already found a "fundamental solution" $x_0$ which satisfies $$ Ax_0 = \begin{bmatrix} 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix} = e_0. $$

We are off to a good start if we notice that \begin{align} A S^{-1} x_0 = \begin{bmatrix} 0 \\ 1 \\ 0 \\ \vdots \\ 0 \end{bmatrix} = e_1 \end{align} and in fact $$ A S^{-j} x_0 = e_j \quad \text{for } j = 0,\ldots,n-1. $$ It follows that \begin{align} A(b_0 x_0 + b_1 S^{-1} x_0 + b_2 S^{-2} x_0 + \cdots + b_{n-1} S^{n-1} x_0) &= b_0 e_0 + e_1 b_1 + \cdots + b_{n-1} e_{n-1} \\&=b. \end{align} We have solved $Ax = b$.

The solution to $Ax = b$ is a particular combination of $x_0$ and $b$, and this combination of $x_0$ and $b$ is called the "convolution" of $x_0$ and $b$, and is denoted $x_0 \ast b$. We have discovered the convolution operation. This explains why we care about convolution, and why convolution is defined in the way that it is.

We can discover the convolution of functions using a similar line of reasoning, with the delta function $\delta$ in place of $e_0$.

littleO
  • 51,938
6

It depends what you want out of it: Just as there are different ways to define the Fourier-Transformation - which have very similar, but slightly different algebraic properties - one can also define different convolution type integrals. The most common alternativ to the convolution is the cross-correlation which is frequently used in signal processing.

\begin{alignat}{3} \text{Convolution:} &\;\;[f*g](t) &= \int f(\tau)g(t-\tau){\,\rm d}\tau &= \int f(t-\tau)g(\tau){\,\rm d}\tau\\ \text{Cross correlation:} &\;\;[f\star g](t) \ &=\int \overline{f(\tau)}g(t+\tau){\,\rm d}\tau &=\int \overline{f(\tau -t)} g(\tau){\,\rm d}\tau \end{alignat}

Cross correlation is nice since it can measure how similar two signals are making it very useful in application. But mathematicians typically tend to use the convolution instead since it offers the nicest package of algebraic properties.

In particular what stands out to me is that the dirac delta $\delta$ acts like a multiplicative unit under convolution ($f*\delta = \delta*f = f$ for all $f$) which makes leads to the beautiful solution theory of linear PDEs, whereas for the cross-convolution $[f\star\delta](t)=\overline{f}(-t) \neq f = \delta\star f$.

On the other hand properties like the convolution theorem $\mathcal F(f*g) = \mathcal F(f)\cdot \mathcal F(g)$ like shrinklemma mentioned are not that unique; for example the cross correlation satisfies the analogous $\mathcal F(f\star g) = \overline{\mathcal F(f)}\cdot \mathcal F(g)$

Hyperplane
  • 11,659
1

It's actually for a very simple and logical reason.

In essence, convolution is the mathematical operation that inserts the effects of a "f" function in another "g" function.

In the domain of time we have f(t) and g(t). When we are dealing with temporal domain, there is a need for the order of events to respect the time they happened. That is, the future cannot precede the past.

For example, event f(t = 3) cannot precede the event f(t = 2), the temporal order must be respected, therefore, f(t = 2) precedes f(t = 3).

In this sense, if you will convey function f(t) with function g(t), one of them must be reversed so that the event f(t = initial) find the event g(t = initial) correctly at the beginning of the convolution .

Otherwise, if one is not reversed, you will have event f(t = initial) finding the event g(t = final) at the beginning of the convolution, which disrespects the order of temporal event of functions.

If you picture a sliding function over a stationary function (which is what convolution does), if the sliding one is not reversed (either of them, actually), you will convolute the present time for the stationary function with the future time of the sliding function, which disrespects order of time events. And in time domain you simply can't do that. But, if the sliding one is reversed, then you'll convolute the present time for both of them correctly.

And that's why one function is reversed in convolution.