CNN - Confused on the output shape of second convolutional layer

Question

I'm attempting to write a forward pass of a CNN but I'm stuck on the second convolutional layer.

From what I understand, given an image of size 28x28, a first filter of size 10x3x3, a second filter 20x3x3, and max pooling after each filter, the input shape should go from 28x28 > 10x28x28 > 10x14x14 > 20x14x14 > 20x7x7 (assuming a stride of 1).

My initial intuition says the shape after the second filter should be 20x10x14x14. However after doing some digging, from my understanding, you treat the sliding window as a sliding 'block'.

Currently my program spits out 20x10x14x14 using scipy.signal.correlate function.

input_ = x_train[0]
filters = np.random.randn(20, 3, 3)
temp = []
for i in range(20):
    temp.append(scipy.signal.correlate(input_, filters[i], 'same'))
input_ = np.array(temp)
print(input_.shape)

filters = np.random.randn(10, 20, 3, 3)
temp = []
for i in range(10):
    temp.append(scipy.signal.correlate(input_, filters[i], 'same'))
input_ = np.array(temp)
print(input_.shape)

I would be grateful for any help.

What is the intention to use signal.correlate? In NN you do matrix multiplication A*X+b. For CNN they are just applied by scrolling the window. Maybe I am talking about different things, but I just do not understand your reasons then. — keiv.fly, Feb 28 '20 at 22:07
the correlate function performs the convolutional process (sliding window) between the input image and the filter. so i iterate through each filter and convolve the filter with the image, then append the final result of each convolution to a blank array. — Yeti.91, Feb 29 '20 at 03:23
You need two cycles in the second loop and also a sum on one of the dimensions. — keiv.fly, Feb 29 '20 at 13:48
Aren't you missing the max pooling? It also doesn't quite seem like this implementation gets the second filter right. It's a function from 10 filters to 20 filters, but you end up with a dimension of 20 x 10 — Sean Owen, Mar 02 '20 at 03:05
Sorry, yea I did miss the max pooling. Just assume the max pooling is there. I didn't include it because it wasn't really part of my question and it would just complicate the code. Regarding the output, when I run it on google colab, the output looks like the following. (20, 28, 28) (10, 20, 28, 28). — Yeti.91, Mar 02 '20 at 06:18
@keiv.fly Forgot to thank you for solving my problem. I didn't realize I needed to perform a summation. It's not usually explained in CNN explanations. — Yeti.91, Mar 12 '20 at 15:25

CNN - Confused on the output shape of second convolutional layer

0 Answers0