Improving initial Cross Entropy in CNN

Question

I have a Convolutional Neural Network, that is structured as binary classifier. I have two relatively standard convolutional/relu/pooling layers followed by 2 layer fully connected network outputting to a softmax with loss layer for the binary classification. However I observed something unusual:

In Version 1 of the network I had both convolutional layers derive 10 features, upon initialisation I had a Cross Entropy Error of around 28.

In Version 2 I upped the number of features of the convolutional layers to 64 features. Despite still having the same fully connected layers and the same softmax, my Cross Entropy Error jumped to 340.0

My question is, why would this happen. Surely the randomness is the same and the softmax with loss function should normalise so that both outputs add up to 1. So why would Cross Entropy suddenly jump so high

My understanding of the cross entropy effects for outputting large numbers was helped by this answer

score 1 · Answer 1 · answered Feb 16 '17 at 17:19

By increasing number of features in convolutional layers, you are actually increasing the number of learnable parameters which were initialized to random values. So, randomness increases and that's why the cross entropy loss increases.Hope this is satisfiable justification to your question.

Ex, Suppose convolution layer with :

filter_size = (3, 3) and number_feature = 10 => number_parameters = 3*3*10 = 90
filter_size = (3, 3) and number_feature = 64 => number_parameters = 3*3*64 = 576

Improving initial Cross Entropy in CNN

1 Answers1