Bias of 1 in fully connected layers introduced dying relu problem

Asked Aug 22 '18 at 17:15

Active Feb 13 '21 at 04:14

Viewed 356 times

While implementing AlexNet (model-code), one of the thing I need to do was to initialize the biases of the convolutional layers and fully connected layers.

Normally we initialize biases with 0s, but the paper says:

We initialized the neuron biases in the second, fourth, and fifth convolutional layers, as well as in the fully-connected hidden layers, with the constant 1.

So I went ahead and initialized the biases to 1 as the paper says. But that didn't make the network learn at all. Basically the last fully connected layer was producing a lot of 0s, which is otherwise known as dying-relu-problem. Out of 4096 neurons only 40 or 50 were producing non-zeros.

After lot of debugging, I came to realize that: if I make the fully connected layers' bias to 0 than they are nicely learning. loss decreased nicely.

Now I'm wondering:

How bias plays the role for dying-relu-problem here ?
Can all dying-relu-problem be corrected using bias searching ?

edited Feb 13 '21 at 04:14

Ethan

1,633
9
24
39

asked Aug 22 '18 at 17:15

Abhisek

Bias of 1 in fully connected layers introduced dying relu problem

0 Answers0