I did watch the course DeepLearning of Andrew Ng and he told that we should create parameter w
small like:
parameters['W' + str(l)] = np.random.randn(layer_dims[l], layer_dims[l - 1]) ** 0.001
But in the last application assignment. They choose another way:
layers_dims = [12288, 20, 7, 5, 1]
def initialize_parameters_deep(layer_dims):
np.random.seed(3)
parameters = {}
L = len(layer_dims)
for l in range(1, L):
parameters['W' + str(l)] = np.random.randn(layer_dims[l], layer_dims[l - 1]) / np.sqrt(layer_dims[l - 1])
parameters['b' + str(l)] = np.zeros((layer_dims[l], 1))
assert (parameters['W' + str(l)].shape == (layer_dims[l], layer_dims[l - 1]))
assert (parameters['b' + str(l)].shape == (layer_dims[l], 1))
return parameters
And the result of this way is very good but if I choose w
like the old above, It's just have 34%
correct!
So do you can explain ?