0

I started to work with my own implementation of backpropagation algorithm, that I made five years ago. For each training sample (input-output pair), I make a forward pass (to compute outputs of each neuron), backward pass (to compute "Deltas" for each neuron) and then, I update the weights.

When there are 3 layers of neurons: input layer, hidden layer, output layer, there are "two layers of weights". In a forward pass, I need to go through all weights (both layers). But in a backward pass, I omit the first layer of weights (that goes between the input layer and the first inner layer).

Is seems to be wrong, as I thought a backward pass requires going through all weights. But it seems to work quite well. So is it true, that not all weights are used in a backward pass?

When I use it for 20x20px images (layers 400 : 60 : 10), a backward pass is 40x faster than a forward pass, as it processes only 60x10 weights, omitting 60x400 weights.

  • I've read a paper on kernel methods earlier this year that uses random weight matrix to approximate the computation of a mapping to the space of the Gaussian Kernel, they only learn the linear classifier that classifies the mapped vector. So maybe something similar is happening there. – Pedro Henrique Monforte Nov 15 '19 at 04:10
  • @PedroHenriqueMonforte My question is about a regular 30-years-old backprop method. I think the fact, that backward pass skips the first layer of weights, is quite interesting. But I did not see it anywhere mentioned explicitly. – Ivan Kuckir Nov 15 '19 at 10:21
  • I got it. I am just saying: as in the case of this gaussian kernel approximation, maybe the first weights don't necessarily need to be updated to achieve good results – Pedro Henrique Monforte Nov 16 '19 at 05:20

0 Answers0