1

Is there any other materials that derives the LSTM back propagation and carousel of error except the original paper? (I could not understand it, sorry).

I tried deriving and got stuck, and asked the following question: LSTMs - Data Science Stack Exchange question, however, it doesn't seems that there are much people interested in hand derivation of LSTMs.

Thanks

user1157751
  • 689
  • 1
  • 8
  • 22

1 Answers1

2

One tool that may be helpful is Aiden Gomez's blog post. Its strength lies primarily in the fact that he runs through a toy/numerical example, which when paired with the original paper/thesis , serves as a great foundational tool.

I did take a look at the site you mentioned in your other question, it's actually an excellent resource. I'll hop on over and try and clarify what I can for you when I get a chance. It looks like you've misunderstood/overlooked notation which can happen since there are so many components involved.

It may also be worth taking a look at some code. Siraj Raval has a great video on LSTMs and includes the code in the link I've included. No libraries. I wouldn't dive too deep, but it's a great way to see the inner workings of the network.

As far as the CEC goes, there is a reddit post. If you're looking for a more rigorous handling of this topic you can either reference the original paper or often cited paper: On the Difficulty of Training Recurrent Neural Networks.