..., to create proper consecutive batches, where the nth input sequence in a batch starts off exactly where the nth input sequence ended in the previous batch.
Géron, Aurélien. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow (Kindle Locations 12018-12020). O'Reilly Media. Kindle Edition.
So data for a stateful RNN looks like:
Why does stateful use batches like the bottom one? As far as i know, it seems like the bottom one's hidden state has longer memory. (e.g. seq 4, seq 1-2-3, but the top one, seq 4, just seq 3)
Maybe some reasons is there?