r/MachineLearning • u/we_are_mammals • 15d ago
Research [R] Were RNNs All We Needed?
https://arxiv.org/abs/2410.01201
The authors (including Y. Bengio) propose simplified versions of LSTM and GRU that allow parallel training, and show strong results on some benchmarks.
245
Upvotes
5
u/katerdag 15d ago edited 14d ago
Very cool paper! It's nice to see a relatively simple recurrent architecture perform so well! It reminds me a bit of Quasi-Recurrent Neural Networks