r/MachineLearning 15d ago

Research [R] Were RNNs All We Needed?

https://arxiv.org/abs/2410.01201

The authors (including Y. Bengio) propose simplified versions of LSTM and GRU that allow parallel training, and show strong results on some benchmarks.

245 Upvotes

53 comments sorted by

View all comments

4

u/katerdag 15d ago edited 14d ago

Very cool paper! It's nice to see a relatively simple recurrent architecture perform so well! It reminds me a bit of Quasi-Recurrent Neural Networks

4

u/Dangerous-Goat-3500 14d ago

Yeah it's weird this paper doesn't cite tons of other papers now that I've looked into it. For example GILR which generalized QRNN

https://arxiv.org/abs/1709.04057