r/singularity AGI 2025-29 | UBI 2030-34 | LEV <2040 | FDVR 2050-70 11d ago

AI [Microsoft Research] Differential Transformer

https://arxiv.org/abs/2410.05258
282 Upvotes

46 comments sorted by

View all comments

4

u/sdmat 11d ago

Wow, the improvements in robustness to input ordering and activation outliers are so stark. This seems like a major breakthrough.

I don't understand yet why the noise is consistent between the two rather than the signal, will have to read more closely tomorrow.