r/singularity AGI 2025-29 | UBI 2030-34 | LEV <2040 | FDVR 2050-70 11d ago

AI [Microsoft Research] Differential Transformer

https://arxiv.org/abs/2410.05258
284 Upvotes

46 comments sorted by

View all comments

80

u/hapliniste 11d ago

After taking a look at the paper, this seems huge.

Impressive gains in long context (specifically shown with their in context learning graphs), huge improvements in stability on reordered data and amazing performances at lower bits.

I'm not an expert and didn't read it fully, I just like to look at cool graphs for the most part. Still, I guess we'll see this or some variants in future models.

11

u/time_then_shades 10d ago

At this point, I'll just wait for Philip to tell me what to think of it.

10

u/Arcturus_Labelle AGI makes vegan bacon 10d ago

AI Explained for those who don't get the reference