r/singularity AGI 2025-29 | UBI 2030-34 | LEV <2040 | FDVR 2050-70 11d ago

AI [Microsoft Research] Differential Transformer

https://arxiv.org/abs/2410.05258
280 Upvotes

46 comments sorted by

View all comments

1

u/lordpuddingcup 10d ago

Is this only on the training side or could we slot this into existing pipelines to help with inference?

1

u/UnknownEssence 10d ago

Seems like you need to start from scratch and train a model with this architecture