@ImanolSchlag Profile picture

Imanol Schlag

@ImanolSchlag

AI Researcher training LLMs in Switzerland, for Switzerland.

Imanol Schlag Reposted

MoEUT: Mixture-of-Experts Universal Transformers Their UT model, for the first time, slightly outperforms standard Transformers on LM tasks such as BLiMP and PIQA, while using significantly less compute and memory repo: github.com/robertcsordas/… abs: arxiv.org/abs/2405.16039

Tweet Image 1

United States Trends
Loading...

Something went wrong.


Something went wrong.