@sytelus Profile picture

Shital Shah

@sytelus

Mostly research and code. If universe is an optimizer, what is its loss function? All opinions are my own.

Similar User
Noam Brown photo

@polynoamial

anton photo

@abacaj

Jürgen Schmidhuber photo

@SchmidhuberAI

Devendra Chaplot photo

@dchaplot

Abhi Venigalla photo

@ml_hardware

Tengyu Ma photo

@tengyuma

Felix Hill photo

@FelixHill84

Zoubin Ghahramani photo

@ZoubinGhahrama1

Jonathan Frankle photo

@jefrankle

David photo

@DavidSHolz

Shayne Longpre photo

@ShayneRedford

Wojciech Zaremba photo

@woj_zaremba

Ofir Press photo

@OfirPress

Xin Wang photo

@xinw_ai

Tim Rocktäschel photo

@_rockt

Pinned

Phi-3 14B model from our team is available now! This was trained with 512 H100s on 4.8T tokens achieving MMLU of 78 (comparable with Llama3 70B!!). huggingface.co/microsoft/Phi-…


The real winner in this election is AI.


Elections are (hopefully) over and we all can use some cooling down. But you know what else can use some cooldown? Your LR schedule! I wrote note about this last year and now things are becoming very real. Some people are calling it "WSD schedule" while others are calling it…

Tweet Image 1

Just learned something very cool about LR schedules. This one is so huge it surprises me that it's not in its own paper but rather tucked away. Problem: Most training use cosine/linear decays but this requires specifying number of steps in advance. This is quite troublesome. 🧵



PSA: Flossing strings from popular in-store brands contains plastics and other forever-chemicals! The solution is silk based biodegradable products. youtu.be/V-8oKejN9EE


Shital Shah Reposted

From AI Frontiers, Yadong++ have released Omniparser (microsoft.github.io/OmniParser/) which parses screens better than vision models. The code is open source and the model is on hugging face huggingface.co/microsoft/Omni… .


Loading...

Something went wrong.


Something went wrong.