@Vaishaal Profile picture

Vaishaal Shankar

@Vaishaal

ML research @ apple. Trying to find artificial intelligence. Opinions are my own.

Similar User
Ludwig Schmidt photo

@lschmidt3

Pavel Izmailov photo

@Pavel_Izmailov

Shimon Whiteson photo

@shimon8282

Jonathan Frankle photo

@jefrankle

Yin Cui photo

@YinCuiCV

Jacob Steinhardt photo

@JacobSteinhardt

Clémentine Dominé 🍊 photo

@ClementineDomi6

Francesco Locatello photo

@FrancescoLocat8

Andrew Ilyas photo

@andrew_ilyas

Gintare Karolina Dziugaite photo

@gkdziugaite

Ananya Kumar photo

@ananyaku

Katherine Lee photo

@katherine1ee

Jerry Li photo

@jerryzli

Surya Ganguli photo

@SuryaGanguli

Mazda Moayeri photo

@MLMazda

Pinned

I am really excited to introduce DataComp for Language Models (DCLM), our new testbed for controlled dataset experiments aimed at improving language models. 1/x

Tweet Image 1

Vaishaal Shankar Reposted

As Apple Intelligence is rolling out to our beta users today, we are proud to present a technical report on our Foundation Language Models that power these features on devices and cloud: machinelearning.apple.com/research/apple…. 🧵


Vaishaal Shankar Reposted

Datacomp-LM (DCLM) was presented today in ICLM FOMO workshop. DCLM is a data-centric benchmark for LLMs. It is also the state of the art open-source LLM and the state of the art open training dataset. Probably the most important finding is that data curation algorithms that…

Tweet Image 1

DCLM models keep on coming! This time we release (by far) the best open-data 1B model!

Excited to share our new-and-improved 1B models trained with DataComp-LM! - 1.4B model trained on 4.3T tokens - 5-shot MMLU 47.5 (base model) => 51.4 (w/ instruction tuning) - Fully open models: public code, weights, dataset!

Tweet Image 1


Vaishaal Shankar Reposted

Kudos to Apple. They publish their new 7B model not only open weight, but also open data-set! And in this ranking Apple even takes 1st place! An outstanding achievement that others should take as an example and be just as transparent. huggingface.co/datasets/mlfou…

Tweet Image 1

Kudos to @Apple for dropping a game-changer! 🚀 Their new 7B model outshines Mistral 7B and fully open-sourced, complete with the pretraining dataset! 🔥🌟 huggingface.co/apple/DCLM-7B



Vaishaal Shankar Reposted

🚀 Exciting news! @Apple has released its own open-source LLM, DCLM-7B. Everything is open-source, including the model weights and datasets. 💡Why should you be excited? 1. The datasets and tools released as part of this research lay the groundwork for future advancements in…

Tweet Image 1
Tweet Image 2
Tweet Image 3

Vaishaal Shankar Reposted

Apple shows off open AI prowess: new models outperform Mistral and Hugging Face offerings venturebeat.com/ai/apple-shows…

Tweet Image 1

Vaishaal Shankar Reposted

Apple joined the race of small models. DCLM-7B Released a few days back is open-source in every respect, including weights, training code, and dataset! 👀 🧠 7B base model, trained on 2.5T tokens from open datasets. 🌐 Primarily English data with a 2048 context window. 📈…

Tweet Image 1

I will be at ICML in Vienna next week. DM if you want to talk about language models, dataset design or any other exciting research :)


Vaishaal Shankar Reposted

Apple at it again with "truly open-source models" 🔥🔥🔥

We have released our DCLM models on huggingface! To our knowledge these are by far the best performing truly open-source models (open data, open weight models, open training code) 1/5



Vaishaal Shankar Reposted

Okay first DFN then that Apple is now the king of open-source datasets, both vision and NLP

Apple released a 7B model that beats Mistral 7B - but the kicker is that they fully open sourced everything, also the pretraining dataset 🤯 huggingface.co/apple/DCLM-7B



Vaishaal Shankar Reposted

Kudos to @Apple for dropping a game-changer! 🚀 Their new 7B model outshines Mistral 7B and fully open-sourced, complete with the pretraining dataset! 🔥🌟 huggingface.co/apple/DCLM-7B


Vaishaal Shankar Reposted

Apple has entered the game! @Apple just released a 7B open-source LLM, weights, training code, and dataset! 👀 TL;DR: 🧠 7B base model, trained on 2.5T tokens on an open datasets 🌐 Primarily English data and a 2048 context window 📈 Combined DCLM-BASELINE, StarCoder, and…

Tweet Image 1

Vaishaal Shankar Reposted

Apple released a 7B model that beats Mistral 7B - but the kicker is that they fully open sourced everything, also the pretraining dataset 🤯 huggingface.co/apple/DCLM-7B


Loading...

Something went wrong.


Something went wrong.