Vaishaal Shankar @Vaishaal Twitter Profile

Vaishaal Shankar

@Vaishaal

ML research @ apple. Trying to find artificial intelligence. Opinions are my own.

509Posts 2KFollowers 351Following

Similar User

@lschmidt3

@Pavel_Izmailov

@shimon8282

@jefrankle

@YinCuiCV

@JacobSteinhardt

@ClementineDomi6

@FrancescoLocat8

@andrew_ilyas

@gkdziugaite

@ananyaku

@katherine1ee

@jerryzli

@SuryaGanguli

@MLMazda

Pinned

Vaishaal Shankar

@Vaishaal

18 Jun

I am really excited to introduce DataComp for Language Models (DCLM), our new testbed for controlled dataset experiments aimed at improving language models. 1/x

Vaishaal Shankar Reposted

As Apple Intelligence is rolling out to our beta users today, we are proud to present a technical report on our Foundation Language Models that power these features on devices and cloud: machinelearning.apple.com/research/apple…. 🧵

Vaishaal Shankar Reposted

Alex Dimakis

@AlexGDimakis

26 Jul

Datacomp-LM (DCLM) was presented today in ICLM FOMO workshop. DCLM is a data-centric benchmark for LLMs. It is also the state of the art open-source LLM and the state of the art open training dataset. Probably the most important finding is that data curation algorithms that…

Vaishaal Shankar

@Vaishaal

22 Jul

DCLM models keep on coming! This time we release (by far) the best open-data 1B model!

Achal Dave

@achalddave

22 Jul

Excited to share our new-and-improved 1B models trained with DataComp-LM! - 1.4B model trained on 4.3T tokens - 5-shot MMLU 47.5 (base model) => 51.4 (w/ instruction tuning) - Fully open models: public code, weights, dataset!

Vaishaal Shankar Reposted

Chubby♨️

@kimmonismus

19 Jul

Kudos to Apple. They publish their new 7B model not only open weight, but also open data-set! And in this ranking Apple even takes 1st place! An outstanding achievement that others should take as an example and be just as transparent. huggingface.co/datasets/mlfou…

Shaun Ralston

@shaunralston

19 Jul

Kudos to @Apple for dropping a game-changer! 🚀 Their new 7B model outshines Mistral 7B and fully open-sourced, complete with the pretraining dataset! 🔥🌟 huggingface.co/apple/DCLM-7B

Vaishaal Shankar Reposted

Akash Shetty

@akashlives

19 Jul

🚀 Exciting news! @Apple has released its own open-source LLM, DCLM-7B. Everything is open-source, including the model weights and datasets. 💡Why should you be excited? 1. The datasets and tools released as part of this research lay the groundwork for future advancements in…

Vaishaal Shankar Reposted

VentureBeat

@VentureBeat

19 Jul

Apple shows off open AI prowess: new models outperform Mistral and Hugging Face offerings venturebeat.com/ai/apple-shows…

Vaishaal Shankar Reposted

Rohan Paul

@rohanpaul_ai

20 Jul

Apple joined the race of small models. DCLM-7B Released a few days back is open-source in every respect, including weights, training code, and dataset! 👀 🧠 7B base model, trained on 2.5T tokens from open datasets. 🌐 Primarily English data with a 2048 context window. 📈…

Vaishaal Shankar

@Vaishaal

20 Jul

I will be at ICML in Vienna next week. DM if you want to talk about language models, dataset design or any other exciting research :)

Vaishaal Shankar Reposted

clem 🤗

@ClementDelangue

19 Jul

Apple at it again with "truly open-source models" 🔥🔥🔥

Vaishaal Shankar

@Vaishaal

18 Jul

We have released our DCLM models on huggingface! To our knowledge these are by far the best performing truly open-source models (open data, open weight models, open training code) 1/5

Vaishaal Shankar Reposted

TimDarcet

@TimDarcet

19 Jul

Okay first DFN then that Apple is now the king of open-source datasets, both vision and NLP

Casper Hansen

@casper_hansen_

19 Jul

Apple released a 7B model that beats Mistral 7B - but the kicker is that they fully open sourced everything, also the pretraining dataset 🤯 huggingface.co/apple/DCLM-7B

Vaishaal Shankar Reposted

Shaun Ralston

@shaunralston

19 Jul

Kudos to @Apple for dropping a game-changer! 🚀 Their new 7B model outshines Mistral 7B and fully open-sourced, complete with the pretraining dataset! 🔥🌟 huggingface.co/apple/DCLM-7B

apple/DCLM-7B · Hugging Face

Source: https://t.co/J6qHAdpxN1

Vaishaal Shankar Reposted

Philipp Schmid

@_philschmid

19 Jul

Apple has entered the game! @Apple just released a 7B open-source LLM, weights, training code, and dataset! 👀 TL;DR: 🧠 7B base model, trained on 2.5T tokens on an open datasets 🌐 Primarily English data and a 2048 context window 📈 Combined DCLM-BASELINE, StarCoder, and…

Vaishaal Shankar Reposted

Casper Hansen

@casper_hansen_

19 Jul

Apple released a 7B model that beats Mistral 7B - but the kicker is that they fully open sourced everything, also the pretraining dataset 🤯 huggingface.co/apple/DCLM-7B

apple/DCLM-7B · Hugging Face

Source: https://t.co/Nj0nT1Z0Ru