Soumith Chintala @soumithchintala Twitter Profile

Soumith Chintala

@soumithchintala

Cofounded and lead @PyTorch at Meta. Also dabble in robotics at NYU. AI is delicious when it is accessible and open-source.

4KPosts 211KFollowers 991Following

Similar User

@AndrewYNg

@karpathy

@PyTorch

@goodfellow_ian

@ylecun

@huggingface

@fchollet

@demishassabis

@SchmidhuberAI

@lilianweng

@JeffDean

@hardmaru

@ch402

@hugo_larochelle

@chrmanning

Soumith Chintala Reposted

Soumith Chintala

@soumithchintala

21 h

If you could change one thing about PyTorch what would it be?

Soumith Chintala Reposted

If you want robots that can just live with you & help 24/7, it needs to build & update its memory on the fly. Current semantic memory representations like VoxelMap from OK-Robot can't change with the world. That's why we built DynaMem: dynamic memory for a changing, open world!

Soumith Chintala

@soumithchintala

31 Oct

its super fun to build very very large clusters, train llama on them, and release it for y'all to enjoy -- and talk in great detail about how we did it! It's also really fun to partner with @Ahmad_Al_Dahle in creating this disruptive chaos 😀 Join us, there's lots of work to do!

Ahmad Al-Dahle

@Ahmad_Al_Dahle

31 Oct

Great to visit one of our data centers where we're training Llama 4 models on a cluster bigger than 100K H100’s! So proud of the incredible work we’re doing to advance our products, the AI field and the open source community. We’re hiring top researchers to work on reasoning,…

Soumith Chintala Reposted

Soumith Chintala

@soumithchintala

29 Oct

Scent Teleportation Update: WE DID IT! #Osmo #TechNews #AI #Scent

Soumith Chintala Reposted

Soumith Chintala

@soumithchintala

25 Oct

🤖 How can robot policies zero-shot generalize to any new environment and any new object? Introducing our new project: 🚀Data Scaling Laws in Imitation Learning for Robotic Manipulation🚀—bringing us closer to the dream of having robots work as waiters in hot pot restaurants! 🍲

Soumith Chintala Reposted

Soumith Chintala

@soumithchintala

17 Oct

Transformers can be trained to solve a 132-years old open problem: discovering global Lyapunov functions. New paper on Arxiv (accepted in NeurIPS 2024), with @albe_alfa and @Amaury_Hayat arxiv.org/abs/2410.08304 1/8

Soumith Chintala

@soumithchintala

17 Oct

this is correct. i find the Tesla humanoids extremely impressive. teleoping hands is not a big deal for now.

Whole Mars Catalog

@WholeMarsBlog

17 Oct

You can’t “teleoperate” balance, stability, or walking. Which other humanoids have demoed untethered in a live crowd again?

Soumith Chintala Reposted

Soumith Chintala

@soumithchintala

17 Oct

Some updates on our autonomy capabilities in this video: * Can dock and charge itself * Navigate around humans * Carry a tray * Walk up stairs * Give snacks to humans None of these shots are teleoperated.

Tesla Optimus

@Tesla_Optimus

17 Oct

Navigating by myself

Soumith Chintala

@soumithchintala

15 Oct

really cool work, building-scale robot tasks!

Rutav

@rutavms

14 Oct

🤖 Want your robot to grab you a drink from the kitchen downstairs? 🚀 Introducing BUMBLE: a framework to solve building-wide mobile manipulation tasks by harnessing the power of Vision-Language Models (VLMs). 👇 (1/5) 🌐 robin-lab.cs.utexas.edu/BUMBLE

Soumith Chintala

@soumithchintala

13 Oct

SpaceX continues to be incredible. The engineering feats are absolutely nuts. What I find more incredible is SpaceX as an organization -- can execute structured long-term research and engineering bets without bureaucracy and with high velocity. 99.999% of organizations at this…

SpaceX

@SpaceX

13 Oct

Mechazilla has caught the Super Heavy booster!

Soumith Chintala Reposted

Soumith Chintala

@soumithchintala

11 Oct

Announcing INTELLECT-1: the first-ever decentralized training of a 10B model Scaling decentralized training 10x beyond prior efforts. Anyone can join us to build open-source AGI 🦋

Soumith Chintala Reposted

Soumith Chintala

@soumithchintala

4 Oct

So how did we get to these amazing videos for Meta Movie Gen? One of the things I’m proudest of is that we released a very detailed technical report (ai.meta.com/research/movie……) Lets dive into a technical summary of what we did & learnt 🧵 1/n x.com/AIatMeta/statu…

AI at Meta

@AIatMeta

4 Oct

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in…

Soumith Chintala Reposted

Soumith Chintala

@soumithchintala

4 Oct

So, this is what we were up to for a while :) Building SOTA foundation models for media -- text-to-video, video editing, personalized videos, video-to-audio One of the most exciting projects I got to tech lead at my time in Meta!

AI at Meta

@AIatMeta

4 Oct

Soumith Chintala

@soumithchintala

4 Oct

yes, Meta released a full scientific paper on MovieGen, with a lot of details that'll help the field move forward.

near

@nearcyan

4 Oct

meta movie gen announced this morning • 30B video model, 7B upsampler • 1080p 4-16s videos • info on sampling, training, basic dataset curation pipeline details less frequently shared • 'potential release' after much time and feedback... page: ai.meta.com/research/movie…

Soumith Chintala

@soumithchintala

4 Oct

the MovieGen stack got a huge upgrade. compared to others, the Editing and Personalization features are really on a different level! producing personalized videos from a single photo of a person was quite mind-blowing to me.

Ahmad Al-Dahle

@Ahmad_Al_Dahle

4 Oct

I couldn’t be more excited to share our latest AI research breakthrough. We call it Meta Movie Gen and it’s a collection of state-of-the-art models that combine to deliver the most advanced video generation capability ever created. Check it out: ai.meta.com/research/movie…

Soumith Chintala

@soumithchintala

2 Oct

"How to train a model on 10k H100 GPUs?" has now been immortalized on my blog: soumith.ch/blog/2024-10-0…

Soumith Chintala

@soumithchintala

2 Oct

There's three parts. 1. Fitting as large of a network and as large of a batch-size as possible onto the 10k/100k/1m H100s -- parallelizing and using memory-saving tricks. 2. Communicating state between these GPUs as quickly as possible 3. Recovering from failures (hardware,…