Vithu Thangarasa
@vithursant19Machine Learning Research at @CerebrasSystems, previously at @Tesla and @UberAILabs, and former grad student at @uoguelph_mlrg and @VectorInst. Thamilan ௐ.
Similar User
@danijarh
@oh_that_hat
@robertghrist
@ancadianadragan
@docmilanfar
@francoisfleuret
@TheGregYang
@aidangomez
@kevin_zakka
@TacoCohen
@prfsanjeevarora
@maithra_raghu
@Michael_J_Black
@MilesCranmer
@deviparikh
🚀 2100+ tokens/s with Llama3.1-70B Instruct on @CerebrasSystems Wafer Scale Engine—Zero Loss in Model Quality! Proud to hit this milestone in LLM inference speed, setting a new standard in AI hardware performance. Grateful for the team effort! Let's keep pushing limits with ML…
🚨 Cerebras Inference is now 3x faster: Llama3.1-70B just broke 2,100 tokens/s - 16x faster than the fastest GPU solution - 8x faster than GPUs running Llama *3B* - It's like the perf of a new hardware generation in a single software release Available now at…
2:4 Sparsity + @AIatMeta Llama-3.1: At @neuralmagic, we've developed a recipe to produce very competitive sparse LLMs, and we are starting by open-sourcing the first one: Sparse-Llama-3.1-8B-2of4. We also show how to leverage it for blazingly fast inference in @vllm_project
Here is what instant 405B looks like: Cerebras vs. fastest GPU cloud:
Llama 3.1 405B is now running on Cerebras! – 969 tokens/s, frontier AI now runs at instant speed – 12x faster than GPT-4o, 18x Claude, 12x fastest GPU cloud – 128K context length, 16-bit weights – Industry’s fastest time-to-first token @ 240ms
Cerebras is capable of offering Llama 3.1 405B at 969 output tokens/s and they have announced they will soon be offering a public inference endpoint 🏁 We have independently benchmarked a private endpoint shared by @CerebrasSystems and have measured 969 output tokens/s, >10X…
There are 4500+ NeurIPS papers... 🤯 The NeurIPS Navigator lets you search, summarize and instantly chat with the 4500+ papers accepted into NeurIPS 2024, powered by Llama3.1-70b on Cerebras. 👉 neurips.cerebras.ai
Last week, I spoke at @CerebrasSystems's Llamapalooza in front of 400+ people But the day before they dropped a huge announcement Llama3.1-70b at 2148 tokens / second 🤯 The morning of, I decided to drop everything and make three open-source demos from scratch 👇
Congrats @andrewdfeldman and @CerebrasSystems for a huge leap forward and setting a new speed record for serving Llama 3.1-70B. 2100 tokens/sec is blazingly fast for a 70B model. This is great for agentic AI!
🚨 Cerebras Inference is now 3x faster: Llama3.1-70B just broke 2,100 tokens/s - 16x faster than the fastest GPU solution - 8x faster than GPUs running Llama *3B* - It's like the perf of a new hardware generation in a single software release Available now at…
Cerebras has launched a major upgrade and is now achieving >2,000 output token/s on Llama 3.1 70B, >3x their prior speeds This is a dramatic new world record for language model inference. @CerebrasSystems' language model inference offering runs on their custom "wafer scale" AI…
Embrace the speed. Let's go fast and far, together! 🧡 @CerebrasSystems
🚨 Cerebras Inference is now 3x faster: Llama3.1-70B just broke 2,100 tokens/s - 16x faster than the fastest GPU solution - 8x faster than GPUs running Llama *3B* - It's like the perf of a new hardware generation in a single software release Available now at…
🚨 Cerebras Inference is now 3x faster: Llama3.1-70B just broke 2,100 tokens/s - 16x faster than the fastest GPU solution - 8x faster than GPUs running Llama *3B* - It's like the perf of a new hardware generation in a single software release Available now at…
This is how I imagine the future with AR/VR
I’ve walked through poor neighbourhoods in India, Africa and LatAm many times. Yet, I recently walked through one of the most depressing ones in terms of poverty, drug abuse, and sheer hopelessness: San Francisco. Giant tech AI companies promise to make the world a better…
1/5 - Our paper "Self-Data Distillation for Pruned LLMs" has been accepted at NeurIPS 2024 Workshop on Machine Learning and Compression {neuralcompression.github.io/workshop24} organized by @nyuniversity, @AIatMeta, @UCIrvine Paper: arxiv.org/abs/2410.09982
Announcing Llamapalooza NYC on Oct 25! 🦙 Join Cerebras for a one-of-a-kind event around fine-tuning and using llama models in production! Headliners include talks from Hugging Face, Cerebras, Crew AI. We'll also have food and drinks 🍹🍟 RSVP here: lu.ma/d3e81idy…
BREAKING NEWS The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Physics to John J. Hopfield and Geoffrey E. Hinton “for foundational discoveries and inventions that enable machine learning with artificial neural networks.”
Cerebras continues to deliver output speed improvements, breaking the 2,000 tokens/s barrier on Llama 3.1 8B and 550 tokens/s on 70B Since launching less than a month ago, @CerebrasSystems has continued to improve output speed inference performance on their custom chips. We…
🚨 Major perf update: Llama3.1-70B now runs at 560 tokens/s 24% faster in 3 weeks Available now on Cerebras Inference API and chat inference.cerebras.ai
🚀 Introducing the Dream Machine API. Developers can now build and scale creative products with the world's most popular and intuitive video generation model without building complex tools in their apps. Start today lumalabs.ai/dream-machine/… #LumaDreamMachine
With @CerebrasSystems' Llama3.1-8B now at 1927 tokens/s (up from 1800) and Llama3.1-70B reaching 481 tokens/s (up from 450), it's clear that not all Llama3.1 models are created equal—we remain the most accurate and fastest inference provider in the world! 🥇 🚀 To ensure the…
United States Trends
- 1. Bengals 60,3 B posts
- 2. Eagles 49,6 B posts
- 3. Ravens 34,9 B posts
- 4. Justin Tucker 3.164 posts
- 5. Maddox 6.008 posts
- 6. Steelers 78,1 B posts
- 7. Kirk Cousins 11,4 B posts
- 8. Jets 53,7 B posts
- 9. Colts 29,3 B posts
- 10. Bryce Young 4.245 posts
- 11. Vikings 35,6 B posts
- 12. Falcons 23,1 B posts
- 13. #SKOL 7.090 posts
- 14. #HereWeGo 12,5 B posts
- 15. Seahawks 35,3 B posts
- 16. Trevor Lawrence 28 B posts
- 17. Penix 5.515 posts
- 18. Mark Andrews 2.477 posts
- 19. Cardinals 15,1 B posts
- 20. Jalen Carter 1.783 posts
Who to follow
-
Danijar Hafner
@danijarh -
Hattie Zhou
@oh_that_hat -
prof-g
@robertghrist -
Anca Dragan
@ancadianadragan -
Peyman Milanfar
@docmilanfar -
François Fleuret
@francoisfleuret -
Greg Yang
@TheGregYang -
Aidan Gomez
@aidangomez -
Kevin Zakka
@kevin_zakka -
Taco Cohen
@TacoCohen -
Sanjeev Arora
@prfsanjeevarora -
Maithra Raghu
@maithra_raghu -
Michael Black
@Michael_J_Black -
Miles Cranmer
@MilesCranmer -
Devi Parikh
@deviparikh
Something went wrong.
Something went wrong.