Michael Carbin @mcarbin Twitter Profile

Michael Carbin

@mcarbin

Associate Professor in EECS at @MIT | Founding Advisor at @mosaicml | Programming Systems | Neural Networks | Approximate Computing

Joined September 2007

493Posts 3KFollowers 365Following

Similar User

@jefrankle

@SebastienBubeck

@ZoubinGhahrama1

@zicokolter

@TheGregYang

@FidlerSanja

@ShamKakade6

@aleks_madry

@tomgoldsteincs

@HazanPrinceton

@gkdziugaite

@NeerajaJY

@RogerGrosse

@IsilDillig

@JohnCLangford

Pinned

Michael Carbin

@mcarbin

5 May 2023

Meet MPT-7B and its children! Proud of the @MosaicML team for delivering a new open standard for LLMs. Built with the same data, training, and eval tools that we make available to you to build your own LLM.

Jonathan Frankle

@jefrankle

5 May 2023

MPT is here! Check out our shiny new LLMs, open-source w/commercial license. The base MPT-7B model is 7B params trained on 1T tokens and reaches LLaMA-7B quality. We also created Instruct (commercial), Chat, and (my favorite) StoryWriter-65k+ variants. 🧵 mosaicml.com/blog/mpt-7b

Michael Carbin Reposted

Dipendra Misra

@DipendraMisra

22 Aug

Super excited to join @databricks Mosaic Research team @DbrxMosaicAI! Looking forward to solving important research challenges in foundation models with my amazing team members and unlocking their full potential to accurately solve real-world problems.

Michael Carbin Reposted

Zack Ankner

@ZackAnkner

22 Aug

Excited to announce our new work: Critique-out-Loud (CLoud) reward models. CLoud reward models first produce a chain of thought critique of the input before predicting a scalar reward, allowing reward models to reason explicitly instead of implicitly! arxiv.org/abs/2408.11791

Michael Carbin Reposted

Nandan Thakur

@beirmug

19 Aug

Personal News: I started today at @Databricks @DbrxMosaicAI as a Research Scientist Intern in the Bay area! 🧑🏻‍💻 ✨ I'm excited to push research forward in the RAG team, build standardized benchmarks, and improve RAG-based evaluations. Anyone in SF, let's hangout and meet up! 🤝

Michael Carbin Reposted

Databricks Mosaic Research

@DbrxMosaicAI

16 Aug

Function calling significantly enhances the utility of LLMs in real-world applications; however, evaluating and improving this capability isn't easy — and no one benchmark tells the whole story. Learn more about our approach in the latest blog from @databricks:…

Michael Carbin Reposted

Matei Zaharia

@matei_zaharia

13 Aug

Does long context solve RAG? We found that many long-context models fail in specific and weird ways as you grow context length, making the optimal system design non-obvious. Some models tend to say there's a copyright issue, some tend to summarize, etc. databricks.com/blog/long-cont…

Long Context RAG Performance of LLMs | Databricks Blog

Source: https://t.co/c54m3hW0Fu

Michael Carbin Reposted

Databricks Mosaic Research

@DbrxMosaicAI

12 Aug

How good do the latest long context LLMs (LLama-3.1-405b, GPT-4o-mini and Claude-3.5-sonnet) perform on RAG? We benchmarked 13 popular OSS and commercial models on context lengths from 2k to 125k, and the results are very interesting! Full post: databricks.com/blog/long-cont…

Long Context RAG Performance of LLMs | Databricks Blog

Source: https://t.co/EltV19Dizv

Michael Carbin Reposted

Zack Ankner

@ZackAnkner

3 Jun

New paper where we explore using a small LM’s perplexity to prune the pretraining data for larger LMs. We find that small LMs can prune data for up to 30x larger LMs, data pruning works in the overtrained and data-constrained regimes, and more! arxiv.org/abs/2405.20541

Michael Carbin Reposted

Mihir Patel

@mvpatel2000

8 May

You know your CTO (@matei_zaharia) got the dog in him when the company is worth 40B+ and he's still looking at data and labeling

Michael Carbin Reposted

Arnav Singhvi

@arnav_thebigman

12 Apr

DSPy x DBRX 🔥

Databricks Mosaic Research

@DbrxMosaicAI

12 Apr

Ready to use a programmatic approach to prompting #LLMs and building #RAG applications? The @stanfordnlp #dspy repo includes support for @databricks Model Serving and Vector Search! Details: databricks.com/blog/dspy-data…

Michael Carbin Reposted

Taras Glek 🇺🇦

@tarasglek

30 Mar

I tried getting gpt-4-turbo to generate useful code from openai assistants docs. it failed. Claude-opus did better, but it's bad at coding. the new dbrx absolutely spanked the other models. chatcraft.org/api/share/tara…

chatcraft.org

Source: https://t.co/0HhgsiNCQ4

Michael Carbin Reposted

Julia Neagu

@JuliaANeagu

29 Mar

@DbrxMosaicAI DBRX outperforms @OpenAI GPT-4 on realistic, domain-specific benchmark datasets. For example, on a customer support summarization use-case👇👇👇 Still neck and neck but it shows that open models can be the no-brainer choice for actual enterprise applications.

Michael Carbin Reposted

potato_salad.cpp

@potato_y_salad

28 Mar

speaking of mosaic/databricks, i’ve ported so much code to versions of composer/streaming. it’s just so good.

Cody Blakeney

@code_star

27 Mar

It’s finally here 🎉🥳 In case you missed us, MosaicML/ Databricks is back at it, with a new best in class open weight LLM named DBRX. An MoE with 132B total parameters and 32B active 32k context length and trained for 12T tokens 🤯

Michael Carbin Reposted

Ajay Saini

@ajaysaini725

29 Mar

If you're curious about how DBRX was trained come by!

Databricks Mosaic Research

@DbrxMosaicAI

29 Mar

Curious about #DBRX and how it was trained? Join @abhi_venigalla and @ajaysaini725 to learn about the model and the @databricks platform that trained it! Hosted by our own Eric Peter, and the AI Alliance's @TimBonnemann and @ChiefScientist! lu.ma/kiidiyeb

Michael Carbin

@mcarbin

28 Mar

How to Science: 1) replicate your comparisons where possible, 2) make strongest baselines you can think of, 3) ablate your work to death. If your idea survives all that, it might stand a chance. Oh yea and report it all in the appendix! Pass that science along.

Prithviraj (Raj) Ammanabrolu

@rajammanabrolu

28 Mar

I'm writing this cause I'm a bit salty. We've implemented so many seemingly promising, published & popular papers only for them to utterly flop. At least I like to think that my personal bs Big Model paper classifier is now pretty good given my extensive training data.

Michael Carbin Reposted

Trevor Gale

@Tgale96

28 Mar

Hi all, a few updates on MegaBlocks 🧵 github.com/databricks/meg…

Michael Carbin Reposted

Jonathan Frankle

@jefrankle

27 Mar

It begins...

Michael Carbin Reposted

Ankit Mathur

@ankit_math

27 Mar

Best open model right now and >3x more efficient to serve than GPT-3.5 and 4 on our platform!

Jonathan Frankle

@jefrankle

27 Mar

Meet DBRX, a new sota open llm from @databricks It's a 132B MoE with 36B active params trained from scratch on 12T tokens. It sets a new bar on all the standard benchmarks, and - as an MoE - inference is blazingly fast. Simply put, it's the model your data has been waiting for.

Michael Carbin Reposted

@willknight.bsky.social

@willknight

27 Mar

Scoop: Grok had a good run but there’s a new open source model that beats out the rest: DBRX. I got an inside look at the impressive work that went into building it: wired.com/story/dbrx-ins…