Alex Havrilla @Dahoas1 Twitter Profile

Alex Havrilla

@Dahoas1

Georgia Tech ML Researcher studying neural network learning theory and LLMs for mathematical reasoning. Intern at FAIR, MSR. Co-founder of CarperAI.

165Posts 1KFollowers 537Following

Similar User

@lcastricato

@tri_dao

@ml_hardware

@DbrxMosaicAI

@Francis_YAO_

@YiTayML

@jsuarez5341

@OfirPress

@dust4ai

@ggerganov

@aidangomez

@JayAlammar

@synth_labs

@EthanJPerez

@humanloop

Alex Havrilla Reposted

Shehzaad Dhuliawala

@shehzaadzd

15 Nov

Excited to share work from my internship at @AIatMeta! LLM devs often tweak decoding temperature: low for analytical tasks, and high for creative ones. Why not learn this from the data? Introducing the AdaptiveDecoder! (1/3)🧵

Jason Weston

@jaseweston

15 Nov

🚨 Adaptive Decoding via Latent Preference Optimization 🚨 - New layer added to Transformer, selects decoding params automatically *per token* - Learnt via new method, Latent Preference Optimization - Outperforms any fixed temperature decoding method, choosing creativity or…

Alex Havrilla Reposted

Srishti Gureja

@sGx_tweets

30 Oct

✨ New Evaluation Benchmark for Reward Models - We Go Multilingual! ✨ Introducing M-RewardBench: A massively multilingual RM evaluation benchmark covering 23 typologically different languages across 5 tasks. Paper, code, dataset: m-rewardbench.github.io Our contributions: 1/9

Alex Havrilla Reposted

Felix Hill

@FelixHill84

7 Oct

Do you work in AI? Do you find things uniquely stressful right now, like never before? Haver you ever suffered from a mental illness? Read my personal experience of those challenges here: docs.google.com/document/d/1aE…

Alex Havrilla

@Dahoas1

8 Oct

Super cool work from SynthLabs on generative reward modeling!

SynthLabs

@synth_labs

7 Oct

🎭 Introducing Generative Reward Models (GenRM): A novel framework for preference learning that improves traditional reward models by up to 45%! 🤖 CoT-GenRM leverages self-generated reasoning traces and iterative training to enable test-time compute for better alignment of…

Alex Havrilla Reposted

AK

@_akhaliq

25 Jul

PERSONA A Reproducible Testbed for Pluralistic Alignment The rapid advancement of language models (LMs) necessitates robust alignment with diverse user values. However, current preference optimization approaches often fail to capture the plurality of user opinions,

Alex Havrilla

@Dahoas1

23 Jul

Presenting now at number 2716!

Alex Havrilla

@Dahoas1

21 Jul

I'm at ICML presenting GLoRe (arxiv.org/abs/2402.10963) and Teaching Reasoning with RL (arxiv.org/abs/2403.04642)! If you'd like to chat about synthetic data, process-based rewards, open-endedness, or theoretical foundations of scaling laws (or anything else) my DMs are open!

Alex Havrilla

@Dahoas1

12 Jul

Come help us build a PRM benchmark!

Aran Komatsuzaki

@arankomatsuzaki

12 Jul

Interested in benchmarking and improving Process Based Reward models (PRMs)? Come join our project to extend RewardBench by building a benchmark with fine-grained step-level feedback across many hard STEM and agentic type tasks! Project doc: docs.google.com/document/d/1S2… Discord:…

Alex Havrilla Reposted

Bogdan Mazoure

@bogdan_mazoure

10 May

🚨 Pass by our #ICLR2024 workshop on Generative AI for Decision Making tomorrow, Saturday May 11! 🚨 We have a content-heavy day, including an exciting lineup of invited and contributed talks, as well as two poster sessions! Details: iclr.cc/virtual/2024/w…

Alex Havrilla Reposted

Javier Ferrando

@javifer_96

3 May

[1/4] Introducing “A Primer on the Inner Workings of Transformer-based Language Models”, a comprehensive survey on interpretability methods and the findings into the functioning of language models they have led to. ArXiv: arxiv.org/pdf/2405.00208

Alex Havrilla

@Dahoas1

24 Apr

In my humble opinion the recent Stream of Search paper (arxiv.org/abs/2404.03683) is truly outstanding. Everyone should give it a thorough read.

Alex Havrilla Reposted

Andrej Karpathy

@karpathy

23 Apr

The 3 key elements of a good dataset: 1. quality 2. diversity 3. quantity You can only easily measure the last one but the performance is a sensitive function of all three. Super interesting topic ty for #longread :)!

Alex Havrilla Reposted

Tomek Korbak

@tomekkorbak

22 Apr

I've finally uploaded the thesis on arXiv: arxiv.org/abs/2404.12150 It ties together a bunch of papers exploring some alternatives to RL for finetuning LMs, including pretraining with human preferences and minimizing KL divergences from pre-defined target distributions.

David Krueger

@DavidSKrueger

30 Nov

I was very impressed with @tomekkorbak's thesis! Some really nice insights into LLM alignment: 1) RL is not the way --> distribution matching let's us target constraints like "generate as many of these as of those" 2) fine-tuning is not the way --> PHF aligns during pre-training

Alex Havrilla Reposted

Sharath Raparthy

@sharathraparthy

18 Apr

I am super excited to share our Llama3 preview models (8B and 70B). I am proud to have been a part of this amazing effort over the past 8 months. We still have some super cool stuff coming up in the coming months... until then, enjoy playing with these preview models…

Alex Havrilla

@Dahoas1

18 Apr

Had a great time during our discussion, thanks again for having me!

The TWIML AI Podcast

@twimlai

17 Apr

Today we're joined by @Dahoas1 from @GeorgiaTech to discuss the reasoning capability of language models and the potential to improve it with traditional RL methods 🎧 / 🎥 Listen to the episode at: twimlai.com/go/680. 📖 CHAPTERS 00:00 - Introduction 02:19 - RL vs RLHF…

Alex Havrilla Reposted

Reshinth

@reshinth_

6 Apr

How to define Diversity in the context of CodeLMs and Programming Languages ? 1. Diversity is positively correlated with Performance in solving a problem. 2. Shortcomings of diversity in small codeLMs. 3. Code Embedding models don't capture semantics. reshinthadithyan.github.io/blog/2023/code…

Alex Havrilla Reposted

Costa Huang

@vwxyzjn

27 Mar

Happy to share our work on reproducing RLHF scaling behaviors in @OpenAI's work in summarizing from feedback. We built an RLHF pipeline from scratch and enumerated over 20+ implementation details 🚀 Fun collab with @mnoukhov, @arianTBD, @krasul, @weixunwang, and @_lewtun 📜…