@Dahoas1 Profile picture

Alex Havrilla

@Dahoas1

Georgia Tech ML Researcher studying neural network learning theory and LLMs for mathematical reasoning. Intern at FAIR, MSR. Co-founder of CarperAI.

Similar User
Louis Castricato photo

@lcastricato

Tri Dao photo

@tri_dao

Abhi Venigalla photo

@ml_hardware

Databricks Mosaic Research photo

@DbrxMosaicAI

Yao Fu photo

@Francis_YAO_

Yi Tay photo

@YiTayML

Joseph Suarez (e/🐡) photo

@jsuarez5341

Ofir Press photo

@OfirPress

Dust photo

@dust4ai

Georgi Gerganov photo

@ggerganov

Aidan Gomez photo

@aidangomez

Jay Alammar photo

@JayAlammar

SynthLabs photo

@synth_labs

Ethan Perez photo

@EthanJPerez

Humanloop photo

@humanloop

Alex Havrilla Reposted

Excited to share work from my internship at @AIatMeta! LLM devs often tweak decoding temperature: low for analytical tasks, and high for creative ones. Why not learn this from the data? Introducing the AdaptiveDecoder! (1/3)🧵

Tweet Image 1

🚨 Adaptive Decoding via Latent Preference Optimization 🚨 - New layer added to Transformer, selects decoding params automatically *per token* - Learnt via new method, Latent Preference Optimization - Outperforms any fixed temperature decoding method, choosing creativity or…

Tweet Image 1


Alex Havrilla Reposted

✨ New Evaluation Benchmark for Reward Models - We Go Multilingual! ✨ Introducing M-RewardBench: A massively multilingual RM evaluation benchmark covering 23 typologically different languages across 5 tasks. Paper, code, dataset: m-rewardbench.github.io Our contributions: 1/9

Tweet Image 1

Alex Havrilla Reposted

Do you work in AI? Do you find things uniquely stressful right now, like never before? Haver you ever suffered from a mental illness? Read my personal experience of those challenges here: docs.google.com/document/d/1aE…


Super cool work from SynthLabs on generative reward modeling!

🎭 Introducing Generative Reward Models (GenRM): A novel framework for preference learning that improves traditional reward models by up to 45%! 🤖 CoT-GenRM leverages self-generated reasoning traces and iterative training to enable test-time compute for better alignment of…

Tweet Image 1


Alex Havrilla Reposted

PERSONA A Reproducible Testbed for Pluralistic Alignment The rapid advancement of language models (LMs) necessitates robust alignment with diverse user values. However, current preference optimization approaches often fail to capture the plurality of user opinions,

Tweet Image 1

Presenting now at number 2716!

Tweet Image 1

I'm at ICML presenting GLoRe (arxiv.org/abs/2402.10963) and Teaching Reasoning with RL (arxiv.org/abs/2403.04642)! If you'd like to chat about synthetic data, process-based rewards, open-endedness, or theoretical foundations of scaling laws (or anything else) my DMs are open!


Come help us build a PRM benchmark!

Interested in benchmarking and improving Process Based Reward models (PRMs)? Come join our project to extend RewardBench by building a benchmark with fine-grained step-level feedback across many hard STEM and agentic type tasks! Project doc: docs.google.com/document/d/1S2… Discord:…



Alex Havrilla Reposted

🚨 Pass by our #ICLR2024 workshop on Generative AI for Decision Making tomorrow, Saturday May 11! 🚨 We have a content-heavy day, including an exciting lineup of invited and contributed talks, as well as two poster sessions! Details: iclr.cc/virtual/2024/w…


Alex Havrilla Reposted

[1/4] Introducing “A Primer on the Inner Workings of Transformer-based Language Models”, a comprehensive survey on interpretability methods and the findings into the functioning of language models they have led to. ArXiv: arxiv.org/pdf/2405.00208

Tweet Image 1

In my humble opinion the recent Stream of Search paper (arxiv.org/abs/2404.03683) is truly outstanding. Everyone should give it a thorough read.


Alex Havrilla Reposted

The 3 key elements of a good dataset: 1. quality 2. diversity 3. quantity You can only easily measure the last one but the performance is a sensitive function of all three. Super interesting topic ty for #longread :)!


Alex Havrilla Reposted

I've finally uploaded the thesis on arXiv: arxiv.org/abs/2404.12150 It ties together a bunch of papers exploring some alternatives to RL for finetuning LMs, including pretraining with human preferences and minimizing KL divergences from pre-defined target distributions.

Tweet Image 1

I was very impressed with @tomekkorbak's thesis! Some really nice insights into LLM alignment: 1) RL is not the way --> distribution matching let's us target constraints like "generate as many of these as of those" 2) fine-tuning is not the way --> PHF aligns during pre-training



Alex Havrilla Reposted

I am super excited to share our Llama3 preview models (8B and 70B). I am proud to have been a part of this amazing effort over the past 8 months. We still have some super cool stuff coming up in the coming months... until then, enjoy playing with these preview models…

Tweet Image 1

Had a great time during our discussion, thanks again for having me!

Today we're joined by @Dahoas1 from @GeorgiaTech to discuss the reasoning capability of language models and the potential to improve it with traditional RL methods 🎧 / 🎥 Listen to the episode at: twimlai.com/go/680. 📖 CHAPTERS 00:00 - Introduction 02:19 - RL vs RLHF…



Alex Havrilla Reposted

How to define Diversity in the context of CodeLMs and Programming Languages ? 1. Diversity is positively correlated with Performance in solving a problem. 2. Shortcomings of diversity in small codeLMs. 3. Code Embedding models don't capture semantics. reshinthadithyan.github.io/blog/2023/code…

Tweet Image 1
Tweet Image 2

Alex Havrilla Reposted

Happy to share our work on reproducing RLHF scaling behaviors in @OpenAI's work in summarizing from feedback. We built an RLHF pipeline from scratch and enumerated over 20+ implementation details 🚀 Fun collab with @mnoukhov, @arianTBD, @krasul, @weixunwang, and @_lewtun 📜…

Tweet Image 1
Tweet Image 2
Tweet Image 3
Tweet Image 4

Loading...

Something went wrong.


Something went wrong.