Similar User
@chitanerk
@doveZYM
@chenfly6
@Br_MonMx_Roy
@bfdlin
@liuyanxue2
@canselozann
@owlkery0702
@MKGodknows
@xiang088
@sulczgabor
@Moneyvism
@StevenChen2022
@carmentjwern
@wilsen_0xfa6606
I don’t know why I didn’t work on this at early OpenAI, despite going around everywhere giving talks about the magic of autoregressive language models around that time. I went deep into RL like everyone else that time. Biggest, most confusing research career mistake ever
"LLMs can't reason, look at how their accuracy drops if you change the numbers in the problem!!!1!" The accuracy drop (%):
Ah, i missed this new discrete diffusion in town. GPT is hitting the wall?
Steerability is the next frontier of generative models! Having knobs that control the behavior of AI systems will greatly improve their safety & usability. I’m very excited to present ✨Conditional Language Policy (CLP)✨, a multi-objective RL framework for steering language…
There is a nuanced but important difference between chain-of-thought before and after o1. Before the o1 paradigm (i.e., chain-of-thought prompting), there was a mismatch between what chain of thought was and what we wanted it to be. We wanted chain of thought to reflect the…
RLHF provably can't teach models any new knowledge. If you need to teach new skills, you need to look at pre-training and SFT. Why? 👇
Same here. In the original "Let's Verify Step by Step" paper, the process reward model represents the immediate reward in RL, but in many recent papers, the term process reward model means the value function actually
📢Annoucing EDLM, our brand-new Energy-based Language Model embedded with Diffusion framework! Key results: 1. We (for the first time?) almost match AR perplexity. 2. Significantly improved generation quality. 3. Considerable sampling speedup without quality drop. 🧵1/n
I'm very excited to announce that I'm joining @AnthropicAI this week, after 10 amazing years at @GoogleDeepMind ! Thank you also to all the amazing people I got to meet and work with, and I'm really looking forward to meeting all my new colleagues 💖 ! furidamu.org/blog/2024/10/2…
Autoregressive language models, despite their impressive capabilities, sometimes struggle with complex reasoning and long-term planning tasks. Can we go beyond autoregression on these challanges?🤔
Video generation can serve as world models and embodied planning tools, but they must be grounded in the physical world. Check out: VideoAgent for self-improving video generation using feedback from VLMs and action executions. Paper: arxiv.org/abs/2410.10076 Code:…
Meta have shared an LLM pretraining codebase github.com/facebookresear…
I just tried out playing Counter-Strike in a neural network on my MacBook. In my first run, it diverged into mush pretty quickly. The recording is sped up 5x.
🚨 Exciting new results with dense process reward models (PRMs) for reasoning. Our PRMs scale ✅ search compute by 1.5-5x ✅ RL sample efficiency by 6x ✅ 3-4x ⬆️ accuracy gains vs prior works ❌ human supervision What's the secret sauce 🤔?: See 🧵 ⬇️ arxiv.org/pdf/2410.08146
We just released Pixtral 12B paper on Arxiv: arxiv.org/abs/2410.07073
The cool thing: This does not only apply to papers. It works whenever you consume information. Watching a youtube video, listening to a podcast. Make it active! Reflect on why you consume it, extract the important bits, then repeat them actively to memorize. Hope it helps 🤗
🚨This week’s top AI/ML research papers: - MovieGen - Were RNNs All We Needed? - Contextual Document Embeddings - RLEF - ENTP - VinePPO - When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1 - LLMs Know More Than…
deepseek multi latent attention is wild cool to see aggressive innovation like this
macOS Sequoia 不再允许使用「Option +数字/字母」配置成快捷键的官方回答,不出意外,是安全原因…行,服,这很 Apple。 forums.developer.apple.com/forums/thread/…
United States Trends
- 1. #AskShadow 2.383 posts
- 2. GOTY 40,6 B posts
- 3. #AskSonic N/A
- 4. Mika 107 B posts
- 5. Balatro 22,2 B posts
- 6. #TheGameAwards 50,1 B posts
- 7. Elden Ring 30,9 B posts
- 8. San Marino 32 B posts
- 9. #ysltrial 3.914 posts
- 10. #Strobepopcat N/A
- 11. Morning Joe 87,6 B posts
- 12. Shadow of the Erdtree 11,7 B posts
- 13. Katie Hobbs 8.805 posts
- 14. Bakkt 7.699 posts
- 15. Metaphor 27,1 B posts
- 16. SOTE 4.020 posts
- 17. Ichiro 4.752 posts
- 18. Astro Bot 14,8 B posts
- 19. Game of the Year 41,1 B posts
- 20. Geoff 10,5 B posts
Who to follow
-
chitaner
@chitanerk -
不萬能風
@doveZYM -
chenfly
@chenfly6 -
BMo
@Br_MonMx_Roy -
bfdlin
@bfdlin -
CryptoNinja
@liuyanxue2 -
Cansel.eth
@canselozann -
貓頭鷹
@owlkery0702 -
bwyz 🐐
@MKGodknows -
xiang꧁IP꧂
@xiang088 -
SulczG ♥️ Memecoin
@sulczgabor -
Fedvity
@Moneyvism -
chdonger ♦️ Tabi 🟧 🛸(🌸, 🌿)🛡️🩸💧꧁IP꧂
@StevenChen2022 -
Carmen Sandiego Teoh
@carmentjwern -
Will-0xfa
@wilsen_0xfa6606
Something went wrong.
Something went wrong.