Kevin Lu
@_kevinlu@openai. ex-@berkeley_ai, advised by @pabbeel & @imordatch
Similar User
@IMordatch
@abhishekunique7
@shahdhruv_
@haoliuhl
@jaseweston
@StefanoErmon
@ClementineDomi6
@kimin_le2
@zhuohan123
@LerrelPinto
@yubai01
@siddkaramcheti
@archit_sharma97
@denisyarats
@KuanFang
Come check out o1-mini: SoTA math reasoning in a small package openai.com/index/openai-o… with @ren_hongyu @shengjia_zhao @Eric_Wallace_ & the rest of the OpenAI team
A Codeforces contestant used o1-mini in a live contest today codeforces.com/blog/entry/133… and achieved near master-level performance! Agree with the decision to restrict AI in competition going forwards, and it'll be interesting to see how the contest scene evolves.
Thrilled to release o1-mini, a model near and dear to my heart 💙. o1-mini is an efficient model in the o1 series that’s super performant in STEM reasoning, especially math and coding. I can’t wait to see what you all build with o1-mini!! openai.com/index/openai-o…
“OpenAI says that more than 200 million people use ChatGPT each week […] while API usage has doubled following the release of the company’s cheaper and smarter model GPT-4o mini” Has @OpenAI API usage really doubled in the past five weeks since 4o-mini? theverge.com/2024/8/29/2423…
Exciting Chatbot Arena Update -- GPT-4o mini's result is out! With 4K+ user votes, GPT-4o mini climbs to the top of the leaderboard, now joint #1 with GPT-4o while being 20x cheaper! Significantly better than its early version ("upcoming-gpt-mini") in Arena across the boards.…
Excited to release our new small model, developed by a killer crew of team players. Intelligence per $ is very strong with GPT-4o mini. Your turn, developers! omniminiomniminiomnimini (say it 5 times fast)
I recently joined OpenAI! Come check out our new model: 82% MMLU at 60 cents per 1M output tokens! openai.com/index/gpt-4o-m…
In our new work - Algorithm Distillation - we show that transformers can improve themselves autonomously through trial and error without ever updating their weights. No prompting, no finetuning. A single transformer collects its own data and maximizes rewards on new tasks. 1/N
LLMs like GPT-3 and Codex contain rich world knowledge. In this fun study, we ask if GPT like models can plan actions for embodied agents. Turns out, with apt sanity checks, even vanilla LLMs without any finetuning can generate good high-level plans given a low-level controller.
Come chat with us about sequence modeling for reinforcement learning @NeurIPSConf tomorrow (Thurs 12/9) at 8:30-10am PT! gather.town/app/XRWlik7kvt…
Can RL algorithms be replaced with transformer-based language models? We’ve looked at this question with our work on Decision Transformer: Website: sites.google.com/corp/berkeley.… Code: github.com/kzl/decision-t… 1/8
Currently It is challenging to measure progress in Unsupervised RL w/o having common tasks & protocol. To take a step in addressing this issue we release our #NeurIPS2021 paper: (URLB) Unsupervised RL Benchmark! Paper: bit.ly/3bwHhY8 Code: bit.ly/3bAvI1S 1/N
Really exciting work looking at how to utilize frozen language models for multimodal tasks! Great to see more successes in this direction.
Our new paper shows how to prompt a pre-trained text language model with a combination of text AND images (🖼️,🔤, 🖼️,🔤, 🖼️,🔤). Keep the language model 🧊 frozen 🧊 and train a vision encoder to embed images into the same space as word sequences. arxiv.org/abs/2106.13884 (1/12)
Excited to share our new work on applying language modeling ideas to RL policy optimization! Jointly led with @lchen915 and other amazing collaborators.
Can RL algorithms be replaced with transformer-based language models? We’ve looked at this question with our work on Decision Transformer: Website: sites.google.com/corp/berkeley.… Code: github.com/kzl/decision-t… 1/8
Complementary blog post to our paper, Pretrained Transformers as Universal Computation Engines, has been released! bair.berkeley.edu/blog/2021/03/2…
What are the limits to the generalization of large pretrained transformer models? We find minimal fine-tuning (~0.1% of params) performs as well as training from scratch on a completely new modality! with @_kevinlu, @adityagrover_, @pabbeel paper: arxiv.org/abs/2103.05247 1/8
sites.google.com/corp/berkeley.… Excited to share work led by amazing Kevin Lu in collaboration with @adityagrover_ and @pabbeel! What’s holding us back from agents that learn in reset-free, lifelong settings?
United States Trends
- 1. $MAYO 11,6 B posts
- 2. Tyson 420 B posts
- 3. Pence 48,7 B posts
- 4. Kash 82,1 B posts
- 5. Dora 23,5 B posts
- 6. Debbie 23 B posts
- 7. Mike Rogers 13 B posts
- 8. #LetsBONK 9.265 posts
- 9. Gabrielle Union 1.302 posts
- 10. Laken Riley 48,6 B posts
- 11. Iron Mike 17,8 B posts
- 12. Ticketmaster 17,6 B posts
- 13. Whoopi 72 B posts
- 14. #FursuitFriday 16,3 B posts
- 15. Pirates 20,3 B posts
- 16. Cenk 12,5 B posts
- 17. Fauci 184 B posts
- 18. National Energy Council 2.287 posts
- 19. The FBI 235 B posts
- 20. The UK 457 B posts
Who to follow
-
Igor Mordatch
@IMordatch -
Abhishek Gupta
@abhishekunique7 -
Dhruv Shah
@shahdhruv_ -
Hao Liu
@haoliuhl -
Jason Weston
@jaseweston -
Stefano Ermon
@StefanoErmon -
Clémentine Dominé 🍊
@ClementineDomi6 -
Kimin
@kimin_le2 -
Zhuohan Li
@zhuohan123 -
Lerrel Pinto
@LerrelPinto -
Yu Bai
@yubai01 -
Siddharth Karamcheti
@siddkaramcheti -
Archit Sharma
@archit_sharma97 -
Denis Yarats
@denisyarats -
Kuan Fang
@KuanFang
Something went wrong.
Something went wrong.