Yingru Li @RichardYRLi Twitter Profile

Yingru Li

@RichardYRLi

AI, RL, LLMs, Data Science | PhD @cuhksz. | ex @MSFTResearch @TencentGlobal | On Job Market

489Posts 390Followers 1KFollowing

Similar User

@zhiyuanli_

@2prime_PKU

@lyang36

@Song__Mei

@zhuoran_yang

@zhaoran_wang

@YingJin531

@DongruoZ

@hanzhao_ml

@iampanxu

@LihongLi20

@KaixuanHuang1

@RuoyuSun_UI

@sfrei_

@uuujingfeng

Pinned

Yingru Li

@RichardYRLi

14 Oct

Discover theoretical advancements and applications in #GenAI Reasoning & Agents at #INFORMS2024! 🚀 Sessions on: #LLM Agents, RL, Exploration, Alignment, Gene-Editing, Math Reasoning & more! Oct 20 & 21, Summit-342, Seattle Convention Center. @Zanette_ai @RichardYRLi…

Yingru Li Reposted

Dylan Foster

@canondetortugas

19 Oct

Heading to INFORMS for the first time! On Monday, I will be giving a tutorial in the Applied Probability Society distinguished lecture session on our line of work on the statistical complexity of RL and Decision-Estimation Coefficient. Come say hi if you are around!

Yingru Li Reposted

Gokul Swamy

@g_k_swamy

17 Oct

I'll be at #INFORMS2024 next week, giving a talk on Sunday and will be around Seattle for a few days afterwards! Hit me up if you'd like to chat :).

Yingru Li

@RichardYRLi

14 Oct

Yingru Li Reposted

Kaixuan Huang

@KaixuanHuang1

14 Oct

I will be giving a talk at #INFORMS2024 (Seattle) about CRISPR-GPT, a semi-automatic LLM multi-agent framework that speeds up CRISPR gene-editing experimental designs. I am extremely passionate about the future of scientific research agents! Check out our v1 preprint at…

Yingru Li

@RichardYRLi

14 Oct

Yingru Li Reposted

Chi Jin

@chijinML

7 Oct

Our team's LLM ensemble method ranked first on AlpacaEval 2.0!🚀 joint work with @WenzheLiTHU, Yong Lin, @xiamengzhou More details will be released soon.

Yingru Li Reposted

Ge Zhang

@GeZhang86038849

5 Oct

1/ 🚀 Exciting news to introduce another amazing open-source project! Introducing Open O1, a powerful alternative to proprietary models like OpenAI's O1! 🤖✨ Our mission is to empower everyone with advanced AI capabilities. Stay tuned for more! Homepage: Open-Source O1…

Yingru Li Reposted

Ruoyu Sun

@RuoyuSun_UI

28 Sep

Excited to share our paper "Why Transformers Need Adam: A Hessian Perspective" accepted at @NeurIPSConf Intriguing question: Adam significantly outpeforms SGD on Transformers, including LLM training (Fig 1). Why? Our explanation: 1) Transformer’s block-Hessians are…

Yingru Li Reposted

Ziniu Li

@ZiniuLi

26 Sep

🚀 Excited to share our paper "Why Transformers Need Adam: A Hessian Perspective", which is accepted at #NeurIPS2024! We delve into why SGD lags behind Adam, uncovering the "block heterogeneity" in Transformers' Hessian spectra that Adam navigates more adeptly. 🧠💡 Our…

Yingru Li Reposted

Rishabh Agarwal

@agarwl_

17 Sep

I gave my first guest lecture today in a grad course on LLMs as an (soon-to-be) adjunct prof at McGill. Putting the slides here, maybe useful to some folks ;) drive.google.com/file/d/1komQ7s…

Yingru Li

@RichardYRLi

13 Sep

Visionary

Pablo Samuel Castro

@pcastr

11 Aug

Great keynote by David Silver, arguing that we need to re-focus on RL to get out of the LLM Valley @RL_Conference

Yingru Li Reposted

Jason Wei

@_jasonwei

12 Sep

Super excited to finally share what I have been working on at OpenAI! o1 is a model that thinks before giving the final answer. In my own words, here are the biggest updates to the field of AI (see the blog post for more details): 1. Don’t do chain of thought purely via…

OpenAI

@OpenAI

12 Sep

We're releasing a preview of OpenAI o1—a new series of AI models designed to spend more time thinking before they respond. These models can reason through complex tasks and solve harder problems than previous models in science, coding, and math. openai.com/index/introduc…

Yingru Li Reposted

Ziniu Li

@ZiniuLi

6 Sep

Diversity is beneficial for test-time compute😃 Llama-3-8B SFT-tuned models can easily achieve 1) ~90% accuracy on math reasoning task GSM8K 2) ~80% pass-rate on code generation task without domain-specific data! 🔥

Yingru Li Reposted

DeepSeek

@deepseek_ai

16 Aug

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search > New SOTA: 63.5% on miniF2F (high school) & 25.3% on ProofNet (undergrad) > Introduces RMaxTS: Novel MCTS for diverse proof generation > Features RLPAF:…

Yingru Li Reposted

Ziniu Li

@ZiniuLi

26 Aug

RLHF training involves both generation (of long sequences) and inference (calculating log-probability). 🧠📝 The straightforward implementation is slow. 🐢 ReaL provides an efficient implementation of RL algorithms like ReMax and PPO. 🚀🔥 github.com/openpsi-projec… This…

Yingru Li Reposted

Marlos C. Machado

@MarlosCMachado

21 Aug

I couldn't be prouder of my colleagues at the @UAlberta! The work led by @s_dohare, in collaboration w/ J. F. H.-Garcia, @LanceLan3, @rahman_parash, @rupammahmood, & @RichardSSutton on continual learning and loss of plasticity is now published at @Nature! nature.com/articles/s4158…

Loss of plasticity in deep continual learning

Source: https://t.co/PaO8GPsVHq

Yingru Li Reposted

Ming Yin

@MingYin_0312

19 Aug

The 1st RL conference concluded a week ago, but I think it's still worth sharing a powerful insight from RL pioneer David Silver: While RL may not be as hyped in the LLM era, it could still be the key to achieving superhuman intelligence (Figure 1). Silver emphasized that while…

Yingru Li Reposted

Ruoyu Sun

@RuoyuSun_UI

18 Aug

Now LLaMa Factory incorporates Adam-mini! 27% total memory reduction when fine-tuning QWen-1.5B. To run it immediately: github.com/hiyouga/LLaMA-…

LLaMA Factory

@llamafactory_ai

9 Aug

🚀We've integrated the Adam-mini optimizer into LLaMA-Factory, slashing the memory footprint of full-finetuning Qwen2-1.5B from 33GB to 24GB! 🔥 Image source: github.com/zyushun/Adam-m…

Yingru Li Reposted

Pierluca D'Oro

@proceduralia

8 Aug

I'm a scientist working on creating better reward models for agents, and I disagree with the main point of this post. Not only RL using a reward you can't totally trust is RL, but I would argue it is the RL we should do research on. Yes, without any doubt RL maximally shines…

Andrej Karpathy

@karpathy

7 Aug

# RLHF is just barely RL Reinforcement Learning from Human Feedback (RLHF) is the third (and last) major stage of training an LLM, after pretraining and supervised finetuning (SFT). My rant on RLHF is that it is just barely RL, in a way that I think is not too widely…

Yingru Li Reposted

Alexander Terenin

@avt_im

5 Aug

We’re extremely excited to announce the NeurIPS Workshop on Bayesian Decision-making and Uncertainty: from probabilistic and spatiotemporal modeling to sequential experiment design! This will take place at NeurIPS 2024, in Vancouver, BC, Canada, either on December 14th or 15th.

Yingru Li Reposted

Shiqian Ma

@ShiqianMa

26 Jul

At #ISMP2024 , today at 2pm in 510C, I will talk about adaptive Barzilai-Borwein method. It is a line-search-free, parameter-free gradient method, a very simple modification to the BB method. We prove the O(1/k) convergence rate for general convex functions.