@RichardYRLi Profile picture

Yingru Li

@RichardYRLi

AI, RL, LLMs, Data Science | PhD @cuhksz. | ex @MSFTResearch @TencentGlobal | On Job Market

Similar User
Zhiyuan Li photo

@zhiyuanli_

Yiping Lu photo

@2prime_PKU

Lin Yang photo

@lyang36

Song Mei photo

@Song__Mei

Zhuoran Yang photo

@zhuoran_yang

Zhaoran Wang photo

@zhaoran_wang

Ying Jin photo

@YingJin531

Dongruo Zhou photo

@DongruoZ

Han Zhao photo

@hanzhao_ml

Pan Xu photo

@iampanxu

Lihong Li photo

@LihongLi20

Kaixuan Huang photo

@KaixuanHuang1

Ruoyu Sun photo

@RuoyuSun_UI

Spencer Frei photo

@sfrei_

Jingfeng Wu photo

@uuujingfeng

Pinned

Discover theoretical advancements and applications in #GenAI Reasoning & Agents at #INFORMS2024! 🚀 Sessions on: #LLM Agents, RL, Exploration, Alignment, Gene-Editing, Math Reasoning & more! Oct 20 & 21, Summit-342, Seattle Convention Center. @Zanette_ai @RichardYRLi

Tweet Image 1

Yingru Li Reposted

Heading to INFORMS for the first time! On Monday, I will be giving a tutorial in the Applied Probability Society distinguished lecture session on our line of work on the statistical complexity of RL and Decision-Estimation Coefficient. Come say hi if you are around!

Tweet Image 1
Tweet Image 2

Yingru Li Reposted

I'll be at #INFORMS2024 next week, giving a talk on Sunday and will be around Seattle for a few days afterwards! Hit me up if you'd like to chat :).

Discover theoretical advancements and applications in #GenAI Reasoning & Agents at #INFORMS2024! 🚀 Sessions on: #LLM Agents, RL, Exploration, Alignment, Gene-Editing, Math Reasoning & more! Oct 20 & 21, Summit-342, Seattle Convention Center. @Zanette_ai @RichardYRLi

Tweet Image 1


Yingru Li Reposted

I will be giving a talk at #INFORMS2024 (Seattle) about CRISPR-GPT, a semi-automatic LLM multi-agent framework that speeds up CRISPR gene-editing experimental designs. I am extremely passionate about the future of scientific research agents! Check out our v1 preprint at…

Discover theoretical advancements and applications in #GenAI Reasoning & Agents at #INFORMS2024! 🚀 Sessions on: #LLM Agents, RL, Exploration, Alignment, Gene-Editing, Math Reasoning & more! Oct 20 & 21, Summit-342, Seattle Convention Center. @Zanette_ai @RichardYRLi

Tweet Image 1


Yingru Li Reposted

Our team's LLM ensemble method ranked first on AlpacaEval 2.0!🚀 joint work with @WenzheLiTHU, Yong Lin, @xiamengzhou More details will be released soon.

Tweet Image 1
Tweet Image 2

Yingru Li Reposted

1/ 🚀 Exciting news to introduce another amazing open-source project! Introducing Open O1, a powerful alternative to proprietary models like OpenAI's O1! 🤖✨ Our mission is to empower everyone with advanced AI capabilities. Stay tuned for more! Homepage: Open-Source O1…

Tweet Image 1

Yingru Li Reposted

Excited to share our paper "Why Transformers Need Adam: A Hessian Perspective" accepted at @NeurIPSConf Intriguing question: Adam significantly outpeforms SGD on Transformers, including LLM training (Fig 1). Why? Our explanation: 1) Transformer’s block-Hessians are…

Tweet Image 1
Tweet Image 2

Yingru Li Reposted

🚀 Excited to share our paper "Why Transformers Need Adam: A Hessian Perspective", which is accepted at #NeurIPS2024! We delve into why SGD lags behind Adam, uncovering the "block heterogeneity" in Transformers' Hessian spectra that Adam navigates more adeptly. 🧠💡 Our…

Tweet Image 1

Yingru Li Reposted

I gave my first guest lecture today in a grad course on LLMs as an (soon-to-be) adjunct prof at McGill. Putting the slides here, maybe useful to some folks ;) drive.google.com/file/d/1komQ7s…

Tweet Image 1

Visionary

Great keynote by David Silver, arguing that we need to re-focus on RL to get out of the LLM Valley @RL_Conference

Tweet Image 1
Tweet Image 2


Yingru Li Reposted

Super excited to finally share what I have been working on at OpenAI! o1 is a model that thinks before giving the final answer. In my own words, here are the biggest updates to the field of AI (see the blog post for more details): 1. Don’t do chain of thought purely via…

We're releasing a preview of OpenAI o1—a new series of AI models designed to spend more time thinking before they respond. These models can reason through complex tasks and solve harder problems than previous models in science, coding, and math. openai.com/index/introduc…



Yingru Li Reposted

Diversity is beneficial for test-time compute😃 Llama-3-8B SFT-tuned models can easily achieve 1) ~90% accuracy on math reasoning task GSM8K 2) ~80% pass-rate on code generation task without domain-specific data! 🔥

Tweet Image 1

Yingru Li Reposted

DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search > New SOTA: 63.5% on miniF2F (high school) & 25.3% on ProofNet (undergrad) > Introduces RMaxTS: Novel MCTS for diverse proof generation > Features RLPAF:…

Tweet Image 1

Yingru Li Reposted

RLHF training involves both generation (of long sequences) and inference (calculating log-probability). 🧠📝 The straightforward implementation is slow. 🐢 ReaL provides an efficient implementation of RL algorithms like ReMax and PPO. 🚀🔥 github.com/openpsi-projec… This…

Tweet Image 1

Yingru Li Reposted

I couldn't be prouder of my colleagues at the @UAlberta! The work led by @s_dohare, in collaboration w/ J. F. H.-Garcia, @LanceLan3, @rahman_parash, @rupammahmood, & @RichardSSutton on continual learning and loss of plasticity is now published at @Nature! nature.com/articles/s4158…


Yingru Li Reposted

The 1st RL conference concluded a week ago, but I think it's still worth sharing a powerful insight from RL pioneer David Silver: While RL may not be as hyped in the LLM era, it could still be the key to achieving superhuman intelligence (Figure 1). Silver emphasized that while…

Tweet Image 1
Tweet Image 2
Tweet Image 3
Tweet Image 4

Yingru Li Reposted

Now LLaMa Factory incorporates Adam-mini! 27% total memory reduction when fine-tuning QWen-1.5B. To run it immediately: github.com/hiyouga/LLaMA-…

🚀We've integrated the Adam-mini optimizer into LLaMA-Factory, slashing the memory footprint of full-finetuning Qwen2-1.5B from 33GB to 24GB! 🔥 Image source: github.com/zyushun/Adam-m…

Tweet Image 1
Tweet Image 2


Yingru Li Reposted

I'm a scientist working on creating better reward models for agents, and I disagree with the main point of this post. Not only RL using a reward you can't totally trust is RL, but I would argue it is the RL we should do research on. Yes, without any doubt RL maximally shines…

# RLHF is just barely RL Reinforcement Learning from Human Feedback (RLHF) is the third (and last) major stage of training an LLM, after pretraining and supervised finetuning (SFT). My rant on RLHF is that it is just barely RL, in a way that I think is not too widely…

Tweet Image 1


Yingru Li Reposted

We’re extremely excited to announce the NeurIPS Workshop on Bayesian Decision-making and Uncertainty: from probabilistic and spatiotemporal modeling to sequential experiment design! This will take place at NeurIPS 2024, in Vancouver, BC, Canada, either on December 14th or 15th.

Tweet Image 1

Yingru Li Reposted

At #ISMP2024 , today at 2pm in 510C, I will talk about adaptive Barzilai-Borwein method. It is a line-search-free, parameter-free gradient method, a very simple modification to the BB method. We prove the O(1/k) convergence rate for general convex functions.

Tweet Image 1

Loading...

Something went wrong.


Something went wrong.