@haozhangml Profile picture

Hao Zhang

@haozhangml

Asst. Prof. @HDSIUCSD and @ucsd_cse running @haoailab. Cofounder and runs @lmsysorg. 20% with @SnowflakeDB

Similar User
Zhuohan Li photo

@zhuohan123

Tri Dao photo

@tri_dao

SkyPilot photo

@skypilot_org

Lianmin Zheng photo

@lm_zheng

lmarena.ai (formerly lmsys.org) photo

@lmarena_ai

Piotr Nawrot photo

@p_nawrot

Woosuk Kwon photo

@woosuk_k

Xinyun Chen photo

@xinyun_chen_

Joey Gonzalez photo

@profjoeyg

Zico Kolter photo

@zicokolter

Eric Xing photo

@ericxing

Ying Sheng photo

@ying11231

Dacheng Li photo

@DachengLi177

Tianqi Chen photo

@tqchenml

Lei Li photo

@lileics

Pinned

Check out our latest blogpost discussing a better metric -- goodput (throughput s.t. latency constraints) -- for LLM serving, and our new technique prefill-decoding disaggregation that optimizes goodput and achieves lower cost-per-query and high service quality at the same time!

Still optimizing throughput for LLM Serving? Think again: Goodput might be a better choice! Splitting prefill from decode to different GPUs yields - up to 4.48x goodput - up to 10.2x stricter latency criteria Blog: hao-ai-lab.github.io/blogs/distserv… Paper: arxiv.org/abs/2401.09670



Hao Zhang Reposted

📈 more to come

Yeah, Gemini-exp-1114 is pretty good :)



Glad to see both projects we have been doing since last year get recognitions. More to come ✊🤟

Thank you @sequoia for supporting the open source community! We believe openness is the way for the future infrastructure of AI, and humbled by our amazing users, contributors, and supporters! ❤️



I will be at this event in person tonight -- happy to chat about anything about AI/LLMs! 🙂

The last in-person vLLM meetup of the year is happening in two weeks, on November 13, at @SnowflakeDB HQ! Join the vLLM developers and engineers from Snowflake AI Research to chat about the latest LLM inference optimizations and your 2025 vLLM wishlist! lu.ma/h0qvrajz



Hao Zhang Reposted

It's official! We are now an incubation project @LFAIDataFdn We firmly believe in open governance and are committed to ensuring that vLLM remains a shared community project. ❤️

🚀 Introducing vLLM, LF AI & Data’s newest incubation project! vLLM is a high-throughput, memory-efficient engine for LLM inference & serving, solving the challenge of slow model serving. Read the full announcement ➡️ hubs.la/Q02V_5270 #opensource #oss

Tweet Image 1


Speculative decoding is really interesting problem; the community has published many papers in a short amount of time, but very few of them have really discussed how it can be made really useful in a real serving system (they mostly assume batch size = 1). This is a nice step.…

This is the first time we formally introduce speculative decoding in vllm. Actually this feature has been there for a long time since @cdnamz built the general framework months ago. It’s such a long way and a big community effort to make it really work.



exciting new development!!

🔥New benchmark: Preference Proxy Evaluations (PPE) Can reward models guide RLHF? Can LLM judge replace real human evals? PPE addresses these questions! Highlights: - Real-world human preference from Chatbot Arena💬 - 16,000+ prompts and 32,000+ diverse model responses🗿 -…

Tweet Image 1


Congrats @_parasj @ajayj_ . This is huge to the community and my students are all excited about what the model can enable!!!

Introducing Mochi 1 preview. A new SOTA in open-source video generation. Apache 2.0. magnet:?xt=urn:btih:441da1af7a16bcaa4f556964f8028d7113d21cbb&dn=weights&tr=udp://tracker.opentrackr.org:1337/announce



wow, congrats @angcharlesli @haojian_jin and @HDSIUCSD !!

Thrilled to share that our collaborative work with @haojian_jin got the Distinguished Paper Award at CCS'24. Congrats to all the co-authors! @eceumd paper: arxiv.org/pdf/2408.07728

Tweet Image 1
Tweet Image 2


whoo, UCSD is doing great!!

And big changes in the ranking of various AI labs in the Zeta Alpha top-100 most cited papers. Microsoft made Google dance, also in the most cited AI papers of 2023. Big moves up in the ranking for AI Research at @CarnegieMellon @MIT @hkust and @UCSanDiego

Tweet Image 1


Hao Zhang Reposted

Join us on Wednesday, October 16th at 4pm PT to explore how we’re optimizing LLM serving under stricter latency requirements by maximizing goodput! 🙌 Link: hubs.la/Q02T9pcR0

Join us for our next #PyTorch Expert Exchange Webinar on Wednesday, October 16th at 4 PM PT ➡️ Distsserve: disaggregating prefill and decoding for goodput-optimized LLM inference with @haozhangml Asst. Prof. at @HDSIUCSD & @ucsd_cse Tune in at: hubs.la/Q02T9pcR0

Tweet Image 1


I'll talk about DistServe (hao-ai-lab.github.io/blogs/distserv…) at PyTorch webinar this Wed. Looking forward😃. Thank @PyTorch and @AIatMeta for the support!

Join us for our next #PyTorch Expert Exchange Webinar on Wednesday, October 16th at 4 PM PT ➡️ Distsserve: disaggregating prefill and decoding for goodput-optimized LLM inference with @haozhangml Asst. Prof. at @HDSIUCSD & @ucsd_cse Tune in at: hubs.la/Q02T9pcR0

Tweet Image 1


Hao Zhang Reposted

My @Google colleague and longtime @UCBerkeley faculty member David Patterson has a great essay out in this month's Communications of the ACM (@TheOfficialACM):🎉 "Life Lessons from the First Half-Century of My Career Sharing 16 life lessons, and nine magic words." I saw an…

Tweet Image 1

Hao Zhang Reposted

BREAKING NEWS The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Chemistry with one half to David Baker “for computational protein design” and the other half jointly to Demis Hassabis and John M. Jumper “for protein structure prediction.”

Tweet Image 1

Hao Zhang Reposted

📢📢 We are releasing TxT360: a globally deduplicated dataset for LLM pretraining 🌐 99 Common Crawls 📘 14 Curated Sources 👨‍🍳 recipe to easily adjust data weighting and train the most performant models Dataset: huggingface.co/datasets/LLM36… Blog: huggingface.co/spaces/LLM360/…

Tweet Image 1

Hao Zhang Reposted

✨ Check out our revamped repo! Analysis360: Open Implementations of LLM Analyses 🔗 github.com/LLM360/Analysi… Featuring tutorials on: 💾 Data memorization 🧠 LLM unlearning ⚖️ AI safety, toxicity, & bias 🔍 Mechanistic interpretability 📊 Evaluation metrics

Tweet Image 1

Hao Zhang Reposted

Former @SCSatCMU faculty member Geoffrey Hinton has been awarded the 2024 Nobel Prize in Physics! 👏 Hinton, now at @UofT, was recognized alongside John J. Hopfield of @Princeton for their work in machine learning with artificial neural networks. ➡️ cmu.is/Hinton-Nobel-P…

Tweet Image 1

Hao Zhang Reposted

Geoff and John are a truly inspired choice for the Nobel Prize in Physics. Not only because they have done groundbreaking work for machine learning research, but also since this choice reflects an understanding that machine learning methods are changing how science is done (1/2)


Congrats @geoffreyhinton, who really laid the foundations for deep learning which enables so many breakthroughs. I quite like @bschoelkopf's interpretation x.com/bschoelkopf/st… that that AI/ML is gradually yet fundamentally changing *how science is done*, and this is truly…

BREAKING NEWS The Royal Swedish Academy of Sciences has decided to award the 2024 #NobelPrize in Physics to John J. Hopfield and Geoffrey E. Hinton “for foundational discoveries and inventions that enable machine learning with artificial neural networks.”

Tweet Image 1


Hao Zhang Reposted

Join our online meetup on Oct. 16 for efficient LLM deployment and serving, co-hosted by SGLang, FlashInfer, and MLC LLM! 🥳 You are all welcome to join by filling out the Google form forms.gle/B3YeedLxmrrhL1… It will cover topics such as low CPU overhead scheduling, DeepSeek MLA…

Tweet Image 1

Loading...

Something went wrong.


Something went wrong.