Zefan Cai @ EMNLP 2024 @Zefan_Cai Twitter Profile

Zefan Cai @ EMNLP 2024

@Zefan_Cai

Now Ph.D student @UWMadison Previous @PKU1898

Joined May 2023

43Posts 173Followers 394Following

Similar User

@Baris19051974

@Red0541927155

@farzy04506692

@BlaineF74

@oscarcuanto

@astoldbyphoebe

@jorgegarciacam

@BigBleuOx

Zefan Cai @ EMNLP 2024 Reposted

Anthropic

@AnthropicAI

19 Nov

New Anthropic research: Adding Error Bars to Evals. AI model evaluations don’t usually include statistics or uncertainty. We think they should. Read the blog post here: anthropic.com/research/stati…

A statistical approach to model evaluations

Source: https://t.co/jwT73WsyFe

Zefan Cai @ EMNLP 2024 Reposted

Sean Yun-Shiuan Chuang @ EMNLP 2024

@SeanChuang4

12 Nov

Excited to present my paper on role-playing LLM agents at #EMNLP2024! 🎉 Paper Title: “Beyond Demographics: Aligning Role-playing LLM-based Agents Using Human Belief Networks” Come say hi and let's chat about some exciting research! 🤖🧠✨ TL;DR: How can we make LLM agents…

Zefan Cai @ EMNLP 2024 Reposted

Junjie Hu

@JunjieHu12

12 Nov

We are excited to share our latest work on benchmarking current LLM-based machine translation and traditional NMT on culture-specific concepts. Chat with us at 4-5:30pm on the poster session #EMNLP2024! Joint work w/ @Binnie8545 @SeleenaJiang @Diyi_Yang

Binwei Yao

@Binnie8545

12 Nov

🚀 Excited to present our paper "Benchmarking Machine Translation with Cultural Awareness" at #EMNLP2024! We build CAMT, a novel parallel corpus enriched with culture-specific item annotations, and evaluate how well NMT and LLM-MT systems handle cultural entities.

Zefan Cai @ EMNLP 2024

@Zefan_Cai

11 Nov

Arriving in #Miami for #EMNLP2024! Excited to see friends! Would like to chat about inference acceleratoon —feel free to reach out!

Zefan Cai @ EMNLP 2024 Reposted

Qingxiu Dong

@qx_dong

11 Nov

About to arrive in #Miami 🌴 after a 30-hour flight for #EMNLP2024! Excited to see new and old friends :) I’d love to chat about data synthesis and deep reasoning for LLMs (or anything else) —feel free to reach out!

Zefan Cai @ EMNLP 2024 Reposted

Haoyi Qiu ✈️ NeurIPS24

@HaoyiQiu

4 Nov

🌐 Are LLM agents prepared to navigate the rich diversity of cultural and social norms? 🏠 CASA tests them on real-world tasks like online shopping and social discussion forums, revealing that current agents show less than 10% awareness and over 40% norm violations. 🧠 We’re…

Zefan Cai @ EMNLP 2024

@Zefan_Cai

8 Nov

Thanks for sharing! Our new work HeadKV intelligently compresses LLM memory by identifying and prioritizing crucial attention heads. Specifically, this is the first work that targets at global memory allocation aross 32 heads in 32 layers inside Llama-3 model.

Rohan Paul

@rohanpaul_ai

1 Nov

Not all brain cells are equal - same goes for LLM attention heads! 💡 Why store everything when you can just remember the important stuff? Smart KV cache compression that knows which attention heads matter most. Hence, HeadKV intelligently compresses LLM memory by identifying…

Zefan Cai @ EMNLP 2024 Reposted

Rob Tang

@XiangruTang

6 Nov

@OpenAI @junshernchan @ChowdhuryNeil Thank you for your work on MLE-bench. I wanted to bring to your attention our highly relevant work in 2023: "ML-BENCH: Evaluating Large Language Models and Agents for Machine Learning Tasks" (arxiv.org/abs/2311.09835). We'd appreciate…

OpenAI

@OpenAI

10 Oct

We’re releasing a new benchmark, MLE-bench, to measure how well AI agents perform at machine learning engineering. The benchmark consists of 75 machine learning engineering-related competitions sourced from Kaggle. openai.com/index/mle-benc…

Zefan Cai @ EMNLP 2024 Reposted

Shuhuai-Ren

@RenShuhuai

28 Oct

Progress in this field is truly rapid!

Wenhao Chai

@wenhaocha1

26 Oct

🔥 MovieChat recently received its 100th citation. Thank you all for your support! A year after its release, we’ve updated MovieChat in CVPR 2024, the first large multimodal model designed for long video understanding. Thanks to its training-free design, we’ve upgraded the…

Zefan Cai @ EMNLP 2024 Reposted

Xiang Yue ✈️ NeurIPS2024

@xiangyue96

22 Oct

🌍 I’ve always had a dream of making AI accessible to everyone, regardless of location or language. However, current open MLLMs often respond in English, even to non-English queries! 🚀 Introducing Pangea: A Fully Open Multilingual Multimodal LLM supporting 39 languages! 🌐✨…

Zefan Cai @ EMNLP 2024

@Zefan_Cai

21 Oct

Looking forward to see how Lex react to Anthropic

Lex Fridman

@lexfridman

20 Oct

I'm doing a podcast with Dario Amodei, CEO of Anthropic (creator of Claude) soon, all about AI. Let me know if you have questions/topic suggestions. Also, I'll stop by SF for a bit. Let me know if you have suggestions of who I should talk to.

Zefan Cai @ EMNLP 2024

@Zefan_Cai

20 Oct

Thanks for sharing!

Ben Hagag @EMNLP 🏝️

@BenHagag20

19 Oct

6/ **Large Language Models are not Fair Evaluators** Want to use LLMs as evaluators? There are many things to be aware of, one of them is positional bias! This paper not only shows that but also develops simple yet effective calibration mechanisms to align LLM judgments more…

Zefan Cai @ EMNLP 2024 Reposted

Qingxiu Dong

@qx_dong

12 Oct

(Perhaps a bit late) Excited to announce our survey on ICL has been accepted to #EMNLP2024 main conf and been cited 1,000+ times! Thanks to all collaborators and contributors to this field! We've updated the survey arxiv.org/abs/2301.00234. Excited to keep pushing boundaries!

Zefan Cai @ EMNLP 2024

@Zefan_Cai

11 Oct

Our previous work ML-Bench also evaluates how well agents perform ML developments! Super excited that this high-quality dataset is released to help develop code agents! ARXIV: arxiv.org/abs/2311.09835 Code: github.com/gersteinlab/ML…

OpenAI

@OpenAI

10 Oct

GitHub - gersteinlab/ML-Bench: The Official Repo of ML-Bench: Evaluating Large Language Models and...

Source: https://t.co/nEdryhlZip

Zefan Cai @ EMNLP 2024 Reposted

Qingxiu Dong

@qx_dong

11 Oct

How can we guide LLMs to continually expand their own capabilities with limited annotation? SynPO: a self-boosting paradigm training LLM to auto-learn generative rewards and synthesize preference data. After 4 iterations, Llama3&Mistral achieve over 22.1% win rate improvements

Zefan Cai @ EMNLP 2024 Reposted

Heming Xia

@hemingkx

10 Oct

🤔How much potential do LLMs have for self-acceleration through layer sparsity? 🚀 🚨 Excited to share our latest work: SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration. Arxiv: arxiv.org/abs/2410.06916 🧵1/n

Zefan Cai @ EMNLP 2024 Reposted

Fei Liu

@feiliu_nlp

9 Oct

Need to address my earlier tweet: the #ACL2025 deadline has now been updated to February 15. Be sure to check out the updated CFP for all the details at 2025.aclweb.org/calls/main_con…. Thank you for your understanding as we navigate these changes! 📝✨

Zefan Cai @ EMNLP 2024

@Zefan_Cai

10 Oct

Do you still think VQ can not do text reconstruction? DND-Transformer can definitely change your mind! We empirically prove that Auto-Regressive Transformers can generate images with rich text and graphical elements.

Liang Chen

@liangchen5518

9 Oct

✨A Spark of Vision-Language Intelligence! We introduce DnD-Transformer, a new auto-regressive image gen model beats GPT/Llama w/o extra cost. AR gen beats diffusion in joint VL modeling in a self-supervised way! Github: github.com/chenllliang/Dn… Paper: huggingface.co/papers/2410.01…