Lechen Zhang @EMNLP
@leczhangMSc @UMSI @UMich | BEng @SJTU1896. Actively looking for 25Fall PhD oppurtunities. Interested in #NLProc & #AI.
[1/12] Optimizing prompts for specific tasks has been key to improving LLM performance, but what if we optimize prompts on system level to work well on *all* tasks? Check out 🌱SPRIG, a genetic system prompt optimizer that help unlock LLMs' full potential: arxiv.org/abs/2410.14826
Heard of the Alaska-Hawaii merger?🤔Wonder if LLMs know it’s pending government approval before it can happen? They stumble, but we’ve got a fix⚒️! Dive into my #EMNLP2024 work 𝐍𝐚𝐫𝐫𝐚𝐭𝐢𝐯𝐞-𝐨𝐟-𝐓𝐡𝐨𝐮𝐠𝐡𝐭—a special prompting technique to unlock LLMs’ temporal reasoning
At #EMNLP2024 we will present our paper on LLM values and opinions! We introduce tropes: repeated and consistent phrases which LLMs generate to argue for political stances. Read the paper to learn more! arxiv.org/abs/2406.19238 Work done @CopeNLU & @AiCentreDK
Ever wonder the factuality of Language Models in the wild 🌍? Check out our new benchmark, 𝐅𝐚𝐜𝐭𝐁𝐞𝐧𝐜𝐡! I am proud to have contributed to the entire process of building 𝐅𝐚𝐜𝐭𝐁𝐞𝐧𝐜𝐡 and its supporting evaluation pipeline, 𝕍𝔼ℝ𝕀𝔽𝕐.
🌍 How Verifiable Are LM Responses in the Wild? A Three-Way Factuality Benchmark Meet 𝐅𝐚𝐜𝐭𝐁𝐞𝐧𝐜𝐡 – an updatable benchmark for evaluating language models' factuality in real-world scenarios. 🔗 huggingface.co/spaces/launch/… @launchnlp @michigan_AI @UMichCSE
🎙️ What if the way we prompt LLMs might actually hold it back? 🚨 Assigning personas like "helpful assistant" in system prompts might *not* be as helpful as we think! ✨ Check out our work accepted to Findings of @emnlpmeeting ✨ 📜 arxiv.org/abs/2311.10054 🧵 [1/7]
🌿I am on the faculty job market this year!🌿 I work on reliable natural language processing, including: ✅ Factuality 💪 Robustness 🌿 Sustainability Feel free to reach out and DM! I will also be at #EMNLP2024 and #NeurIPS2024 and would love to chat in person!
👩🏼💻 Real or Robotic? 🤖 Can LLMs accurately simulate qualities of human responses in dialogue? Human conversations with LLMs are great for assessing the capabilities of LLMs. But having lots of folks chat with LLMs is challenging (💰⏳🕵️). Could we have another LLM *simulate*…
🤔What does it mean for a model to have a value? To answer, we first ask, are large language models 🤖 consistent over value-laden questions? 🧵
Such a great opportunity to connect during the Michigan AI #NAACL2024 meetup! 🌟 Looking forward to the next one! 😀
United States Trends
- 1. Mike 1,81 Mn posts
- 2. Serrano 235 B posts
- 3. #NetflixFight 69,8 B posts
- 4. Canelo 15,8 B posts
- 5. #netflixcrash 15,3 B posts
- 6. Father Time 10,6 B posts
- 7. Logan 75,9 B posts
- 8. Rosie Perez 14,5 B posts
- 9. He's 58 22,8 B posts
- 10. #buffering 10,7 B posts
- 11. Shaq 15,5 B posts
- 12. Boxing 283 B posts
- 13. ROBBED 100 B posts
- 14. My Netflix 81,6 B posts
- 15. Tori Kelly 5.044 posts
- 16. Roy Jones 7.034 posts
- 17. Ramos 69,9 B posts
- 18. Cedric 21,3 B posts
- 19. Gronk 6.507 posts
- 20. Barrios 50,2 B posts
Something went wrong.
Something went wrong.