Rogerio Chaves
@_rchaves_🍷 LLM sommellier 🧙 Open Sourcerer 📊 DSPy Visualization 🚀 Building LangWatch - https://t.co/ez7kW1C6z9 🇳🇱🇧🇷
Similar User
@p2pCapitalist
@jlucasps
leetcode never made any sense, if you want to hire good devs, chill pair programming is the best technical interview there is, trial working with the person, in their comfortable editor, ai and all, debate and decide things together, because well, that's what it will be irl
one of the biggest unexpected shifts with AI B2B SaaS v traditional SaaS: companies want to run on prem things we are automating are just too sensitive, not only for enterprise, SMBs too choose portability over scalability, you'll thank me later
Google figured out something big, and it ain’t stopping
watch llms getting surprised by their own naming conventions
Seems like the other network is getting more and more traction, follow me there too, let’s bring more LLM and DSPy talk there, the place can use some bsky.app/profile/rchave…
anyone trying Windsurf editor? Is the tab completion better than Cursor?
Forget about the wall, this is next phase of AI we are entering into, getting the accuracy on things that matter impressively close to 100%
We just hit a milestone in document processing — 91 page scanned PDF w/ nested tables, handwriting, and complex layout. 10,400 data points extracted with 100% accuracy.
Ouch, I haven’t thought from that perspective but she is totally right, AI should help us be better and more productive, not fake stupid stuff Do not make it easier for people to keep the bullshit, normalize being real instead
Apple Intelligence commercials make it clear that Apple’s latest software is for idiots — and using some of the new tools feels pretty stupid, too. @BridgetCarey points out the problems with early Apple Intelligence features, why you should feel good about turning it off.
that's why you should evaluate on your own data, don't go picking models based on somebody else's benchmark (but if you do chose sonnet 3.5)
the fun thing about designing unconventional benchmarks is that you can instantly see which models were desperately overfit to LMSYS to please managers vs which were focused on raw intelligence (hint, the new 3.5 sonnet)
forget about QA, the best way to find bugs is giving live demos
LLM-as-a-judge is super hard y'all, specially when it's smarter than you it classified "Melanzane alla parmigiana" as not vegetarian, so I went to debug it, what is this hallucinating machine talking about? but it was true, TIL
I've heard more and more people say they are using gemini 1.5 on dev meetups, I guess google finally coming up with easy to use API keys without all the google cloud shenanigans really made a difference
tomorrow I'll present a little hacking I glued together using whisper, ffmpeg, and LLMs to automate my video editing at AI Tinkerers, if you are in Amsterdam, come check it out amsterdam.aitinkerers.org/p/ai-tinkerers…
just tested 4 different AI meeting note taking apps right now, and @getshadowai was by far the best, best user experience by not trying to do much "magic", transcription runs locally on my device, the summary is on pair with more expensive competitors, it's simple and awesome
I know it seems like the field moves very fast, but there is still a long way until good practices becomes mature on the AI industry, so take your time, focus on doing the right thing, and you will be quite ahead already
I've been AI consulting for ~ 2 years. Client: "The AI isn't working in XYZ scenario" Me: "Can we look at a trace together?" ~70%: No traces, no logging ~20%: Log traces, but never look at them ~10%: Actively looking at data Unbelievable alpha in looking at data.
Was fascinating to see how MIPRO prompt optimization fared for this pipeline, across six LMs. As much as a 41% increase in quality and a 68% decrease in leakage, straight out of the box. Not bad.
We use DSPy to optimize the prompts for drafting the private prompt and synthesizing the personalized output. After prompt optimization, Llama-3.1-8B performs quite well at using the untrusted 4o-mini as a tool. Retains quality for 85% of the queries and privacy 93% of the time!
This meme was a small moment of enlightenment for me a long long while back
United States Trends
- 1. Saquon 94,8 B posts
- 2. #FlyEaglesFly 30,7 B posts
- 3. Rams 40,9 B posts
- 4. Brandon Graham 10,9 B posts
- 5. #BaddiesMidwest 14,3 B posts
- 6. Giants 94,6 B posts
- 7. Chris Chan 9.059 posts
- 8. #GoBirds 1.161 posts
- 9. #PHIvsLAR 8.017 posts
- 10. Steve Lacy 1.999 posts
- 11. Jalen 23,5 B posts
- 12. Damn BG N/A
- 13. Jela 4.197 posts
- 14. #RHOP 6.665 posts
- 15. Okada 5.933 posts
- 16. Joe Schoen 5.481 posts
- 17. Stafford N/A
- 18. AJ Brown 4.659 posts
- 19. Sirianni 4.392 posts
- 20. Slay 41,8 B posts
Who to follow
Something went wrong.
Something went wrong.