Rogerio Chaves @_rchaves_ Twitter Profile

Rogerio Chaves

@_rchaves_

🍷 LLM sommellier 🧙 Open Sourcerer 📊 DSPy Visualization 🚀 Building LangWatch - https://t.co/ez7kW1C6z9 🇳🇱🇧🇷

10KPosts 992Followers 494Following

Similar User

@p2pCapitalist

@jlucasps

Rogerio Chaves

@_rchaves_

13 h

leetcode never made any sense, if you want to hire good devs, chill pair programming is the best technical interview there is, trial working with the person, in their comfortable editor, ai and all, debate and decide things together, because well, that's what it will be irl

Rogerio Chaves

@_rchaves_

23 h

one of the biggest unexpected shifts with AI B2B SaaS v traditional SaaS: companies want to run on prem things we are automating are just too sensitive, not only for enterprise, SMBs too choose portability over scalability, you'll thank me later

Rogerio Chaves

@_rchaves_

21 Nov

Google figured out something big, and it ain’t stopping

lmarena.ai (formerly lmsys.org)

@lmarena_ai

21 Nov

Woah, huge news again from Chatbot Arena🔥 @GoogleDeepMind’s just released Gemini (Exp 1121) is back stronger (+20 points), tied #1🏅Overall with the latest GPT-4o-1120 in Arena! Ranking gains since Gemini-Exp-1114: - Overall #3 → #1 - Overall (StyleCtrl): #5 -> #2 - Hard…

Rogerio Chaves

@_rchaves_

21 Nov

watch llms getting surprised by their own naming conventions

Rogerio Chaves

@_rchaves_

21 Nov

not themselves...

Rogerio Chaves

@_rchaves_

21 Nov

Seems like the other network is getting more and more traction, follow me there too, let’s bring more LLM and DSPy talk there, the place can use some bsky.app/profile/rchave…

Omar Khattab

@lateinteraction

20 Nov

I'm late to the other interaction as well. Same @!

Rogerio Chaves

@_rchaves_

19 Nov

anyone trying Windsurf editor? Is the tab completion better than Cursor?

Rogerio Chaves

@_rchaves_

18 Nov

word

Rogerio Chaves

@_rchaves_

17 Nov

Forget about the wall, this is next phase of AI we are entering into, getting the accuracy on things that matter impressively close to 100%

Kushal Byatnal

@kushalbyatnal

15 Nov

We just hit a milestone in document processing — 91 page scanned PDF w/ nested tables, handwriting, and complex layout. 10,400 data points extracted with 100% accuracy.

Rogerio Chaves

@_rchaves_

17 Nov

Ouch, I haven’t thought from that perspective but she is totally right, AI should help us be better and more productive, not fake stupid stuff Do not make it easier for people to keep the bullshit, normalize being real instead

CNET

@CNET

15 Nov

Apple Intelligence commercials make it clear that Apple’s latest software is for idiots — and using some of the new tools feels pretty stupid, too. @BridgetCarey points out the problems with early Apple Intelligence features, why you should feel good about turning it off.

Rogerio Chaves

@_rchaves_

17 Nov

that's why you should evaluate on your own data, don't go picking models based on somebody else's benchmark (but if you do chose sonnet 3.5)

James Campbell

@jam3scampbell

17 Nov

the fun thing about designing unconventional benchmarks is that you can instantly see which models were desperately overfit to LMSYS to please managers vs which were focused on raw intelligence (hint, the new 3.5 sonnet)

Rogerio Chaves

@_rchaves_

15 Nov

forget about QA, the best way to find bugs is giving live demos

Rogerio Chaves

@_rchaves_

13 Nov

LLM-as-a-judge is super hard y'all, specially when it's smarter than you it classified "Melanzane alla parmigiana" as not vegetarian, so I went to debug it, what is this hallucinating machine talking about? but it was true, TIL

Rogerio Chaves

@_rchaves_

13 Nov

I've heard more and more people say they are using gemini 1.5 on dev meetups, I guess google finally coming up with easy to use API keys without all the google cloud shenanigans really made a difference

Rogerio Chaves

@_rchaves_

11 Nov

tomorrow I'll present a little hacking I glued together using whisper, ffmpeg, and LLMs to automate my video editing at AI Tinkerers, if you are in Amsterdam, come check it out amsterdam.aitinkerers.org/p/ai-tinkerers…

Rogerio Chaves

@_rchaves_

1 Nov

just tested 4 different AI meeting note taking apps right now, and @getshadowai was by far the best, best user experience by not trying to do much "magic", transcription runs locally on my device, the summary is on pair with more expensive competitors, it's simple and awesome

Rogerio Chaves

@_rchaves_

30 Oct

I know it seems like the field moves very fast, but there is still a long way until good practices becomes mature on the AI industry, so take your time, focus on doing the right thing, and you will be quite ahead already

Hamel Husain

@HamelHusain

30 Oct

I've been AI consulting for ~ 2 years. Client: "The AI isn't working in XYZ scenario" Me: "Can we look at a trace together?" ~70%: No traces, no logging ~20%: Log traces, but never look at them ~10%: Actively looking at data Unbelievable alpha in looking at data.

Rogerio Chaves Reposted

Omar Khattab

@lateinteraction

29 Oct

Was fascinating to see how MIPRO prompt optimization fared for this pipeline, across six LMs. As much as a 41% increase in quality and a 68% decrease in leakage, straight out of the box. Not bad.

Siyan Sylvia Li 🦋

@Sylvia_Sparkle

28 Oct

We use DSPy to optimize the prompts for drafting the private prompt and synthesizing the personalized output. After prompt optimization, Llama-3.1-8B performs quite well at using the untrusted 4o-mini as a tool. Retains quality for 85% of the queries and privacy 93% of the time!