Gautam Goel
@gautamcgoelPostdoc studying ML at the Simons Institute at UC Berkeley.
Similar User
@SebastienBubeck
@ShamKakade6
@zicokolter
@neu_rips
@TheGregYang
@prfsanjeevarora
@shortstein
@bneyshabur
@thegautamkamath
@yisongyue
@nanjiang_cs
@AlexGDimakis
@HazanPrinceton
@jasondeanlee
@ZeyuanAllenZhu
My sympathies are with the authors here.
This was a great talk, highly recommend!
Talk I gave at @SimonsInstitute on a line of work trying to understand statistical properties of score-like losses is finally up. Talk length was a touch longer and allowed breathing room, so more technical details and musings than the usual 50 min talk! youtube.com/watch?v=mdwxbQ…
Let F map R^d to R. McDiarmid's Inequality says that if changing any single coordinate changes the value of F by at most some constant, then F is close to E[F] whp. Is there an analog where F: R^d -> R^d and changing a coordinate changes the 2-norm by at most a constant?
A wise man once told me this rule: NeurIPS/ICML/ICLR/AISTATS if you can pretend it works, COLT/ALT if you don't have time to make it work, STOC/FOCS if there is no hope to make it work
Suppose you come up with an exciting learning theory result. There are three sets of conferences you could send it to: COLT/ALT, or NeurIPS/ICML/ICLR/AISTATS, or STOC/FOCS. How do you pick? When should you choose STOC over COLT?
I've been tearing my hair out, trying to prove a concentration inequality for the softmax of N standard Gaussians. I've tried the standard tricks but come up empty. Do you guys have any ideas? @ccanonne_ @aryehazan @neu_rips @gaussianmeasure
Check out this article my brother coauthored on how over 2/3 of elections in the US are uncontested - no one bothers to challenge the incumbent! governing.com/magazine/ameri…
I have a very similar (maybe the same?) question. I have a sequence of i.i.d zero-mean random variables. These variables are not subexponential, but every moment is finite. I need a concentration inequality for the average. Any thoughts?
What if the random variables are not bounded and instead we have a bound on the k-th moment? E.g., k=100. Surely we can still get a similar bound, but it seems like it would need a different proof technique.🤔 The union bound gets you something, but it seems rather weak.
Lovely talk by Vatsal Sharan on using Transformers to discover data structures at the @simons
Prediction: she will join Ilya's company @ssi
I shared the following note with the OpenAI team today.
right in the feels
‘The PhD student is someone who forgoes current income in order to forgo future income.’ - Peter Greenberg
Lots of progress on bandit convex optimization recently arxiv.org/abs/2406.18672 arxiv.org/abs/2406.06506 arxiv.org/abs/2302.05371, I wish I could follow it more closely ... looks like Conjecture 1 from arxiv.org/abs/1607.03084 is going to be resolved soon!!!
Surya describing connections between LLMs, statistical mechanics, and neuroscience at the @sim
Ankur kicking off the year-long program on LLMs and Transformers at the @SimonsInstitute
every day we stray further from God
Mark Thursday, August 24, 2024 PDF has overtaken all of the Abrahamic religions The trajectory over the last 9 years has been wild
I didn't read any of the papers and still know the answer is 'no'.
Everyone's talking about Sakana's AI scientist. But no-one's answering the big question: is its output good? I spent hours reading its generated papers and research logs. Read on to find out x.com/SakanaAILabs/s…
The first review I ever got: "This looks fine."
R2 from my 1st PhD paper: "the proposed algorithms need to have at least 2 of the following 3 dimensions: 1-should solve difficult problems 2-should provide near-optimal solutions 3-solve a previously unsolved problem. Unfortunately, this paper achieves none of these 3 aspects."
I just disproved something I'd been trying to prove for a week. It's hard to argue with a counterexample!
Question for ML Twitter. Let f be a function, g the gradient of f, and H the Hessian of f. What is the significance of g'Hg? Intuitively this measures how quickly the function is growing in directions of high curvature. Is there any literature on this value?
United States Trends
- 1. #TysonPaul 165 B posts
- 2. Serrano 215 B posts
- 3. #NetflixFight 60,4 B posts
- 4. #netflixcrash 13,7 B posts
- 5. Canelo 8.503 posts
- 6. Rosie Perez 11,9 B posts
- 7. Shaq 13,2 B posts
- 8. #buffering 9.805 posts
- 9. My Netflix 70,6 B posts
- 10. Father Time 9.235 posts
- 11. Tori Kelly 4.407 posts
- 12. ROBBED 89,1 B posts
- 13. Gronk 6.080 posts
- 14. Cedric 18,5 B posts
- 15. #boxing 40,4 B posts
- 16. Ramos 68,2 B posts
- 17. Roy Jones 5.561 posts
- 18. Barrios 49 B posts
- 19. Logan 64,3 B posts
- 20. He's 58 10,7 B posts
Who to follow
-
Sebastien Bubeck
@SebastienBubeck -
Sham Kakade
@ShamKakade6 -
Zico Kolter
@zicokolter -
Gergely Neu
@neu_rips -
Greg Yang
@TheGregYang -
Sanjeev Arora
@prfsanjeevarora -
Thomas Steinke
@shortstein -
Behnam Neyshabur
@bneyshabur -
Gautam Kamath
@thegautamkamath -
Yisong Yue
@yisongyue -
Nan Jiang
@nanjiang_cs -
Alex Dimakis
@AlexGDimakis -
Elad Hazan
@HazanPrinceton -
Jason Lee
@jasondeanlee -
Zeyuan Allen-Zhu
@ZeyuanAllenZhu
Something went wrong.
Something went wrong.