@citizenhicks Profile picture

edward hicksford

@citizenhicks

posts about ai and occasionally apple.

i never really understood the use case for ipad mini. then i saw it being used as a digital ‘menu’ in a restaurant. it is the perfect weight and sizd for that.

Tweet Image 1

this is a very good read. i personally do not look at benchmarks anymore. kinda useless really. stochasm.blog/posts/scaling_…

first blog post! around 2000 words, link in replies. first time writing something like this

Tweet Image 1


you know you are in 2024 when the waiter asks: ‘do you want to take a photo first? then we can cut your dish…’ well, we took a photo.


elections are over and everyone started to panic about a wall.


we need more of this guy.

Releasing two trillion tokens in the open. huggingface.co/blog/Pclanglai…

Tweet Image 1


this paper from @Apple machine learning research introduces the concept of ‘super weights’ in llms. pruning even a single parameter, which they call a super weight, can drastically impair an llm's ability to generate text. this is surprising because previous work had only found…

Tweet Image 1

one wonders how come that alphafold3 was released 6 months ago and now it’s just got open sources, albeit with somewhat gated weights. for me, it signals that they have something much more capable in the pipeline already.

Tweet Image 1

given that @GoogleDeepMind open sourced (for academic use) alphafold 3, it’s worth re-visiting the model architecture. at the foundation of the architecture are the input embeddings, which consist of an input embedder and relative position encoding. following the input…

Tweet Image 1

can’t wait.

The priority isn't the outside world, it's protecting those within the Silo. #Silo Season 2 — This Friday on Apple TV+



this paper introduced regularised best-of-n (rbon) sampling, a novel approach to mitigate reward hacking in large language model alignment. bon sampling is a decode-time alignment method that selects the best response from n samples based on a reward model. however, bon can…


engineering just cannot keep up with r&d…

"Of course that's your contention. You just found out about entropix on X, saw that pretty quadrant plot in the README and got sold on varentropy for confidence estimation. That'll last until you'll find llmri and see what those distributions really look like. Same patterns, same…

Tweet Image 1


all token entropy wants to be low.

Tweet Image 1

apple mlx is great but where is my torch.distributions.dirichlet(), sir?!

Tweet Image 1

this looks promising.

Mathematics offers a unique window into AI's reasoning capabilities. Discover why we've launched FrontierMath—a benchmark of hundreds of unpublished, expert-level math problems—to understand the frontier of artificial intelligence.



the paper below investigates the differences between low-rank adaptation (lora) and full fine-tuning methods for adapting pre-trained language models. while lora has been shown to match full fine-tuning performance on many tasks with fewer trainable parameters, the paper…

Tweet Image 1

when your girlfriend calls your new m4max macbook ‘how is the new pc?’ i can’t.

Tweet Image 1

entropy is everywhere.

Entropy on stage.

Tweet Image 1


you lot just click on anything.


United States Trends
Loading...

Something went wrong.


Something went wrong.