edward hicksford @citizenhicks Twitter Profile

edward hicksford

@citizenhicks

posts about ai and occasionally apple.

955Posts 308Followers 29Following

edward hicksford

@citizenhicks

2 h

i never really understood the use case for ipad mini. then i saw it being used as a digital ‘menu’ in a restaurant. it is the perfect weight and sizd for that.

edward hicksford

@citizenhicks

4 h

this is a very good read. i personally do not look at benchmarks anymore. kinda useless really. stochasm.blog/posts/scaling_…

stochasm

@stochasticchasm

12 h

first blog post! around 2000 words, link in replies. first time writing something like this

edward hicksford

@citizenhicks

15 h

you know you are in 2024 when the waiter asks: ‘do you want to take a photo first? then we can cut your dish…’ well, we took a photo.

Silo

@SiloSeries

14 Nov

🌀 1 day left #Silo #AppleTV

edward hicksford

@citizenhicks

14 Nov

elections are over and everyone started to panic about a wall.

edward hicksford

@citizenhicks

13 Nov

we need more of this guy.

Alexander Doria

@Dorialexander

13 Nov

Releasing two trillion tokens in the open. huggingface.co/blog/Pclanglai…

this paper from @Apple machine learning research introduces the concept of ‘super weights’ in llms. pruning even a single parameter, which they call a super weight, can drastically impair an llm's ability to generate text. this is surprising because previous work had only found…

edward hicksford

@citizenhicks

13 Nov

one wonders how come that alphafold3 was released 6 months ago and now it’s just got open sources, albeit with somewhat gated weights. for me, it signals that they have something much more capable in the pipeline already.

edward hicksford

@citizenhicks

12 Nov

given that @GoogleDeepMind open sourced (for academic use) alphafold 3, it’s worth re-visiting the model architecture. at the foundation of the architecture are the input embeddings, which consist of an input embedder and relative position encoding. following the input…

edward hicksford

@citizenhicks

12 Nov

can’t wait.

Apple TV

@AppleTV

12 Nov

The priority isn't the outside world, it's protecting those within the Silo. #Silo Season 2 — This Friday on Apple TV+

edward hicksford

@citizenhicks

11 Nov

this paper introduced regularised best-of-n (rbon) sampling, a novel approach to mitigate reward hacking in large language model alignment. bon sampling is a decode-time alignment method that selects the best response from n samples based on a reward model. however, bon can…

edward hicksford

@citizenhicks

10 Nov

engineering just cannot keep up with r&d…

doomslide

@doomslide

10 Nov

"Of course that's your contention. You just found out about entropix on X, saw that pretty quadrant plot in the README and got sold on varentropy for confidence estimation. That'll last until you'll find llmri and see what those distributions really look like. Same patterns, same…

edward hicksford

@citizenhicks

10 Nov

all token entropy wants to be low.

edward hicksford

@citizenhicks

9 Nov

apple mlx is great but where is my torch.distributions.dirichlet(), sir?!

edward hicksford

@citizenhicks

9 Nov

this looks promising.

Epoch AI

@EpochAIResearch

8 Nov

Mathematics offers a unique window into AI's reasoning capabilities. Discover why we've launched FrontierMath—a benchmark of hundreds of unpublished, expert-level math problems—to understand the frontier of artificial intelligence.

edward hicksford

@citizenhicks

8 Nov

the paper below investigates the differences between low-rank adaptation (lora) and full fine-tuning methods for adapting pre-trained language models. while lora has been shown to match full fine-tuning performance on many tasks with fewer trainable parameters, the paper…