Harpreet Singh @harpreetmann24 Twitter Profile

Harpreet Singh

@harpreetmann24

50Posts 3Followers 746Following

Similar User

@frances_ryta

@Steviraehobson3

@JUDBIANCFARMS

@webPegaz

@clu007

@CryptoneDon

@Kierans_stuff

@Northeasttowba1

@inkjewels

@gurbeyhiz

@irwinvalntyne

@OscarMwandu1

@HoanHyInvest

@Russell_Wilkey

@kellierosanna

Harpreet Singh Reposted

merve

@mervenoyann

25 Oct

Microsoft released a groundbreaking model that can be used for web automation, with MIT license 🔥👏 OmniParser is a state-of-the-art UI parsing/understanding model that outperforms GPT4V in parsing. 👏

Harpreet Singh Reposted

Planning GenAI Right: Your Blueprint for Reliable Business Transformation 🚀 For enterprises 🏢, reliability is essential when integrating GenAI into business processes 🤖. The paper "LLMs Still Can’t Plan; Can LRMs?" reveals that even advanced models like OpenAI’s o1 face…

Harpreet Singh Reposted

Eugene Yan

@eugeneyan

1 Oct

♥️ this writeup from @AnthropicAI for so many reasons: • Reiterating bm25 + semantic retrieval is standard RAG • Not just sharing what worked but also what didn't work • Evals on various data (code, fiction, arXiv) + embeddings • Breaking down gains from each step More of…

Harpreet Singh Reposted

Philipp Schmid

@_philschmid

28 Aug

The most comprehensive overview of LLM-as-a-Judge! READ IT‼️ "Evaluating the Effectiveness of LLM-Evaluators (aka LLM-as-Judge)” summarizes and analyzes two dozen papers on different LLM as Judge approaches.🤯 TL;DR; ⚖️ Direct scoring is suitable for objective evaluations, while…

Harpreet Singh Reposted

Eugene Yan

@eugeneyan

26 Aug

Here's an engaging intro to evals by @sridatta and @iamwil They've clearly put a lot of care and effort into it, where the content is well organized with plenty of illustrations throughout. Across 60 pages, they explain model vs. system evals, vibe checks and property-based…

Harpreet Singh Reposted

Philipp Schmid

@_philschmid

30 Sep

Thats Big! ⛰️ FRAMES released by @GoogleAI! FRAMES is a comprehensive evaluation dataset designed to test Retrieval-Augmented Generation (RAG) Applications on factuality, retrieval accuracy, and reasoning. It includes multi-hop questions that demand sophisticated retrieval and…

Harpreet Singh Reposted

[email protected]

@rohitaggarwal51

28 Sep

🚨 Why Are So Many GenAI Projects Failing? 📉 And How to Fix Them! 💡 GenAI has huge potential, but many projects fall short. Why? 🤔 Here are 6 common reasons and learn actionable solutions to avoid them: 1️⃣ Treating GenAI as simple automation 🤖 2️⃣ Over-relying on a single…

Harpreet Singh Reposted

William Fedus

@LiamFedus

29 Apr

After years eclipsed by its big brothers, gpt-2 resurgant? 🤔

Boris Dayma 🖍️

@borisdayma

28 Apr

The hype for finding out what is "gpt2-chatbot" on lmsys chatbot arena is real 😅

Harpreet Singh Reposted

Extended Brain

@Extended_Brain

27 Apr

arxiv.org/abs/2404.16811 "our study presents information-intensive (IN2) training, a purely data-driven solution to overcome lost-in-the-middle."

Harpreet Singh Reposted

Philipp Schmid

@_philschmid

22 Apr

Easily Fine-tune @AIatMeta Llama 3 70B! 🦙 I am excited to share a new guide on how to fine-tune Llama 3 70B with @PyTorch FSDP, Q-Lora, and Flash Attention 2 (SDPA) using @huggingface build for consumer-size GPUs (4x 24GB). 🚀 Blog: philschmid.de/fsdp-qlora-lla… The blog covers: 👨‍💻…

Efficiently fine-tune Llama 3 with PyTorch FSDP and Q-Lora

Source: https://t.co/vs9WC6Bjy0

Harpreet Singh Reposted

Sebastian Raschka

@rasbt

23 Apr

Phi-3 has "only" been trained on 5x fewer tokens than Llama 3 (3.3 trillion instead of 15 trillion) Phi-3-mini less has "only" 3.8 billion parameters, less than half the size of Llama 3 8B. Despite being small enough to be deployed on a phone (according to the technical…

Sebastian Raschka

@rasbt

23 Apr

I can't believe microsoft just dropped phi-3 less than a week after llama 3 arxiv.org/abs/2404.14219. And it looks good!

Harpreet Singh Reposted

Sebastian Raschka

@rasbt

22 Apr

Just learned that the RedPajama-V2 pretraining dataset is actually 30T tokens. 2x the size used for Llama 3 🤯 github.com/togethercomput…

Harpreet Singh Reposted

Sebastian Raschka

@rasbt

21 Apr

"... do SSMs truly have an advantage (over transformers) in expressive power for state tracking? Surprisingly, the answer is no ... Thus, despite its recurrent formulation, the 'state' in an SSM is an illusion" 🎤✋🔥 arxiv.org/abs/2404.08819

Harpreet Singh Reposted

Thomas Wolf

@Thom_Wolf

21 Apr

Llama3 was trained on 15 trillion tokens of public data. But where can you find such datasets and recipes?? Here comes the first release of 🍷Fineweb. A high quality large scale filtered web dataset out-performing all current datasets of its scale. We trained 200+ ablation…

Guilherme Penedo

@gui_penedo

21 Apr

We have just released 🍷 FineWeb: 15 trillion tokens of high quality web data. We filtered and deduplicated all CommonCrawl between 2013 and 2024. Models trained on FineWeb outperform RefinedWeb, C4, DolmaV1.6, The Pile and SlimPajama!

Harpreet Singh Reposted

AI at Meta

@AIatMeta

18 Apr

Introducing Meta Llama 3: the most capable openly available LLM to date. Today we’re releasing 8B & 70B models that deliver on new capabilities such as improved reasoning and set a new state-of-the-art for models of their sizes. Today's release includes the first two Llama 3…

Harpreet Singh Reposted

Philipp Schmid

@_philschmid

18 Apr

Meta Llama 3 70B Instruct in Hugging Chat! Go have fun! huggingface.co/chat/models/me…">huggingface.co/chat/models/me… huggingface.co/chat/models/me…">huggingface.co/chat/models/me…

HuggingChat

Source: https://t.co/PqUnNQMqNI

Harpreet Singh Reposted

Ai2

@allen_ai

17 Apr

Announcing our latest addition to the OLMo family, OLMo 1.7!🎉Our team's efforts to improve data quality, training procedures and model architecture have led to a leap in performance. See how OLMo 1.7 stacks up against its peers and peek into the technical details on the blog:…

Harpreet Singh Reposted

Philipp Schmid

@_philschmid

15 Apr

We can do it! 🙌 First open LLM outperforms @OpenAI GPT-4 (March) on MT-Bench. WizardLM 2 is a fine-tuned and preferences-trained Mixtral 8x22B! 🤯 TL;DR; 🧮 Mixtral 8x22B based (141B-A40 MoE) 🔓 Apache 2.0 license 🤖 First > 9.00 on MT-Bench with an open LLM 🧬 Used multi-step…

Harpreet Singh Reposted

Philipp Schmid

@_philschmid

10 Apr

New open model from @MistralAI! 🧠 Yesterday night, Mistral released Mixtral 8x22B a 176B MoE via magnet link. 🔗🤯 What we know so far: 🧮 176B MoE with ~40B active 📜 context length of 65k tokens. 🪨 Base model can be fine-tuned 👀 ~260GB VRAM in fp16, 73GB in int4 📜 Apache…

Harpreet Singh Reposted

Vaibhav Adlakha

@vaibhav_adlakha

10 Apr

We introduce LLM2Vec, a simple approach to transform any decoder-only LLM into a text encoder. We achieve SOTA performance on MTEB in the unsupervised and supervised category (among the models trained only on publicly available data). 🧵1/N Paper: arxiv.org/abs/2404.05961