Edvard Avagyan @DarkOdysseus Twitter Profile

Edvard Avagyan

@DarkOdysseus

Data Engineering and Deep Learning enthusiast | Ms. C Data Engineering and Analytics at TUM | GitHub: https://t.co/4HQRij9Lbo

398Posts 44Followers 254Following

Edvard Avagyan Reposted

Thomas Wolf

@Thom_Wolf

21 Apr

Llama3 was trained on 15 trillion tokens of public data. But where can you find such datasets and recipes?? Here comes the first release of 🍷Fineweb. A high quality large scale filtered web dataset out-performing all current datasets of its scale. We trained 200+ ablation…

Guilherme Penedo

@gui_penedo

21 Apr

We have just released 🍷 FineWeb: 15 trillion tokens of high quality web data. We filtered and deduplicated all CommonCrawl between 2013 and 2024. Models trained on FineWeb outperform RefinedWeb, C4, DolmaV1.6, The Pile and SlimPajama!

Edvard Avagyan Reposted

Armen Aghajanyan

@ArmenAgha

24 Mar

One of the best pieces of career advice I received came indirectly from @myleott, via @stephenroller, which was to always "optimize for knowledge." I've found that this holds through every layer of abstraction, from company/team choice down to the specifics of daily experiments

Edvard Avagyan Reposted

Kass Popper - e/acc

@foomagemindset

16 Mar

they actually did it github.com/openai/grok

GitHub - openai/grok

Source: https://t.co/TkrSEVLvfS

Edvard Avagyan Reposted

Teknium (e/λ)

@Teknium1

31 Jan

Today I have a huge announcement. The dataset used to create Open Hermes 2.5 and Nous-Hermes 2 is now PUBLIC! Available Here: huggingface.co/datasets/tekni… This dataset was the culmination of all my work on curating, filtering, and generating datasets, with over 1M Examples from…

Edvard Avagyan Reposted

Jeremy Howard

@jeremyphoward

21 Jan

If you're a Python programmer looking to get started with CUDA, this weekend I'll be doing a free 1 hour tutorial on the absolute basics. Thanks to @neurosp1ke, @marksaroufim, and @ThomasViehmann for hosting this on the CUDA MODE server. :D Click here: discord.gg/6z79K5Yh?event…

Discord - Group Chat That’s All Fun & Games

Source: https://t.co/VzbQPhDbFM

Edvard Avagyan Reposted

Wavecoder

@TeamCodeLLM_AI

17 Jan

🌊🌊🌊Introduce WaveCoder-Ultra-6.7B with the closest capabilities to GPT-4 so far. Arxiv:arxiv.org/abs/2312.14187 WaveCoder-Ultra-6.7B is the newest SOTA open-source Code LLM on mutiple tasks.

Edvard Avagyan Reposted

Yossi Gandelsman

@YGandelsman

17 Jan

Accepted to oral #ICLR2024! *Interpreting CLIP's Image Representation via Text-Based Decomposition* CLIP produces image representations that are useful for various downstream tasks. But what information is actually encoded in these representations? [1/8]

Edvard Avagyan Reposted

Jon Durbin

@jon_durbin

9 Jan

🚢 Python DPO dataset This is uses items from Vezora/Tested-22k-Python-Alpaca as the "chosen" responses, and 13b/7b gens as rejected (assumed to be worse, not ranked/validated). huggingface.co/datasets/jondu…

jondurbin/py-dpo-v0.1 · Datasets at Hugging Face

Source: https://t.co/6ue7CwM69P

Edvard Avagyan

@DarkOdysseus

9 Jan

It's pretty amusing seeing your weekend finetune getting downloads on HuggingFace.

Edvard Avagyan Reposted

Netrunner

@thenetrunna

5 Jan

you'll never know if you don't try just fucking do it!

Edvard Avagyan Reposted

jason liu

@jxnlco

3 Jan

Check us our if you're looking to make some easy JS open source contributions I'm trying to convert my python library to javascript, so i need help converting examples, check out our issues for more details github.com/jxnl/instructo…

GitHub - instructor-ai/instructor-js: structured extraction for llms

Source: https://t.co/yoGG1a3n99

Edvard Avagyan Reposted

Linoy Tsaban🎗️

@linoy_tsaban

2 Jan

Let's go 2024 🚀: 🆕 training script in 🧨 @diffuserslib leveraging techniques from the community: ① pivotal tuning (from @cloneofsimo cog-sdxl) ② prodigy optimizer (from kohya's scripts) + more tricks, compatibility with AUTO1111 ♾️ ⏩ all here huggingface.co/blog/sdxl_lora…

LoRA training scripts of the world, unite!

Source: https://t.co/qqgM6Gn5rO

Edvard Avagyan Reposted

Yann LeCun

@ylecun

1 Jan

Who knew that coffee destroyed the fabric of society 😨☕

Daniel Jeffries

@Dan_Jeffries1

31 Dec

Here's the story of another technology that faced massive backlash in its time that will sound very familiar to today's battles over #AI. Coffee. a thread.

Edvard Avagyan Reposted

Aran Komatsuzaki

@arankomatsuzaki

20 Dec

Evaluating Language-Model Agents on Realistic Autonomous Tasks Explores the ability of LM agents to acquire resources, create copies of themselves, and adapt to novel challenges they encounter in the wild arxiv.org/abs/2312.11671

Edvard Avagyan Reposted

naklecha

@naklecha

12 Dec

I wrote a comprehensive blog on latent consistency models (LCMs). > aaaaaaaaaa.org/lcm The blog explains everything you need to know about LCMs including their math, architecture, finetuning and code.