Mr. Jack Tung @MrJackTung Twitter Profile

Mr. Jack Tung

@MrJackTung

3KPosts 518Followers 4KFollowing

Similar User

@abdielcasanas

@GlennSt39143555

@SilverBay2023

@Amber_quge

@victordevguy

@Hunter2Sun

@EmilyCrewe1

@Christal0616

@AlHassan49186

Pinned

Mr. Jack Tung

@MrJackTung

18 Sep

update: OpenAI-o1 likely system prompt for Multi-Step Reasoning ^^ github.com/Mr-Jack-Tung/J…

GitHub - Mr-Jack-Tung/JARVIS-Chatbot-a-simple-RAG-with-PDF-files: JARVIS Chatbot: a local simple...

Source: https://t.co/VOjJs2fYyS

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

15 Nov

Unusual couplings and gears.

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

13 Nov

Snowy owl mum protecting her chicks from snow 📹AI

From March

Mr. Jack Tung Reposted

"Open-source developer platform to power your entire infra and turn scripts into webhooks, workflows and UIs. Fastest workflow engine (13x vs Airflow). Open-source alternative to Retool and Temporal."

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

13 Nov

This is song is not created by a real human. It’s made with Suno V4. Now we passed the threshold where we were able to distinguish man made music from AI one. Also the video. It’s over for good old Warner Music and others.

From ⭕ AI & Design (Marco)

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

12 Nov

China seems to be in the robot lead. We need to accelerate more. No question about it.

From YouJiacheng

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

12 Nov

You can finetune Qwen-2.5-Coder-14B for free on Colab now! Unsloth makes finetuning 2x faster & uses 60% less VRAM with no accuracy loss. We extended context lengths from 32K to 128K with YaRN & uploaded GGUFs: huggingface.co/collections/un… Finetuning Colab: colab.research.google.com/drive/18sN803s…

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

12 Nov

M4 Mac AI Coding Cluster Uses @exolabs to run LLMs (here Qwen 2.5 Coder 32B at 18 tok/sec) distributed across 4 M4 Mac Minis (Thunderbolt 5 80Gbps) and a MacBook Pro M4 Max. Local alternative to @cursor_ai (benchmark comparison soon).

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

12 Nov

The Rhythm In Anything (TRIA) An AI system to map arbitrary sound to high-fidelity drum recordings! Mind-blowing 🤯 we've probably watched this video a thousand times now 🥁 [SOUND: ON🎵]

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

11 Nov

Humans can learn to reason in an "unfamiliar" world, like new games. How far are LLMs from this? Check out our recent work @NeurIPS2024 D&B Track: "LogiCity: Advancing Neuro-Symbolic AI with Abstract Urban Simulation". Page: jaraxxus-me.github.io/LogiCity/

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

11 Nov

.@Microsoft just dropped TinyTroupe! Described as "an experimental Python library that allows the simulation of people with specific personalities, interests, and goals." These agents can listen, reply back, and go about their lives in simulated TinyWorld environments.

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

9 Nov

💫 Introducing Mixture-of-Transformers (MoT) , our latest work advancing modality-aware sparse architectures for multimodal foundation models, led by @liang_weixin, in collaboration w/ amazing colleagues at @AIatMeta arxiv.org/abs/2411.04996 (1/n)

Weixin Liang

@liang_weixin

8 Nov

How can we reduce pretraining costs for multi-modal models without sacrificing quality? We study this Q in our new work: arxiv.org/abs/2411.04996 At @AIatMeta, We introduce Mixture-of-Transformers (MoT), a sparse architecture with modality-aware sparsity for every non-embedding…

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

24 May

At last, a curriculum learning that works, one for pretraining and another for instruction tuning @l__ranaldi @Giuli12P2 @andrenfreitas @znz8 aclanthology.org/2024.lrec-main… aclanthology.org/2023.ranlp-1.1…

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

11 Nov

🚨 New Paper!! How can we train LLMs using 100M words? In our @babyLMchallenge paper, we introduce a new self-synthesis training recipe to tackle this question! 🍼💻 This was a fun project co-led by me, @yingtian80536, @akgokce0, w/ @HannesMehrer & @martin_schrimpf 🧵⬇️

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

9 Nov

"Full stack, modern web application template. Using FastAPI, React, SQLModel, PostgreSQL, Docker, GitHub Actions, automatic HTTPS and more."

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

9 Nov

Voice-Pro gradio web-ui for transcription, translation and text-to-speech

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

9 Nov

M4 Mac Mini AI Cluster Uses @exolabs with Thunderbolt 5 interconnect (80Gbps) to run LLMs distributed across 4 M4 Pro Mac Minis. The cluster is small (iPhone for reference). It’s running Nemotron 70B at 8 tok/sec and scales to Llama 405B (benchmarks soon).

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

9 Nov

✒️Kiroku is a multi-agent system that helps you organize and write documents Really complex agent (see the diagram below!) that HEAVILY involves a "human in the loop" flow Great resource for anyone looking to create a writing agent github.com/cnunescoelho/k…

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

8 Nov

Model merging is tricky when model weights aren’t aligned Introducing KnOTS 🪢: a gradient-free framework to merge LoRA models. KnOTS is plug-and-play, boosting SoTA merging methods by up to 4.3%🚀 📜: arxiv.org/abs/2410.19735 💻: github.com/gstoica27/KnOTS

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

8 Nov

👉🏻Thrilled to introduce BitNet a4.8, enabling 4-bit activations for 1.58-bit LLMs!🚀🚀 Paper: arxiv.org/abs/2411.04965 HF page: huggingface.co/papers/2411.04… 🔥🔥2B BitNet a4.8 trained with 2T tokens achieves 50.30% acc on MMLU, almost no degradation to BitNet b1.58.

Mr. Jack Tung Reposted

Mr. Jack Tung

@MrJackTung

6 Nov

43% of the speedup in the new NanoGPT record is due to a variant of value residual learning that I developed. Value residual learning (recently proposed by arxiv.org/abs/2410.17897) allows all blocks in the transformer to access the values computed by the first block. The paper…