Josh Meyer @_josh_meyer_ Twitter Profile

Josh Meyer

@_josh_meyer_

https://t.co/PxzriWj2jt

14KPosts 4KFollowers 627Following

Similar User

@SamueleCornell

@ArxivSound

@ISCAInterspeech

@SpeechBrain1

@ButSpeech

@shinjiw_at_cmu

@mhnt1580

@neilzegh

@edfonseca_

@mirco_ravanelli

@alphacep

@WavLab

@rdesh26

@erogol

@hbredin

Pinned

Josh Meyer

@_josh_meyer_

4 Oct

My dissertation as a podcast :)

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

18 Oct

After spending some hours on F5, I found passion to finalize this small post. I'm telling this for quite some time already though. alphacephei.com/nsh/2024/10/18…

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

17 Oct

Awesome new project: Whisper Turbo MLX by Josef Albers. A clean, single file (< 250 lines), and blazing fast implementation of Whisper Turbo in MLX:

Today we’re introducing our latest Text-To-Speech model, Play 3.0 mini. It’s faster, more accurate, handles multiple languages, supports streaming from LLMs, and it’s more cost-efficient than ever before. Try it out here: play.ht/playground/?ut…

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

14 Oct

Inspired by the @AIatMeta's Chameleon and Llama Herd papers, llama3-s (Ichigo) is an early-fusion, audio and text, multimodal model. We're experimenting with this research entirely in the open, with an open-source codebase, open data, and open weights. 2/10

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

13 Oct

3 steps to run @huggingface "Parler TTS" AI Voice on your local machine. New tutorial video out now 😊! My step-by-step technical tutorial is now available on my "Thorsten-Voice" youtube channel. youtu.be/1X2LxAGn9tU

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

10 Oct

We just released Pixtral 12B paper on Arxiv: arxiv.org/abs/2410.07073

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

4 Oct

🍏 Apple ML research in Paris has multiple open internship positions!🍎 We are looking for Ph.D. students interested in generative modeling, optimization, large-scale learning or uncertainty quantification, with applications to challenging scientific problems. Details below 👇

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

5 Oct

I’ll be presenting a deep dive into how Moshi works at the next NLP Meetup in Paris, this Wednesday the 9th at 7pm. Register if you want to attend ! 🧩🔎🟢 meetup.com/fr-FR/paris-nl…

Connexion à Meetup | Meetup

Source: https://t.co/1ZPb105JKX

Josh Meyer

@_josh_meyer_

4 Oct

impressive

AI at Meta

@AIatMeta

4 Oct

🎥 Today we’re premiering Meta Movie Gen: the most advanced media foundation models to-date. Developed by AI research teams at Meta, Movie Gen delivers state-of-the-art results across a range of capabilities. We’re excited for the potential of this line of research to usher in…

Josh Meyer

@_josh_meyer_

4 Oct

👀

Ahmad Al-Dahle

@Ahmad_Al_Dahle

4 Oct

Looking forward to tomorrow … 👀

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

4 Oct

Under-appreciated that Moshi (by @kyutai_labs) is a big simplification over more traditional speech-to-speech pipelines. It's really just two models: - A speech encoder/decoder (like EnCodec) - An LLM (trained to input and output speech tokens) Traditionally building something…

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

3 Oct

``MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages,'' Marco Gaido, Sara Papi, Luisa Bentivogli, Alessio Brutti, Mauro Cettolo, Roberto Gretter, Marco Matassoni, Mohamed Nabih, Matteo Negri, ift.tt/visfyaK

Josh Meyer

@_josh_meyer_

2 Oct

Today I let my team know that I'll be leaving Rabbit. I'm immensely grateful to have worked with such a driven team. We made strides in pushing the boundaries of AI in everyday life, and we consistently shipped at high velocity with excellent partners. I want to thank my team…

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

20 Sep

My key takeaways from the first 17 pages of the Moshi technical report, which details the models and architecture (a thread):

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

24 Sep

Behold: NeMo ASR now runs easily 2000-6000 faster than realtime (RTFx) on @nvidia GPU. We developed a series of optimizations to make RNN-T, TDT, and CTC models go brrrrrrr!🔥 In addition to topping the HF Open ASR Leaderboard they are now fast and cheap. All in pure PyTorch!

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

23 Sep

New paper: efficient multimodal machine translation training (EMMeTT). It's a milestone on our road to providing everybody with multimodal foundation model training infra. Result: single multimodal model handles both speech and text translation without loss of NMT performance.

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

24 Sep

I'm excited to share that Pindo Voice AI is now in beta! After sending 120M+ texts, we found that SMS & USSD are hard to access for many in Africa. @pindoio helps African businesses engage customers in their native languages. Join waitlist - pindo.ai/waitlist

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

1 Sep

We're releasing updated versions of Command R (35B) and Command R+ (104B). Command R (now with GQA) in particular should perform significantly better multilingually. 🤗 model weights: - ⌘ R 08-2024: huggingface.co/CohereForAI/c4… - ⌘ R+ 08-2024: huggingface.co/CohereForAI/c4…

cohere

@cohere

30 Aug

We’re releasing improved versions of the Command R series, our enterprise-grade AI models optimized for business use cases. You can access them on our API, @awscloud Sagemaker, and additional platforms soon. cohere.com/blog/command-s…

CohereForAI/c4ai-command-r-plus-08-2024 · Hugging Face

Source: https://t.co/xvKR8pcTfa

Josh Meyer Reposted

Josh Meyer

@_josh_meyer_

17 Jun

Hi all, This is the third call for papers about the SynData4GenAI workshop. Good news! While the submission data was originally due on June 18th, we'll extend it to June 24th. Please submit your papers at syndata4genai.org We look forward to your submissions!

Shinji Watanabe

@shinjiw_at_cmu

24 May

This is the second call for papers about the SynData4GenAI workshop. Please mark your calendar for the submission due date (June 18, 2024, after the Interspeech acceptance notification)! I'm also pasting the CFP.