Tobias Weyand @0xtob Twitter Profile

Tobias Weyand

@0xtob

Researcher / Software Engineer @GoogleDeepMind working on video understanding.

Joined June 2008

581Posts 396Followers 235Following

Similar User

@Jimantha

@giotolias

@dlarlus

@aliathar94

@pararths

@kwangmoo_yi

@SattlerTorsten

@FuaPv

@avapamini

@Istvan_Sarandi

@NagraniArsha

@rishit_dagli

@ElorHadar

@pmh47_ml

@mihaidusmanu

Tobias Weyand

@0xtob

Dec 4

Excited to share Long-Video Masked Autoencoder (LVMAE) our team just published at @NeurIPSConf! We boost the context length of video models using an adaptive decoder and a dual-masking strategy and achieve SotA on several video benchmarks. Paper: arxiv.org/abs/2411.13683

Google AI

@GoogleAI

Dec 4

Training video understanding models on longer contexts is computationally intensive. To address this, we present a novel approach that reduces the computational load while also improving the quality of the learned representations. More at: goo.gle/4fW5aIc

GoogleAI's tweet image. Training video understanding models on longer contexts is computationally intensive. To address this, we present a novel approach that reduces the computational load while also improving the quality of the learned representations. More at: goo.gle/4fW5aIc

Tobias Weyand

@0xtob

Nov 14

Thank you @JeffDean , very much appreciate the boost! This is really a team effort with my amazing colleagues @NagraniArsha, Mingda Zhang, @raminia, Rachel Hornung, @nitesh_ai, @under_fitting, Austin Meyers, @zhouxy2017, @BoqingGo, @CordeliaSchmid, @sirotenko_m, @ZhuZhu66595

Jeff Dean

@JeffDean

Nov 14

A nice new benchmark for long video understanding by Tobias Weyand @0xtob and others. This is likely to be one of the new frontiers of capabilities for large-scale multimodal models, and it's great to have a new benchmark to assess others in this area.

Tobias Weyand

@0xtob

Nov 12

Excited that our work on Long video understanding is being featured by @GoogleAI !

Google AI

@GoogleAI

Nov 12

Can #AI truly understand long videos? Tobias Weyand & the Google Research team are testing the limits w/ Neptune, an open-source benchmark for long video understanding. Dive into the details & see how AI tackles temporal reasoning, cause & effect, & more →goo.gle/4esTTNM

Tobias Weyand

@0xtob

Sep 23

The other day I let my kids talk to Gemini live. Today my 3 year old asked my 6 year old: "Can you tell me a joke?" - 6 year old: "Sorry, I'm just a language model."

Tobias Weyand

@0xtob

Sep 16

Excited to share what our team has been working on! With expanding context lengths, frontier models are able to process longer and longer videos. But how well do they really understand them? Today we release Neptune, a challenging benchmark for long video understanding.

Google AI

@GoogleAI

Sep 16

Datasets for evaluation of long video understanding are rare. So with this in mind, today we describe Neptune, an open-source evaluation dataset that includes multiple-choice and open-ended questions for videos of variable lengths up to 15 minutes. More →goo.gle/3B41nZV

GoogleAI's tweet image. Datasets for evaluation of long video understanding are rare. So with this in mind, today we describe Neptune, an open-source evaluation dataset that includes multiple-choice and open-ended questions for videos of variable lengths up to 15 minutes. More →goo.gle/3B41nZV

Tobias Weyand

@0xtob

Aug 20

New long video understanding benchmark from my colleagues @GoogleDeepMind pushing LLMs to their limits!

Dima Damen

@dimadamen

Aug 20

Can current LLMs solve video reasoning Qs like: Over 1-hour, when does the camera holder go down stairs... ?? Watch the teaser... Can you distinguish up/down stairs - p.s. stairs are not visible when you go down any youtu.be/Ddgvr4OReL4 Hour-Long PerceptionTest VQA @eccvconf

Tobias Weyand Reposted

Google AI

@GoogleAI

Jul 23

Congratulations to the authors of "VideoPoet: A Large Language Model for Zero-Shot Video Generation" for winning one of this year's @icmlconf Best Paper Awards! #ICML2024 Paper: openreview.net/forum?id=LRkJw… Blog post: goo.gle/4atanoj

GoogleAI's tweet image. Congratulations to the authors of "VideoPoet: A Large Language Model for Zero-Shot Video Generation" for winning one of this year's @icmlconf Best Paper Awards! #ICML2024

Paper: openreview.net/forum?id=LRkJw…
Blog post: goo.gle/4atanoj

Tobias Weyand Reposted

Astitva Srivastava

@sarcastitva

Apr 1

Computer Vision conference's acceptance criteria these days: #CVPR2024 #eccc2024 #AI #ComputerVision

Tobias Weyand Reposted

Google AI

@GoogleAI

Mar 25

Introducing VideoPrism, a single model for general-purpose video understanding that can handle a wide range of tasks, including classification, localization, retrieval, captioning and question answering. Learn how it works at goo.gle/49ltEXW

Tobias Weyand

@0xtob

Mar 22

New work from my colleagues: NeRF without the need for SfM to obtain camera poses!

Google AI

@GoogleAI

Mar 18

Presenting MELON, a technique that can determine object-centric camera poses entirely from scratch while reconstructing the object in 3D. MELON can easily be integrated into existing NeRF methods and requires as few as 4–6 images of an object. Learn more →…

GoogleAI's tweet image. Presenting MELON, a technique that can determine object-centric camera poses entirely from scratch while reconstructing the object in 3D. MELON can easily be integrated into existing NeRF methods and requires as few as 4–6 images of an object. Learn more →…

Tobias Weyand

@0xtob

Nov 7, 2023

My 5yo daughter is already coming up with image generation prompts to test generalization beyond the training data: "Unicorn kitty in space", "Princess astronaut". Or maybe she's just asking me to print coloring pages for her, idk.

Tobias Weyand Reposted

Google AI

@GoogleAI

Oct 10, 2023

Introducing SANPO, a multi-attribute video dataset for outdoor human egocentric scene understanding composed of both real-world and synthetic data, including depth maps and video panoptic masks with a wide variety of semantic class labels. Read more → goo.gle/3ZISInU

Tobias Weyand Reposted

Jul 11, 2023

I just spent 3 days with dear friends, all of whom have kids ages 8mo to 4y. Something I need to get off my chest about being a parent of young kids and the culture we live in:

Tobias Weyand Reposted

Mikhail Sirotenko

@sirotenko_m

Jul 7, 2023

New work from our team. We studied how various video foundation models perform on different benchmarks and with different adaptation methods.

AK

@_akhaliq

Jul 7, 2023

VideoGLUE: Video General Understanding Evaluation of Foundation Models paper page: huggingface.co/papers/2307.03… We evaluate existing foundation models video understanding capabilities using a carefully designed experiment protocol consisting of three hallmark tasks (action…

_akhaliq's tweet image. VideoGLUE: Video General Understanding Evaluation of Foundation Models

paper page: huggingface.co/papers/2307.03…

We evaluate existing foundation models video understanding capabilities using a carefully designed experiment protocol consisting of three hallmark tasks (action…

Tobias Weyand

@0xtob

Apr 23, 2023

Very clear and concise tutorial on transformers

Lucas Beyer (bl16)

@giffmana

Sep 14, 2022

My Transformer tutorial slides are now available at lucasb.eyer.be/transformer I'll append recordings to this thread as I get them. If you want to use some of the slides for your lecture, you may, as long as you credit me. If you'd like me to give the lecture: maybe; e-mail me.

giffmana's tweet image. My Transformer tutorial slides are now available at lucasb.eyer.be/transformer

I'll append recordings to this thread as I get them.

If you want to use some of the slides for your lecture, you may, as long as you credit me.

If you'd like me to give the lecture: maybe; e-mail me.

Tobias Weyand Reposted

Mikhail Sirotenko

@sirotenko_m

Jan 3, 2023

Our team is looking for student researchers to work on foundation video models. You'll work with @BoqingGo and @0xtob DM if you're interested

Tobias Weyand Reposted

Torsten Sattler

@SattlerTorsten

Jul 15, 2022

The Universal Image Embedding Challenge (kaggle.com/competitions/g…) of our @eccvconf Instance-Level Recognition workshop (ilr-workshop.github.io/ECCVW2022/) is online now! The workshop is co-organized by @0xtob and @giotolias among others.

Tobias Weyand Reposted

Papers with Datasets

@paperswithdata

Feb 8, 2022

🖼️The Met Dataset: a large-scale dataset for instance-level recognition in the artwork domain. Consists of 400k images from more than 224k classes. It can be used for research in few-shot learning, self-supervised and supervised contrastive learning. paperswithcode.com/dataset/met