Sho Yokoi @sho_yokoi_ Twitter Profile

Sho Yokoi

@sho_yokoi_

Researcher at #NLProc team @NlpTohoku & at #statistics team @RIKEN_AIP_EN | interested in geometry of word embedding space and optimal transport | Ja:@sho_yokoi

79Posts 379Followers 150Following

Similar User

@blankeyelephant

@yans_official

@tohoku_nlp

@nlp_colloquium

@jqk09a

@futsaludy

@shot4410

@goro_koba

@rtokuhisa

@esindurmusnlp

@moguranosenshi

@yuji_research

@hmtd223

@KaoriAbe11

@shunkiyono

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

4 Nov

This is a good piece of work (finally online): arxiv.org/abs/2411.00680 Word embeddings typically assume uniform word frequencies, but in reality, they follow Zipf’s law. Simply applying PCA whitening weighted by empirical word frequencies can improve task performance.

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

4 Nov

Zipfian Whitening ift.tt/lBr1gDA

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

27 Sep

Interested in word frequency / geometry of language representations / contrastive learning? Check out our paper “Zipfian Whitening” at #NerIPS2024! See you all in Vancouver! 🇨🇦

Sho Yokoi

@sho_yokoi_

27 Sep

Our paper “Zipfian Whitening” got accepted to #NeurIPS2024🎉 Joint work w/ amazing @levelfour_ @hiroto_kurita (@tohoku_nlp) & @hshimodaira! A simple idea—consider word frequency when taking expectations—yields rich empirical/theoretical insights into language representation🔬

Sho Yokoi

@sho_yokoi_

27 Sep

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

20 Jan

[CL] Analyzing Feed-Forward Blocks in Transformers through the Lens of Attention Map G Kobayashi, T Kuribayashi, S Yokoi, K Inui [Tohoku University & MBZUAI] (2023) arxiv.org/abs/2302.00456 - Transformers have become ubiquitous in NLP tasks, so interpreting their internals is…

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

19 Jan

Happy to share our paper on LM’s Feed-forward network (FF) analysis has been accepted as an #ICLR2024 spotlight! 💡FF boosts attention between words forming compound nouns, named entities, etc. 💡FF and LayerNorm cancel out each other’s effects 📄arxiv.org/abs/2302.00456

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

8 Jan

[ARR Co-CTO here] Quick correction - our policy is based on your arXiv submission time, which we acknowledge will sometimes mean the paper appears on arXiv inside the anonymity window (see the FAQ on the ARR author page for details: aclrollingreview.org/authors #faq)

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

7 Dec

The Tohoku NLP Lab will present nine works at #EMNLP2023. See you in Singapore! #NLProc nlp.ecei.tohoku.ac.jp

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

25 Oct 2023

It's now on arXiv! arxiv.org/abs/2310.15921

Hiroto Kurita

@hiroto_kurita

9 Oct 2023

Thrilled to announce my first paper, “Contrastive Learning for Sentence Encoder Induces Word Weighting by Information-Theoretic Quantities” got accepted at #EMNLP2023 findings! Our work connects sentence encoders with information theory. Will upload the paper to arXiv soon! 💨

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

9 Oct 2023

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

8 Oct 2023

Three full papers and four findings papers have been accepted to #EMNLP2023 ! #NLProc nlp.ecei.tohoku.ac.jp/news-release/9…

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

25 Jul 2023

Cool idea! We actually observed that BERT tries to turn off attention by shrinking the value vector of [CLS] [SEP] punctuations and paying unnecessary attention there: arxiv.org/abs/2004.10102 I'd like to see his simpler trick will make Transformer better.

Evan Miller

@EvMill

24 Jul 2023

I hit a bug in the Attention formula that’s been overlooked for 8+ years. All Transformer models (GPT, LLaMA, etc) are affected. Researchers isolated the bug last month – but they missed a simple solution… Why LLM designers should stop using Softmax 👇 evanmiller.org/attention-is-o…

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

8 Jun 2023

Introducing OTAlign, optimal transport (OT) based monolingual word alignment. We established a connection between the family of OT and monolingual word alignment concerning the null alignment ratio. (1/3) 📄arxiv.org/abs/2306.04116 w/ @levelfour_ and @sho_yokoi_ #ACL2023NLP

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

2 Jun 2023

How the “prediction head” works in LMs? Our #ACL2023 Findings paper shows this module at the end of LMs adjusts word frequency in prediction 📊 (bonus) This property can easily be used to promote more diverse outputs of LMs! 📜 arxiv.org/abs/2305.18294

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

3 May 2023

Monolingual alignment, anyone? Our #ACL2023NLP paper "Unbalanced Optimal Transport for Unbalanced Word Alignment" w/ @levelfour_ and @sho_yokoi shows the family of optimal transport are natural and powerful tools for this problem. More information coming soon. #NLProc

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

2 Feb 2023

Hey! We've uploaded two pre-prints! One focuses on the intersection of vision, language, and cognitive science. arxiv.org/abs/2302.00667 The other examines the role of feedforward in Transformers. arxiv.org/abs/2302.00456 These are still **in progress**. Stay tuned for updates!

Sho Yokoi

@sho_yokoi_

17 Dec 2022

The separation of “(recognizing) textual entailment” and “natural language inference” in the peer-review field selection form for ACL 2023 is disturbing.

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

26 Oct 2022

[CL] Subspace-based Set Operations on a Pre-trained Word Embedding Space Y Ishibashi, S Yokoi, K Sudoh, S Nakamura [NAIST & Tohoku University] (2022) arxiv.org/abs/2210.13034 #MachineLearning #ML #AI #NLP #NLProc

Sho Yokoi Reposted

Sho Yokoi

@sho_yokoi_

8 Apr 2022

One paper entitled "Word Tour: One-dimensional Word Embeddings via the Traveling Salesman Problem" has been accepted to #NAACL2022 (main conference, short paper) 🎉