Faster Problem Solving with Pandas by Ian Ozsvald
Rasa Chats: Who Does NLP Work For, Fairness and Access | Podcast
[Long Review] Axial Attention in Multidimensional Transformers
UMass CS685 (Advanced NLP) F20: Transformers and sequence-to-sequence models
Flash Attention 2: Faster Attention with Better Parallelism and Work Partitioning
Kaggle Reading Group: Generating Long Sequences with Sparse Transformers | Kaggle
Piero Molino- Word Embeddings: History, Present and Future AIWTB 2017
Flash Attention 2.0 with Tri Dao (author)! | Discord server talks