The Transformer 1
Transformers have transformed (pun intended) all domains of machine learning – computer vision, natural language processing, or speech processing. The original paper is old by ML standards but the general idea of having an encoder/decoder architecture with an attention is…