This project aims to implement the paper "Attention is all you need" from Google, 2017. TODO Transformer architecture Multi-Head Attention Mechanism Encoder-Decoder Structure Positional Encoding