Brief:

Before you read this page, you can skim over Transformer , Self-Attention & Multi-Head Attention.







Reference:

1907.00235.pdf