Brief:
Before you read this page, you can skim over Transformer , Self-Attention & Multi-Head Attention.
Reference:
1907.00235.pdf