Lesson 3·35 min·Free
The Attention Mechanism
The mathematical heart of every transformer
1 / 10
Why Attention Matters
Attention is the mechanism that allows a transformer to understand context. When you read the sentence "The bank of the river was muddy," your brain instantly resolves "bank" as a riverbank, not a financial institution. Attention is how a transformer does the same thing — by looking at surrounding words to determine meaning.
The mathematical formulation is elegantly simple: Attention(Q, K, V) = softmax(QKᵀ/√dk)V
Let's break down exactly what this means.
← → arrow keys to navigate