What is Reward prediction error?
Reward prediction error is the specific type of prediction error related to expected versus received rewards. It is encoded by dopamine neurons and serves as the brain’s primary signal for learning what actions and stimuli lead to positive outcomes.
How it works
Wolfram Schultz’s recordings of midbrain dopamine neurons revealed that they fire in response to unexpected rewards (positive prediction error), reduce firing below baseline for unexpected reward omissions (negative prediction error), and show no response to fully predicted rewards. This signal matches the temporal difference (TD) learning algorithm from reinforcement learning, providing a neurobiological implementation of computational learning theory. Reward prediction errors drive the updating of value representations in the striatum and prefrontal cortex.
Applied example
When a vending machine dispenses two candy bars instead of one, the unexpected bonus triggers a dopamine burst (positive reward prediction error) that strengthens the association between using that machine and getting a reward. After several double-dispenses, the brain comes to expect two bars, and receiving only one now produces a negative prediction error and disappointment.
Why it matters
Reward prediction error is the core mechanism of reinforcement learning in the brain, explaining how organisms learn to predict and pursue rewards through experience.



