What is Reward prediction error In Neuroscience?

What is Reward prediction error?

Reward prediction error is the specific type of prediction error related to expected versus received rewards. It is encoded by dopamine neurons and serves as the brain’s primary signal for learning what actions and stimuli lead to positive outcomes.

How it works

Wolfram Schultz’s recordings of midbrain dopamine neurons revealed that they fire in response to unexpected rewards (positive prediction error), reduce firing below baseline for unexpected reward omissions (negative prediction error), and show no response to fully predicted rewards. This signal matches the temporal difference (TD) learning algorithm from reinforcement learning, providing a neurobiological implementation of computational learning theory. Reward prediction errors drive the updating of value representations in the striatum and prefrontal cortex.

Applied example

When a vending machine dispenses two candy bars instead of one, the unexpected bonus triggers a dopamine burst (positive reward prediction error) that strengthens the association between using that machine and getting a reward. After several double-dispenses, the brain comes to expect two bars, and receiving only one now produces a negative prediction error and disappointment.

Why it matters

Reward prediction error is the core mechanism of reinforcement learning in the brain, explaining how organisms learn to predict and pursue rewards through experience.

Sources and further reading

Schultz et al. (1997): A Neural Substrate of Prediction and Reward

Search site

What is Reward prediction error In Neuroscience?