Format

Send to

Choose Destination
Learn Behav. 2011 Sep;39(3):245-58. doi: 10.3758/s13420-011-0025-7.

Pigeon and human performance in a multi-armed bandit task in response to changes in variable interval schedules.

Author information

1
Western Carolina University, Cullowhee, NC, USA.

Abstract

The tension between exploitation of the best options and exploration of alternatives is a ubiquitous problem that all organisms face. To examine this trade-off across species, pigeons and people were trained on an eight-armed bandit task in which the options were rewarded on a variable interval (VI) schedule. At regular intervals, each option's VI changed, thus encouraging dynamic increases in exploration in response to these anticipated changes. Both species showed sensitivity to the payoffs that was often well modeled by Luce's (1963) decision rule. For pigeons, exploration of alternative options was driven by experienced changes in the payoff schedules, not the beginning of a new session, even though each session signaled a new schedule. In contrast, people quickly learned to explore in response to signaled changes in the payoffs.

PMID:
21380732
DOI:
10.3758/s13420-011-0025-7
[Indexed for MEDLINE]

Supplemental Content

Full text links

Icon for Springer
Loading ...
Support Center