Learning, attentional control and action video games
Abstract
While humans have an incredible capacity to acquire new skills and alter their behavior as a result of experience, enhancements in performance are typically narrowly restricted to the parameters of the training environment, with little evidence of generalization to different, even seemingly highly related, tasks. Such specificity is a major obstacle for the development of many real-world training or rehabilitation paradigms, which necessarily seek to promote more general learning. In contrast to these typical findings, research over the past decade has shown that training on ‘action video games’ produces learning that transfers well beyond the training task. This has led to substantial interest among those interested in rehabilitation, for instance, after stroke or to treat amblyopia, or training for various precision-demanding jobs, for instance, endoscopic surgery or piloting unmanned aerial drones. Although the predominant focus of the field has been on outlining the breadth of possible action-game-related enhancements, recent work has concentrated on uncovering the mechanisms that underlie these changes, an important first step towards the goal of designing and using video games for more definite purposes. Game playing may not convey an immediate advantage on new tasks (increased performance from the very first trial), but rather the true effect of action video game playing may be to enhance the ability to learn new tasks. Such a mechanism may serve as a signature of training regimens that are likely to produce transfer of learning.
Introduction
Prompted in part by a growing market of aging baby boomers, the past decade has seen a surge of interest in so-called ‘brain training’. Products and paradigms as varied as diet, aerobic exercise, social interactions, pharmacological interventions, meditation, working memory training, and video games have been promoted for their potential ability to enhance memory, speed processing, boost executive functions, and/or augment fluid intelligence [1–8]. For all the excitement, however, there is also much skepticism as to whether such regimens truly produce benefits of sufficient size and scope to noticeably improve the quality of day-to-day existence. For instance, in the case of cognitive training regimens, the tendency of human learning to be highly specific to the exact characteristics of the training environment is a major obstacle that must be overcome before real-world cognitive enhancement can be realized [9].
Here, we take stock of the current scientific understanding of specificity and transfer in learning, in particular the principles that can be drawn from the highly general learning that is produced by experience with action video games, in an attempt to define a path towards learning that is widely applicable. More specifically, we note that rather than attempting to teach individuals an extensive variety of individual skills, it may be more useful to enhance attentional and executive control [10]. By facilitating the identification of task relevant information and the suppression of irrelevant, potentially distracting sources of information, improvements in attentional control could enable individuals to more swiftly adapt to new environments or to more quickly learn new skills. The proposal that action game play fosters such ‘learning to learn’ would naturally account for the wide variety of skills enhanced by action game play.
Within- and between-individual differences in learning
The primary focus of this review is extrinsic factors in learning — in other words, the characteristics that training regimens need to incorporate in order to successfully enhance behavioral performance. However, this is by no means meant to discount the role intrinsic factors play in determining learning outcomes. For instance, it is well known that the ability of a given individual to learn changes dramatically as a function of age [11]. Indeed, the dominant view in the neurosciences through the middle portion of the previous century was of a brain that, while quite plastic in infancy, childhood, and adolescence, became reasonably fixed and inflexible in adulthood and old age. Consistent with this idea was, for example, the finding of sensitive periods, wherein the potential for large-scale learning/plasticity was most present only early in life. This foundational concept, initially pioneered by Hubel and Wiesel in the study of the development of binocular vision [12], has been echoed in domains ranging from tactile perception [13] to language acquisition [14].
But although it is unarguably the case that the malleability of the brain declines with age, recent research shows that, given appropriate training, the adult brain has the capacity for far more substantial plasticity than previously believed [11,15]. For example, the recent successes in recovering function lost following cortical damage, such as strokes [16], and in improving vision in adults with amblyopia [17], are both examples of plasticity in circumstances where training was previously believed to be futile. These, and many other similar findings, offer substantial hope for the development of learning paradigms across the lifespan.
In addition to within-individual changes in plasticity, such as occur with normal aging, the presence of between-individual differences in learning ability have been noted for millennia. As early as the Zhou Dynasty in China, 2500 years ago, Confucius perhaps first outlined the principle of what would today be called ‘differentiated instruction’ — essentially the idea that effective teaching needs to be tailored to the abilities of individuals, which can vary substantially [18]. In the late 19th century, Sir Francis Galton (who coined the phrase ‘eugenics’), suggested that individual differences in life outcome could be attributed to heritable differences in the “adequate power of doing a great deal of very laborious work” (quoted in [19]). This belief presaged work in the field of expertise starting in the 1960s, where the capacity for ‘deliberate practice’ — choosing to train on effortful activities, such as playing chess or the piano — was identified as the best predictor of learning and eventual expertise [19].
Thus, it is worth considering that the exact implementation needed to realize the principles of effective training regimens may differ greatly depending on the precise circumstances. Nevertheless, we believe the overall principles that will be discussed in the remainder of the article are generally applicable regardless of age, individual aptitude, or most other individual differences.
The ‘curse’ of learning specificity
The realization that even the least plastic humans are capable of acquiring new skills (or reacquiring lost skills), coupled with a rapidly aging population, has led to a growing interest in ‘brain fitness’, with the most popular approach being the ‘mini-task’ approach, wherein the individual undergoes repeated practice on a relatively small set of tasks. Often these tasks are, at their base, reasonably standard experimental paradigms (for example task-switching, multiple-object tracking, the Tower of Hanoi, Raven’s Progressive Matrices), that have been ‘dressed up‘ with visual and sound effects to appear less sterile (often satirically dubbed the “chocolate covered broccoli’ approach). But although there is no question that given sufficient time on task, individuals will show substantial improvements on the trained task(s), it is less clear that such training will benefit untrained tasks, such as those that might be encountered outside the lab during normal day-to-day life. Yet, it is these everyday activities that should, after all, be the true goal of such regimens.
For example, one may hypothesize that training on a ‘gamified’ version of the Tower of Hanoi task — made to look more like a video game and less like a psychological experiment — would promote an increase in high-level planning or spatial working memory, which would in turn lead to enhanced performance on many tasks that require high-level planning and spatial working memory ability both within and outside of the lab. Yet, such training could also simply teach an individual the specific rules and strategies necessary to solve the Tower of Hanoi task, a result that would not benefit any tasks other than the Tower of Hanoi task itself.
Indeed, one striking feature of much human learning is its specificity to the exact learning task and context. And while such specificity has been seen in every sub-field of psychology that focuses on learning — for example, motor learning [20], expertise [21], memory [22], education [23] — it has been perhaps most thoroughly demonstrated in the sub-field of perceptual learning, where the learning is often specific to seemingly exceedingly low-level attributes of the task environment (Figure 1). For example, learning of visual tasks is often specific to retinal position [24]. Individuals trained to determine the orientation of a target that always appears in one quadrant of the visual field will show clear evidence of learning — improvements in performance over time that persist over the course of at least days, and typically months to years. But when the target position is moved to an untrained quadrant, performance returns to baseline levels and the subject must learn this new position ‘from scratch’. Similarly, individuals trained to discriminate whether a field of moving dots has a global direction of motion of 3° clockwise versus 3° counterclockwise from straight up show no benefits of this initial learning when tested on the same task, but around a different reference angle, such as 3° clockwise or counterclockwise from straight down [25]. Comparably specific learning has been documented for myriad low-level features including spatial frequency, line/grating orientation, background texture, and even which eye was trained [26–28].
Learning is highly specific to the training conditions.
(A) Subjects trained to discriminate motion direction around one reference angle (45°) improve substantially over the course of training (improvement shown in red). However, when tested on the same task, but the opposite direction (225°), no benefit of training is observed (transfer, or lack thereof, illustrated in blue-adapted from [25]). (B) Subjects trained to produce one nonsense string while the mouth is physically perturbed show substantial learning (red bar). However, the learning does not transfer to a task wherein the same perturbation is used, but a new nonsense string must be produced (blue bar). Aadapted from [90]. (C) Expert baseball players show substantial reaction-time advantages over non-experts in a task with similar processing demands as those that occur while batting in baseball (Go/NoGo = Swing/Don’t Swing; red bar). However, no such advantage is observed in a simple reaction-time task, which has no such baseball equivalent (blue bar ). Adapted from [91]. D) Expert chess players show a substantial advantage over non-experts in the ability to recall the position of chess pieces only when the pieces are in real-game typical positions (red bar). When pieces are positioned randomly, no advantage is observed (blue bar). Adapted from [21].
For those whose interest in the development of training regimens is practical in nature it is obvious why this tendency toward specificity is a problem that needs to be overcome. After all, it is of little use to improve the ability to detect a triangle target in peripheral vision if this training does not increase the ability to detect an approaching car in an intersection. Similarly, improving one’s ability to remember a sequence of digits when presented in the lab has little use if it does not improve one’s ability to remember a phone number at home.
To generalize or not to generalize?
Although specificity is obviously a ‘curse’ if the goal is, for instance, rehabilitation after stroke, it is perhaps more useful to think of specificity and generalization as being two ends of a continuum, either of which can theoretically be the ‘ideal’ learning solution depending on the training conditions. The available literature points to two factors in particular that appear to be key in pushing learning toward one extreme or the other: the number of training trials experienced; and the amount of variability in the training set.
Concerning the first factor, several authors have suggested a possible distinction between an early phase of learning (when relatively few trials have been experienced), which is reasonably fast and tends to generalize well across contexts, and a later phase of learning (when the number of trials experienced increases into the several hundreds or thousands), which is slow and much more specific to the exact characteristics of the task [29]. This trend is also echoed in much of the expertise literature, where increases in automaticity (which is necessarily specific) are only seen with substantial experience [30].
As for the second factor, the fact that more general learning is produced by greater variety in task/stimuli has been noted for over a century [31] and has been demonstrated in a multitude of domains [32]. As just one example, Catalano and Kleiner [33] manipulated the number of distinct timings subjects experienced in a coincident timing task. In this task subjects were seated in front of a row of lights, which lit up one at a time starting with the most distant from the subject and moving toward the closest to the subject. The subject’s goal was to press a button at the exact time that the final light turned on. One group of subjects was trained on a single inter-light timing (a constant speed), while another group of subjects was trained on a variable set of timings. Then, following training, both groups were tested on timings neither group had previously experienced. While the single-timing group showed the greatest amount of learning during training (consistent with the view that the most effective training for a single task is repeated experience with that one task), the multiple-timing group far outperformed the single-timing group in the transfer conditions. In short, there is no free lunch. If the goal is to train a very specific skill that needs to be executed repeatedly and flawlessly, then the appropriate training regimen should include substantial numbers of trials of that very task. Conversely, if the goal is to produce performance that is less skilled on any one individual task, but more applicable to a wide range of tasks, then the appropriate training regimen should include fewer trials of experience on any one task, and a much broader variety of stimuli/tasks [32].
Hierarchical learning
Variety is an essential characteristic of training regimens that lead to more general learning. The precise role variety plays in learning can be easily captured in a theoretical framework that recognizes that tasks can be organized hierarchically — or in other words, that tasks can share component processes [34]. Thus, variety, by allowing the opportunity to experience many tasks with such shared component processes, will foster the ability to learn at this ‘meta’ level.
As a toy example, imagine a training regimen that consisted of one hour of making peanut butter and jelly sandwiches, one hour of making ham and cheese sandwiches with mayonnaise, and one hour of making Nutella and banana sandwiches. While each different sandwich type represents a distinct ‘task’, there are obviously shared components amongst the tasks (for example, taking a semi-solid product from a jar and spreading it on bread). This knowledge will obviously benefit tasks that share this component, such as making a marshmallow fluff sandwich, but not tasks that do not share this component, such as pouring cereal).
The broad framework that tasks are inherently hierarchical in nature, and generalization results from learning at meta-levels, encompasses a wide swath of more specific theories. For instance, Thorndike’s theory of identical elements [31] states that some tasks involve identical processing components. The more identical processing elements two tasks share, the more learning on one will benefit the other — for example, large transfer from learning to drive a car to learning to drive a truck, but less transfer to learning to drive a boat. Similarly, both motor schema theory [20] and Harlow’s theory of ‘learning set formation’ [35] emphasize that seemingly different tasks may nonetheless have similar rules at their roots.
For example, in Harlow’s work, monkeys were given a series of different learning tasks to solve [35]; on each trial, the animal had to choose which of two food wells (covered by different contextual objects) to look in for a food reward which, within a block of trials, was always hidden under the same contextual object. Thus, when a new block of trials began, complete with new contextual objects, there was no way for the animal to know which object to search under. Nevertheless, it was also the case that a single rule of ‘win-stay, lose-switch’ would always result in a food reward on the second and all subsequent trials; that is, if the chosen object resulted in food on the previous trial, it should be picked again, and if the chosen object did not result in food, the other object should be chosen on the next trial. Through experience with many blocks, the animals eventually learned this rule and in doing so greatly improved their performance on all ensuing transfer blocks (Figure 2; see also work on transfer of metacognitive skills – [36]). In the domain of perceptual learning, the ‘double-training’ or ‘training-plus-exposure’ procedures pioneered by Yu and colleagues [37,38] could also be construed to fall under the umbrella of hierarchical learning. In these experiments, exposure to multiple tasks/stimuli results in a degree of transfer not observed when subjects are trained on the tasks in isolation.
Needs a short title.
Although experience with many tasks with the same underlying structure did not result in enhanced performance on trial number one (indeed, it cannot help, the best the animal can do is guess the correct answer), it did result in a substantial increase in the rate at which the new tasks were learned. Adapted from [35].
Finally, analogous ideas have been explored in the educational psychology domain. For instance, although learning in school typically focuses on content (for example, the names of the state capitals or the number of neutrons, protons, and electrons in an atom of carbon), Binet [39], at the turn of the 20th century, noted that many skills underlie the ability to successfully acquire content. One such skill, required by nearly all school learning, was the ability to remain still, quiet, and focused. He thus invented many games to play with his young students that would each teach this higher-level ability (for example, they would play ‘statue’ where, when given a command, all the students had to freeze in their current position until a second command was given). Only when these basic skills had been developed did he attempt to teach content – a process that was accomplished far more efficiently than would have been the case without such training.
This hierarchical view thus calls for the identification of task components that are most commonly shared across many different tasks. We are certainly not the first to suggest that the ability to flexibly and effortlessly allocate attentional and executive resources is key in our ability to manage almost any challenging task [6,40]. Such abilities foster efficient filtering of noise and enhancement of signal, which in turn underlie performance improvements in essentially all perceptual tasks [10,41,42]. The automatization of resource allocation also holds the potential to free executive systems for more complex problems including discovering the underlying structure of a task. Indeed, one cannot excel at a task without having developed proper representations to handle the task itself.
The importance of developing meaningful units that can serve as pointers for top-down attention, and thus guide learning, is highlighted by studies on multi-stimulus learning (or roving) in the perceptual learning literature. The mixing of multiple stimuli can hinder learning as the dimensions along which learning should occur constantly vary. Yet proper spatio-temporal patterning can rescue learning by providing the necessary cues to identify the units over which learning should occur [43]. While the process of identifying the correct learning space is often reduced to template matching in studies of perception, it can take more complex forms when it comes to learning rich hierarchical structures as children do during many aspects of their conceptual development [44]. A direct consequence of this realization has been the search for training regimens that result in the automatization of resource allocation. In children and older adults where executive skills are somewhat weak, several promising alternatives have emerged including playing a musical instrument [45] or computer-based brain training [40,46]. Here, we shall evaluate the possibility that playing action-packed, first-person shooter video games may augment attentional control and executive functioning in young adults at the prime of their capacity [47].
Action video game play enhances attentional control
Over the past two decades, myriad reports have documented the beneficial effects of playing video games [7,48,49]. While the earliest work in this field did not explicitly distinguish between genres of games [50], more recent work has identified one particular genre (so-called “action” video games) as containing games that promote the broadest benefits to perceptual and attentional abilities. Games in this genre are distinguished from those in other genres (such as strategy or role-playing) by the speed of the games (both in having transient objects that quickly pop into and out of the visual field and in the velocity of moving objects), high perceptual, cognitive, and motor loads (for example, multiple characters to monitor simultaneously, and many possible motor plans to keep active before making a selection), an emphasis on peripheral visual field processing and divided attention (items of interest often first appear at the edges of the screen at the same time as events that are occurring at the center of the screen). Furthermore, these games require players to constantly make predictions regarding future game events both spatially — “Where an enemy is most likely to appear?” — and temporally — “When is an enemy most likely to appear?”. The latter occurs at many different time scales, from the millisecond range when considering enemy appearance, to minutes for knowledge of the lay of the land, to hours or days for meta parameters such as achieving the goal of a particular game level. Finally, as the game unfolds, players constantly receive feedback as to the accuracy of their predictions, a key step in engaging the reward system and thus producing learning [51–53]. A distinguishing feature of these games is the layering of events/actions at many different time scales, resulting in a rather complex pattern of reward in time. This feature may explain, in part, why action video game players seem to maximize reward rate in a variety of tasks [54]. Although this complex reward schedule likely has an important role to play in conjunction with changes in attention, when considering learning to learn in this review, we will focus primarily on the well-documented changes in attentional control.
Indeed, there is certainly considerable evidence in the literature that action game play enhances various aspects of attention, including selective attention over space and time and attention to objects [54–71]. Although we separate these elements in the following section, this is primarily in the service of organization, rather than being representative of true differences in mechanism or outcome. As we will see, the effects appear to be in the control of top-down attention, independent of the exact instantiation.
Selective attention in space
The ability to focus attention on a target and ignore distracting information is the essence of selective attention. Correspondingly, action game play results in more effective localization of a target whether presented in isolation, as in Goldman perimetry [67], or amongst distracting, irrelevant information, as in the Useful Field of View [55,56,59,60] or in standard visual search [72] (Figure 3). Crowding thresholds, the minimum distance between a target and a distractor wherein the target can still be individuated and identified, are often thought to be indicative of the spatial resolution of visual attention. These thresholds are also reduced as a result of action game play, an effect that occurs both within and outside of the trained portion of the visual field, thus indicating that the effect is not retinotopically specific [57]. Furthermore, the benefits to spatial attention are not limited to quickly presented static displays. West and colleagues [64] utilized a dynamic display wherein subjects were presented with many moving ‘swimmer’ stimuli at two levels of perceptual load — either 15 or 30 moving circles with oscillating line arms, in a wide field of view (a circular field with a 30° radius). Subjects monitored this display for between 1.5 and 3.5 seconds for the onset of a target stimulus, wherein one of the swimmers stopped moving and increased the oscillation of its arms. Video-game players far outperformed non-gamers at both levels of perceptual load and at all possible target eccentricities (from 10° to 30°).
Improved selective spatial attention after action game play.
(A) Several versions of the Useful Field of View Task (different timings, masks, targets, and so on) have been employed to test changes in selective spatial attention that arise due to action video game experience [92]. (B) Across all task versions, avid action game players (blue) demonstrate enhanced performance as compared to non-action game players (green). (C) A causal link between playing action video games and enhanced performance on the Useful Field of View task has been assessed in a number of training studies. Training non-action game players on action games leads to an increase in Useful Field of View performance (blue bars highlight performance before and after action training), while training on non-action games leads to lesser, or no such improvement (green bars highlight performance before and after control training). Adapted from [55,56,59,60,73].
Selective attention in time
Action game play also enhances the ability to select relevant information over time. When viewing a rapid 10 Hz stream of visually presented letters, wherein all the letters are black except one letter that is white, participants can typically easily identify the white letter. However, doing so creates a momentary blink in attention, leading them to be unaware of the next few items following the white letter. This effect, termed the attentional blink, is believed to measure a fundamental bottleneck in the dynamics of attentional allocation. Action game training significantly reduces the magnitude of this blink, with some expert gamers failing to show a blink at all [56,73,74]. Also consistent with an enhancement of attention in time is the finding that action game training reduces the negative impact of backward masking [70] and the report that action gamers perceive the timing of visual events more veridically than nongamers [63].
Selective attention to objects
A third aspect of attention documented to change for the better after action game play has been attention to objects [56,60]. Using the multiple object tracking task, action game players can successfully track both more independently moving objects than non-videogame playsers as well as track the same number of objects at faster rates [58,68,73,75]. This skill requires efficient allocation of attentional resources as objects move and bounce off one another in the display.
Toward more efficient attentional control
The proposal that action game play enhances top-down aspects of attention by allowing gamers to more flexibly allocate their resources is supported by several independent sources. For example, it has been shown that action video game players better overcome attentional capture. Chisholm et al. [71] compared the performance of gamers and non-gamers on a target search task manipulating whether a singleton distractor, known to automatically capture attention, was or was not present. Although the singleton distractor captured attention in both groups, it did so to a much lesser extent in action gamers than non-gamers suggesting that gamers may better employ executive strategies to reduce the effects of distraction.
Recently, Mishra et al. [65] made use of the steady-state visual evoked potentials technique to understand the neural bases of the attentional enhancement noted in action gamers. They found that action gamers more efficiently suppress unattended, potentially distracting information. Participants viewed four different streams of rapidly flashed alphanumeric characters. Each stream flashed at a distinct temporal frequency allowing retrieval of the brain signals evoked by each stream independently at all times. Thus, not only could the brain activation evoked by the attended stream be retrieved, but also those evoked by each of the unattended and potentially distracting streams. Action gamers suppressed irrelevant streams to a greater extent than non-gamers and the extent of the suppression predicted the speed of response, thus supporting the view that action video game playing sharpens attentional skills by allowing players to better focus on the task at hand by ignoring other sources of potentially distracting information.
Sustained attention and impulsivity
Selective visual attention is not the only aspect of attention that changes for the better. There is some evidence that sustained attention also benefits from action video game play. Using the ‘Test of Variables of Attention’, a computerized test often used in the screening of attention deficit disorder, Dye et al. [61] found that gamers responded faster and made no more mistakes than non-gamers. Briefly, this test requires participants to respond as fast as possible to shapes appearing at the target location, while ignoring the same shapes if they appear at another location. By manipulating the frequency of appearance at the target location, the Test of Variables of Attention offers a measure of both impulsivity (is the observer able to withhold a response to a non-target when most of the stimuli are targets?) and a measure of sustained attention (is the observer able to stay on task and respond quickly to a target when most of the stimuli are non-targets?). In all cases, gamers were faster but no less accurate, indicating if anything enhanced performance on these aspects of attention as compared to non-gamers.
It may be worth noting that gamer responses were often so fast that the built-in data analysis software of the Test of Variables of Attention considered many of their reaction times to be anticipatory (200 ms or less; note, the fact that nearly 100% of these trials were correct responses indicates that they were not in fact ‘anticipatory’). In summary, gamers are faster but not more impulsive than non-gamers and equally capable of sustaining their attention. Although correlational studies indicate a link between technology use and attention-deficit disorder [76], in the case of playing action games it appears that, to the contrary, this activity actually sharpens attention.
Not all aspects of attention are altered
Action games are literally full of abrupt onsets of highly salient visual objects, which are typically also very behaviorally relevant (for example, an enemy that springs out of a door). Thus, it seems reasonable to hypothesize that playing this type of game would enhance exogenous attention. However, the available literature suggests this is not the case. Three separate studies [62,72,77] have now examined whether game players differ in the extent to which exogenous cues pull attention using classic Posner cueing techniques [78] and no effects have been observed (however, see [64] for a positive result using a different measure of exogenous attention).
Summary of effects of action video game experience
The overall literature appears clear in that the positive effects of action game play are greatest on tasks where performance is limited by top-down attention or the processes that control and regulate attentional allocation and resource management. The extent to which executive functions are also changed for the better remains to be further explored, although enhancements have been observed in several specific constructs that fall under the broader category of executive function. These include the enhancements in selective attention discussed previously as well as in task-switching, multi-tasking, and visual short-term memory tasks [59,79–81]
Learning to learn as the goal of general learning
We conclude by considering the role of enhanced attentional control in explaining the observed differences in behavior noted as a result of action video game play. While some viewpoints may assume that enhanced attention is the proximal ‘cause’ of the superior performance — in other words, an end in and of itself — we have recently considered the possibility that enhanced attention is instead a means to an end, with that end being better probabilistic inference [54]. As example, take a standard classification task we face everyday: letter recognition. Any given letter can take many forms given variations in handwriting and the many computer fonts available; yet we can reliably infer whether a given letter is an ‘a’ or a ‘b’. Converging evidence suggests that our nervous system addresses such computational challenges by calculating the probability that an individual letter is a member of category ‘A’ or category ‘B’, given the evidence that is available (the image that is presented). By Bayes’ rule, this value is proportional to the probability of the evidence given the category — also known as the statistics of the evidence or likelihood. Importantly, these statistics cannot be known in the absence of experience. In the case of letters, for example, we have learned over many years of experience to perform such statistical inferences over Roman alphabet letters, but may find ourselves at loss with an Arabic font. In the laboratory setting, stimuli and task demands are often entirely novel and quite unlike stimuli we experience in everyday life (such as low contrast Gabor functions or random dot kinematograms). There is simply no way for the nervous system to know what the evidence will look like for one answer versus the other without first experiencing examples associated with each answer. What is needed then for performance to improve and what would be the effect of enhanced attention in such tasks?
Enhancements in spatial or temporal attention will allow for better evidence to be available to the system. However, this does not change the fact that on the first trial, the subject cannot make a reasonable inference as to what that evidence indicates. Where the effect of enhanced attention will manifest itself is on each subsequent trial where having more/better evidence will lead to more informed choices on the individual trial as well as allow for more accurate knowledge of the statistics of the evidence to be accrued, which in turn also leads to more informed choices [54]. Behaviorally, it will be the case that an individual with enhanced attentional capabilities will learn to perform new tasks at a faster rate than an individual without such capabilities — in other words, they will have ‘learned to learn’.
We are now examining the possibility that the enhanced performance noted as a result of action video game play is in fact the result of enhanced ‘learning to learn’. Rather than enhancements in performance on trial one, this view holds that reasonably equivalent performance between groups should be seen early on when performing a new task, with the action gaming advantage appearing and then increasing through experience with the task. This account is attractive, as it would naturally explain the breadth of tasks shown to benefit from action game training.
In sum, action video play is an appealing tool to probe the limits of plastic changes in perception, attention and cognition, opening new windows on how to foster learning and brain plasticity across a wide variety of tasks and domains. We have focused here on how attentional control could foster learning to learn, but we recognize that other factors in the video game experience are highly likely to facilitate such an outcome. First, video games incorporate many characteristics of good pedagogy beyond those discussed above including the ratio of massed versus distributed practice, personalized difficulty levels, just-right increment steps during learning, fun and engagement (see [8, 82]). As previously noted, action games also include extremely effective reinforcement and reward scheduling, which can be critical for efficient learning [52,83].
However, we note that, in the case of factors like fun and engagement, these are also present in other, less effective games. For example, during our training studies, participants required to play action games report the same level of engagement as those asked to play our control games as measured by the Flow Questionnaire [84,85]. Yet, action trainees improve more. Therefore, a key consideration for future work will be to continue characterizing game play factors, or likely combinations thereof, that are not only necessary but also sufficient in fostering learning to learn.
Potential practical applications include the development of more efficient rehabilitation regimens or more engaging educational software. Yet, this is not to say that action video game play is expected to change performance for the better in all domains. There is now strong evidence that action game play fosters performance in tasks where the statistics need to be derived from the environment. However, it remains an open question whether higher cognitive tasks which require similar statistical inferences but over memory representations (e.g. problem solving) will benefit.
Training study design.
Individuals who report playing little to no video games (both males and females) are recruited and pre-tested on measures of interest. The pre-test measures are specifically designed to minimize task specific learning (for example, small numbers of trials, no feedback). Following pre-test, the groups are randomly assigned to play either an action game or a non-action, control game. The games are matched as closely as possible for as many aspects of game play as possible (identification with character, fun, ‘flow’, and so on) while leaving attentional and action demands different. Subjects come to the lab to play the game one to two hours a day (maximum of 10 hours a week) for anywhere from 10 to 50 hours depending on the study. Once the training is completed (and at least 24 hours after the last training session ends to ensure that any observed effects are not due to transient changes in physiology/arousal), subjects complete similar tasks as during pre-test. A causal role of action game playing is indicated by a larger change from pre- to post-test in the action trained group than in the non-action trained group.
Acknowledgements
We thank Ted Jacques with help with figure preparation. This work was supported in part by the McDonnell Foundation, “Critical Period Revisited Network” as well as by a National Institute of Health grant EY016880 and an Office of Naval Research MURI grant N00014-07-1-0937 to Daphne Bavelier.
Footnotes
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.




