Logo of learnmemLearning & MemoryCSHL PressJournal HomeSubscriptionseTOC AlertsBioSupplyNet
Learn Mem. 2002 May; 9(3): 138–150.
PMCID: PMC182592

The Time Course of Neural Changes Underlying Auditory Perceptual Learning


Improvement in perception takes place within the training session and from one session to the next. The present study aims at determining the time course of perceptual learning as revealed by changes in auditory event-related potentials (ERPs) reflecting preattentive processes. Subjects were trained to discriminate two complex auditory patterns in a single session. ERPs were recorded just before and after training, while subjects read a book and ignored stimulation. ERPs showed a negative wave called mismatch negativity (MMN)—which indexes automatic detection of a change in a homogeneous auditory sequence—just after subjects learned to consciously discriminate the two patterns. ERPs were recorded again 12, 24, 36, and 48 h later, just before testing performance on the discrimination task. Additional behavioral and neurophysiological changes were found several hours after the training session: an enhanced P2 at 24 h followed by shorter reaction times, and an enhanced MMN at 36 h. These results indicate that gains in performance on the discrimination of two complex auditory patterns are accompanied by different learning-dependent neurophysiological events evolving within different time frames, supporting the hypothesis that fast and slow neural changes underlie the acquisition of improved perception.

Perceptual learning is defined as gains in performance on perceptual tasks as a result of experience-dependent changes in the properties of neurons and in the functional organization of the cerebral cortex. Improvement in performance has been reported not only during the training period after presentation of several trials (Poggio et al. 1992), but also after several hours (Karni and Sagi 1993) or several days (Schoups et al. 1995; Schoups and Orban 1996). These differences in the time course over which perceptual learning takes place can be explained by neural changes evolving within different temporal windows. Thus, the improvement in perceptual sensitivity within the training session is assumed to occur as a result of fast neural changes, probably reflected by rapid receptive field modulation of cortical neurons (Gilbert 1994; Kapadia et al. 1994). In contrast, in the period between sessions, slower neural changes develop in the absence of stimulation. These slower changes are thought to be the result of reorganization of cortical representations. Consistent with this hypothesis, physiological studies have found slow behavioral improvements to correlate with neuronal changes in the somatosensory (Recanzone et al. 1992ad), auditory (Recanzone et al. 1993), and visual cortex (Zohary et al. 1994) of monkeys. The slow neural changes have been indicated to be responsible for the consolidation of information in long-term memory (see Galván and Weinberger 2002), and might account for the behavioral improvements several hours after training, as well as for their maintenance for long time periods (Karni and Sagi 1993). Nevertheless, the time course of these neural and behavioral changes in perceptual sensitivity has not yet been established.

Neurophysiological changes underlying behavioral improvements in performance on perceptual tasks have been studied in humans by using event-related potentials (ERPs). ERPs provide information about the temporal relationship between the presentation of a stimulus and the corresponding cortical response. The specific temporal relationship among certain ERP components (e.g., mismatch negativity [MMN], N2, and P3b) has led to the hypothesis that they index different but dependent neural events (e.g., the automatic change-detection process reflected by MMN and the target-identification processes reflected by N2 and P3b) that are equivalent to distinct stages of information processing (Novak et al. 1992). Unlike behavioral measures, ERPs can be used as objective indices of different preattentive processes. The automatic detection of a change occurring in a homogeneous sequence, determined by single or abstract characteristics, is reflected by a negative ERP component named MMN (Näätänen 1990). The MMN has been used as an electrophysiological index of short- and long-term learning-dependent changes in the central auditory system associated with the automatic discrimination of single tones (Menning et al. 2000), complex auditory patterns (Näätänen et al. 1993; Koelsch et al. 1999; Tervaniemi et al. 2001), and speech sounds (Kraus et al. 1995), as well as with the learning of the native language in infancy (Cheour et al. 1998) or a foreign language in adulthood (Winkler et al. 1999). Thus, in one of these studies (Näätänen et al. 1993), the MMN elicited by changes in complex auditory patterns became larger as performance on a discrimination task within a single session improved. This progressive increase of the MMN amplitude in parallel with improvement in performance is thought to index fast neural changes evolving within the training session.

The time course of slow neurophysiological and behavioral changes associated with perceptual learning has also been evaluated using ERPs (Tremblay et al. 1998). In this study, subjects were trained to discriminate two speech sounds differing in voice onset time (/ba/ versus /mba/) on different days. Performance and ERPs were tested on days after each training session. Behavioral and neurophysiological changes showed different time courses. Specifically, neurophysiological changes were observed to precede behavioral improvement. Nevertheless, it was difficult to dissociate, from these results, which neurophysiological changes were the consequence of fast learning during the training session and which were the result of slow changes between sessions. The main goal of the present study is to determine the time course of learning-related fast (within session) and slow (between sessions) neural changes by using variations in the ERPs as electrophysiological measure.

The fast and slow neural changes underlying the acquisition and consolidation of perceptual learning, respectively, are highly dependent on attention at the time of encoding (Näätänen et al. 1993; Tervaniemi et al. 2001). In fact, if subjects attend to the stimuli but not to the aspect relevant of the stimuli to the task, no improvement is found (Shiu and Pashler 1992; Ahissar and Hochstein 1993, 2000; Harris and Fahle 1998). In contrast, ERP studies indicate that attention is not always necessary at the time of retrieval (Näätänen et al. 1993; Tremblay et al. 1997, 1998, 2001; Tervaniemi et al. 2001). Therefore, the improvement in perceptual sensitivity can be evaluated with or without attention, because the training seems to affect the cortical memory traces used by the automatic neural mechanisms of information processing. In the present study, improvements in the preattentive and attentive discrimination of two complex auditory patterns were measured within the training session and every 12 h afterward for a period of 48 h from the beginning of the training (see Fig. Fig.11 for a scheme of the experimental design). Different neural mechanisms are thought to explain fast and slow changes underlying behavioral improvement. Consistent with this hypothesis, different electrophysiological changes within different time frames are expected to be found in parallel with behavioral improvement on the discrimination task.

Figure 1
Stimulus and experimental design. A scheme of auditory patterns is shown on the top (left corner). The black bar on the sixth segment (signaled with a black arrow) denotes the frequency change introduced within the deviant pattern at 225 msec from stimulus ...


Behavioral Results

Figure Figure22 shows the number of hits and false alarms (top) and the Pr index (see Materials and Methods) obtained in the first and last three blocks of the training session. Pr and reaction time (RT) obtained at the end of the training phase (after averaging results obtained in the two last blocks) and in the remaining phases of study are shown in Figure Figure3.3.

Figure 2
(Top) Mean hits and false alarms obtained for the first (b1, b2, b3) and last three blocks (b4, b5, b6) of the training phase. (Bottom) Mean Pr discrimination index obtained in the first and last three blocks of the training phase. Pr was computed by ...
Figure 3
Mean Pr discrimination index (hit rate minus false alarm rate) and mean reaction times obtained at the end of the training phase and in the different tests (t1, t2, t3, t4). The training condition results from averaging data of the last two blocks in ...

Subjects showed an improvement in performance within the training phase and from one experimental phase to the next. All subjects reached the established learning criteria (80% hits and no more than three errors in each of two consecutive blocks) after several blocks (between six and 22) during training. The analyses of variance (ANOVA) showed significant differences in the Pr values obtained in the first and last three blocks of the training phase [F(5,45) = 26.35, P < 0.0001, ɛ = 0.58]. The discrimination was significantly better in the last three blocks compared with the first three blocks of the training phase [F(5,54) = 15.99, P < 0.0001]. However, RT (measured in the last five blocks of the training period because only in those blocks did all subjects provided correct responses) did not vary from one block to another during training (mean±SD: b1 = 743.2±144.5; b2 = 733.9±77.7; b3 = 748.9±64.1; b4=721.9± 55.2; b5 = 733.4±91.2). Differences in correct responses to the target stimulus, as reflected by Pr, did not significantly differ between the different experimental phases, in part because the hit percentage required to reach the learning criterion was very high. However, RT significantly decreased across time [F(4,36) = 7.67; P < 0.0001; ɛ = 0.81]. Responses in test 3 and test 4 (36 and 48 h, respectively) were significantly faster than those obtained during the training phase (P < 0.04). Therefore, a significant additional improvement in performance, particularly in the speed of information processing, was observed after a period of 36 h.

Neurophysiological Results

Superimposed grand-average waveforms for ERPs to standard and deviant patterns recorded from Fz and T6 derivations, as well as the difference waves (deviant minus standard) obtained in each phase of the experiment, are shown in Figure Figure4.4. ERPs elicited by the two types of auditory patterns (standard and deviant) showed the typical P1-N1-P2 complex wave in response to the stimulus onset in all conditions, and a negative wave at ∼350 msec after stimulus onset, which was linked to the sudden decrease in frequency in the sixth segment of the pattern.

Figure 4
(Left and right) Superimposed grand-average event-related potentials (ERPs; n = 10) to standard (thin line) and deviant (thick line) stimuli at Fz and T6, respectively. (Middle) Grand-average difference waves at Fz and T6 obtained by subtracting ...

As was expected, cortical responses elicited by the deviant pattern were significantly different from those obtained for the standard pattern once subjects learned to consciously distinguish the two complex stimuli. A clear negative waveform with maximum amplitude over frontocentral areas (Fig. (Fig.5),5), peaking 150 to 200 msec after the introduction of the frequency deviance in the sixth segment, was observed in the difference waves. The reversal polarity at posterior derivations (T5 and T6) was especially evident in the last two tests (36 and 48 h), as shown in Figure Figure6.6. Both latency and scalp distribution indicate that such a negative wave is the MMN. This deviance-related negativity was significantly different from zero in all experimental phases after the training on the task, as shown in Table Table1.1. This result indicates that the deviant pattern was differentially processed from the standard pattern at preattentive level, but only after subjects were able to consciously detect the difference between the two auditory patterns.

Figure 5
Scalp topography of the grand-average difference waveforms (n = 10) obtained at all recording sites in the pre- and post-training conditions in session 1 by subtracting the event-related potentials elicited by the standard pattern from ...
Figure 6
Scalp topography of the grand-average difference waveforms (n = 10) obtained at all recording sites at 12, 24, 36, and 48 h after training by subtracting the event-related potentials elicited by the standard pattern from the event-related ...
Table 1
Mismatch Negativity Mean Amplitudes

The one-way ANOVA performed to assess changes in the MMN amplitude at Fz between experimental phases yielded significant differences [F(5,45) = 16.9; P < 0.00001; ɛ = 0.55]. Although there was a progressive increase in MMN amplitude with the passage of time (Fig. (Fig.4;4; Table Table1),1), post-hoc tests revealed that the MMN recorded in test 3 and test 4 (36 and 48 h after training) was significantly larger than that obtained in test 1 and the posttraining phase (P < 0.00001). In relation to the speed of preattentive processing measured as changes in the MMN latency, the ANOVA yielded no significant differences among phases.

The MMN was not the only ERP component showing learning-related changes in its amplitude. P2 was also found to increase its amplitude with the passage of time (Fig. (Fig.4;4; Table Table2).2). The ANOVA confirmed the existence of significant differences among phases [F(5,45) = 9.6, P < 0.00005, ɛ = 0.74]. Post-hoc analysis showed that the P2 amplitude was significantly larger from test 2 forward, that is, beginning 24 h after the training session and thereafter (P < 0.005). Changes in P2 amplitude were not accompanied by significant changes in N1 amplitude. ANOVA yielded no main effect of the experimental phase on the amplitude of this negative potential.

Table 2
P2 and Late Positivity Mean Amplitudes

Training-induced effects observed on P2 amplitude could be the result of an increase in a superimposed positive component. Individual waveforms for the standard patterns revealed two separated positive waves in five subjects within different temporal windows: an early positivity with a peak latency of ∼150 to 200 msec from the stimulus onset, and a later positive component peaking between 275 and 330 msec from stimulus onset. Both components showed maximum amplitude at frontocentral locations, larger over the right hemisphere, and reversal polarity at temporal derivations, especially clear in the case of P2. The remaining subjects showed either one positivity at the P2 interval (n = 2) or a slower positivity that extended until the end of the second above-mentioned late positive wave (n = 3). The early positive wave corresponding to the obligatory P2 component showed the same topographic distribution across time, as confirmed by the absence of a significant electrode × phase interaction. This ANOVA was performed after scaling the voltage for each electrode and experimental condition by dividing by the square root of the sum of the squares of the voltage values at all electrodes (McCarthy and Wood 1985). The amplitude of the late positivity, measured in all subjects as the maximum positive value between 275 and 320 msec from stimulus onset at F4 (right frontal), showed its maximum amplitude at 48 h after training (Fig. (Fig.7;7; Table Table2),2), but this increase did not reach statistical significance. The fact that training mainly affected the P2 amplitude and that its topographic distribution remained constant with the passage of time indicates a genuine effect of training on this obligatory component. However, a possible contribution of a slower superimposed positive component to this enhanced P2 cannot be ruled out from the present results.

Figure 7
Superimposed grand-average (n = 10) event-related potentials to standard patterns at Fz, Cz, and T6 for all experimental conditions. The early (P2) and late positive waves are indicated with two black arrows in the panel corresponding ...

Correlation between Behavioral and Neurophysiological Results

Because changes in MMN amplitude have been shown to predict variations in hit rates and RT (Aaltonen et al. 1994; Tiitinen et al. 1994; Kraus et al. 1996; Amenedo and Escera 2000; Menning et al. 2000), a correlation analysis between changes in MMN amplitude and changes in the RT and Pr was conducted separately. A negative correlation was found between the MMN amplitude and the Pr discrimination index (r = −0.83, P < 0.08), indicating that voltage values of MMN were more negative (i.e., increased amplitude) as Pr increased. A positive correlation accounted for the relationship between the MMN amplitude and RT (r = 0.84, P < 0.07), indicating a shortening of RT as the MMN became larger. In both cases, the correlation coefficient (r) was quite high, although it failed to reach statistical significance, indicating that the time course for behavioral and neurophysiological changes were not identical.

Individual Differences

Figure Figure88 shows the gain in MMN and P2 amplitudes during the post-training phases compared with the pretraining phase, as well as gains in RT reached in each test session compared with the mean RT obtained in the last two blocks of the training session for all subjects. The mean value for the group was also superimposed. Eight out of 10 subjects showed an increase in MMN amplitude just after training in session 1. The additional increase in MMN amplitude was observed at 24 h after training for one subject and at 36 h for the remaining subjects except one, who showed no changes across time (denoted by an empty diamond). Similar amplitude values were observed at the 48-h test except with three subjects, whose MMNs were quite similar to those obtained at 24 h. The results for P2 were less homogeneous across time. Seven subjects showed an enhanced P2 at 24 h that was maintained at 48 h but not in the 36-h test, in which half of the subjects showed a small decrease in P2 amplitude. Regarding RT, one subject showed a huge decrease in the first test (12 h), whereas six of the remaining subjects reached the lowest RT values when they were tested 36 h after training. RT was maintained at the same level in the last test except for one subject, who recovered the values obtained in previous tests.

Figure 8
Training-induced changes. Superimposed individual (thin lines with different symbols) and group (thick line without symbol) data. Changes in mismatch negativity (MMN) and P2 amplitudes (left and middle columns) in all post-training phases compared with ...


Effects of Fast and Slow Neural Changes on Attentive Processing

Intensive training on discrimination of complex auditory patterns within a single session produced fast and slow behavioral changes. The fast improvement was shown by an increased ability to detect the deviant stimulus across the blocks during the training phase, as revealed by an increase in the number of correct responses to the deviant pattern (hits) and a decrease in the number of responses to the standard pattern (false alarms). Because of the ceiling effect reached in the training phase, the training-related slow changes in performance were revealed after 36 h only by a shortening of RT, indicating an increase in the processing speed for consciously detecting the deviant pattern.

Fast and slow behavioral improvements have previously been reported in the visual (Polat and Sagi 1994; Schoups et al. 1995; Ahissar and Hochstein 1996) and somatosensory modalities (Sathian and Zangaladze 1997), as well as in the learning of motor skills (Karni et al. 1995; Brashers-Krugh et al. 1996; Shadmehr and Holcomb 1997; for a review, see Karni et al. 1998). These two stages in the acquisition of improved perception were first indicated by Karni and Sagi (1993). They reported changes in the performance on a texture discrimination task during the practice session and an additional improvement 8 h after the last training session. These investigators referred to this latent phase of learning as the “consolidation process.” The main difference between this previous study and the present study is the different time period during which the consolidation process took place. This process has been reported to vary from several hours to several weeks. This variability in the time course of learning seems to depend on the experimental design (e.g., task difficulty and amount of training) rather than on the sensory modality or the locus of the learning-dependent changes (for review, see Karni and Bertini 1997).

Effects of Fast and Slow Neural Changes on Preattentive Processing

One hypothesis that may be derived from the behavioral results is that different cortical responses with different time courses occur as a consequence of training. Fast neurophysiological changes evolving across stimulus blocks were manifested by the elicitation of MMN by the deviant pattern just after the training phase. In the absence of additional stimulation, slow neural changes, which are thought to underlie the consolidation process of perceptual learning, were revealed by two different neurophysiological changes several hours after the training session: First, there was a significant increase in P2 amplitude 24 h after the training phase; second, there was a significant additional increase in MMN amplitude for deviant patterns 36 h later.

Effects of fast neural changes on preattentive processing are consistent with prior results using the same complex auditory patterns (Näätänen et al. 1993). In this previous study, MMN amplitude increased during the course of the session only in those subjects whose discrimination performance also improved as the session progressed. When the same amount of stimuli were presented but subjects were not asked to consciously discriminate the two auditory patterns, the deviant stimulus elicited a small MMN that did not undergo any further increase during the course of the session (Näätänen et al. 1993). In accordance with this electrophysiological result, the performance at the end of the session was also very poor. Likewise, Tervaniemi et al. (2001) found that even accurate subjects like musicians have to listen to the sounds consciously before their auditory cortex could detect changes in melodic patterns under no attention conditions. Taken together, these results indicate that at least in an initial stage, attention is a necessary condition for inducing the earliest neural changes that will affect both preattentive and attentive processing, as well as the subsequent slow neurophysiological changes underlying the consolidation process. This hypothesis is supported by neurophysiological results obtained in the somatosensory modality (Recanzone et al. 1992d). In this study, the investigators observed changes in the sensitivity of cortical neurons while monkeys practiced a somatosensory discrimination task, but no changes were found when the skin surface was passively stimulated. Behavioral results obtained in the visual modality also support the role of attention in the acquisition of perceptual learning in nonexpert humans (Ahissar and Hochstein 1993, 2000; Joseph et al. 1997; Braun 1998). For example, Joseph et al. (1997) found impairments in detecting orientation oddballs by the additional imposition of an attention-demanding rapid serial visual presentation task involving letter identification. These findings together with the present results indicate that a change in a stimulus feature can be preattentively detected, and that attention is critical in triggering the earliest neural changes that improve preattentive processing and subsequently guide stimulus information from this early stage of processing to an eventual perceptual awareness.

Training-related slow changes were revealed at preattentive level by changes in different ERP components. A significantly enhanced P2 was observed to occur 24 h after the training session. This was followed by an additional increase in the MMN amplitude at 36 h. From this neurophysiological activity pattern, two different stages of processing that are affected by auditory perceptual learning can be hypothesized: an earlier one occurring before the deviance was introduced in the auditory pattern, and another later one, specifically associated with the frequency deviance. The increase in P2 might be indexing learning-dependent changes in the coordinated activity of a dynamically associated ensemble of cells that represents a particular constellation of features characterizing both auditory patterns (Singer 1995). The level of auditory processing indexed by the P2 component indicates that such changes might happen in primary auditory cortex (Vaughan et al. 1980). The subsequent increase in MMN amplitude might be indexing changes in the coordinated activity of an ensemble of cells that specifically respond to the frequency deviance within the pattern. Unlike P2, nonprimary subdivisions of the thalamus and auditory cortex are thought to be involved in generation of MMN (Kraus et al. 1994a,b).

Our findings are consistent with results obtained in monkeys that were trained to discriminate the temporal features of a tactile stimulus (Recanzone et al. 1992a,c,d). In these experiments, progressive improvements in performance were correlated with increases in the size of receptive fields (Recanzone et al. 1992c) and with cortical representational changes (Recanzone et al. 1992a,d). Nevertheless, behavioral improvement was mainly accounted for by a progressive change in the distributed response coherences of neurons that represent behaviorally important stimuli in cortical area 3b (Recanzone et al. 1992d). This increase in neuronal coherence is hypothesized to result from an increase in the strength of positive coupling across the cortical network (for review, see Merzenich and Sameshima, 1993). According to this hypothesis, the first increase in MMN amplitude observed in the post-training phase would be the result of increased temporal coherence in cortical neuron activity that results in the formation of a distributed cell assembly. With the passage of time, the coupling of this cell assembly becomes progressively stronger, leading to a larger temporal coherence, as revealed by a second increase in MMN amplitude.

Interestingly, the present study yielded no experience-dependent changes in the N1 component. However, previous studies reported experience-induced changes in the magnetic counterpart of N1 (N1m) to single tones after seven training sessions in which subjects learned to detect deviant tones differing from the standard tone by progressively smaller frequency shifts (Menning et al. 2000). A possible explanation for these apparently contradictory results could rely on the different stimulation used in the two studies. Menning et al. (2000) presented pure tones of 1,000 Hz as standard stimulus and small frequency changes of that tone as deviant stimuli. Unlike complex auditory patterns, information required for the correct discrimination of single tones differing in frequency is provided at the beginning of the stimulus, and the enhanced N1 after several training sessions might reflect changes in the processing of the repetitive stimulus as a result of practice. In our case, however, the correct discrimination of the deviant pattern probably relies on the spectrotemporal organization of stimulus (i.e., integration or segregation of high and low tones within the pattern), which cannot be fully extracted from the initial part of the auditory pattern. An alternative explanation could also arise from the different experimental designs used in both studies. Our subjects were exposed to a single training session and required to reach a very good performance at the end of the session. However, subjects in the Menning et al. study were exposed to 15 daily training sessions, and performance progressively improved across sessions. It is possible that the ceiling effect reached in the first training session in the present experiment prevented the increase in N1 amplitude observed by Menning and colleagues (2000). In agreement with this hypothesis, Cansino and Williamson (1997) found a decrease in the N1m after 200 discrimination training sessions, probably as a consequence of a saturation effect, which was interpreted as reflecting the use of less resources for the automatic processing after extensive practice.

Previous studies that reported learning-dependent neurophysiological changes were focussed either on short-term effects (Näätänen et al. 1993) or on long-term effects (Karni et al. 1995; Kraus et al. 1995). The time course of these neurophysiological changes has been assessed in two studies (Tremblay et al. 1998; Menning et al. 2000). In one of them, the investigators used different training sessions and tested ERPs and performance in the days after the training sessions (Tremblay et al. 1998). As a consequence, they were not able to determine which of training-associated changes were owing to the fast component and which to the slow component of perceptual learning. In the other study, the above-mentioned study of Menning et al. (2000), neuromagnetic fields were recorded before training, after seven training sessions, at the end of the training (after 14 sessions of training), and 3 weeks later. As in the study of Tremblay and coworkers (1998), changes in the magnetic counterparts of ERPs, specifically in N1m and MMNm, related to fast and slow neural changes could not be dissociated. Therefore, this is the first study that provides different objective indices of the time course of fast and slow neural changes underlying behavioral improvements in an auditory perceptual task in humans. Nevertheless, given that subjects repeated the discrimination task after each ERP recording session, we cannot rule out the possibility that changes in electrophysiological and behavioral measures stem from the previous discrimination sessions. Against these carryover effects, in a recent study developed in our laboratory (M. Atienza, J.L. Cantero, E. Dominguez-Marin, R.M. Salas, and R. Stickgold, in prep.), we found a highly significant increase in MMN amplitude with the same auditory patterns used in the present experiment at 48 h after training in the same discrimination task, when no extra stimulation was presented in between. These results support the hypothesis that additional neurophysiological changes evolve in the time frame following the training period that, in turn, may lead to an improved information processing at preattentive level.

Top-Down Influences on Preattentive Auditory Processing

Evidence from ERP studies performed in the auditory modality indicates that information can be accessed from long-term memory to improve sensory processing. These top-down effects have been divided into short- and long-term effects (Pantev and Lütkenhöner 2000; Schröger 2000). In the present study, intensive training in one single session exerted short- and longer-term effects on attentive and preattentive processing. We hypothesize that the training-dependent behavioral and electrophysiological changes found within different temporal windows index changes in the accuracy of memory traces available in short-term memory. Thus, the increase observed in P2 amplitude 24 h after training might be reflecting the contents of short-term memory as a result of gathering information from different memory systems. This interpretation is supported by a previous study in which P2 was found to increase its amplitude in a linear way with memory load (Conley et al. 1999). In this study, a list of digits that varied in size was presented and followed by a probe digit. In one condition, subjects were required to remember whether such a probe digit was or not previously presented. In the other condition, subjects ignored the digits of the list and reported if the probe digit was odd or even. P2 amplitude to probes increased with the amount of digits in the list only in the memory task. This result indicates that P2 amplitude is dependent on the amount of information contained in short-term memory during memory scanning (for similar results, see Wolach and Pratt, 2001). In the present study, short-term memory, including elements of sensory and longer lasting memories, might underlie the observed increase in P2 amplitude. The information provided by longer lasting memories might improve the quality of the memory trace pattern contained in short-term memory, facilitating the change detection process reflected in the enhanced MMN. Previous studies have reported this top-down effect on preattentive processing indexed by an increase in the P2 (associated to improved speech perception; see Tremblay et al. 2001) and MMN amplitude (Näätänen et al. 1993, 1997; Tremblay et al. 1998; Koelsch et al. 1999; Tervaniemi et al. 2001), and support the assumption that information not only flows from sensory memory toward other memory systems, but that the flow of information is also possible in the reverse direction (Schröger 2000).

In conclusion, different behavioral changes and cortical responses with different time courses were observed as a result of training on an auditory perceptual task. Our findings support the hypothesis that perceptual learning is acquired in two different stages (Karni and Sagi 1993; see also Sagi and Tanne 1994; Karni et al. 1998). The earliest neural changes took place during training, after the presentation of several stimulus blocks. Such changes are thought to affect not only preattentive and attentive processing just after training, but also the subsequent slow neural changes evolving in the absence of additional stimulation for a period of several hours. These slow neural changes are assumed to underlie the consolidation process whereby information is stored in long-term memory (Karni and Sagi 1993). As a consequence of this consolidation process, the effects of slow changes on different stages of processing were found before and after the frequency change was introduced within the auditory pattern. The neurophysiological changes before frequency deviance, reflected by an enhanced P2, might index the activation of information from sensory and longer lasting memories in short-term stores. These differences in the contents of short-term memory, guided by top-down mechanisms, might improve the memory trace of the standard auditory pattern, facilitating the activation of the change-detector mechanism to the frequency-deviance within the pattern, as revealed by an enhanced MMN. According to generators of these cortical responses (P2 and MMN), fast and slow neural changes underlying the learning of the present auditory discrimination task would develop in primary and secondary subdivisions of the auditory pathway.


Subjects and Experimental Design

Ten healthy adult subjects (five men and five women; age range, 18 to 30 years) participated voluntarily in the present study. Written informed consent was obtained from each one before the experiment. All subjects were screened for health status with a structured medical interview. Medical illness, psychiatric/psychological disturbance, substance abuse, and/or neurological disorders were criteria for exclusion. Subjects were asked to refrain from caffeine, alcohol, and medication during the 48 h before the experiment and on the days of experiment.


Subjects were presented with two complex auditory patterns composed by segments of different frequencies. These stimuli were previously well established in psychophysical and electrophysiological research (Watson et al. 1975; Spiegel and Watson 1981; Espinoza-Varas and Watson 1989; Schröger et al. 1992; Näätänen et al. 1993; Schröger 1994; Atienza and Cantero 2001). The standard stimulus (87%) consisted of eight segments of 50 msec (including rise and fall times of 5 msec), each one of a different frequency (720, 500, 638, 1040, 117, 565, 815, 920 Hz). Fall and rise of consecutive segments overlapped. The same segments were used for the deviant pattern (13%), except the sixth (650 Hz instead of 565 Hz). The total duration of each pattern was 365 msec, and the frequency deviance was introduced 225 msec after the stimulus onset. An example of the stimulus pattern is included in Figure Figure1,1, in which the black segment added to the sixth segment denotes the frequency increase for the deviant pattern. Stimuli were delivered binaurally via airtube earphones (Etimotic Research, Model ER-3A), with an intensity of 70 dB SPL. A total of 200 stimuli (26 deviant stimuli) were pseudorandomly (two deviant patterns were always separated by at least two standard patterns) presented in each block with an interstimulus interval of 975 msec; the duration of a block was 4.5 min.


Discrimination Training

Subjects were trained on the discrimination of two auditory patterns in a single session. They had to respond to the deviant stimulus by pressing a key as quickly as possible. The training phase began with a block of 25 stimuli as probe. Subsequently, blocks of 200 stimuli each were presented. Every subject was presented with a minimum of six blocks, but the total number of blocks varied between six and 22 (mean±, 9.5±4.8), depending on subjects to reach a previously established learning criteria: 80% hits (correct responses to the deviant pattern) in each of the two consecutive blocks with a maximum of three false alarms (responses to the standard pattern). Feedback was provided at the end of each block. Although feedback has been found not to be necessary for perceptual learning (Ball and Sekuler 1987), some studies have reported different effects on learning with and without feedback (Shiu and Pashler 1992; Fahle and Edelman 1993). For example, Shiu and Pashler (1992) reported improvements in the performance from one session to the next, but not within each session, under no-feedback conditions. In contrast, no differences were found when subjects were presented trial-to-trial or block-to-block feedback. To avoid changes in performance from one session to the next as the consequence of lacking feedback, subjects in the present study were informed about their hit and false alarm percentages after each block.

Discrimination Tests

Performance in the task was tested every 12 h for a period of 48 h, in four sessions. In the test phase, only one block of 200 stimuli was delivered. Subjects were tested using the same oddball stimulus sequences and the same response task used during training. However, no feedback was given at the end of the block. Figure Figure11 schematically shows the procedure of the experiment under conditions of attention and no attention.

ERP Recordings

ERPs to auditory patterns were always recorded while subjects read a book of his/her own choice and ignored stimuli. ERPs to nine stimulus blocks were recorded before and after the training phase (these two phases were termed pre- and post-training phases). The other four phases were termed test 1, test 2, test 3, and test 4, and took place 12, 24, 36, and 48 h, respectively, after the beginning of the pre-training phase. In the test phases, ERPs were also recorded under ignore conditions (nine blocks) just before testing the performance on the discrimination task. ERP recording always began at 9:00 a.m. or 9:00 p.m. depending on the session (see Fig. Fig.1).1). The first session lasted at about 2.5 to 4 h, and the remaining sessions lasted 1 h each.

Electrophysiological Recordings

Electroencephalographic (EEG) activity was recorded in an acoustically shielded room from Ag/AgCl scalp electrodes placed at 19 different locations according to the 10 to 20 system (Fp1, Fp2, F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, O1, and O2). An electrode placed on the tip of the nose was used as reference. Vertical and horizontal ocular movements were also recorded, with two pairs of electrodes placed above and below the left eye separated by 1 cm from the outer canthi of both eyes, respectively. Electrophysiological activity was amplified and digitized by a MEDICID 4 system (Neuronic) at 250 Hz, and low- and high-bandpass filtered at 0.1 and 40 Hz (−3 dB points of a 24 dB/octave roll-off curve). Impedance of all electrodes was kept below 5,000 ohms.

For each artifact-free trial, an epoch of 800 msec, including a 100-ms prestimulus baseline, was selected. Trials with artifacts exceeding ±100 μV were rejected from analyses. Trials immediately after deviant stimuli were not included in the average because they can elicit small MMNs. The first five trials of each stimulus block were excluded from analyses. Before averaging, the signals were digitally filtered with a cutoff frequency of 30 Hz (−3 dB), and the drift artifacts were corrected by using linear detrending. Separate ERPs from individual subjects were computed for each type of stimulus pattern (standard and deviant) as well as for each experimental phase (pre-training, post-training, test 1, test 2, test 3, and test 4).

Data Analysis

Behavioral Data

Two measures of performance were obtained: (1) an index of recognition memory performance based on the hit rate and false alarm rate, and (2) RT. Given that subjects were required to develop a high hit percentage during the training phase, improvements in the recognition memory were not expected to be significantly different in the remaining experimental phases. RT was measured to assess possible improvements in the speed of the information processing with the passage of time.

To measure the recognition memory, we used the discrimination index Pr based on the two–high-threshold model (Snodgrass and Corwin 1988). This model defines discrete memory states (standard recognition, deviant recognition, and uncertainty) rather than a continuum of familiarity or memory strength like in the case of signal detection models. The Pr index provides a measurement of true recognitions of the standard and deviant patterns correcting the probability of lucky guesses from the uncertain state. Pr was computed by subtracting the false alarm rate (FA) from the hit rate (H).

equation M1

The hit rate is the probability of responding correctly to the deviant stimulus, and the false alarm rate is the probability of responding to the standard stimulus. Both hit and false alarm rates were corrected by adding 0.5 and dividing by the number of deviant or standard stimuli plus 1.

equation M2

equation M3

Differences in the Pr discrimination index and RT obtained in the two last blocks of the training session and in the remaining sessions were assessed using two separate one-way ANOVAs with repeated measures.

Electrophysiological data

The MMN amplitude and latency were measured in the “difference wave” obtained by subtracting ERPs to the standard pattern from ERPs to the deviant pattern. In each phase of the experiment and for each individual subject, the maximum peak of the MMN was obtained at Fz between 370 and 505 msec from the stimulus onset (145 to 280 msec from the introduction of the frequency deviance). MMN amplitude was measured as the mean voltage within a temporal window of 24 msec (12 msec before and after the maximum negative peak) minus the mean voltage value obtained between the stimulus onset and the introduction of the frequency deviance (225 msec). MMN latency was defined as the time from deviance onset to the maximum negative peak.

Two-tailed t tests were used to determine whether the mean amplitudes within the temporal window of MMN differed significantly from zero. Differences in amplitude and peak latency in each condition were assessed using two different one-way ANOVAs with repeated measures, including phases (pre-training, post-training, test 1, test 2, test 3, test 4) as within-factor. Significance levels of the F ratios were adjusted with the Greenhouse-Geisser correction. Post-hoc analyses (Newman-Keuls test) were performed to assess the main effects.

Visual inspection of the obtained waveforms revealed another time window (∼180 msec from the stimulus onset) within which the ERPs seemed to differ among phases. The latency and polarity of the wave indicated changes in P2 amplitude. Because changes in P2 use were to be associated to changes in N1, the amplitude of these two potentials were measured within temporal windows of 90 to 150 msec from stimulus onset for N1, and of 140 to 220 msec for P2. In both cases, the amplitude was analyzed for the standard pattern. N1 amplitude was defined as the maximum negative peak and P2 amplitude as the maximum positive peak relative to prestimulus baseline at Cz derivation (at which these potentials reach their maximum). The voltage values for N1 and P2 were introduced in two separate one-way ANOVAs of repeated measures, respectively, to study differences between experimental phases.


We are grateful to Erich Schröger (Institut fur Allgemeine Psychologie, Universitat Leipzig), Edward Pace-Schott (Department of Psychiatry, Harvard Medical School), and the two anonymous reviewers for their valuable comments and suggestions on an earlier version of the manuscript.

The publication costs of this article were defrayed in part by payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 USC section 1734 solely to indicate this fact.


E-MAIL ude.dravrah.smh@azneita_sedecrem; Fax (617) 734-7851.

Article and publication are at http://www.learnmem.org/cgi/doi/10.1101/lm.46502.


  • Aaltonen O, Eerola O, Lang AH, Uusipaikka E, Tuomainen J. Automatic discrimination of phonetically relevant and irrelevant vowel parameters as reflected by mismatch negativity. J Acoust Soc Am. 1994;96:1489–1493. [PubMed]
  • Ahissar M, Hochstein S. The role of attention in early perceptual learning. Proc Natl Acad Sci. 1993;90:5718–5722. [PMC free article] [PubMed]
  • ————— Learning pop-out detection: Specificities to stimulus characteristics. Vision Res. 1996;36:3487–3500. [PubMed]
  • ————— The spread of attention and learning in feature search: Effects of target distribution and task difficulty. Vision Res. 2000;40:1349–1364. [PubMed]
  • Amenedo H, Escera C. The accuracy of sound duration representation in the human brain determines the accuracy of behavioral perception. Eur J Neurosci. 2000;12:1570–1574. [PubMed]
  • Atienza M, Cantero JL. Complex sound processing during human REM sleep by recovering information from long-term memory as revealed by the mismatch negativity (MMN) Brain Res. 2001;901:151–160. [PubMed]
  • Ball K, Sekuler R. Direction-specific improvement in motion discrimination. Vision Res. 1987;27:953–965. [PubMed]
  • Brashers-Krugh T, Shadmehr R, Bizzi E. Consolidation in human motor memory. Nature. 1996;382:252–255. [PubMed]
  • Braun J. Vision and attention: The role of training. Nature. 1998;393:424–425. [PubMed]
  • Cansino S, Williamson SJ. Neuromagnetic fields reveal cortical plasticity when learning an auditory discrimination task. Brain Res. 1997;764:53–66. [PubMed]
  • Cheour M, Ceponiene R, Lehtokoski A, Luuk A, Allik J, Alho K, Näätänen R. Development of language-specific phoneme representations in the infant brain. Nature Neurosci. 1998;1:351–353. [PubMed]
  • Conley EM, Michalewski HJ, Starr A. The N100 auditory cortical evoked potential indexes scanning of auditory short-term memory. Clin Neurophysiol. 1999;110:2086–2093. [PubMed]
  • Espinoza-Varas B, Watson CS. Perception of complex auditory patterns by humans. In: Dooling RJ, Hulse St H, editors. The comparative psychology of audition: Perceiving complex sounds. Hillsdale, NJ: Erlbaum; 1989. pp. 67–94.
  • Fahle M, Edelman S. Long-term learning in vernier acuity: Effects of stimulus orientation, range and feedback. Vision Res. 1993;33:397–412. [PubMed]
  • Galván VV, Weinberger NM. Long-term consolidation and retention of learning-induced tuning plasticity in the auditory cortex of the guinea pig. Neurobiol Learn Mem. 2002;77:78–108. [PubMed]
  • Gilbert CD. Early perceptual learning. Proc Natl Acad Sci. 1994;91:1195–1197. [PMC free article] [PubMed]
  • Harris JP, Fahle M. The use of different orientation cues in vernier acuity. Percept Psychophys. 1998;60:405–426. [PubMed]
  • Joseph JS, Chun MM, Nakayama K. Attentional requirements in a ‘preattentive’ feature search task. Nature. 1997;387:805–807. [PubMed]
  • Kapadia MK, Gilbert CD, Westheimer G. A quantitative measure for short-term cortical plasticity in human vision. J Neurosci. 1994;14:451–457. [PubMed]
  • Karni A, Bertini G. Learning perceptual skills: Behavioral probes into adult cortical plasticity. Curr Opin Neurobiol. 1997;7:530–535. [PubMed]
  • Karni A, Sagi D. The time course of learning a visual skill. Nature. 1993;365:250–252. [PubMed]
  • Karni A, Meyer G, Jezzard P, Adams MM, Turner R. Functional MRI evidence for adult motor cortex plasticity during motor skill learning. Nature. 1995;377:155–158. [PubMed]
  • Karni A, Meyer G, Rey-Hipolito C, Jezzard P, Adams MM, Turner R, Ungerleider LG. The acquisition of skilled motor performance: Fast and slow experience-driven changes in primary motor cortex. Proc Natl Acad Sci. 1998;95:861–868. [PMC free article] [PubMed]
  • Koelsch S, Schröger E, Tervaniemi M. Superior pre-attentive auditory processing in musicians. Neuroreport. 1999;10:1309–1313. [PubMed]
  • Kraus N, McGee T, Carrell T, King C, Littman T, Nicol T. Discrimination of speech-like signals in auditory thalamus and cortex. J Acoust Soc Am. 1994a;96:2758–2768. [PubMed]
  • Kraus N, McGee T, Littman T, Nicol T, King C. Encoding of acoustic change involves non-primary auditory thalamus. J Neurophysiol. 1994b;72:1270–1277. [PubMed]
  • Kraus N, McGee T, Carrell TD, King C, Tremblay K, Nicol T. Central auditory system plasticity associated with speech discrimination training. J Cogn Neurosci. 1995;7:25–32. [PubMed]
  • Kraus N, McGee TJ, Carrell TD, Zecker SG, Nicol TG, Koch DB. Auditory neurophysiologic responses and discrimination deficits in children with learning problems. Science. 1996;273:971–973. [PubMed]
  • McCarthy G, Wood CC. Scalp distributions of event-related potentials: An ambiguity associated with analysis of variance models. EEG Clin Neurophysiol. 1985;62:203–208. [PubMed]
  • Menning H, Roberts LE, Pantev C. Plastic changes in the auditory cortex induced by intensive frequency discrimination training. Neuroreport. 2000;11:817–822. [PubMed]
  • Merzenich MM, Sameshima K. Cortical plasticity and memory. Curr Opin Neurobiol. 1993;3:187–196. [PubMed]
  • Näätänen R. The role of attention in auditory information processing as revealed by event-related potentials and other brain measures of cognitive function. Behav Brain Sci. 1990;13:201–288.
  • Näätänen R, Schröger E, Karakas S, Tervaniemi M, Paavilainen P. Development of a memory trace for a complex sound in the human brain. Neuroreport. 1993;4:503–506. [PubMed]
  • Näätänen R, Lehtokoski A, Lennes M, Cheour M, Huotilainen M, Iivonen A, Vainio M, Alku P, Ilmoniemi RJ, Luuk A, et al. Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature. 1997;385:432–434. [PubMed]
  • Novak G, Ritter W, Vaughan HG., Jr Mismatch detection and the latency of temporal judgements. Psychophysiology. 1992;29:398–411. [PubMed]
  • Pantev C, Lütkenhöner B. Magnetoencephalographic studies of functional organization and plasticity of the human auditory cortex. J Clin Neurophysiol. 2000;17:130–142. [PubMed]
  • Poggio T, Fahle M, Edelman S. Fast perceptual learning in visual hyperacuity. Science. 1992;256:1018–1021. [PubMed]
  • Polat U, Sagi D. Spatial interactions in human vision: From near to far via experience-dependent cascades of connections. Proc Natl Acad Sci. 1994;91:1206–1209. [PMC free article] [PubMed]
  • Recanzone GH, Jenkins WM, Hradek GH, Merzenich MM. Progressive improvement in discriminative abilities in adult owl monkeys performing a tactile frequency discrimination task. J Neurophysiol. 1992a;67:1015–1030. [PubMed]
  • Recanzone GH, Merzenich MM, Jenkins WM. Frequency discrimination training engaging a restricted skin surface results in an emergence of a cutaneous response zone in cortical area 3a. J Neurophysiol. 1992b;67:1057–1070. [PubMed]
  • Recanzone GH, Merzenich MM, Jenkins WM, Grajski KA, Dinse HR. Topographic reorganization of the hand representation in cortical area 3b of owl monkeys trained in a frequency-discrimination task. J Neurophysiol. 1992c;67:1031–1056. [PubMed]
  • Recanzone GH, Merzenich MM, Schreiner CE. Changes in the distributed temporal response properties of SI cortical neurons reflect improvements in performance on a temporally based tactile discrimination task. J Neurophysiol. 1992d;67:1071–1091. [PubMed]
  • Recanzone GH, Schreiner CE, Merzenich MM. Plasticity in the frequency representation of primary auditory cortex following discrimination training in adult owl monkeys. J Neurosci. 1993;13:87–103. [PubMed]
  • Sagi D, Tanne D. Perceptual learning: Learning to see. Curr Opin Neurobiol. 1994;4:195–199. [PubMed]
  • Sathian K, Zangaladze A. Tactile learning is task specific but transfers between fingers. Percept Psychophys. 1997;59:119–128. [PubMed]
  • Schoups AA, Orban GA. Interocular transfer in perceptual learning of a pop-out discrimination task. Proc Natl Acad Sci. 1996;93:7358–7362. [PMC free article] [PubMed]
  • Schoups AA, Vogels R, Orban GA. Human perceptual learning in identifying the oblique orientation: Retinotomy, orientation specificity and monocularity. J Physiol. 1995;483:797–810. [PMC free article] [PubMed]
  • Schröger E. An event-related potential study of sensory representations of unfamiliar tonal patterns. Psychophysiology. 1994;31:175–181. [PubMed]
  • ————— . Top-down effects on auditory sensory memory processing. In: Schick A, et al., editors. Contributions to psychological acoustics. Oldenburg, Germany: Bibliotheks- und Informationssystem der Universität Oldenburg; 2000. pp. 337–353.
  • Schröger E, Näätänen R, Paavilainen P. Event-related brain potentials reveal how non-attended complex sound patterns are represented by the human brain. Neurosci Lett. 1992;146:183–186. [PubMed]
  • Shadmehr R, Holcomb HH. Neural correlates of motor memory consolidation. Science. 1997;277:821–825. [PubMed]
  • Shiu LP, Pashler H. Improvement in line orientation discrimination is retinally local but dependent on cognitive set. Percept Psychophys. 1992;52:582–588. [PubMed]
  • Singer W. Development and plasticity of cortical processing architectures. Science. 1995;270:758–764. [PubMed]
  • Snodgrass JG, Corwin J. Pragmatics of measuring recognition memory: Applications to dementia and amnesia. J Exp Psychol Gen. 1988;117:34–50. [PubMed]
  • Spiegel MF, Watson CS. Factors in the discrimination of tonal patterns, III: Frequency discrimination with components of well-learned patterns. J Acoust Soc Am. 1981;69:223–230. [PubMed]
  • Tervaniemi M, Rytkönen M, Schröger E, Ilmoniemi RJ, Näätänen R. Superior formation of cortical memory traces for melodic patterns in musicians. Learn Mem. 2001;8:295–300. [PMC free article] [PubMed]
  • Tiitinen H, May P, Reinikainen K, Näätänen R. Attentive novelty detection in humans is governed by pre-attentive sensory memory. Nature. 1994;372:90–92. [PubMed]
  • Tremblay K, Kraus N, Carrell T, McGee T. Central auditory system plasticity: Generalization to novel stimuli following listening training. J Acoust Soc Am. 1997;6:3762–3773. [PubMed]
  • Tremblay K, Kraus N, McGee T. The time course of auditory perceptual learning: Neurophysiological changes during speech-sound training. Neuroreport. 1998;9:3557–3560. [PubMed]
  • Tremblay K, Kraus N, McGee T, Ponton C, Otis B. Central auditory plasticity: Changes in the N1-P2 complex after speech-sound training. Ear and Hear. 2001;22:79–90. [PubMed]
  • Vaughan HG, Jr, Ritter W, Simson R. Topographic analysis of auditory event-related potentials. In: Kornhuber HH, Deecke L, editors. Motivation, motor and sensory processes of the brain: Electrical potentials, behaviour and clinical use. Amsterdam: Elsevier; 1980. pp. 279–290.
  • Watson CS, Wroton HW, Kelly WJ, Benbassat CA. Factors in the discrimination of tonal patterns, I: Component frequency, temporal position, and silent intervals. J Acoust Soc Am. 1975;57:1175–1185. [PubMed]
  • Winkler I, Kujala T, Tiitinen H, Sivonen P, Alku P, Lehtokoski A, Czigler I, Csépe V, Ilmoniemi RJ, Näätänen R. Brain responses reveal the learning of foreign language phonemes. Psychophysiology. 1999;36:638–642. [PubMed]
  • Wolach I, Pratt H. The mode of short-term memory encoding as indicated by event-related potentials in a memory scanning task with distractions. Clin Neurophysiol. 2001;112:186–197. [PubMed]
  • Zohary E, Celebrini S, Britten KH, Newsome WT. Neuronal plasticity that underlies improvement in perceptual performance. Science. 1994;263:1289–1292. [PubMed]

Articles from Learning & Memory are provided here courtesy of Cold Spring Harbor Laboratory Press
PubReader format: click here to try


Save items

Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • MedGen
    Related information in MedGen
  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...