See 1 citation found by title matching your search:
Identifying the Attended Speaker Using Electrocorticographic (ECoG) Signals.
- 1
- Ctr for Adapt Neurotech, Wadsworth Center, New York State Department of Health, Albany, NY; Dept of Neurology, Albany Medical College, Albany, NY; Donders Inst for Brain, Cognition and Behaviour, Radboud Univ Nijmegen, The Netherlands.
- 2
- Ctr for Adapt Neurotech, Wadsworth Center, New York State Department of Health, Albany, NY; Dept of Neurology, Albany Medical College, Albany, NY.
- 3
- Ctr for Adapt Neurotech, Wadsworth Center, New York State Department of Health, Albany, NY; J. Crayton Pruitt Family Dept of Biomed Eng, Univ of Florida, Gainesville, FL.
- 4
- Ctr for Adapt Neurotech, Wadsworth Center, New York State Department of Health, Albany, NY; Dept of Biomed Sci, State Univ of New York at Albany, Albany, NY.
- 5
- Dept of Neurology, Albany Medical College, Albany, NY.
- 6
- Donders Inst for Brain, Cognition and Behaviour, Radboud Univ Nijmegen, The Netherlands.
- 7
- Ctr for Adapt Neurotech, Wadsworth Center, New York State Department of Health, Albany, NY; Dept of Neurology, Albany Medical College, Albany, NY; Dept of Biomed Sci, State Univ of New York at Albany, Albany, NY.
Abstract
People affected by severe neuro-degenerative diseases (e.g., late-stage amyotrophic lateral sclerosis (ALS) or locked-in syndrome) eventually lose all muscular control. Thus, they cannot use traditional assistive communication devices that depend on muscle control, or brain-computer interfaces (BCIs) that depend on the ability to control gaze. While auditory and tactile BCIs can provide communication to such individuals, their use typically entails an artificial mapping between the stimulus and the communication intent. This makes these BCIs difficult to learn and use. In this study, we investigated the use of selective auditory attention to natural speech as an avenue for BCI communication. In this approach, the user communicates by directing his/her attention to one of two simultaneously presented speakers. We used electrocorticographic (ECoG) signals in the gamma band (70-170 Hz) to infer the identity of attended speaker, thereby removing the need to learn such an artificial mapping. Our results from twelve human subjects show that a single cortical location over superior temporal gyrus or pre-motor cortex is typically sufficient to identify the attended speaker within 10 s and with 77% accuracy (50% accuracy due to chance). These results lay the groundwork for future studies that may determine the real-time performance of BCIs based on selective auditory attention to speech.
KEYWORDS:
Auditory Attention; Brain-Computer Interface (BCI); Cocktail Party; Electrocorticography (ECoG)
Figure 1Electrode coverage
Electrode coverage and density varied across subjects. Electrode locations (black dots) included frontal, temporal, parietal and occipital cortical areas. Four subjects (4, 6, 8 and 12) were implanted with high-density grids (electrodes spaced 6 mm apart).
Brain Comput Interfaces (Abingdon). 2015;2(4):161-173.
Figure 2Experimental setup and methods
(A) Subjects selectively directed auditory attention to one of two simultaneously presented speakers. (B) We extracted the envelope of ECoG signals in the high gamma band, as well as the envelopes of the attended and unattended speech stimuli (i.e., JFK and Obama). (C) The correlation between the envelopes of the ECoG gamma band and the attended speech stimulus, accumulated over time, is markedly larger than the accumulated correlation between the envelopes of the ECoG gamma band and the unattended speech stimulus.
Brain Comput Interfaces (Abingdon). 2015;2(4):161-173.
Figure 3Lag between speech presentation and neural response
This figure shows the correlation between neural response and the attended speech (green), averaged across subjects, for corrected lags between 0 and 250 ms to peak at 100 ms.
Brain Comput Interfaces (Abingdon). 2015;2(4):161-173.
Figure 4Neural tracking of attended (●) and unattended (○) speech
Neural tracking is measured as the correlation between the high gamma ECoG envelope and the attended or unattended speech envelope. Color gives the magnitude of this correlation expressed as an activation index (−log(p)).
Brain Comput Interfaces (Abingdon). 2015;2(4):161-173.
Figure 5Classification accuracy to which the attended speech could be identified, using a univariate (blue) or multivariate (orange) classification method
(A) Accuracy per subject, sorted by average performance. For subjects 1-7 (‘significant subjects’), accuracy is significantly larger than chance for at least one classification method (adjusted for multiple comparisons using a false discovery rate with q = 0.05). Significance is marked with an asterisk. (B) Average accuracy across subjects for subjects with statistically significant performance.
Brain Comput Interfaces (Abingdon). 2015;2(4):161-173.
Figure 6Neural tracking of attended (●) and unattended (○) speech
Two averages are displayed: (A) Subjects for which performance was significantly better than chance for at least one classification method and (B) subjects for which performance was at chance level. For the significant subjects, the tracking of the attended speech is both stronger and more widely distributed than the tracking of the unattended speech. For the non-significant subjects, the overall activation index is smaller. In addition, there is only a marginal difference in spatial distribution between attended and unattended stimuli.
Brain Comput Interfaces (Abingdon). 2015;2(4):161-173.
Figure 7Neural tracking of attended and unattended speech across different frequencies
(A) and (B) show correlation coefficients across different frequencies, averaged across subjects, and for attended (orange trace) and unattended speech (blue trace). (A) Subjects for which performance was significantly better than chance for at least one classification method. (B) Subjects for which performance was at chance level. For the significant subjects, the tracking of the attended speech is stronger across all frequency bands, especially in the high gamma band (70–170 Hz, gray shaded). The tracking shows a negative relationship in the low frequency band (10–30 Hz). For the non-significant subjects, this negative relationship at lower frequencies is not apparent and the tracking of attended and unattended speech at higher frequencies is at the same low level.
Brain Comput Interfaces (Abingdon). 2015;2(4):161-173.
Figure 8Accuracy for different segment lengths for univariate (blue) and multivariate methods (orange)
The classification accuracy increases steadily with segment length for both classification methods. Multivariate classification results in higher average accuracy than univariate classification for all segment lengths.
Brain Comput Interfaces (Abingdon). 2015;2(4):161-173.
Figure 9Effect of ‘tuning-in’ on correlation (A) and classification accuracy (B)
For the first second, the difference between the ‘attended’ and ‘unattended’ correlation remains zero resulting in a classification accuracy around chance level (i.e., 50%). Subsequently, correlation and classification accuracy trend upwards while being superimposed with cycles of higher and lower correlation and classification accuracy.
Brain Comput Interfaces (Abingdon). 2015;2(4):161-173.
Full Text Sources
Other Literature Sources
Miscellaneous