NCBI Bookshelf. A service of the National Library of Medicine, National Institutes of Health.

National Academy of Sciences; Striedter GF, Avise JC, Ayala FJ, editors. In the Light of Evolution: Volume VI: Brain and Behavior. Washington (DC): National Academies Press (US); 2013 Jan 25.

Cover of In the Light of Evolution

In the Light of Evolution: Volume VI: Brain and Behavior.

Show details

15Integration of Faces and Vocalizations in Ventral Prefrontal Cortex: Implications for the Evolution of Audiovisual Speech


The integration of facial gestures and vocal signals is an essential process in human communication and relies on an interconnected circuit of brain regions, including language regions in the inferior frontal gyrus (IFG). Studies have determined that ventral prefrontal cortical regions in macaques [e.g., the ventrolateral prefrontal cortex (VLPFC)] share similar cytoarchitectonic features as cortical areas in the human IFG, suggesting structural homology. Anterograde and retrograde tracing studies show that macaque VLPFC receives afferents from the superior and inferior temporal gyrus, which provide complex auditory and visual information, respectively. Moreover, physiological studies have shown that single neurons in VLPFC integrate species-specific face and vocal stimuli. Although bimodal responses may be found across a wide region of prefrontal cortex, vocalization responsive cells, which also respond to faces, are mainly found in anterior VLPFC. This suggests that VLPFC may be specialized to process and integrate social communication information, just as the IFG is specialized to process and integrate speech and gestures in the human brain.

The area dedicated to language processing in the frontal lobe is located within the inferior frontal gyrus (IFG), which can be further subdivided into the pars opercularis (most posterior portion of the IFG), the pars triangularis, and the pars orbitalis (cortex inferior and anterior to the horizontal ramus of the lateral fissure). These subdivisions include Brodmann areas 44, 45, and 47. Our understanding of the functions within this specialized area of cortex is hampered by the fact that no other mammal has a frontal lobe of similar organization or complexity, leaving few animal models to investigate. Within nonhuman primates, only catarrhines have a well-developed frontal lobe with cytoarchitectonic evidence of Brodmann areas 44, 45, and 47 (Petrides and Pandya, 2002). In contrast, New World monkeys including marmosets and squirrel monkeys have a lissencephalic frontal lobe with previously identified motor and premotor cortices but less clearly defined prefrontal regions (Preuss, 2007). Recent work has identified an area in the ventrolateral prefrontal cortex (VLPFC) of rhesus macaques (Macaca mulatta) that is involved in the processing and integration of vocalizations and faces. We have hypothesized that the ventral prefrontal cortex (VLPFC) became specialized for the processing and integration of auditory and visual communication signals, in at least early anthropoid primates, and ultimately this region was modified and lateralized to the left cerebral hemisphere to subserve language in modern humans.


Organization of VLPFC

The frontal lobe of the macaque monkey has been studied extensively with anatomical, electrophysiological, and functional methodologies compared with other primate species. The area of the VLPFC, also referred to as the inferior convexity of the PFC, in the macaque monkey, includes the cortical region ventral to the principal sulcus and anterior to the inferior limb of the arcuate sulcus (Fig. 15.1). The cytoarchitectonic areas of VLPFC in the macaque are arranged in a similar fashion to that of the human frontal lobe and include regions on the lateral frontal surface: area 45, which lies just anterior to the inferior arcuate sulcus; area 12 (or 12/47), which lies anterior to area 45 and ventral to area 46; and the most ventrolateral extent of the inferior convexity, which wraps around the inferior gyral surface and extends to the lateral orbital sulcus: area 12 orbital (Preuss and Goldman-Rakic, 1991). Additional architectonic studies have described areas within the arcuate sulcus and premotor cortex as well as the subdivisions of the orbital cortex (Carmichael and Price, 1995; Petrides and Pandya, 2002; Saleem et al., 2008; Gerbella et al., 2010), but we will confine our discussion to VLPFC, areas 45 and 12/47. We will refer to area 12 in general as area 12/47 to convey the homology of area 12 in the macaque with human area 47 as introduced by Petrides and Pandya (2002). However, to differentiate between the ventrolateral region below the principal sulcus and the lateral orbital cortex we will use 12 ventrolateral (12vl) and 12 orbital (12o), respectively, as defined by Preuss and Goldman-Rakic (1991).

FIGURE 15.1. Organization of ventral PFC.


Organization of ventral PFC. Maps of the cytoarchitectonic organization of ventral PFC are shown for the human (A and B) and macaque brain (C–E). (A) Brodmann map (1909) of the human brain with areas 44, 45, and 47 marked on the IFG. Reproduced (more...)

The VLPFC also commonly includes a small sulcus—termed the inferior frontal sulcus by Winters et al. (1969) and the infraprincipal dimple or the inferior prefrontal dimple (IPD) by others (Paxinos et al., 2000; Petrides and Pandya, 2002; Petrides et al., 2005)—which varies in its position and depth in M. mulatta. Some schematics of VLPFC depict the IPD as running in a rostral-to-caudal direction and separating area 45A from area 46 (Petrides and Pandya, 2002). However, in our neurophysiological recordings (Romanski and Goldman-Rakic, 2002; Romanski et al., 2005) and in other studies (Petrides et al., 2005), it is depicted as running dorsal to ventral and separating area 45 from area 12/47. It is not always described or visible in studies of other subspecies of macaque monkeys. Thus, there is variability in the position of area 12/47 and area 45 not only between the subspecies of macaques but also within M. mulatta individuals. As explained later, the IPD is the primary location in which auditory responsive neurons and audiovisual responsive cells have been reliably located in several studies, and may be a critical landmark for delineating the functional auditory responsive prefrontal region in macaques. Whether it defines the border of areas 12/47 and 45 is unclear.

In the human brain, areas 44 and 45 have been associated with language processing confirmed by electrical stimulation, PET and functional MRI (fMRI). However, the areas that control vocalization production in Old World monkeys are not as well understood and could include VLPFC, whereas other studies have implicated ventral premotor and the cingulate vocalization area (Petrides et al., 2005; Jürgens, 2009; Coudé et al., 2011).

Cytoarchitectonic Organization of VLPFC

The cytoarchitectonic descriptions here are taken from Preuss and Goldman-Rakic (1991), who described the frontal lobe of the rhesus macaque, M. mulatta, which is the same species that has been examined in most neurophysiology studies of VLPFC. These descriptions are in general agreement with Petrides and Pandya (2002). Area 45 is located ventral to the caudal principal sulcus within the ventral limb of the arcuate and extends onto the cortical surface (Fig. 15.1C–E). It is composed of large pyramidal cells in layer V and deep layer III. Layer IV is thick with densely packed small cells, with some of the larger pyramidal cells from deep layer III and superficial layer V intruding on layer IV. It is densely myelinated. Area 45 is bordered dorsally by area 8a (Preuss and Goldman-Rakic, 1991), which can be distinguished from area 45 by the presence of extremely large pyramidal cells in layer Va of area 8. Area 12 (areas 12vl and 12o), which covers the surface of the ventrolateral convexity and extends onto the lateral orbital surface as far as the lateral orbital sulcus, can be distinguished from area 46 by its more heavily myelinated appearance. The disappearance of the large layer III pyramidal cells marks the transition from area 45 to area 12vl. Area 12vl on the ventrolateral surface has been distinguished from area 12o by the more diffuse myelinated appearance of 12o and the more granular layer IV of 12vl. A series of comparative cytoarchitectonic studies have examined the similarities of area 12 in macaque and area 47 in the human brain (Petrides and Pandya, 2002). As a result, area 12 has been referred to as area 12/47 even though the assignation of 12 was renewed in a recent analysis of VLPFC connections (Gerbella et al., 2010). In addition, studies by Petrides and Pandya (2002) and Gerbella et al. (2010) have suggested that area 45 be divided into subdivisions 45B, closest to and within the anterior limb of the anterior bank of the arcuate sulcus, and area 45A, located rostral to 45B and extending across the surface of the inferior convexity to the IPD, with area 12vl (i.e., 47) bordering 45A rostrally and ventrally (Fig. 15.1). The precise location of area 12vl relative to area 45A may vary somewhat in individuals and may be better determined from a combination of connectivity studies and physiological recordings.

Cortical Connectivity of VLPFC

Connectivity of VLPFC with Cortical Visual Processing Regions

Much of what we know about the cellular functions of the primate PFC is based on the processing of visual information. Thus, it is not surprising that many studies have examined projections from visual association cortex to the primate PFC. Results indicate that VLPFC receives afferents from extrastriate visual cortical areas in the inferotemporal cortex, including area TE. Early anatomical studies by Barbas, Pandya, and others examined the innervation of the entire prefrontal mantle by visual association areas (Chavis and Pandya, 1976; Barbas, 1988; Barbas and Pandya, 1989). Barbas was among the first to note that basoventral prefrontal cortices were more strongly connected with extrastriate, ventral visual areas, which have been implicated in object recognition and feature discrimination. In contrast, medial and dorsal prefrontal cortices are more densely connected with medial and dorsolateral occipital and parietal areas, which are associated with visuospatial functions (Barbas, 1988). This dissociation was confirmed by Bullier et al. (1996), who found a segregation of inputs to caudal PFC when paired injections of tracers were placed into temporal and parietal visual processing regions. In their study, visual temporal cortex projected mainly to ventrolateral PFC, area 45, whereas parietal cortex sent projections to ventrolateral PFC and dorsolateral PFC (areas 8a and 46) (Bullier et al., 1996). Tracing and lesion studies by Ungerleider et al. (1989) showed that area TE projected specifically to three ventral prefrontal targets, including the ventral limb of the arcuate sulcus (area 45), the inferior convexity just ventral to the principal sulcus (area 12vl), and within the lateral orbital cortex (areas 11 and 12o). These projections are via the uncinate fasciculus (Ungerleider et al., 1989). Furthermore, ventrolateral PFC areas 12vl and 45, which contain object- and face-selective neurons (Wilson et al., 1993; Ó Scalaidhe et al., 1997, 1999), were shown to be connected with inferotemporal areas TE and TEO (Webster et al., 1994), with the strongest innervation of ventrolateral PFC and orbitofrontal areas 11 and 12o originating in TE.

Auditory Projections to PFC

In contrast to the visual pathways, the prefrontal targets of central auditory pathways have not been studied as extensively despite the accepted role of the frontal lobe in language. In early anatomical studies, lesion/degeneration techniques were used to reveal projections from the caudal superior temporal gyrus (STG) to the principal sulcus region, the arcuate cortex, and the inferior convexity of the frontal lobe, and from the middle and rostral STG to the rostral principal sulcus and orbital regions (Pandya et al., 1969; Jones and Powell, 1970; Chavis and Pandya, 1976). Additional studies revealed connections between the lateral PFC and cortical areas within the STG (Galaburda and Pandya, 1983; Barbas and Mesulam, 1985; Barbas, 1988; Barbas and Pandya, 1989). There was a suggestion of rostrocaudal topography in these studies whereby anterior and middle aspects of the principal sulcus, including areas 9, 10, and rostral 46, were connected with the middle STG, whereas area 8 received projections from mostly caudal STG (Barbas and Mesulam, 1985; Barbas, 1988; Petrides and Pandya, 1988). It became clear that VLPFC received afferents from the STG, inferotemporal cortex, and multisensory regions within the superior temporal sulcus (STS).

Importantly, detailed anatomical studies by Morel et al. (1993), Jones et al. (1995), and Hackett et al. (1998), together with parallel neurophysiological studies by Rauschecker et al. (1995), provided evidence that primate auditory cortices were organized as a core-belt system with a third zone, the parabelt just lateral to the belt (Morel et al., 1993; Jones et al., 1995; Hackett et al., 1998). A series of landmark neurophysiology studies provided the first electrophysiological evidence for three separate tonotopic regions (AL, ML, and CL) in the belt cortex that could be distinguished from the core A1 (Rauschecker et al., 1995). Additional studies have described functional dissociations of anterior and posterior belt and parabelt regions (Tian et al., 2001). Rauschecker et al. (1995), Morel et al. (1993), and Hackett et al. (1998) used a common terminology to delineate auditory cortex in anatomical and physiological studies, which enabled cross-talk and comparisons that fostered progress in the study of auditory cortical processing and organization.

Combining physiological recording with anatomical tract tracing, Romanski et al. (1999b) analyzed the connections of physiologically defined areas of the belt and parabelt auditory cortex. They determined that rostral and ventral PFC receives projections from the anterior auditory association cortex (areas AL and anterior parabelt) and caudal prefrontal regions are innervated by posterior auditory cortex (areas CL and caudal parabelt; Fig. 15.2). Together with auditory physiological recordings from the lateral belt (Tian et al., 2001) and from the PFC (Romanski and Goldman-Rakic, 2002; Romanski et al., 2005), these studies suggest that separate auditory streams originate in the caudal and rostral auditory cortex and target dorsolateral spatial and anterior-ventrolateral object domains in the frontal lobe, respectively (Romanski, 2007). This is similar to the dorsal and ventral streams described for the visual system (Ungerleider and Mishkin, 1982). Ultimately, this also implies that auditory and visual afferents target similar functional domains of dorsal and ventral PFC (Romanski et al., 2005). The convergence of auditory and visual ventral stream inputs to the same VLPFC domain suggests that they may be integrated and combined to serve a similar function, for example, that of object recognition, which is aided by the integration of multiple sensory inputs.

FIGURE 15.2. Dual streams of auditory afferents target the PFC.


Dual streams of auditory afferents target the PFC. (A) The auditory cortex core (A1) surrounded by the belt (box delineates the lateral belt cortex shown in B). Injections placed into the auditory belt cortex at similar frequency-mapped locations in AL (more...)

Examination of the connections of VLPFC without accompanying physiology has suggested that area 45A receives greater inputs from the STG than from inferotemporal cortex (Petrides and Pandya, 2002; Gerbella et al., 2010). This is in contrast to previous analysis of anterograde projections of the STG and of inferotemporal cortex. These anterograde studies suggest that STG and STS innervate area 12/47 whereas inferotemporal and STS cortex project to area 45 and area 12/47. Much of the debate appears centered on where the boundary between area 45 and area 12/47 occurs, and may be clarified with additional neurophysiological recordings and combined anatomical connectivity studies.


Visual Processing in VLPFC

Decades of research have demonstrated the frontal lobe's involvement in cognitive functions including working memory, decision making, and social communication processes such as language and face–voice processing. Single unit recording studies in animal models have characterized dorsolateral prefrontal cortex (DLPFC) neuronal involvement in visuospatial processing, saccadic eye movements, and working memory (Bruce and Goldberg, 1985; Funahashi et al., 1989, 1993; Quintana and Fuster, 1992; Chafee and Goldman-Rakic, 1998). Further investigations have emphasized a process-oriented role for DLPFC and have described single-unit activity of prefrontal neurons during decision making, categorization, numerosity, and the coding of abstract rules (Kim and Shadlen, 1999; Miller and Cohen, 2001; Nieder et al., 2002; Freedman and Miller, 2008).

In contrast, investigation of the cellular activity in the VLPFC has focused on object processing and social communication. Early studies of VLPFC showed that neurons in this region were responsive to simple and complex visual stimuli presented at the fovea (Rosenkilde et al., 1981; Suzuki and Azuma, 1983). Face-responsive neurons were documented by Thorpe et al. (1983) and Rolls et al. (2006) and later described in detail by Goldman-Rakic and coworkers (Ó Scalaidhe et al., 1997, 1999; Wilson et al., 1993). In these studies, Wilson et al. (1993) showed that DLPFC and VLPFC neurons responded differentially to spatial and object features of visual stimuli. These studies were the first to demonstrate a functional dissociation between DLPFC and VLPFC by using single-unit electrophysiology. Wilson et al. (1993) showed that DLPFC neurons were selectively engaged by visuospatial memory tasks and VLPFC neurons were selective for color, shape, or type of visual objects. An earlier study by Mishkin and Manning (1978) showed that lesions of VLPFC in nonhuman primates interfere with the processing of nonspatial information, including color and form. Electrophysiological recordings demonstrated that VLPFC face cells had a twofold increase in firing rate to face stimuli compared with nonface stimuli during passive presentations or during working memory tasks (Ó Scalaidhe et al., 1997, 1999). Face cells were found only in the VLPFC and not in DLPFC, and were localized to three small parts of VLPFC, including a patch on the lateral convexity close to the lower limb of the arcuate sulcus (area 45), within and around the IPD (area 12vl), and a small number of cells in the lateral orbital cortex (Ó Scalaidhe et al., 1997). VLPFC face cells were sensitive to changes in facial features, expressions, or the angle of gaze, much like the inferotemporal cortical regions, which project to these VLPFC cells. These studies have suggested that VLPFC cells may encode identity, expression, and face view (Ó Scalaidhe et al., 1997, 1999; Rolls et al., 2006; Romanski and Diehl, 2011). Data from the single-unit recordings have been confirmed with fMRI studies in macaque monkeys (Tsao et al., 2008b), which have demonstrated activation of face-responsive “patches” in the same arcuate, ventrolateral, and orbitofrontal locations shown by Ó Scalaidhe et al. (1997, 1999). Demonstration by both methods of visual responsiveness and face selectivity substantiates the notion that VLPFC in the macaque monkey is involved in object and face processing (Fig. 15.3).

FIGURE 15.3. Face-responsive neurons in the VLPFC.


Face-responsive neurons in the VLPFC. (A and B) Face-responsive neurons recorded by Ó Scalaidhe et al. (1997) are depicted. Adapted from Ó Scalaidhe et al. (1997). (A) Region recorded in the PFC is indicated with a circle on the lateral (more...)

Auditory Responses and Function in Ventral PFC

The ventral frontal lobe has long been linked with complex auditory function through its association with language functions in the IFG. The results of some studies have suggested parcellation of function in the human IFG. The anterior region, the pars triangularis (area 45), along with the pars orbitalis (area 47), has been suggested to be more involved in semantic processing, comprehension, and auditory working memory (Démonet et al., 1992; Paulesu et al., 1993; Buckner et al., 1995; Demb et al., 1995; Stromswold et al., 1996; Price, 1998; Poldrack et al., 1999; Gelfand and Bookheimer, 2003). In contrast, the pars opercularis (area 44) and ventral premotor cortex are more active during phonological processing and speech production. The precise neuronal mechanisms that occur in the frontal lobe during the processing of complex auditory information are unknown but might be indirectly assessed with neurophysiological recordings in animals with similar ventral frontal lobe regions, such as macaque monkeys.

Neuronal responses to acoustic stimuli have been sporadically noted in the frontal lobes of Old and New World monkeys (Newman and Linds-ley, 1976; Wollberg and Sela, 1980; Tanila et al., 1993). However, when recordings targeted cortical areas that had been shown to receive projections from acoustically characterized regions of the auditory belt and parabelt cortex (Romanski et al., 1999b), a discrete auditory responsive region was localized in VLPFC (Romanski and Goldman-Rakic, 2002). This VLPFC cortical region is thought to be the termination of a ventral auditory processing stream, specialized for the processing of nonspatial (i.e., object) auditory information (Romanski et al., 1999b; Romanski, 2007; Cohen et al., 2009; Romanski and Averbeck, 2009). The auditory responsive region of VLPFC is located rostral to the ventral limb of the arcuate sulcus below the principal sulcus, in the area of the IPD. This region receives projections from ventral stream auditory cortical regions and polymodal cortex of the STS, as discussed earlier (Carmichael and Price, 1995; Hackett et al., 1999; Romanski et al., 1999a,b; Petrides and Pandya, 2002). VLPFC auditory neurons are responsive to complex auditory stimuli, including vocalization and complex nonvocal stimuli (Romanski and Goldman-Rakic, 2002). This small ventrolateral prefrontal auditory region has also been shown to be active in neuroimaging studies in rhesus monkeys during presentation of complex acoustic stimuli (Poremba and Mishkin, 2007).

The VLPFC auditory area was analyzed with a large library of rhesus macaque vocalizations to test selectivity to specific call categories, as previous analysis had implied some selectivity for calls with common functions (Gifford et al., 2005). Analysis of these vocalization responses with exemplars from 10 different types of calls demonstrated that neurons tended to respond to two or three vocalization types that had similar acoustic morphology rather than similar behavioral referents (Fig. 15.4; Romanski et al., 2005). Additional electrophysiological recording studies by Gifford et al. (2005) and Russ et al. (2007) have suggested that VLPFC neuronal activity is modulated during categorization of acoustic stimuli and in auditory decision making (Lee et al., 2009). These combined data are consistent with a role for VLPFC in a ventral auditory processing stream for auditory objects, including vocalizations. The localization of this auditory processing area to the ventral prefrontal region of Old World monkeys suggests a functional similarity between it and human language-processing regions in the ventral or inferior frontal lobe of the human brain (Deacon, 1992; Romanski and Goldman-Rakic, 2002; Aboitiz, 2012).

FIGURE 15.4. Auditory responsive neurons in VLPFC.


Auditory responsive neurons in VLPFC. A single-cell example of responses to four different vocalization stimuli is shown in the top part of the figure. The response is shown as raster and shaded spike density function in response to a “grunt” (more...)

Multisensory Responses in VLPFC

The initial physiological studies of VLPFC suggested that auditory and visual object processing regions were located adjacent to one another in VLFPC. Any overlap in these auditory and visual responsive zones could thus be sites for multisensory integration of complex auditory and visual information. As neurons in this region are face- and vocalization-responsive, multisensory neurons in the macaque VLPFC might integrate face and vocal information. Given that the percentage of neurons responsive to visual stimuli was much greater than the number of auditory responsive cells (55% vs. 18%), we reasoned that multisensory cells are more likely to be located in regions where auditory cells had been recorded and predicted that multisensory neurons might be found only in this region. In our neurophysiological investigation, we presented movies of familiar monkeys vocalizing to macaque monkeys while single neurons were recorded from the VLPFC (Sugihara et al., 2006). These movies were separated into audio and video streams, and neural responses to the unimodal stimuli were compared with the responses to the combined audiovisual stimuli. Interestingly, approximately half the neurons recorded in the VLPFC were multisensory in that they responded to unimodal auditory and visual stimuli, that is, bimodal responses; or were multisensory because of an enhanced or decreased response to the combined audiovisual stimulus (face and vocalization) compared with the response to the unimodal stimuli (Sugihara et al., 2006). This is likely to be an underestimate of the percentage of multisensory responses because we used a limited set of audiovisual stimuli and neurons were found to be selective for particular face–vocalization pairs.

VLPFC neurons exhibited multisensory enhancement or suppression (Fig. 15.5) just as neurons do in the superior colliculus, the STS, and auditory cortex during multisensory integration (Stein and Meredith, 1993; Barraclough et al., 2005; Ghazanfar et al., 2005; Lakatos et al., 2009). It was also interesting that face/voice stimuli evoked multisensory responses more frequently than nonface/nonvoice audiovisual stimuli. This adds support to the notion that VLPFC is part of a circuit that is specialized for the integration of social communication information rather than sensory stimuli in a general sense. In localizing these multisensory responses to the PFC, there appeared to be two somewhat separate VLPFC regions for multisensory processing. Interestingly, these two separate clusters of multisensory neurons overlap with two prefrontal face patches described by Ó Scalaidhe et al. (1997) and Tsao et al. (2008b) in the arcuate and ventrolateral PFC areas. In our study, there was a large pool of unimodal visual neurons with a small number of multisensory cells located in posterior VLPFC (area 45). Unimodal neurons in this area are mostly visual and respond to faces and nonface stimuli such as objects, shapes, and patterns. The multisensory neurons in this arcuate region (Fig. 15.5) have strong visual responses modulated by the simultaneous presentation of auditory stimuli. There are strong projections to this area from the inferotemporal cortex and the polymodal STS, which have been associated with the processing of facial identity and facial expression. Previous studies in nonhuman primates of visual working memory, decision making, and visual search (Wilson et al., 1993; Kim and Shadlen, 1999; Freedman and Miller, 2008) have noted responsive neurons within this arcuate region as well as in the more commonly recorded principal sulcus region.

FIGURE 15.5. Multisensory neurons in the VLPFC.


Multisensory neurons in the VLPFC. The responses of two single units are shown in A and B as raster/spike density plots to an auditory stimulus alone (a vocalization, AUD) and face (Vis) and both presented simultaneously (AV). Right: Bar graph of mean (more...)

A smaller, potentially more specialized pool of multisensory neurons is located in VLPFC, anterior and lateral to the first pool (Fig. 15.5). These neurons are found near the IPD, and within its banks in area 12vl of Preuss and Goldman-Rakic (1991). This is the region where unimodal auditory responsive neurons were predominantly localized in previous studies (Romanski and Goldman-Rakic, 2002; Romanski et al., 2005). These anterolateral VLPFC neurons respond to vocalizations and to faces, but only weakly to other visual stimuli (Romanski and Goldman-Rakic, 2002; Sugihara et al., 2006). This area receives afferents from mainly polymodal STS cortical regions and also from auditory association cortex, including a small amount of afferents from the belt, more from the parabelt, and the largest contribution from the rostral temporal lobe (Romanski et al., 1999a,b; Hackett et al., 1999). Multisensory responses here favor vocalizations and their corresponding faces, suggesting a more specialized role in the integration of social communication information. Face-responsive cells recorded in this area, which were selective for forward gaze, such as that which occurs in face-to-face communication, were also more likely to be auditory responsive (Romanski and Diehl, 2011). Most of the multisensory neurons exhibited suppression rather than enhancement. This nonlinear interaction has also been noted in auditory cortex (Ghazanfar et al., 2005; Lakatos et al., 2009). This anterolateral pool of multisensory neurons may be specialized for the integration of social communication sounds with facial gestures and other communication-oriented information. In contrast, the more posterior multisensory neurons may serve a more general integrative purpose.


How does this compare with the ventral frontal lobe in the human brain? Whereas many associate the human IFG with only spoken language and verbal processing, communication is, in fact, a multisensory process. Several well-known illusions owe their effects to specific aspects of multisensory integration, including the McGurk and ventriloquist effects (McGurk and MacDonald, 1976; Bertelson and Aschersleben, 2003). Although cross-modal integration takes place over a network of areas in the brain (Driver and Noesselt, 2008; Stein and Stanford, 2008), the same areas that underlie speech and language processing in the temporal and frontal cortex play an essential role in the integration of audiovisual communication information. The STS and the ventral frontal lobe are both sites of activation during the processing and integration of speech and gestures (Homae et al., 2002; Jones and Callan, 2003; Beauchamp et al., 2010; Noppeney et al., 2010). In an fMRI study of speech and gesture, Xu et al. (2009) found overlap of activation in two regions of the IFG when subjects viewed gestures or listened to a voicing of the phrase that fit the gesture. The activated regions included a large cluster in the pars triangularis and pars opercularis (areas 44 and 45) and a smaller focal cluster in pars orbitalis, area 47 (Fig. 15.6). Xu et al. (2009) argued that the IFG most likely plays a larger role in communication than classical auditory-speech processing and theorized that linking meaning with acoustic or visual symbols may be the essential function of these inferior frontal regions.

FIGURE 15.6. Convergence of speech- and gesture-responsive regions in the human frontal lobe.


Convergence of speech- and gesture-responsive regions in the human frontal lobe. Shown are three activation clusters that were active for speech and gesture conditions. There is a large activation cluster in the posterior temporal lobe and in the IFG (more...)

Thus, the process of linking, or integrating phonological constructs with auditory objects, results in the perception of spoken words, whereas integration of a visual image of letters with their learned meanings conveys the concept of a word. Integration might then be one of many basic processes the human frontal lobe performs during speech, language, and communication. The linking, or integrating of face and vocal information, in the macaque monkey frontal lobe could be seen as a precursor to the more complex functions that the IFG performs in the human brain whereby abstract concepts are united with images and sounds. In the human brain, words, sounds, gestures, and visual images are each integrated with meaning and with each other (Xu et al., 2009). In more primitive primates, such as macaques, in which abstraction is not likely to occur, faces are integrated with vocalizations. It has been previously suggested that the ventral PFC is essential in associating a visual cue with an action (Passingham et al., 2000). Our data suggest that VLPFC may associate auditory cues with gestural actions, which is necessary during communication.

Depending on the aspects of the stimuli that are integrated, several communication-relevant functions may be accomplished through integration. In humans, adding mouth movements or facial expressions to spoken words can clarify or even alter the meaning of an utterance (McGurk and MacDonald, 1976). Incongruency between face identity and a voice or between a facial expression and a vocal sound is detected by humans and activates prefrontal and temporal cortical regions. Some human neuroimaging studies have demonstrated a decrease in ventral prefrontal activity for incongruent faces and voices (Calvert et al., 2001; Homae et al., 2002; Jones and Callan, 2003). Others report increased activations during incongruent stimuli (Miller and D'Esposito, 2005; Ojanen et al., 2005; Hein et al., 2007). Recordings of macaque VLPFC neurons show that incongruent stimuli also evoke an increase or a decrease in neuronal activity depending on the original response to bimodal stimuli (Romanski and Diehl, 2011). The sign of the neuronal response may also be affected by facial features and emotional valence of the audiovisual stimuli.

Identity, or recognition, is another process that greatly benefits from the integration of face and vocal information [reviewed in Campanella and Belin (2007)] and studies have shown that animals match faces and corresponding voices as we do (Jordan et al., 2005; Sliwa et al., 2011). The circuit for the processing of face identity includes the cortex within the STS and inferotemporal cortex, and single cells in these areas respond to facial identity and facial expression (Sugase et al., 1999; Eifuku et al., 2004). How multisensory neurons in the STS integrate face and vocalization information to enhance recognition is not known at the single-cell level even though pairing of incongruent faces and vocalizations alters activity in this region. The STS has a robust connection with VLPFC and is likely to send unimodal and multisensory identity information to VLPFC neurons. Selectivity of face-responsive cells in VLPFC has been shown for particular individuals, expressions, or categories of face stimuli (Ó Scalaidhe et al., 1997, 1999; Rolls et al., 2006; Romanski and Diehl, 2011).

The accumulation of evidence to date shows that cells in the ventral PFC of the macaque monkey respond to and integrate audiovisual information. VLPFC cells respond optimally to face and vocalization stimuli and exhibit multisensory enhancement or suppression when face-vocalization stimuli are combined. Thus, the ventral frontal lobe of nonhuman primates may have some basic functional homologies to the human frontal lobe, although more evidence from additional primate species is needed. The basic process of associating a face, or facial gesture, with a vocal stimulus, which occurs in the macaque PFC, may be a precursor to the more complex functions of the human frontal lobe, where semantic meaning is linked with acoustic or visual symbols.


Department of Neurobiology & Anatomy, University of Rochester School of Medicine, Rochester, NY 14642. E-mail: liz_romanski@urmc‚Äč

Copyright 2013 by the National Academy of Sciences. All rights reserved.
Bookshelf ID: NBK207155


  • PubReader
  • Print View
  • Cite this Page
  • PDF version of this title (26M)

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...