![]() | ![]() |
Formats:
|
||||||||||||||||||||||||
Recognition Memory for Realistic Synthetic Faces 1Volen Center for Complex Systems, Brandeis University 2Department of Psychology, University of Pennsylvania 3Centre for Vision Research, York University Correspondence concerning this article may be addressed to Robert Sekuler; e-mail may be sent to sekuler/at/brandeis.edu. The publisher's final edited version of this article is available free at Mem Cognit.Abstract A series of experiments examined short-term recognition memory for trios of briefly-presented, synthetic human faces derived from three real human faces. The stimuli were graded series of faces, which differed by varying known amounts from the face of the average female. Faces based on each of the three real faces were transformed so as to lie along orthogonal axes in a 3-D face space. Experiment 1 showed that the synthetic faces' perceptual similarity stucture strongly influenced recognition memory. Results were fit by NEMo, a noisy exemplar model of perceptual recognition memory. The fits revealed that recognition memory was influenced both by the similarity of the probe to series items, and by the similarities among the series items themselves. Non-metric multi-dimensional scaling (MDS) showed that faces' perceptual representations largely preserved the 3-D space in which the face stimuli were arrayed. NEMo gave a better account of the results when similarity was defined as perceptual, MDS similarity rather than physical proximity of one face to another. Experiment 2 confirmed the importance of within-list homogeneity directly, without mediation of a model. We discuss the affinities and differences between visual memory for synthetic faces and memory for simpler stimuli. Kahana and Sekuler (2002) developed a computational model that successfully accounts for short-term recognition memory with low-dimensional stimuli, compound sinusoidal gratings whose spatial frequency and phase varied. Building on Nosofsky's (1984, 1986) Generalized Context Model (GCM), Kahana and Sekuler's Noisy Exemplar Model, NEMo, combines core aspects of GCM with new key assumptions. NEMo follows the tradition of multidimensional signal-detection theory (e.g., Ashby & Maddox, 1998) in assuming that stimulus representations are coded in a noisy manner, with different levels of noise associated with various dimensions. NEMo augments the summed-similarity framework of item recognition (Clark & Gronlund, 1996; Nosofsky, 1991, 1992; Humphreys, Pike, Bain, & Tehan, 1989) with the idea that recognition decisions are influenced not only by probe-to-list-item similarity, but also by the similarity of list items to one another, a variable that is called within-list homogeneity. Specifically, subjects appear to interpret probe-to-list similarity in light of within-list homogeneity, with greater homogeneity leading to a greater tendency to reject lures that are similar to one or more of the studied items. This impact of within-list homogeneity has been confirmed by Nosofsky and Kantner (in press), using color patches as stimuli, and by Kahana, Zhou, Geller, and Sekuler (revision submitted), using compound gratings that were adjusted to reflect individual subjects' visual thresholds. In contrast to compound sinusoidal gratings, essential aspects of visual processing of human faces take place several synapses beyond the primary visual cortex (Loffler, Gordon, Wilkinson, Goren, & Wilson, 2005; Loffler, Yourganov, Wilkinson, & Wilson, 2005). Because the primary visual cortex participates not only in visual encoding but also in visual memory and related phenomena (Magnussen & Greenlee, 1999; Kosslyn, Thompson, Kim, & Alpert, 1995; Klein, Paradis, Poline, Kosslyn, & Le Bihan, 2000), differences between visual processing of compound gratings and of human faces might produce corresponding differences in recognition memory for the two kinds of stimuli (Hole, 1996), and thereby undermine NEMo's applicability to face stimuli. Our aim here is not to explore or adjudicate among competing psychophysical or neural accounts of face perception and/or memory for faces (for example, Gauthier, Skudlarski, Gore, & Anderson, 2000; Joseph & Gathers, 2002; Grill-Spector, Knouf, & Kanwisher, 2004; Riesenhuber, Jarudi, Gilad, & Sinha, 2004), but to test NEMo's extensibility to memory for high dimensional stimuli, and to refine the methodology for measuring and modeling short-term memory. Studies of face perception and/or face memory have used a range of stimuli, including simple cartoons, such as the Brunswik faces (Brunswik & Reiter, 1937; Sigala, Gabbiani, & Logothetis, 2002; Peters, Gabbiani, & Koch, 2003), photographs collected in convenience samples from sources such as school yearbooks, or images whose properties have been tailored to the study's specific purposes (e.g., Gold, Bennett, & Sekuler, 1999). We chose to work with realistic, synthetic human faces generated by the methods introduced by Wilson, Loffler, and Wilkinson (2002). As test stimuli, Wilson faces mitigate problems arising from the availability of non-facial information or from the availability of distinctive featural differences among faces (for example, Duchaine & Weidenfeld, 2003; Sadr, Jarudi, & Sinha, 2003). Most importantly for model-related research, the generating algorithm for Wilson faces makes it easy to manipulate perceptual differences among faces. Furthermore, sets of Wilson faces can be represented in n-dimensional perceptual spaces whose properties can be tailored to suit some particular theoretical objective. For example, the axes of a face space might be orthogonalized, or individual exemplar faces might be made equidistant from some mean or reference face. Also, small, graded differences between faces makes it difficult for subjects to learn and name each face in a reliable, consistent fashion. This is important because naming can subvert mnemonic reliance on visual information (for example, Ashby & Ell, 2001; Goldstein & Chance, 1971; Hwang et al., 2005). To reinforce reliance on visual information per se, we limited rehearsal by permitting subjects only a brief glimpse of each face, and then allowing only a short interval between successive faces. Wilson's synthesized faces are geometrically simple, compared to actual, unprocessed gray scale photographs of faces, but these synthesized faces manage to convey sufficient information to permit identification of individual faces Wilson et al. (2002). This individuality, which is important in episodic memory, is absent from some commonly used face stimuli, such as Brunswik faces (Brunswik & Reiter, 1937). To preview, Experiment 1's design allowed us to apply the NEMo model of visual short-term recognition memory to performance on individual stimulus lists (i.e., diverse series of stimulus items). The model was applied in alternative modes, for example expressing “similarity” either in terms of faces' physical coordinates, or in terms of perceptual coordinates, assessed using multdimensional scaling. The design of Experiment 2 provided a direct, model-free demonstration of within-list homogeneity's influence on recognition decisions. Experiment 1 Perceptual similarity among stimuli plays a central role in the structure of NEMo, and in other models of recognition. With compound gratings as stimuli, similarity has been defined by representing stimuli in a metric based on subjects' discrimination thresholds for spatial frequency (Zhou, Kahana, & Sekuler, 2004; Kahana et al., revision submitted). That same approach would likely fail with face stimuli because the representational space is clearly anisotropic. For example, the difference threshold for discriminating between faces varies substantially from one region of face space to another, with smallest difference thresholds in the neighborhood of the average face (Wilson et al., 2002). To permit full expression of potential anisotropies, we used non-metric multidimensional scaling (MDS) to characterize the perceptual space within which our face stimuli were located, and to quantify the distances between faces in that space. The data used for MDS came from oddity judgments made on simultaneously-presented trios of faces. Subjects Two male and six female volunteers aged from 20 to 25 years participated in the main experiment. All were naive to the purpose of this experiment. They had normal or corrected-to-normal visual acuity as measured with Snellen targets, and normal contrast sensitivity as measured with Pelli-Robson charts (Pelli, Robson, & Wilkins, 1988). Stimuli Stimuli were generated and displayed using Matlab and extensions from the Psychophysics and Video Toolboxes (Brainard, 1997; Pelli, 1997). Stimuli were presented on a 15-inch computer monitor with a refresh rate of 95 Hz, and resolution set to 800 by 600 pixels. Routines from the Video Toolbox calibrated and linearized the display. Mean screen luminance was fixed at 36 cd/m2. The Wilson faces used in all our experiments were based on photographs of three Caucasian females whom we designate A, B, and C. From mA, the vector of 37 measurements taken on actual face A, we synthesized a realistic version of that face in a stimulus space of high dimensionality (n = 37). Vectors of measurements taken on faces A, B and C were transformed mathematically so as to be mutually orthogonal, by Gram-Schmidt orthogonalization (Diamantaras & Kung, 1996; Principle, Euliano, & Lefebvre, 2000). Consequently, variation in one face's geometric properties is independent of the variation in geometric properties of the other two faces (Wilson et al., 2002). Additional details of the faces' construction and properties are given in an Appendix. After pre-processing and normalization, vectors of measurements from several different faces can be combined to generate mavg, the vector of measurements for an average face. To illustrate, the synthesized mean female face is shown at the left of Figure 1
The graded series for faces A-C are shown in the upper three rows of Figure 1 A final set of faces, D, was generated by averaging corresponding exemplars of A, B, and C. The resulting faces are shown in the bottom row of Figure 1
Procedure On each trial, a study set of three faces, the study series, was followed by a single probe face (p). Subjects judged whether p had been among the items in the study series. We use the term target to designate a p that had been in the study series, and the term lure to designate a p that had not been in the study series. Correspondingly, we can designate any trial as either a target trial or a lure trial. Because the study series varied from trial to trial, subjects were forced to base each “yes”-“no” recognition judgment on the items they had just seen. Each study face was presented for 110 msec, with an inter-stimulus interval of 200 msec. The use of brief presentations was inspired by Wilson et al. (2002)'s use of this same duration in their studies of face discrimination, and by the fact that fairly detailed processing of a face can be completed within the first 100 msec of viewing (Lehky, 2000). A warning tone followed the study series. Then, after 1200 msec retention interval, a p face was presented for 110 msec. For each study list, p was chosen at random from the entire set of faces, with two constraints. First, on half of all trials, p was forced to replicate one of the items in the study set (on half of all trials, p differed from all study items). Second, when p matched one of the study items, it matched items in each serial position equally often. Distinctive tones following each response gave subjects trial-wise knowledge of results. Although there were small differences in size from face to face, each was approximately 5.5 degrees high by 3.8 degrees wide. To eliminate the usefulness of vernier-type cues, the vertical and horizontal position of each face was perturbed on each presentation by adding a pair of random displacements drawn from a uniform distribution with mean = 12.6 minarc, and range = 2.1–23.1 minarc. Sixty different stimulus series were generated were used. A preliminary study with different subjects and many more stimulus series identified these 60 series as likely to span a wide range of recognition performance (Yotsumoto, Kahana, Wilson, & Sekuler, 2004). We reasoned that a wide range of performance would allow for the strongest test of NEMo. Of the 60 stimulus series, half comprised target trials, in which p replicated one of three study items; the remaining lists comprised lure trials, in which p replicated none of the study items. Subjects participated in four one-hour sessions of 490 trials each. The first 10 trials of any session were treated as practice, and were eliminated from data analysis; this left 32 replications for each stimulus series and subject. During testing, a subject sat with head supported by a chin-and-forehead rest, viewing the computer display binocularly from a distance of 114 cm. Trials were self-initiated. Multidimensional Scaling To characterize the perceptual similarity structure of the synthetic faces (Lee, Byatt, & Rhodes, 2000), the subjects who would serve in the recognition memory experiment first took part in a study with non-metric multidimensional scaling (MDS). The data required for such scaling were generated using the method of triads (Ennis, Mullen, Frijters, & Tindall, 1989) On each trial, three faces were presented simultaneously, side by side, for 500 ms. Simultaneous presentation was used in order to minimize likely effects of memory. From the set of three faces, subjects chose the one face that seemed most different from the other two (Romney, Brewer, & Batchelder, 1993; Wexler & Romney, 1972). We did not specify the characteristic(s) on which similarity judgments should be based. To minimize the possibility that vernier-type cues might contribute to the dissimilarity judgments, each face's vertical position was randomly offset by a sample from a uniform distribution spanning ±16.8 min. Each possible stimulus pair (Face1, Face2) was presented with every other possible stimulus, for example Face3. If stimulus Face3 were selected as the stimulus most different from the others, then by default the remaining stimuli, Face1 and Face2 were assumed to be the most similar to one another. A similarity matrix was constructed by counting the number of times that a stimulus pair (e.g., Face1, Face2) is designated as “similar” when placed in combination with various other stimuli (e.g., Face3…Face21). To control the number of trials required for multidimensional scaling, we used a Balanced Incomplete Block design (Weller & Romney, 1988). For this design we generated triads of faces (“blocks”), whose members were drawn from the complete set of 21 faces. This selection was constrained so that each of the 210 pairs of faces occurred in the context of 30 triads. This arrangement meant that the 30 trials whose triads included any particular pair of faces were likely to have different faces as their third member. The displacement of three faces were randomly determined for each trial. Each subject participated in three, one-hour sessions of 710 trials each. The first 10 trials of each session were treated as practice, and were eliminated from our data analysis. The remaining 2100 triadic comparisons per subject were converted into a dissimilarity matrix, which were processed by SPSS' ALSCAL and INDSCAL routines, using a Euclidean distance model. Results Recognition Memory Figure 3
Multidimensional Scaling MDS solutions were obtained for representations in one-to-six dimensions. Values of r2, which represent the proportion of variance accounted for in the scaled data, increased as the number of dimensions varied from one to three, but saturated thereafter. Based on these values and also on the dimensionality of the stimuli, subsequent modeling was based on the three dimensional MDS solution. In that solution, Kruskal's Stress measure was 0.26, and r2 was 0.62. The three-dimensional solution generated by MDS is shown in Figure 2B We used Procrustes analysis (Dryden & Mardia, 1998) to visualize relationships between this three-dimensional description of similarity space and the three-dimensional structure in which the faces were generated. The Procrustes analysis linearly transformed the matrix of values from the MDS solution to bring that matrix into best conformity with the matrix of pairwise distances in the faces' physical space. The outcome, shown in Figure 2B Clearly, although the transformed MDS solution does resemble the arrangement of the face stimuli themselves, residual differences remain between the perceptual space, as represented by MDS, and the physical space. After the Procrustes transformation, the sum of squared residual discrepancies between the physical representations and the transformed-MDS representations was 0.26. To provide an intuition about the magnitude of this value, we used Monte Carlo methods to put this sum of squares into the same units as were used for the Euclidean physical space (Figure 2A To examine the residuals on a finer scale, the mean residuals between the MDS solution and the faces' physical coordinates were calculated and then sorted into bins according to the distance between faces in a study series and the mean of the 21 faces. These values are plotted in Figure 4
As a further comparison between the MDS and physical representations of our 21 faces, we computed the vector angles between perceptual exemplars of A, B, and C. These vector angles are shown in Table 1. Faces just 4% away from the mean face were excluded from these calculations because in MDS space those faces clustered tightly around the mean face, which made angle measurements for those faces meaningless. The mean angles based on the 8, 12, and 16% data were 89, 70 and 93°, suggesting that the perceptual similarity space preserved much, but not all of the orthogonality that had been built into the faces' original, 3-D space. However, all of the angle estimates dropped when the 0.20 faces were included in the calculations, which confirms the demonstration in Figure 4
Model We applied NEMo to the recognition memory data. As mentioned before, NEMo departs from the classic summed similarity models of item recognition (e.g., McKinley & Nosofsky, 1996; Nosofsky, 1986) by allowing recognition judgments to be determined not only by the similarity between the probe, p, on one hand, and each study stimulus, on the other, but also by similarities among study items themselves. Given a series of L study items, s1…sL, and a probe, p, NEMo responds “yes” if: where η(p, si) is the perceptual similarity between p and the ith study item (see Equation 2, below); is a vector representing the noise associated with each stimulus dimension, αi is the weight given the ith study item, and CL represents an optimal criterion for a series of L study items. To allow for the possibility that subjects' decision rule might incorporate within-list homogeneity, NEMo adds together (i) summed similarity and (ii) within-list homogeneity, weighting the latter by a parameter β. If β = 0 the model reduces to a standard summed similarity model (Nosofsky, 1986) with noisy item representations (Ennis, 1988) and a deterministic decision rule. If β < 0, when s1…sL are similar to one another, a given lure becomes less tempting, that is, it attracts fewer “yes” responses. The opposite effect would accompany β > 0, that is, study items similar to one another would attract more “yes” responses.In NEMo, similarity, η(si, sj), between item representations, si and sj, is given by:
With similarity defined in either physical or perceptual (MDS) space, we ran parallel simulations of NEMo, one set of simulations using each definition of similarity. Among parameters in Equation 2, we fixed c = 1, implementing a simple exponential generalization function as suggested by previous empirical results (Kahana & Sekuler, 2002). To reduce NEMo's free parameters further, we used an independent, empirical estimate of τ and a. We estimated similarity in physical space using data from a large-scale preliminary experiment (Yotsumoto et al., 2004) that generated an empirical approximation of the similarity tuning function in physical space.1 The similarity tuning function's exponent and y-intercept, 9.20 and 0.84, were used for τ and a respectively, in one set of model simulations. Then, reusing results from the preliminary experiment, we transformed inter-face distances according to values from the MDS analysis, and fit a second exponential similarity function. This generated a similarity tuning function in perceptual space. The function's exponent and y-intercept, 11.14 and 0.91, were used as τ and a respectively, in a second set of model simulations. Finally, we fixed one other parameter, setting NEMo's criterion to 0.5, which is the empirical value found in previous model fits (Kahana & Sekuler, 2002). Simulations and Application of Model We fit NEMo to the value of P(yes) obtained for each of the 60 different stimulus lists in Experiment 1. A genetic algorithm (Mitchell, 1996) found NEMo's best fitting parameter set by minimizing the root-mean squared-difference (RMSD) between observed and predicted recognition scores. The genetic algorithm allowed a population of 1000 random parameter sets to evolve for 20 generations. At the end of every generation each of the 500 least-fit parameter sets was replaced with a new parameter set, which randomly draw each of its parameter values from one of the non-replaced, 500 best-fit parameter sets. Then the non-replaced 500 best-fit parameter sets were mutated by a single, Gaussian parameter change with a standard deviation of 30% of a parameter's range. Finally, to produce an estimate of RMSD, each parameter set ran for 1000 simulated trials for each stimulus list. Results of Model Simulations We fit subjects' average performance twice, expressing NEMo's inter-face similarity values, τ , a, and d(si, sj), either as physical distances between faces, or as values from the MDS descriptions of subjects' perceptual space. Table 2 gives the best fitting model parameters for NEMo derived from the genetic algorithm. The column Physical shows the best parameters using stimulus distances in physical space, and the column MDS shows the best parameters using stimulus distances in MDS, perceptual space. The first three parameters, σ1, σ2, and σ3, are the variances of noise distributions: one for each dimension of the three dimensional, perceptual space. σ1, σ2, and σ3 correspond to dimension 1, 2 and 3 in Figure 2
As explained earlier, β represents the contribution of within-list homogeneity. Its negative sign for both simulations, indicates that when study items were similar to one another, NEMo became more conservative, decreasing any tendency to treat a lure as a target. Note, finally, that the RMSD associated with the perception-based fit, 0.101, is smaller than the RMSD associated with the fit based on the faces' physical representation, 0.123. Each of the best-fit parameter sets was used to generate NEMo's predictions for each of the 60 lists. Because NEMo has three non-deterministic, noise parameters, running the model with any single trio of noise samples σ1 …σ3 could not produce a singular, exact set of predictions. Therefore, NEMo was used to simulate 1000 trials for each of the 60 study-p lists, with new, independent random noise samples drawn for each trial. From these 1000 trials, we obtained predicted proportion yes responses for each list. Figure 5
The reliability of the MDS solutions To assess the consistency of subjects' triadic judgments we computed two different mean MDS solutions. One solution was based on subjects' dissimilarity judgments on all odd-numbered trials (that is, first, third, fifth, etc.), the second solution was based on judgments from all even-numbered trials (that is, second, fourth, sixth, etc.).2 For each three dimensional solution, the Euclidean distances between all face pairs were taken, and the correlation calculated between pairwise distances from odd trials and pairwise distances from even trials. The results are shown as a scatterplot in Figure 6
As noted earlier, NEMo's predictions were more accurate when psychophysical (MDS) rather than purely physical similarities were taken account of, but those predictions had a number of clear outliers. These were stimulus series on which the model failed badly, that is, deviated by 0.20 or more from the predicted P(yes). To identify the origin of these failures, we examined the makeup of these series. Of the five outliers, three contained two or more faces that deviated from the mean face by the largest value possible, namely 0.20. To this outcome, Monte Carlo simulation assigned a probability 0.02< p <0.01. So, faces with the greatest deformation relative to the mean face produced the largest errors in NEMo's predictions. As shown in Figure 4 Discussion NEMo's account of the recognition results gives further support to the idea that recognition decisions depend upon both summed similarity and within-list homogeneity. Discrepancies between the faces' representation in physical space (Figure 2 Torgerson (1958) described other variants of the method of triadic comparisons in which subjects had to make multiple, explicit pairwise judgments per trial. Note that the single explicit judgment required on each trial in our application implies that subjects have made one or more pairwise comparisons, although such comparisons are not made explicit. Letting the stimuli in the triad be i, j, and k, our subjects' identification of one item as most dissimilar could reflect evaluations of inequalities among |i – j|, |i – k|, and |j – k|. Of course, when subjects are not forced to make such evaluations explicit, one cannot rule out the possibility that, particularly with time pressures, subjects might sometimes make fewer than all pairwise comparisons. Experiment 2 The simulations applied to data from Experiment 1 revealed that visual memory performance could be well predicted by a model that takes account of both summed similarity and within-list homogeneity. Because the within-list homogeneity term is a novel addition to the summed similarity framework, we sought an additional, direct, model-free, demonstration that within-list homogeneity was actually important in face recognition memory. Therefore we designed stimulus series in which variation in both summed similarity and in within-list homogeneity were controlled. We expected that the responses produced by various combinations of the two factors would directly demonstrate the contribution of each factor, without the mediation of a computational model. Methods Apparatus, Stimuli and Procedure The apparatus and stimuli were the same as in Experiment 1 except that multiple subjects were tested simultaneously, using computers in a classroom cluster. Although subjects did not use chin rests, they were encouraged to maintain a constant viewing distance of approximately 57 cm from their computer. The procedure was the same as Experiment 1 except that the three study faces for each trial were forced to come from three different categories of faces, A…D (as shown in Figure 1 Subjects Twenty nine Brandeis undergraduates participated as part of a course requirement; during the session, each subject gave 436 trials. All subjects were naive to the experimental purpose, and none had taken part in our other experiments. Results and Discussion For each series on which p was a lure, we used the faces' physical coordinates to calculate summed similarity between p and all study items, and the within-list homogeneity. For this purpose, similarity was assumed to be monotonic with Euclidean distance in the physical space represented in Figure 2A
Note that the directions of the two effects observed here reproduced the corresponding effects seen in simulations with NEMo for Experiment 1. Of particular interest was the direct confirmation that within-list homogeneity and summed similarity operate in opposed directions to influence recognition judgments, just as NEMo demonstrated they did. General Discussion Physical coordinates vs MDS solutions NEMo's account of visual recognition memory performance for face stimuli was improved when the model incorporated faces' perceptual similarity rather than purely physical coordinates. It is important to note that, with either perceptual or physical representations of similarity, NEMo had the same number of free parameters, therefore this difference did not result from a difference in the model complexity. We should note that not every related study has demonstrated an advantage from describing stimuli in perceptual (MDS) terms rather than physical ones. Peters et al. (2003) found that the performance of categorization models was either unchanged or even slightly diminished when face stimuli were described using MDS rather than a native, physical metric. In their study, subjects were trained to categorize stimuli including schematic Brunswik-Reiter faces (1937), and slightly more elaborate, cartoon faces. These faces were defined in 4-dimensional physical spaces, which subjects learned to bifurcate using a simple, linear separable criterion. After learning the category membership of various exemplars, subjects' categorization was tested with mixtures of previously-seen faces and new ones. Peters et al. (2003)'s modeling results favored the proposition that subjects stored a sparse, abstracted representation of category properties, rather than the characteristics of individual exemplars. abstract, sparse the learned, long-term information as individual exemplars. This outcome with longer-term learning of stable category properties is quite different from the case in our experiments on episodic recognition, where subjects seem to store individual exemplars, at least for the brief duration of a single trial. In categorization tasks like Peters et al. (2003)'s, subjects can learn abstract rules over the course of their experience with many exemplars. However, such a strategy would not work in a episodic recognition tasks in which targets and lures are drawn randomly from a common stimulus space, and in which no simple rule could work in the face of trial to trial variation in the study and test items. Of course, if some simplifying consistent bias were introduced into a recognition experiment so that lures were always drawn from one category, say Wilson faces from class A, and targets were always drawn from a different category, say Wilson faces from class B, subjects undoubtedly would eventually learn the rule and be able to ignore the study items. Clearly, though, such an experiment would no longer be an experiment on episodic recognition. Differences from compound gratings One of our purposes was to investigate short term visual memory with higher dimensional stimuli, particularly by applying NEMo to recognition of synthetic faces. The best fitting parameters obtained here preserved the general characteristics observed in Kahana and Sekuler (2002)'s studies with memory for compound gratings. For example, in both studies, recency effects were captured by values of α, and the empirical effects of within-list homogeneity were reflected in significantly negative values of β. Moreover, despite the fact that τ values were derived in different ways for recognition of gratings and of faces, the resulting τ values were close between two studies (8.8 and 10.7 with compound gratings, 9.20 and 11.14 with synthetic faces).3 It seems, then, that similar similarity-distance functions operate for both low-dimensional (gratings) and high-dimensional stimuli (synthetic faces). However, memory for synthetic faces may differ in an important way from the memory for compound gratings. Even though we found a recency effect with synthetic faces as well as with gratings, attributes of the recency effect observed in this study differed from those found with gratings. With gratings, the recency effect persisted across the entire list, with performance increasing systematically from the least to the most recently presented item (Kahana & Sekuler, 2002). Here, though, the serial positions preceding the last one produced essentially equivalent performance, though, as mentioned earlier, performance at all serial positions is better than the chance level defined by the false alarm rate. This suggests that with higher dimensional stimuli, instead of forgetting previously-seen items gradually, the last seen item may diminish the memory for all previously seen items equally. This resembles a result reported previously (Phillips, 1974, 1983; Phillips & Christie, 1977). Because procedures differed between the studies, we must be cautious in attributing various discrepancies in the results to differences in stimulus dimensionality alone. But we do believe that parallel studies of recognition memory for gratings, faces, and other high-dimensional stimuli, as well as comparable stimuli from other sensory modalities (Visscher, Kahana, & Sekuler, 2006), will ultimately help us understand how the number and structure of perceptual dimensions that make up stimuli contribute to human short term recognition memory. We should note at least one possible application of the present results, to the validity of police lineups. When a witness views a simultaneous lineup comprising several individuals, the homogeneity of those individuals is an analogue to the within-list homogeneity whose potency was demonstrated in Experiments 1 and 2. Extrapolating from our results, one would expect that within-list homogeneity in a lineup would strongly affect a potential witness' recognition response. Recently, law enforcement officials in the United States have been encouraged to substitute sequential lineups for the traditional simultaneous procedure (Turtle, Lindsay, & Wells, 2003). Advocates of this procedure have claimed that sequential lineups somehow enhance discriminability, and thereby promote accuracy. In sequential lineups, witnesses view one lineup member at a time and decide whether or not that person is the perpetrator, prior to viewing the next lineup member. Here, witnesses make a yes-no judgment after viewing each single person/face, one at a time, which is meant to minimize false alarms, by discouraging the sort of relative judgments that can contaminate simultaneous lineups (Lindsay et al., 1991; Steblay, Dysart, Fulero, & Lindsay, 2001). However, Gronlund (2004) has shown that the beneficial effect of sequential lineups arises not from heightened discriminability, but from a change in criterion –with a sequential lineup promoting a more conservative criterion. Whatever their benefit, though, sequential lineups would be immune to influences from within-list homogeneity only if there were zero carryover of memory from one face/lineup member to another, an assumption that begs to be evaluated. Finally, in all the studies we have reported here, possible contamination of face recognition by emotional cues was intentionally avoided. In fact, the set of Wilson-faces that we used substituted constant generic shapes for features that would change shape as emotions are expressed. In addition to equation on basic dimensions such as contrast and mean luminance, the consistent neutral expression of our faces ruled out emotion as an aid to recognition memory. Given emotional expression's known role in recognition (for example, Gallegos & Tranel, 2005; Johansson, Mecklinger, & Treese, 2004; Kaufmann & Schweinberger, 2004), it is worth noting that systematic variation in the position and/or orientation of eyes, mouth and brows (Ekman & Friesen, 1975) can generate Wilson faces that express distinct, easily identified emotions, and in varying degree (e.g. ?). We plan to evaluate how the presence of emotion signals of varying, calibrated strength might combine with other facial information to influence short-term face recognition. Acknowledgments The authors acknowledge support from AFOSR F49620-03-1-0376, and National Institutes of Health grants MH55687 and EY002158. Yuko Yotsumoto is now at the Martinos Center for Biomedical Imaging, Masschuetts General Hospital, Charlestown, MA. Appendix Wilson et al. (2002) introduced a method for generating synthetic faces that are well-suited for for model-driven research on various topics, including visual memory. In their scheme, individual synthetic faces are derived from gray-scale face photographs by digitizing 37 key points: 14 points defining head shape, 9 points for the hairline, 4 points for eye locations, 4 points for nose length and width, 5 points defining the mouth and lips, and one point for brow height. Synthetic faces are then reconstructed from these 37 measurements and bandpass filtered with a 2.0 octave wide difference of Gaussians filter with a peak frequency of 10.0 cycles per face width (Wilson et al., 2002). Several studies have shown that such filtering preserves frequencies needed for face recognition (Gold et al., 1999; Nsnen, 1999). By design, synthetic faces eliminate textures such as skin, hair, wrinkles, etc, and focus instead on geometric characteristics of faces. However,this raises the important question of whether synthetic faces are sufficiently accurate representations of their original faces to be useful in psychophysical experimentation. This question has been answered by requiring observers to identify the gray-scale photograph from which a synthetic face was derived in a four alternative forced choice experiment. The mean across five observers was 97.4% correct in matching between front view synthetic faces and photographs, and even for matching between 20 side view photographs and front view synthetic faces (or vice versa) performance averaged 90.7% correct (Wilson et al., 2002). As chance performance is 25% in these experiments, these data clearly demonstrate that synthetic faces capture salient aspects of individual face geometry. Furthermore, the data base captures known face gender differences: synthetic female faces have significantly smaller heads, rounder chins, thicker lips, and higher eyebrows than males. Finally, fMRI signals from the fusiform face area show that synthetic faces produce BOLD activation that is nearly as large as the original gray scale faces from which they are derived (Loffler et al., 2005). Although the Wilson faces are relatively simple geometrically, they still convey sufficient information to characterize individual faces. This essential individuality, which is important in episodic memory, is absent from some commonly used face stimuli, such as Brunswik faces (Brunswik & Reiter, 1937; Sigala et al., 2002; Peters et al., 2003). Footnotes 1The similarity tuning function was based on errors in same-different judgments with just a single study item, which minimized memory load, and with timing identical to that used here. 2The balanced incomplete block design forced us to take this indirect approach. Differences in the makeup of triads on successive trials prevented us from comparing judgments themselves. As a result, our comparisons had to be mediated via MDS. 3We ran simulations with various τ values, and found that differences of this size did not appreciably alter the model's resulting RMSD. So differences in the similarity-distance functions were not critical to the success of model fits. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||
Vision Res. 2002 Aug; 42(18):2177-192.
[Vision Res. 2002]J Exp Psychol Gen. 1986 Mar; 115(1):39-61.
[J Exp Psychol Gen. 1986]J Exp Psychol Hum Percept Perform. 1991 Feb; 17(1):3-27.
[J Exp Psychol Hum Percept Perform. 1991]Vision Res. 2005 Aug; 45(17):2287-97.
[Vision Res. 2005]Psychol Res. 1999; 62(2-3):81-92.
[Psychol Res. 1999]Nature. 1995 Nov 30; 378(6556):496-8.
[Nature. 1995]J Cogn Neurosci. 2000; 12 Suppl 2():15-23.
[J Cogn Neurosci. 2000]Perception. 1996; 25(1):53-64.
[Perception. 1996]J Cogn Neurosci. 2002 Feb 15; 14(2):187-98.
[J Cogn Neurosci. 2002]Vision Res. 2003 Sep; 43(21):2265-80.
[Vision Res. 2003]Nature. 1999 Nov 11; 402(6758):176-8.
[Nature. 1999]Vision Res. 2002 Dec; 42(27):2909-23.
[Vision Res. 2002]Neuropsychologia. 2003; 41(6):713-20.
[Neuropsychologia. 2003]Psychol Sci. 2004 Feb; 15(2):112-8.
[Psychol Sci. 2004]Vision Res. 2002 Dec; 42(27):2909-23.
[Vision Res. 2002]Spat Vis. 1997; 10(4):443-6.
[Spat Vis. 1997]Spat Vis. 1997; 10(4):437-42.
[Spat Vis. 1997]Vision Res. 2002 Dec; 42(27):2909-23.
[Vision Res. 2002]Vision Res. 2002 Dec; 42(27):2909-23.
[Vision Res. 2002]Psychol Aging. 2006 Mar; 21(1):40-8.
[Psychol Aging. 2006]Vision Res. 2002 Dec; 42(27):2909-23.
[Vision Res. 2002]J Cogn Neurosci. 2000 Sep; 12(5):848-55.
[J Cogn Neurosci. 2000]Psychol Sci. 2000 Sep; 11(5):379-85.
[Psychol Sci. 2000]J Exp Psychol Hum Percept Perform. 1996 Apr; 22(2):294-317.
[J Exp Psychol Hum Percept Perform. 1996]J Exp Psychol Gen. 1986 Mar; 115(1):39-61.
[J Exp Psychol Gen. 1986]J Exp Psychol Gen. 1986 Mar; 115(1):39-61.
[J Exp Psychol Gen. 1986]Vision Res. 2002 Aug; 42(18):2177-192.
[Vision Res. 2002]Perception. 1986; 15(5):525-35.
[Perception. 1986]Psychol Sci. 2000 Sep; 11(5):379-85.
[Psychol Sci. 2000]J Exp Psychol Hum Percept Perform. 1991 Feb; 17(1):3-27.
[J Exp Psychol Hum Percept Perform. 1991]Vision Res. 2003 Sep; 43(21):2265-80.
[Vision Res. 2003]Vision Res. 2003 Sep; 43(21):2265-80.
[Vision Res. 2003]Vision Res. 2002 Aug; 42(18):2177-192.
[Vision Res. 2002]Vision Res. 2002 Aug; 42(18):2177-192.
[Vision Res. 2002]J Appl Psychol. 1991 Dec; 76(6):796-802.
[J Appl Psychol. 1991]Law Hum Behav. 2001 Oct; 25(5):459-73.
[Law Hum Behav. 2001]J Appl Psychol. 2004 Apr; 89(2):362-8.
[J Appl Psychol. 2004]Brain Lang. 2005 Jun; 93(3):338-48.
[Brain Lang. 2005]J Cogn Neurosci. 2004 Dec; 16(10):1840-53.
[J Cogn Neurosci. 2004]Perception. 2004; 33(4):399-408.
[Perception. 2004]Vision Res. 2002 Dec; 42(27):2909-23.
[Vision Res. 2002]Nature. 1999 Nov 11; 402(6758):176-8.
[Nature. 1999]Vision Res. 1999 Nov; 39(23):3824-33.
[Vision Res. 1999]Vision Res. 2002 Dec; 42(27):2909-23.
[Vision Res. 2002]Vision Res. 2005 Aug; 45(17):2287-97.
[Vision Res. 2005]J Cogn Neurosci. 2002 Feb 15; 14(2):187-98.
[J Cogn Neurosci. 2002]Vision Res. 2003 Sep; 43(21):2265-80.
[Vision Res. 2003]Vision Res. 2002 Dec; 42(27):2909-23.
[Vision Res. 2002]