Facial emotion recognition in autistic adult females correlates with alexithymia, not autism

Research on predominantly male autistic samples has indicated that impairments in facial emotion recognition typically associated with autism spectrum conditions are instead due to co-occurring alexithymia. However, whether this could be demonstrated using more realistic facial emotion recognition stimuli and applied to autistic females was unclear. In all, 83 females diagnosed with autism spectrum condition completed online self-report measures of autism spectrum condition severity and alexithymia, and afacial emotion recognition deficit that assessed their ability to identify multimodal displays of complex emotions. Higher levels of alexithymia, but not autism spectrum condition severity, were associated with less accurate facial emotion recognition. Difficulty identifying one’s own feelings and externally oriented thinking were the components of alexithymia that were specifically related to facial emotion recognition accuracy. However, alexithymia (and autism spectrum condition severity) was not associated with speed of emotion processing. The findings are primarily discussed with the theoretical view that perceiving and experiencing emotions share the same neural networks, thus being able to recognise one’s own emotions may facilitate the ability to recognise others’. This study is in line with previous similar research on autistic males and suggests impairments in facial emotion recognition in autistic females should be attributed to co-occurring alexithymia. Lay abstract Research with autistic males has indicated that difficulties in recognising facial expressions of emotion, commonly associated with autism spectrum conditions, may instead be due to co-occurring alexithymia (a condition involving lack of emotional awareness, difficulty describing feelings and difficulty distinguishing feelings from physical bodily sensations) and not to do with autism. We wanted to explore if this would be true for autistic females, as well as to use more realistic stimuli for emotional expression. In all, 83 females diagnosed with autism spectrum condition completed self-report measures of autism spectrum condition traits and alexithymia and completed a visual test that assessed their ability to identify multimodal displays of complex emotions. Higher levels of alexithymia, but not autism spectrum condition features, were associated with less accuracy in identifying emotions. Difficulty identifying one’s own feelings and externally oriented thinking were the components of alexithymia that were specifically related to facial emotion recognition accuracy. However, alexithymia (and levels of autism spectrum condition traits) was not associated with speed of emotion processing. We discuss the findings in terms of possible underlying mechanisms and the implications for our understanding of emotion processing and recognition in autism.


Introduction
Autism 24 (8) Association, 2013; ASC is synonymous with the term autism spectrum disorder used within the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-5)). Symptoms of ASC are typically observable in early development, yet some social and behavioural difficulties may only be recognised when a child struggles to meet the social and educational demands of life in later childhood (Baio, 2014). As autistic individuals often experience problems with social interaction, much research has investigated whether they are impaired at processing faces -the most fundamental source of social information . Unlike neurotypical individuals, autistic infants typically show reduced attention to faces in their environment (Chawarska et al., 2013), while autistic adults have shown an absence of fusiform face area (FFA) activity during face-matching tasks, a brain region central to the detection and recognition of faces (Hoffman & Haxby, 2000;Schultz et al., 2003). Autistic children and adolescents are poorer than controls at matching unfamiliar faces (Riby et al., 2009;Wolf et al., 2008), and autistic adults similarly show impaired facial identity recognition (Greimel et al., 2014;Weigelt et al., 2012). However, contrasting studies have found that autistic individuals have typical FFA activity (Hadjikhani et al., 2004) and no deficit in recognising facial identity (Deruelle et al., 2004;Neil et al., 2016), thus face processing impairment in ASC is inconclusive.
Equally inconsistent findings have come from research investigating whether individuals with ASC are impaired at recognising emotion from faces. Although a deficit in facial emotion recognition (FER) is not explicitly part of the DSM-5 criteria for ASC, it has consistently been considered a diagnostic marker (Uljarevic & Hamilton, 2013). As FER is vital in social interaction and social-emotional reciprocity, both of which are part of ASC's diagnostic criteria (American Psychiatric Association, 2013), impaired FER has reasonably been thought to underlie these autistic deficits (Schultz, 2005). Yet research investigating this has been mixed. Numerous studies have found that compared to controls, individuals with ASC are less accurate at labelling photographs of faces displaying the six basic emotions (happiness, sadness, anger, fear, surprise and disgust; Bölte & Poustka, 2003;Tanaka et al., 2012). However, others have found no basic FER deficit in ASC (Castelli, 2005;Homer & Rutherford, 2008). Similar inconsistencies have been seen in the recognition of complex emotions such as insincerity, as some have found impairments in ASC (Black et al., 2020;Golan et al., 2006;Peñuelas-Calvo et al., 2019), while others have not (Tracy et al., 2011). Research has also investigated whether autistic individuals are only impaired at recognising specific emotions. Yet, while there is some support for the view that they have a deficit in recognising negative emotions (Ashwin et al., 2006;Shanok et al., 2019), alternative studies have presented contrary findings (Adolphs et al., 2001). Furthermore, although a meta-analysis that pooled an abundance of these studies concluded that autistic individuals do have an FER deficit (Uljarevic & Hamilton, 2013), high statistical heterogeneity meant studies substantially varied both in whether they found a significant ASC deficit and in the strength of this effect, limiting the reliability of the overall finding. Finally, and although less researched, findings have also been inconclusive regarding whether individuals with ASC are slower at FER compared to controls. Some have found that ASC is associated with slower response times (RTs) for FER with autistic children (Bal et al., 2010), adolescents (Dalton et al., 2005) and a large sample of autistic adults (Sucksmith et al., 2013), while others have concluded that those with ASC are just as quick as controls at FER (Tracy et al., 2011). Therefore, although an FER deficit has been considered a core feature of ASC, research demonstrating this has produced considerably varied results.
To explain this inconsistency, several factors have been proposed. First, differences in participants' age could account for why the presence (or absence) of an FER deficit in ASC varies across studies (Harms et al., 2010). Neurotypical FER ability improves with age in childhood and into adolescence (Durand et al., 2007). However, as children and adults with ASC did not differ in their FER ability (O'Connor et al., 2005), and as there was no correlation between age and FER performance in ASC (Gepner et al., 2001), FER ability in ASC may reach its peak much earlier on (Tanaka et al., 2012). Second, autistic individuals may range from above average IQ and limited verbal difficulties through to low IQ and profound language difficulties (Burack & Volkmar, 1992). As the inclusion of different presentations of ASC varies both across and within FER studies, this could determine whether an autistic individual is impaired at FER (Harms et al., 2010).
Third, mixed results could be attributed to methodological differences between studies (Tanaka et al., 2012). FER tasks involving matching emotional expressions could be less sensitive to detecting ASC deficits as they may allow for autistic individuals to use compensatory strategies such as only processing the stimuli's surface characteristics without fully understanding the emotion displayed (Hariri et al., 2000;Teunisse & de Gelder, 2001). By contrast, although forced-choice labelling tasks may allow for the guessing of correct answers, which is particularly an issue in studies that had only two label options (e.g. Baron-Cohen et al., 1997), labelling tasks that require the individual to freely generate their answer may be affected by verbal ability and thus may be representative of language rather than FER ability (Uljarevic & Hamilton, 2013). Fourth, the variables on which the ASC group was matched to controls could contribute to inconsistent results as some variables may mask or enhance FER performance (Tanaka et al., 2012). As cited in Tanaka et al. (2012), studies have shown FER deficits in ASC when participants were matched on non-verbal intelligence, yet no ASC deficit when matched on verbal intelligence (Fein et al., 1992;Ozonoff et al., 1990).
Research has recently concluded that the strongest explanation for the inconsistency in FER results is the alexithymia hypothesis (Kinnaird et al., 2019;Poquérusse et al., 2018). Alexithymia is a subclinical phenomenon characterised by difficulties in recognising and describing one's own emotional state (Nemiah et al., 1976). Because it involves having problems with understanding what bodily or emotional sensations one experiences, communicating this to others, and having a cognitive style that is focussed on the external details of life rather than one's inner experience (Bagby et al., 1994), it often leads to more distant interpersonal relationships (Kafetsios & Hess, 2019). While the prevalence of alexithymia in the general population is estimated at 10% (Linden et al., 1995), 40%-65% of individuals with ASC meet its diagnostic criteria (Griffin et al., 2016). It is important to note, however, that alexithymia and ASC are distinct constructs as alexithymia is not necessary or sufficient to receive an ASC diagnosis and vice versa . The alexithymia hypothesis states that impaired FER is not a core feature of ASC, but instead represents a feature of their co-occurring alexithymia ).
An increasing number of studies have supported the alexithymia hypothesis (Kinnaird et al., 2019). Alexithymia, independent of ASC, is strongly associated with a deficit in FER (Grynberg et al., 2012). Neurotypical individuals who score highly on the 20-item Toronto Alexithymia Scale (TAS-20; Bagby et al., 1994), the most reliable and widely used self-report tool for measuring alexithymia (Prkachin et al., 2009), are poorer at FER than those with low alexithymia scores (e.g. Lane et al., 2000;Swart et al., 2009). Recent studies have found that cooccurring alexithymia, and not ASC diagnosis, predicts poor FER ability in autistic individuals Kinnaird et al., 2019). Cook et al. (2013) found no differences between adults with ASC and neurotypical controls in their ability to identify which of the six basic emotions was displayed on the faces of morphed stimuli. As there was a negative correlation between alexithymia level and FER ability, it was concluded that the groups did not differ because there was an equal distribution of those with high alexithymia in each group. With no correlation between ASC severity and FER, these findings support the alexithymia hypothesis. Similar results have been found in a younger sample, as autistic adolescents who were higher in alexithymia had more parent-reported emotional difficulties and performed worse on FER than those low in alexithymia (Milosavljevic et al., 2016). Importantly, participants' ASC severity scores were unrelated to FER (Milosavljevic et al., 2016). Interestingly, support for the alexithymia hypothesis is not restricted to only explaining emotional deficits regarding faces as autistic individuals with high levels of alexithymia, and not those with low alexithymia, were impaired at recognising emotion from voices (Heaton et al., 2012) and from music (Allen et al., 2013). It is important to note, however, that alexithymia may not explain all emotional symptoms associated with ASC, as a recent study found that it was autistic adults' autistic traits, and not their level of alexithymia, that predicted reduced eye fixation on emotional facial stimuli (Stephenson et al., 2019).
Yet most studies that support the alexithymia hypothesis have fundamental limitations. First, most studies have used FER tasks comprised of facial stimuli that lack ecological validity. In the real world, people recognise briefly presented multimodal emotional expressions, meaning that facial, vocal and bodily cues are simultaneously displayed (Loth et al., 2018;Schlegel & Scherer, 2016). As most tasks in these studies used Ekman and Friesen's (1976) grayscale static photographs of actors posing exaggerated prototypical emotions (e.g. Cook et al., 2013;Milosavljevic et al., 2016;J. D. Parker et al., 1993), performance on them cannot fully reflect how individuals process facial expressions in daily life. Second, almost all studies supporting the alexithymia hypothesis have only measured the six basic emotions which are not fully representative of the range of complex emotions perceived in life (e.g. Cook et al., 2013;Heaton et al., 2012). Although one study supported the alexithymia hypothesis using the Reading the Mind in the Eyes Test (RMET; Baron-Cohen et al., 2001), which tests a range of complex emotions identified from the eye region, (Oakley et al., 2016), the RMET's stimuli are still only static, grayscale photographs of eye regions that are identified without the holistic context of all the facial features, as one would do in real life (Fertuck et al., 2009). Third, these studies typically have small samples. With samples as small as 16  or 13 participants (Bird et al., 2011), studies like these may lack the statistical power to detect differences in FER ability between ASC and controls. Fourth, studies investigating the alexithymia hypothesis typically do not have FER tests that measure how long participants take to identify emotions (e.g. Cook et al., 2013;Milosavljevic et al., 2016). Although one recent study has measured RT of FER within the context of alexithymia and found that alexithymia was not a significant predictor of how long those with and without ASC took to identify emotions, this was only measured for a small subset of emotions (fear, joy, anger, neutral; Stephenson et al., 2019). Including RT for a wider range of emotions could help determine whether alexithymia does predict slower processing speed of emotions (Ketelaars et al., 2016).
Finally, perhaps one of the most important limitations is that all but one study examining the alexithymia hypothesis in ASC has consisted of all-male (e.g. Bird et al., 2010) or predominantly male participants with ASC (e.g. Bird et al., 2011;Cook et al., 2013;Oakley et al., 2016). Because ASC is approximately three times more prevalent in males than females (Loomes et al., 2017), most other findings on the mechanisms that underlie ASC symptoms are also based on samples dominated by males (e.g. Peterson et al., 2016). Thus, one must be cautious about automatically assuming findings apply to autistic females, due to sex differences within ASC that may reflect a different ASC phenotype for females (Ketelaars et al., 2016;Rivet & Matson, 2011).).
Research on the FER ability in females has been as inconsistent as studies that included predominantly autistic male samples (Uljarevic & Hamilton, 2013). One study with a sufficient number of participants did find an FER deficit in autistic females (Lai et al., 2012), while another did not (Ketelaars et al., 2016). Ketelaars et al. (2016) measured autistic female adults' and matched controls' levels of alexithymia, their ASC symptomology, and had them label facial emotions of varying intensity. No differences in FER ability were found between the ASC and control groups, but females from either group with higher levels of alexithymia were significantly worse at recognising subtle low-intensity emotions compared to those with lower alexithymia. As ASC symptomatology was not associated with FER ability, this supports the alexithymia hypothesis for autistic females. However, this study shares several of the common methodological limitations among studies investigating the alexithymia hypothesis. Only grayscale images of the basic emotions were used in the FER test, and RT was not measured to provide more conclusive results. Further research in autistic females that addresses these limitations is therefore vital.
The present study aimed to address these limitations by using more ecologically valid methodology and a large all-female sample to determine whether the alexithymia hypothesis applies to autistic females. It aimed to investigate whether higher alexithymia levels, and not ASC severity, is associated with poorer FER ability in autistic females, and which sub-categories of alexithymia, if any at all, are specifically associated with FER. Female adults with ASC self-reported their level of alexithymia and ASC severity. They also completed the Geneva Emotion Recognition Test-Short (GERT-S; Schlegel & Scherer, 2016): an FER test that consists of video clips of actors multimodally portraying a range of complex emotions in full colour that measures both accuracy and RT. The hypotheses were (1) there will be a negative correlation between alexithymia and FER ability: autistic females who report higher levels of alexithymia (TAS-20 score) will have poorer FER ability (lower percentage of correct responses in the GERT-S and slower average RT) than those with lower alexithymia levels and (2) there will be no association between ASC severity (Autism Spectrum Quotient total score; Baron-Cohen et al., 2001) and FER ability.

Participants
A total of 217 females participated in the study. As 92 started the study but did not complete it, this left 125 participants with complete data sets. Clinical diagnosis was determined by having participants select online whether they have a clinical diagnosis of ASC, self-identify as having ASC, or neither self-identify nor have a clinical diagnosis; 42 participants self-identified as having ASC, and 1 participant neither self-identified nor was diagnosed with ASC. These participants were excluded from the core analysis in order to ensure a 'valid' ASC sample. Thus, core analyses were conducted on 83 females who stated they have a clinical diagnosis of ASC. In this final sample (n = 83), participants were aged between 19 and 65 years, with the mean age of 38.5 years (standard deviation (SD) = 11.72 years). All participants volunteered to take part by following the study's online Qualtrics link (Qualtrics, 2019) advertised in various autism groups or charity pages on Facebook (e.g. 'Aspire: The Female Autism Network'), as well as through Twitter. The only participant exclusion criteria (noted on the advertisement) was being male, under the age of 18 years and/or not having or self-identifying as having a diagnosis of an ASC.

Materials
Autism spectrum disorder severity. Severity of autistic traits was measured using the Autism Spectrum Quotient (AQ; Baron-Cohen et al., 2001). The AQ is a 50-item selfadministered questionnaire that measures level of traits associated with autism spectrum disorder (ASD). It is divided into five subscales consisting of 10 items, including Social Skill, Attention Switching, Attention to Detail, Communication and Imagination. Participants' record whether they 'strongly agree', 'agree', 'disagree', or 'strongly disagree' with each item. The maximum score is 50, the minimum is 0. Higher AQ scores reflect greater number of ASC traits, with a cut-off score of 32 and above reflecting a clinically significant level of autistic traits. The AQ has good test-retest reliability (r = 0.79; Hoekstra et al., 2008) and has been validated as a reliable measure of ASC severity in the autistic population (Broadbent et al., 2013). In addition, it has often been used to measure ASC severity in previous studies investigating the alexithymia hypothesis in ASC, justified by the correlation between total AQ score and ASC diagnosis (e.g. Cook et al., 2013;Shah et al., 2016).
Alexithymia. Levels of alexithymia were measured using the 20-item TAS-20; Bagby et al., 1994), a self-administered instrument assessing the degree of difficulty one has in recognising and describing their own emotional state. It has three subscales including Difficulty Identifying Feelings, Difficulty Describing Feelings and Externally Oriented Thinking. Items are scored with a 5-point Likerttype scale (5 = strongly agree, 4 = agree, 3 = neither agree or disagree, 2 = disagree, 1 = strongly disagree), with 5 of the 20 items being reverse scored. With a minimum score of 20 and a maximum of 100, scoring 51 or less indicates no alexithymia; between 52 and 60 indicates possible alexithymia; and above 61 indicates alexithymia. The TAS-20 is viewed by Berthoz and Hill (2005) as a validated and reliable psychometric instrument for identifying alexithymia in the autistic population, with high test-retest reliability (r = 0.77), convergent validity (r = 0.62) and internal consistency (Cronbach's α = 0.81; Bagby et al., 1994).
FER ability. FER was assessed with the computer-administered GERT-S (Schlegel & Scherer, 2016), which has demonstrated good construct validity and internal consistency (Schlegel & Scherer, 2016). It consists of 42 video clips, each lasting 1-3 s, of five male and female professional French-Swiss actors expressing 14 different emotions: pride, anger, joy, irritation, amusement, disgust, pleasure, sadness, relief, despair, interest, fear, surprise and anxiety. The clips were selected from the Geneva Multimodal Emotion Portrayals database and had increased authenticity as they were recorded as part of a continuous interaction between a director and actor role-playing a real-life scenario (Bänziger et al., 2012). Each clip presented an actor's upper torso and face portraying an emotion through their facial expression, gestures and voice as actors pronounced a phrase that had no semantic meaning but conveyed an emotional tone (see Figure 1 for screenshots of example video clips); 12 of the emotions were selected because they covered the four core emotional categories of the combinations of high/low arousal and positive/negative valence as introduced by Bänziger et al. (2012), while disgust and surprise were included as these are traditionally assessed in most FER tests.
The test presented participants with simple instructions, followed by a short definition of each of the 14 emotions. For each trial, participants clicked inside the video to play it and were then presented with an emotion wheel to select the correct emotion the actor in the video was intending to portray (see Figure 2 for GERT-S response format). The 14 emotion options were arranged in a circle to facilitate participants' ability to orient among the options. No choice feedback was provided at the end of each trial or the test. Participants first completed two practice trials, followed by the remaining 42 trials. The time taken to choose the correct emotion, and the number of times participants clicked in the emotion wheel, was not restricted. However, each video could only be played once. The GERT-S is typically completed within 10 min and is scored by allocating 1 point for each correctly identified trial and 0 for incorrect trials. The test also recorded RT from when each video ended to when participants clicked on their final choice in the emotion wheel.

Design
The study used a correlational design with the variables: total AQ score (and scores for each of the five AQ subscales), total TAS-20 score (and scores for its three  subscales), as well as GERT-S score (for both percentage accuracy and average RT).

Procedure
The study was presented on an online Qualtrics survey and participants followed a Qualtrics link to complete it on a computer/laptop via a web browser. The opening screen presented information explaining what participants need to do. All participants gave full written consent to participate and were provided with a randomly generated 5-digit unique ID to withdraw their data if necessary. Participants first completed the AQ, followed by the TAS-20, by selecting the response that best describes how strongly they felt each statement applies to them. They then completed the GERT-S and were instructed that they will watch each video once with headphones; that they will not be able to understand what the actors are saying; and that they are to select the emotion word that best describes the emotion the actor wanted to express. Participants were finally debriefed about the aims of the study. Total time taken to complete the study was approximately 30 min, and participants were given the opportunity to be rewarded for their time by following an alternative Qualtrics link to enter a prize draw to win one of the 12 £25 Amazon Vouchers, funded by Newcastle University.

Results
AQ and TAS-20 scores. Participants scored between 11 and 49 on the AQ (M = 38.3, SD = 7.45) and 84.3% of the sample scored above clinical cut-off (AQ ⩾ 32). TAS-20 scores ranged from 32 to 88 (M = 65.2, SD = 11.4). According to the thresholds of the TAS-20, 72.3% of participants could be identified as having alexithymia (TAS-20 ⩾ 61), 15.7% had borderline alexithymia (TAS-20: 52-60), and 12% had no alexithymia (TAS-20 ⩽ 51). However, given that small differences in scores across thresholds could result in inaccurate group allocation and that these group sizes were unequal, all further analyses treated TAS-20 and AQ scores as being on a continuous scale, with participants having higher or lower levels of alexithymia and ASC severity.
GERT-S performance. Figure 3 shows participants' average accuracy and response time for each individual emotion on the GERT-S. To measure participants' FER ability, the total number of correct answers on the GERT-S for each participant (accuracy), and the average time taken to correctly respond to each trial (RT), was recorded. Overall, there were two outliers for average RT. Participants' average accuracy was 54.1% (SD = 14.5%). While the average RT for a trial was 4400 ms (SD = 2143) for correct responses, paired samples t-tests showed that participants took significantly longer to identify the emotion on trials they got incorrect (M = 6482 ms, SD = 3027 ms), compared to those they got correct (M = 4401 ms, SD = 2144 ms), t(82) = -9.59, p < 0.001.
Examining the alexithymia hypothesis. To assess whether the alexithymia hypothesis does apply to female adults diagnosed with ASC, Spearman's rho correlations were conducted between alexithymia (TAS-20 total score and subscales) and FER ability (GERT-S average percentage accuracy and RT) and between ASC severity (AQ total score and subscales) and FER ability. A non-parametric test was implemented as all variables were checked for normality using Shapiro-Wilk and were not normally distributed. Spearman's rho correlation indicated there was a significant positive relationship between total TAS-20 and total AQ (r s (83) = 0.35, p = 0.001), thus greater ASC severity was related to higher alexithymia levels. Consistent with the alexithymia hypothesis, while there was no relationship between total AQ and accuracy on the GERT-S (r s (83) = -0.16, p = 0.16), there was a significant negative correlation between total TAS-20 score and GERT-S accuracy (r s (83) = -0.27, p = 0.012). Therefore, higher alexithymia, not ASC severity, was associated with poorer FER accuracy.
Correlations between the subscales of the AQ and TAS-20 with GERT-S performance revealed which specific components of alexithymia (and ASC) were associated with FER ability. As summarised in Table 1, higher alexithymia scores on both Difficulty Identifying Feelings and Externally Oriented Thinking were significantly associated with poorer FER accuracy, while Difficulty Describing Feelings was unrelated to FER. As Table 2 shows, the only AQ subscale that was significantly correlated to FER accuracy was Communication. No further components of ASC were associated with FER.
However, as ASC severity could account for a significant proportion of the unique variance when alexithymia is taken into account, further hierarchical regression analyses were conducted to determine whether TAS-20  predicts FER accuracy above and beyond ASC severity. With GERT-S average percentage accuracy as the outcome variable, total AQ was entered into the first step of the model and total TAS-20 was entered into the second step. While total AQ was not a significant predictor of FER accuracy (F(1, 81) = 0.390, p = 0.534, R 2 = 0.005), total TAS-20 was a significant predictor of FER accuracy (F(2, 80) = 4.722, p = 0.012, R 2 = 0.106). Alexithymia significantly improved the model and significantly increased the variance accounted for by 10.1%. Considering RT on the GERT-S, as shown in both Tables 1 and 2, while no significant correlations were observed between TAS-20 scores and GERT-S average RT, the only significant correlation between AQ and RT was between the AQ subscale of Attention to Detail and GERT-S average RT. Thus, those who took longer on average to correctly identify an emotion had a higher attention to detail. A multiple regression analysis was conducted with GERT-S average RT as the outcome variable, with total AQ entered at the first step of the model, and total TAS-20 entered at the second step. Both total AQ (F(1, 81) = 0.273, p = 0.603, R 2 = 0.003) and total TAS-20 (F(2, 80) = 1.459, p = 0.239, R 2 = 0.035) were not significant predictors of GERT-S average RT. Therefore, alexithymia levels (and ASC severity) were not associated with the speed of emotion processing in autistic females.
These analyses were also conducted on a sample (n = 117) that included females with a clinically significant level of autistic traits (total AQ ⩾ 32) who self-identified as having ASC. While the pattern of results was the same for this sample regarding alexithymia and FER accuracy, as well as RT findings, results differed in that total AQ, AQ Communication and AQ Imagination significantly correlated with FER accuracy (see Appendix 1 for full analyses). However, multiple regression analyses revealed that ASC severity was not a significant predictor of GERT-S accuracy (F(1, 123) = 1.344, p = 0.249, r 2 = 0.011), while alexithymia was (F(2, 122 = 5.881, p = 0.004, r 2 = 0.088).

Discussion
By addressing methodological limitations of previous research, this study aimed to determine whether the alexithymia hypothesis applies to autistic females; to identify which sub-category of alexithymia, if any, is associated with FER ability; and to explore prevalence of alexithymia in autistic females. As hypothesised, higher levels of alexithymia and not ASC severity were associated with poorer FER accuracy. Having greater difficulty in identifying one's own feelings and externally oriented thinking was related to poorer FER accuracy. Difficulty in describing feelings was the only sub-category of alexithymia not associated with FER accuracy. However, the finding that participants higher in alexithymia were not slower at FER than those lower in alexithymia did not support the original hypothesis. Thus, the alexithymia hypothesis applied to autistic females' accuracy in FER, but not to their speed of emotion processing.
This study provided a valuable indication of the prevalence of alexithymia in females with ASC. The finding that the large majority of autistic females in this study could be identified as having high alexithymia is consistent with previous findings of how rates of alexithymia tend to be significantly higher in ASC compared to neurotypical samples (Heaton et al., 2012;Honkalampi et al., 2000). However, prevalence in this sample was higher than in other reported autistic samples. While almost three quarters of participants in this study reported high alexithymia, the prevalence of alexithymia has previously been reported as between 40% and 65% in male (Hill et al., 2004;Mul et al., 2018) and female autistic samples (Ketelaars et al., 2016). One explanation for why the prevalence was so high among this sample could be selection bias. As the study was advertised as investigating emotion recognition in ASC, it is possible that autistic women who struggle more with FER (and thus those higher in alexithymia) were more likely to self-select. However, there is currently little evidence to explain why the prevalence of alexithymia is generally higher in the autistic population. While this could be because alexithymia and ASC both have shared difficulties in mentalising (Kinnaird et al., 2019), Bird and Cook (2013) have speculated that it may be the result of a genetic vulnerability to developing atypical neural connectivity. Individuals with poorer neural connectivity within social cognition networks may develop 'pure' ASC, and those with connectivity issues among emotional networks may develop 'pure' alexithymia, but it is more likely for individuals to have poorer connectivity across both networks, leading to more frequent co-occurrence of alexithymia and ASC . This hypothesis warrants further investigation.
The main and most compelling finding was that autistic females' level of alexithymia, and not ASC severity, was associated with FER ability. Those with greater difficulty in recognising their own emotions were less accurate at recognising the emotions of others. This is in line with research that supports a general FER deficit in alexithymia regardless of ASC (Grynberg et al., 2012;Lane et al., 1996) and adds to the few but growing number of studies that have supported this alexithymia hypothesis in ASC (Allen et al., 2013;Cook et al., 2013;Heaton et al., 2012;Ketelaars et al., 2016;Milosavljevic et al., 2016). Ketelaars et al. (2016) similarly found that it was alexithymia in autistic females that predicted FER of low-intensity emotions. The present study builds upon this with a larger sample of autistic women. While previous studies were only able to demonstrate that alexithymia is associated with recognition of static photographs of basic emotions, the present study has supported the alexithymia hypothesis with greater ecological validity, suggesting autistic women who are higher in alexithymia are poorer at recognising multimodal displays of a range of emotions in the way they typically would be perceived in real life.
It is possible perceiving and experiencing emotions share the same underlying neural networks (Wingbermühle et al., 2012), as consistent with the shared network model of social cognition (Preston & De Waal, 2002). When perceiving another's emotional expression, a neural simulation or representation of this emotion is directly experienced (Preston & De Waal, 2002). This enables individuals to automatically use their experience of the emotion they perceive to help them to recognise it. Damage to neural structures that underlie the experience of a certain emotion can lead to an impaired ability to recognise it in others (Adolphs et al., 2005;Heberlein & Atkinson, 2009), and neurotypical individuals with high alexithymia (who also had impaired FER) show weaker neural responses in regions that are typically activated when processing various emotions (Grynberg et al., 2012;Ihme et al., 2014a).
The present study also indicated that FER ability in autistic women was associated with two out of three core components of alexithymia. While greater issues in identifying one's own emotions and externally oriented thinking were related to poorer FER accuracy, having difficulty describing one's own feelings was not. Since no previous study involving autistic participants appears to have assessed which components of alexithymia are associated with FER, the current study may have been the first to demonstrate this. Interestingly, this differs to research on nonclinical samples. Ihme et al. (2014a) and Swart et al. (2009) found that difficulty describing feelings was most strongly associated with basic FER, perhaps because this ability, like attributing a label to a facial expression, involves semantically processing emotional information (Ihme et al., 2014a). Difficulty identifying feelings was instead associated with the automatic neural processing of briefly presented emotional faces (Ihme et al., 2014a). Contrasting results in the present study may be due to the realistic nature of the FER test. Because the GERT-S presented complex and multimodal emotions, accurate performance may have relied on automatic emotional processing. Although participants did label emotional expressions in this task, being able to describe their feelings may not have been relevant in facilitating performance. The finding that externally oriented thinking was related to poorer FER accuracy in autistic women coincides with studies showing that higher scores on this cognitive style correlated with poorer recognition of emotion from faces (Prkachin et al., 2009) and music (Taruffi et al., 2017).
The finding that participants with higher autistic traits in communication had poorer FER accuracy was somewhat surprising considering that difficulty describing feelings was not associated with FER. However, it is possible that having poorer FER, associated with alexithymia, impacts one's communication ability. If an autistic female struggles to accurately identify the emotion on another's face, they may not understand the true meaning of another's words, how to respond to them appropriately or, for example, whether another person is indicating to them that they are talking excessively. All of these non-verbal cues that rely on accurate FER are vital for effective communication (Kothari et al., 2013).
The finding that alexithymia was not associated with the speed of emotion processing is of interest. Most studies that have previously investigated the alexithymia hypothesis in ASC have not measured RT of FER (e.g. Cook et al., 2013;Ketelaars et al., 2016). Research investigating neurotypical participants has found that individuals with greater difficulty in describing their feelings were slower at recognising the negative basic emotions (Ihme et al., 2014b) or struggled to identify them when they were presented very briefly (P. D. Parker et al., 2005). Such research concluded those high in alexithymia need more time to recognise an emotion perhaps because they do not generate automatic representations of another's emotion (Grynberg et al., 2012), or because they use processing strategies that are not as efficient as those used by individuals with low alexithymia (Vermeulen et al., 2008). The finding in the current study is more in line with those that have similarly shown no relation between RT of FER and alexithymia (Mériau et al., 2006;Vermeulen et al., 2008). Studies that have shown slower FER in ASC compared to controls may not reflect higher levels of alexithymia in the autistic sample (Bal et al., 2010), but could reflect an overall slower processing speed in ASC in general (Travers et al., 2014).
Overall, it is important to discuss the secondary analysis in our study that demonstrated that when those who self-identified as having an ASC were included into the sample with women with a formal diagnosis of ASC, ASC severity (as well as alexithymia) was significantly correlated with FER accuracy. Although ASC severity did not significantly predict FER accuracy for the formally diagnosed sample, it is recommended that future studies continue to investigate the alexithymia hypothesis in female samples to more significantly determine whether ASC does have any association with FER ability.
Although this study has addressed several limitations of previous similar studies, it is not without limitations of its own. Perhaps the largest limitation concerns the sample. Although relying on convenience sampling meant that we were able to sample a large amount of participants, we were unable to fully verify that those who stated they had a clinical diagnosis of ASC actually did. Given that more extensive demographic information about the sample was not gathered (e.g. education), it may also be difficult to know who these results can generalise to. One further limitation is that the AQ was used as a measure of participants' ASC severity. Although the AQ is typically used for this purpose in similar research (e.g. Cook et al., 2013), it is not a diagnostic tool and thus higher scores on it do not always reflect the presence of autism. While all females in the core analysis did have an ASC diagnosis, the key finding that poorer FER was not associated with higher scores on the AQ, although indicative may not conclusively suggest that it is not part of autism. This leads to the third limitation: because this study did not include a matched sample of neurotypical females, it was unable to determine whether there is any difference between females with or without ASC in FER ability. Future research would benefit from the inclusion of control groups to determine whether it is alexithymia, regardless of diagnosis, that predicts FER ability. Furthermore, although the TAS-20 is a fast instrument to complete and administer, it may not have estimated accurate levels of alexithymia in this particular sample. Given that individuals high in alexithymia and those with ASC may struggle with self-insight and identifying or describing aspects about themselves (Griffinet al., 2016), self-report measures require a basic level of emotional awareness, and thus, levels of alexithymia (and ASC) may have been underestimated . Future research should be aware of this issue and consider exploring ways to measure alexithymia in a way that does not require selfinsight . Finally, while this study used an FER test with significantly increased ecological validity compared to static photographs of just a subset of emotions, identifying emotions from short video clips with actors expressing incomprehensible phrases is still not how humans process emotions in everyday life. Studies could benefit from implementing FER tests that are even closer to real life, perhaps by having participants recognise emotions in vivo rather than on a computer screen.

Conclusion
Overall, by counteracting some of the limitations of previous research, this study has provided evidence for how the alexithymia hypothesis applies to females diagnosed with ASC. Alexithymia in autistic women, specifically in terms of difficulty identifying their own emotions and externally oriented thinking, but not their ASC, was associated with FER ability. However, this was only for FER accuracy, as speed of emotion processing was unrelated to both ASC and alexithymia.
To our knowledge, the present study was the first to implement an ecologically valid FER test assessing a range of complex emotions and RT when investigating the alexithymia hypothesis in ASC. Future studies could use similar FER tests to explore alexithymia and FER ability in autistic males. Furthermore, while the prevalence of alexithymia is elevated in the autistic population (e.g. Brewer et al., 2015), there is limited evidence explaining why the two conditions tend to co-occur. It would be useful to investigate the underlying mechanisms of alexithymia and why this overlaps with ASC. Finally, it is important to explore whether the alexithymia hypothesis explains the emotional impairments associated with conditions other than ASC. For example, FER ability in anorexia nervosa may also be explained by higher prevalence of alexithymia in this population (Zonnevijlle-Bendek et al., 2002). Studies could explore whether this applies to other clinical conditions, particularly those characterised by a high prevalence of alexithymia and inconsistent FER results, such as schizophrenia (Cedro et al., 2001;Kring & Elis, 2013). Table A1. Spearman's rho correlations of alexithymia (TAS-20) and FER ability (GERT-S) in diagnosed autistic females and those scoring AQ ⩾ 32 who self-identified as having ASC (n = 117).

GERT-S total accuracy
GERT-S average RT