Strategic identity signaling in heterogeneous networks

Significance Much of online conversation today consists of signaling one’s political identity. Although many signals are obvious to everyone, others are covert, recognizable to one’s ingroup while obscured from the outgroup. This type of covert identity signaling is critical for collaborations in a diverse society, but measuring covert signals has been difficult, slowing down theoretical development. We develop a method to detect covert and overt signals in tweets posted before the 2020 US presidential election and use a behavioral experiment to test predictions of a mathematical theory of covert signaling. Our results show that covert political signaling is more common when the perceived audience is politically diverse and open doors to a better understanding of communication in politically polarized societies.

1 Twitter data collection Table S1: News sites used to determine Twitter account engagement with farleft/right politics. Initial 8 news accounts (from whom followers were drawn to provide the initial 80,000 seed accounts) are in bold.

Left Sites
Twitter  Figure S1: Relationship between heterogeneity and homogeneity scores among the initially considered 16,398 Twitter users (green dots; r = .99). Orange crosses denote 1,409 Twitter users that were selected for the rating study because they were in the top 20% by either homogeneity or heterogeneity scores and had tweets that matched our selection criterion.

S3
We used CloudResearch, a panel service associated with Mechanical Turk (MTurk), to select 3,027 individuals who were identified in CloudResearch screening survey as having a "moderate", "very liberal", or "very conservative" political orientation. They were invited to a brief survey in which they were asked to solve a political literacy quiz and report on their news-following habits, the political groups they identify with, their familiarity with different political groups and movements, and their opinions about different political issues. Based on the survey responses, we selected four groups of raters as described in Figure  S2. To select far-left and far-right raters, we focused on their preferred news sources and identities. We included individuals who reported using at least one far-left or far-right source regularly to keep up with the news (see TableS3 in the SI Appendix ) or who identified with a far-left or far-right identity (see Table S4 in the SI Appendix ). In addition, we screened out participants who identified with inconsistent political views (e.g., followed a farright news source but self-described as very liberal) or with inconsistent political identities (e.g., identified with a far-left identity and as a Republican). Although it may be perfectly reasonable to follow cross-partisan news accounts, we excluded these individuals to reduce a potential source of noise in our data. Political literacy for far left/right: Follows politics q1 , knows politics q4-6 , and positive about far-left/right movements q3 or Only one of the above and uses social media for news q2 Somewhat conservative q25 and knows about far-right movements q3 and engaged (either identifies as radical q28 or uses social media for news q2 ).
Very conservative q25 and knows about far-right movements q3 Republican q26 and feels positive about many far-right movements q3

Original criteria Adjustments for enough ratings
Republican q26 and feels negative about many far-left movements q3

OR
Political literacy for moderates: follows politics q1 , knows politics q4-6 , and knows about far-left/right movements q3 or Only one of the above and uses social media for news q2 Figure S2: The process of selection of raters. Superscripts refer to the survey question (see Questionnaire below) used for each criterion.
In order to select raters with at least a minimum of political literacy, we selected participants who reported following the news at least once a week, answered two out of three questions correctly in the political literacy quiz, and were in the top 50% by their positivity toward their own far-left/right political movements (see Table S5 in the SI Appendix ) . Because our analysis focused on rating or sharing tweets, we wanted to oversample people S4 familiar with social media and therefore we also added to the pool participants who had only one of the criteria of political literacy but listed a social media site as their regular source of news (Twitter, Facebook, Reddit, 4chan, 8kun, ctree, and forums). To select moderate left and right raters, we included all participants who identified as Democrat or Republican and were not marked as having a far-left/right ideology. In addition, we used the same political literacy criteria as for far-left/right participants, except we selected those who were in the top 75% in knowledge far-left/right movements (regardless of feeling positive about them).
Given these criteria, our pool of potential raters included fewer far-right participants compared to all other groups. This is not surprising given that people who participate in Amazon's Mechanical Turk platform tend to lean Democrat [2]. We therefore broadened our criteria for far-right raters, while still retaining the focus on far-right ideology, identity, and a minimum of political literacy. We included participants who identified as Republican and, compared to all right-leaning participants, were in the top 10% for feeling positive about many far-right movements or top 10% for feeling negative about far-left movements. We also added participants who self-described as very conservative and were in the top 75% of right-leaning individuals according to their knowledge of far-right movements. Finally, we added to the pool participants who identified as somewhat conservative, were in the top 75% according to their knowledge of far-right movements, and were engaged online (either self-identified as radical or used social media for news). We used this pre-selected pool for both the rating of tweets and the behavioral experiment. The government should do more to help needy Americans, even if it means going deeper into debt vs.

S5
The government today can't afford to do much more to help the needy Which of the following statements comes closest to your view?
Racial discrimination is the main reason why many black people can't get ahead these days vs.
Black people who can't get ahead in this country are mostly responsible for their own condition Which of the following statements comes closest to your view?
The best way to ensure peace is through military strength vs.
Good diplomacy is the best way to ensure peace Which of the following statements comes closest to your view?
Government is almost always wasteful and inefficient vs.
Government often does a better job than people give it credit for Which of the following statements comes closest to your view?
Government regulation of business is necessary to protect the public interest vs.
Government regulation of business usually does more harm than good Which of the following statements comes closest to your view?
Poor people today have it easy because they can get government benefits without doing anything in return vs.
Poor people have hard lives because government benefits don't go far enough to help them live decently Which of the following statements comes closest to your view?
Immigrants today strengthen our country because of their hard work and talents vs.
Immigrants today are a burden on our country because they take our jobs, housing and health care

Which of the following statements comes closest to your view?
Most people who want to get ahead can make it if they're willing to work hard vs.
Hard work and determination are no guarantee of success for most people Which of the following statements comes closest to your view?
Business corporations make too much profit vs.
Most corporations make a fair and reasonable amount of profit Which of the following statements comes closest to your view?
Stricter environmental laws and regulations cost too many jobs and hurt the economy vs.
Stricter environmental laws and regulations are worth the cost Which of the following statements comes closest to your view?
Homosexuality should be accepted by society vs.
Homosexuality should be discouraged by society Which of the following statements comes closest to your view?
In foreign policy, the U.S. should take into account the interests of its allies even if it means making compromises with them vs.
In foreign policy, the U.S. should follow its own national interests even when its allies strongly disagree Which of the following statements comes closest to your view?
It's best for the future of our country to be active in world affairs vs.
We should pay less attention to problems overseas and concentrate on problems here at home Which of the following statements comes closest to your view?
Our country has made the changes needed to give blacks equal rights with whites vs.
Our country needs to continue making changes to give blacks equal rights with whites Which of the following statements comes closest to your view?
The economic system in this country unfairly favors powerful interests vs.
The economic system in this country is generally fair to most Americans Which of the following statements comes closest to your view?
The obstacles that once made it harder for women than men to get ahead are now largely gone vs.
There are still significant obstacles that make it harder for women to get ahead than men S8 Other (specify) Other (specify) Other (specify)

Specific identities
The following list contains various words that people might use to describe their views. [q23] Which of these words would you say apply to you?
Feel free to choose as many words as you like. What is the percent chance that you will vote in the upcoming 2020 U.S. Presidential election?

Alt-right
Please enter a number from 0 to 100, where 0 means that you will definitely not vote, and 100 that you will definitely vote. If you have already voted, please ente

Own voting intentions [q30]
If you do vote in the election, what is the percent chance you will vote for each of the candidates below?
For each candidate, please enter a number from 0 to 100, where 0 means that you will definitely not vote for that candidate, and 100 that you will definitely vote [q31] What percentage of your social contacts that live in your state are likely to vote in the upcoming 2020 U.S. Presidential election?
Please enter a number from 0 to 100, where 0 means that you think none of your social contacts will vote, and 100 means that all of your social contacts will vote

Social circle voting intentions [q32]
Out of all your social contacts who live in your state and are likely to vote in the 2020 U.S. Presidential election, what percentage do you think will vote for each of the candidates below?
For each candidate, please enter a number from 0 to 100, where 0 means that none of your social contacts will vote for that candidate, and 100 means that all of

Rating and classification of of tweets
Online participants answered four questions for 5,215 tweets total ( Figure S3). Of those, 4,752 received enough ratings to be evaluated according to various criteria (Table S6). A total of 164 tweets were pre-selected as covert and overt tweets based on these criteria. Figure S3: Online questionnaire for rating tweets Figure S4 shows the mean tweet-level scores for each of the items within our two dimensions used to automatically pre-select tweets, separately for four different groups of raters defined in relationship to the political orientation of the tweet: copartisan far-left/right raters, copartisan moderate raters, cross-partisan far-left/right raters and cross-partisan moderate raters (for a total of n=1,992 raters). For example, in the case of a right leaning tweet, copartisans would be right-leaning raters and cross-partisan left-leaning. Of all tweets, 93 were marked as overtly political and 71 as covertly political. As described above, each tweet was given a Table S6: Criteria used to select covert and overt tweets for the behavioral experiment, based on ratings given by different political groups

Criteria for left (right) covert tweets
Copartisan difference in tweets' perceived political orientation Difference between average political orientation ratings* given by far-and moderate left (right) raters is above 80th percentile.
Copartisan difference in affective response Difference between average negative affect* experienced by moderate and far-left (right) raters is above 80th percentile.

Neutrality of moderates-rated political orientation
Average rating of political orientation given by left (right) moderates is neutral or mildly left (right).
Moderates' affective response Average negative affect experienced by left (right) moderates is low (<=4 on a 7-point scale).
Political content At least 75% of far-left (right) raters think the tweet is political.

Cross-partisan difference in political orientation
Difference between average political orientation ratings* given by left and right raters is below 20th percentile.
Cross-partisan difference in affective response. Difference between average negative affect* experienced by left and right raters is above 80th percentile.
Copartisan similarity in affective response Difference between average negative affect* experienced by far-and moderate left (right) raters is below 20th percentile.
Political content At least 75% of all raters think the tweet is political. *To make differences comparable across tweets, differences were divided by the sum of standard deviations of ratings given by each group of raters. score based on the average ratings for each of the four groups, Figure S4 presents the mean of the scores across all tweets.
According to the definition of covert signaling, all raters should receive the political signal of overt tweets but only ingroup (copartisan) far-left/right raters should receive the political signal of covert tweets. This is reflected in the first two rows of Figure S4. On the one hand, overt tweets are rarely viewed as not political across all four groups (orange bar in the first row Figure S4). On the other hand, covert tweets are rarely selected as non political by copartisan far-left/right raters but quite more often by the three other groups (grey bar in the first row Figure S4). Regarding guessing the political identity of the tweet author, overt tweets were considered as extreme left or right for all groups, but covert tweets were not considered as extreme by copartisan moderate raters compared to copartisan far left/right raters (grey bar in the second row, first two columns in Figure S4). Cross-partisan raters considered covert tweets still as pretty politically extreme. We believe this is due to the opposite political side being more sensitive to any signal from the other political group, even if they do not quite understand it (and tend to mark it as non political). Both cross-partisan groups, however, still marked it as less politically extreme than copartisan far-left/right raters.
As expected, cross-partisan raters disliked all tweets from the opposite side of the political spectrum, but disliked overt tweets more than covert tweets. Copartisan raters did not view selected overt and covert tweets negatively (third row, first two columns in Figure  S4), nor did they view them as insulting to people like them (fourth row first two columns in Figure S4). As expected, copartisan far-left/right raters were more positive and viewed the tweets as less insulting compared to copartisan moderate raters. However, for copartisan groups, there was no difference in the perception of overt or covert tweets. The difference in affective response between covert and overt tweets only appears within the ratings of cross-partisan members (last two columns of the last two rows of Figure S4). Members of the opposite political side viewed selected overt tweets more negatively and more insulting to their group compared to covert tweets. These differences in average ratings across measures give some evidence that the pre-selected tweets represented overt and covert political signaling at the time of the 2020 presidential elections. We further test this by comparing their usage on Twitter in the next Supplementary section.  Figure S4: Mean tweet-level scores on four items from copartisan, cross-partisan, moderate, and far-left/right raters, n = 164. Error bars represent standard errors of the mean.

Description of covert and overt tweets' networks
One of the predictions of the general theory of covert signaling is that covert signaling should be more prevalent among individuals in more heterogeneous communities. To explore if this was true for our pre-selected tweets in a real world context, we compared the Twitter networks (heterogeneous vs. homogeneous) of the Twitter accounts for the full sample of tweets preselected as covert and overt (n = 164). As shown in Figure S5, both overt and covert political tweets were more likely to be from Twitter accounts with homogeneous followers compared to heterogeneous followers (61% of overt tweets were in heterogeneous contexts and 58% of covert tweets). Covert tweets were more likely than overt tweets to appear in a heterogeneous context: 42% of covert tweets were used in heterogeneous contexts compared to 39% of overt tweets. In other words, all political speech is more likely to happen in homogeneous Twitter networks compared to heterogeneous networks, but when political speech happens in heterogeneous networks it is more likely to be covert. These results are in line with the theory of covert signaling and point to the reasonability of our measures. Although the difference in the relative use of covert signaling in homogeneous and heterogeneous networks on Twitter was in the predicted direction, it was relatively small. This is reasonable, given the small sample of selected covert and overt tweets as well as the vast number of factors that influence online communication in a public forum such as Twitter. We hope future research pursues this line of inquiry and develops a wider data collection process and measures of covertness to formally compare the use of covert political identity signaling on Twitter. Figure S6: First part of experiment: rating of each of the 80 tweets selected as covert and overt from both co-and cross-partisan sources.

PLEASE READ THE INSTRUCTIONS CAREFULLY, THE TEXT IN RED WILL CHANGE FOR EACH PAGE.
You are now interacting with ten other people who are also participating in this study.
[Size group 1] described themselves as being on the left of the political spectrum, and [size group 2] described themselves as being on the right.
For each person who likes most of the tweets you share, you will receive an additional 1 cent.
For each person who dislikes most of the tweets you share, you will lose [cost of dislike].
In other words, for this page, if all ten people like most of the tweets you share, you will receive additional 10 cents. If all ten people dislike most of the tweets you share, you will lose [total possible cost].

PLEASE READ THE INSTRUCTIONS CAREFULLY, THE TEXT IN RED WILL CHANGE FOR EACH PAGE.
You are now interacting with ten other people who are also participating in this study.
[Size group 1] described themselves as being on the far left (right) of the political spectrum, and [Size group 2] described themselves as being mainstream left (right).
For each person who likes most of the tweets you share, you will receive an additional 1 cent.
For each person who dislikes most of the tweets you share, you will lose [cost of dislike].
In other words, for this page, if all ten people like most of the tweets you share, you will receive additional 10 cents. If all ten people dislike most of the tweets you share, you will [total possible cost].

Cross-partisan audience (n=243) Copartisan audience (n=235)
Each participant receives 8 different combinations of group sizes and costs of dislike: 1) Size group 1 (and group 2): 1(9), 4(6), 6(4), 9(1) 2) Cost of dislike (and total possible cost): 0.5 cent each (5 cents total) or 1 cent each (10 cents total).  Figure S8: Bonuses received at the end of the behavioral experiment for participants with cross-partisan audiences, n=243. Average predictive margins for the use of covert tweets with 95% confidence intervals by total payoff (a) and sample distribution of total bonuses received at the end of the eight experimental rounds (b). Estimates are from multilevel Poisson models with random intercepts by individuals over experimental conditions, averaging predicted number of tweets shared across sample values for number of tweets shared, political leaning, age, gender, race, and education. S19

Power analysis
For the behavioral experiment, we used two independent samples (copartisan and crosspartisan comparisons) and four groups of raters (far left, moderate left, moderate right, and far right). With our budget and time to share tweets, we estimated that we could get 60 individuals per category (a total of 480 participants). We calculated the power for detecting a linear increase in the total number of covert tweets shared over the outgroup size with one predictor (comparison), another within effect (cost), two independent samples (cross-partisan raters and copartisan raters), and controlling for the total number of tweets shared, for a total sample size of 480 participants.
For the cross-partisan high-cost condition, we assumed a conservative increase in the average number of covert tweets shared for each additional outgroup audience member, from zero covert tweets when there are no outgroup members to one covert tweet when there are only outgroup members (see estimates for our actual outgroup sizes in Table S7). We also assumed that for the low-cost condition (where dislikes were half as costly), the increase would be a third less than in the high-cost condition. We also assumed that participants in the copartisan condition would generally share half of the number of covert tweets compared to the cross-partisan condition, but the effect of outgroup size would not be different than in the cross-partisan condition.
We assumed the standard deviation for the total number of covert tweets shared would be around two tweets and we assumed the standard deviation would increase over outgroup size and be higher for the low-cost option. We expected the total number of tweets to have a standard deviation of eight and to be highly correlated with the total number of covert tweets shared, although decreasing over outgroup size. Number of all tweets shared Figure S9: Proportion of rounds by the number of covert and total tweets shared.

Statistical model
To compare the effect of different experimental conditions and individual-level characteristics, we conducted a generalized mixed-effects Poisson count model on the number of covert tweets shared per round with each outgroup experimental condition nested within individuals. An analysis of the linear mixed-effects model showed enough variation between individuals to warrant random intercepts for each individual (intra-class correlation between 0.304 and 0.775 across models and significant Breusch-Pagan test for cluster-level variation (583 < χ 2 < 3973). Given that we want to analyze the number of all tweets and covert tweets shared, we approximated our outcome distribution using a Poisson distribution. We compared the addition of outgroup size as a continuous or categorical variable and found the latter to be a better fit. The resulting point estimates are in Table S9. Table S9: Poisson mixed model estimates on count of covert (a and b) and total (c and d) tweets shared in each round in the copartisan (a and c) and cross-partisan (b and d) conditions. Estimates used for Figure 5. Standard errors in parenthesis.
Size of outgroup (ref=1 out of 10) 4 out of 10 0.024 0. In Table S10 we present the linear mixed model of the number of covert tweets shared in copartisan and cross-partisan conditions centered by group. The main results are sub-S22 Table S10: Linear mixed model estimates on count of covert tweets shared in each round in the copartisan (a) and cross-partisan (b) conditions with covariance matrices centered by group. Standard error in parenthesis.  stantively similar to those in Table S9. However, the linear models enabled the estimation of different covariance matrices for each political group. The variance of the residuals shows that in the copartisan condition (a), participants from the far left and far right had a larger variance than those in the mainstream. However, in the cross-partisan condition (b), participants from the far left and far right had lower variance than those who were more mainstream.