As a dynamic, mask-like form of digital self-representation, Animoji on the Apple iPhone X afford interesting new possibilities for mediated communication. However, the effect that Animoji animators seek to create depends heavily on the characteristics of their spoken voice. We investigate spoken behavior in Animoji video clips shared publicly on YouTube.com and Twitter.com in the first 14 months after the iPhone X’s commercial release, using quantitative and qualitative discourse analysis methods. Through modifications in vocal quality, prosody, and lexis, the clips were found to enact playful verbal performances that varied in degree and nature according to the specific Animoji used, the gender of the animator, and the gender of the performed character. At the same time, some performances invoked stereotypes that denigrate members of certain groups. We conclude by discussing the broader social and ethical implications of Animoji and other forms of filtered digital self-representation.
The recent development of ‘deepfake’ technologies, in which an algorithm superimposes images and videos onto source images and uses synthesized voices to show someone saying and doing things that they never did in reality (Chesney & Citron, 2018), is a sophisticated example of a broader phenomenon, namely, the growing availability of technology that allows even people with limited technical skills to filter and modify their online self-presentation. We refer to this phenomenon as Filtered Digital Self-Representation, or FDSR for short. FDSR includes filters on Snapchat and Instagram that overlay a user’s image and alter its appearance, as well as the recent phenomenon of Animoji on the iPhone X, through which users can video chat and send video clips of themselves speaking through large-format emoji that mirror movements of the sender’s head, mouth, eyes, and eyebrows in real time. In addition to popular emoji characters such as poop, unicorn, and robot, Animoji users can create and animate custom human “Memoji” that represent their appearance (or how they wish to appear) (Tillman, 2018).
‘Deepfakes’ have been used to make politicians make provocative statements and celebrities appear to engage in pornographic acts (Chesney & Citron, 2018; Parkin, 2019; Schwartz, 2018); their near-perfect realism raises the specter of a “post-authentic” digital world, in which we can no longer trust the evidence of our eyes and ears (e.g., Schwartz, 2018).1 Snapchat filters and Animoji are mainly intended for playful interaction, but their use can also have serious consequences. Plastic surgeons report that young people are increasingly requesting to have their bodies surgically modified to resemble their Snapchat-filtered appearance (McAfee, 2018), blurring the distinction between virtual and physical reality and exacerbating the problem of body dysmorphia, especially among teens and young women.
As mask-like alternate personae that can be used to communicate via video, Animoji use also has the potential to affect communication and social behavior. Although no research has been published on Animoji to date, we know from research on avatars in virtual environments that users prefer idealized representations over representations of their actual self (Jin, 2011) and that an individual’s behavior is shaped by their digital self-representation, independent of how others perceive them (Yee & Bailenson, 2007). However, an area as yet entirely unexplored is how digital self-representation of any type affects how people speak through these filters in video messaging or video chat. This is a notable gap, in that how the spoken voice is used can significantly support or undermine the effect that the animator of a FDSR intends to create.
To begin to address this gap, we investigate spoken behavior in Animoji and Memoji video clips that were shared publicly on two popular social media platforms, YouTube.com and Twitter.com. Using methods of qualitative and quantitative discourse analysis, we address the following research questions:
RQ1: How do iPhone X users speak in Animoji and Memoji clips as regards voice quality, accent stylization, and word choice?
RQ2: How, if at all, does this speech vary according to Animoji type?
RQ3: How, if at all, does this speech vary according to user gender?
Our analyses show that speaking through Animoji filters encourages playful verbal performances of stereotyped characters and personae. The nature and degree of the performance are influenced by the specific Animoji/Memoji used, as well as by the gender of the person who animates it and of the character that is performed. These findings have interactional, social, and ethical implications.
FDSR is the use of a technologically-modified version of oneself in a digitally-mediated environment. Increasingly seen in web, mobile, and virtual reality applications, in its most basic form, FDSR manifests as modified selfies. FDSR is also growing in use in both asynchronous (e.g., images, video clips) and synchronous (e.g., video chat, avatar chat) computer-mediated communication (CMC). Visual modifications in FDSR include decorations, makeup, costumes, accessories, and backgrounds; change of gender or age; cartoonification; and non-human characteristics, as illustrated in Figure 1 for the mobile platforms Snapchat, Instagram, AR Emoji (Samsung), and Animoji (Apple), respectively.2

Figure 1. Examples of mobile Filtered Digital Self-Representation
Voice modification is also possible via third-party apps and video editing software,3 and it is a built-in feature of some FDSR mobile apps, such as iMoji.4 Here we are interested in spoken communication using Animoji and Memoji, independent of whether the voice has been technologically altered or not. In fact, several instances of technological voice modification occurred in our data, but they were not frequent.
Animoji were first introduced on the Apple iPhone X, which launched commercially in November 2017. Memoji, along with several new Animoji, were introduced several months later when the iPhone’s operating system was upgraded to iOS 12.5 Animoji are large-format emoji, and only a limited set of them (20 in iOS 12) are available on the iPhone X. In contrast, Memoji are customizable cartoon human heads, and their variety is theoretically unlimited. Examples of each type are shown in Figure 2.


Figure 2. Examples of Animoji (top rows) and Memoji (bottom rows)
The term ‘Animoji’ is used henceforth as a cover term to refer to both Animoji and Memoji, except when a contrast between the two types is intended.
Animoji are animated in real time through movements of the user’s head, mouth, eyes, and eyebrows. As of this writing, they can be used to send short asynchronous video clips (e.g., in text messages or email) and in FaceTime video chat originating from an iPhone X. People generally speak when recording Animoji video clips, and they talk to other people in FaceTime chats; thus, Animoji use typically involves speech. The nature and characteristics of that speech are the focus of this study.
Tiidenberg and Whelan (2017, p. 14) posit that self-representation is an “emergent, recognizable, intertextual genre” that includes a variety of practices. These practices are significantly shaped by the technology available at a given time. The technological means for representing one’s self online have become progressively richer throughout the history of CMC, from textual descriptors to graphical avatars in anonymous and pseudonymous chat rooms and message boards, to profile photos and selfies in ‘nonymous’ social media sites (Kapidzic & Herring, 2014). Following the anonymity and nonymity of these earlier trends, a more recent trend has emerged, which we term ‘polynymity.’ Polynymity (literally, ‘many names’) refers to the tendency for social media users to create and maintain multiple identities across a variety of online spaces and sometimes within a single platform. FDSR facilitates polynymity by providing visual and auditory enhancements to support and reflect multiple identities. Online self-representation has been attracting the attention of researchers since before polynymity was a trend, however, and, indeed, since the early days of the internet.
One of the earliest topics to attract research interest was play with gender identity. In text-based virtual worlds such as Multi-User Dungeons (MUDs) in the early 1990s, cis-men and women would choose to swap genders textually for a variety of reasons, including to get attention (primarily for cis-male users), to be left alone (for cis-female users), or simply to explore other aspects of their identity (Bruckman, 1993). However, subsequent research found that in asynchronous forums, and even in anonymous social environments such as MUDs and chat rooms, users generally presented as their offline gender (e.g., Herring & Stoerger, 2014). Further, in online role-playing environments where participants did textually assume a different gender (or race, sexual orientation, or species; see, e.g., McRae, 1996), their linguistic choices – style, word choice, politeness, etc. – rarely matched those of their assumed gender and instead tended to reflect stereotypes of cis men and women (Herring & Martinson, 2004).
As the internet became more multimodal, and particularly with the rise of social network sites, the trend toward anonymity in CMC was replaced with ‘nonymity’ (Zhao, Grasmuck, & Martin, 2008). In social network sites, users represent themselves visually via profile pictures, as well as maintaining accounts that link to their real names. At the same time, users tend to exaggerate or omit certain features to present themselves in a favorable light. This can be done through the selective curation of photos (“selfies”) (Bakhshi et al., 2015; Manago et al., 2008), as well as with image filters. However, choice of filters is not always determined by a desire to appear attractive. Rios, Ketterer, and Wohn (2018) interviewed 18 Snapchat users about how they decided which “digital masks” or filters called Face Lenses to use and found that filters were chosen based on goals, personality, and a “scroll-first, decide-later mindset.”6
There is evidence that use of a graphical digital filter can affect one’s communication style. Palomares and Lee (2010) found that in a computerized trivia game, women hedged and apologized more when using a female avatar and were more aggressive and bold when using a male avatar. Yee and Bailenson (2007) identified what they called the Proteus Effect, whereby users in online environments become deindividuated (cf. Lea & Spears, 1991) and use avatars as the entire basis of their self-representation. Under this effect, users conform to the behavior that they believe others expect of their avatar. Given this tendency, Yee and Bailenson (2007) investigated whether having more/less attractive and shorter/taller avatars impacted how users behaved while “piloting” these avatars. They found that users given more attractive avatars were more likely to self-disclose and approach opposite-gendered avatars. Participants given taller avatars were more willing to make unfair splits in an economic game where a player proposes how to divide a sum of money with a second player, while users given shorter avatars were more likely to accept unfair splits from other avatars.
Polynymity online reflects the truth that “people’s lives are ‘faceted’; that is, people maintain social boundaries and show different facets or sides of their character according to the demands of the current social situation” (Farnham & Churchill, 2011, p. 359). Various online platforms with different technical affordances and cultural norms may encourage users to behave or present themselves in a particular manner.7 For example, Instagram’s marketing and design is structured around encouraging users to create artful, polished content (Duguay, 2015). Users may also choose to have multiple accounts on the same platform to reach different audiences. Dewar, Islam, Resor, and Salehi (2019) found that about 18% of surveyed Instagram users, who were mainly young, white, and female, reported having multiple accounts, one of which was a ‘fake’ Instagram account, or ‘finsta.’ Finsta users report that they use multiple accounts as a way to avoid context collapse, not have to worry about maintaining “aesthetic” or their “ideal authentic self” (cf. Ward, 2016), and (ironically) to be more authentic.
Females tend to use image filters more than males. In a study by Dhir et al. (2016), teenage girls in Norway were found to use filters more than young adults or adults (Dhir et al., 2016). In another study, Thelwall and Vis (2017) surveyed image sharing practices on UK Facebook, Twitter, Instagram, Snapchat and WhatsApp. They found that women were more likely to share photos overall, despite being more concerned than men about privacy across all platforms.
Finally, Tiidenberg and Whelan (2017) looked at how “not-selfies” (i.e., reaction images and reaction GIFs) act as a means of self-representation when the subject of the image or GIF is in some way similar to the poster, but does not necessarily physically resemble the poster. Animoji in the form of animals, robots, and poop are also “not-selfies” in the sense of Tiidenberg and Whelan (2017), who argue that self-representation is not a specific kind of text (i.e., a selfie), but rather a genre where an image is chosen with the intentionality to convey a perceived representativeness of the image for the poster. This broad conceptualization includes technologies that allow users to create a cartoon version of themselves, such as Memoji. Some cartoonified representations, such as modifiable stickers using the Bitmoji app (and other similar apps), can be further personalized with clothing, make-up, and accessories and can be shown expressing an almost limitless variety of emotions and actions (Elder, 2018).8
Much like an avatar or Snapchat filter draws viewers’ attention to the visual manipulation of digital filtered self-representation, verbal performance draws an audience’s attention to the use of language as an act of expression (Bauman, 1975). In verbal performance, the performer consciously or unconsciously shapes the linguistic features of their speech in order to imbue language with cultural and ideological meanings beyond the propositional content of the spoken words. These meanings may alter or even cancel out the text’s propositional content (Bauman, 1975). This has been observed in numerous studies where performers engage in style crossing (e.g., using a language variety outside the bounds of an individual’s regularly used repertoire; Rampton, 2000), to express subversive gender identities (Barrett, 1999), recontextualize mock ethnic styles (Chun, 2004), or align themselves with perceived audience members (Coupland, 2001). In these studies, style crossing was used as a keying device (Goffman, 1974) to signal to the audience that a performance frame was being invoked.
Language stylization plays an integral role in indexing the performative frame being used and in providing clues to the performer’s intention. Stylization is the process whereby a performer takes the voice of another and repurposes it to suit the performer’s objective. Bakhtin asserts that stylized utterances are inherently multi-voiced and contain “varying degrees of otherness or varying degrees of ‘our-own-ness’... [and] carry with them their own evaluative tone, which we assimilate, rework, and re-accentuate” (Bakhtin, 1986, p. 89). Coupland (2001) posits that stylized utterances project “altera personae” that are drawn from familiar socio-cultural repertoires which include archetypes, stereotypes, public personae, and well-known media characters. Moreover, stylizations convey ideological values and ideational meanings beyond the immediate context (Coupland, 2001).
For example, Bucholtz and Lopez (2011) observed the use of dialect stylization to depict racial stereotypes in Hollywood film portrayals of race. Characters such as the white-washed African-American female and the hypermasculine African-American male were stylized using Standard American English or African-American Vernacular English (AAVE). The latter includes the use of AAVE features such as postvocalic /r/ and /l/ deletion and consonant cluster simplification. Bucholz and Lopez (2011) dubbed this kind of stylization a form of “linguistic minstrelsy,” whereby language varieties are deauthenticated, decoupled from their original intertextual meanings, and indexically regimented to the point where the persona being played is reduced to a narrow set of stereotypical indexical meanings. In this way, stylization can work toward building stereotypical personae that can perpetuate negative perspectives on racial and ethnic minorities. Features of AAVE are sometimes used in the Animoji performances in our data to suggest racial stereotypes.
In addition to race and ethnicity, stereotypes exist for gender and sexual identity. In a study of perceptions of gender-typed occupations, activities, and traits, Blashill and Powlishta (2009) found that gay males were perceived as less masculine/more feminine than heterosexual males, and lesbians were viewed as more masculine/less feminine than heterosexual females. These results were consistent with findings from 20 years previous (Blashill & Powlishta, 2009), indicating that stereotypes persist over time. Stereotypes are readily available, widely recognizable features that, for better or worse, can be drawn on easily, particularly in impromptu performances (e.g., Barrett, 1999: Bucholz & Lopez, 2011; Chun, 2004). However, stereotypes make over-generalized inferences about certain social groups and fail to recognize the uniqueness of each individual. Davis and Harris (1998) characterize stereotypes as “negative and/or misleading’’ when they are used to predict and explain behavior, since they assume that all members of a group share similar traits (Gibson, 1989).
Stylization and the projection of alternate personae can also serve less pernicious purposes. For instance, use of mock Asian varieties in the performances of comedian Margaret Cho, as studied by Chun (2004), demonstrates a strategic re-shaping of a mock language variety for the purpose of providing social commentary on Asian-American stereotypes. Cho differentiated her Asian-American identity from her mother’s Asian identity by using mock Asian linguistic features like the neutralization of the phonemic distinction between /r/ and /l/ (e.g., flied licefor fried rice) when speaking in her mother’s voice. In this way, her mother’s voice contrasted with Cho’s own Californian variety of English. Similarly, Barrett’s (1999) examination of African-American drag queen performances demonstrated how the use of hedges, declaratives with rising intonation, and other superpolite forms to perform white, heterosexual, upper class females enacts a polyphonous identity that is in opposition to African-American stereotypes and Western conceptions of masculinity. In this way, stylization can be a vehicle for constructing multilayered identities and personae.
Verbal performances can also be informed by commonly-known archetypes derived from the “collective unconscious” of mankind (Jung, 1964). An archetype is a basic pattern that can be represented in various manifestations featuring specific attributes (Jung, 1964). Contemporary theorists regularly identify 12 Jungian-based archetypes, including the caregiver, innocent, jester, lover, magician, and sage. One of the key features of archetypes is that they are unconscious mental models (Faber & Mayer 2009). This makes archetypes easily understandable stock characters that any improvising performer can draw on readily from memory and know that their audience will “get” the character almost instantly. Some common archetypes, such as villain, damsel in distress, and handsome prince, are invoked in the Animoji clips in our data.
Another relevant concept is that of the persona, or the social role assumed by an individual. According to Jung (1976), the persona is the mask one presents to the world to make themselves more socially desirable. A person can have multiple personae in different contexts: in the workplace, with family, or in a romantic relationship, dependent on when and how they engage (Waisanen & Becker, 2015). Researchers have examined the intended public roles or personae assumed by political figures (Aden, Crowley, Phillips, & Weger, 2016), celebrity stars (Meyers, 2009), and media practitioners (Higgins, 2010). The social web permits users to create online personae mediated by new tools and communicative strategies, such as by constructing virtual characters in online games (Bessière, Seay, & Kiesler, 2007) or purchasing avatars and decorative objects in networked communities (Kim, Chan, & Kankanhalli, 2012). In our data, Animoji users sometimes took on the personae of public figures such as Barack Obama, Donald Trump, Doctor Phil, and popular YouTubers, as well as media characters such as Bugs Bunny, Darth Vader, and various Disney princesses.
Analyzing verbal performances is a way to identify the communicative means that are used as performative shorthand and “the degree of intensity with which the performance frame operates” (Bauman, 1975, p. 297) in different communities. Both the means and the degree of performance are analyzed in the present study, which aims at understanding the relationship between verbal performance and filtered digital self-representation via Animoji. Although much research has been done to understand how language stylization is used to perform alternate identities or personae, little has been done on verbal performance in CMC settings or the influence that digital communicative resources have on verbal performances.
Informed by the literature on verbal performance, stylization, stereotypes, and alternate personae as reviewed in the previous section, we refined our first research questions about how mediation by Animoji affects the speech produced to ask specifically:
RQ1a. What paralinguistic patterns of voice quality and vocalization are used in Animoji clips that are shared on social media?
RQ1b. How ‘performed’ are the voices in the clips overall, in terms of degree of departure from the Animoji animator’s normal speaking voice?
RQ1c. How frequently are phonological features produced that are stereotypically associated with sounds made by animals/objects (for Animoji) or with certain kinds of speakers (for Memoji)?
RQ1d. How frequently are direct and indirect references made to the Animoji?
To address the research questions, Animoji and Memoji videos were collected from YouTube.com and Twitter.com between November 2017 and January 2019. Animoji videos were located using the search terms Animoji, Animoji jokes/funny/cute/etc. In this way we collected 49 videos that were created by 39 individuals, 29 from YouTube and 20 from Twitter. Memoji videos were located using the search terms Memoji and Memoji stories. This resulted in nine videos being collected that were created by six different individuals, all of which were from YouTube. The term 'animator' is used in this article to refer to the person whose voice and facial movements animate the Animoji/Memoji in a video clip. All but three of the animators appear to be ordinary social media users, while one YouTuber is a professional voice actor, another specializes in voice impressions online, and a third is an online comedian.
The 58 Animoji and Memoji videos were all that we were able to locate at the time that met our sampling criteria. Videos that did not include speech (e.g., that contained only laughter or karaoke singing) were excluded from the collection process. Although it was not possible to determine the precise age of each poster, from their voices all appeared to be over 18. The few videos that were obviously produced by children were excluded. All of the videos were in English, and most seemed to have been produced by native speakers of American English.
Each of the 58 videos was first broken down into clips, which we operationalized as sequences involving a single emoji (usually) and bounded by shifts in key (Goffman, 1974), indicating that a performance frame was being initiated or ended. This method led to the identification of 366 Animoji clips (346 from YouTube and 20 from Twitter)9 and 31 Memoji clips, for a total 397 clips, in the collected videos.
For the quantitative discourse analysis, four trained coders coded each clip for: 1) the gender of the animator and the gender of the Animoji/Memoji character (the gender options were female, male, and undetermined/other10); 2) whether the Animoji/Memoji was mentioned by name; 3) whether characteristics of the Animoji/Memoji were referenced; 4) whether sounds/accents characteristic of the Animoji/Memoji were produced; and 5) the degree of spoken performance. Degree of performance was intended to capture “the degree of intensity with which the performance frame operates” (Bauman, 1975, p. 297); it was operationalized as deviation from the speaker’s inferred normal speaking voice, on a scale from 0 to 3. A score of 0 was assigned to clips that were deemed to represent the speaker’s habitual or baseline speaking voice. A score of 1 indicates speech that was stylized in pitch, cadence, pronunciation, and/or prosody (e.g., intonation, loudness, duration, stress) but that was still readily identifiable as the speaker’s own voice. A score of 2 indicates more extreme articulatory and/or prosodic deviations from normal voice quality, albeit not yet a performance of a specific, recognizable entity. Finally, a score of 3 indicates a full-blown performance (imitation or impersonation) of a stereotype, archetype, or famous character, regardless of its accuracy. These categories emerged from the data through an iterative coding process and discussion among the coders. [Click here to view video clips illustrating the four degrees of performance.]
Interrater reliability measures were calculated for categories 1-5, and over 88% agreement was reached for the first four. On 5 (degree of spoken performance), initial agreement among the coders was 73%; after discussion and recoding of a sample of the data, it reached 78%; and a third iteration produced an agreement rate of 84.6%. At that point, the first author reviewed the other coders’ codes and, observing that most of the disagreements were due to one coder, adjusted some codes to bring them in line with those of the other coders. The quantitative results are presented using descriptive statistics.
For the qualitative analysis, we noted paralinguistic keying features such as shifts in pitch, volume, and voice quality for each performance. We also generated open-ended characterizations of the performances in terms of the language varieties (registers or dialects) that we were able to recognize in the performances. For example, performances were described as ‘Indian-accented English’; ‘African American English with a lisp’; and ‘voice-modulated robotic voice.’ After individually describing each video clip, we collectively formulated descriptions of a smaller set of clips that typified main trends in our data. Illustrations of these trends are provided at the end of the Findings section.
Overall, the clips posted to YouTube and Twitter mostly represented males, especially for Animoji proper as compared to Memoji. That is, most clips were posted by men (Animoji: 90.9% male, Memoji: 58.1% male) and represented male characters (Animoji: 84.9% male, Memoji: 54.8% male). Women were more likely to post Memoji than Animoji, with 41.9% of Memoji being produced by women as compared to only 7.3% of Animoji.
For the most part, the gender of the performed characters matched the apparent gender of the Animoji/Memoji animators, although there were some exceptions. Both male and female animators posted gender-switched performances, although women did so only for Memoji. For Animoji, moreover, men gender-switched only 4.4% of the time (Table 1). Both male and female animators were much more likely to perform a different gender in the Memoji clips, where the performances were often of famous people, and women did this slightly more than men (38.5% vs. 33.3%). See Tables 1 and 2.


Both Animoji and Memoji clips show a high average degree of performance. The overwhelming majority (more than 84%) of the clips for both Animoji and Memoji showed deviations from normal voice quality, scoring 1, 2, or 3 on the scale of performativity. These clips are henceforth considered to be ‘performed.’ Slightly more Memoji (90.1%) were performed according to this measure than Animoji (84.1%). More men performed Animoji than women did (84.9% vs. 70.4%). However, more women performed Memoji than men did (92.4% vs. 88.9%), and women performed Memoji more often than they performed Animoji (92.4% vs. 70.4%). See Tables 3 and 4.


Male animators’ performances showed more extreme deviations from baseline voice quality than female animators’ performances for both Animoji and Memoji. For Animoji, the male performativity average was twice as high (1.88 vs. .96), and for Memoji, it was somewhat higher (1.94 vs. 1.85). Women were most likely to have a degree of performance of 1 for Animoji, whereas male performances were mostly scored 3. The distribution of scores by gender was somewhat more balanced for Memoji, as shown in Tables 3 and 4.
Related to modifications in vocal quality, the creators of the video clips also produced sounds and accents characteristic of the Animoji or Memoji through which they were speaking. For Animoji, characteristic sounds included noises typically produced by animals, such as barking and panting for the Dog and purring for the Cat, as well as sounds conventionally associated with non-animal entities, such as monotonous pitch and beeping sounds for the Robot. Characteristic speech included ethnic and regional accents for generic Memoji (e.g., a southern drawl to represent a “country guy”; AAVE features to represent a “black mom”) and specific speech styles for Memoji intended to represent specific persons or characters (e.g., a choppy delivery and deliberate pauses to mimic Barack Obama’s speech style; raised pitch to voice the persona of a female user as a little girl; and an African accent to perform T'Challa, the Black Panther in the Marvel Comics universe. (The Memoji in these examples were also customized to visually resemble the characters.) The use of characteristic sounds and speech styles was more evident with Memoji use (64.5%) than Animoji use (23.1%) in our data. Men were more likely than women to produce sounds or speech characteristic of both Animoji and Memoji. See Table 5.
Table 5 also shows the results for two other coding variables that provide evidence of how choice of Animoji affects, and is reflected in, the behavior of the user. In roughly one-third of all clips, the spoken content refers to characteristics of the Animoji/Memoji. Characteristics of Animoji include references to stereotypical associations about animals and other entities depicted by Animoji, including cultural associations. For example, in our data the Cat mentions its nine lives; the Alien talks about outer space, spaceships, and planets; and the Monkey talks about bananas and throwing feces. Characteristics of Memoji include associations with the personae or archetypes performed. For example, a Memoji clip performing Donald Trump calls out to “Melania,” the name of Trump’s wife; cowboy Woody, a character in the Disney movie ‘Toy Story,’ refers to the movie by name; and a performance of the U.S. television personality Dr. Phil starts with “On today’s show, ….” Both genders were more likely to mention characteristics of a Memoji than an Animoji.
The last variable is direct mention of the name of the Animoji or the Memoji character. For Animoji, mentioning the name consisted of saying, for example, “poop” when using the Poop Animoji, as when a user gave as a reason for buying the iPhone X: “I can be a sexy poop.” For Memoji, this involved saying the word “memoji” or (more commonly) the name of the Memoji character being performed, such as when the user performing Trump proclaimed, “I’m Donald J. Trump.” The creators of the video clips were more likely to mention an Animoji’s name (33.1%) than a Memoji’s (19.4%), and men were more likely than women to mention the names of both Animoji and Memoji, as summarized in Table 5. Indeed, male animators exhibited more of all three performance-related behaviors in Table 5 than female animators did, consistent with the results for modification of voice quality and degree of modification presented in Tables 3 and 4.

As the examples in the previous section suggest, the Animoji producers’ behavior reflects the particular Animoji or Memoji character they are speaking through (e.g., barking when speaking through the Dog Animoji). Men and women differed in which Animoji they used, and thus gender must be considered when interpreting Animoji users’ speech production. Table 6 shows which Animoji were used by male, female, and unknown gender animators, and for clips representing male, female, and unknown gender characters. Since each Memoji was uniquely customized, it was not possible to conduct a similar gender breakdown for Memoji. Female animators and characters most favored the Chicken, followed by the Unicorn, the Pig, and then the Cat. Male animators and characters most favored with the Dog, followed by the Monkey, the Poop, and then the Alien and the Robot. No woman spoke through or performed the Poop, Alien, or Robot, even though men used these Animoji quite often.
The differences between the Animoji of the animators and of the performed characters were relatively minor for males. However, women used the Unicorn, Pig, and Monkey proportionately more than those Animoji were used (by both genders) to perform females, while conversely, the Chicken, Cat, and Fox were performed as females proportionately more than female animators used those Animoji. These discrepancies suggest a difference between the female animators’ self-identification and how women are stereotypically perceived (e.g., as cackling hens, finicky felines, and sexy ‘foxes’).

In addition to gender preferences, the manner of performance also varies by Animoji type. The performance measures discussed earlier are broken down by Animoji type in Table 7. Although the average degrees of performance of each type are roughly similar except for a few outliers (e.g., the Fox and the Lion, as discussed further below), different Animoji favor different performance strategies. For example, users were most likely to directly mention the name of the Animoji for the Poop (e.g., “I’m a sexy poop”, but also using the synonyms “crap” and “shit”) and the Bear. Indirect reference to the Animoji being performed through mentioning characteristics associated with it was most common for the Robot and the Poop, followed closely by the Rabbit and the Dog. For example, one performance was of a rabbit pediatrician who “get[s] paid in carrots.” A dog talked about getting spayed, a poop talked about wanting to “marry a peepee,” and a robot made a suggestive reference to his “joystick.”
As for using characteristic sounds/accents, the Robot was performed in this way most often, through beeps and boops and a robotic, flat voice quality, imitating stereotypes of robots in popular culture such as cartoons and movies. Over half of the robot performances in our dataset followed these stereotypes. The Chicken and the Dog were also especially likely to elicit characteristic, stereotypical sounds. When using the Chicken, users tended to employ a high-pitched tone and make clucking sounds or speak with a clucking quality. When using the Dog, users panted and made barking sounds. For other Animoji whose referents either do not typically produce sound (e.g., the Poop and the Skull) or whose sounds are less well known (e.g., the Panda), this performance feature was used less often. However, some of these, such as the Skull, the Bear, and the Lion, were newer Animoji at the time of data collection and were used less often; thus their data may not be reliable.
The Animoji with the highest degree of performance on average was the Fox. This is because of the internet meme based around the song “What does the fox say?” by Ylvis. Most of the users who performed with the Fox Animoji referenced this song in their performances, typically breaking out into song, which elevated their level of performance to a 3. At the low end of the scale are the Lion and the Skull. However, these appeared too infrequently in our corpus to provide reliable data for analysis.

In this section, we expand on the quantitative patterns identified in the previous sections through close, qualitative analysis of two performances. The first is an example of a performance that is accomplished primarily through vocal quality stylization, whereas dialect stylization plays a key role in the second performance.
Stylization is the process whereby a performer takes the voice of another and repurposes it to suit his or her objective, which in the case of our data is entertainment. The Animoji performances employed language stylizations that included a wide array of dialects and marked phonological and prosodic features to evoke stereotypes and reference cultural tropes. At the lower end of the performance spectrum, simple shifts in pitch (e.g., falsetto voice) and voice quality (e.g., vocal fry) were often used to key a “joke” or humor frame. At the higher end of the performance spectrum, dialect stylizations involving prosodic and phonological features associated with the speech of a particular race/ethnicity or geographical area were used to perform culturally-recognizable characters or personae. The excerpts described below illustrate these two types of performance.
One source of data for this study was a video in which a number of “dad jokes” were told through different Animoji. One such joke was told by a woman through the Pig. The text of the joke was: “What’s blue and not heavy at all? Light blue.” The choice of the Pig Animoji had no apparent relation to this particular joke, and the woman was not speaking in the voice of any particular character, yet the two utterances in this eight second segment were delivered with stylized pitch and cadence to index a “joke” frame, and thus it was assigned a 1 for degree of performance. Specifically, the first utterance (line 1) was spoken at a slow pace with elongated vowels in the words “blue” and “all.” It was followed by a three second pause to generate suspense before the punchline: “light blue!” These words (line 2) were delivered at a faster pace and exhibit the most marked vocal modification: raised pitch (falsetto voice) and a long falling intonation on the word “blue.” [Click here to view video Excerpt 1.] These modifications emphasize and draw attention to the punchline of the joke.

The stylized cadence and pitch of this delivery, along with the syntactic structure (“What’s X…?”), are typical of two-part question-answer jokes.12 Modified pitch, elongated vowels, and marked cadences were among the most commonly used stylizations and occurred in clips at the low end of the performance scale as well as clips at the high end of the scale.
Speech was also stylized in ways that reflected the visual identity of the Animoji/Memoji. One YouTuber from whom we collected data posted a series of videos depicting humorous vignettes of medical professionals featuring different Animoji. The next excerpt comes from a video in which Animoji medical professionals described their “side hustles.” In this 15-second segment, the Monkey is a primary care physician who is describing his secondary occupation as a Lyft/Uber driver.
This particular performance uses a shared repertoire to depict common medical and ethnic tropes for comic effect and relies on lexis and dialect stylization to index medical and ethnic identities. First, the use of phonological features characteristic of African-American Vernacular English (AAVE) can be observed throughout the segment. For instance, the diphthong /ai/ is pronounced as the monophthong /a/ or /a:/ in the words “I’m,” “my,” and “arrived” in lines 1, 2, and 4; postvocalic /r/ and /l/ deletion occurs in the words “your,” “well,” and “here” in lines 2 and 4; and the final /t/ in “right” is glottalized in line 4. The use of these features is not entirely consistent (the vowel in “like” in lines 1 and 4 is not monophthongized, for example, and postvocalic /r/ is pronounced in “you’re” in line 4), but the speaker is clearly aiming for AAVE. Additionally, revoicing is used to create contrast between the AAVE of the monkey primary care doctor and his Lyft/Uber passengers, whom the doctor refers to as his “patients” in line 1, and who are voiced in Standard American English (SAE) in line 3. [Click here to view video Excerpt 2.]
This segment received a ‘degree of performance’ rating of 3. It constructs a character that is stereotyped both as a medical professional (primary care doctors are incompetent, even as Lyft drivers) and ethnically (African-Americans are unintelligent). Moreover, the use of the Monkey Animoji evokes the racist trope that African Americans are “apes” (Hund & Mills, 2018). This might be considered a form of “linguistic minstrelsy” (Bucholz & Lopez, 2011).

The longer video from which the example above was excerpted includes many other examples of mocking dialect stylization in the performance of stereotyped ethnic and/or gendered identities. These include the successful and ambitious Asian doctor (the Panda, speaking with a Chinese accent); the proctologist “Dr. Poopmoji” (the Poop, speaking with a South Asian accent “because I am brown”); and the “catty” female nurse (the Cat, speaking with a Filipino accent). Besides these ethnic medical caricatures, language stylization and stereotyping occur in performances of archetypes such as the American cowboy and well-known public figures such as Presidents Trump and Obama (via Memoji), as well as media figures such as Bugs Bunny (via the Rabbit Animoji).
Our first research question (RQ1) asked: “How do iPhone X users speak in Animoji and Memoji clips as regards voice quality, accent stylization, and word choice?” In our dataset of Animoji and Memoji video clips shared on social media in the first 14 months after the iPhone X was released, we found that the speech in the clips overwhelmingly took on a performed quality, with the overall degree of performance averaging 1.83 on a scale of 0-3 (RQ1b). Consciously or unconsciously, the performances invoked stereotypes of creatures or persons, real or imagined, whose “voices” the users performed, imbuing their speech – in many cases, such as those described in the previous section – with cultural and ideological meanings beyond the propositional content of their spoken words (cf. Bauman, 1975). They did this via phonological and prosodic modifications to their normal speaking voice (RQ1a), by imitating the characteristic sounds, dialect, or speech style of the character being performed (RQ1c), as well as through the content of their speech, which often referred directly or indirectly to that Animoji or Memoji (RQ1d).
Our second research question (RQ2) asked: “How, if at all, does this speech vary according to Animoji type?” We found that both the nature and degree of the performance are influenced by the specific Animoji or Memoji used. Although overall degree of performance did not vary greatly for the different Animoji types, individual Animoji performances favored different strategies (Table 7). This can be explained by the Proteus Effect (Yee & Bailenson, 2007), according to which an individual’s behavior conforms to their digital self-representation. Somewhat more Memoji were performed (90.3%) than Animoji (84.1%) (Tables 3 and 4), possibly because the Memoji in our data mostly represented entities other than the self, whereas more Animoji seemed to function as “not-selfies” (Tiidenberg & Whelan, 2017). It may also be more intuitive for users to speak as human-like Memoji than as animal or anthropomorphized object Animoji. That said, certain Animoji, such as the Robot, seemed to almost irresistibly trigger the use of stereotypical vocal features in these data.
Finally, RQ3 asked: “How, if at all, does this speech vary according to user gender?” The frequency and degree of the Animoji and Memoji performances were influenced by the animator’s gender and the gender of the character being performed. Women posted far fewer Animoji clips than men did in the time period studied. Moreover, female animators in general had a lower average degree of performance than male animators. Most of the clips are humorous, and males are typically thought to be the funnier sex (Mickes et al., 2011); men also tend to feel more entitled to take up public space online (Herring & Stoerger, 2014). Taken together, these factors help explain the larger number of men posting humorous Animoji clips on social media. However, women posted nearly as many Memoji clips as men did. Their preference for Memoji, which represent humans, over Animoji, which represent animals and objects, is consistent with the tendency for women to express greater interest than men in people (e.g., Su et al., 2009).
Further, the preferences of men and women for certain Animoji reflect gendered stereotypes, for example, ‘Men are dogs; women are cats’ (Charron 2011). The appearance of the Animoji themselves is arguably gendered: The Animoji most associated with females are pastel-colored: pink (Pig), purple (Unicorn), and yellow (Chicken; Cat), whereas those associated with males are blue (Robot), brown (Poop; Monkey; Dog), and grey (Alien), consistent with gendered color preferences dating from early childhood (e.g., Weisgram, Fulcher, & Dinella, 2014). Finally, male preferences for scatological and robotic Animoji are consistent with traditional “boy culture” (Jenkins, 2007).
The above findings are based on Animoji video clips that were publicly shared on social media in the first 14 months after Animoji were released with the iPhone X; this is undoubtedly a small subset of the Animoji in use today. Some of the Animoji video clips we examined were created with the express purpose of being posted to Youtube.com, and it appears that most of the clips were shared because the creator believed that other people would find them entertaining. These videos may not reflect private Animoji usage, since people typically behave differently in private than in public, and their audiences and reasons for use may differ. Research is clearly needed on Animoji-mediated communication in private and non-humorous contexts of use. Further, our sample is biased toward Animoji that represent animals and objects, since data collection began before Memoji were released. More research on Memoji use, especially in private communication, is needed as well.
The main contributions of this study can be summarized as follows. First, it reports on a public use of Animoji (in the broadest sense) by early adopters of the technology, and it is the first study to do so. Second, it provides evidence in support of the Proteus Effect and extends the effect of using a particular avatar to vocal behavior – not just what users of Animoji say, but how they speak. Third and last, the study has revealed a tendency for Animoji use to lead, consciously or unconsciously, to social stereotyping.
Meanwhile, digital filtered self-representations are expanding in scope to include virtual reality chat avatars, holographic images, and telepresence robots (Herring, 2015). As such, they are part of a broader paradigm shift from static means of digital self-representation, such as selfies and Bitmoji stickers, to more dynamic and interactive avatars that increasingly communicate through speech. This shift has several implications.
On the positive side, new forms of filtered digital self-representation (FDSR) such as Animoji encourage playful verbal performances and facilitate the presentation of polynymous identities in online spaces (cf. Dewar et al., 2019). They afford play with identity and provide new resources for relationship formation and management. For users concerned about privacy or who fear harassment if they reveal their identity, they provide a more anonymous way to engage. At the same time, the increasing use of facial filters raises questions about how facial cues are communicated and interpreted in mediated interaction. On the negative side, filtered digital self-representation may promote stereotyping, and stereotyping reifies prejudicial attitudes. With regard to animal Animoji in particular, as Hund and Mills (2018, n.p.) note, “[a]nimalization [is a] widespread element[ ] of racist dehumanization.” Yet the playful nature of these facial filters provides plausible deniability, which may allow these behaviors to go unnoticed and unchecked.
Finally, FDSR increases the potential for deception in self- and other-representation. Digital masks can be used to deceive, and voices can be technologically altered. Several years ago, researchers at the University of Washington used AI methods to generate a highly realistic ‘deepfake’ video of Barack Obama (Suwajanakorn, Seitz, & Kemelmacher-Shlizerman, 2017). Cartoonish masks such as Memoji and Snapchat filters may seem a far cry from deepfakes, but in reality they are not. All belong to a family of digital tools that allow for altering one’s appearance and voice in mediated communication. A Memoji can be customized to impersonate someone else. Photographs of, for example, Obama can be transformed into 3D representations that can be animated,14 and recorded speech can be played back in Obama’s voice.15 Most chilling of all, easy-to-use software16 is already available that lets users create and share their own realistic deepfake videos.
As Ehrlich (2019) points out, people tend to uncritically accept video content as a faithful representation of reality. Thus one solution to counter the threat of deception (and disinformation) posed by deepfakes is to raise critical awareness of the inherently constructed nature of video, much like awareness of Photoshop has made laypeople more critical viewers of digital photographs. The potential for malicious misuses of digital masks and filters, such as for bullying or catfishing, constitutes another threat that might, however, be reduced by emphasizing the importance of not sharing one’s phone or passcode with others. To verify the identity of the transmitter of private FDSR messages, secret passwords or encryption keys could be used and shared only with trusted contacts. Other proposed technological solutions aim to flag the occurrence of deepfakes on social media platforms in real time and to establish immutable authenticity trails for individuals (Chesney & Citron, 2018).
Yet the notion of a unitary “authentic self” (cf. Ward, 2016) is itself a construct, one whose relevance has arguably been declining ever since textual CMC first made selective self-presentation easy and available on an unprecedented scale (cf. Walther, 1996). FDSR renders the “authentic self” increasingly irrelevant by making it easier than ever to digitally alter one’s physical appearance and voice. Indeed, FDSR and the trend toward polynymity in social media may together be changing the definition of authenticity itself. Dewar et al. (2019), for example, suggest that “authentic” in the fake Instagram accounts of young users means performed in accordance with the expectations of the audience, rather than in accordance with one’s core conception of self. As these trends advance, it will be crucial to understand how digital filters affect usage, behavior, social norms, and trust in interpersonal interactions and in social media more generally.
Special thanks are due to Language@Internet Editorial Board member Tuija Virtanen for serving as Associate Editor for this article, and to the reviewers, especially Reviewer 1, for their thoughtful comments and suggestions.
-
Although videos are technically constructed artifacts that can be edited, lay people tend to uncritically accept them as reliable evidence, similar to eye-witness testimony (Ehrlich, 2019).
FDSR takes place on other platforms, as well, but in this article the focus is on mobile FDSR.
See, e.g., https://appuals.com/the-5-best-voice-changer-software-to-use/, accessed September 23, 2019.
https://apps.apple.com/ua/app/imoji-ar-voice-changer-emojis/id1437104455
Enhancements to the Memoji creation options were introduced in iOS 13 in September 2019. As the data for this study were collected prior to the introduction of iOS 13, all Animoji and Memoji examples given here are from iOS 12.
"[W]when deciding on a Face Lens, ... [t]he user ... has the option to scroll through this list and apply the Face Lens synchronously, meaning the user does not have to take the picture first, they are free to click on one, see how it looks, move their head around, perform an action to access animations, and much more" (Rios et al., 2018, n.p.).
A tongue-in-cheek illustration of polynymity is a meme that recently went viral on social media of four images depicting how the same person self-presents differently on LinkedIn, Facebook, Instagram, and Tinder (https://www.dailymail.co.uk/femail/article-7920007/Dolly-Parton-sparks-LinkedIn-Facebook-Instagram-Tinder-photo-challenge.html, retrieved March 1, 2020).
Bitmoji sets can now be generated from Memoji in iOS 13.
Most of the Twitter videos consisted of a single clip. Several longer YouTube videos employed a split screen format where a person was visible on one side, and their phone interface showing their Animoji was visible on the other side. For those videos, only segments where the YouTuber was speaking through the Animoji, rather than speaking about them, were included as data.
Gender was determined according to the intuition of the coders based on evidence in the video such as voice quality. Research has shown that normal-hearing speakers in non-noisy environments are able to identify voice gender based on acoustic properties of speech that contribute to voice quality (Fu, Chinchilla, & Galvin, 2004).
The following transcription conventions are used in the excerpts: : lengthened vowel, ↑ rising pitch, ↓ falling pitch.
The Pig’s wide grin at the end also serves to key this clip as a joke.
Pronunciations that diverge from Standard American English are represented in the International Phonetic Alphabet (https://en.wikipedia.org/wiki/International_Phonetic_Alphabet) and enclosed in square brackets, and standard pronunciations (broadly understood to include vowel lengthening and other prosodic modifications) are represented in regular English orthography. These notational conventions are used to make the nonstandard pronunciations easy to identify in the transcription.
Using, e.g., CamMask (http://www.cammask.com).
Using, e.g., the iMoji app (https://apps.apple.com/ua/app/imoji-ar-voice-changer-emojis/id1437104455).
E.g., FakeApp and open-source alternatives such as Faceswap and DeepFaceLab (Wikipedia, 2020).
Aden, R. C., Crowley, K., Phillips, E., & Weger, G. (2016). Doubling down: President Barack Obama’s doubled persona after the Zimmerman verdict. Communication Studies, 67(5), 605-622.
Bakhshi, S., Shamma, D. A., Kennedy, L., & Gilbert, E. (2015). Why we filter our photos and how it impacts engagement. In Proceedings of the International AAAI Conference on Web and Social Media (pp. 12–21). Oxford, UK: AAAI.
Bakhtin, M. (1986). Speech genres and other late essays. (C. Emerson, & M. Holquist, Trans.) Austin, TX: University of Texas Press.
Bauman, R. (1975). Verbal art as performance. American Anthropologist, 77(2), 290-311.
Barret, R. (1999). Indexing polyphonous identity in the speech of African American drag queens. In M. Bucholtz, A. C. Liang, & L. L. Sutton (Eds.), Reinventing identities: The gendered self in discourse (pp. 313-330). New York, NY: Oxford University Press.
Bessière, K., Seay, A. F., & Kiesler, S. (2007). The ideal elf: Identity exploration in World of Warcraft. Cyberpsychology & Behavior, 10(4), 530-535.
Blashill, A. J., & Powlishta, K. K. (2009). Gay stereotypes: The use of sexual orientation as a cue for gender-related attributes. Sex Roles, 61(11-12), 783-793.
Bruckman, A. (1993). Gender swapping on the internet. Paper presented at The Internet Society 1993, Reston, VA.
Bucholtz, M., & Lopez, Q. (2011). Performing blackness, forming whiteness: Linguistic minstrelsy in Hollywood film. Journal of Sociolinguistics, 15(5), 680-706.
Charron, N. L. (2011). Why men are like dogs and women are like cats. Xlibris Corporation.
Chesney, R., & Citron, D. (2018, February 20). Deep fakes: A looming challenge for privacy, democracy, and national security. Lawfare. Retrieved from https://www.lawfareblog.com/deep-fakes-looming-crisis-national-security-democracy-and-privacy
Chun, E. (2004). Ideologies of legitimate mockery: Margaret Cho's revoicings of Mock Asian. Pragmatics, 14(2), 263-289.
Coupland, N. (2001). Dialect stylization in radio talk. Language in Society, 30(3), 345-375.
Davis, L. R., & Harris, O. (1998). Race and ethnicity in U.S. sports media. In L. A. Wenner (Ed.), Media sport (pp. 154–169). New York, NY: Routledge.
Dewar, S., Islam, S., Resor, E., & Salehi, N. (2019). Finsta: Creating" fake" spaces for authentic performance. In Extended Abstracts of the 2019 CHI Conference on Human Factors in Computing Systems (p. LBW1214). New York, NY: ACM.
Dhir, A., Pallesen, S., Torsheim, T., & Andreassen, C. S. (2016). Do age and gender differences exist in selfie-related behaviours? Computers in Human Behavior, 63, 549-555.
Duguay, S. (2015). Is being #instagay different from an #lgbttakeover? A cross-platform investigation of sexual and gender identity performances. In SM&S: Social Media and Society 2015 International Conference. Ryerson University, Toronto, Canada.
Ehrlich, S. (2019). ‘Well, I saw the picture’: Semiotic ideologies and the unsettling of normative conceptions of female sexuality in the Steubenville rape trial. Gender and Language, 13(2), 251–269. doi:10.1558/genl.35019
Elder, A. M. (2018). What words can’t say. Journal of Information, Communication and Ethics in Society, 16(1), 2–15. doi:10.1108/jices-08-2017-0050
Faber, M. A., & Mayer, J. D. (2009). Resonance to archetypes in media: There’s some accounting for taste. Journal of Research in Personality, 43(3), 307-322.
Farnham, S. D., & Churchill, E. F. (2011, March). Faceted identity, faceted lives: Social and technical issues with being yourself online. In Proceedings of the ACM 2011 Conference on Computer Supported Cooperative Work (pp. 359-368). New York, NY: ACM.
Fu, Q-J., Chinchilla, S., & Galvin, J. J. (2004). The role of spectral and temporal cues in voice gender discrimination by normal-hearing listeners and cochlear implant users. Journal of the Association for Research in Otolaryngology, 5(3), 253–260.
Gibson, L. R. (1988). Beyond the apron: Archetypes, stereotypes, and alternative portrayals of mothers in children's literature. Children's Literature Association Quarterly, 13(4), 177-181.
Goffman, E. (1959). The presentation of self in everyday life. New York, NY: Anchor Books.
Goffman, E. (1974). Frame analysis. Harmondsworth: Penguin.
Herring, S. C. (2016). Robot-mediated communication. In R. A. Scott, M. Buchmann, & S. M. Kosslyn (Eds.), Emerging trends in the social and behavioral sciences: An interdisciplinary, searchable, and linkable resource (pp. 1–16). Hoboken, NJ: John Wiley & Sons.
Herring, S. C., & Martinson, A. (2004). Assessing gender authenticity in computer-mediated language use: Evidence from an identity game. Journal of Language and Social Psychology, 23(4), 424-446.
Herring, S. C., & Stoerger, S. (2014). Gender and (a)nonymity in computer-mediated communication. In S. Ehrlich, M. Meyerhoff, & J. Holmes (Eds.), The handbook of language, gender, and sexuality, 2nd edition (pp. 567-586). Chichester, UK: John Wiley & Sons, Ltd.
Higgins, M. (2010). The “Public Inquisitor” as media celebrity. Cultural Politics, 6(1), 93-110.
Hund, W. D., & Mills, C. W. (2018, January 14). For centuries the West has found it useful to compare black people to monkeys. Quartzy. Retrieved from https://qz.com/quartzy/1179366/hm-monkey-hoodie-why-have-black-people-been-compared-to-monkeys-by-racists/
Jenkins, H. (2007). The wow climax: Tracing the emotional impact of popular culture. New York: New York University Press.
Jin, S. A. A. (2011). “It feels right. Therefore, I feel present and enjoy”: The effects of regulatory fit and the mediating roles of social presence and self-presence in avatar-based 3D virtual environments. Presence: Teleoperators and Virtual Environments, 20(2), 105-116.
Jung, C. (1964). Man and his symbols. London: Aldus Books.
Jung, C. (1976). Psychological types. Princeton, NJ: Princeton University Press.
Kapidzic, S., & Herring, S. C. (2014). Race, gender, and self-presentation in teen profile photographs. New Media & Society. doi: 10.1177/1461444813520301
Kim, H. W., Chan, H. C., & Kankanhalli, A. (2012). What motivates people to purchase digital items on virtual community websites? The desire for online self-presentation. Information Systems Research, 23(4), 1232-1245.
Lea, M., & Spears, R. (1991). Computer-mediated communication, de-individuation and group decision-making. International Journal of Man Machine Studies, 34, 283–301.
Manago, A. M., Graham, M. B., Greenfield, P. M., & Salimkhan, G. (2008). Self-presentation and gender on MySpace. Journal of Applied Developmental Psychology, 29, 446–458.
McAfee, T. (2018, August 6). 'Snapchat dysmorphia' causes people to seek plastic surgery to replicate social media filters. People Magazine. Retrieved from https://people.com/health/snapchat-dysmorphia-plastic-surgery-trend/
McRae, S. (1996). Coming apart at the seams: Sex, text and the virtual body. In L. Cherny & E. Weise (Eds.), Wired_Women (pp. 242-263). Seattle: Seal Press.
Meyers, E. (2009). “Can you handle my truth?”: Authenticity and the celebrity star image. The Journal of Popular Culture, 42(5), 890-907.
Mickes, L., Walker, D. E., Parris, J. L., Mankoff, R., & Christenfeld, N. J. (2012). Who’s funny: Gender stereotypes, humor production, and memory bias. Psychonomic Bulletin & Review, 19(1), 108-112.
Palomares, N. A., & Lee, E. (2010). Virtual gender identity: The linguistic assimilation to gendered avatars in computer-mediated communication. Journal of Language and Social Psychology, 29(1), 5–23.
Parkin, S. (2019, June 22). The rise of the deepfake and the threat to democracy. The Guardian. Retrieved from https://www.theguardian.com/technology/ng-interactive/2019/jun/22/the-rise-of-the-deepfake-and-the-threat-to-democracy
Rampton, B. (2000). Crossing. Journal of Linguistic Anthropology, 9(1-2), 54-56.
Reich, A. R. (1981). Detecting the presence of vocal disguise in the male voice. The Journal of the Acoustical Society of America, 69, 1458. doi:10.1121/1.385778
Rios, J. S., Ketterer, D. J., & Wohn, D. Y. (2018). How users choose a face lens on Snapchat. In Companion of the 2018 ACM Conference on Computer Supported Cooperative Work and Social Computing (pp. 321-324). New York, NY: ACM.
Schwartz, O. (2018, November 12). You thought fake news was bad? Deep fakes are where truth goes to die. The Guardian. Retrieved from https://www.theguardian.com/technology/2018/nov/12/deep-fakes-fake-news-truth
Su, R., Rounds, J., & Armstrong, P. I. (2009). Men and things, women and people: A meta-analysis of sex differences in interests. Psychological Bulletin, 135(6), 859-884.
Suwajanakorn, S., Seitz, S. M., & Kemelmacher-Shlizerman, I. (2017, July). Synthesizing Obama: Learning lip sync from audio. ACM Transactions on Graphics, 36(4), 95:1–95:13. doi:10.1145/3072959.3073640
Thelwall, M., & Vis, F. (2017). Gender and image sharing on Facebook, Twitter, Instagram, Snapchat and WhatsApp in the UK: Hobbying alone or filtering for friends? Aslib Journal of Information Management. doi:10.1108/AJIM-04-2017-0098
Tiidenberg, K., & Whelan, A. (2017). Sick bunnies and pocket dumps: “Not-selfies” and the genre of self-representation. Popular Communication, 15(2), 141-153.
Tillman, M. (2018, September 12). What are Animoji? How to create and use Apple's animated emoji. Pocket-lint. Retrieved from https://www.pocket-lint.com/apps/news/apple/142230-what-are-animoji-how-to-use-apple-s-animated-emoji
Waisanen, D. J., & Becker, A. B. (2015). The problem with being Joe Biden: Political comedy and circulating personae. Critical Studies in Media Communication, 32(4), 256-271.
Walther, J. (1996). Computer-mediated communication: Impersonal, interpersonal, and hyperpersonal interaction. Communication Research, 23(1), 3–43.
Ward, J. (2016) Swiping, matching, chatting: Self-presentation and self-disclosure on mobile dating apps. HumanIT, 13(2), 81-95.
Weisgram, E. S., Fulcher, M., & Dinella, L. M. (2014). Pink gives girls permission: Exploring the roles of explicit gender labels and gender-typed colors on preschool children's toy preferences. Journal of Applied Developmental Psychology, 35, 401–409.
Wikipedia. (2020). Deepfake. Retrieved February 26, 2020 from https://en.wikipedia.org/wiki/Deepfake
Yee, N., & Bailenson, J. (2007). The Proteus effect: The effect of transformed self-representation on behavior. Human Communication Research, 33(3), 271-290.
Zhao, S., Grasmuck, S., & Martin, J. (2008). Identity construction on Facebook: Digital empowerment in anchored relationships. Computers in Human Behavior, 24(5), 1816-1836.
Biographical Notes
Susan C. Herring [herring@indiana.edu] is Professor of Information Science and Linguistics and Director of the Center for Computer-Mediated Communication at Indiana University, Bloomington. Her current research interests include the use of graphical elements in computer-mediated discourse.
Ashley R. Dainas [ardainas@indiana.edu] is a Ph.D. candidate in Information Science at Indiana University, Bloomington. Her current research interests include the uses and interpretations of “graphicons” such as emoticons, emoji, images, GIFs, and Animoji in computer-mediated discourse.
Holly Lopez Long [hdlopezl@iu.edu] is a doctoral student in Information Science at Indiana University, Bloomington. Her research explores topics in computer-mediated discourse, channel selection, and innovation adoption.
Ying Tang [yt11@iu.edu] is currently a Postdoctoral Fellow in Informatics at Indiana University, Bloomington. Her current research focuses on technology enhanced learning, including how the use of graphical icons can impact interaction and engagement.
License ¶
Any party may pass on this Work by electronic means and make it available for download under the terms and conditions of the Digital Peer Publishing License. The text of the license may be accessed and retrieved at http://www.dipp.nrw.de/lizenzen/dppl/dppl/DPPL_v2_en_06-2004.html.
Recommended citation ¶
Herring, S. C., Dainas, A. R., Lopez Long, H., & Tang, Y. (2020). Animoji performances: "Cuz I can be a sexy poop". Language@Internet, 18, article 1. (urn:nbn:de:0009-7-50465)
Please provide the exact URL and date of your last visit when citing this article.