Home / Articles / Volume 9 (2012) / The effect of dialect and gender on the representation of consonants in Jordanian chat
Document Actions



Computer-mediated communication (CMC) has led to the emergence of new forms of written language represented by such communicative practices as chat and text messaging via mobile phones. The non-standard nature of the language used in these forms of writing seems to be shared among many languages used on the Internet, and the existence of these non-standard forms online suggests that certain common motivations and needs have led to their emergence and use. Since language, in its spoken form, usually shows variation, dialectal or otherwise, so too the orthographic representation of language on the Internet is expected to reflect this variation.2 In spoken language, the different variants of phonemes and words are often associated with social factors such as the dialectal background, gender, geographical location, and age, etc. of the speakers. It is expected that these differences will also be evident in the choice and use of language variants that are transcribed on the Internet, as typography and orthography “are the primary ‘physical’ cues available to users to express themselves and to convey information about their identity” (Zelenkauskaite & Herring, 2006, p. 1) in textual CMC.

The focus in this study is on CMC that takes place in (near-)real time and that can be referred to as “text-based online chat” (Simpson, 2002). The language variety under investigation is Spoken Jordanian Arabic (SJA) that is represented in chat through the use of ASCII (i.e., American Standard Code for Information Interchange) symbols. In general terms, the ASCII code consists of 128 characters: the most common Latin letters (upper and lower case) that are used in European languages, numerals, punctuation marks, and other common symbols (Palfreyman & al Khalil, 2003). Jordanian chat (JC) makes use of many ASCII characters to represent colloquial varieties of Jordanian Arabic in CMC.

As regards the representation of spoken language in CMC, some researchers mention that chatters tend to use “phonological spellings” (Hentschel, 1998) or “pronunciation spellings” in order to “convey the fine details of actual speech production” (Nishimura, 2003, n.p.) or “to mirror spoken features” (Palfreyman & al Khalil, 2003, n.p.). If the dialectal variation that is found in SJA is evident in JC, this can be taken as evidence that chatters using JC are able to encode the “fine details” of their spoken varieties in the orthographic representation of these varieties while chatting on the Internet. More generally, this study contributes to the investigation of the distinctive features of CMC in languages “with different sounds and different writing systems” than those of English on “the multilingual Internet” (Danet and Herring, 2003, n.p.).

In addition to investigating the representation of SJA consonants in JC, this study also examines the sociolinguistic indications related to the graphemic (i.e., orthographic) representations of the variants of four “dialectal” consonants in JC, and thus addresses the need for sociolinguistic research that focuses on how people communicate on the Internet (see Gass, 2008, pp. 429-430).3 Specifically, the analysis examines the contexts in which the phonetic variants of /q/, /ð/, /D/, and /θ/ are represented in JC. In addition, the study investigates the associations between the use of the variants of /q/ and the gender of chatters and their dialectal background.4 In everyday use in Jordanian society, it is common for males to use the rural variants of these phonemes; this includes many males who speak the urban variety of SJA (the latter variety would still be recognizable through how vowels and other consonants are used). However, the urban variants are considered more “suitable” or appropriate for females in urban areas and also for young females enrolled in universities, or who are working, in urban areas, regardless of their original dialectal background. This form of dialectal variation can be labeled “gendered,” since the urban variants are stereotypically associated with femininity, while the rural variants are associated with masculinity. It is hypothesized here that this phonetic variation would be evident in the written form of SJA represented in JC.

According to Herring and Zelenkauskaite (2008, p.75), “[l]ittle empirical research has investigated gender differences encoded through non-standard spelling in any mode of CMC.” To our knowledge, no research has focused on gender differences in the orthographic representation of the phonetic variants of SJA consonants in CMC. In particular, this study aims to answer the following research questions:

RQ1: What are the graphemic variants of /q/, /ð/, /D/, and /θ/ in JC?

RQ2: Does JC reflect the dialectical variation found in SJA?

RQ3: How do these representations relate to the gender of chatters in JC?

It is expected that answering these questions will lead to an understanding of why SJA phonemes are orthographically represented as they are in JC and to what extent these representations convey features of the chatters’ social identity, particularly that related to dialect and gender.

Review of Related Literature

Latinization, Anglicization, and ASCII-ization

The writing system used in Standard Arabic is entirely different from that of languages that use the Latin orthographic system, such as that used in English. Nonetheless, the representation of SJA in JC depends on the use of ASCII symbols, which are based on and include the Latin alphabet. JC, however, does not use all the letters that are present in the Latin alphabet, which includes letters and diacritics found in languages such as French, Spanish, and German. Instead, JC uses only characters that are found in the alphabet of English. For this reason, it is technically incorrect to describe JC as a “latinized” version of SJA. At the same time, although JC uses English letters, it cannot be described as an “anglicized” version of SJA, since it also uses other symbols that are not found in the English alphabet. According to Ivković (forthcoming, p. 3), “anglicization” is “a stabilization of transliteration norms drawing on the orthographic conventions of English.” We draw a distinction in the present study between “latinization,” “anglicization,” and “ASCII-ization” in order to show that JC is an ASCII-zation of SJA.

In Anglicized Arabic (AA), the English alphabet is used to represent letters of Arabic words.5 Many Arabic words can be represented by symbols from the English alphabet due to phonemic similarity, even though Arabic and English use two different orthographic systems. For example, the Arabic word هذا ‘this’ can be represented in CMC as <hatha>, which resembles the Arabic pronunciation <hatha> (see the Appendix and Research Method section below for the symbols used in this article). Some other Arabic letters/sounds have no equivalents in the English alphabet; these are anglicized following the conventions of English for these sounds (see Table 1 below). For example, the sounds /x/ and /ġ/ appear as ‘kh’ and ‘gh,’ respectively. However, this practice can lead to ambiguities, since the distinctions between some Arabic sounds are lost. As Palfreyman and al Khalil (2003) note:

For example, the character sequence <kh> has in principle two readings: either as a digraph representing /x/, as in the name <Shaikha> (/∫eixa/), or as a sequence of two consonants /k/ and /h/, as for example in <samakha> (/səməkha/, ‘her fish’), formed from ‘samak’ (‘fish’) and the possessive particle ‘-ha’ (‘her’). (n.p.)

The problem, therefore, is that AA words are not usually accurate representations of how these words are pronounced in SJA. In addition, some distinct Arabic letters are represented in AA by the same single character from the English alphabet; for example, /H/ and /h/ both appear as ‘h’ in AA, thus rendering words like ‘hamil’ ambiguous between /Hamil/ ‘pregnant’ and /hamil/ ‘a bum.’ However, the distinction between these two sounds is well established in JC, since /H/ is ASCII-ized as <7> as in <7amil> ‘pregnant,’ while /h/ is ASCII-ized as <h> as in <hamil> ‘a bum.’ Thus, while JC is not simply a “latinized” version of SJA, it is also not a completely anglicized version of SJA, because the Arabic sounds that have no equivalents in English are not usually represented in JC following the conventions of English orthography.

In order to represent Arabic letters that have no phonemic equivalents in the Latin alphabet distinctly, chatters using JC resort to “creative orthographic and typographic ‘work-arounds’” (cf. Zelenkauskaite and Herring, 2006, p. 2). These “work-arounds” depend on replacing such letters/sounds with numerals based on visual resemblance (see Tseliga, 2007, who reports a similar practice in Greek CMC). In JC, numerals are used that visually resemble the shape of Arabic letters that do not exist in English. Since numerals and the English alphabet are both part of the ASCII code, JC is thus more properly considered an ASCII-ized rather than a “latinized” or “anglicized” form of SJA.

In their study of ASCII-ized Arabic used in Instant Messaging (IM) by female university students in the United Arab Emirates (UAE), Palfreyman and al Khalil (2003) note that phonemic and graphemic resemblances determine how the spoken Arabic used in UAE is represented in CMC. This is similar to the phenomenon in “Greeklish” where transliterations of Greek characters are either phonemic – “attempting to represent the Greek sounds/phonemes with Latin characters” – or orthographic, where Greek characters are represented with “visually equivalent Latin characters or, in case of absence, with numbers” (Koutsogiannis & Mitsikopoulou, 2003, n.p.). As regards phonemic similarity, the following English phonemes have equivalents or near equivalents in Arabic: /b/, /s/, /z/, /l/, /m/, /g/, /d/, /f/, /h/, /k/, /n/, /w/, /r/, /t/, /θ/, and /ð/. These sounds are represented in CMC by using their corresponding graphemes from English: <b>, <s>, <z>, etc. But the Arabic phonemes that do not exist in English are represented by numerals, on the basis of visual resemblance. This practice is the same for the Arabic consonants used in JC and those that Palfreyman and al Khalil (2003) described in IM that uses vernacular Arabic in the UAE.6 For example, the cursive Arabic letter ح /H/ is ASCII-ized as <7> because this numeral and the letter ح are similar in shape. The Arabic consonants that are represented by numerals are back consonants and emphatics: the velars /x/ and /ġ/, the pharyngeals /H/ and /`/, the glottal stop /’/, and the emphatics /T/, /S/, /Đ/, and /D/. Table 1 below shows the visual similarity between numerals and the corresponding Arabic consonants as found in the JC corpus, as well as showing the representation of the same consonants in AA, for purposes of comparison.

Table 1. The substitution of Arabic back consonants and emphatics by numerals on the basis of visual similarity in JC
and their representation in AA using English letters

However, “visual resemblance is clearer in some cases than in others, and in some cases involves mirror-image reversal of all or part of the symbol” (Palfreyman & al Khalil, 2003); examples in Table 1 include the letters ء , ح, and ع, which are ASCII-ized as <2>, <7>, and <3>, respectively. In addition, users also add an apostrophe to some numerals in order to represent letters that have a dot in their Arabic orthography, as shown in Table 1.8 In Jordan, ASCII-ized Arabic is sometimes referred to as <3rabeeze> /’arabiizi/; this term is a blend of <3rabee> ‘Arabic’ and <Engleeze> ‘English,’ even though <3rabeeze> includes symbols that are not part of the English alphabet.

Since ASCII has a larger inventory of symbols than the English alphabet, ASCII-ized Arabic can be considered more accurate as a representation of SJA consonants than latinized or anglicized SJA. Not only English letters are used to represent Arabic sounds, but also numerals, as well as other ASCII symbols such as numerals together with an apostrophe. For example, <3’>, <7’>, and <6’> are used to represent, respectively, the Arabic letters غ /ġ/, خ /x/, and ظ /D/, which include a dot in their Arabic orthography. The use of the apostrophe with a numeral is important to differentiate between pairs of Arabic letters that are normally discriminated (in writing) through the use of the diacritical dot; for example, the letters ح /H/ and خ /x/ are ASCII-ized as <7> and <7’>, respectively. Since numerals, and numerals combined with an apostrophe, are not part of Latin or English spelling, it can be said that these are ASCII, rather than Latin or English, symbols when used in JC to represent some Arabic letters. Their existence in JC further supports the view that JC is an ASCII-ization rather than a mere “latinization” or “anglicization” of SJA.9 Use of an apostrophe alone (i.e., not in combination with a numeral) is not an ASCII phenomenon in JC, since this symbol is common in Anglicized Arabic (AA) to represent the sounds /’/ and /`/ (see endnote 7).

In the literature on the use of languages on the Internet, it seems that each language has its own manifestations of and reasons for the representation of its sounds and writing systems in CMC. For instance, Ivković (forthcoming) investigates non-standard Latin variants that are used to represent Serbian CMC in online news forums. He describes Serbian as a language “whose speakers synchronously [sic] use two scripts – the Cyrillic and Latin alphabets – in varied discourses” (forthcoming, p. 1), but he finds that Serbian CMC users favor “the use of the Latin alphabet and the non-standard, undiacritized Latin forms over Cyrillic […]” (p. 30). These non-standard Latin variants, Ivković writes, have become standardized in Serbian CMC. Hentschel (1998), in her study of chat, noticed that the availability of ASCII code leads to different strategies in the linguistic representation of some European languages; for example, users in Germany replace the characters <ä>, <ö>, <ü>, and <ß> with <ae>, <oe>, <ue>, and <ss>, respectively, while Serbian chatters are satisfied with basic letters without diacritics. Hentschel (1998) also finds that Russian users (especially those living outside Russia) resort to transliteration of Russian words with English letters when they cannot use the Russian ones. It seems that the non-standard form of language used in CMC helps Internet users avoid problems related to the representation of their dialects in the alphabet of their original languages. According to Koutsogiannis and Mitsikopoulou (2003), this is the reason why Greek Internet users replace the Greek alphabet with ASCII symbols in online communication.

Similarly, the availability of the ASCII code helps JC users represent their dialects in a way that the standard writing system does not allow. The language used in JC is non-standard because it represents spoken (dialectal) Jordanian Arabic. SJA differs from Standard Arabic in pronunciation and frequently also in vocabulary choice. Some sounds in SJA have no equivalents in Standard Arabic; examples include the use of the variants of /θ/, /ð/, and /D/ in SJA but not in Standard Arabic. Additionally, the phoneme /q/ is rarely pronounced in SJA, where one of its dialectal allophones, mostly [’] or [g], is usually used instead. These variations are common in JC as a written form of SJA, but they are never used in Standard Arabic as variants of /q/; in fact, [’] exists as an independent phoneme in Standard Arabic, and [g] does not exist in Standard Arabic at all. There is also the issue of variation in the pronunciation of most words between Standard Arabic and SJA due to the choice of different vowels for the same words. Adding to this “non-standardness” is the fact that ASCII-ized SJA, which is rarely written in domains other than CMC, is not written in the Arabic alphabet. The only written form of Arabic is Standard Arabic, which is not used for communication in everyday, real-life situations; it is the official language in Arab countries and thus is used mainly in news broadcasts, official documents, and in the mass media. It is also worth noting that while Standard Arabic is written from right to left, chatters write JC to be read from left to right, like English.

Sociophonetic Variation and Gender

Most sociolinguists agree that gender of interlocutors plays an important role in phonetic and phonological variation. Trudgill (1972), in his study of phonetic and phonological variation in Norwich, England; Labov (1990, 2010) in his analysis of variation in Philadelphia; Abdel Jawad (1981) and Al-Wer (1991) in their investigations of phonetic variation in SJA in Amman, Jordan, in addition to many other researchers, generally support the idea that males use phonetic variants that are different from those used by females for some of the phonemes in their respective dialects (see also Al-Khatib, 1995; El Salman, 2003). These researchers point out that one variant of a phoneme is often associated with masculinity, toughness, rurality, or working-class life style, while another variant is more associated with, or indicative of, femininity, softness, prestige, urbanism, or high-class lifestyle (see also Abdel Jawad, 1986; Wolfram & Schilling-Estes, 2006, pp. 237-245). These studies also note that females have a greater tendency than males to use phonetic variants that are considered more “prestigious.”

Gender differences have been investigated in different forms of CMC such as email, chat, and SMS with respect to non-standard typography and orthography (Herring & Zelenkauskaite, 2008; Koutsogiannis & Mitsikopoulou, 2003; Palfreyman & al Khalil, 2003; Zelenkauskaite and Herring, 2006). Herring and Zelenkauskaite (2008) investigated gender differences in mobile phone text messages (SMS) posted by viewers to a public interactive television program in Italy. They found that females posted more and longer SMS and used more non-standard forms, contrary to previous gender-related findings in the sociolinguistics and CMC literatures. More consistent with previous sociolinguistic findings, in a comparison of Lithuanian and Croatian typography in chat, Zelenkauskaite and Herring (2006) found that female “users tended to use ‘softer’ consonants, palatalizing them,” whereas male “users tended to use more innovative forms, associated with English and technology, and [to] disassociate themselves with the feminine ‘palatalizing’ tendency” (p. 15).

The Associations of Gender and Dialect with the Variants of /q/ in SJA

Dialectal variation in SJA is linked to three major varieties: Madani, which is used in urban areas; Fallahi, which is used in the countryside; and Bedouin, which is used in the desert of Jordan (see Al-Khatib, 1995). The major focus in the present study concerns the distinction between urban (Madani) and rural varieties (both Fallahi and Bedouin). The justification for this distinction is that variants of many vowels and consonants used in Fallahi and Bedouin SJA are mostly the same and are distinguishably different from variants of the same sounds used in Madani SJA.10 Differences in pronunciation between urban and rural SJA dialects involve variants of several consonants and vowels rather than being restricted to only one consonant or one vowel. However, Jordanians stereotypically distinguish one another’s dialect in terms of which variant of the phoneme /q/ a speaker uses. Speakers of urban SJA typically use the [’] variant, while speakers of rural SJA use the [g] variant, and these two variants have become the major distinguishing (i.e., most salient) feature of dialectal variation in Jordanian society. Accordingly, from a Jordanian folk perspective, a Jordanian’s dialect is usually described as one that uses [’] or one that uses [g] as variants of /q/.

The [’] variant, which is characteristic of urban SJA, is stereotypically associated with prestige, refinement, modernism, and, sometimes, a high-class image in Jordanian society. The [g] variant, in contrast, is stereotypically associated with a rural (sometimes working class) lifestyle, roughness, and/or vigor (Al-Khatib, 1995). These associations have produced conceptions about which variant is more socially appropriate, or more “becoming,” for use by a male or by a female. The [’] variant is commonly regarded as more becoming than [g] for use by females,11 whereas the [g] variant is usually associated with masculinity in Jordanian society, since it is related to roughness and virility. It is typically used by males and even by the majority of males who speak urban SJA (keeping in mind that SJA varieties can also be differentiated with regard to variants of other consonants such as those of /θ/, /ð/, and /D/, not to mention the use of some vowels). The differential use of [g] and [’] applies to most males and females living, working, studying, etc., in urban areas. Males and females living in rural areas both use the variant [g]; however, many young females from rural areas switch to [’] when communicating with urbanites due to the relatively stigmatized, rural image associated with [g] in such contexts. In studying the variants of /q/ in the urban dialect used in Irbid, a city in Jordan, El Salman (2003, p. 423) finds that the

men choose the variant that embodies […] toughness – i.e., the local variant [g], which was once the symbol of rurality and harshness of Bedouin life[, while women] are expected socially to reflect softness and urbanization. Thus, they have adopted the urban variant [ʔ], which best reflects these features. [Note: El Salman’s [ʔ] (glottal stop) is represented as /’/ in this study.]

El Salman (2003) also points to the prestigious status of the [’] variant. He believes that it is “normal for females to adopt a prestigious urban variant, as they tend to use the code identified as the code of prestige” (p. 423).

Research Method

For the purposes of this study, a corpus of online chat texts was collected from three Jordanian chat rooms over a period of nine months during different times of the day in 2009. Language use in these chat rooms was natural and spontaneous, since there were no predetermined conditions under which texts were gathered. Data were collected from a chat program called Internet Relay Chat (IRC). “Basically, IRC is a synchronous, multi-user, text-based chat technology” where people “connect to servers” all over the world and once “connected to a server, it’s possible to join a channel (i.e., a chatroom) for talking publicly in groups, or privately with just one other person” (Thurlow et al. 2004, p. 182; see also Hentschel, 1998). This 24-hour chat program was chosen because it was public and easily accessible. IRC chat sessions are usually conducted for the sake of socializing and entertainment; chatters talk about trends in society, universities, friends, weather, food, holidays, childhood, sports, and other topics. Users of this program can enter more than one channel and form an unlimited number of new channels to be named as they want. In each channel, an operator controls the topic of chat. It is possible for operators to have their own regulations as regards language; for example, they can strictly require the use of Latin script only.12 Participants in each channel are identified by their “nicknames” and each message is automatically preceded by a nickname.

Data were collected from three channels: #Jordan, #Amman, and #Irbid. Chats from screens of public chat activity were collected at different times of the day in order to collect a broadly representative sample of ASCII-ization and variation in representing SJA consonants. Different times of the day include different chat groups, since time usually governs peoples’ participation in synchronous communications. For example, the number of young people between the ages of 16 to 23 years old decreases in chat channels from 7:00 a.m. to 3:00 p.m. because this is the time they attend classes at high schools and universities. This age group is found using IRC more after 3:00 p.m. and much more on weekends and school and university vacations.

Text samples were collected once every week and sometimes every two weeks, mostly on Thursdays but occasionally on Fridays. Friday and Saturday are “official” holidays in Jordan, but since Saturday is followed by a working day (Sunday), most people dedicate Fridays and the remaining time of Thursdays to leisure, entertainment, and social activities; for this reason, the number of chatters on those days was usually greater than on the other days of the week. Samples were collected by number of messages (as one segment), and the aim was to collect, typically, around 150-200 messages in each of the days of data collection; the number of collected messages was sometimes increased to around 200 when the chat included females. On five different occasions, the number of messages was increased to around 700 to allow for the observation of use of phonetic variants related to gender; these were cases in which the chat initially included males only for some time, and then females joined chat with the same males (see “Shift from One Variety to Another in JC,” below). In all, 23 chat samples were collected, comprising a total of 6,835 messages.

The number of participants in each chat session was typically 10-15 chatters; the lowest number of chatters was 6 and the highest number was 26 on two different occasions in the collected data. According to the number of nicknames used, the total number of chatters in the collected corpus was 291, of which 62 presented as females and 203 presented as males. A chatter usually refers to his/her sex in the opening of his/her chat by identifying ‘a/s/l/’ (i.e., age/sex/location); chatters’ nicknames also mostly indicate if the chatter is a male or a female. In JC, one can generally easily recognize if chatters identify themselves, or are addressed, as males or as females (see also Herring, 2003). However, because we cannot be entirely sure if chatters are definitely males or females, we will henceforth use the terms “chatters as males” (CMs) and “chatters as females” (CFs) in this article, keeping in mind that chatters using JC are aware of the gender-related features of dialect use in their society.

Table 2 shows the breakdown of CMs and CFs across some topics of chat included in the sampled data:13

Table 2. Breakdown of CM and CF chatters according to participation in some chat topics

As Table 2 shows, CFs did not participate in all topics. The number of CFs and the frequency of their participation during any chat session seem to depend on their interest in the topic of the chat. The number of CFs generally increased. Conversely, the number of CFs noticeably decreased when the topic of chat related to local or international politics, economy, or sports. In all, CFs contributed 977 messages in the corpus, while CMs contributed 5,858 messages.

In this study, examples taken from the JC corpus appear between angle brackets < >. Translation of words appears between single quotation marks, phonemic transcription appears between slashes / /, and phonetic variants (i.e., allophones) are shown between square brackets [ ]. The first step in the data analysis was identifying and studying instances of letters and other symbols (e.g., numerals) that are used to represent consonants in JC. The second step mainly consisted of identifying representations that stand for graphemic variants of the same phoneme.14 The SJA phonemes that showed the most variation in the collected corpus of JC were /q/, /θ/, /ð/, and /D/. Since one of the main focuses of this study is gendered variation associated with /q/, the variants of /q/ were investigated in greater depth in relation to the gender of chatters in order to determine the relationship between gender and use of the variants of this phoneme.

Data Analysis

ASCII-ization of the Phonetic Variants of /q/, /θ/, /ð/, and /D/ in JC

In response to the first research question articulated above, the phonetic variation found in SJA is reflected in JC and is evidenced by the presence of orthographic variants that represent different allophones of the same phoneme (see Table 4 below). This is manifested in JC particularly in ASCII-izations that represent the variants of /q/, /θ/, /ð/, and /D/. For example, the four phonetic variants of /q/ are usually ASCII-ized in JC as <2> or sometimes < > (i.e., no grapheme) for [’] (i.e., glottal stop)15, <g> for [g], <k> for /k/, and <q> for [q], as in <2alm>, <galm>, <kalm>, and <qalam> ‘pencil’ (the first two variants are more common in SJA than the other two). The other phonemic variables are also ASCII-ized in different ways according to whether a chatter is using urban or rural variants, as illustrated in the following table:

Table 3. ASCII symbols used to represent the variants of /q/, /θ/, /ð/, and /D/ in JC

Despite orthographic similarity between some ASCII-izations, the intended sound can be distinguished by chatters, and also by the researchers, depending on the context in which the sound occurs. For example, <th> is interpreted as [θ] in <ktheer> [kθiir] ‘much’ but as [ð] in <etha> [iða] ‘if,’ as illustrated in Table 3 above. When <z> is used as a variant of /D/, the indicated sound is [Z] (i.e., emphatic alveolar voiced fricative) as in <zalem> [Zalim] ‘oppressor.’ However, <z> as indicating [Z] can be differentiated from <z> intended as /z/ based on context.

Table 4 below shows the most common urban and rural phonetic variants of /q/, /ð/, /D/, and /θ/ that CMs and CFs represent by different ASCII symbols in JC.

Table 4. Most common phonetic variants of /q/, /ð/, /D/, and /θ/ in urban and rural SJA by gender

It is notable in Table 4 that each of the phonemes /q/, /ð/, /D/, and /θ/ has more than one variant in urban SJA and that the ones associated with urban males are the same as the variants used by both males and females in rural SJA. However, a male’s dialect in SJA (and in JC) would still be recognizable as urban or rural based on how vowels are used in a word; the use of vowels in urban SJA is generally different from that in rural SJA. For example, most males using urban SJA would say /gahwi/ (ASCII-ized as <gahwe>) to mean ‘coffee,’ while males and females speaking rural SJA would say /gahwa/(ASCII-ized as <gahwa>). In addition, the phonetic variants of /q/, /ð/, /D/, and /θ/ that are expected from urban females are distinct from those used by either urban males or rural males and females; thus, for instance, girls speaking urban SJA would say /’ahwe/ (ASCII-ized as <2hwe>) to mean ‘coffee.’

Variation in Representing the Phoneme /q/ in JC: Gender and Dialect

The corpus shows that the same chatters sometimes switched from one SJA variety to another and so also switched representations of specific sounds. The activity that accompanied this change was that new chatters self-identifying as females started to participate in the ongoing conversations. Therefore, analysis of variation in ASCII-izing the phonemic variables in JC should take the chatters’ identity, particularly their gender and dialect, into consideration, since these are mostly “embedded” in CMC during chat (see Howard, 2004, p. 2). In particular, it is essential to investigate the effect of the presence (or absence) of chatters who identify themselves as female on the choice and frequency of those consonantal graphemes that represent phonetic variants in JC. In this section, we examine the occurrences of the different variants of /q/ in messages by CMs and CFs. We also identify elements in the context of chat that might trigger a shift from one variant to another in JC, as given “communicative contexts may generate short-term effects on phonological patterns” (Foulkes, n.d., p. 645).”

Correlation between Gender of Chatters and Representation of the Phonetic Variants of /q/

The collected data were analyzed in order to determine if the choice of the variants of /q/ is related to the gender of chatters and to compare the frequency of occurrence of these variants in male-male (CM-CM) chat with that in mixed gender (CM-CF) chat. Different logs of male-male chat within the corpus showed the dominance of the [g] variant for the variable /q/. Table 5 below shows the percentages of the different representations of the phonetic variants of /q/ in male-male chat in the corpus:

Table 5. Percentages of the representations of the phonetic variants of /q/ in CM-CM JC

As Table 5 shows, /q/ in CMs chat was ASCII-ized as <g> in 81% of its occurrences to indicate the use of the variant [g], while it was ASCII-ized as <2> or < > in 6% in order to indicate the use of [’].

In contrast, the frequency of occurrence of the variants of /q/ in a mixed male-female chat was clearly different from that of the same variants in male-male chat, in that the contributions showed an increase in the use of [’] (and a decrease in [g]) when chat included both CMs and CFs, as shown in Table 6 below. The table shows the frequency of each of the variants of /q/ in CMs’ messages compared with the frequency of the same variants in CFs’ messages when chat includes both CMs and CFs. The shift in the usage of phonetic variants (i.e., mainly from <g> to <2>) becomes especially apparent when we compare the percentages that represent the variants’ frequencies in Table 5 above with those in Table 6.

Table 6. Percentages of the representations of the phonetic variants of /q/ in CMs’ messages compared with those in CFs’ messages in mixed CM-CF JC

Table 6 shows that symbols indicating the use of the variant [’] generally predominate in mixed male-female chat, and, in contrast to what appears in Table 5, they are also the dominant variants in CMs’ contributions. This observation leads us to the issue of the role of the participation of CFs in triggering the shift by CMs from the variant [g] to the variant [’].

Shift from One Variety to Another in JC

There are occasions in the corpus that include chat by the same CMs in which the participation of CFs intervenes. That is, in one continuous sample of text in the corpus, the chatters are all CMs, then CMs and CFs, and later CMs only, followed by a return of CFs. One such occasion consists of the following periods of chat: (A) chat by CMs for 32 minutes, then (B) chat by CMs and CFs for 16 minutes, followed by (C) absence of CFs for 192 minutes, and finally (D) a return of CFs to chat with the same CMs for 30 minutes. This specific text sample was recorded over 4 hours and 30 minutes (from 11:58-16:32) and includes 693 messages from 26 chatters (23 CMs and 3 CFs). Table 7 shows the percentages of the representations of the phonetic variants of /q/ during these four periods in the sequence:

Table 7. Percentages of the ASCII-ized variants of /q/ during the periods A, B, C, and D

As shown in Table 7, the phonetic variant [’], which increases in occurrence in CMs’ chat during B and D, is the same variant that is almost always realized in CFs’ chat. Conversely, the occurrence of the variant [g] decreases in CMs’ chat during B and D, even though it is used most of the time in male-male chat. We can add to this the observation that in period C (i.e., during CFs’ absence), CMs return to using the variant [g], as was the case in period A. Therefore, it appears that switching from one dialectal variant to another by the same CMs did not occur by chance but rather was triggered by the participation of CFs.

Discussion: Socio-Cultural Aspects and Motivations

Sociolinguistic Aspects of JC

The second research question mentioned in the Introduction asked whether JC can be taken as a manifestation of the dialectal variation found in SJA, while the third question asked about the socio-cultural reasons for the diversity of representing phonetic variants – specifically, how these representations relate to the gender of chatters in JC. In general, the phonetic features of SJA pronunciation, differentiating urban from rural, are reflected in JC; this is especially apparent in the ASCII-ization of the phonetic variants of /q/, in addition to the variants of /θ/, /ð/, and /D/, as shown in Tables 3 and 4. Thus, JC demonstrates that chatters accentuate their chat in order to represent their SJA varieties.

Specifically, the common differentiation between urban and rural SJA, as based primarily on the choice between [’] and [g] as variants of /q/, is evident in JC. This use of non-standard language in JC reflects the same dialectal variation that is characteristic of SJA in offline situations.

The results of this study support the traditional association of the [’] variant with femininity and refinement and the association of [g] with masculinity and rurality. In the JC corpus investigated, CMs typically use [g] (as <g>) in chatting with other CMs, while CFs use [’] (usually as <2>) in chatting with CMs. The differentiation between [’] and [g] as variants of /q/ is culturally associated with gender styles and “gender-linked differences” (Wardhaugh, 2010) in the use of language varieties.

Overall, the results of the analyses reflect Hentschel’s (1998, n.p.) claim that “the way IRC’ers use and spell their language shows very realistically the way in which the language is used in actual everyday conversation.”

JC as a Reflection of Sociolinguistic Accommodation and Convergence

In the corpus for this study, although CMs used the [g] variant during all-male chat, most of them switched to the [’] variant when a CF entered the chat channel. This behavior can be explained in light of Communication Accommodation Theory (CAT), which was developed by Giles in the 1970s (see Giles & Smith, 1979; Giles & Ogay, 2007). This theory emphasizes the socio-psychological aspects of interaction among speakers of different varieties or dialects of the same language. “In essence, accommodation refers to one speaker adapting his or her language variety in order to reduce differences with the language variety of another speaker” (Jones, 1999, p. 269) or in order “to boost social attractiveness” (Coupland, 2008, p. 268). It follows that when chatters are seeking approval or acceptance from those addressed by or viewing their messages, they should have a tendency to assimilate their language variety to that of their interlocutors (see Holmes, 2008, pp. 255-260; Hudson, 2001, pp. 229-237).

In addition, according to CAT, speakers try to reduce pronunciation dissimilarities in order to win their interlocutors’ social approval; this is referred to as “speech convergence” (Holmes, 2008). “Convergence occurs when speakers alter their speech to be more like that of their interlocutors because they want to establish solidarity with or elicit approval from their audience” (Adamson, 2009, p. 140). This type of accommodation was found in mixed male-female JC in the corpus. The corpus also shows that convergence involving the variants of /q/ is more prevalent than that in relation to the variants of other variables such as /θ/, /ð/, and /D/ (see endnote 4).This perhaps can be explained in relation to cultural stereotypes associating the different variants of /q/ with gender and dialectal background differences. Hudson (2001, p. 165) believes that “if we think of accommodation as a way of reducing social distance, i.e., as a strategy for protecting solidarity-face, there is no reason to expect accommodation on every single variable; a few variables might well be singled out for accommodation.”

As Tables 5, 6, 7, and related discussions above show, in mixed male-female JC, CMs show a higher tendency to converge towards CFs; CFs do not have a tendency to converge their dialect towards that of CMs. Taking CAT into consideration, we can speculate that male chatters’ convergence towards females’ chat is related to their desire to gain CFs’ approval and acceptance and to reduce social distance. Through their graphemic choices, CMs symbolically indicate that they belong to the same “group” as that of CFs (see Hudson, 2001; Holmes, 2008). That is, CMs tend to use the variant [’] when chatting with females, since it is commonly associated with prestige and urbanism, and they avoid the variant [g], since, in this context of dialects in contact, it is stigmatized as indicating rurality and crudeness. The urban variety is considered refined, stylish, elegant, and, frequently, also more “cultured” in contrast to the rural variety. Its use would suggest that a CM is himself refined and cultured, and so the CFs would presumably find him more attractive to respond to than a CM using the rural variety. The traits of the prestige variety are considered necessary by many females/CFs in order to respond to, and participate in discussions/chat with, males/CMs.

Conclusion, Limitations, and Suggestions for Further Research

The sociolinguistic aspects of JC conversations are a reflection of the same aspects that shape the use of SJA in everyday life. These aspects show that CMC texts can be a rich source of data for researchers working in the field of sociolinguistics, and perhaps also for those working in fields such as phonetics. Although this type of communication lacks some of the qualities of face-to-face interaction, the use of language in CMC chat can be described as natural, spontaneous, and communicatively efficient.

One limitation of this study is that SJA vowels were not investigated due to scope and space limitations. Speakers of SJA varieties generally use different vowels in uttering the same word. Variations in representing short and long vowels in JC and whether these uses function as sociolinguistic indicators should be comprehensively investigated to see if accommodation takes place, and if so, what factors stimulate it. In addition, this study focused on dialectal variations in JC related to pronunciation. A future study could focus on vocabulary choice in JC in relation to rural and urban SJA or even in Fallahi and Bedouin. Another limitation is that our analysis focused on only one mode of CMC, chat. Future research could analyze variation in JC in other modes of CMC such as email, instant messaging, and social network sites. It would be interesting to know if dialectal and/or gender-based accommodation occurs in private, dyadic CMC. In addition, this study did not investigate the actual (offline) dialects of the chatters to be able to determine if the JC realistically represented those dialects; such an investigation would have required making spoken contact with the chatters. Another issue that could be investigated in future research are the sociolinguistic aspects of female-female JC, since these were not taken into consideration in this study.

These limitations notwithstanding, it is hoped that this study, as a contribution to research on Arabic dialects used in CMC, will substantially add to understanding of the sociolinguistic aspects of communication on the multilingual Internet.


  1. We are especially grateful to Susan Herring, the editor of Language@Internet, for her constructive and thorough revisions. We are also grateful to the two anonymous referees for their useful comments and suggestions on earlier versions of this article.

  2. Communication in CMC, “despite its use of a written medium, takes place in an environment of social interaction and displays features of spoken language,” such as hedges, in addition to being “spontaneous and unedited” (Gass, 2008, p. 432).

  3. Investigating vowel representations in JC is beyond the scope of this study. However, as is the case with some consonants, vowels are important in differentiating SJA dialects.

  4. Due to considerations of space, the analysis of the effect of perceptions related to gender and dialect on the representation of /ð/, /D/, and /θ/ has not been included in this article. It is available upon request from the first author.

  5. Anglicized Arabic (i.e., AA) is sometimes referred to as Common Latinized Arabic (CLA) (see Palfreyman & al Khalil, 2003). AA had been common on the Internet (e.g., in email and chat) since before the emergence and spread of ASCII symbols. In addition, AA was commonly used to represent Arabic words, especially proper names, appearing on some shop signs (to indicate prestige), on street or road signs showing names of cities and streets (to guide non-native speakers of Arabic), and in graffiti (perhaps to show off or to indicate learnedness).

  6. However, there are differences between a few of the consonantal graphemes that are used in JC and those reported by Palfreyman and al Khalil (2003); these differences mainly relate to dialectal variations between SJA and the UAE vernacular. In particular, the numeral ‘2’ represents either the glottal stop as an independent phoneme or an (urban) allophone of /q/ in JC; however, the same numeral represents only the glottal stop in the UAE vernacular, since the glottal stop is not used as a variant of /q/ in UAE. These two dialects also differ with regard to the variants of /ð/, /D/, and /θ/, since the urban variants of these phonemes in SJA are not used in the UAE vernacular for the same phonemes.

  7. We are indebted to one of the anonymous reviewers of this article for pointing out the issue related to the representations of /`/ (i.e., the voiced pharyngeal fricative) and /q/. In AA, /`/ has no parallel grapheme, so <a> is usually used instead; in JC, <3> is used to stand for /`/. The case for /q/ is different, however, since it commonly has two variants in SJA: /g/ in rural and /’/ (glottal stop) in urban SJA. The first variant is represented as <g> in both AA and JC, while the second is mostly represented as <2> in JC but as <a> in AA. This latter case shows that /’/ as a variant of /q/ or as an independent phoneme has no grapheme in AA; thus, as in the case of /`/, the letter <a> is usually used instead in AA. It is worth mentioning that the mark represented by apostrophe is sometimes used to stand for /`/ or /’/ in AA; the addressee can recognize the intended sound through context, as in urban AA <rafe’>, which can mean either /rafii`/ ‘thin’ or /rafii’/ ‘companion’ (in JC, these are differentiated as <rafe3> ‘thin’ and <rafe2> ‘companion,’ respectively).

  8. It should be noted here that in JC these numerals are not used as homophones, such as using ‘4’ to replace ‘for’ in English. The resemblance in graphical shape suggests that they may be taken as (partial) homographs. For discussion of the use of numerals in CMC, see Koutsogiannis and Mitsikopoulou (2003) and Tseliga (2007) on Greek CMC and Anis (2007) on French CMC.

  9. These diacritics are not used in latinized or anglicized SJA. Loss of diacritics is reported as regards some languages when latinized in CMC; see, for instance, Ivković (forthcoming) and Hentschel (1998) on Serbian.

  10. However, Fallahi and Bedouin are also sometimes differentiated according to word choice.

  11. This does not mean that all women in Jordan use the urban variant [’] in everyday interactions. In fact, many of the women who speak rural SJA natively do not speak urban SJA “competently.” But this “failure” applies also to phonetic variants other than those of /q/; those young women use [’] as a variant of /q/, since it is the most stereotypically recognized variant indicating refinement, urbanity, and prestige. However, a good number of the younger generation of females who originally spoke rural dialects – provided that they grew up in urban areas or had long-term contact with urbanites – seem to be proficient at using urban SJA, and the female chatters in this study of JC were all young.

  12. One of the reasons for strictly requiring Latin script was that Arabic script lacks adequate or distinctive representations of some sounds like /g/ that exist in SJA. Moreover, requiring the use of ASCII-ized Arabic can be due “to positive, in-group local values of education, competence in English and peer group prestige” (Palfreyman & al Khalil, 2003, n.p.).

  13. This is not an exhaustive list of all the topics of chat in the collected data. Categorizing all the collected messages according to gender of chatters and in relation to topic of chat is beyond the space and scope considerations of this study. A good number of the topics of chat were represented by only a small number of messages, as a segment might include different topics of chat. Mentioning all of these topics, which would require a very long list, does not relate to the research questions of this study.

  14. The focus in this article is on investigating variation related to the second step of analysis. The first step was a general process intended to identify representations of all SJA phonemes in JC.

  15. We are grateful to one of the anonymous reviewers for pointing out that [’], as a phonetic variant of /q/, is sometimes represented by no grapheme < >. This “no grapheme” is mostly followed by the vowel <a> /a/; native speakers, however, will recognize the existence of [’] based on context. Examples from JC include words such as <asdak> [’Sdak] ‘you mean’ and <areeb> [’ariib] ‘close.’

  16. One of the reviewers of this article pointed out that “using <d> to represent /D/ loses the phonemic distinction between /D/ and /d/.”


Abdel Jawad, H. (1981). Lexical and phonological variation in spoken Arabicin Amman. Unpublished doctoral dissertation, University of Pennsylvania.

Adamson, H. (2009). Interlanguage variation in theoretical and pedagogical perspective. New York: Routledge.

Al-Khatib, M. (1995). The impact of interlocutor’s sex on linguistic accommodation: A case study of Jordan radio phone-in programs. Multilingua, 14(2), 133-150.

Al-Wer, E. (1991). Phonological variation in the speech of women from threeurban areas in Jordan. Unpublished doctoral dissertation, University of Essex.

Anis, J. (2007). Neography: Unconventional spelling in French SMS text messages. In B. Danet & S. C. Herring (Eds.), The multilingual Internet: Language,communication, and culture online (pp. 87-115). New York: Oxford University Press.

Coupland, N. (2008). The delicate constitution of identity in face-to-face accommodation: A response to Trudgill. Language in Society, 37(2), 267-270.

Danet, B., & Herring, S. C., Eds. (2003). The multilingual Internet: Language, culture, and communication in Instant Messaging, email, and chat. Special issue of the Journal of Computer-Mediated Communication, 9(1). Retrieved February 16, 2012 from http://jcmc.indiana.edu/vol9/issue1/

El Salman, M. (2003). The use of [q] variant in the Arabic dialect of Tirat Haifa. Anthropological Linguistics, 45(4), 413-425.

Foulkes, P. (n.d.). Phonological variation – a global perspective. Retrieved July 4, 2011 from http://www-users.york.ac.uk/~pf11/Phonological%20variation-web.pdf

Gass, K. M. (2008). Language contact in computer-mediated communication: Afrikaans-English code switching on Internet-relay chat (IRC). South African Linguistics and Applied Language Studies, 26(4), 429-444.

Giles, H., & Smith, P. (1979). Accommodation theory: Optimal levels of convergence. In H. Giles & R. N. St. Clair (Eds.), Language and Social Psychology (pp. 45-65). Baltimore, MD: Basil Blackwell.

Giles, H., & Ogay, T. (2007). Communication accommodation theory. In B. B. Whaley & W. Samter (Eds.), Explaining communication: Contemporary theories and exemplars (pp. 293-310). Mahwah, NJ: Lawrence Erlbaum.

Hentschel, E. (1998). Communication on IRC. Linguistik Online, 1. Retrieved September 27, 2009 from http://www.linguistik-online.com/irc.htm

Herring, S. C. (2003). Gender and power in online communication. In J. Holmes & M. Meyerhoff (Eds.), The handbook of language and gender (pp. 202-228). Oxford, UK: Blackwell Publishers.

Herring, S. C., & Zelenkauskaite, A. (2008). Gendered typography: Abbreviation and insertion in Italian iTV SMS. In J. F. Siegel, T. C. Nagel, A. Lorente-Lapole, & J Auger (Eds.), IUWPL7: Gender in language: Classic questions, new contexts (pp. 73-92). Bloomington, IN: IULC Publications.

Holmes, J. (2008). An introduction to sociolinguistics. London: Longman.

Howard, P. N. (2004). Embedded media: Who we know, what we know, and society online. In P. N. Howard & S. Jones (Eds.), Society online: The Internet in context (pp. 1-28). Thousand Oaks, CA: Sage Publications.

Hudson, R. (2001). Sociolinguistics. Cambridge, UK: Cambridge University Press.

Ivković, D. (forthcoming). Pragmatics meets ideology: Digraphia and non-standard orthographic practices in Serbian online news forums. Journal of Language and Politics.

Jones, B. M. (1999). The Welsh answering-system. New York: Mouton De Gruyter.

Koutsogiannis, D., & Mitsikopoulou, B. (2003). Greeklish and Greekness: Trends and discourses of “glocalness.” Journal of Computer-Mediated Communication, 9(1). Retrieved February 15, 2012 from http://jcmc.indiana.edu/vol9/issue1/kouts_mits.html

Labov, W. (1990). The intersection of sex and social class in the course of linguistic change. Language Variation and Change, 2(2), 205-254.

Labov, W. (2010). Principles of linguistic change: Cognitive and cultural factors (Vol. 2). Malden, MA: Wiley Blackwell.

Nishimura, Y. (2003). Linguistic innovations and interactional feature of casual online communication in Japanese. Journal of Computer Mediated Communication, 9(1). Retrieved January 20, 2010 from http://jcmc.indiana.edu/vol9/issue1/nishimura.html

Palfreyman, D., & al Khalil, M. (2003). A funky language for teenz to use: Representing Gulf Arabic in instant messaging. Journal of Computer-Mediated Communication, 9(1). Retrieved April 11, 2009 from http://jcmc.indiana.edu/vol9/issue1/palfreyman.html

Simpson, J. (2002). Computer-mediated communication. ELT Journal, 56(4), 414-415.

Thurlow, C., Lengel, L., & Tomic, A. (2004). Computer mediated communication: Social interaction and the Internet. London: Sage.

Trudgill, P. (1972). Sex, covert prestige and linguistic change in the urban British English of Norwich. Language in Society, 1(2), 179-195.

Tseliga, T. (2007). It’s all Greeklish to me! Linguistic and sociocultural perspectives on Roman-alphabeted Greek in asynchronous computer-mediated communication. In B. Danet & S. Herring (Eds.), The multilingual Internet: Language, communication, and culture online (pp. 116-141). New York: Oxford University Press.

Wardhaugh, R. (2010). An introduction to sociolinguistics (6th ed.). Malden, MA: Wiley-Blackwell.

Wolfram, W., & Schilling-Estes, N. (2006). American English: Dialects and variation. Malden, MA: Blackwell.

Zelenkauskaite, A., & Herring, S. C. (2006). Gender encoding of typographical elements in Lithuanian and Croatian IRC. In F. Sudweeks & C. Ess (Eds.), Proceedings of Cultural Attitudes Towards Technology and Communication 2006 (CATaC’06), Tartu, Estonia, June 28-July 1.


Biographical Notes

Samir Jarbou (samerjar@just.edu.jo) is an assistant professor at Jordan University of Science and Technology. His interests include anaphora, deixis, and computer-mediated communication.

Buthaina al-Share (bith-alshare@just.edu.jo) holds an M.A. from Jordan University of Science and Technology. Her interests include sociolinguistics and digital discourse.


Any party may pass on this Work by electronic means and make it available for download under the terms and conditions of the Digital Peer Publishing License. The text of the license may be accessed and retrieved at http://www.dipp.nrw.de/lizenzen/dppl/dppl/DPPL_v2_en_06-2004.html.