Home / Articles / Volume 6 (2009) / Disrupted turn adjacency and coherence maintenance in Instant Messaging conversations
Document Actions



Instant Messaging (IM) is an increasingly popular communication medium. A recent survey found that 83% of Swedish youth (12-16 years old) use the Internet for chat in their spare time, and chatting most commonly takes place in systems where contacts have to be approved by the user, such as MSN Messenger (Medierådet, 2008). IM is also becoming an important tool for communication in the workplace (Lam & Mackiewicz, 2007; Nardi, Whittaker, & Bradner, 2000; Woerner, Yates, & Orlikowski, 2006).

One key to the success of IM is described by Baron (2008) as “volume control.” The semi-synchronicity and presence settings in IM make it possible for the receiver to choose his or her level of participation. However, semi-synchronicity also sometimes results in “disrupted turn adjacency,” which Herring (1999) describes as one of the obstacles to coherence in text-based computer-mediated communication (CMC): Since messages do not appear on the screen of the other participant(s) until the writer hits the return key, it is possible that other messages will be sent in the meantime, separating related messages from each other.

Subsequent research on the topic indicates that disrupted turn adjacency does not necessarily result in incoherent conversation (Simpson, 2005; Woerner et al., 2006). However, not very many details are provided concerning the linguistic means by which coherence is maintained in the specific circumstances of these studies, and the few studies that discuss strategies for coherence in intertwined threads deal with multiparty IM conversations (Markman, submitted; Woerner et al., 2006). It is here that the present study aims to contribute.

The aim of this article is to examine how coherence is maintained in IM interaction in light of the communicative affordances of this particular tool. The main focus is on instances of disrupted turn adjacency; I investigate whether this phenomenon causes coherence problems and what specific strategies are employed to create coherence in this context. As part of the results, a detailed account is given of the linguistic means by which coherence in intertwined threads is secured. Other signs of problematic coherence creation in IM are also presented and discussed, and recommendations for the design of IM clients are suggested.

The data analyzed were collected during an ethnographic study at a design school in Sweden in March 2007. A group of international students was observed during three weeks, with a special focus on a period of six days, during which log files from online in-group conversations were collected. Altogether, the data include log files from IM involving six people in eleven dyads (7,474 words and 1689 messages).

Conversational Coherence and Communicative Affordances

Coherence, broadly defined, is that which in a discourse connects statements with statements, statements with people, and people with other people. It is, in short, the “glue” of text and conversation. (Erickson, Herring, & Sack, 2002, p. 2)

From a pragmatic perspective, coherence establishment and maintenance can be seen as “a kind of organization which actors accomplish or construct communicatively during their interactions” (Korolija, 1998, p. 26). In this process, connections can be established on different levels. In face-to-face interaction, three common strategies to establish coherence relate to cohesive devices, sequential structures, and continued attention. The first two are the main concern of the present article.

One way in which connections between related communicative actions can be made clear is through the employment of linguistic and grammatical devices that create textual cohesion. Building on a review of previous research, Tanskanen (2006) distinguishes between coherence and cohesion as follows:

Cohesion can be regarded as a property of the text, while coherence depends upon the communicators’ evaluation of the text. Cohesive devices, being on the surface of the text, can be observed, counted and analyzed and are therefore more objective. Coherence, on the other hand, is more subjective, and communicators may perceive it in different ways […]. There is an interplay between [the two phenomena] in that the presence of cohesive devices in a text facilitates the task of recognising its coherence. (p. 21)

Halliday and Hasan (1976) describe five strategies for creating cohesion in text: reference, where for example a pronoun is used to refer back to a previously mentioned subject; substitution, where one word is used instead of another; ellipsis, where connections become clear through the exclusion of certain words; conjunction, where segments are linked through specific linking words; and lexical cohesion, where links are created through lexical repetition. Halliday and Hasan present a detailed and complex coding scheme, which is much simplified in the current analysis, where the aim is not to describe the complete system of cohesive devices employed, but rather how participants create coherence despite disrupted turn adjacency. The different strategies, as invoked in the present study, will be exemplified in Table 1, along with excerpts from the log files analyzed (see “Data and coding”).

Another way in which connections between utterances are made apparent, and coherence is thus established, can be found in the structure of conversation. Schegloff (1990) claims that sequential structure is an important tool in creating conversational coherence, and that one can draw conclusions about links between seemingly unrelated utterances based on sequencing. The most striking example is the so-called adjacency pair (Schegloff, 1968), in which certain types of utterances—for instance, questions—set up an expectation concerning the following turn—for example, that it is likely to include a response.

However, some scholars claim that it is difficult to rely on sequencing alone to create coherence in CMC, as the patterns we find here are different from those of face-to-face interaction. Hutchby (2001) suggests that we should explain the relationship between technology and conversational structures through the concept of communicative affordances. He builds on the notion of affordance, first introduced by Gibson (1977, 1979):

The affordances of the environment are what it offers the animal, what it provides or furnishes, either for good or ill. The verb to afford is found in the dictionary, but the noun affordance is not. I have made it up. I mean by it something that refers to both the environment and the animal in a way that no existing term does. (Gibson, 1979, p. 127)

The concept was later introduced to the human-computer interaction (HCI) community, notably by Norman (1988, 1999) and Gaver (1991, 1996), in relation to technological artefacts.

According to Hutchby, the communicative affordances of a tool are reflected in conversational patterns, such as turn-taking, sequencing, and feedback, and can be identified by examining conversational interaction as it unfolds. In the present analysis, qualitative methods from conversation analysis are employed to investigate how participants in interaction act upon the communicative affordances provided by the tool of Instant Messaging. Both cohesive strategies and strategies related to sequencing are revealed as important in creating coherence in this context.

Conversational Coherence in CMC

Coherence as an issue has often been addressed in CMC research. Herring (1999) identified two main problems for coherence in CMC: lack of simultaneous feedback and disrupted turn adjacency, as evidenced in patterns of turn-taking and sequencing. Her analysis of synchronous CMC focuses on Internet Relay Chat (IRC) interaction, and she shows how users are able to adapt to the conditions of IRC by introducing innovative strategies for back-channels, turn-change signals, and address. Further, she points out that “[CMC] is both dysfunctionally and advantageously incoherent” and suggests that the “incoherence” of CMC might encourage linguistic playfulness and participation in multiple simultaneous threads within the same discussion. It should be noted here that “incoherence” in Herring’s definition does not refer to miscommunication, but rather to disrupted turn adjacency. Based on her observations, Herring makes suggestions for design improvements to text-based CMC systems in three domains that could enhance interactional coherence, while maintaining some of the benefits of what she refers to as incoherence: better logging and visualization possibilities, two-way interaction, and innovative ways of linking connected turns.

Herring’s (1999) observations have been applied and discussed in subsequent publications on the topic of coherence in CMC, and often there seems to be some confusion with regard to the terminology used. Some scholars have taken Herring’s reasoning as a claim that CMC leads to miscommunication. They have opposed the statement that CMC is incoherent, arguing that coherence is created and maintained despite disrupted turn adjacency both in asynchronous and synchronous CMC. In her study of coherence in asynchronous CMC, Lapadat (2007) shows that participants draw on techniques traditionally used in writing in order to create coherence. For instance, they use quotations to a large extent. This, she argues, indicates that turn sequencing might not be as relevant in CMC as in linear oral interaction. Similarly, Simpson (2007) claims that even though synchronous CMC lacks sequential coherence, participants are able to draw on background knowledge concerning this type of interaction in order to make sense of the unfolding discourse. Markman (submitted) investigates coherence in synchronous computer-mediated team meetings and explores how threading (more specifically embedded repeats, adjacency pairs, and sentence structure) and speaker selection are important in creating coherent interaction in synchronous CMC.

Of particular interest in relation to the present study are publications dealing specifically with Instant Messaging. Woerner et al. (2006) take the two problems identified by Herring (1999) as a starting point for their analysis of IM in the workplace among physically dispersed co-workers. In the context of the workplace, they identified two additional problems: multitasking and authority. In response to the lack of simultaneity, Woerner et al. show that participants in IM conversation use specific openings or preambles to notify others that they would like to converse, and that they make use of the persistent records of their IM and leave their IM client on continuously throughout the day. Further, they show that disrupted turn adjacency was uncommon in their data, possibly because of the work-related topics discussed. The strategies used to deal with potential sequencing problems during engagement in simultaneous IM conversations included keeping the different conversations in separate windows, making use of color coding to separate contributions from different participants, and putting off-turn information within brackets. Woerner et al. also show how verbal techniques, such as naming, partial sequences, and lexical repetition across IM conversations, were used to keep conversations on track. The strategies employed to deal with the challenges of multitasking also include separate windows and even separate screens for work-related and conversational tasks. The challenge of authority was met by adapting language to the style of the leader; a strategy used in response to both the challenge of multitasking and that of authority was simply going offline.

Lam and Mackiewicz (2007) also discuss coherence and IM in the workplace, focusing on the interaction of one dyad over 95 days. They concluded that the IM conversations investigated were not incoherent, as the researchers were able to identify very few examples of miscommunication. They identified three strategies that their participants used in order to maintain coherence: short, multiple and sequential transmissions, topicalization, and the use of performative verbs. It should be noted that in the present article, phenomena such as disrupted turn adjacency (which Herring, 1999, refers to as sequential incoherence) are not seen as signs of incoherence per se. Instead, the term incoherence is employed here to refer to miscommunication and ambiguities. It is further assumed that disrupted turn adjacency could potentially lead to miscommunication and ambiguities, but whether such is the case will be empirically investigated in the present study.

Data and Method

The data on which the current results are based were gathered during March 2007 during an ethnographic study at a design school in Sweden. The author spent three weeks in the environment, observing how a group of students communicated with each other, face-to-face and via IM. During six days, the students were asked to collect log files from their in-group interaction, and log files were provided by six group members all involved in dyad IM conversations (11 dyads total). The substantial data available through the complete log files were used as the basis for this particular study. Observational notes are not part of the current analysis, as such notes are not available for all of the IM conversations (Örnberg Berglund, forthcoming, takes a broader perspective and combines log files and observational notes in the analysis).

The participants were international master’s students of interaction design. The native languages of those involved in the IM conversations analyzed here are English, Chinese, Malay, Swedish and Portuguese, and the lingua franca was English. They were all experienced users of IM, and, at the time of investigation, they used it regularly it to communicate with each other and with friends and family both in Sweden and in their native countries. During the period of observation, the students were working on individual tasks, thus IM was not used for formal task-related cooperation. The interaction mainly took place at the design school, but some examples of interaction from home are also included in the data. This means that the participants are not necessarily communicating about design, but often also about private issues. Whereas the fact that they were sometimes sharing the same physical environment can certainly influence interactional patterns (see Örnberg Berglund, forthcoming), it is argued here that the context exercises minimal influence on the main focus of the present investigation, coherence in instances of disrupted turn adjacency.

Although the log files do not, of course, represent the embodied practices of the participants in the same way as would, say, a video recording, they still provide an analyzable trace of the participants’ emergent interaction. It should further be noted that the log files provided by the participants reflect the specific IM tool used and hence contain different structural information. In addition, one participant copied her interaction directly off the screen and pasted it into a MS Word file instead. Figures 1-4 illustrate the different types of log files received.

Figure 1. Log file from MSN Messenger

The excerpt in Figure 1 is from a MSN Messenger log file. Participants were instructed to save the history of their IM conversations, which meant that their interaction was automatically logged by the system as an XML document. Here we find information about the date and time the message was sent; we can also see how the system has divided the messages into sessions (based on temporal criteria). The self-chosen names of the participants are also listed, as well as the message itself. We also find information about the font participants are using. If participants make use of pre-programmed emoticons, this is also visible in the log file. System-generated messages informing participants about, e.g., file transfer, appear both on the screens of the participants and in these log files.

Figure 2. Log file from Gtalk

The chat history from Gtalk is automatically saved as an email message in Gmail, as shown in Figure 2. These logs do not provide exact detail concerning the time of the message, but only report on minutes. Instead of dividing the messages into sessions, here we get information about how many minutes the participants are idle if longer pauses occur. In these log files, we also see the self-chosen names of the participants, but the account owner is represented as “me.”

Figure 3. Log file from the Miranda application

Figure 3 shows a file that was copied directly off the screen from the application Miranda and pasted into a text editor by the participant. The arrows before each message indicate whether it was incoming or outgoing. The participant provided information about date and interactants herself.

Figure 4. Log file from MSN; times estimated

Figure 4 illustrates the least detailed type of data collected. Again, the participant has copied the messages directly off the screen and pasted them into a text editor. No time stamps are provided, but rather estimated times are provided by the participant. It should be noted that most of the log files of the Figure 4 type were also contributed as regular MSN log files by the other participant in the interaction, which made it possible to get exact figures concerning timing. However, this was not always the case, and consequently some excerpts only show the estimated time when the session began.

For analysis purposes, these different types of log files were all transformed into consistently formatted tables in Excel. The columns included in these tables, as shown in the current article, were: Line and Mode (added by the researcher), Time, From, To, and Message.

The Tool

IM is a tool for online communication that allows for interaction with people added to a contact list. In this list, it is also possible to set status messages to indicate availability for interaction. IM can be used both for text and audio or video interaction, but in the current data, the focus is on text-based conversations. Interaction is semi-synchronous, which means that most of the time participants are engaged in real-time interaction, but messages do not appear on the other participant’s screen until the person typing them hits the return key. This might result in additional messages appearing on screen while new contributions are in preparation. However, once someone has started typing, this is indicated, and once a new message has been received, an alert appears. The messages are persistent, in that they remain visible or scrollable in the IM window throughout the complete session. Further, the application does not use much computer capacity, and the window is small, which makes it possible to have other windows open in the background, supporting multitasking. With many IM clients the interface can be personalized, for example with regard to the types of alerts received.

The specific IM clients used in this study were MSN Messenger and Gtalk. One of the differences between these tools as employed here concerns the indications provided when the other participant is typing. MSN shares information that the other “is writing,” and if he or she takes a break in the middle of the message, this notification will disappear, making it difficult to distinguish between having taken a pause in typing and having erased the message completely. Gtalk provides information that the other “is typing,” and if the person is idle for a while, the message will change and instead read that the other “has entered text,” indicating that the other participant is still composing the message. Apart from these two main IM systems, one participant had installed Miranda on her computer, which allowed her to collect interaction from different IM channels in one interface.

Figures 5 and 6 are screen shots from MSN Messenger and Gtalk showing interaction between the fictional participants 07researcher and 2007participant from the screen of 07researcher.

Figure 5. Screen shot from MSN Messenger 7.5

Figure 6. Screen shot from Gtalk

Analytical Approach

One basic assumption in this study is that one can analyze meaning making by gaining access to the information that participants in interaction have at their disposal when making sense of each other’s utterances. This means that in this context it is not an absolute requirement to have access to recordings of the computer screen of the participants, as this information is not available to the other participant in interaction either, but instead it is possible to rely on log files only for analytical purposes. However, it is important to keep in mind that some potentially important cues are not included in the log files. For example, the time coding is not exact, and we do not get any information about when the other person starts typing. Both of these cues are available to the participants and are potentially important, but also potentially ambiguous. In practice, it is impossible to know whether the person who has started typing is producing a long reply to an earlier message or editing the message completely as new messages appear on screen. In addition, we have no way of knowing whether participants do manage to pay attention to information appearing on screen while preparing their own messages, other than by investigating their contributions to the subsequent interaction.

Data and Coding

For analytical purposes, the messages sent by each dyad were divided into separate sessions, and the borders between sessions were identified based on temporal and linguistic criteria. When there was an unusually long pause in the interaction, it was judged whether the message occurring after the pause could be a response to a preceding message, thus belonging to the previous session, or if it could be seen as the beginning of a new session. Sometimes session boundaries were made explicit by the participants themselves, as they would initiate new sessions with greetings. It should further be noted that some adjacent sessions might exhibit topical coherence, that is, they might deal with similar topics, yet they were treated as separate sessions partly because they did not show explicit links and partly because they were separated in time.

In total, the participants sent 1,689 messages containing 7,474 words, divided into 120 sessions. The sessions were on average 6.45 minutes long (interquartile range – IQR: 0.40-7.25 minutes) and contained on average 14 messages (IQR: 4.0-15.25 messages). One hundred and two messages were excluded because they were in sections where code switching occurred; thus 1,587 messages were included in the analysis. Of these, 670 messages (42%) follow directly upon a message sent by the same person, and 802 messages (51%) follow directly upon a message sent by the other participant in interaction. The messages making up the remaining 7% are the first message in each of the 120 sessions; these were not coded in the current analysis.

Almost all messages are syntactically complete in their context of appearance, and thus they could constitute turns by themselves. However, many turns continue beyond one message, as participants often choose to expand on their own contributions. Here it should be noted that even though the current study concerns disrupted turn adjacency, the analysis presented here deals with disruption on a message-by-message basis through the reconstruction of conversational threads.

Threads were identified by investigating cohesive devices and adjacency pair structures, and each message was coded as having a clear connection with the immediately previous message, as connecting with an earlier message, or as being the first message of a new thread. Ambiguous cases were grouped together with those connecting with an earlier message. In an attempt to investigate whether disrupted turn adjacency lead to communicative breakdown in these data, it was decided to look further into how coherence is created and maintained in the sections where different threads are intertwined. Thus, those messages that were coded as returning to previous threads (including ambiguous cases) were chosen for further analysis. These messages were first categorized as either second pair parts in adjacency pairs, as other types of comments on a contribution made by the other participant, or as own continuations.

In addition, the cohesive strategies employed were identified. First, I investigated whether one or more of three types of explicit cohesive devices were used: substitution, repetition, or conjunction. If none of these explicit strategies was identified, I looked to see whether the utterance contained anaphoric reference or elliptical feedback, strategies that are more ambiguous since the links are made implicitly. Table 1 illustrates these five cohesive strategies with quotes from the data (intervening messages omitted).

Table 1. Cohesive devices in the IM log files


Reference in Intertwined Threads: Strategies for Coherence Maintenance

In total, 144 messages were coded as relating to a previous message other than the directly preceding one (both following own and following other), and in 126 of these cases (87.5%), disrupted turn adjacency does not seem to cause problems with referencing. Detailed examples of the strategies employed are provided below.

Of the messages belonging to intertwined threads where reference is clear, 81 point back to messages posted by the other person, and 32 of these (39.5%) were coded as the second pair part of an adjacency pair. In these cases, the sequential structure of interaction is in itself an important clue in coherence creation. Another common coherence creation strategy is lexical repetition. Of the messages in this category, 28 (34.6%) include lexical repetition.

The remaining 45 messages with clear coherence are own continuations. Of these, 8 (17.8%) begin with a linking expression, showing a clear connection between the current message and the preceding one by the same person. Another common strategy is lexical substitution; 10 of the messages in this category (22.2%) are linked through this cohesive device.

Both in messages relating to earlier posts by the other participant and in own continuations we find examples of anaphoric reference and general acknowledgement statements. However, coherence is frequently clear despite the potential ambiguity of these implicit strategies, as there is often only one potential antecedent.

As we shall see, the timing of the messages is also crucial. Again, it is important to remember that the participants in interaction had access to information about the time at which the other person started typing, whereas the log files only provide information about when the message was sent and appeared on the screen of the other person.

Excerpt 1 serves to illustrate the strategies for coherence creation identified. The arrows are provided as a visual aid to indicate which messages are intertwined.

Excerpt 1

In Excerpt 1, Bea contacts Felicia to hear what her plans are for dinner. She initiates with a greeting and a question in line 1 and chooses to elaborate on her question in line 2, while presenting a second question in line 3. Felicia in line 4 returns to the first question posted by Bea in line 1 and asks for a clarification. The connection between these messages is clear because of lexical repetition (‘home’). This clarification request never receives any reply from Bea. Instead, shortly after, a new message from Bea appears where she elaborates on her own contribution in line 3. Here, coherence is secured through lexical substitution (‘u and him’ instead of ‘u guys’). In line 6, Felicia then provides a reply to Bea’s question.

Lines 11-14 constitute another example of intertwined threads, as Felicia’s comment in line 12 appears on screen before Bea’s own continuation in line 13 of her contribution in line 11. Here, the timing of Bea’s message in combination with her use of a conjunction (‘unless’) and a lexical substitution (‘wanna stay’ instead of ‘go out’) helps create clear links between Bea’s contributions in lines 11 and 13. In line 14, she returns to the question posted by Felicia in line 12 by providing a second pair part to a first part of an adjacency pair (‘yeah’), and elaborates on her answer in lines 15-16. Felicia then delivers feedback through an onomatopoetic expression in line 17, by typing out laughter.

Reference in Intertwined Threads: Potential Problems with Coherence Creation

Eighteen messages potentially cause referencing problems (12.5% of the intertwined messages), and these occur in interaction involving five different dyads. In all of these instances, the potentially problematic interaction relates to the use of anaphoric reference (7 messages) and elliptical feedback (11 messages) with two potential antecedents. There is no co-textual information, such as lexical repetition or substitution, linking the message to a specific thread.

Most of these 18 potentially problematic messages become less challenging when seen in their interactional context. Important aids are overt clarifications, sequencing, distinctions between different types of feedback, and timing. In order to illustrate these strategies, detailed descriptions of excerpts from the log files are provided in what follows. First, Excerpt 2 illustrates overt clarifications.

Excerpt 2

Excerpt 2 consists of data that were not automatically logged but were copied into a text file by one of the participants. Here, we unfortunately only have access to the estimated time when the session started. In this excerpt, we find four cases of messages returning to previous threads (lines 10, 11, 12 and 14) and two examples of anaphora with unclear antecedents (lines 9 and 21), in both of which the following message by the same person clarifies what was meant by the utterance. The first case of unclear antecedents occurs in line 9, where Bea’s reference to ‘them’ is unclear. Without getting any prompt from Ella, she tries to clarify in line 11 with a lexical substitution (‘the address’ instead of ‘them’) but, in line 14, she changes her mind and substitutes ‘address’ for ‘advertisement.’ In lines 15 and 17, Ella asks for further clarification, and Bea delivers this in multiple messages linked through lexical repetition (‘i love the …’) and a conjunction in the final example (‘and’).

Intertwined with this thread is a thread where Ella illustrates her utterance in line 6 by copying and pasting the company slogan into her message in line 10. Bea at first does not grasp this link of lexical substitution, something which she expresses through a general clarification request with one potential antecedent in line 12 (‘what?1’), but then she indicates that she has understood after all in line 13 (‘ooooo’).

The second case of unclear anaphoric reference occurs in line 21, where Ella’s anaphoric comment including ‘that one’ could refer back to any of the three advertisements that Bea mentions in lines 18-20. A clarification follows in line 22, but instead of defining ‘that one’ that she thinks ‘was funny,’ she chooses to identify one of the advertisements that she did not find ‘sooo funny.’ This repair suggests that participants are orienting towards an adjacency principle where it is assumed that the pronoun “that” should refer back to the immediately preceding message (cf. Sacks, 1992). Since other messages appear in between the linked messages, additional information is needed in order to make the references clear.

Excerpts 3 and 4 illustrate the importance of sequential structure, a distinction between different feedback types, and timing to maintaining coherence during disrupted turn adjacency.

Excerpt 3

Excerpt 3 includes three cases of potentially unclear references. In the message in line 4, ‘she’ could refer to either ‘Beay X’ in line 3 or to ‘therese’ in line 1. However, from the local context it is possible to conclude that the message in line 4 is a logical continuation of Charles’ response to the message in line 1: The researcher analyzing the log files might think that they are terrorists. The fact that Aaron’s question in line 3 receives a reply from Charles in line 5 further supports this analysis.

The other two cases of potential problematic references are provided in the feedback messages in lines 6-7, where ‘sure’ and ‘aaah’ could refer to either of the statements in lines 4 and 5. However, the sequentiality in the references becomes clear when a second feedback message is posted in line 7, keeping the threads separated and replicating the sequential structure in lines 4 and 5. In addition, the quality of the feedback statements differs, in that ‘sure’ in line 6 seems to relate to given information and to something that needs an evaluative comment, whereas ‘aaah’ in line 7 indicates that this statement relates to something that can be treated either as given or new information. More specifically, it can be seen as a “change-of-state token” (Heritage, 1984) which is used to indicate that a change in the current state of knowledge has taken place.

Excerpt 4

In Excerpt 4, timing is crucial in the interpretation of the messages. Other factors that could help create coherence here are sequential structure and a distinction between given and new information. Leaving the photo sequence in lines 4-7 aside, this sequence deals with the release of new software.

In line 8, Dina returns to Bea’s statement in line 3, linking the two through anaphoric reference (‘it’) and lexical repetition, while also asking for a clarification. In line 9, she then elaborates on the topic of the software by telling Bea about a new specific feature. By looking at the message structure only, it might be difficult to know whether Bea’s ‘yes’ in line 10 followed by ‘i saw it today’ in line 11 refers to Dina’s question in line 8 or to her elaboration in line 9. The quality of the feedback does not give us any clue either, as the messages in line 10 and 11 could either be the second part of an adjacency pair, responding to line 8, or an indication that the information presented in line 9 is already given. However, when we also consider the timing of the messages, we see that the contributions in line 9 and 10 were posted at almost the exact same time. This would suggest that line 10 constitutes a second adjacency pair part, where the first pair part is found in line 8.

As in the previous example, other indications that this is the case can be found in the sequential structure, where Bea’s responses in lines 10-13 correspond to the sequencing of Dina’s question and statement in lines 8-9. Also, the fact that Bea expresses surprise in line 12 indicates that this is a response to new information, something that Dina provides in line 9.

Three of the remaining messages that were coded for potentially unclear references in intertwined threads are more problematic. Two of them occur in Excerpt 5.

Excerpt 5

In Excerpt 5, Ellie is sending mp3 files to Bea; system-generated messages describe this process. In this excerpt it is unclear what Bea’s message in line 8 (‘i love it’) refers to. It could either be a reference to the song that she has received in a file from Ella (see line 4), or it could refer to the immediately preceding message about grilled chicken and potato salad. The fact that Bea in line 9 explicitly refers to potato salad, and that she begins her message with ‘haha’ might indicate that ‘i love it’ in line 8 refers to something other than the ‘potatosallad’ in line 7, but the relations remain uncertain.

The potentially problematic aspects continue, as Ella’s question in line 10 (‘which one?’) could either have its antecedent in line 8 (‘i love it’) or in line 9, where the same linguistic construction appears (‘the one’). Bea’s comment in line 12 might indicate that she has understood the question in line 10 as relating to the transferred files.

The final example of problematic reference in intertwined threads is found in Excerpt 6.

Excerpt 6

It is unclear whether Dina’s message in line 6 refers to the fact that Bea can be reached by telephone and MSN, or to ‘mei mei’ mentioned in line 5. The feedback provided in line 7 could be seen as part of a sequential repetition of the structure in the messages in lines 4 and 5, which would mean that line 6 refers back to line 4, and line 7 refers back to line 5. However, it could also be seen as a continuation of the feedback delivered in line 6, which would mean that both lines 6 and 7 would refer back to line 5. This potential miscommunication is not resolved in the subsequent interaction. However, the participants never indicate that they find this problematic, either.

Discussion: Intertwined Threads

The analysis presented above shows that disrupted turn adjacency most often does not lead to miscommunication in these data. In this sense (see also Lam & Mackiewicz, 2007), these data are not incoherent. Unproblematic referencing was achieved through sequential structure, where it can be assumed that, for example, a question will receive a reply (adjacency pairs). Other strategies employed in order to minimize the risk of incoherent IM include lexical repetition and substitution. One orthographic IM-specific coherence strategy reported in previous research is to end the first message in a linking sequence with multiple dots (“…”) This was seen on a few occasions in the data; however, it was not only used as a linking strategy, but also to indicate contemplation (Simpson, 2005).

All potential problems identified relate to cases of ambiguous anaphoric references and ambiguous feedback. Despite the apparent risks of using anaphoric reference and non-specific feedback, surprisingly many examples of this are found in the data. This indicates that the participants do not view intertwined threads as very problematic. In addition, on most occasions, contextual information can be used to increase coherence. Examples include repeated sequential structure, distinctions among different feedback types, and timing. In some cases, overt clarifications are also used to set things straight.

One strategy that might be used to avoid potential communicative breakdown would be to wait until the current “speaker” has finished his or her turn. About half of the time in the present data, messages do appear on screen in a manner resembling turn-taking in face-to-face conversations—that is, a message posted by one participant is followed by a message posted by the other. However, waiting for one message from the other participant does not equal waiting for a complete turn to be finished, as turns often extend over several messages (Markman, submitted). Participants in IM interaction seem to be able to handle this, and with the specific affordances of the system in mind, coherence appears to be created on a level above that of individual messages.

Additional Signs of Problematic Coherence Creation in Instant Messaging

Since referencing in sections with disrupted turn adjacency did not cause very many problems in the data, I further investigated whether there were other signs of problematic coherence creation in IM. I decided to focus on instances where participants themselves initiate repair work and thus explicitly indicate that they find the prior interaction problematic.

Participants seem aware that they are supposed to reply quickly to questions asked over IM, as we find examples of repair work in connection with what they perceive to be belated replies. For instance, in interaction between Charles and Aaron, Aaron excuses his belated reply by informing Charles that the teacher just came by. Similarly, in interaction between Bea and Dina, Dina explains her temporary absence from the interaction by stating that she did not see Bea’s message.

The final example, shown in Excerpt 7, is of a less serious nature.

Excerpt 7

Here, Aaron asks Felicia to check something with the others in the studio. Felicia’s response is not really delayed time wise, as Aaron is well aware that there will be some delay. Rather, this is an example of how orthographic means can be used to emphasize urgency, even though in this particular case it is not meant in a serious manner (which Aaron specifies in line 7 by including an emoticon).

Other examples of repair work occur in connection with unclear exophoric reference, where participants refer to people or objects outside the boundaries of the IM conversation and mistakenly take a shared common ground for granted. For example, on one occasion Charles refers to a Mr A., and Aaron asks for a clarification. In another occasion, Bea states ‘I cant do it!!!’ in a message to Dina. This is not commented on, other than by Bea herself, who in the next message writes: ‘is okie.’ In addition, two examples of unclear reference between sessions were identified, where the person initiating seems to assume that the other person is able to identify the correct antecedent, but this is difficult. In one example, Aaron talks about ‘he’ in the first message in a session, referring back to the previous session with Charles, and Charles asks for a clarification. In another example, Bea writes to Dina about ‘him’ referring to another IM conversation that Bea just has completed with Aaron, and even though Dina does not clearly indicate that she does not understand, Bea clarifies. In these cases, the problem relates to discrepancies between perceived and actual common ground between participants.

An additional six examples of miscommunication were found. One already mentioned example occurred in Excerpt 2, when Ella tries to clarify an earlier message by including a substitution in line 10, but Bea at first does not grasp what role this substitution plays in the interaction and exclaims ‘what?1’ in line 12. She then realizes what Ella has done and types ‘ooooo’ in line 13. Another misunderstanding occurs in interaction between Aaron and Charles, where Aaron asks Charles whether he wants to come early to the party, but instead of replying, Charles posts ‘ok’; ‘perfect’. Aaron indicates that he thinks that there has been a misunderstanding by suggesting a disconnection and by restating the question. This session appears to have taken place during multitasking, since the pauses between the messages are unusually long.

Examples of miscommunication are also found in interaction between Bea and Felicia, as shown in Excerpt 8.

Excerpt 8

Here, Bea and Felicia are talking about Felicia’s boyfriend going back home to Austria, and Bea, who has some of his things in her dorm room, wants to make sure he gets them before he leaves. In line 1, she informs Felicia what time she plans on leaving the school. Felicia then makes a joke in line 2 which Bea does not understand and so she exclaims ‘wat?!’ in line 3 and specifies her reaction in line 4 (‘austrialian?!’). Instead of clarifying here, in line 5 Felicia lets Bea know that the boyfriend will stop by.

In lines 6 and 7, Bea seems to return to her statement in line 1, elaborating on when she is going to leave. Felicia, in line 10, however, takes the statement that Bea sent in line 7 as a question relating to what time ‘he’, mentioned in line 5, will leave, as it appears as if she produces a second adjacency pair part. Before this, though, she tries to clarify in line 8 what she meant by her joke in line 2, since Bea has indicated she did not understand. However, Felicia’s clarification does not make any sense and rather causes further confusion. Bea indicates that she does not understand by again typing ‘wat?!’ in line 9, and then ‘errr…’ and ‘Okie’ in lines 11-12. Felicia then clarifies in lines 13-15, indicating that she made a typing error.

There are many more typing errors and cases of non-standard spelling in the data, some accompanied by repairs. The fact that most participants are not native English speakers might have some effect here. The current example stands out since the produced outcome in fact is another lexical item, changing the meaning of the message and causing much confusion. This suggests that participants in IM interaction are able to overlook typing errors as long as they do not interfere with the meaning of an utterance.

Interesting to note here is that Felicia puts her final clarification within parentheses. This strategy to indicate that an utterance is misplaced in the local context was reported by Woerner et al. (2006), and we find eight instances of parenthesis usage in the current data: Felicia uses this technique three times in her interaction with Bea, and Bea puts onomatopoetic expressions, such as ‘huuu,’ within parentheses on four occasions (none of which is coded as intertwined) and uses brackets to indicate a side comment on one occasion.

The interactions between Bea and Dina contain two examples of misunderstandings. One involves word confusion, and the other is shown in Excerpt 9.

Excerpt 9

In Excerpt 9, Bea sends an indication that a message has not been correctly understood in line 2. Dina needs further information on what Bea has problems understanding and posts a question mark in line 3. Here we see one of the affordances of text-based interaction for the repair of intersubjectivity. Dina does not need to repeat herself, since her previous message is still visible on the screen. Instead, it is enough for her to use punctuation only, illustrating the most extreme case of elliptical feedback. By posting a question mark she can indicate something like “What is it in my previous message that you have trouble understanding?”. Instead of continuing to explain herself here, Bea concludes in line 4, by stating ‘is okie’.

Discussion: Explicit Signs of Problematic Coherence Creation

Sequences where participants explicitly oriented to problematic sequences through repair work were taken as additional signs of incoherent interaction in the analyzed data.

The potential reasons for these incoherent sections relate to problems in negotiating common ground on different levels. For instance, participants were multitasking and did not know whether the other person was available for interaction, or they used unclear exophoric references when referring across sessions. Here, the affordance to participate in many semi- simultaneous IM conversations as well as to have time lapses between sessions may cause some problems. In addition, repair work at times can be cumbersome, since the affordances of the medium make it more difficult to reach consensus, and negotiation takes more effort than it would in a face-to-face situation. Moreover, we saw that typing errors can cause confusion.

Conclusion and Design Implications

The study presented in the current article focused on potential obstacles to coherence creation in IM interaction. Based on an analysis of log files from 120 IM conversations, I investigated whether disrupted turn adjacency caused problems with coherence, in the sense of misunderstandings or conversational breakdowns. Other signs of problematic coherence creation were also identified. The fact that the analysis was based on log files has some important implications. Whereas log files for the most part provide the analyst with the same information that participants in interaction have access to, it is important to remember that the files also lack information about some potentially valuable cues, such as exact timing and information about when the other participant starts typing. These cues could be important affordances in relation to turn adjacency, as they make it possible to frame the next utterance relative to both the starting point of the typing of the messages and the actual time when the messages appear on the screen. However, as previously mentioned, this type of information can also be ambiguous.

The results confirm that disrupted turn adjacency occurs, even in dyadic conversation (cf. Herring, 1999). Further, they show that such sequential disruptions do not necessarily result in misunderstandings or confusion (cf. Lapadat, 2007; Markman, submitted; Simpson, 2005). Here, it is important to remember that whereas previous research mainly focused on multi- party interaction, which is likely to be quite complex, the current analysis deals with dyadic interaction. In the dyadic IM interaction analyzed here, the visual and persistent affordances of text interaction together with cues provided by timing and information about when others are typing make it possible to link logically adjacent utterances that are typographically separated. Cohesive devices such as lexical repetition and lexical substitution, as well as linking expressions, provide additional important cues. However, we also find surprisingly many instances of general anaphoric reference and elliptical feedback with no clear co-textual links, suggesting that the interacting participants do not always view explicit co-textual cohesive devices as necessary in this context.

As in other types of conversational interaction, other strategies in addition to cohesive devices are important to maintain coherence in IM interaction. For example, sequencing appears to be a valuable tool, despite the occasional occurrence of disrupted turn adjacency. Sequencing is relevant on two levels. First, as all interaction depends on mutual contributions and recognitions, it is possible to detect logical links between typographically separated utterances through the identification of adjacency pairs. Second, a strategy more closely connected to the specific affordances of this text-based medium relates to the fact that it is possible to find repeated structures in the data where the sequencing of, for example, adjacent questions is replicated in the reply structure. The data included examples of how multiple replies with several potential antecedents made sense because of sequential structure and sometimes distinctions among different feedback types, as well as timing. This is in some ways in conflict with the view of Markman (submitted, p. 7), who in her findings concerning multiparty CMC reports on both of these strategies, yet claims that “[t]he inevitability of disrupted turn adjacency means that sequential organization cannot be relied upon to provide coherence in chat interaction […]”. In contrast, it could be argued that sequential structure is important also in CMC, albeit in a different way than in face-to-face interaction.

Furthermore, participants appear to be aware that turns take different shapes in this medium and that turn-taking works in relation to the specific affordances of the tool. This is in line with Simpson’s (2005) reasoning that background knowledge is an important factor in coherence maintenance in synchronous CMC. Unquestionably, intertwined threads did occur in the dyadic IM interaction analyzed here, as well as a few cases of potentially problematic reference. However, there were only a few instances where participants initiated repair work and explicitly treated such sequences as problematic. It is also quite possible that the students adapted their interactional behaviour with regard to the task at hand. Had they been involved in work-related problem-solving discussions, they might have felt the need for more overt grounding and repair strategies than was the case in the present study, where interaction was mainly of a social nature.

As disrupted turn adjacency was not very problematic in the data analyzed, other obstacles to coherence maintenance were also identified by investigating cases of explicit repair work. In the sequences identified, all problems related to reaching common ground, and specifically to determining others’ availability for interaction (relating to Woerner et al.’s “multitasking” obstacle), problems with referring across sessions, and problems with resolving cases of miscommunication in the textual medium (relating to Herring’s “lack of simultaneous feedback” obstacle).

The instances of miscommunication identified could sometimes be related to the design of the tools employed; thus one might consider how design improvements could result in more coherent interaction. Most IM tools have features that support coherence creation, such as presence indicators (including information that a user has been inactive for a while) and information that the other is typing. However, this is not always enough. One way to improve the tool to better support referencing in intertwined threads and negotiation of common ground would be to borrow ideas from tools for asynchronous communication. For example, it might be useful to give users the possibility to keep threads separate on the interface, by actively choosing to start a new thread separated either spatially or graphically. It might also be valuable to increase the possibilities to link between utterances, for example by adding features for simplified quoting. This could be used both within the same session and when referring across sessions. However, the impact of such features on synchronicity would have to be evaluated further. Another suggestion to increase the possibilities for multitasking would be to allow participants access to more detailed information about current activities, automatically collected by the computer, but where one could choose what type of information to include. Here, issues relating to personal integrity would also need to be considered.


Baron, N. (2008). Adjusting the volume: Technology and multitasking in discourse control. In J. E. Katz (Ed.), Handbook of mobile communication studies (pp. 177-194). Cambridge, MA: MIT Press.

Erickson, T., Herring, S., & Sack, W. (2002). Discourse architectures: Designing and visualizing computer-mediated conversation. Proceedings of ACM CHI 2002. Retrieved July 6, 2009 from http://hybrid.ucsc.edu/SocialComputingLab/Publications/erickson-herring-sack-chi01.pdf

Gaver, W. W. (1991). Technology affordances. Proceedings of the SIGCHI Conference on Human Factors in Computer Systems (CHI’91). Retrieved July 6, 2009 from http://www.goldsmiths.ac.uk/interaction/pdfs/04gaver.technologyAffordances.chi91.pdf

Gaver, W. W. (1996). Affordances for interaction: The social is material for design. Ecological Psychology, 8(2), 111-129. Retrieved July 6, 2009 from http://www.goldsmiths.ac.uk/interaction/pdfs/17gaver.socialAffs.96.pdf

Gibson, J. J. (1977). The theory of affordances. In R. E. Shaw & J. Bransford (Eds.), Perceiving, acting, and knowing (pp. 67-82). Hillsdale, NJ: Lawrence Erlbaum Associates.

Gibson, J. J. (1979). The ecological approach to visual perception. Boston: Houghton Mifflin.

Halliday, M. A. K., & Hasan, R. (1976). Cohesion in English. London: Longman.

Heritage, J. (1984). A change-of-state token and aspects of its sequential placement. In J. M. Atkinson & J. Heritage (Eds.), Structures of social action (pp. 299-345). Cambridge: Cambridge University Press.

Herring, S. C. (1999). Interactional coherence in CMC. Journal of Computer-Mediated Communication, 4(1). Retrieved July 6, 2009 from http://jcmc.indiana.edu/vol4/issue4/herring.html

Hutchby, I. (2001). Conversation and technology. From the telephone to the Internet. Cambridge, UK: Polity Press.

Lam, C., & Mackiewicz, J. (2007). A case study of coherence in workplace instant messaging. In Proceedings of the International Professional Communication Conference (IPCC) 2007. Retrieved July 6, 2009 from from http://www.iit.edu/~lamchri/documents/IPCC_2007-jm-7-24-07.pdf

Lapadat, J. C. (2007). Discourse devices used to establish community, increase coherence, and negotiate agreement in an online university course. Journal of Distance Education, 21 (3), 59-92. Retrieved July 6, 2009 http://www.jofde.ca/index.php/jde/article/viewDownloadInterstitial/32/5

Markman, K. M. (submitted). Conversational coherence in computer-mediated team meetings. Submitted to S. C. Herring, D. Stein, & T. Virtanen (Eds.), The handbook of the pragmatics of computer-mediated communication. Berlin: Mouton.

Nardi, B., Whittaker, S., & Bradner, E. (2000). Interaction and outeraction: Instant Messaging in action. In Proceedings of the 2000 ACM conference on Computer Supported Cooperative Work. Retrieved July 6 2009 from http://dis.shef.ac.uk/stevewhittaker/outeraction_cscw2000.pdf

Norman, D. A. (1988). The design of everyday things. New York: Doubleday.

Norman, D. A. (1999). Affordances, conventions, and design. Interactions, 6(3), 38- 41.

Örnberg Berglund, T. (Forthcoming). Coherent conversation initiation in multiplex communicative ecologies. Submitted.

Sacks, H. (1992). Lectures on conversation. Ed. G. Jefferson. Oxford: Blackwell Publishers.

Schegloff, E. A. (1990). On the organization of sequences as a source of "coherence" in talk-in-interaction. In B. Dorval (Ed.), Conversational organization and its development (pp. 51-77). Norwood, NJ: Ablex Publishing.

Schegloff, E. A. (1968). Sequencing in conversational openings. American Anthropologist, 70, 1075-1095.

Simpson J. (2005). Meaning-making online: Discourse and CMC in a language learning community. In A. Méndez-Vilas, B. González-Pereira, J. Mesa González, & J. A. Mesa González (Eds.), Recent research developments in learning technologies. Badajoz, Spain: FORMATEX. Retrieved July 6, 2009 from http://www.formatex.org/micte2005/36.pdf

Tanskanen, S. (2006). Collaborating towards coherence. Lexical cohesion in English discourse. Amsterdam/Philadelphia: John Benjamins Publishing Company.

Ungar & Medier 2008 (Art. nr. 531) by Medierådet. Retrieved July 6, 2009 from http://www.mediaradet.se/upload/Rapporter_pdf/Ungar_&_Medier_2008.pdf

Woerner, S. L., Yates, J., & Orlikowski W. J. (2007). Conversational coherence in Instant Messaging and getting work done. In Proceedings of the 40th Annual Hawaii International Conference on System Sciences. Retrieved July 6, 2009 from http://hdl.handle.net/1721.1/37284

Biographical Note

Therese Örnberg Berglund [terese.ornberg@humlab.umu.se] is a Ph.D. candidate at Umeå University in northern Sweden, affiliated with the Department of Language Studies and HUMlab. Her research interests concern contextual influences on strategies of coherence maintenance in online and mixed-mode contexts. She currently also works at Linköping University, coordinating activities to encourage exploration of digital technologies in teaching and research.


Any party may pass on this Work by electronic means and make it available for download under the terms and conditions of the Digital Peer Publishing License. The text of the license may be accessed and retrieved at http://www.dipp.nrw.de/lizenzen/dppl/dppl/DPPL_v2_en_06-2004.html.