Home / Articles / Volume 13 (2016) / Attending Multi-Party Videoconference Meetings: The Initial Problem
Document Actions



In computer-mediated communication (CMC) and video-mediated communication (VMC), conveying availability at the beginning of an encounter may become an intricate interactional task to be solved jointly by participants. Based on the analysis of 18 openings in multi-party videoconference meetings in Spanish, this article addresses how participants initiate interaction without a moderator. Taking Conversation Analysis (CA) as a methodological point of departure, the study shows that being present online does not necessarily indicate being available; even so, availability is one of the first interactional issues to be addressed at the beginning of an online interaction. The analysis illustrates how participants use different resources in order to display availability progressively during the three phases of the pre-meeting, and how interaction is unfolded until the establishment of mutual availability is accomplished. The findings lead to a discussion of the necessity for expanded notions like availability in CMC and VMC settings, in which the initial interactional problem is solved gradually.


Over the last decade, new practices of interaction have emerged in multimodal computer-mediated communication (CMC), including videoconferences, which have become an increasingly prevalent social activity in many settings. In institutional settings it is particularly common to meet online with partners or collaborators located in various places around the world. In educational settings, teacher-student and student-only videoconferencing is an increasingly well-established practice (e.g., Blake, 2005; Martin, 2005; Örnberg Berglund, 2009). This use of multimodal synchronous environments for daily communication has led in recent years to the description of new linguistic and social patterns of communication (e.g., Licoppe, 2012a, 2012b; Messina Dahlberg & Bagga-Gupta, 2013; Sindoni, 2014).

Yet as has been pointed out, there is still a need for empirical studies centered on interaction in synchronous environments (Hampel & Stickler, 2012; Jenks & Firth, 2013), particularly in environments using voice and video modes (Herring, 2010; Sindoni, 2014). The aim of this study is to examine how participants solve the problem of establishing mutual availability in multi-party videoconference encounters in Spanish that take place in Adobe Connect 7.0, multimodal desktop videoconferencing software that allows for multiple participation in three different channels simultaneously: video, voice, and text-based chat.

Based on 18 screen-recorded online encounters without a moderator, involving three or four participants each, this study contributes to the growing body of literature focused on multimodal CMC by demonstrating that opening a videoconference is a jointly negotiated and complex interactional activity that lasts a considerable amount of time, partly due to the intricacies of indicating availability in a setting that delivers CMC through three different channels. The study shows that the establishment of mutual availability becomes an interactional issue solved by online participants during the pre-meeting, and it illustrates how the accomplishment of such social action unfolds in each of the three phases of the pre-meeting, namely, the technological opening, the interactional opening, and the audiovisual opening.

As indicated in previous CMC research, individuals adapt to the medium in terms of interaction management (e.g., Anderson et al., 2010; Herring, 2014; Hutchby, 2001; Jenks & Brandt, 2013). This observation raises the question: How does the multi-party dimension affect the beginning of a multimodal interaction?

In Adobe Connect 7.0, as well as in other multimodal settings, being online does not necessarily mean being engaged in the conversation (Mondada, 2010). Indeed, in the encounters analyzed here, being available is a precondition for initiating interaction and the subsequent meeting. However, before the meeting starts, prospective participants must show that they are available to carry out audiovisual interaction. This article illustrates how different resources allow participants to display availability, showing that in a multimodal videoconference setting, the primary interactional task of solving the problem of availability is jointly accomplished during the successive openings of the different modes.

The Problem of Availability in Mediated Interaction

Every form of conversational exchange must be initiated. Conversations and meetings – mediated or unmediated, formal or informal – have procedures for initiation. As Sidnell (2010) explains, “the coordinated entry of parties into a conversation is a fundamental and utterly generic problem of social interaction” (p. 201), and in mediated settings these issues are more visible and possible to account for. In this regard, Schegloff’s contribution has been crucial, because he has described extensively how the problem of availability is solved through the summons-answer (SA) sequence (Schegloff, 1968, 2002, 2007). As he posited in his early work, in a telephone conversation the ring of the phone constitutes the summons, while answering the phone constitutes the second part of the summons-answer sequence, i.e., the answer (Schegloff, 1968), as is illustrated in the following example:

Today, the standardization of the use of new technologies in social settings, together with a continuing interest in opening sequences, has led to a larger strand of research on this issue and a growing body of literature describing various phenomena and practices in different technology-mediated environments (e.g., Jenks & Brandt, 2013; Markman, 2009; Mondada, 2010; Veyrier, 2012). Accounting for new orders of mediated interaction has led to challenging discussions about notions like summons, presence, and co-presence in relation to new mediated settings, which in turn has led to the development of new frameworks and approaches drawn from empirical work (e.g., Licoppe, 2012a, 2012b; Mondada, 2010; Rettie, 2004). Hence, the standard dyadic SA sequence proposed by Schegloff has been in some sense ‘surpassed’ by new accounts of settings of technology-mediated interaction in which participants are not physically co-present.

Telephone Openings and the SA Sequence

Schegloff (2007) defines the SA sequence as a generic “pre-sequence which is not directed to any sequence type, but rather is aimed at a feature generically relevant to the efficacy of talk-in-interaction” (p. 48). Research on this phenomenon began with telephone interaction, in which the performance of an SA sequence involves only two parties (Schegloff, 1968, 1979). Schegloff demonstrated that a summons is an attention-getting device that “is understood to serve to mobilize its addressee’s attention and provide for the addressee’s alignment as a recipient,” and for which “use is warranted only when the attention and aligned recipiency of the target are in question, or appeared to be impaired or attenuated – typically by involvement in some competing activity or activities” (Schegloff, 2002 p. 294). It can be argued that SA “sequences establish a framework of participation, a very basic kind of alignment, between the parties” (Sidnell, 2010, p. 202).

Two different types of SA sequences can be distinguished in telephone conversation: first, those produced in the very beginning of the interaction, when the summoner has no certain knowledge of the availability of the answerer; and second, those which appear during the ongoing interaction, where the summons seems to be mostly intended to confirm that the audio channel is functioning properly (Schegloff, 1968, p. 1077).

Most relevant to this work is the fact that SA sequences are germane to the problems of both availability and coordinated entry into a conversational exchange. Furthermore, Schegloff (1968) established a set of properties and interesting features of the SA sequences. In particular, by virtue of their property of ‘non-terminality’ and the fact that “SA sequences align the roles of speaker and hearer providing a summoner with evidence of the availability or unavailability of a hearer, and a prospective hearer with notice of a prospective speaker” (p. 1093), SA sequences in telephone openings have constituted the standard pre-sequence through which solutions to both of these problems are simultaneously provided. In other words, through the accomplishment of an SA sequence, each conversational party demonstrates its availability and coordinates with the other – coordination that may then be sustained for further conversation. Therefore, one might claim that while the completion of a SA sequence establishes the mutual availability of the parties and allows the activity to continue, failure to complete the sequence establishes or claims unavailability.

The Diversity of (Pre-)Openings in Synchronous CMC

In synchronous CMC, interactional openings in various media have attracted the attention of researchers for years (e.g., Markman, 2009; Rintel, Mulholland, & Pittam, 2001). Particularly in the last decade, many studies have contributed to filling the empirical gap on this issue, and extensive research centered in different environments has been conducted. Much of this research has focused on (pre-)openings and has demonstrated that preparation for the main activity is determined by the affordances and constraints of the medium (Hutchby, 2001).

Rintel et al. (2001), for example, explored the openings of Internet Relay Chat (IRC) interactions, using Schegloff’s notion of summons. The authors show that turn coordination between participants who have newly joined a channel may be ambiguous, and as a result, “a massive number of opening utterances can be found” (Rintel et al., 2001, p. 7). An interesting result of their research is the evidence that the openings of IRC interactions consist of different phases which ascribe a progressive character to the openings. Another relevant observation in their study is that some electronic resources, such as the “automated joining event,” i.e., the display of a newly-joined user, which is produced as a result of entering a chat, may be considered “opening artifacts” within the progression of the whole opening (Rintel et al., 2001, pp. 13-26). In the same vein, Markman (2009) studied the openings and closings of virtual team meetings in quasi-synchronous chat (QSC). Her study confirms that starting informally structured meetings is a complex and difficult interactional activity.

The work of Jenks and Firth (2013) and of Jenks and Brandt (2013) on audio Skypecasts has shown that the multi-party character of the medium may also result in complex long opening sequences, indicating that offering a detailed description of pre-meetings in multimodal settings would not only be worthwhile but also necessary. Further, Jenks and Brandt (2013) demonstrate that even if the participants in a multi-party Skypecast have access to the list of logged-in participants, speakers usually need a verbal confirmation in order to be sure who is actively present. The authors point to the lack of a video display as a potential cause for this necessity, and in their analysis they consider Schegloff’s notion of summons, using the term electronic summons to refer to the summons produced on Skypecasts.

Regarding opening exchanges in online video conversations, Liddicoat (2011) suggests that “the opening spoken exchanges in the online conversational data are characterized by very little interactional work before moving to the first topic” and that “the first turns are oriented to the technological interface rather than to personal interaction” (Liddicoat, 2011, p. 53).

In his doctoral dissertation, Veyrier (2012) carries out a detailed analysis of the interactional practices conducted within the pre-openings of multi-party web-based conference meetings held in a corporate setting. Focusing on the arrangements made before opening the meeting (pre-réunion), Veyrier identifies several sequential phases that constitute the transition to the meeting itself (entrée en réunion). A relevant methodological point in his study, similar to Rintel et al. (2001), is the application of a model based in entries for analyzing the transition through the three sequential phases. In his data, however, no summoning activity is observed. Veyrier’s analysis shows how these entries are coordinated by a moderator, who is responsible for opening the web-based conference room and starting the meeting. The problem of availability, however, seems to play a minor role in Veyrier’s work, which makes sense because the moderator is both the gatekeeper and the initiator of the interaction.

Regardless of the theoretical framework and methodology adopted, most of the researchers who take an empirical stance as a point of departure agree with the statement that multimodality in CMC adds more complexity to the interactional work carried out by participants. This added complexity may be revealed in the initiation of an online interaction (Mondada, 2010; Satar, 2013). Even if the body of literature centered on multimodal CMC is growing considerably, the interaction produced on multimodal platforms where communication can be delivered through video remains scarce (Herring, 2010; Sindoni, 2013, 2014). Lorenza Mondada is one of the researchers who have investigated such communication from a multimodal perspective.

In examining the potential of video data, Mondada (2008) points to the pre-beginning as one of the “three key sequential positions within the overall structure” of a call (Mondada, 2008, p. 36). In a later study, Mondada (2010) presents a fruitful and interesting examination of the pre-openings of videoconference encounters, identifying three endogenous categories. She differentiates the pre-opening and the initiation from the general opening and discusses notions such as co-presence, contact, and involvement (Mondada, 2010, p. 277).1 Her analysis reveals striking differences between videoconference openings and previous descriptions of the sequential organization proposed in telephone interaction and mobile telephone interaction, thus contributing to the understanding of these “new” forms of communication.2

Data and Method

The data set is comprised of approximately four hours of online interaction drawn from 18 screen-recorded, multi-party meetings in Spanish among university students studying by distance.

The meetings were held in Adobe Connect 7.0, a desktop videoconferencing platform that allows participants to interact simultaneously through voice, video, and text-based chat. The purpose of the encounters, which took place over a time period of five months (January 2012–May 2012), was to facilitate oral interaction among the students, since they were following the same Spanish course but were not physically co-present during the course.

Although screen recording was a normal practice in the course, students were informed about the collection of data for academic purposes before it started, and only those students who signed a written consent form participated in the research project.

In total, 28 participants aged 19 to 65 (9 men and 19 women) were involved in the study. Of those, 9 participants are native Spanish speakers (NS), while 19 are non-native Spanish speakers (NN).3

Distributed in seven groups of four students each, the participants scheduled their respective meetings in advance, and one room in Adobe Connect was assigned to each group. No teacher was present during the online encounters, but before each meeting the participants received some instructions related to the contents of the course. The length of the meetings and the agenda were not pre-established.

Some Methodological Considerations

In this study, the primary methodological approach to the data is Conversation Analysis (CA), a methodology that developed from sociological origins and in which social action is seen as central to the organization of talk-in-interaction. CA research examines the communication processes that make human interaction possible; this includes the analysis of diverse forms of talk as well as visible conduct in numerous social settings (Bolden, in press, p. 1).

There are several factors that indicate the suitability of CA methodology for this study. The first is the fact that the CA empirical approach usually focuses on authentic interactions, which is a suitable description of the type of data collected. The second is the attested fact that even textual interactions in computer-mediated communication are experienced by users as ‘conversations’ (Herring, 2010). Moreover, CA has become a key methodological approach within CMC research (Giles et al., 2015), even if working with online multimodal interaction data may entail new methodological challenges.

In order for the description of the analysis that follows to be understandable, some remarks must be made about the technological affordances and constraints of the medium – that is, the specific technology in which the opening activity occurs. Adobe Connect 7.0 is a videoconferencing software program that allows participants located in distant places to communicate synchronously and simultaneously through three different channels, so they can see each other, talk to each other, and use text-based chat. In Adobe Connect 7.0, participants’ names are automatically displayed on the screen when they enter the room, and they thereby become visible to the co-participants. Moreover, once a participant enters the room, he or she is automatically enabled to type in the text-chat interface and to hear and see what others are typing or have typed. In contrast, in order to speak or activate their webcam, online participants must first click the appropriate “talk” and “video” icons in the interface. When this happens, a microphone symbol appears next to the speaker’s name on the participant list. Additionally, Adobe Connect allows more than one participant to speak simultaneously. Participants’ broadcasting appears in three or four central windows, which may be movable, i.e., participants’ position in the interface may change once a participant deactivates the webcam or leaves the videoconference room. A screenshot of the Adobe Connect interface is provided in Figure 1.

Figure 1. Multi-party conversation in Adobe Connect, 7.0

Another important methodological consideration is that although the meetings have been transcribed primarily according to the Jeffersonian system as used in Hepburn and Bolden (2013), some other conventions have been added in order to accurately render the dynamics of encounters in Adobe Connect 7.0, in which online talk-in-interaction is considered as a whole comprised of video, talk, and text-based chat. Hence, I have consistently distinguished between the interaction taking place within the different modes (i.e., text-based chat and audiovisual interaction), and also between specific electronic actions such as entering the room () or activating the webcam (). These actions are relevant to participants, in the sense that they constitute resources in their own right and can receive other courses of action as a response (see the appendix for transcript notations).

Finally, it is important to keep in mind that because the data were recorded through the Adobe Connect 7.0 recording tool, there is no full access to the activities happening in the participants’ physical workspaces, but only to their online activity; this is also the extent of what the other participants can see on their computer screens. Gestural and other physical behaviors are not illustrated in the screenshots in the transcriptions in order to preserve the anonymity of the students. However, analysis procedures have taken into account this kind of physical behavior; hence in those cases where gestures or other physical moves are relevant to the analysis, they have been incorporated into the transcript according to the Jeffersonian system mentioned above.


Three core phases were identified within the overall structure of the beginnings of the multi-party videoconference encounters. I refer to them as technological openings, interactional openings, and audiovisual openings. In each phase, before the meeting starts, participants conduct specific interactional work, often moving progressively from entering the room to the initiation of the audiovisual interaction. Specific key interactional resources serve participants in shifting from being offline to initiating the audiovisual interaction. Through these successive shifts, online participants orient to the beginning of the meeting, gradually displaying their availability for further talk-in-interaction. The interactional work and practices carried out within each of these sequences are illustrated in the following subsections with representative excerpts from the collected data, focusing on the interactional meaning of each distinct and recognizable action within the online talk-in-interaction.

The Technological Opening: Entering the Room and Claiming Attendance

Entering the videoconference constitutes the first, and a necessary, step toward solving the problem of availability and initiating interaction in Adobe Connect 7.0. As in other technology-mediated contexts, the chat room is opened by a host or moderator (Veyrier, 2012); in educational settings, this host is likely to be the teacher.4 From that moment, participants may enter the room individually; this commonly occurs some minutes before the scheduled time for the meeting. These minutes leading up to the scheduled start time may not be filled with interaction; rather, after having entered the room, during the technological opening, online participants often remain silent and even invisible for a long period of time. Indeed, before a participant initiates an interaction, and especially if participants’ webcams are not active, it may not be possible to know whether the co-participants are sitting in front of their computers. In this sense, it can be argued that entering the room becomes a key interactional action before the meeting starts, but which is not necessarily understood by participants as claiming attendance or displaying availability.5 Additionally, it is also clear that participants usually enter the chat room in advance of the scheduled start time in order to adjust their technical settings individually. This behavior is illustrated in Excerpt (1), which starts with Karl (NN) entering the chat room some minutes in advance of the scheduled time for the meeting. Even though he is not available for initiating the meeting, Karl activates his webcam before carrying out some kind of individual preparation related to the technology.

In (1), after activating his webcam (line 2), Karl performs several tests in order to confirm that his microphone is functioning (lines 3–4), seated in front of the computer with his webcam on. After some time, Irene enters the room (line 6); subsequently she activates her webcam (line 8), and initiates a greeting sequence in the chat interface (line 10). However, although Karl is online, he does not seem to be available to initiate the interaction yet, as seen in line 12, where he replies via chat to Irene’s greeting that was extended almost a minute earlier. Moreover, the Spanish ah in this context is comparable to the common English change-of-state token oh described in Heritage (1984). By using ah, Karl displays late recognition of Irene’s greeting, which in turn may indicate that before that moment he was not paying attention to the prospective online interaction. The fact that Karl is not available for interaction becomes more evident in lines 13–15, in which, after several minutes of silence, he removes his headphones and steps away from the computer.6

Entering the room does not necessarily mean that a participant will attend the meeting either, even though this is often treated as a recognizable action for claiming attendance. Even so, meeting attendance may be subsequently displayed through other actions such as greetings in the chat, as is the case in Excerpt (1). Further, participants can be online without being available for initiating interaction, and this lack of availability can be conveyed through various cues, both verbal and non-verbal. This observation is illustrated in Excerpt (2), when Joseph (a NN student not belonging to that group) enters the room (line 4) while the official participants are discussing their courses.

After Joseph enters the room, his classmates Miriam, Rakel, and Lili (NNs) continue their discussion about the difficulty of one of their Spanish courses without showing any sign of attention to Joseph. However, he stays online and even logs in again (line 19), so that his name appears twice on the participant list display (Joseph and Joseph2). The fact that Joseph logs in to the chat room twice may be oriented to getting the other participants’ attention, although this action does not disrupt the discussion, and his co-participants continue their conversation without addressing Joseph at all. The fact that the three official participants noticed Joseph’s online co-presence is evidenced in lines 57–62, after he leaves the chat room. Despite their awareness that a classmate is online – that is, despite their recognition that someone has entered the room – the participants do not assume that someone who enters the room plans on joining the meeting.

The Interactional Opening: Displaying Attendance and Availability

In the meetings that were analyzed, online participants displayed their attendance through various actions, including typing in the chat window or activating their webcams; these actions may also signify a certain availability for interaction. However, participants’ availability may vary during a given encounter, especially during the pre-meeting. In many cases, participants coordinate jointly to achieve complete availability for interacting through the various modes. However, most of the encounters are initiated via chat. One reason may be the affordances of the medium itself – that is, in order to engage in text-based chat, participants do not need to take any additional action such as activating a webcam or microphone, and the text-based chat also remains visible on the computer screen. So even if an online co-participant is not available to initiate the interaction at a certain moment, that participant can see the comment later and may reply to it at that time.

The analysis of the data reveals that interaction is most frequently initiated through the chat interface. In other words, participants typically use the chat interface for conducting the initial greeting sequences through which they confirm their mutual orientation. In turn, through the greeting sequence online participants confirm their attendance and initiate online talk-in-interaction. Typing in the chat is understood as a recognizable action conveying a certain degree of availability – that is, displaying that the participant is at least available for interacting via chat.

Excerpt (3) illustrates the beginning of an interactional opening in which Kian (NN) and Marta (NS) display attendance and availability via chat during the pre-meeting.

In Excerpt (3), Marta greets her co-participant and asserts the time that the meeting is scheduled to begin (lines 4–6). Kian replies to her greeting, after which Marta continues with the conversation through the chat window, implementing some greeting routines. However, despite their relative involvement in the already-initiated interaction, neither of the online participants activates their webcam or initiates vocal interaction.

As seen in Excerpt (3), interaction is usually initiated through greeting sequences in the chat. Greeting tokens such as hola may constitute verbal cues for displaying attendance at the meeting as well as availability for communicating via chat.

In this sense, as Schegloff (2012) noted while describing the functioning of SA sequences in dyadic telephone conversation, the first greeting token in this kind of encounter may function as a summoning device. As in the case of the telephone summons, because Kian’s attention is in question, the greeting token in line 4 seems to be understood as a device for getting Kian’s attention. In line 8, the addressee aligns as a recipient.

The Audiovisual Opening: Displaying Availability for Audiovisual Interaction

Availability for interacting via chat does not necessarily imply availability for audiovisual interaction. Indeed, availability for audiovisual interaction can only be achieved once participants become reciprocally visible and audible to their co-participants.

As previously discussed, there are several ways to display availability, such as activating the webcam, or activating the microphone and speaking. Nevertheless, the interactional meanings of activating the webcam and of speaking (which implies having activated the microphone) seem to differ considerably.7 This difference is usually shown through participants’ interactional behavior, which in turn may result in sequential implications. In other words, even if activating the webcam and performing spoken interaction are both considered preconditions for fulfilling availability, in terms of doing, there is an important accountable difference worth explaining: While activating the webcam can only be understood as a device for displaying availability for audiovisual interaction, this availability must be confirmed through the successful performance of speaking.

In the vast majority of the encounters, activating the webcam precedes the initiation of the spoken interaction. The activation of the webcam becomes then a relevant electronic action through which online participants orient to the beginning of audiovisual interaction. Excerpt (4) illustrates how availability is gradually achieved through different types of involvement and degrees of contact (Mondada, 2010) in a multi-party encounter.

In excerpt (4), Karla’s webcam is already active when she initiates a greeting sequence in the chat (lines 8–17). By making her image visible to her co-participants, Karla displays her orientation toward starting the spoken interaction. Her availability to talk is subtly confirmed in line 13, when Karla produces an audible ‘uhm’, confirming that her microphone is open as well. However, the other participants have not activated their webcams yet, and the initiation of the spoken interaction is delayed until line 20, after Anna activates her webcam in line 18. It is precisely the electronic action of activating Anna’s webcam that seems to trigger the initiation of the spoken interaction, performed by Karla with a question addressed only to Anna (line 20). What Karla seeks by using the singular verbal form escuchás is a confirmation from Anna that she can hear her. Anna’s subsequent confirmation is typed in the chat line 25. It is interesting that Anna, even though she had activated her webcam already, types in the chat when providing her answer to Karla’s question. Through the selection of this mode Anna indicates that she is not ready to talk yet, while it gives her the opportunity to confirm that Karla’s microphone is functioning properly. Indeed, establishing mutual availability within the audiovisual opening is also linked to the checking of the audio and video channels, which may be considered a characteristic sequence in videoconferencing (Santos Muñoz, in press).

Similar interactional behavior is shown in Excerpt (5), where the problem of availability is jointly solved again when the spoken interaction is initiated. This occurs only after two online participants have activated their respective webcams.

In Excerpt (5) Samuel connects his webcam (line 5) before launching a greeting that he explicitly addresses to all online participants (line 9). In line 11, Magnus activates his webcam and replies to the greeting via chat (line 13), thus confirming his attendance at the meeting. As in Excerpt (4), activating the webcam is understood as a device for displaying availability for audiovisual interaction. This action triggers, in turn, the initiation of spoken interaction, as shown in line 14, when Samuel initiates a new sequence particularly addressed to Magnus, in order to verify that the audiovisual interface is working. By uttering hola Magnus (hello Magnus), Samuel confirms that he can see Magnus’s image; his question, can you hear me? is a request for confirmation that the audio is functioning as well. Magnus’s availability is confirmed in line 16, when he aligns with Samuel by replying vocally with a new greeting token, through which they achieve mutual availability.8

As shown in Excerpt (4) and Excerpt (5), online participants do not usually initiate the spoken interaction until at least two participants have activated their respective webcams. Among the 18 openings I analyzed, activating one’s webcam seems to be the first step in displaying availability for audiovisual interaction. The recognizable electronic action of activating the webcam may thus constitute a resource for claiming availability for further audiovisual interaction, and one might argue that this action makes initiating spoken interaction relevant. In turn, however, carrying out spoken interaction is how participants achieve mutual availability, confirming that they are available to carry out audiovisual interaction.

Given that talk-in-interaction (mediated or unmediated) is constructed jointly, solving the problem of availability at the beginning of multi-party videoconference encounters constitutes an interactional task involving at least two participants. When two or more participants have demonstrated the same ‘available status’ at a certain moment of the ongoing talk-in interaction, they modally align with each other. I refer to this alignment as modal alignment, because it implies that co-participants’ attention and aligned recipiency are attainable through the same mode(s) (Santos Muñoz, 2015). Modal alignment differs from cross-modal exchanges, also characteristic of VMC, in which the production channel is different from the interlocutors’ preferred feedback channel (Rosenbaun et al., 2016, p. 29).


As seen in the foregoing analysis, starting a multi-party meeting without a moderator in a videoconference setting entails intricate joint interactional work in which the first problem to be solved by participants is the problem of availability. Because in Adobe Connect 7.0 being online does not necessarily indicate availability for interaction, and no “electronic summons” can be performed, online participants may need some interactional resources to serve as cues for displaying their attention to the prospective meeting. While being online can be understood as claiming attendance to the meeting, both attendance and availability may be confirmed through other resources and recognizable actions such as typing in the chat interface, activating the webcam, or speaking. Thus, these actions become distinctive practices through which participants display their availability for interaction. However, availability for audiovisual interaction can only be achieved once co-participants have mutually displayed their availability to talk, which will, in turn, lead to the subsequent opening of the meeting. In this sense, the findings of this research align with Veyrier’s work, in which the pre-meeting is considered a transitional phase (Veyrier, 2012).

Further, even if, as suggested by Sellen (1992), gestures and gaze in videoconferences may not serve to secure co-participants’ attention through video, the data in this study reveal that the activating of the webcam is often experienced as an electronic action that triggers the initiation of audiovisual interaction. That is, it is an electronic resource used by online participants aiming to attract the attention of co-participants, inviting them to respond to the initiated sequence and contributing to the efficacy of the prospective talk-in-interaction.

The foregoing analysis and our findings lead to the necessity of revisiting the traditional notion of availability, and consequently, expanding the notion of summons to include new forms of technology-mediated interaction. Indeed, before the meeting starts, there are other resources that participants mobilize in order to solve the availability problem. Typing hola in the chat or uttering a new greeting token may also be considered interactional devices contributing to the accomplishment of the ongoing interactional task. As long as they match most of the features of the notion of summons, these resources may be interpreted as summoning devices, helping to solve the problem of availability in videoconference encounters.

Moreover, in telephone conversations, failure to respond to a summons signifies that the person is not available. Similarly, in the pre-meetings analyzed here, failure to respond to a greeting in the chat interface signals that the participant is not yet available for further interaction. However, in comparison to telephone conversation, videoconference shows a key difference concerning the status of the summoner: In telephone interaction the summoner does not know at all about the actual availability of the summoned, whereas in the beginnings of videoconference meetings in Adobe Connect, the “summoner” may be aware to some extent of the availability of co-participants, although this is a problem solved gradually in which the activation of the webcam also plays an important role.

Nevertheless, because the medium affects the entire interaction, and because participant behavior will also depend on the affordances and constraints of the medium, these assertions are mostly tentative, given that this kind of interaction is especially complex. Research should more closely examine relevant phenomena such as silence in videoconferences and the selection of the channels available for interacting (e.g., Liddicoat, 2011, 2012b; Sindoni, 2012, 2014), in order to gain a fuller understanding of it. Further, this study confirms that the interaction within the videoconference pre-meeting is relevant to the efficacy of talk-in-interaction, which in turn implies that varying levels of co-presence can be related to the participants’ degree of involvement in the activity (Mondada, 2010; Rosenbaun et al., 2016).

One final observation drawn from the analysis of the 18 online encounters concerns the selection of the mode or mode-switch, defined by Sindoni (2014, p. 327) as “the alternation of speech and writing in the same communicative event and in synchronous mode .” Within the pre-openings, mode selection may also become a resource for conveying availability. In turn, as shown in Rosenbaun et al. (2016), this selection of the channel may affect the interaction in terms of participation management and participatory status.

As revealed in the dataset, within the pre-openings, participants who achieve modal alignment display the same available status for interaction, which implies that they are reciprocally available through the use of a certain mode. Also, once a participant initiates spoken interaction, that participant may use his or her turn to seek a response from online co-participants. Then, co-participants may produce their responsive actions through various modes, and in making their selection, they will consider the initiator’s or addressees’ availability as well. Modal alignment is not necessary for interaction to be initiated, though (e.g., Rosenbaun et al., 2016). However, the findings of this study are consistent with previous findings confirming that in multiparty VMC the interplay among channels will affect participants’ mobility and roles (e.g., Sindoni, 2014; Rosenbaun et al., 2016).


This article addresses how participants in Adobe Connect 7.0 initiate interaction and consequently solve the problem of availability at the beginning of multi-party videoconference encounters. Because the problem of availability is a critical issue at the beginning of every interaction, I have outlined some of the most relevant studies in various settings in CMC and VMC where, as it was noted, participants orient their interactional work through different phases before initiating the interaction (e.g., Rintel, 2001; Mondada, 2010; Veyrier, 2012).

The analytic section focused on the various actions enacted by online participants in the multi-party videoconference encounters; in particular, those actions which become relevant as distinctive actions during the different interactional phases of the openings. As Heath and Luff (1992, p. 320) indicate, “participants in interaction presuppose reciprocity of perspectives or interchangeability of standpoints in producing their own conduct and in recognizing the actions of their counterparts.” In this sense, the present study has confirmed that some of the specific actions taken by meeting participants in Adobe Connect 7.0 are treated as recognizable and meaningful (Licoppe, 2012a).

Furthermore, this study confirms some of the previous findings in VMC. First, it provides evidence that the first turns of multi-party meetings are oriented to the technological interface (Liddicoat, 2011, p. 53); second, it shows that this salient characteristic within multi-party and synchronous VMC is similar to other technological settings (e.g., Brandt & Jenks, 2013; Markman, 2009; Rintel, 2001). Despite these similarities, it can be argued that multi-party videoconference meetings are different in nature from other exchanges, such as multi-party exchanges in public Google Hangouts in which not all users are expected to interact with each other (Rosenbaun et al., 2016).

The analysis has confirmed the important role played by the specific technology, even before the meeting starts; this supports the idea that the technology may lead to sequential implications in terms of actions (Hutchby, 2001; Mondada, 2010). However, online participants’ interactional behavior seems to be routinized (Kendon, 1990). Further, each of the actions through which participants coordinate – working together to complete the joint interactive task of establishing availability – seems to be understood as conveying some degree of availability. This finding implies that the SA sequence that ‘easily’ solved the problem of availability in the context of telephone conversations may also be extended to new orders of technology-mediated interaction (Rintel et al., 2001). Hence, it may be worth reconsidering the notion of availability in multimodal CMC environments, where it constitutes an interactional issue that seems to be solved gradually. Indeed, this progressive character of the establishment of mutual availability is also evidenced in the openings in Adobe Connect 7.0, where participants’ interactional behavior ultimately led to achieving availability to initiate the audiovisual interaction and, subsequently, beginning the meeting.

Dealing with some of the basic CA concepts in a videoconference setting has allowed me, first, to contribute to the growing body of literature focused on multimodal CMC, thereby enhancing the understanding of the organization of multi-party videoconference meetings. Second, it has led to a discussion of some theoretical notions in relation to the interactional behavior of online participants which may be characteristic of multimodal CMC. Nevertheless, this study focuses on a very limited community of practice, and certainly there is a diversity of practices for opening a synchronous multimodal computer-mediated encounter. In such variation, factors such as the type of exchange itself, the number of participants, and the available channels for interacting play a relevant role. For this reason, although I posit that the conclusions presented in this paper can be safely applied to other interactional contexts and settings within video-mediated communication such as Skype exchanges, only future empirical research can fully confirm this proposal.


  1. The words in italics correspond to our translation of the terms Vor-Eröffnung, Eröffnung, Beginn, Ko-Präsenz, Kontakt, and Teilnahme used in Mondada (2010).

  2. Despite the fact that the sequential organization of openings in mobile telephone interaction may differ from traditional telephone openings, for the present purpose, it can be assumed that the initial problem of availability remains the same in both modalities. For relevant discussion of the organizational structure of landline and mobile phone calls, see Arminen and Leinonen (2006), Hutchby (2005), Hutchby and Barnett (2005), and Arminen (2005).

  3. Even though the group of non-native Spanish speakers is quite heterogeneous, the vast majority of the participants have Swedish as a mother tongue, while two of them have French. Additionally, most of the participants have lived in Spanish-speaking countries or speak Spanish in their private life, as a result of which their oral proficiency in Spanish is very high.

  4. In our data, the only reason for the teacher to open the chat room is to screen-record the session. The teacher is the host, and only a host has the necessary permission to record sessions.

  5. In our analysis, the terms claiming and displaying are used as in previous CA research (Heritage, 2007).

  6. The notion of unavailability or “being unavailable” is not discussed in this article. However, given the new forms of presence and being available discussed in previous literature (e.g., Rettie, 2004; Licoppe, 2012a, 2012b), it would be interesting to explore these notions in future research.

  7. Even if the action of activating the microphone becomes necessary for displaying availability for audiovisual interaction, it is not always a recognizable action to the participants or to the transcriptionist, due to the fact that sometimes it may not be totally clear at which precise moment each participant activates or deactivates his or her microphone.

  8. The intonation contour of Magnus’s response suggests some technical problems; these are not addressed in this article.


Anderson, J. F., Beard, K. B., & Walther, J. B. (2010). Turn taking and the local management of conversation in a highly simultaneous computer-mediated communication system. Language @ Internet, 7, article 7. Retrieved July 05, 2016, from http://www.languageatinternet.org/articles/2010/2804

Arminen, I. (2005). Sequential order and sequence structure – the case of incommensurable studies on mobile phone calls. Discourse Studies, 7(6), 649–662.

Arminen, I., & Leinonen, M. (2005). Mobile phone call openings – tailoring answers to personalized summons. Discourse Studies, 8(3), 339–368.

Blake, R. J. (2005). Bimodal CMC: The glue of language learning at a distance. CALICO Journal, 22(3), 497–511.

Bolden, G. (in press). Conversation analysis. In M. Allen (Ed.), The SAGE Encyclopedia of Communication Research Methods. Retrieved July 06, 2016, from https://www.academia.edu/20376615

Brandt, A., & Jenks, C. (2013). Computer-mediated spoken interaction: Aspects of trouble in multi-party chat rooms. Language@Internet, 10, article 5. Retrieved June 29, 2016, from http://www.languageatinternet.org/articles/2013/Brandt

Giles, D., Stommel, W., Paulus, T., Lester, J., & Reed, D. (2015). Microanalysis of online data: The methodological development of “Digital CA”. Discourse, Context and Media, 7, 45-71.

Hampel, R., & Stickler, U. (2012). The use of videoconferencing to support multimodal interaction in an online language classroom. ReCALL, 24, 116–137.

Heath, C., & Luff, P. (1992). Media space and communicative asymmetries: Preliminary observations of video-mediated interaction. Human-Computer Interaction, 7, 315-346.

Hepburn, A., & Bolden, G. B. (2013). The conversation analytic approach to transcription. In J. Sidnell & T. Stivers (Eds.), The handbook of conversation analysis (pp. 57–76). Cambridge: Wiley-Blackwell.

Heritage, J. (1984). A change-of-state token and aspects of its sequential placement. In J. M. Atkinson & J. Heritage (Eds.), Structures of social action (pp. 299-345). Cambridge: Cambridge University Press.

Heritage, J. (2007). Intersubjectivity and progressivity in person (and place) reference. In N. J. Enfield & T. Styvers (Eds.), Person reference in interaction. Linguistic, cultural and social perspectives (pp. 255–280). Cambridge University Press.

Herring, S. C. (2004). Computer-mediated discourse analysis: An approach to researchering online behaviour. In S. A. Barab, R. Kling, & J. H. Gray (Eds.), Designing for virtual communities in the service of learning (pp. 338–376). New York: Cambridge University Press.

Herring, S. C. (2010). Computer-mediated conversation: Introduction and overwiew. Language@Internet, 7, article 2. Retrieved January 7, 2016, from http://www.languageatinternet.org/articles/2010/2801

Hutchby, I. (2001). Conversation and technology: From the telephone to the Internet. Cambridge: Polity Press.

Hutchby, I. (2005). ‘Inconmmensurable’ studies of mobile phone conversation. A reply to Ilkka Arminen. Discourse Studies, 7(6), 663–670.

Hutchby, I., & Barnett, S. (2005). Aspects of the sequential organization of mobile phone conversation. Discourse Studies, 7(2), 147–171.

Jenks, C., & Brandt, A. (2013). Managing mutual orientation in the absence of physical copresence: Multiparty voice-based chat room interaction. Discourse Processes, 50(4), 227–248.

Jenks, C., & Firth, A. (2013). Synchronous voice-based computer-mediated communication. In S. C. Herring, D. Stein, & T. Virtanen (Eds.), Handbook of pragmatics of computer-mediated communication (pp. 217–241). Berlin: Mouton De Gruyter.

Kendon, A. (1990). Conducting interaction. Patterns of behavior in focused encounters. Cambridge: Cambridge University Press.

Licoppe, C. (2012a). Understanding mediated appearances and their proliferation: The case of the phone rings and the ‘crisis of the summons’. New Media & Society, 14. Retrieved October 29, 2014, from http://nms.sagepub.com/content/14/7/1073.abstract

Licoppe, C. (2012b). Les formes de la présence. Revue française des sciences de l’information et de la communication, 1. Retrieved September 4, 2014, from https://rfsic.revues.org/142

Liddicoat, A. J. (2011). Enacting participation: Hybrid modalities in online video conversation. In C. Develotte, R. Kern, & M. N. Lamy (Dirs.), Décrire la conversation en ligne. Le face à face distanciel (pp. 51-69). Lyon: Ens Editions.

Markman, K. M. (2009). “So what shall we talk about”: Openings and closings in chat-based virtual meetings. Journal of Businness Communication, 46, 150–170.

Martin, M. (2005). Seeing is believing: The role of videoconferencing in distance learning. British Journal of Educational Technology, 36(3), 397–405.

Messina Dahlberg, G., & Bagga-Gupta, S. (2013). Communication in the virtual classroom in higher education: Languaging beyond the boundaries of time and space. Learning, Culture and Social Interaction, 2, 127–142.

Mondada, L. (2008). Using video for a sequential and multimodal analysis of social interaction: Videotaping institutional telephone calls. Forum Qualitative Sozialforschung / Forum: Qualitative Social Research, 9(3). Retrieved August 5, 2014, from http://www.qualitative-research.net/index.php/fqs/article/view/1161

Mondada, L. (2010). Eröffnungen und Prä-Eröffnungen in medienvermittelter Interaktion: Das Beispiel Videokonferenzen. In L. Mondada & R. Schmitt (Eds.), Situationseöffnungen. Zur multimodalen Herstellung fokusierter Interaktion (pp. 277–334). Göttingen: Narr Verlag.

Örnberg Berglund, T. (2009). Multimodal student interaction online: An ecological perspective. ReCALL, 21(2). Retrieved July 28, 2014, from http://journals.cambridge.org/abstract_S0958344009000184

Rettie, R. (2004, October). Using Goffman´s framework to explain presence and reality. Paper presented at the 7th Annual International Workshop on Presence, Valencia, Spain. Retrieved May 10, 2014, from http://core.kmi.open.ac.uk/download/pdf/90164.pdf

Rintel, E. S., Mulholland, J., & Pittam, J. (2001). First things first: Internet Relay Chat openings. Journal of Computer-Mediated Communication, 6 (3). Retrieved December 29, 2015, from http://onlinelibrary.wiley.com/doi/10.1111/j.1083-6101.2001.tb00125.x/full

Rosenbaun, L., Rafaeli, S. & Kurzon, D. (2016). Participation frameworks in multiparty video chats cross-modal exchanges in public Google Hangouts. Journal of Pragmatics, 94, 29-46. Retrieved May 16, 2016 from http://www.sciencedirect.com/science/article/pii/S0378216616000047

Satar, H. M. (2013). Multimodal learner interactions via desktop videoconferencing within a framework of social presence: Gaze. ReCALL, 25. Retrieved April 20, 2014, from http://journals.cambridge.org/action/displayAbsctract?aid=8836008

Santos Muñoz, A. (2015). El alineamiento modal en el preámbulo de reuniones por videoconferencia. Lingüística en la Red, XIII, 1-22. Retrieved December 29, 2015, from http://www.linred.es/numero13_articulo_4.html

Santos Muñoz, A. (In press). La comprobación de los canales al inicio de encuentros por videoconferencia. Spanish in Context.

Schegloff, E. (1968). Sequencing in conversational openings. American Anthropologist, 70, 1075–1095.

Schegloff, E. (1979). Identification and recognition in telephone conversation openings. In G. Psathas (Ed.), Everyday language: Studies in ethnomethodology (pp.23–78). New York: Irvington.

Schegloff, E. (2002). Beginnings in the telephone: Perpetual contact. In J. E. Katz & M. Aakhus (Eds.), Mobile communication, private talk, public performance (pp. 284–300). Cambridge: Cambridge University Press.

Schegloff, E. (2007). Sequence organization in interaction. A primer in conversation analysis I. Cambridge: Cambridge University Press.

Sellen, A. (1992). Speech patterns in video-mediated conversations. Proceedings of the CHI '92 Conference on Human Factors in Computing Systems (pp. 49-59). New York: ACM.

Sidnell, J. (2010). Conversation analysis. An introduction. Cambridge: Wiley-Blackwell.

Sindoni, M. G. (2012). Mode-switching: How oral and written modes alternate in videochats. In M. Cambria, C. Arizzi, & F. Coccetta (Eds.), Web genres and Web tools with contributions from the Living Knowledge Project (pp. 141-153). Como & Pavia: Ibis.

Sindoni, M. G. (2013). Spoken and written discourse in online interactions: A multimodal approach. London & New York: Routledge.

Sindoni, M. G. (2014). Through the looking glass: A social semiotic and linguistic perspective on the study of video chats. Text & Talk, 34(3), 325-347.

Veyrier, C. A. (2012). Les cinq premières minutes: organisation des ouvertures en (web)conférence. Analyse des pratiques interactionnelles en réunion professionnel. Doctoral thesis in Linguistics. Université Paul-Valéry III, Montpellier.


Biographical Note

Arantxa Santos Muñoz [aranzazu.santos@moderna.uu.se] is a Ph.D. candidate at Uppsala University. Her research interests include video-mediated communication and the use of new methodologies for the analysis of digital conversation.


Any party may pass on this Work by electronic means and make it available for download under the terms and conditions of the Digital Peer Publishing License. The text of the license may be accessed and retrieved at http://www.dipp.nrw.de/lizenzen/dppl/dppl/DPPL_v2_en_06-2004.html.