We describe the degree of formality of language in French politicians’ blogs, with specific focus on comparing blog posts and blog comments. The degree of formality is investigated in a corpus of posts and comments in 80 blogs through a cluster of features derived both from traditional French sociolinguistics and from studies of informal computer-mediated communication. The features examined are 1) syntactic (omission rate of the negative particle ne and forms of Yes/No-questions), 2) lexical (frequency of colloquialisms and of acronyms and non-standard spelling), and 3) prosodic (frequency of repetitive punctuation and emoticons). The analysis shows that the language used in the French politicians blogs is overall relatively standard. However, the language politicians use in their blog posts is more standard than the language used by commenters – the latter ranges from strictly formal to highly colloquial.


Blogs written by politicians combine public political discourse with the medium of computer-mediated communication (CMC). This combination gives rise to contradictory expectations with regard to the degree of language formality: In lay conceptions, the style of political discourse is usually regarded as standard or even formal, while blogs are viewed as a relatively casual communication mode. A few researchers have previously addressed this contradiction. For instance, Suomela-Salmi (2009) argues that politicians may use colloquial language in their blogs to create the illusion of a close relationship between author and readers, but that politicians are only allowed to ‘flirt’ with non-standard language; excessive use runs the risk of impairing the author’s credibility. Similarly, Janoschka (2010, p. 232) claims that an informal style is a sign of the politicians attempting to place themselves at “a similar communication level” with the blog readers. Our aim in this article is to investigate, with corpus linguistic tools, to what extent French politicians use informal style in their blog posts and how their style differs from that used by their readers in the comments.

Our data consist of writings posted in 80 politicians’ blogs during September 2007. We examine both the blog posts written by the politicians and the comments on them by blog readers, focusing on an aggregate of features susceptible of connoting a certain degree of formality. The results are compared to corresponding frequencies in other CMC corpora, where available. The questions investigated are the following: How are certain variables used in politicians’ blogs? Does the use of these variables reflect the degree of formality associated with them in the previous literature? We hypothesize that the features connoting informality are more frequent in the comments than in the posts because of the different identities of those who write: any Internet user in the former vs. institutional politician in the latter case.

The features chosen for analysis are derived from two research fields: French sociolinguistics and CMC research. The first type of feature, sociolinguistic variables connoting a certain degree of formality in French such as the omission of the negative particle ne, have previously been studied in both computer-mediated and more traditional corpora (see, e.g., Armstrong, 1998; Coveney, 1996; van Compernolle, 2008a; van Compernolle & Williams, 2009). Features of the second type are typical in informal, especially synchronous, CMC modes; they include emoticons and repeated punctuation. These features have been described by numerous researchers (see, e.g., Anis, 2007; Bieswanger, 2013; Herring, 2012). The two types of features, however, have not previously been considered together.1 In our view, investigation of the degree of formality of computer-mediated French necessitates such a grouping.

In this article, we first describe blogs in general and politician’s blogs in particular and identify factors that may affect stylistic choices in the politician blog genre. We then briefly discuss the socio-stylistic continuum of linguistic formality in French before proceeding to the method and data of the analysis. The results section reports the frequencies of the chosen features in the data corpus, together with representative examples of their use. We also discuss findings that show differences between blog posts and blog comments, and suggest a reconsideration of the connotations of some of the features. In concluding, we propose ways to further develop understanding of the degree of language formality in politicians’ blogs, in political discourse more generally, and in CMC.

Blog Genres

According to Garden (2012, p. 487), the blog is difficult to define because the “term refers to both its technological platform and its output.” As a technological platform, blogs are “modified web pages in which dated entries are listed in reverse chronological sequence” (Herring, Scheidt, Bonus, & Wright, 2004, p. 1). In terms of its output, the blog is ‘‘an author-driven, asynchronous and informal genre of CMC that uses various modalities and entails some interactivity’’ (Lomborg, 2009). Herring et al. (2005) point out that the blog is used in so many ways that it cannot be considered a single genre, but is rather a medium or a socio-technical format. Miller and Shepherd (2009) identify two main genres within the blog; the first, the personal blog, resembles a diary and is characterized by personal commentary, self-disclosure, and identity construction, while the second, the public affairs blog, is more of a soapbox, representing alternative journalism and aiming at action and social change. The approach chosen in the present study is more specific yet: We view the politician’s blog as a genre.

Genre is understood here as a type of social action, an abstraction of recurring rhetorical situations (see Miller, 1984). This view entails a fusion of the socially constructed conception of the communicative situation (context, participants, etc.) and the individual’s creative use of the genre within or beyond generic conventions (Devitt, 2004). Generic conventions are tacit “expected way[s] of acting” (Devitt, 2004, p. 86), including the style of language used. While some genres, for instance the tax form or the research article, impose rigid generic conventions in terms of language use, others, such as the tourist brochure or the politician’s blog, allow more variation. Indeed, Puschmann (2013) claims that blogs are a highly variable form of self-expression involving degrees of formality of language.

In September 2007, the period our data were collected, the blog was an emerging mode of communication for French politicians. Given their lack of previous experience, blogging politicians made innovative (and usually unconscious) decisions as to how to fill their blogs. In addition to choices concerning such issues as the nature of the topics raised (political or personal, for instance), blogging politicians had to decide what degree of linguistic formality to use in their blog posts. Blog commenters similarly had to decide what mode of address to use towards both the politician and each other, as well as the degree of formality of their language.

These choices, however, were not made in a vacuum. As Fairclough (2003, p. 66) notes, “new genres develop through combination of existing genres.” Lehti (2011) shows that in 2007 politicians mixed different genres, for instance diary and polemic, in their blogs. The stylistic choices made by blogging politicians were also framed by the styles associated with antecedent genres, as well as with the overall discourse type (political discourse). In the case of blog comments, the antecedent genres influencing their linguistic realization include the discussion forum and online chat. Texts representing antecedent and ambient genres and discourse types constitute stylistic “norms,” i.e., an intuitive model based on a body of reference, which guides the stylistic choices of discourse producers (Enkvist 1991).

Politicians’ Blogs: Posts and Comments

The politician’s blog as a genre differs from the political blog. The latter is a blog about politics, written by any internet user, whereas in the case of the politician’s blog the author occupies the role of institutional politician, although the topics covered are not necessarily related to politics. In addition to the blog format, one principal characteristic of the politician’s blog genre is the power asymmetry of the participants (elected politician and the public). The communicative purposes of the genre include constructing a credible image of the author and interacting with voters (Janoschka, 2010; Lehti, 2011; Suomela-Salmi, 2009). One component in the project of image construction is the strategic use of linguistic style, including degree of formality, within and beyond generic conventions.

In this article we examine not only the blog posts written by the politicians but also the comments that follow the posts. We consider the blog comment and the blog post to be separate albeit interrelated modes of communication. That is, we conceive of ‘the politician’s blog post’ and ‘the comment in a politician’s blog’ as sub-genres of the genre ‘the politician’s blog.’

Aharony (2010) examined the comments in academic Library and Information Science blogs. Her results indicate that the comments were written for varied purposes, including giving advice and general reflection on the topic of the post. With regard to politicians’ blogs in particular, Sivenkova (2012) suggests that the communicative goals of the commenters include scrutinizing the politician’s political ideas, advancing their own opinions, and entering into debate with other discussants. Lehti (forthcoming 2013) shows that the comment sections in many politicians’ blogs are not very active overall. Further, even in the active sections, the blogging politician is most often absent from discussion, and the commenters consequently react not only to the blog post but also to each other’s comments.

Degree of Linguistic Formality

Degree of formality is signalled in language via variables which carry social meaning, such as, in French, the retention or omission of the negative particle ne. These variables do not change the denotational meaning of the utterance, but they carry various kinds information, for instance about the utterer, the addressee, the situation, and the context (Coupland, 2007). French sociolinguistic and didactic research on social and situational variation has identified linguistic features which can connote a certain degree of formality.

The socio-stylistic continuum of French has been analysed by sociolinguists in terms of various scales. For instance Gadet (2003) identifies four levels: formal (soutenu), standard (standard), informal (familier), and colloquial (populaire). Likewise Arrivé, Gadet, and Galmiche (1986) propose a four-level scale: formal (soigné), everyday/medium (courant/moyen), informal (familier), and colloquial (populaire). Mougeon, Rehner, and Nadasdi (2004), in turn, present a tripartite division: vernacular, mildly marked, and formal. These levels are defined in terms of normativity and social and situational variation:

Vernacular variants do not conform to the rules of Standard French, are typical of informal speech, are inappropriate in formal settings, are associated with speakers from the lower social strata, and are usually stigmatized. Mildly marked variants, like vernacular variants, do not conform to Standard French and are typical of the informal register, but may also be used in formal situations. However, compared with vernacular variants, mildly marked variants demonstrate considerably less social or gender stratification and are not stigmatized. Formal variants conform to the rules of Standard French, are typical of careful speech and written French, and are strongly associated with members of the upper social strata. (Mougeon et al., 2004, p. 413)

The sociolinguistic and didactic study of the degree of formality of French language brings together a large body of research on different linguistic features connoting a given degree of formality; the features examined, however, are mainly those found in ‘traditional’ spoken and written modes of communication. In French, CMC has not been fully integrated into the study of degree of linguistic formality, and the possible (in)formality connotations, for example of emoticons or repeated punctuation, consequently have not been systematically explored.


In the present study we attempt to pave the way for the integration of CMC-specific features into the study of linguistic (in)formality of French language. We therefore make use of both sociolinguistic and CMC studies which indicate the possible (in)formality connotations of different variables. According to Gadet (2003, p. 44), the degree of formality in French can be defined within the areas of prosody, pronunciation, morphology, syntax, vocabulary, and discourse, applying different variables; these include, for example, omission of the silent e, omission of the negative particle ne, word order, and the use of colloquial vocabulary. French grammars (e.g., Arrivé et al., 1986; Riegel, Pellat, & Rioul, 1997 [1994]) also refer repeatedly to certain features in these areas as indicators of socio-stylistic variation. Mougeon et al. (2004) chose for their study a package of 13 features: nine grammatical, two lexical, and two phonetic.

Our own choice of features is based on past sociolinguistic and CMC studies and on the suitability of the features for quantitative analysis. For instance, certain sentence structures (e.g., pseudo-cleft constructions; Blanche-Benveniste, 2000) connote a certain degree of formality and could thus be examined in this study, but their analysis would require the development of an automatic syntactic analyser. In this study, we have chosen to study variables that can be detected by self-made Unix scripts and the automatic morphological analyser MElt (Denis & Sagot, 2009). Consequently, we explore six variables from the areas of syntax, vocabulary, and prosody, as shown in Table 1.

Table 1. List of variables analysed

In the present written material, ‘prosody’ refers to typographic practices that imitate spoken prosody (see, e.g., Herring, 2012).


The data for the study consist of 874 blog entries, posted on 80 French politicians’ blogs maintained by 79 politicians (one politician maintained two blogs) during the month of September 2007, together with the 3316 comments made on those entries. The total number of words in the corpus is 266,475 for the posts and 425,084 for the comments.

The politicians in question represent six different political parties, with representatives of the two main parties constituting the bulk of the data: the conservative UMP (Union pour un Mouvement Populaire, 25 bloggers) and the socialist PS (Parti Socialiste, 43 bloggers). The politicians also represent various political roles, ranging from municipal and regional councillors to Members of Parliament, Senators, Ministers, and Members of the European Parliament. Fifteen of the politicians are women and 64 are men. Their age could be determined in the case of 64 writers, of whom 20 were under 40 years old in 2007, 34 were 40-59 years old, and 10 were over 60 years old.

We are aware that politicians do not necessarily write their blog posts themselves; rather, there may be a blog team, for example, that produces the text. One of the criteria applied in compiling the data was that the politician appeared as the author of the blog post; thus blogs in which the politician was referred to in third person, for instance, were excluded. The data consist of all French politicians’ blogs which were then accessible, and which met the following criteria:

  1. The blog possessed a commentary section and an archive that was reasonably easy to use

  1. It was (at least ostensibly) written by the politician him- or herself

  2. It was written by a politician elected to a specific political office; blogs maintained for instance by ‘active party members,’ Parliamentary assistants, or people interested in politics in general were not included.

The blogs were found using web search engines and through links displayed in blogs already found.



Omission of the Negative Particle ne

According to Coveney (1996, p. 55), the presence or absence of the negative particle ne is “possibly the best known sociolinguistic variable in contemporary French.” Prescriptively, negation in French is expressed by two elements surrounding the finite verb of the clause: The first, the negative particle ne (or n’ if followed by a vowel-initial word), is placed before the verb, and the second, the negative complement(s) pas (‘not’), plus (‘anymore’), jamais (‘never’), etc., follows the verb (Charaudeau, 1992). The first part of the negation, however, is often omitted in informal French, particularly in spoken language but also in informal written contexts such as synchronous chat (van Compernolle, 2008a). According to Blanche-Benveniste (2000, p. 39), this is the case in 95% of negations in ordinary spoken conversation, while in public, controlled communication, even when spoken, the retention of ne is much more frequent.

We examined the occurrence of the most common negative complements (see Armstrong & Smith, 2002, p. 23), pas, jamais, plus, rien, and personne; we found 2184 negations expressed by these complements in the blog posts and 4230 in the blog comments. We excluded from the count condensed negations (Charaudeau, 1992) in which ne is not expected, whether due to the absence of a verb, such as in the noun phrase pas de medicaments (‘no medicines’) or in fixed expressions, such as pourquoi pas (‘why not’) and pas mal (‘not bad’). In addition, constructions containing multiple negations were excluded as their detection was not possible using the same identification method on a Unix command line.

In the blog posts the particle ne is present in nearly all negations (omitted in 1.6%), while the comments display a slightly higher degree of absence: ne is omitted in 4.2% of negations. The difference is statistically significant.2 The comparison of our results (see Table 2) to the omission of ne in French online personal ads in Quebec (van Compernolle, 2008b) and chats (van Compernolle, 2008a) shows that the omission rate in the blog comments is much lower than that of chats or online personal ads.

Table 2. Omission rates of negative particle ne

It should be noted that 75% of the omissions in the blog posts occurred in reported speech, for instance in the transcription of a spoken interview. This strengthens the conclusion that the omission of ne is not expected in a politician’s blog post and that the genre belongs to the “more monitored” or “conservative written” varieties of French which typically retain the negative particle ne (see Armstrong & Smith 2002, pp. 23, 28). Further, the results are in line with those of Williams (2009): In moderated CMC contexts, ne deletion is rare. The comments sections of the politicians’ blogs can be considered moderated because the politicians can control which comments they publish and which not.

Although infrequent, the omission of ne is not altogether absent in the data. Its occurrences manifest variation in the degree of formality, especially within the blog comment sub-genre. Omissions of ne occur in sequences in which the style overall can be described as informal, such as the comment in (1) :

(1) (Devedjian, http://www.blogdevedjian.com/, italics added)

On nous raconte que de la merde, de belles petites histoires pour nous faire rêver, mais derriere ils font que de la merde aussi. Juppé... c’est une plaisanterie. C’est pas possible de se foutre de la gueule du monde à ce point là.

‘They tell us nothing but bullshit, nice little stories to keep us happy, but behind they do nothing but shit. Juppé… it’s a joke. It’s not possible to mess with people to this point.’

Example (1) displays three ne omissions, although the first and the second occurrences, On Ø nous raconte que and ils Ø font que, are not included in our ne omission count because ne - que is not among the five most common negation pairs. The third negation, c’ Ø est pas, is the most common negative expression with an omitted ne in our data (54 of the 165 omissions). It can therefore be suggested that the expression c’est pas is a chunk in which the insertion of the standard negative particle ne is not always considered necessary.

The informal style displayed in (1) is a combination of different factors, including, in addition to the non-standard negations, colloquial lexicon, e.g., merde (‘shit’) and se foutre de la gueule (‘to mess with’). In a similar manner, the standard style displayed in comment (2) is conveyed by a combination of factors:

(2) (Jeanne, http://www.nadinejeanne.com/ , italics added)

On peut rêver de la situation qui existe dans de petites villes où les habitants constituent eux-mêmes leur liste en rayant les noms de ceux qui ne leur plaisent pas : ce n’est pas possible à Puteaux. Je ne connais pas d’autres organisations possibles qu’un parti pour prendre le pouvoir : il donne une colonne vertébrale et permet ensuite de dialoguer avec les militants des autres partis et les citoyens non engagés à partir du projet élaboré.

‘We can dream of the situation which exists in small towns where the inhabitants themselves make up the list by crossing out the names that do not appeal to them: it is not possible in Puteaux. I do not know of other possible organizations than a party to be assuming power: it gives a backbone and enables dialogue with active members of other parties and with non-committed citizens on the basis of a developing project.’

In (2), the retention of the negative particle ne (ce n’est pas and je ne connais pas) contributes to an interpretation of the text as standard. In addition, the vocabulary of the sequence is standard, as is the syntax and use of punctuation marks.

Forms of Yes-No Questions

Coveney (1996) argues that in addition to negation, another prominent syntactic indicator of socio-stylistic variation is the structure of the direct interrogative sentence. We focus on the 1633 yes-no questions (YNQ) found in our data, leaving aside WH- questions (WHQ) for the sake of simplicity: YNQs have fewer variants than WHQs (Coveney, 1996). YNQs in French can take three different forms: 1) SV (subject + verb), in which interrogation is signalled merely by the question mark (in writing) or by a rising intonation (in speech); 2) word order inversion, VS; and 3) the periphrastic interrogative structure ESV (est-ce que ‘is it that’). In most descriptive grammars, word order inversion (VS) is associated with written language and especially with formal style (Charaudeau, 1992; Riegel et al., 1997 [1994]). The present data, however, call for an addition to the forms because a considerable number of the YNQs do not contain a finite verb. We therefore divide YNQs into four categories: VS, ESV, SV, and non-finite (NF). The last category consists of constructions lacking a finite verb and containing only a noun phrase, prepositional phrase, or adjective functioning for instance as the subject of the question, which is indicated by a question mark. To our knowledge, the possible (in)formality connotations of this interrogative form have not yet been described; Frehner (2008) and Herring (2012), however, have observed that omission of parts of speech (e.g., subject or finite verb) is typical of informal CMC, especially in its synchronous modes.

Table 4 shows the proportions of VS, ESV, SV, and NF questions in the blog posts and blog comments and compares them to the results of van Compernolle and Williams (2009b), according to which in informal synchronous French CMC, 97.7% of the YNQs by native speakers follow the direct SV pattern.

Table 3. Relative frequencies of yes-no question structures

Table 3 shows that while the ESV pattern is marginal in all three modes, the other three structures display variation. In chats the informal SV structure is clearly the most common, while blog posts and blog comments show both informal SV and NF structures and formal VS ones. The formal VS pattern is more frequent in blog posts (58.3%) than in blog comments (43.5%). The difference between posts and comments is statistically significant.3 In both sub-genres the VS structure is much more frequent than in chats, where it is marginal.

The relatively high frequency of the VS structure in blogs can be interpreted as a sign of formal style. This nevertheless conflicts with our finding that many of the word-order inversions are written non-standardly, with a varied orthography. One common deviation from the norm is the omission of the hyphen between the verb and the pronoun, as in (3):

(3) (Mélenchon, http://www.jean-luc-melenchon.fr/ , italics added)

Ne sommes nous pas en retard sur ce plan?

(In standard French: Ne sommes-nous pas […] ?)

Are we not late in this respect?

In addition to the omission of the hyphen, the use of the linking t between a vowel-ending verb and a third person singular pronoun il or elle is non-standard in many cases. According to the norm, the linking t should be preceded and followed by a hyphen (Riegel et al., 1997 [1994]). In (4), the verb ends with a latent consonant and the linking t is thus redundant. Further, the linking t is not surrounded by hyphens but is followed by an apostrophe:

(4) (Romero, http://www.romero-blog.fr/ , italics added)

YVES CALVI serait t’ il à extrémiste de gauche ou de pro life ?

(In standard French : […] serait-il […] ?)

Would YVES CALVI be an extremist on the left or the pro life?

In standard French, the linking t would be introduced if the verb in (4) ended in a vowel, for instance if it were in the future tense instead of the present conditional: YVES CALVI sera-t-il.

The non-standard spellings of the VS structure call for a more thorough scrutiny of the position of YNQ structures in terms of the degree of formality they connote. The frequency of VS structures with non-standard orthography in the data suggests that the status of this variant is undergoing a change; it may be spreading to less formal contexts, especially in written CMC. This may be due to the economy of the form: Inversion does not require more keystrokes than the SV form, unlike the ESV structure which is common in informal spoken French.

As shown in Table 3, 19.7% of the YNQs in the blog posts and 30.8% in the blog comments are realized in the non-finite form. Example (5) illustrates the use of a non-finite question in a blog post.

(5) (Baert, http://www.dominiquebaert.com/ )

[…] Surprise et malaise en tout cas, car chacun a compris que l’utilisation du mot « faillite », laissé échapper par M. Fillon, risque bien d’en annoncer un autre : celui de « rigueur » ! Mme Lagarde, ministre de l’Economie, l’a laissé échapper, elle, cet été. Un autre aveu ?

‘Surprise and unease in any case, because everyone has understood that the use of the word “bankruptcy”, let slip by Mr. Fillon, clearly tends to announce another: that of “austerity”! Ms Lagarde herself, Minister of Finance, let it slip this summer. Another confession?’

In (5), the question un autre aveu? (‘another confession?’) ends the entry. The lack of a finite verb (as well as a formal subject), as for instance in Est-ce un autre aveu? (‘Is it another confession?’), does not affect the denotational meaning of the utterance, and it is unclear what social meaning is connoted. The lack of sentence elements thus opens future avenues for the exploration of (in)formality in CMC.



The frequency of colloquialisms in the data was examined with a list adapted from Armstrong’s (1998) study of lexical socio-stylistic variation. The list does not claim to be an exhaustive account of colloquial words in French. The colloquialisms listed, however, are very common and continue to be generally used in informal communicative situations; their presence or absence in a corpus of politicians’ blogs is thus emblematic of degree of formality. The original list contains 233 items,5 labelled familier (‘colloquial’), populaire (‘popular’), or vulgaire (‘vulgar’). Some of the words are colloquial only in one or some of their meanings; only the colloquial uses were taken into account.

We found that the colloquialisms listed occur in our data, albeit sparsely. The blog posts contained 41 items out of the 231, with a total of 117 occurrences. The comments contained 107 items, occurring a total of 722 times. In other words, the blog posts contain 0.439 of the listed colloquialisms per 1000 words, while the corresponding frequency in the comments is 1.698. This difference suggests that the use of colloquialisms is more typical in comments than in posts; in both sub-genres, however, their use is marginal. The numbers of occurrences of some of the items found in the data are given in the Appendix.

One noteworthy feature of the colloquial lexicon in the comments is the frequency of strongly evaluative items, which are nearly absent in the blog posts. For instance the strongly pejorative adjective and noun con 5 is more frequent (46 occurrences, see Appendix) than the corresponding and less insulting standard item stupide (11 occurrences). Other evaluative colloquialisms used in the comments include merde, connerie, the expressions en avoir marre and s’en foutre, the intensifier sacré, and the intensifier/adjective super. In comparison, the very common colloquialisms in informal spoken French sympa, truc, and fric are quite rare in the blog comments, while their respective standard equivalents (sympathique, chose, and argent) are more frequent. Even if sympa is also an evaluative item, it is semantically relatively trite and therefore much less strongly evaluative than the above mentioned items. The low frequency of these three common colloquialisms renders the occurrences of the strongly evaluative ones even more noticeable. This finding suggests that the follow-ups include overt stance-taking and fierce self-positioning. Such expression of opinion is illustrated in the comment shown in (6):

(6) (Guilmart, http://thierryguilmart.blogspirit.com/)

Ce qui est chiant avec les retraites, à chaque changement de gouvernement, on nous change les règles du jeu! On pense à réformer, mais à cause de qui? Les précédents gestionnaires nous ont foutu dans la merde! Qu’on les face payer enfin!

‘What sucks with the pensions, with every change of government there are new rules of the game! They think they are fixing it, but because of whom? The previous administrators fucked everything up for us! We should finally make them pay for it!’

In (6), the commenter expresses his/her disappointment with the outcome of past governments and calls for payback. This highly subjective stance is reinforced by the vulgar lexical items chiant and foutu dans la merde, as well as the use of exclamation marks. The overall tone of the comment is one of anger, and the style is relatively informal; in addition to the colloquialisms, there is non-standard syntax in the first sentence (ce qui est chiant […] on nous change) and a spelling error in the last sentence (face instead of fasse).

A closer examination of the colloquial items found in the blog posts seems to further highlight the gap between the posts and the comments; the colloquialisms in the blog posts connote a less informal social meaning than those in the comments. Here several points need to be made. First, the most frequent colloquialism, vélo (‘bike’), is problematic. As the standard bicyclette occurs only once in the data, it is possible that the use of the truncated form is expanding from colloquial style towards the standard. The same is observed in the case of télé, the truncation of téléviseur (‘television set’); it is seven times more frequent than the standard full form. In the dictionary Larousse en ligne (http://www.larousse.com/en/dictionaries/french-monolingue), vélo and télé are still labelled as colloquial (familier) items, however.

Second, the next most frequent colloquialism in the posts is foot, a truncation of football. The standard form football occurs primarily in different blogs from those using the truncated colloquial word. The use of the truncation foot may be due to the nature of the topic: A blog post on sports, especially on football, is a chance for a politician to entertain the reader, as opposed to writing about strictly political topics. This entertainment function, and the consequent adding of a bit of informality and diversity to the author’s public image, can be emphasized by the use of a more colloquial vocabulary, such as the truncated foot.

Finally, a noteworthy feature of the use of colloquial items in the posts is that they quite frequently occur in reported speech, usually representing a spoken communicative event. Omitting vélo and télé (see above), 38% of the items labelled familier occur in reported speech. The same applies to all but one of the five words classified as populaire or vulgaire.


In addition to the ‘traditional’ colloquialisms, the data contain lexical choices typical of CMC that deviate from standard French. Anis (2007) labels the French writing system characteristic of some modes of CMC as ‘neography,’ including such devices as acronyms, phonetic spelling, syllabograms, rebus-like spellings, suppression of vowels, logograms, and truncations. Neography is especially found in SMS and chat; chiefly because of constraints on space and time in those modes of CMC, but also “to make the message more expressive, to exhibit the user’s ego, to play with language and communication, to contest standards, to express solidarity with the group, or to manifest adhesion to a counterculture” (Anis, 2007, p. 90; see also Herring, 2001). Danet and Herring (2007) associate neography features with speechlike informality.

In the present data, we found lexical neographic features only in the blog comments. The absence of neographic features in the blog posts further highlights the difference between the sub-genres: While standard writing is preferred in the posts, the blog comment sub-genre allows for more variation. Table 4 shows the frequency of two quantifiable features: vowel suppression and acronyms.

Table 4. Occurrences of some neography indicators in the data

Vowel suppression occurs in some comments; for instance: tjrs and tjs (toujours, ‘always’), ds (dans, ‘in’), and tt (tout, ‘all’). Phonetic spelling is also present in a few comments, as in (7):

(7) (Devedjian, http://www.blogdevedjian.com/)


(Standard spelling : Mais qu’est-ce qu’on a à faire de ça ?)

‘But what do we have to do with it?’

The most common neographic features in comments shown in Table 4 are acronyms expressing the writer’s laughter (LOL and MDR, with varied spellings). The use of the English variant is also a sign of code-switching, a phenomenon Danet and Herring (2007) find typical of informal CMC. These acronyms, both in English and in French, occur most often in sequences containing other informal features as well. Comment (8) illustrates this:

(8) (Albouy-Guidicelli, http://jmag77.typepad.com/)

Y'a eu la petite Marina qui habite Melun,(une amie) qui est passé par "star ac 6" qui va faire parler d'elle au printemps 2008 avec un album super, je peux le confirmer...Oup's! petite pub lol

‘There was the cutie Marina who lives in Melun,(a friend) who took part in "star ac 6" which will present her in spring 2008 with a great album, I can confirm…Oops ! a little ad lol’

In (8), linguistic features connoting informality include the abbreviation y’a for il y a, the diminutive petite (‘small,’ here in the sense of ‘cute’), the truncation star ac 6 for the TV show Star Academy 6, the interjection Oup’s, and the truncation pub for publicité, in addition to the acronym lol. The sequence can consequently be labelled informal.


In written CMC, typography and orthography take over the paralinguistic functions fulfilled in speech by sound (Herring, 2012). Punctuation is one of the means through which prosody and paralanguage are expressed, with non-standard uses that often express emotive and affective meanings (Anis, 1999) and different pragmatic meanings (Dresner & Herring, 2012), but which also connote informality (Anis, 2007; Danet & Herring, 2007).

Repetitive punctuation

In the present corpus, the repetition of question marks, exclamation marks, and periods is more frequent in the comments than in the posts (see Table 5).

Table 5. Occurrence of repeated punctuation marks (two or more) and emoticons per 1000 words

The combinations of punctuation marks shown in Table 5 are non-standard apart from the use of three consecutive periods, which indicates ellipsis, hesitation, or trailing off of the writer’s thought. All of the forms, including ellipsis, also indicate expressiveness, as shown in (9) and (10):

(9) (Devedjian, http://www.blogdevedjian.com/)

Le Président serait bien inspiré de se séparer de notre dame des douleurs....mais voilà,image oblige......la com rien que la com.......

Pour le reste...on attend avec de plus en plus d’impatience des décisions de droite....comme en Alsace ou on trouve que la nomination de Bockel est une injure aux électeurs de cette même droite qui a porté Sarkozy au temple....

mais voilà ouverture oblige.......la com rien que la com......

‘The President would be really keen on separating himself from our lady of the sorrows …. but here we are,image compels……nothing but words……

For the rest…we’re waiting with more and more impatience for the decisions of the Right….as in Alsace where they think the nomination of Bockel is an insult to voters of the same Right which carried Sarkozy to the temple….

but here we are openness compels……nothing but words……’

(10) (Cambadélis, no longer in use)

Bravo !tu es super. Pourras tu porter ce message librement ? j’aimerai une reponse a cela!!!!!!

‘Well done !you’re great. Can you release this message? i’d like to have a response to this!!!!!!’

In (9), the commenter writes in an enigmatic and nearly poetic style; all the constructions end in ellipsis and the utterances la com rien que la com (‘nothing but words’) and mais voilà, image/ouverture oblige (‘but here we are, image/openness compels’) are reiterated. In (10), the six exclamation marks at the end of the comment reinforce the commenter’s expectation of a response. They also contribute to the informal style manifested in (10). In addition to the repetitive punctuation, the orthography is non-standard, as indicated by the absence of a hyphen between the verb pourras and the pronoun tu, a capital letter at the beginning of the second and fourth sentence, a closing –s marking a conditional in j’aimerai, and the accents in reponse and a. Likewise the use of the second person singular pronoun tu is marked in this context, as one would normally expect strangers to address a politician with the polite vous form (Riegel et al., 1997 [1994]). The use of the singular address pronoun in (10) reflects a tendency in some modes of synchronous French CMC, as observed for example by Williams and van Compernolle (2009): tu is preferred even when the writer is not acquainted with the addressee. Douglass (2009) observed great variation among blogs in the address form used in comments, towards both the blog owner and other commentators. However, Douglass investigated “personal blogs, current events blogs and sports and entertainment blogs” (p. 219) written by lay people; our data differs from these because of the bloggers’ institutional status.


Emoticons are another way of expressing affect, or, as Dresner and Herring (2012) put it, of reinforcing or clarifying the illocutionary force of the utterance. In the comment in (11), the emoticon at the end of the utterance mitigates the claim:

(11) (Urvoas, http://www.urvoas.org/ )

J’ai oublié, pardon : j’espère que vos camarades feront le bon choix mercredi !!! :)

‘I forgot, sorry: I hope your friends will make the right choice on Wednesday!!! :)’

This comment presses for a given electoral choice; the emoticon at the end mitigates the coercive tone by adding an element of humour to the utterance. This is a common way of using emoticons in the corpus. Anis (1999) reported the frequent usage of emoticons in French language Internet Relay Chat (IRC), with 13.2% of the messages containing emoticons. The present corpus contains 35 emoticons (0.082 per 1,000 words) in the blog comments, while the blog posts contain none. This parameter thus further confirms that the style of the blog post sub-genre is chiefly standard, while the comments display more variation: Emoticons are marginal in this sub-genre as well, but the 35 occurrences found suggest that the sub-genre allows for their use.


The use of the pre-selected variables in the data suggests that the degree of formality in both sub-genres is by and large standard (or even formal, as the forms of YNQs might suggest), especially in comparison with other CMC genres. However, the frequencies found also indicate variation; although quantitatively infrequent, informal uses of the variables are not completely absent. These informal uses support the hypothesis presented in the introduction, that blog comments would be more informal than blog posts: The informal variants are more frequent in the comments than in the posts.

The colloquial style found in the comments was illustrated in examples (1), (6), (7), (8), (9), (10), and (11). In order to highlight the range of stylistic variation, the comment in (12) is presented here as a contrast to the above mentioned colloquial examples. The example suggests that the antecedent genres influencing stylistic choices in the blog comments include not only informal chat discussion but also formal letters to a politician:

(12) (Kaltenbach, http://www.philippekaltenbach.com/, italics added)

Monsieur le Maire,

Sensibilisée aux problèmes dramatiques des familles en attente de régularisation, et en particulier au sort des enfants, je soutiens votre action en leur faveur et vous en remercie. Les orientations gouvernementales m’inquiètent. Il me semble qu’elles développent en France un sentiment de peur de "l’autre", du "pas pareil" dangereuse. À ce sujet, jai été effrayée très récemment des réactions déraisonnables de certains habitants de mon quartier au sujet d’un projet d’insertion de populations en difficultés. Croyez bien que sur ce sujet aussi, vous avez mon soutien.

‘Mr Mayor,

Being aware of the dramatic problems of the families in line for regularization, and especially of the fate of the children, I support your action in their favour and thank you for that. The government’s leanings worry me. It appears that these leanings are promoting in France a feeling of fear of the “other”, of the dangerous “not the same kind”. On this matter, I was frightened very recently by the unreasonable reactions of some of the inhabitants in my neighbourhood to the project of integration of the groups in difficulties. Please be assured that on this matter as well, you have my support.’

In (12), unlike (10) above, the commenter addresses the blogging politician in the polite vous form, starting with the official form of address Monsieur le Maire. In addition, some features in the comment reflect a clearly formal style, such as the formal register word Sensibilisée, the long premodifying non-finite clause Sensibilisée aux […] enfants, the nominalised complement en attente de regularisation instead of a subordinate clause, the abstract subject orientations gouvernementales, and the formal closing Croyez bien […]. Further, the orthography is standard, except for the absence of an apostrophe in jai (standardly: j’ai).

As mentioned, the blog posts display less variation than the comments. The few informal sequences stand out as atypical, suggesting strategic usage. The blog post in (13) illustrates the use of an informal style:

(13) (Albouy-Guidicelli http://jmag77.typepad.com/ )

jeunes de montereau en goguette

Comme vous pourrez le constater les jeunes de montereau sont tous sympas et bogosses .... Mesdemoiselles vous savez où il faut habiter....

Note rédigé et posté en présence des jeunes pour leur montrer le fonctionnement du programme typepad pour palm tréo 680. Ils étaient assurément sympathiques.

‘The youngsters of montereau ready to have fun

As you can see the young guys of montereau are all nice and cute …. Ladies you know where to live….

A post written in the presence of the young in order to show them the functioning of the typepad program for the palm tréo 680. They were extremely likeable.’

In the blog, the text of (13) is accompanied by a photograph showing three local teenage boys. The blogging politician, Jean-Marie Albouy-Guidicelli, starts the text with informal language, reflected by the colloquialisms en goguette, sympas, and bogosses.6 The heading is written casually, without a capital initial letter in the first word or in the name of the town (Montereau). Furthermore, the two first sentences of the body text end with repeated punctuation, namely the final ellipsis expressing a suspended, implicit thought and a humorous ambiguity. In these lines the author addresses the readers and even specifies them as Mesdemoiselles. The last two lines, however, demonstrate a change in audience design: the truncation sympa, for instance, changes to the standard sympathique. Most importantly, the author now refers to teenagers in the third person plural: It thus seems that these last two lines are directed to the ‘actual’ audience of the blog, explaining the momentary deviation from the usual style.

If the use of colloquial style in (13) can be interpreted as an expression of solidarity towards the young, the following blog post (14) demonstrates the opposite: the use of an informal style as dissociation from a specific social group:

(14) (Cambadélis (no longer in use); underlining and italics added)

La réalité ce sont les affrontements violents des bandes en plein Paris. Les adolescents composant ces bandes sont majoritairement des descendants de migrants, qui renforcent volontairement leurs particularités les plus visibles. Leur sous culture négativiste et agressive, marquée par un hédonisme à court terme, s’inspire de la culture dominante et en retourne le sens. Elle se résume au chantage hystérique : « donnez-moi du fric ou je pète un câble et je fais des conneries ! ». Avec une violence qui traduit la frustration des enfants de ces milieux populaires qui n’ont pas les moyens de s’adapter à un monde dont les valeurs consuméristes et individualistes sont celles de la classe moyenne.

‘The reality, they are the violent clashes of gangs in central Paris. The teenagers who compose these gangs are by a majority the descendants of immigrants, who deliberately emphasize their most obvious characteristics. Their negativist and aggressive sub-culture, marked by a short-term hedonism, is inspired by the dominant culture and reverses its meaning. It is summed up as hysterical extortion: “gimme some money or I’ll cut a cable and screw it up!”. With a violence which translates the frustration of the children of these working-class quarters who do not have the means to adapt themselves to a world whose consumerist and individualist values are those of the middle class.’

Example (14) emphasizes the difference between the public, relatively formal language use of the blogging politician and a highly colloquial way of speaking. The underlined reported speech represents the ostensible discourse of aggressive youngsters. It contains three colloquial lexical items (in italics), and is in clear contrast with the rest of the sequence. The style of the majority of the sequence contains lexical choices reflecting standard and even formal or specialised language, e.g., sous culture négativiste et aggressive, hédonisme, milieux populaires, and consumériste.

In sum, despite the affordances of the blog medium, i.e., “interaction and connection, immediacy, instant access, low overhead” (Miller & Shepherd, 2009, p. 283), French politicians appear not to deviate to any great extent from standard language in their blog posts. However, unlike many political genres, such as the press release or the White Paper, the blog permits innovative and strategic stylistic variation, which can be seen as a tool in the politicians’ process of constructing credible images of themselves in the eyes of the public. In the comments, the multiplicity of participants, together with the interaction afforded by the medium, results in varied styles, ranging from that resembling a casual encounter to the formality entailed by the power asymmetry of the participants.


We have described the nature and degree of variation in linguistic formality in French politicians’ blogs through the quantitative analysis of a set of pre-selected features in the areas of syntax, vocabulary, and prosody, and through the analysis of representative sequences which emerged from the quantitative study. This method enabled a detailed description of the degree of formality and the integration of CMC-specific features into the study of linguistic (in)formality. It also allowed comparison of the results with those obtained from other corpora. The approach, however, has certain limitations, as the results are dependent on the choice of features.

Our analysis suggests many possibilities for future research. The quantitative results could be compared, for example, with those for a more recent corpus of French politicians’ blogs. The comparison might yield useful information about the evolution of generic style conventions: Is language use more informal, more formal, or more varied now than it was in 2007? In addition to such a diachronic approach, a synchronic comparison of the results of the present study with the language, for example, of celebrity blogs, would illuminate the differences between blog genres and the impact of the identity of the blog owner on the language use. Finally, in order to highlight the influence of technological affordances on the styles politicians use in CMC, our results could be compared to the language politicians use in other modes, such as the microblogging tool Twitter.


We thank Eija Suomela-Salmi, Tuija Virtanen, Maarit Mutta, and Milla Luodonpää-Manni for their comments and Filip Ginter and Samuel Kohonen for technical assistance.


  1. Studies of CMC in other languages have already combined these two types of features; see, e.g., Cho (2010) for English and Herring and Zelenkauskaite (2009) for Italian.

  2. With the χ2 test, p= 5.179e-08, χ2 = 29.6486, N=6414.

  3. With the χ2 test, p=0.00032, χ2 = 12.963, N=1599.

  4. We deleted from Armstrong’s original list (1998, pp. 489-495) two items referring to the former French currency: balle (‘franc’) and brique (‘10,000 francs’).

  5. We do not provide translations for the colloquialisms, since English equivalents corresponding to the exact degree of formality may not exist. Instead, translations of the standard equivalents are provided in the Appendix.

  6. The colloquialisms en goguette and bogosse show that a quantitative analysis of lexical variation is problematic, because these items are not included in the list provided by Armstrong (1998). They are likewise not included, for instance, in the Internaute dictionary of colloquialisms ( http://www.linternaute.com/dictionnaire/fr/usage/familier/1/ ). The frequency of items indicated by a predetermined list can thus merely suggest tendencies in a data sample; it is by no means an absolute indicator of the degree of formality of the lexicon used.


Appendix. Some colloquialisms found in the data (list adapted from Armstrong, 1998)

Biographical Notes

Lotta Lehti [lotta.lehti@utu.fi] is a postdoctoral researcher at the Department of French, University of Turku, Finland. Her research interests lie in the areas of discourse analysis, rhetoric, and computer-mediated communication.

Veronika Laippala [veronika.laippala@utu.fi] is a postdoctoral researcher at the Department of French, University of Turku, Finland. Her research interests include text linguistics and the use and development of natural language processing methods in the study of language.


