Problems and Promises in the Study of Virtual Communities: A Case Study

Susan C. KINNEVY <>
University of Pennsylvania School of Social Work


Background and significance

Virtual communities are constructed through communication and interaction that is multi-directional, multidimensional and constantly changing (Kang & Choi, 1999). Communication may occur instantaneously on several levels and through several dimensions: e-mail, listservs, web pages and hyperlinks. Communication may be purposely delayed without loss of connection or credibility (Wellman et al., 1996). Face-to-face communities, on the other hand, depend on linear communication, non-simultaneous responses, and physically shared space. The scientific study of virtual communities, while holding great promise for researchers to understand the ramifications of electronic technology on social interaction, also faces problems not previously encountered in the study of face-to-face communities.

Research on Internet communication and community-building ranges from theoretical discussions on the nature of artificial vs. human intelligence through policy discussions on legal issues of ownership and appropriate content to analytic discussions of e-mail exchanges, web pages and usenet groups. However, research that moves from theory to analysis of existing virtual communities is rare. As with any other study of communication, analysis of virtual communities consists of freezing a section of time to explore what is essentially part of an ongoing process. Nevertheless, the fluidity of the medium necessitates a rethinking of our approach to research. As social scientists, we are accustomed to doing research from a linear perspective, but Internet technology and cyberspace are based on an associative model. Cyber communities that are linked associatively rather than linearly both in time and in space are rapidly creating a new ecological niche that requires creativity in applying traditional research methodologies (Postmes, Spears & Lea, 1998; Taylor, 1999).

Most published research on virtual communities has used qualitative approaches, drawing mostly from ethnography, conversation analysis (CA) or discursive approaches (e.g., Coe, 1998; Downey, Dumit & Williams, 1995; Escobar, 1994; Fox & Roberts, 1999; Rice & Love, 1987). The use of discursive and CA approaches is particularly useful in capturing the flow of communication and interaction between members of a virtual community, since such communities are by their nature discursive communities. Discursive approaches enable the researcher to reconstruct the interpersonal interactions and the processes by which meanings are constructed. As interesting as these qualitative approaches are, there is a paucity of quantitatively based research on virtual communities and a lack of integrated (quantitative and qualitative) approaches. In order to examine a multiplicity of variables, quantitative content analysis is a first and necessary step. Nevertheless, content analysis can be performed only after an ethnographic observation of or involvement with the cyber community (Downey, Dumit & Williams, 1995; Escobar, 1994). Without such immersion in the processes and life of the community, the researcher runs the risk of imposing a barren and preconceived frame of analysis that has little to do with the specific field of study. This type of immersion is common in ethnographic studies of face-to-face communities and is both easier and harder to accomplish in virtual communities.

Ethnographic case study

Virtual communities, which are exclusively voluntary associations, deal with at least some of the same issues as material communities, e.g., gatekeeping and normative value construction. Electronic communities, however, often rely on a moderator to set and maintain boundaries, reduce conflict and control the flow of discourse. In the following participant-observation study of 246 listserv messages in a peace activist virtual community, a naturally occurring change from a monitored to an unmonitored list elicited the following research questions: 1) What is the impact of list monitoring on the content, type and scope of community participation? 2) What is the impact of list monitoring on the conflictual tone and personal content of the messages? 3) What is the impact of list monitoring on the emergence of individual voices?



This study was ethnographic, consisting of non-participant observation of e-mail messages from a virtual peace activist community over a four-month period from mid-December 1998 through mid-April 1999. The first author informed the list owner that she would be observing the list for research purposes. A major change in the community occurred at the beginning of April when the list moved from being monitored by one individual to being a non-monitored, direct-post listserv. This change afforded the authors an opportunity to evaluate a naturally occurring pre-post situation.


A total of 246 messages were observed. Duplicate messages were removed, resulting in a final population of 241 messages (139 messages from the monitored list period and 102 messages from the non-monitored list period).

Coding framework

Messages were initially coded into a quantitative dataset that tracked the messages by topic (50), type of communication (14), geographic range or scope (4), origin (30), personal content (high, low, and neutral), and conflictual tone (yes, no, and neutral).

To streamline the analysis, the messages were recoded, resulting in an analytic framework of 10 Topics (Iraq, Kosovo, Political prisoners and trials, Bill Clinton, Media coverage, Disarmament and defense spending, Generic protests, Information about the list itself, Democracy, and Miscellaneous); 10 Types of messages (Analysis, Announcement, New article, Personal comment, Press release, Media event, Protest action alert, Other protest information, Response to other messages, and Miscellaneous); 4 geographic Scopes (international, national, regional, and local); 11 Origins (LO, List LO (owner) before the listserv opened up; LO2, original List LO (owner) after the listserv opened up; DSC, Director of the State Chapter of the peace activist group; NP, the National Parent of the peace activist group; IA, another International Activist group; CM1, CM2, and CM3, Community Members with regular and numerous contributions to the list; CM4, a new Community Member after the listserv opened up, and Others, including individual community member contributions that were sporadic and low in number). The codes for Personal Content and Conflictual tone were changed to reflect either their presence or absence, with the neutral categories being collapsed into the absent category. Research questions were answered using the following categorical analysis: 1) Content of communication was analyzed through the categories of topic and presence or absence of personal content and conflictual tone; and 2) Form of communication was analyzed through the categories of type and scope.


As Table 1 indicates, the study covered a period of 127 days, only 20 of which were unmonitored. There were 47 days with no messages posted, but 45 of those were during the monitored list period. In other words, 42% of the total monitored days carried no messages, as opposed to 10% of the unmonitored days. Similarly, the longest period of consecutive days without messages was 7 (7%) in the monitored list period, as opposed to 1 day (5%) during the non-monitored period. Although the total number of messages (monitored = 142, unmonitored = 104) was 246, there were 5 duplicate messages, bringing the sample total to 241 (monitored = 139, unmonitored = 102). The highest number of messages posted on a given day went from 7 during the monitored phase to 18 during the non-monitored phase; the average number of messages went from 1.3 to 5.2. As indicated in Chart 1, the topic that garnered the largest share of messages (n=73) over the course of the study was the war in Kosovo. Subsequent analysis revealed that 58.9% of the Kosovo messages generated messages with a conflictual tone, as opposed to only 5.4% of the messages regarding the bombing of Iraq (n=37). Kosovo messages also differed from the combined Iraq/Political Prisoners messages with regard to personal content. Over 50% of the Kosovo messages registered high on personal content, as opposed to 33.7% on the Iraq/Political Prisoner messages (n=76).

Regarding changes in the type of message pre- and post-list change, Chart 2 reveals that the unmonitored list decreased the number of analyses and announcements posted, but increased the level of personal comment and responses to other messages. Chart 3 shows the scope of the messages changing from a very local and national focus to a regional and international focus. Chart 4 indicates that as messages from the list owner (LO) disappeared after the change to an unmonitored list, the individual member who was the original list owner (LO2) emerged just as strongly after the change, although in a member rather than owner capacity. DSC, who participated moderately before the change, ranked as the highest individual participant after the change. The role of the national parent organization (NP) decreased dramatically after the change, along with the role of the international organization (IA). Since the international focus increased after the change, it seems odd that the role of the international organization decreased, leaving the reason for the changes open for speculation. While the role of one individual member (CM1) decreased after the change, two other individual members (CM2 and CM3) increased their participation and a new member (CM4) also became vocal after the change. Regarding the content of the messages, Chart 5 displays an increase in messages with a conflictual tone and is accompanied by a decrease in messages with personal content.

After examining the changes in the distribution of categories that resulted from a change in the monitoring status of the list, the authors estimated logistic regression models, examining the role of personal voices in the community, and the impact of the topic, type, and scope of message and the impact of list monitoring on those personal voices. All models used the dummy variable coding with absence (0) as the reference (omitted) category. Two dependent variables were used -- "personal" and "conflict" representing the probability of including personal experiences/content and conflictual-tone in a message to the list. The results are presented in Tables 2 and 3 respectively. A methodological note is necessary here. As discussed previously, the population the authors examine is the entire message population for the time period in question. Significance levels for the regression coefficients are meaningful only when dealing with a representative sample. From a technical perspective those significance levels should be ignored. However, they can be regarded as indicating the relative explanatory power of certain variables, while controlling for all other variables in the model. Accordingly, the presentation of results will show the results for all the variables in each model. However, due to space considerations, the discussion of the relative impact of each variable will focus only on those that show some level of "significance" (at least p<0.1 level).

Table 2, which examines the impact of the variables on personal content in the messages, displays three nested models. Model A compares the odds of a message containing personal content among the major contributors to the list. Model B adds to this analysis the odds of various topics, types, scopes, and conflictual tone impacting on the personal content of the messages. Model C controls for the monitoring status of the list (pre- and post-list change). Model B represents a significant improvement in explanatory power when compared to Model A (Chi-square difference between the -2LL values of the models is 27.8, with 10 df, p<0.01). Model C does not exhibit more explanatory power than model B (Chi-square difference is 1.04, 1 df). This lack of significance indicates that monitoring status did not impact the inclusion of personal content in the messages. Focusing on Model A, two participants (NP, the national parent organization, and IA, an international peace organization) emerge as negatively related to personal content.. Model B, in which the topics, types of messages, scope and conflictual content were added to the analysis, shows that the topic of "political prisoners" has a significant impact on personal content, as well as the use of conflictual tone in the message. More specifically, the odds for a message including personal content coming from NP are about 1 to 5, and for IA are about 1 to 25. However, some other categories seem to impact on the inclusion of personal content. Of the topics, the issue of prisoners had an odds ratio of 2.6 to elicit personal content, messages related to national scope tended to lower the odds-ratio, while the use of conflictual tone raised the odds for inclusion of personal content to more than 5 to 1.

Table 3 examines the impact of the same set of variables on the odds for having a conflictual tone in the message. Comparing the three models, we can see that each is a significant improvement from the previous model. The difference in -2LL between Model B and Model C is 81.4 with 10 df, and between Model B and Model C is about 19, with 1 df. The increased significance in the shift from Model B to Model C is related to the change in the monitoring of the list. This change may also be deduced from the high level of significance registered for the "Monitoring Status" coefficient, which serves as a control variable and shows that the odds for having conflictual tone in a message have increased by a 12.5 odds-ratio shifting from monitored to non-monitored list. Model A shows three voices as significantly carrying conflictual messages: LO2, DSC and CM4. However, the level of significance drops when we include in Model C the other variables, and the relative conflictual tone of IA emerges as substantially strong. Controlling for all other variables, the odds-ratio for a conflictual message from this organization is about 15 to 1. The topic of Kosovo was directly related to conflictual content with an odds-ratio of almost 3; the messages including new articles had an odds-ratio for conflictual content of almost 5; and messages including personal content had an odds-ratio for having also a conflictual tone of more than 6 to 1.


The frequency findings regarding conflictual tone in the Kosovo messages seem to indicate that Kosovo was inherently a more conflictual topic than Iraq and one that elicited more of a personal emotional response. In addition, the U.S. intervention in Kosovo coincided almost precisely with the opening up of the listserv, further confounding the possibility of determining the impact of the list change versus the impact of Kosovo as a topic. However, the regression analysis also demonstrates a strong relationship between the topic of Kosovo and the presence of a conflictual tone. The analysis further shows the impact of five other variables on conflictual tone while controlling for the impact of Kosovo as a topic. The stronger impact here seems to be the effect of the change in monitoring status of the list. Thus, though the Kosovo issue contributed significantly to the emergence of conflictual tones in messages, it does seems that monitoring the list suppressed inputs with conflictual tones.

Regardless of the cause of the change in conflictual tone, pre- and post- the list change, it is important to note the kind of conflict that emerged in discussions of Kosovo. Whereas Iraq occasioned no real questioning of community pacifist norms, Kosovo elicited active debate about pacifist ideology and the limits of non-violence as a means of protest. For the first time, community members created a public forum that resembled an electronic town hall meeting. The result, while certainly not a complete normative change in values, is still notable in that a subtle shift in norms seemed to make allowances for gray areas on the ideological front.

The change in type of message from analyses and announcements to personal comment and responses to other messages may reflect a reaction to the topic of Kosovo dominating the unmonitored list. The reasons for the change from a local and national scope to a regional and international scope are difficult to analyze. Since there are actually fewer messages from either the national parent organization or the international organization after the list change, it is possible to speculate that the increase in international scope stems from the fact that more individuals from other countries began to contribute after the change. The increase in regional messages almost surely reflects the increase in participation of DSC, an individual member who lives in a different region than LO2, the original gatekeeper.

The disappearance of owner-originated messages from LO after the list change was expected, but the interesting development was the emergence of LO2 as an individual voice. On a qualitative level, the authors noted a change in the tone and content of LO2's messages after he lost his official gatekeeping role. He seemed to become more strident, even somewhat petulant, leading the authors to believe he was having a hard time giving up his ownership of the list. On the other hand, DSC became quite eloquent in the tone of his messages, particularly with regard to his spiritual beliefs and their relationship to his ideological commitment to pacifism.

In his original capacity as gatekeeper, LO's voice was authoritative and businesslike; he seemed to view his duties primarily as functional, concentrating on the transmission of information. Because DSC became so outspoken and passionate in his communications after the list change, his voice began to dominate the list and seemed to liberate other voices, as indicated by the increase in messages from two other members and the participation of one new member. DSC became the emotional center of the list and its de facto gatekeeper, even though LO2 continued to participate heavily. The change in gatekeeping also reflected a change in more fluid norms.

This study is exploratory and many avenues remain unexamined. The methodological design, however, is strong and may prove capable of producing more rigorous results in subsequent applications. At the moment, it is possible to say that this study affirms the existence of electronic communities that operate in their own way to emulate gatekeeping functions and normative value constructions of face-to-face communities. Electronic communities, due to the ethereal and fluid nature of their communication base, may be more sensitive to changes in tone and content of the communication; their lack of visual context may heighten their verbal attention in these areas. Although analysis of such a malleable medium is daunting at first, more exploration of methodological approaches is encouraged because virtual villages are seemingly the wave of the future.


Table 1: Pre- and Post-List Change Comparison
  Monitored Unmonitored
Number of days in the study 107 20
Number of days with no messages 45 (42%) 2 (10%)
Consecutive days with no messages 7 (7%) 1 (5%)
Total messages posted 142 104
Number of duplicate messages 3 2
Total messages in sample 139 102
Highest number of messages in one day 7 18
Average number of messages in one day 1.33 5.2
Table 2: Logistic Regression -- Personal
Variable Ctg Variable Model A (OR)1 Model B (OR) Model C (OR)
Original LO2 1.10 0.46 0.53
(Participant) SD 1.84 1.4 1.5
  NP 0.23** 0.22** 0.20**
  IA 0.08*** 0.04*** 0.04***
  CM1 2.75 3.47 3.17
  CM2 1.1 1.00 1.09
  CM4 2.2 1.52 1.96
  CM3 1.1 0.62 0.60
  LO (owner) 0.72 0.75 0.68
Topic Iraq -- 1.27 1.13
  Kosovo -- 1.22 1.36
  Prisoners -- 2.66* 2.60*
Type News article -- 0.48v 0.49v
  Alert -- 0.71 0.76
  Response -- 1.08 1.20
Scope International -- 0.88 0.88
  National -- 0.38v 0.36v
  Local -- 0.54 0.52
Conflict Conflictual tone -- 5.20*** 5.90***
Monitoring Status Monitored -- -- 1.58
  Constant (B) 0.5960 0.7960 0.4896
  -2LL (df) 287.982 (9) 260.192 (19) 259.188 (20)

vp < 0.1; *p < 0.05; **p < 0.01; ***p < 0.001

1 OR (odds ratio) is the exponent coefficient [exp(B)], and may be interpreted as the increase in the odds-ratio for the inclusion of personal content in a message for each unit increase in the independent variable, controlling for all other variables in the models. Values lower than 1.0 lower the odds-ratio.

Table 3: Logistic Regression: Conflictual Tone of Messages
Variable Ctg Variable Model A (OR)1 Model B (OR) Model C (OR)
Origin LO2 8.68*** 6.64** 3.86v
(Participant) SD 2.68* 1.81 1.80
  NP 0.60 1.38 2.79
  IA 1.58 4.88 15.24*
  CM1 0.32 0.19 0.37
  CM2 0.43 0.17 0.16
  CM4 5.20v 2.22 1.09
  CM3 3.47 5.22 6.35
  LO (owner) 0.00 0.00 0.00
Topic Iraq -- 0.41 1.06
  Kosovo -- 4.89** 2.78*
  Prisoners -- 0.00 0.00
Type News article -- 4.23** 4.97*
  Alert -- 1.04 0.98
  Response -- 7.23** 4.74*
Scope International -- 0.28v 0.26
  National -- 0.42 0.69
  Local -- 0.39 0.44
Personal Personal content -- 5.32*** 6.52***
Monitoring Status Monitored -- -- 0.08***
  Constant (B) -1.2443 -2.5467 -1.7132
  -2LL (df) 235.836 (9) 154.400 (19) 135.336 (20)

vp < 0.1; *p < 0.05; **p < 0.01; ***p < 0.001

1 OR (odds ratio) is the exponent coefficient [exp(B)], and may be interpreted as the increase in the odds-ratio for the inclusion of personal content in a message for each unit increase in the independent variable, controlling for all other variables in the models. Values lower than 1.0 lower the odds-ratio.