randomness and clustering of responses in online ... - CiteSeerX

RANDOMNESS AND CLUSTERING OF RESPONSES IN ONLINE LEARNING NETWORKS Proceedings of the Communication, Internet, and Information Tecnology 2007, Banff, July 2-4, 2007 Reuven Aviv Tel Hai Academic College and Open University of Israel 108 Ravutski Street, P.O. Box 808, Raanana 43107 Israel [email protected] ABSTRACT The goal of this research is to explore the nature of the response relation in online learning networks. We ask whether actors choose their response partners at random or whether certain special mechanisms are at work. We compare the observed values of clustering of responses in 95 online learning networks with the predictions of two suitable Random Graph models. All the online distance learning networks exhibit clustering that is incompatible with the random choices of response partners, with no constraints on the response capacities of actors. But the observed clustering in all networks is compatible with the random choices of response partners subject to constraints on their total response capacities. We provide a possible explanation for this behavior, based on the nature and goal of the online learning networks, and discuss the practical implications for online distance education. KEY WORDS Distance Education, Online Learning Networks, responsiveness, clustering

1. Introduction Distance learning communities typically use online communication networks for their interactions. Some characteristics of these networks were recently studied by analyzing their topological structure: power of influence [1], correlation of power distributions [2], evolution of cohesion [3], the relations between cohesion and roles and knowledge construction [4], time variation and media dependence of communication patterns [5], and reciprocity [6]. Crucial for collaboration in the learning community is the development of responsiveness among the participants [7]. Responsiveness is the cohesive glue between the actors in the learning community. What is the nature of this relation? Is it random? Are there any hidden mechanisms that direct actors to choose their response partners? Recent studies [6, 8] reveal that response partners are not chosen at random – at least some kind of reciprocity consideration mechanism is required to

Zippy Erlich The Open University of Israel [email protected]

Gilad Ravid Annenberg Center for Communication, USC, [email protected] explain the observed mutuality of responses in online learning networks. The question is then – is that enough, or do we need to consider other mechanisms when modeling the interaction patterns between actors? In this study, we consider the clustering of responses. Clustering is a common attribute of social and organizational networks [9]. It also occurs in transcriptional gene networks, and neuronal connectivity networks [10]. In these contexts, the terms clustering and feed-forward loop are used, respectively, to point to the underlying mechanisms. (These mechanisms will be briefly described in section 5). Do we have clustering in online learning networks? Of course, some amount of clustering could be created "by chance." We wish to determine whether the observed clustering is beyond that level. To be specific, we will evaluate the amount of clustering created in observed networks beyond the level created by a combination of random selection of response partners and the reciprocity mechanism. We will do so by statistical comparative analysis of the clustering levels of a large set of online distance learning networks in relation to suitable random graph models. The precise formulation of the research hypotheses is described in section 2. In section 3, we describe the database used in this study and the method of analysis. In section 4, we present the results, and their significance is discussed in section 5.

2. Research Questions We consider an online distance learning network to be one that employs text-based communication. It is a broadcast network: any posted message is readable by all members (called “actors” in this study). The major expectation of actors in such networks is that their messages will be responded to [11, 12], but this does not always happen. Thus, response links may or may not develop: A response link is defined as realized from actor i to actor j if the number of messages posted by i and responded to by j is above threshold, defined in this study as 1. We refer to the collection of the actors and the

response links as the response graph of the online distance learning network. The response graph is akin to a particular relation in a social network. The state of a single actor in the graph is characterized by its actor configuration. This is a set of two integers (dout, din): dout, the out-degree, counts the number of actors with whom that actor created response links; din, the indegree, counts the number of actors who created response links with that actor. The degrees can be thought of as the response capacities of the actors. The set of all actor configurations is called the degree sequence of the graph. The state of a pair of actors is characterized by its dyad configuration which is mutual, asymmetric, or null: if the actors have a bidirectional link (e.g., in an online distance learning network they responded to each other at least once), then the pair has realized a mutual configuration; if the actors only have a one-directional link, they have realized the asymmetric configuration; otherwise there is no link between them, so they have realized the null configuration. The number of mutual, asymmetric and null configurations, denoted by M, A, and N respectively, is called the dyad census of the graph. A triplet of actors can have several configurations [9]. For the purpose of this research, we are interested in only one. A triplet (i, j, k) realizes a transitive (or feed-forwardloop) configuration if it is connected by the three response links ij, jk, and ik. In the response graph, actor i responded to j, j responded to k, and i also responded to k. Note that these responses need not be sequential. Following [10], we use the number of these configurations in a graph as a measure of its level of clustering. We wish to test whether the observed clustering in online distance learning networks can be explained by random choices of response partners, conditioned on a fixed level of reciprocity, or whether some extra constraint or mechanism is required. To do that, we compare the levels of clustering observed in online distance learning networks with predictions of two Random Graph models [9]. Due to the random component of the models, each of them creates an ensemble of graphs; the creation is controlled by the probability distribution function of the model; i.e., the probability assigned to the creation of each the graphs in the ensemble. The observed graph is tested for the likelihood of occurrence of its level of clustering in the ensemble. We use two suitable classic models: The Holland-Leinhardt model [13]: The probability distribution function is uniform, conditioned on the dyad census. This function is usually denoted by U|(M,A,N). The Snijders model [14]: The probability distribution function is uniform, conditioned on the dyad census, as well as the in-degree and the out-degree sequences. This function is usually denoted by U|(M, {Xi+}, {X+i}).

Formally, we will test the following two hypotheses for each of the observed online distance learning networks: H1: The observed clustering can be explained by the Holland-Leinhardt model. H2: The observed clustering can be explained by the Snijders model. These models impose increasing levels of constraints on the otherwise completely random behavior of the actors (conditioned on a given level of reciprocity). The Holland-Leinhardt model imposes one constraint: fixed numbers of the asymmetric (or, equivalently, the null) dyad configurations; it does not impose any constraint on the response capacities of the actors. The Snijders model adds the restriction that the response capacities of all actors are held fixed (note that they are not necessarily equal).

3. Method The Open University of Israel is a distance learning institution, based heavily on intensive use of its learning technologies environment, with optional face-to-face tutorials. Each course utilizes at least one online distance learning network, usually over one semester (16 weeks). The objectives of the networks vary – from collaborative knowledge construction [4], to social, pedagogical or technical support. Objectives are not mutually exclusive. Accordingly, size, response links and participation patterns vary from course to course and from semester to semester. Numbers of participants in the network vary from 10 to 150, but most have about 50 students. In this study, we selected for analysis a sample of 95 of about 500 online distance learning networks in the Open University of Israel. Networks included in the sample were selected at random, omitting 5 networks in which the number of active participants (those who post at least one message during the semester) was below an arbitrarily selected threshold of 10. For each of these networks, we created its (observed) response graph. The global characteristics of the observed response graphs are presented in Figures 1 and 2. Figure 1: Characteristics of online distance learning networks used in this study (number of actors and links)

networks are sparse. Each response graph is represented by a binary adjacency matrix: rows and columns are labeled by the nodes; an entry (i, j) is 1 if a response link exists from i to j; otherwise it is 0.

350

Response Links

300 250

The adjacency matrices of the observed response graphs were constructed by a computer program, opus2sna, developed in-house, that scanned the recorded transcripts of the messages sent in the networks, and identified posters and responders. Once the adjacency matrix was known, we calculated the clustering level following [9].

200 150 100 50

0 14

0 12

0 10

80

60

40

20

0

0

Actors Figure 2: Densities of online distance learning networks analyzed in this study 0.50 0.45

Density

0.40 0.35 0.30 0.25 0.20 0.15 0.10

4. Results

0.05 0.00

0

The expected values of clustering in the two models were estimated by the network simulation program Netminer [15]. For each of the observed graphs and for each hypothesis, we ran the random graph model referred to in the hypothesis to simulate an ensemble of 100,000 graphs. With this number of simulations, the results were stable. The constraints (M,A,N and the degree sequences) were taken as observed values. The program provided the expected values and the fraction of ensemble graphs with clustering above and/or below the observed value. If either of these fractions is smaller than the significance level of 0.01, the hypothesis was rejected for that observed response graph. Otherwise, the hypothesis was accepted.

20

40

60

80

0 10

0 12

0 14

Actors

In these figures, as in figure 3, each point represents one graph (or network). Figure 1 shows that the number of nodes (actors), N, ranges from 13 to 140, and the number of response links, L, ranges from 30 to 350. Increasing the number of participants increased the number of posted messages, which in turn, increased the number of responses to posted messages. Figure 2 shows that the density, L/[N(N-1)] decreases from 0.02 to 0.45. Most

Figure 3 presents, for each of the 95 online distance learning networks, the observed and predicted values of clustering according to the Holland-Leinhardt and Snijders models. In all these cases, the observed values are substantially higher than the values expected by the Holland-Leinhardt model, at a level of p < 0.01. On the other hand, the observed clustering values agree with the values expected by the Snijders model; the probability that these models will generate networks with clustering equal or larger than the observed values is larger than p = 10%.

Figure 3: Observed and expected clustering in the online distance learning networks

Snijders Model

Holland Leinhardt Model

Observed

800.0

Clustering

700.0 600.0 500.0 400.0 300.0 200.0 100.0 0.0

26

69

78

90

7 10

4 12

4 15

1 18

7 23

0 29

Number of Response Links 5. Discussion The predictions of the Holland-Leinhardt model do not fit – in all the tested networks – the observed values of clustering. This model assumes that actors have no restriction on their response capacities – they can respond to an arbitrary number of other actors (subject to total reciprocity). Thus, response links, mutual or not, are created independently. There are no correlations between actors in choosing their response links. However, to create a substantial number of clustered triads, and, in general, higher level sub-structures, requires correlations between actors’ link creation processes. Indeed, links and dyads in nature are typically correlated. This observation led researchers to adopt the generalized random graphs, specifically the Snijders model (with or without conditioning on the dyad census), as the basis for analyzing the structures and the dynamics of naturally occurring and artificial networks [16]. Sub-structures occurring at rates which are significantly larger than the predictions of this model are considered “motifs” that require specific explanatory mechanisms [10]. The results of this research indicate all the online distance learning networks exhibit clustering which is compatible with the expectation of the Snijders model; transitive (or feed-forward-loop) configurations are not motifs in these networks. Thus, clustering can be explained by actors choosing their response partners at random, subject to their inherent total response capacities. There is no need for an additional mechanism to explain clustering in online distance learning networks. It is instructive to contrast this behavior with the behavior of people in social networks and genes in neural networks.

The underlying mechanism in social networks is “cognition balance,” that aims to overcome dissonance and consistency in cognition among actors [17-19]. For example, when dissonances arise between people, they attempt to reduce them by persuading other people, who will persuade more people, and so on; people are clustered via multipath, transitive loops. In gene networks, each node is a transcription factor (transferring genetic information from DNA to RNA). A node regulates some other nodes, that is, determines the rate of transcription of the other nodes. Nodes are triggered by external, inducing stimuli. In a feed-forwardloop, a node i regulates a second node j, and both are needed to regulate a third node k. This structure functions as a persistence detector: only a persistent stimulus to i can regulate i and then j, which leads to expression on k. On the other hand, even a temporary removal of the stimulus on i will lead to rapid turn-off of the expression on k [10]. Online learning networks are different – they are broadcast networks; each actor sees all postings and responses. Thus, the need to settle conceptual inconsistencies or dissonances in perceptions decreases. Similarly, there is no need for mediators of the type used in the gene networks.

6. Conclusion This research indicates that in typical online distance learning networks, and unlike social or biochemical networks, clustering can be explained by actors choosing their response partners at random, subject to their inherent total response capacities. There is no need for an

additional mechanism. The mechanisms underlying these networks are not necessary in broadcast networks. It should be noted that clustered response triads are the building blocks of more complex cohesive structures such as response-cliques, which facilitate constructing knowledge by consensus [20]. If the goal of the online distance learning network is indeed knowledge construction, then suitable collaborative features – positive interdependence [21, 22] – should be designed in order to initiate the cognition balance mechanism. Indeed, in a comparative case study analysis, Aviv, Erlich, and Ravid [23] found significant clustering in one special online distance learning network that was designed and monitored to work as a team to reach a consensual goal (submission of a joint proposal); no such clustering was found in the more common Q&A-type network. It seems that the goal-directed design of the team network forced its participants to reach consensus, which led to the cognition balance mechanism. This intriguing idea should be further explored on a much larger scale. This research focused on clustered triads. This is only one characteristic of networks. There are many others – high order clustering, degree and power distribution, and cliquishness, to name a few. Some of these features affect the behavior of various types of networks. For example, certain degree distributions (the so-called “scale free” distributions [24]) characterize a large number of extremely large networks [25, 26]. “Small-World” topology (a regular connectivity pattern with few shortcuts) supports synchronization between nodes [27]. These network effects are all interdependent, and can be incorporated into a more general analysis using parametric models, such as p* [28, 29], or discriminative classifiers [30]. Such comparative global analyses are very useful in revealing the nature and mechanisms underlying online distance learning networks. These analyses will be the subject of future research.

Acknowledgement We thank Gila Haimovic for critical review of the manuscript.

References [1]

[2]

A. Martinez, Y. Dimitriadis, B. Rubia, E. Gomez, L. Garrachon, and J. A. Marcos, "Studying Social Aspects of Computer-Supported Collaboration with a Mixed Evaluation Approach," in Proceedings of Computer Support for Collaborative Learning (CSCL 2002) Conference, G. Stahl, Ed. Mahwah, NJ: Lawrence Erlbaum, 2002, pp. 631-632. H. Cho, M. Stefanone, and G. Gay, "Social Network Analysis of Information Sharing Networks in a CSCL Community," in Proceedings of Computer Support for Collaborative Learning (CSCL) 2002 Conference, G. Stahl, Ed. Mahwah, NJ: Lawrence Erlbaum, 2002, pp. 43-50.

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15] [16]

C. Reffay and T. Chanier, "Social Network Analysis Used for Modeling Collaboration in Distance Learning Groups," in Lecture Notes in Computer Science (LNCS), S. A. Cerri, G. Guarderes, and F. Paraguaco, Eds., 2002, pp. 31-40. R. Aviv, Z. Erlich, G. Ravid, and A. Geva, "Network Analysis of Knowledge Construction in Asynchronous Learning Networks," Journal of Asynchronous Networks, vol. 7, pp. 1-23, 2003. C. Haythornthwaite, M. Kazmer, J. Robins, and S. Shoemaker, "Community Development Among Distance Learners: Temporal and Technological Dimensions," Journal of Computer Mediated Communication, vol. 6, 2000. R. Aviv, Z. Erlich, and G. Ravid, "Reciprocity Analysis of Online Learning Networks," in Elements of Quality Online Education: Engaging Communities, Wisdom from the Sloan Consortium, vol. 2, Wisdom Series, J. C. Moore, Ed. Needham, MA: SLOAN-C, 2005. S. Rafaeli, F. Sudweeks, E. Mabry, and J. Konstan, "ProjectH: A Collaborative Qualitative Study of Computer-Mediated Communication," in Network and Netplay: Virtual Groups on the Internet, F. Sudweeks, M. L. McLaughlin, and S. Rafaeli, Eds. Menlo Park, CA: AAAI/MIT Press, 1998, pp. 265282. R. Aviv, Z. Erlich, and G. Ravid, "Analysis of Reciprocity and Transitivity in Online Collaboration Networks," Submitted to Connections, 2005. S. Wasserman and K. Faust, Social Network Analysis: Methods and Applications. Cambridge, UK: Cambridge University Press, 1994. R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon, "Network Motifs: Simple Building Blocks of Complex Networks," Science, vol. 298, pp. 824–827, 2002. S. Rafaeli, "From New Media to Communication," in Sage Annual Review of Communication Research: Advancing Communication Science. Beverly Hills, CA: Sage, 1988, pp. 110-134. S. Rafaeli and R. J. LaRose, "Electronic Bulletin Boards and "Public Goods" Explanations of Collaborative Mass Media," Communication Research, vol. 20, pp. 277-297, 1993. P. W. Holland and S. Leinhardt, "The Statistical Analysis of Local Structure in Social Networks," in Sociological Methodology, D. R. Heise, Ed. San Francisco, CA: Jossey Bass, 1976, pp. 1-45. T. A. B. Snijders, "Enumeration and Simulation Methods for 0–1 Matrices with Given Marginals," Psychometrika, vol. 56, pp. 397-417, 1991. NetMiner CYRAM Co. Ltd, "NetMiner 2.5," 2004. M. E. J. Newman, S. H. Strogatz, and D. J. Watts, "Random Graphs with Arbitrary Degree Distributions and Their Applications," Phys. Rev. E, vol. 64, pp. 026118, 2001.

[17] F. Heider, The Psychology of Interpersonal Relations. New York, NY: John Wiley, 1958. [18] L. Festinger, A Theory of Cognitive Dissonance. Evanston, IL: Row, Preston & Co., 1957. [19] D. Cartwright and F. Harary, "Structural Balance: A Generalization of Heider's Theory," Psychological Review, vol. 63, pp. 277-293, 1956. [20] R. Aviv, Z. Erlich, and G. Ravid, "Mechanisms and architectures of online learning communities," presented at The 4th IEEE International Conference on Advanced Learning Technologies (ICALT), Aug. 30 - Sept. 1, 2004, Jonesuu, Finland, 2004. [21] D. W. Johnson and R. T. Johnson, Positive Interdependence: The Heart of Cooperative Learning. Edina, MN: Interaction, 1992. [22] D. W. Johnson and R. T. Johnson, Learning Together and Alone: Cooperative, Competitive and Individualistic Learning. Needham Heights, MA: Allyn and Bacon, 1999. [23] R. Aviv, Z. Erlich, and G. Ravid, "Response Neighborhoods in Online Learning Networks: A Quantitative Analysis," Educational Technology & Society, vol. 8, 2005. [24] A. L. Barabasi and R. Albert, "Emergence of Scaling in Random Networks," Science, vol. 286, pp. 509-512, 1999. [25] M. E. J. Newman, A. L. Barabasi, and D. J. Watts, The Structure and Dynamics of Networks. Princeton, NJ: Princeton University Press, 2003. [26] S. N. Dorogovtsev and J. F. F. Mendes, Evolution of Networks: From Biological Nets to the Internet and WWW. Oxford: Oxford University Press, 2004. [27] M. Barahona and L. M. Pecora, "Synchronization in Small-World Systems," Physical Review Letters, vol. 89, 2002. [28] C. J. Anderson, S. Wasserman, and B. Crouch, "A p* Primer: Logit Model for Social Networks," Social Networks, vol. 21, pp. 37-66, 1999. [29] S. Wasserman and P. E. Pattison, "Logit Models and Logistic Regression for Social Networks. Part I. An Introduction to Markov Graphs and p*," Psychometrika, vol. 60, pp. 401-426, 1996. [30] M. Middendorf, E. Ziv, C. Adams, J. Hom, R. Koytcheff, C. Levovitz, G. Woods, L. Chen, and C. Wiggins, "Discriminative Topological Features Reveal Biological Network Mechanisms," BMC Bioinformatics, vol. 5, pp. 181, 2004.

randomness and clustering of responses in online ... - CiteSeerX

randomness and clustering of responses in online ... - CiteSeerX

Suggest Documents

Online Clustering of Processes - CiteSeerX

Assessing class-wide consistency and randomness in responses to ...

RANDOMNESS AND FOUNDATIONS OF PROBABILITY - CiteSeerX

Complexity and Randomness of Recursive - CiteSeerX

RANDOMNESS AND THE LINEAR DEGREES OF ... - CiteSeerX

Online Clustering of Data Streams - CiteSeerX

Kolmogorov-Loveland Randomness and Stochasticity - CiteSeerX

Randomness and Secrecy - A Brief Introduction - CiteSeerX

RANDOMNESS AND HALTING PROBABILITIES Â§1 ... - CiteSeerX

Morphing: Combining Structure and Randomness - CiteSeerX

randomness and dependence in etch process and their ... - CiteSeerX

Randomness and Complexity in Networks

Clustering, randomness, and regularity in cloud fields: 5. The nature of ...

Efficient Online Spherical K-means Clustering - CiteSeerX

Randomness in US Water Pollution: Evidence and ... - CiteSeerX

Responses and influences: A model of online information ... - CiteSeerX

comments and responses - CiteSeerX

quenched randomness in unconventional

Empirical Analysis of Randomness in Ant Colony ... - CiteSeerX

Entanglement, randomness and chaos

Randomness and Semigenericity

An online blog reading system by topic clustering and ... - CiteSeerX

CLOTU: An online pipeline for processing and clustering ... - CiteSeerX

Clustering and Visualization of Online Chat