viewers perceive a social network's structural characteristics (McGrath, Blythe, and Krackhardt. 1997). Thus ..... 1.4 Creating social network matrices in Access 97.
A GUIDE FOR THE VISUALLY PERPLEXED: VISUALLY REPRESENTING SOCIAL NETWORKS
Sean F. Everton Stanford University
© Stanford University January 2004 A-1 Version .30
INTRODUCTION Network analysts have long used sociograms (network diagrams) to visualize the networks they are analyzing. A common technique that analysts use to draft a sociogram is to construct it around the circumference of a circle. The circle helps organize the data, but the order in which analysts place the points is determined only by their attempt to keep the number of lines connecting the various points to a minimum. Typically, researchers using this technique engage in a trial-and-error drafting process until they reach an aesthetically pleasing result (Scott 2000). While such a process can make the structure of relations clearer, the relations between the sociogram’s points reflect no specific mathematical properties. The points are arranged arbitrarily and the distances between them are meaningless. Not surprisingly, how social network data are spatially arranged in graphs influences how viewers perceive a social network’s structural characteristics (McGrath, Blythe, and Krackhardt 1997). Thus, if we wish to infer “something about the actual sociometric properties of a network, then the physical distance between points should correspond as closely as possible to the graph theoretical distances between them” (Scott 2000:148). To this end, researchers, in recent years, have developed a number of techniques (e.g., metric and non-metric multidimensional scaling, correspondence analysis, spring-embedded algorithms, etc.) that mathematically represent the points in space. This guide provides an overview on how to use these various techniques to visually represent one and two-mode networks. It begins by first examining how to enter, manipulate and prepare social network data using Microsoft’s Access and Excel programs (Chapter 1). It then demonstrates how to perform initial network analysis in Ucinet (Borgatti and Everett 1997),1 which is a network analysis software program. After preparing our data, it then looks at how to visually represent one-mode (Chapter 2) and two-mode (Chapter 3) networks using two visualization packages, Mage and Pajek. Mage was developed as a device to be used in molecular modeling (Richardson and Richardson 1992). It produces elegant three-dimensional illustrations that appear as interactive computer displays. Researchers can rotate Mage images, turn parts of the displays on or off, use the mouse to select and identify various points of the network, and animate changes between different arrangements of objects.2 Appendix A provides guidance for editing Mage files (kinemage) in order to take advantages of these features. Pajek, which is Slovenian for “Spider,” is a network analysis and graph drawing program that has specifically been designed to handle extremely large data sets. It is still in its development stage and can be downloaded for noncommercial use free of charge from the Pajek web site.3 An advantage of Pajek is that its developers are continually updating it, including more and more features that social network analysts use to explore social networks.4 1
2
3
UCINET can be purchased from Analytic Technologies (104 Pond Street; Natick, MA 01760) either by phone (508-647-1903) or directly from their web page www.analytictech.com. For more information on Mage, see the article by Freeman, Webster, and Kirke (1998), and visit the following URL: http://www.faseb.org/protein/kinemages/kinpage.html where the program can be downloaded free. Pajek’s latest iteration can be downloaded free for noncommercial use at: http://vlado.fmf.unilj/pub/networks/pajek.
i Version .42
After exploring how to visualize simple one and two-mode social networks, the manual then turns to more complex visualization issues. Chapter 4 explores how to visualize social networks over time, while Chapter 5 (forthcoming) looks at various block-modeling techniques available in Ucinet and Pajek.
Note: Version .42 of the manual corrects typographical errors and incorrect references to various figures throughout the manual. It also includes an updated glossary.
4
For example, Pajek .73 included, for the first time, a block modeling option that creates block models based on structural or regular equivalence.
ii Version .42
1. GATHERING AND PREPARING SOCIAL NETWORK DATA We can gather and prepare social network data in a variety of ways. Here we use Microsoft Access 97 and Excel 97 in order to demonstrate how to gather and prepare the data of one- and two-mode networks. 1.1 Gathering and preparing one-mode social network data One-mode networks consist of a single set of actors. They differ from two-mode networks in that two-mode networks consist of two sets of actors or one set of actors and one set of events. Actors can be people, groups, organizations, corporations, nation-states, etc. The connections (i.e., relations) between such actors can be friendship or kinship ties, material transactions such as business transactions, the import or export of goods, communication networks involving the sending or receiving of messages, etc. An example of a one-mode network, one that we will use throughout this manual, is Padgett’s Florentine Families Network (Breiger and Pattison 1986; Padgett and Ansell 1993). Padgett and Ansell collected data on the marriage and business ties (i.e., relations) between 16 prominent Florentine families in 15th century Florence. Both sets of ties were nondirectional and dichotomous. A marital tie was determined to exist if a member of one family married a member of another family while a business tie was determined to exist if a member of one family granted credits, made a loan, or entered into a joint partnership with a member of another family (Wasserman and Faust 1994). For our purposes here we will use the marital tie data. 1.1.1 Gathering and manipulating one-mode social network data Because of the interchangeability of Microsoft programs we can use either Access or Excel to enter social network data. Excel includes an “autocomplete” feature that compares the text you are typing into a cell with text already entered into the same column. If the same word has been used before, it then completes typing the entry for you. This feature increases accuracy (e.g., consistently spelling the same name the same way each time) and input time, so we recommend, when possible, that you enter social network data initially into Microsoft Excel. You can later import the Excel data into Access. Because we use relatively small networks as examples, it is actually quicker to enter them directly into Access. We use Excel here, however, in order to demonstrate the steps you will want to take with much larger datasets. We begin by entering the Padgett data into Excel.5 To do so we enter the data into two columns. As can be seen in Figure 1.1 the first column lists the 16 families while the second lists the families with which they have marital ties. Obviously, families with more than one marital tie will be listed more than once in the first column. For example, the Albizzi family has marital ties with the Ginori, Guadagni and Medici families, so it appears three times in the first column. If you look down the first column to the Guadagni family, you will note that it lists a marital tie with the Albizzi family. This is as it should be since the marital ties between the families are reciprocal. 5
The Padgett data are available in matrix form in Appendix B of Wasserman and Faust (Wasserman and Faust 1994:744) and Figure 2.1 in Chapter 2 of this manual.
1-1 Version .42
In this dataset, the Pucci family has no marital ties with any of the other families. To record this in a way that we ultimately end up with a square matrix, we first have to list the Pucci family in column A with a blank cell next to it in column B. Then, we need to list the Pucci family in column B with a blank cell next to it in column A. Figure 1.1:
Padgett Data Entered into Microsoft Excel 97 Worksheet
After you finish entering the data, you will, of course, want to save it and exit Excel, so that you can move to the next step of importing it into Access. 1.2 Gathering and preparing two-mode social network data Two-mode networks differ from one-mode networks in that rather than consisting of a single set of actors, they either consist of two sets of actors, or one set of actors and one set of events. Typically, researchers refer to them as affiliation networks, but they have also been referred to as membership networks, dual networks and hypernetworks (Faust 1997; Wasserman and Faust 1994). Affiliation networks are “non-dyadic because the affiliation relation relates each actor to a subset of events, and relates each event to a subset of actors” (Faust 1997:158).
1-2 Version .42
An example of a two-mode network is Davis’s Southern Club Women (Breiger 1974; Davis, Gardner, and Gardner 1941). Davis and his colleagues recorded the observed attendance of 18 Southern women at 14 social events. 1.2.1 Gathering and manipulating two-mode social network data As we did with the Padgett data, we enter the data into two columns.6 However, in this case the form of the data differs in that the first column lists the women while the second lists the number of the event that they attended. Figure 1.2:
Southern Women Data Entered Into Microsoft Excel 97 Worksheet
It is important to note that each woman is listed separately for every event they attended. Thus, Laura is listed seven times (with the corresponding event number) because she attended seven different events (1, 2, 3, 5, 6, 7 & 8). After we finish entering the data, we need to save it, so that we can then import it into Access. Because we import, manipulate, export and read two-mode data in the same way we do one6
The Southern Women data is available in matrix form in Figure 3.1 in Chapter 3 of this manual.
1-3 Version .42
mode data, in what follows we illustrate the process with only one-mode data, but there is no reason why the same techniques cannot be applied to two-mode data. 1.3. Importing social network data into Access 97 The next step in the process is importing this data into Microsoft Access 97. When you first open Access you will see a dialog box that looks like the one in Figure 1.3. Because we are creating a new database, we will choose between the “Blank Database” or “Database Wizard” options. The former, as its name implies, opens up a blank database while the latter initiates a “wizard” that is quite helpful in setting up databases. It provides users with a series of “readymade” databases that can be readily adapted for other purposes. Our purpose here, however, is not to provide an introduction to Access but simply to show how we can import and manipulate network data using Access. Thus, we will choose the “Blank Database” option. For those who are interested in learning more about Access, we suggest you consult the book, Sams Teach Yourself Access 97 in 21 Days (Eddy, Cassel, Goodling, and Stewart 1998). Once you have created a database, you will choose the option “Open an Existing Database,” which should appear in the list of files appearing just below this option. Figure 1.3:
Access’s Opening Dialog Box
After choosing the “Blank Database” option, you will see a screen that looks similar (but probably not identical) to the one that appears in Figure 1.4.
1-4 Version .42
Figure 1.4:
Access’s New Database Dialog Box
Figure 1.5:
Database Window for Visualization Database
At this point you will want to give your file a name and then select the “Create” button. (Here we have given it the name “Visualization.”) Selecting this opens a new database window similar to the one shown in Figure 1.5. Under the “File” menu select “Get External Data.” This 1-5 Version .42
provides you with two choices: either to “Import” data or to “Link Files.” Select “Import.” This will bring up a dialog box (Figure 1.6) that allows you to first find the Excel spreadsheet you created earlier and then import it. Note that the box provides a number of criteria by which to locate your files. It even provides a “Find” function if you are unsure as to where you saved your Excel file. The important thing here, though, is that in the “Files of Type” box you have selected “Microsoft Excel.” Figure 1.6:
Access’s Import Dialog Box
Click on the “Import” button, and Access will bring up its Import Spreadsheet Wizard (see Figure 1.7). As you can see this wizard initially asks what Excel worksheets you want to import. Currently, we are only interested in the Padgett data, which in this case is the default that Access has selected. Click on the “Next” button, which takes you to the next dialog box (see Figure 1.8) that asks whether the first row of the data contain column headings. In this case it does not, so we do check the box and move on to the following dialog box by clicking on the “Next button.
1-6 Version .42
Figure 1.7:
Access’s Import Spreadsheet Wizard – Worksheet Options
Figure 1.8:
Access’s Import Spreadsheet Wizard – Column Heading Options
This next dialog box (Figure 1.9) asks where we want to store the data: in an existing table or in a new one. Here, we select the new table option.
1-7 Version .42
Figure 1.9:
Access’s Import Spreadsheet Wizard – Data Storage Options
The next dialog box (Figure 1.10) provides users with the opportunity to assign names to fields. Here, we assign Field 1 the name “Family” and Field 2 the name “Marital Tie.” Figure 1.10: Access’s Import Spreadsheet Wizard – Field Options
1-8 Version .42
The next dialog box asks whether you want Access to add the table’s primary key. In this case, we will say yes although whether you do will largely depend on the data being imported and whether it already contains a field you wish to designate as the primary key. For more information on primary keys see Eddy et al. (1998). The final dialog box (not shown) asks you to assign a name to the table you are creating. In this case we use the name “ Padgett.” Figure 1.11: Access’s Import Spreadsheet Wizard – Primary Key Options
Once the import process is complete Access will return to the standard database window displayed in Figure 1.5 except now it will contain a new table. Clicking on the “Open” button opens a table similar to the one displayed in Figure 1.12.
1-9 Version .42
Figure 1.12: Opened Padgett Table in Access
1.4 Creating social network matrices in Access 97 The next step in the process is to create a crosstabulation of the Padgett data such that we can export it as a matrix to Excel and ultimately to Ucinet. At the database window (see Figure 1.5) select the “Queries” tab. Click on the “New” query button, and this will bring up a dialog box similar to the one displayed in Figure 1.13. Select the “Crosstab Query Wizard” option and click “OK.” This will bring up the Crosstab Query wizard, which guide us through the process of creating a crosstabulation.
1-10 Version .42
Figure 1.13: Access’s Query Dialog Box
The query first asks (see Figure 1.14) what tables and queries that will be used to create the crosstab. Since Access is a relational database, it allows us to use multiple tables in creating our queries. What is extremely helpful is the fact that if after we have created a crosstab (or other query), we make changes to the table(s) on which it is based, Access automatically updates the crosstab. Figure 1.14: Access’s Crosstab Query Wizard
In this case we only have one table to select (Padgett) so we highlight it and click on the “Next” button. The wizard then asks (Figure 1.15) what fields’ values we want as the row heading. 1-11 Version .42
Here we select “Family,” move it (using the arrow button) from the “Available Fields” to the “Selected Fields” box and then click on the “Next” button. Figure 1.15: Access Crosstab Query Wizard – Row Heading Options
Next, the wizard (Figure 1.16) asks what fields values we want as the column heading. Here we select “Marital Tie” and again click on the “Next” button. Finally, Access asks what number we want calculated for each column and row intersection (Figure 1.17). Access provides a number of options. In this instance we select “ID” in the field box and “count” in the function box. Access also asks whether we want to summarize each row. This can be a helpful statistic, so select this box as well.
1-12 Version .42
Figure 1.16: Access Crosstab Query Wizard – Column Heading Options
Figure 1.17: Access Crosstab Query Wizard – Calculation Options
1-13 Version .42
The final dialog box (not shown) asks what we wish to name the crosstab (it does provide a default name). Type in a name and click on the “Finish” button. This will open a crosstab similar to the one that appears in Figure 1.18. Figure 1.18: Access 97 Crosstabulation Query of Padgett Data
Notice that the names of the families appear both down the left side (rows) and across the top (columns) as you would find in a typical matrix. The query includes a “Total of ID” column that tabulates (in this case) the number of marital ties that each family has with other families. It also includes a “” column that indicates, at least in this case, families that have no ties as is the case with the Pucci family. The blank row indicates that none of the families have a marital tie with the Pucci family. A quick comparison of this data with Wasserman and Faust (1994:744) indicates that we have indeed imported and manipulated the data correctly. 1.5 Preparing data for Ucinet The next step in the process is to prepare the data for analysis in Ucinet. To do this we first export the data from Access to Excel, and then copy the data from Excel into Ucinet. With the query open that you want to export to Excel, click on the “Tools” menu, select “Office Links,” and click on “Analyze It with MS Excel.” This opens the Excel program and exports the data into Excel (Figure 1.19) in a format that looks almost identical to the Access crosstabulation. First, delete the second row (blank) and the second (Total ID) and third () columns since these will not be part of our final matrix.7 Next, open Ucinet. Along the top of the screen you will find four buttons. The second opens the “Ucinet Spreadsheet.” In principle, the Ucinet spreadsheet should allow us to import Excel data directly into Ucinet. Unfortunately, it does not always work properly. If it does not, simply copy and paste the data from Excel to Ucinet. Once pasted, the data should look something like what you see in Figure 1.20. 7
Access creates these rows and columns as part of the crosstab query. The totals are useful for initially checking the data, but they are not needed for the matrix.
1-14 Version .42
Figure 1.19: Exported Access Data into Excel Spreadsheet
Figure 1.20: The Ucinet Spreadsheet
Before we can analyze the data we need to fill the empty cells with zeroes. Ucinet has a feature that will perform this task for us, so all we need to do is go to the cell in the lower right hand cell 1-15 Version .42
of the matrix. Next, click on the “Fill” icon that can be found on the toolbar. This should fill all the empty cells with zeroes. Next, we need to save the data. The “Save” function can be found under the “File” menu or can be activated by clicking on the “Floppy Disk” icon on the toolbar. Once you have saved the data, click on the “OK” button and you will exit the Ucinet Spreadsheet feature.8
8
We should do one last thing before analyzing the data. Whenever data is pasted and saved into Ucinet as we have done here, Ucinet’s “Display” function does not display the data completely for some reason. This is especially true for large datasets. Thus, it is worth reopening Ucinet’s Spreadsheet feature, opening the file and resaving it. Repeating this procedure seems to take care of the problem.
1-16 Version .42
2. VISUAL REPRESENTATIONS OF ONE-MODE NETWORKS As noted earlier one-mode networks consist of a single set of actors and differ from two-mode networks in that the latter consist of two sets of actors or one set of actors and one set of events. We begin by visualizing symmetric one-mode matrices because, at least when it comes to using multidimensional scaling techniques, they are simpler to represent visually than are asymmetric one-mode matrices. For this, we use the marital ties of Padgett’s Florentine Families (discussed in Chapter 1). We first explore how to visually represent this social network using Mage and then repeat the process using Pajek. Next, we explore the somewhat more complicated task of visually representing asymmetric one-mode matrices. For this task, we use the “advice network” of Krackhardt’s (1987) High Technology Managers (discussed in more detail below). 2.1 Visualizing Symmetric One-Mode Matrices using Mage Figure 2.1 presents the Padgett marriage data in matrix form. Note that the rows and columns are identical (i.e., the names of the various Florentine families) and xij = xji for all i and j.9 Figure 2.1:
Adjacency Matrix of Padgett’s Florentine Families
The first task is to use this matrix to calculate a set of related coordinates. We then export both the matrix and its related coordinate files in a form readable by Mage. 2.1.1 Calculating coordinate files As noted earlier, network analysts have long used sociograms to visualize social networks. A technique that was commonly used was to construct the data around the circumference of a circle. Unfortunately, while such a process can make the structure of relations clearer, the relations between the sociogram’s points reflect no specific mathematical properties. The points 9
This is as it should be since marital ties are, by definition, reciprocal.
2-1 Version .42
are arranged arbitrarily and the distances between them are meaningless, which, depending on how they are arranged, can lead to varying interpretations of the data (McGrath, Blythe, and Krackhardt 1997). In recent years analysts have begun using a series of mathematical techniques to locate the points of a network in such a way that the distances between them are meaningful. Multidimensional scaling (MDS) is one such technique. It is a mathematical approach that uses the concepts of space and distance to represent a network’s internal structure, which, in turn, can help reveal, among other things, what actors are “close” to one another or potential cleavages between sets of actors (Wasserman and Faust 1994). The typical input to MDS is a one-mode symmetric matrix consisting of measures of similarity or dissimilarity between pairs of actors. Output generally consists of a set of estimated distances among pairs of actors that can be then represented in one-, two-, three- or higher-dimensional space (Kruskal and Wish 1978; Wasserman and Faust 1994). Using Ucinet we will compute the coordinates of the Padgett data using three-dimensional multidimensional scaling that, in turn, will then be used to place points representing the various families in 3-dimensional space. Ucinet provides users with a choice between metric and non-metric MDS. Metric MDS takes a given matrix of proximities that measure the similarities or dissimilarities among a set of actors and calculates a set of points in k-dimensional space, such that the distances between them correspond as closely as possible to the input proximities (Borgatti, Everett, and Freeman 1999).10 Metric distance differs from distance in graph theory. In graph theory, the distance between two points is measured in terms of the number of lines in the path that connects the two points. In MDS the distance between two points is the most direct route between them. “It is a distance that follows a rout ‘as the crow flies’, and that may be across ‘open space’ and need not – indeed, it normally will not – follow a graph theoretical path” (Scott 2000:148-149). There are some limitations to using metric MDS for visualizing social networks. Many relational data sets, such as the Padgett data, are binary in form. That is, they simply indicate either the presence or absence of a tie, and thus we cannot directly use such data to measure proximities. We first need to convert it into other measures, such as correlation coefficients, before calculating it metric properties. However, data conversion such as this may lead researchers to draw unjustifiable conclusions about the data. Even when the data are valued, metric assumptions may be inappropriate. For example, a family with four marital ties may not be twice as central to one with only two. While it may be legitimate to consider the former as being more central than the latter, it is difficult to be certain about how much more central it might be (Scott 2000:157). Non-metric MDS procedures, like metric MDS procedures, use symmetrical adjacency matrices in which the cells show the similarities or dissimilarities among actors. However, unlike metric MDS procedures, they do not convert these values directly into Euclidean distances. Instead, they consider only rank order. They treat the data, in other words, as ordinal. Non-metric MDS procedures “seek a solution in which the rank ordering of the distances is the same as the rank ordering of the original values” (Scott 2000:157). Non-metric MDS is often preferred because it 10
The Padgett data proximities represent similarities between the families. That is, a “1” in a matrix cell means that the two families represented by that cell share a marital tie.
2-2 Version .42
tends to provide a better “goodness-of-fit” (stress) statistic. The lower the stress (0 = perfect fit), the better. Generally, stress levels below .1 are considered excellent while levels above .2 are considered unacceptable (Borgatti, Everett, and Freeman 1999). To illustrate the differences between the two methods we will employ both metric and nonmetric MDS procedures, beginning with metric MDS and followed with non-metric MDS. 2.1.1.1 Metric multidimensional scaling Under the “Tools” menu, first select the “MDS” submenu, which provides a choice between “metric” or “non-metric” MDS scaling. Choose “metric.” This brings up the following dialog box (Figure 2.2): Figure 2.2:
Metric MDS Dialog Box
The parameters of the Metric MDS option are as follows: Input dataset: Name of file containing the adjacency matrix. Data type: Square symmetric matrix. Number of dimensions: (Default = 2). This represents the number of dimensions to use in representing items in Euclidean space. Change the default setting to 3. Similarities or Dissimilarities? (Default = Similarities). This choice determines whether the data will represent similarities or dissimilarities between the nodes. If similarities, large values of X(i,j) will draw i and j close together on the MDS map. If dissimilarities, large values will push i and j apart on the map. Starting Configuration (Default = Classic): This parameter tells Ucinet how to generate initial location of points in k-dimensional space. It is important to realize that MDS solutions 2-3 Version .42
are not unique and are subject to convergence to local minima. The first point means that two or more sets of coordinates can be equally good (i.e., having the same stress level) but place points in radically different locations. The second point means that it is possible for the algorithm to fail to find the configuration with the least stress. If you suspect this has happened, it is advisable to run the program several times using random starting configurations (Borgatti, Everett, and Freeman 1999). The choices Ucinet provides are: Classic - Selecting this option performs Gower's “classical” metric ordination procedure. File - Reads starting coordinates from UCINET dataset. If this option is chosen then the user must complete the parameter. Random – This option locates points randomly in space. As noted above MDS procedures often yield lower stress levels when using a random starting configuration Adjust data to nearest Euclidean (Default = Yes): This procedure iteratively adjusts the data so that it obeys the triangle inequality. Output dataset (Default = 'MetricMdsCoord'): This file will contain the Euclidean coordinates. Rather than using the default name, choose one that is related to the file you are working with. Here I named the Padgett MDS file “PadgMDS.” Running this procedure produces both a scatterplot, which we do not need, and an output file that lists the MDS coordinates: Figure 2.3:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Metric Multidimensional Scaling of Padgett Florentine Families
ACCIAIUOL ALBIZZI BARBADORI BISCHERI CASTELLAN GINORI GUADAGNI LAMBERTES MEDICI PAZZI PERUZZI PUCCI RIDOLFI SALVIATI STROZZI TORNABUON
1 -----1.579 1.215 0.007 -0.754 -0.735 1.056 0.428 0.178 0.896 0.455 -0.983 -0.323 0.175 0.873 -0.790 0.676
2 ------0.278 0.992 -1.030 0.973 -0.452 1.141 1.268 1.783 -0.291 -0.164 0.354 0.932 -0.190 -0.423 0.109 0.411
3 -----0.237 0.621 0.566 0.117 0.745 1.337 -0.091 0.399 0.174 1.846 0.674 1.803 -0.643 1.410 0.022 -0.647
2.1.1.2 Non-metric multidimensional scaling Under the “Tools” menu, first select the “MDS” submenu, which provides a choice between “metric” or “non-metric” MDS scaling. Choose “non-metric.” This brings up the following dialog box that asks researchers to provide the answers to a series of parameters (Figure 2.4). 2-4 Version .42
Figure 2.4:
Non-metric MDS Scaling Dialog Box
The parameters of the Non-Metric MDS procedure are defined as follows: Input dataset: Name of file containing the adjacency matrix. Data type: Square symmetric matrix. Number of dimensions: (Default = 2). This represents the number of dimensions to use in representing items in Euclidean space. Change default setting to 3. Similarities or Dissimilarities? (Default = Similarities). This choice determines whether the data will represent similarities or dissimilarities between the nodes. If similarities, large values of X(i,j) will draw i and j close together on the MDS map. If dissimilarities, large values will push i and j apart on the map. Starting Configuration (Default = Torsca): This parameter tells Ucinet how to generate initial location of points in space. As we noted above it is important to know that MDS solutions are not unique and are subject to convergence to local minima. The first point means that two or more sets of coordinates can be equally good (i.e., having the same stress level) but place points in radically different locations. The second point means that it is possible for the algorithm to fail to find the configuration with the least stress. If you suspect this has happened, it is advisable to run the program several times using random starting configurations (Borgatti, Everett, and Freeman 1999). The choices Ucinet provides are: Classic - Performs Gower's classical metric ordination procedure. Torsca - Uses principal components of rank-order data. File - Reads starting coordinates from UCINET dataset. If this option is chosen then the user must complete the parameter. 2-5 Version .42
Random – This option locates points randomly in space. This procedure often yields lower stress levels and, surprisingly, better images because the coordinates do not end up as closely “bunched” together as when they use the Torsca starting configuration. Print Diagnostics (Default = No): If Yes is selected, then dyads with large discrepancies between the proximity data and the plot distances will be printed. Output dataset (Default = NonMetricMdsCoord): This file will contain Euclidean. Rather than using the default name, choose one that is related to the file you are working with. For example, here I named the Padgett non-metric MDS file “PadgNMDS.” Running this procedure produces both a scatterplot and an output file that lists the non-metric MDS coordinates: Figure 2.5:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Non-Metric Multidimensional Scaling of Padgett Florentine Families Marital Ties
ACCIAIUOL ALBIZZI BARBADORI BISCHERI CASTELLAN GINORI GUADAGNI LAMBERTES MEDICI PAZZI PERUZZI PUCCI RIDOLFI SALVIATI STROZZI TORNABUON
1 2 -----------0.996 -0.177 0.210 -0.332 0.558 -0.257 -0.752 -0.476 0.105 -0.108 0.822 0.194 -0.166 -0.834 0.758 -0.986 -0.106 0.107 0.049 1.453 -0.618 0.359 0.858 0.688 -0.233 0.149 0.148 0.954 -0.481 0.074 -0.154 -0.810
3 ------0.570 -0.893 0.266 0.519 0.975 -0.574 -0.160 -0.389 -0.332 -0.018 0.879 0.576 -0.090 -0.611 0.572 -0.147
One can see that these coordinates differ from those displayed in Figure 2.3. Later, when we actually visualize these two sets of coordinates, we will be able to see whether these differences between the two yield substantially different images. 2.1.2 Exporting adjacency matrices and related coordinate files in kinemage (Mage) format Recent versions of Ucinet have (thankfully) simplified the task of preparing Ucinet files for visualization in Mage. At one time, analysts had to create kinemage files using a DOS program (uci2kin) that combined adjacency matrices with their related coordinate files. Ucinet has now incorporated this process into the program itself. To visualize the Padgett data using the coordinates using metric MDS, under the “Tools” menu, select “Export” and then “Mage.” This brings up a dialog box (Figure 2.6) with the following parameters: 2-6 Version .42
(Input) Network dataset: Name of file containing data to be exported. Data type: Adjacency matrix. In this example, PadgettM.##h. (Input) Coordinate dataset: Name of file containing the coordinates of points for the layout of the data (e.g., coordinate output of metric or non-metric MDS). In this example, PadgMDS.##h. Node attributes (if any): Name of file containing actor attributes, given as a vector of shared attributes so that (1,2,3,1,2,2) means that actors 1 and 4 share the same attribute actors 2, 5,and 6 share the same attribute and actor 3 has a different attribute from all the others. I do not use this in the example. Ball Size: Use default; easily changed in kinemage file (see Appendix A). Line Thickness: Use default; easily changed in kinemage file (see Appendix A). Arrow Size: Use default Arrow Angle: Use default Font Size: Use default Output data file: Name of file to be created. Here, I used PadgettM.kin (the default). Launch Mage on exit?: Feature in Ucinet that, in theory, allows researchers to launch Mage from within Ucinet. Unfortunately, it does not always work.
2-7 Version .42
Figure 2.6:
Export Adjacency Matrices and Coordinate Files to Mage Dialog Box
After running the above procedure, Ucinet calls up another dialog box (if you chose “Yes” to the final parameter): Figure 2.7:
Launch Mage Dialog Box
Simply tell Ucinet where the Mage program is located, and it should open the Mage program for you. If it does not, open Mage manually. 2.1.3 Using Mage to visualize kinemage files Upon opening Mage you are provided with an option to either proceed with or abort the program. Since we are interested in using it, select the “Proceed” button. This brings up three windows: a text window, a caption window and a graphics window. For now we are only interested in the graphics window, so double click on the blue title bar at the top of the screen. This should bring the graphics window to the front and hide the text and caption windows.11 11
In order to save “ink” while printing, the background of the graphics window has been changed to white in Figure 2.7. When Mage opens, however, the graphics window begins with a black background.
2-8 Version .42
Under the “File” menu, select “Open New File.” This brings up a dialog box from which you can select the kinemage file you wish to view. In this case we are interested in viewing the visual representations of Padgett’s Florentine Families marital data, so we first select the visual representation using metric multidimensional scaling (Figure 2.8). Note that on the side of the display there are three control bars: “ZOOM,” “ZSLAB” and “ZTRAN.” Not surprisingly, the “ZOOM” bar allows users to “move” the object closer or farther away. The “ZSLAB” bar controls contrast while the “ZTRAN” bar controls brightness. Also along the right side of the screen are a series of “switches” that allow users to turn particular features (e.g., nodes, labels, ties) of the image off or on and thereby call attention to various structural properties. Later, we will see how we can control and define these switches. Mage also permits users to rotate the image. Such rotation can potentially uncover structural regularities that may not be readily observable at first glance. The colors of the nodes, ties and labels can be changed as well (See Appendix A).
2-9 Version .42
Figure 2.8: Visual Representation of Padgett’s Florentine Families Using Metric Multidimensional Scaling
Figure 2.9 presents an image of Padgett’s Florentine Families using non-metric multidimensional scaling. While it differs from Figure 2.8, the difference here is not substantial. There is no clear visual advantage here of using non-metric, as opposed to metric, multidimensional scaling. This is probably reflects the small size of the network. The differences between metric and nonmetric MDS of large networks are often substantial. Moreover, metric MDS of large networks typically yields high stress levels as well.
2-10 Version .42
Figure 2.9:
Visual Representation of Padgett’s Florentine Families Using Non-Metric Multidimensional Scaling
2.2 Visualizing Symmetric One-Mode Matrices using Pajek Pajek does not use MDS to arrange a network’s nodes in visual space, but rather provides springembedding algorithms that place nodes in either 2 or 3-dimensional space in ways similar to MDS. It can also handle extremely large datasets and create kinemage files that can be visualized by Mage. Matrices have to be prepared in such a way that Pajek can read them. Again, the Padgett marriage data are used. 2.2.1 Exporting the adjacency matrix The first step is to export the adjacency matrix from Ucinet. Under the “Data” menu, select “Export,” which provides us with a choice of exporting the data in a number of formats: DL, Krackplot, Mage, Pajek, Metis, Raw, Ucinet 3.0, and Excel. Under “Pajek,” choose “Network,” which brings up the following dialog box:
2-11 Version .42
Figure 2.10
Ucinet Export to Pajek Dialog Box
The parameters are defined as follows: Input dataset: Name of matrix file containing data to be exported. Like before simply select the name of the matrix you plan to export. Dichotomize vals > than: Allows you to transform valued matrices into dichotomized matrices. Default = null. Delete isolates: Allows you to delete isolated nodes. [Input] – Coordinate dataset: Allows you to use coordinates calculated in Ucinet (e.g., MDS) for Pajek visualizations. [Input] – Attribute dataset: Allows you to create attribute files for visualization with Pajek. Output dataset: Here provide the name of the file to be created. Launch Pajek on exit?: Allows you to launch Pajek from within Ucinet once the data are exported. After running this program, the following dialog box will appear if you chose to launch Pajek upon “exit”:
2-12 Version .42
Figure 2.11: Launch Pajek Program Dialog Box
If all goes well (and this seems to work from time-to-time), Ucinet launches Pajek when you click the “OK” button. If not, open Pajek manually. 2.2.2 Visualizing with Pajek When you open Pajek you will initially see that it presents a number menu options. A causal “stroll” through these immediately conveys the sense that Pajek allows users to perform a number of network operations, from basic analyses of networks to creating and analyzing partitions, permutations, clusters, etc. In this manual we merely scrape the surface of Pajek’s capabilities. After opening Pajek, we need to first import the data prepared and exported by Ucinet. Under the “File” menu, select “Network” and then “Read,” as is illustrated in Figure 2.12 below. Alternatively, you can click on the “open file” icon to the left of the Network dialog box in Pajek’s Main Screen. Either way Pajek automatically looks for files with a “.net” extension. Click on the “.dat” file you exported from Ucinet. In this case it is “PadgettM.net.” Pajek’s report box will appear indicating that it has successfully read the data. In this case the report box tells us that Pajek read 56 lines (see Figure 2.13).
2-13 Version .42
Figure 2.12: Opening Network Data in Pajek
Figure 2.13
Pajek’s Report Box
Close the report box by clicking on the “X” box in the upper right hand corner, and you will return to Pajek’s main screen, except now that the name of the data file that we just read into Pajek appears in the “Network” drop list (Figure 2.14). 2-14 Version .42
2-15 Version .42
Figure 2.14
Pajek’s Main Screen after Reading Padgett Marriage Network Data
Next, under the “Draw” menu, select “Draw,” (i.e., not “Draw-Partition,” “Draw-PartitionVector,” “Draw-Vector,” or “Draw-Select All.” – we will return to some of these options later, but for now we stay with a relatively simple case, primarily because we are dealing with onemode data that does not lend itself to these other forms of analyses). After selecting draw, Pajek brings up the “Draw” screen where the image will appear. The data’s initial appearance depends on which of Pajek’s starting layout options has been chosen or any coordinate data exported from Ucinet. It also brings up a new set of menu selections from which we will next choose one of two drawing programs to graphically represent the Padgett marriage data. Before drawing the network data we first have to tell Pajek whether the values assigned to the lines connecting the vertices represent similarities or dissimilarities between the vertices. In the case of the Padgett data, a value of “1” indicates the presence of a tie while a value of “0” indicates the absence of one, so the values are indicators of similarity between the various families. To tell Pajek that the Padgett data values represent similarities, under the “Options” menu, select “Value of Lines” and then “Similarities.” Pajek uses two “spring-embedded” algorithms for visualizing network data: Kamada-Kawai and Fruchterman Reingold. Both algorithms think of the points as pushing and pulling on one 2-16 Version .42
another and seek to find an optimum solution where there is a minimum amount of stress on the springs connecting the whole set of points (Freeman 2000). 2.2.2.1 The Kamada-Kawai Spring Embedded Algorithm The Kamada-Kawai (1989) algorithm is based on an assumed attraction between adjacent points and an assumed repulsion between non-adjacent points and allocates points in two-dimensional space. To use this algorithm under the “Layout” menu, select “Kamada-Kawai.” You are next given the option of allowing the algorithm to “freely” distribute the various nodes and their respective edges in visual space, fixing the first and last nodes, or identifying a node you would like to appear in the middle of the drawing (e.g., the most central actor). Using the “Free” option you should get a graphical representation of the Padgett marriage data that is similar to (but not identical) to the one illustrated in Figure 2.15.12 The Kamada-Kawai algorithm has several options worth noting. One is that it allows analysts fix the position of certain vertices (e.g., a specific class), and then optimize the position of all other vertices with the “Fix selected vertices” command. Pajek also allows you to fix the first and last vertices in a network (using the “Fix first and last vertices” command), or place a selected vertex in the middle of the drawing using (using the “Fix one in the middle” command).
12
It is important to note that there is no unique “solution” for either of these algorithms, so that every time we use them, Pajek will draw them differently. In spite of this, repeated drawings of the same network data tend to resemble one another. It is generally a good idea to visualize the data using the energy commands more than once. Results do depend on the starting position of vertices, so different starting positions may (and often do) yield different results. The results are generally similar, but it seems logical that using an energy a second time will yield a more accurate drawing of the data since it will begin with starting positions that are not random and reflect, to a certain extent, the correct relationship between the various nodes.
2-17 Version .42
Figure 2.15: Visual Representation of Padgett’s Marriage Data Using Kamada-Kawai
Note that in Pajek, unlike in Mage, the lines connecting the various nodes (edges) are represented as arrows. This is because Pajek read the Padgett data exported from Ucinet as “arcs” rather than as “edges,” which they technically are. This is generally not a problem if the social network you are visualizing consists entirely of arcs or entirely of edges. However, if a social network consists of both arcs and edges, then you may need to edit the data if you want the arrows in your graphs to be properly represented. See Appendix B on the editing and printing of Pajek images. This image itself captures some of the dynamics of this social network. The Medici family, which history and a variety of centrality measures have told us was the most central family, clearly appears to be one of the most, if not the most, central family, while the Pazzi, Acciaiuol, Lambertes, and Ginori families fall along the periphery. It is interesting to note, however, that the Pucci family, which has no marital ties to any of the other families in the network, is located more centrally than are some of the other families. This is nonsensical and points to a limitation of the Kamada-Kawai algorithm. Because in this algorithm unconnected points neither attract nor repel other points in the network, it randomly places unconnected points in social space, such that they occasionally are placed nonsensically. Repeated use of this algorithm to visualize this data seems to confirm this suspicion. 2.2.2.2 The Fruchterman Reingold Spring Embedded Algorithm 2-18 Version .42
The Fruchterman Reingold (1991) algorithm is similar to the Kamada-Kawai algorithm, but rather than assuming attraction between adjacent points and repulsion between non-adjacent points, it attempts to simulate a system of mass particles where the vertices simulate mass points repelling each other while the edges simulate springs with attracting forces. It then tries to minimize the “energy” of this physical system. It also differs from the Kamada-Kawai algorithm in that it is able to distribute points in both two-dimensional and three-dimensional space. To use the Fruchterman Reingold algorithm to graphically represent the Padgett marriage data in two-dimensional space, under “Layout” first select “Fruchterman Reingold” and then “2D.” This will produce an image similar to the one displayed in Figure 2.16. Here, as in Figure 2.14, the Medici family falls in the center of the graph while other families such as the Pazzi, Acciaiuol, Lambertes, and Ginori fall along the periphery. In this drawing, however, the Pucci family is clearly an outlier while in Figure 2.15 it was not. Repeated implementation of this algorithm yields essentially the same representation. Turning to a three-dimensional graph of this data using the Fruchterman Reingold algorithm, under “Energy” first select “Fruchterman Reingold” and then “3D.” This will produce a threedimensional similar to the one displayed in Figure 2.17. Here we see patterns similar to the ones seen in Figure 2.15 and 2.16. The Medici family falls at the center of the graph, while the Pazzi, Acciaiuol, Lambertes, and Ginori families fall along the periphery, and the Pucci family is clearly an outlier. Where this figure differs from the previous one, however, is in the size of the vertices. Some are smaller than the others. For example, the Castellan and Pucci vertices are noticeably smaller than the Pazzi and Ginori vertices. This is because the former vertices are “farther away” than are the latter ones. You can, however, tell Pajek to keep the vertices the same size by turning off the “perspectives” option located under the “Spin menu before having Pajek draw the data. Nevertheless, users need to be somewhat careful when using three-dimensional representations because it is possible for a vertex to appear, at first glance, to be quite central but, upon closer inspection, prove to be quite far from the center. This is because in these three-dimensional representations, distance is not only measured “left-to-right” and “top-to-bottom,” but also “front-to-back.”
2-19 Version .42
Figure 2.16: Two-Dimensional Drawing of Padgett’s Marriage Data Using Fruchterman Reingold
2-20 Version .42
Figure 2.17: Three-Dimensional Drawing of Padgett’s Marriage Data Using Fruchterman Reingold
2.2.3 Layering Images in Pajek Pajek also allows users to “layer” their images based how, if at all, the data are partitioned. The first step requires that you partition the data, which is generally what you need to do when you are working with one-mode data. Here we will partition the data based on degree, but Pajek allows you to partition data based on a number of different schemes, including “influence domain,” “core,” “valued core,” “depth” and “p-Cliques.” You can also partition data based on the labels or shapes assigned to various vertices. To partition the Padgett data based on degree, return to Pajek’s main screen by clicking on the “x” box in the upper right hand corner of the “Draw” screen. Next, under the “Net” menu, first select “Partitions,” then “Degree,” and then either “Input” or “Output” (Figure 2.18). Do not select “All” because that command will count the lines between two families twice. This is true even if you transform arcs in Pajek to edges. When you run this procedure, Pajek will create a
2-21 Version .42
partition based on degree and a vector that represents the normalized degree distribution of the network’s vertices.13 You can also calculate average degree of the network by selecting “Make Vector” under “Partition” menu; you can see the results by first highlight the newly created vector in the vector drop list, and then selecting “Vector” under the “Info” menu. The results will appear in Pajek’s report window (not shown). In this case, Pajek reports that the average degree equals 2.5, which indicates that Padgett’s Florentine families averaged two and a half marriages between them. 2.18
Partitioning Data Based on Degree
Next, under the “Draw” menu, select “Draw-Partition.” This brings up the same image as before, except now the vertices are assigned different colors based on their output degree. Notice that a new menu item has appeared on the Draw screen: “Layers.” This only appears when you have drawn used the “Draw-Partition” option. Under “Layers,” select “Type of Layout,” and then “3D” since this is a three-dimensional drawing. 13
In Pajek, partitions represent discrete values of networks, while vectors represent continuous values. Together these two features allow analysts to draw a network where the vertices vary in color according to a partition (e.g., countries classified by continent) and vary in size according to a vector (e.g., country GDP).
2-22 Version .42
Next, under “Layers,” select “in z direction.” What this option does is draw the vertices in layers (based on degree, in this case) toward the “z” coordinate, while leaving the “x” and “y” coordinates as they are. What this accomplishes becomes clearer after rotating the image around the “x” axis. To do this, hold down the “Shift” key and then press on the “X” key. Continue to rotate the image until vertices of the same color horizontally “line up” with one another. Once you reach this point the vertex with the highest “degree” will be at the top of the image. If you rotate the image too far, you can rotate it in the other direction by not holding down the “Shift” key while pressing the “X” key. Figure 2.19
Layering 3D Pajek Image
Looking at Figure 2.20 you can see that, not surprisingly, is at the top of the image. Next in line are the Strozzi and Guadagni families, then the Peruzzi, Catellan, Ridolfi, Tornabuon and Albizzi families, then the Barbadori and Salviati families, then the Lambertes, Pazzi, Acciaiuol, and Ginori families, and finally the outlier of the group, the Pucci family. Pajek also allows users to rotate images around the “y” and “z” axes by simply holding down the “Y” and “Z” keys, respectively.
2-23 Version .42
While layering is not necessarily something that you would want to use every time you visualize social networks, it clearly can highlight some of the structural aspects of social network data.
2-24 Version .42
Figure 2.20
Rotated Pajek Image, Layered Based on Degree
2.3 Visualizing Asymmetric One-Mode Matrices using Mage Visualizing asymmetric (directional) one-mode matrices in Mage is not as straightforward as it is for visualizing symmetric one-mode matrices for the simple reason that multidimensional scaling techniques require symmetric matrices. Thus, the first step involves calculating an equivalence matrix, based either on the distances (e.g., Euclidean) or the correlations between the nodes of the directed matrix. We then submit the equivalence matrix, which is symmetric, to multidimensional scaling techniques. As mentioned earlier, for this purpose we will use the advice network of Krackhardt’s High-Tech Managers (1987). Krackhardt collected data from the managers of a high-tech company that manufactured high-tech equipment on the West Coast of the United States. At the time the company had just over 100 employees with 21 managers. He asked each manager to whom he or she went for advice and whom they considered their friends. He gathered data concerning to whom they reported from company documents. The advice network is displayed in Figure 2.21. The matrix is clearly asymmetrical. For example, while manager #1 goes to managers 2, 4, 8 16, 18 and 21 for advice, manager #2 goes to managers 6, 7 and 21. Manager #15 seeks advice from all of the other managers, while only managers 10, 18, 19, and 20 seek advice from him or her. In fact, manager #15, along with managers 9, 13, & 19 are sought out for advice less than any of 2-25 Version .42
the other managers are. By contrast, manager #2 is sought out for advice more (18) than are any of the other managers. Figure 2.21: Krackhardt’s High Technology Managers’ Advice Network
2.3.1 Calculating equivalence matrices from asymmetric one-mode data To calculate an equivalence matrix, under the “Network” menu, choose, “Roles & Positions,” then “Structural,” and then “Profile” as illustrated in Figure 2.22: Figure 2.22
Menu Options for Calculating Equivalence Matrices
This brings up Ucinet’s profile similarity dialog box (see Figure 2.23) with the following parameters:
2-26 Version .42
Figure 2.23 Profile Similarity Dialog Box
Input dataset: Name of file containing network to be analyzed. Data type: Multirelational, which means that it is capable of calculating the structural equivalence of actors (nodes) for both asymmetric and symmetric matrices. Measure of profile similarity/distance (Default = Euclidean Distance): Choices are Euclidean Distance – This is the distance between the vectors in n-dimensional space, that is, the root of the sum of squared differences. We use this method of computing distance here, but we could just as easily have chosen to measure similarity using the Pearson product correlation coefficient. Correlation – This is the Pearson product correlation coefficient of every pair of profiles. Matches – This is the proportion of exact matches between all pairs of profiles. Positive Matches – This is the proportion of exact matches in which at least one element is positive, between all pairs of profiles. Method of handling diagonal values (Default = Reciprocal): Choices are Reciprocal - In considering adjacency matrix X and comparing the profile of actor i with the profile of actor j, Ucinet replaces the comparison of elements xii with xji and xij with xjj by the comparisons xii with xjj and xij with xji. Ignore – Ucinet treats the diagonals as missing values so that the comparisons of xii with xji and xij with xjj are dropped. We will use this option in this case. 2-27 Version .42
Retain - Profile vectors are compared directly element by element, including the xii and xjj elements. Include transpose in calculations? (Default = Yes): Including transposes in the calculations means that profiles correspond to rows and columns. This is not necessary for symmetric data but we use it here for asymmetric data. For binary data: convert to geodesic distances? (Default = No): Converts binary data to geodesic data before performing an analysis. In this case, we stay with the default and choose “No.” Diagram Type (Default = 'Dendrogram'): The clustering diagram can either be a Tree Diagram or a Dendrogram. We are not analyzing dendograms or tree diagrams here, so take your pick. (Output) Equivalence matrix (Default = 'SE'): Name of data file containing actor by actor equivalence matrix. Choose a file name that relates to your input file. (Output) Partition dataset (Default = 'SEPart'): This is the name of the data file containing partition indicator matrices derived from single link hierarchical clustering. After selecting the “OK” button, Ucinet first produces either a dendogram or a tree diagram, depending on what type of diagram you chose above. Since for our purposes here we are not analyzing either of these diagrams, close the output box. Next, you will see a structural equivalence matrix that looks similar to the one that is presented in Figure 2.24. Figure 2.24 does not display all of Ucinet’s output. Also included in the output is a hierarchical clustering diagram (similar to a dendogram) based on the equivalence matrix.14 The next step in the process is to submit the structural equivalence matrix to the multidimensional scaling techniques discussed earlier. However, in this case the larger the number the greater the distance of one actor from another. So, when we instructed Ucinet to perform multidimensional scaling on the structural equivalence matrix, we chose the “Dissimilarities” option rather than the “Similarities” option (see Figure 2.25). We ended up using metric MDS, which yielded a stress level of .124.
14
See the discussion of this data with regard to calculating structural equivalence in Wasserman and Faust (1994:366-393).
2-28 Version .42
Figure 2.24
Structural Equivalence Matrix
Figure 2.25: Ucinet Metric MDS Dialog Box
2-29 Version .42
Next, we exported both coordinates calculated from the equivalence matrix and the adjacency (not the equivalence) matrix following the procedures outlined earlier. We then combined these into a file readable by Mage, and this produced the following image Figure 2.26
Metric MDS of Krackhardt’s High-Tech Managers Advice Network
The image suggests that the advice network is split into two different groups of advice networks with a few actors bridging the two groups. The blue node in the upper left corner of the graph is Manager #2. Note that he or she is somewhat distant from the other managers, which undoubtedly reflects the fact that, in terms of advice sought, he or she is indeed an outlier. 2.4 Visualizing Asymmetric One-Mode Matrices using Pajek Visualizing asymmetric one-mode matrices in Pajek is not as easy as one would expect, at least I have not discovered a quick way to visualize asymmetric matrices. As such this section is still under construction…
2-30 Version .42
3. NETWORK VISUAL REPRESENTATIONS OF TWO-MODE NETWORKS As we noted earlier two-mode networks consist of either two sets of actors, or one set of actors and one set of events. They differ from one-mode networks in that one-mode networks involve only a single set of actors. 3.1 Visualizing Two-Mode Matrices using Mage The example used here is Davis’s Southern Club Women (Breiger 1974; Davis, Gardner, and Gardner 1941) discussed in Chapter 1. Davis and his colleagues collected these data in the 1930s and represent the observed attendance at 14 social events by 18 Southern women. The result is a person-by-event matrix such that xij is 1 if person i attended social event j, and 0 otherwise: Figure 3.1: Davis’s Southern Women Network in Matrix Form
The rows represent the eighteen women who attended the various events while the columns represent the events themselves. As you can see that actor #1 (Evelyn) attended events 1, 2, 3, 4, 5, 6, 8, and 9 while actor #17 (Flora) attended only events 9 and 11. Depending on how we manipulate the data, Mage can visualize two-mode networks in a variety of ways. We begin with the most common method of visualizing two-mode networks, namely by converting two-mode data sets to one-mode (actors or events) data sets. Typically this involves constructing a matrix that is the product of a matrix and its transpose. With regards to Davis’s Southern Women data, cell xij gives the number of events that both women i and j attended.15 Researchers tend to interpret this value as the strength of the social proximity of the two women (Borgatti and Everett 1997). 15
If researchers, rather than multiplying the matrix by its transpose, choose instead to multiply the transpose by the matrix, they will create a square matrix where cell xij gives the number of women who attended both events i and j. Both types of matrices are computed below.
3-1 Version .42
Next, we turn to two visualization methods that retain both modes (actors and events) of the data. The first uses Ucinet to create a bipartite graph from which we then use correspondence analysis to locate the points in space. Finally, we compute the geodesic distances between all the pairs of nodes in the bipartite graph and then submit the resulting geodesic distance matrix to multidimensional scaling. 3.1.1 Deriving one-mode matrices from two-mode data Rather than requiring users to first create a matrix’s transpose and then multiplying the two together, Ucinet has provided an “Affiliations” option under its “Data” menu that simplifies the process. Selecting this option brings up the following dialog box. Figure 3.2
Ucinet Affiliations Dialog Box
The parameters of this process are as follows: Input dataset: This is the name of file containing 2-mode dataset. In this case “Davis.” Which mode: (Default = Row). Choices are: Row: Represents row by row matrix of overlaps, i.e. forms AA' Column: Represents column by column matrix of overlaps, i.e. forms A'A. Output dataset: (Default = 'Affiliations'). This will be the name of the new matrix. The default output name is “Affiliations,’ but we recommend providing it with a name that you will easily associate with the original matrix. Choosing the “row” option yields the following 18 by 18 co-membership matrix:
3-2 Version .42
Figure 3.3:
Co-membership matrix of Davis’s Southern Women
Both the rows and the columns represent actors (i.e., the women) and the numbers in the cells of the matrix represent the number of ties (i.e., the number of common events attended by the women) between the two actors. Thus, Laura (actor #2) attended six of the same events that Theresa (actor #3) did, and Flora (actor #18) attended only one event at which Dorothy (actor #16) also attended. Furthermore, the values on the diagonal tell us the total number of events attended by each actor. Checking the diagonal we can see that Evelyn and Theresa attended the most number of events (8) while Olivia and Flora attended the fewest (2). Choosing columns instead of rows yields the following 14 by 14 event overlap matrix: Figure 3.4:
Event Overlap Matrix of Davis’s Southern Women
3-3 Version .42
Here, the rows and the columns represent events and the numbers in the cells of the matrix represent the number of ties between any two events. Thus, two women who attended event #1 also attended event #2, and none of the women who attended event #1 attended event #10. The values on the diagonal tell us the total number of actors attracted by each event. Thus, Event #8 attracted the most women (14), while Events #1 & #2 attracted the fewest (3). We are now ready to export either one of these adjacency matrices and its related coordinate data. To do this we follow the same procedures as we do for one-mode networks, so there is no need to repeat them again. 3.1.2 Visualizing in Mage As in our discussion of visualizing one-mode matrices, we used Ucinet to calculate both metric and non-metric MDS coordinates in order to compare how they produce different images from one another. Figure 3.5 illustrates the data using metric MDS: Figure 3.5: Visual Representation of Davis’s Southern Women Using Metric Multidimensional Scaling
3-4 Version .42
Figure 3.6 illustrates the same data using non-metric MDS. Here, clear differences exist between the visualizations using metric and non-metric MDS. Interestingly, both provide insights in different ways. Figure 3.5 emphasizes two clusters of women and a handful of women less connected than the others. Figure 3.6, on the other hand, emphasizes the isolation of two women in the group (the two balls in the upper right hand of the image). Figure 3.6:
Visual Representation of Davis’s Southern Women Using Non-Metric Multidimensional Scaling
3.1.3 Using correspondence analysis to visually represent two-mode data Researchers have long used correspondence analysis to measure the distance between nodes. The first step for using correspondence analysis to visually represent two-mode data is to create a bipartite graph. “Any 2-mode incidence matrix can be thought of as a bipartite graph. If the 2modes are actors and events then the bipartite graph consists of the union of the actors and events as vertices with the edges only connecting actors with events (i.e., no connections between actors or between events). This routine takes a 2-mode incidence matrix and converts it to a 1-mode adjacency matrix of a bipartite graph. If the incidence matrix had n rows and m columns then the 3-5 Version .42
resultant adjacency matrix would be a square matrix of dimension m+n” (Borgatti, Everett, and Freeman 1999). 3.1.3.1 Creating a bipartite graph in Ucinet Under the “Transform” menu, choose “Bipartite.” This brings up the following dialog box: Figure 3.7
Ucinet Bipartite Dialog Box
The parameters are defined as follows: Input 2-mode dataset: This refers to the name of file containing incidence matrix. In this case it will be Davis. Value to fill within-mode ties (Default=0.0): The incidence matrix specifies the values of ties from actors to events the values of the (non-existent) ties of actors to actors and events to events is not given. Users can override the default value of zero by specifying their own within mode value. Make result symmetric? (Default = No). Users can choose to make the resulting matrix symmetric. For our purposes we will select “No.” Output dataset (Default = bi): This refers to the name of file containing adjacency matrix of bipartite graph. We will change this to “Davisbi.” Performing this procedure yields this following matrix:
3-6 Version .42
Figure 3.8
One-mode Bipartite Matrix from Davis Southern Women Two-mode Matrix
Export this matrix following the same procedures used earlier for one-mode data. 3.1.3.2 Correspondence analysis in Ucinet Correspondence analysis in Ucinet is straightforward. Under the “Tools” menu choose “2-mode scaling,” under which select “Correspondence.” This brings up a dialog box (Figure 3.9) with the following parameters: Input dataset: This is the name of file containing matrix to be analyzed, it must have at least as many rows as columns (otherwise transpose the matrix then resubmit). Here we select “Davis” not “Davisbi” because if we select the bipartite matrix of the Davis Southern Women data, we will create an output file of combined row and column scores (see below) from the correspondence analysis with 64 lines whereas we only want and need one of 32 lines. How to scale row and column scores (Default = Coordinates): This parameter tells Ucinet how to scale the row and column scores. The choices that Ucinet provides are (we will use Ucinet’s default): Coordinates - Scores for each point on each dimension adjusted both for point marginals and dimension weights (eigenvalues). 3-7 Version .42
CGS - According to Carroll-Green-Schaffer, this transformation makes distance between a row and a column just as interpretable as distance between a row and a row or a column and a column. Optimal - Scores for each point are corrected for point marginals, but not dimension weights. Axes - No rescaling is performed. Number of factors to save (Default = 3): Maximum value of r, the number of eigenvectors used to decompose the matrix. Keep the default Reconstruct matrix from factors (Default = No): If Yes, the row and column scores are combined to approximate the data matrix with r eigenvectors (see Number of factors to save, above). The result is the best possible approximation of X using matrices of rank r based on a least squares criterion. Keep the default. Keep the trivial first factor (Default = No): This normalization step prior singular value decomposition causes first eigenvector to be constant. If users choose “Yes,” this factor is retained and eigenvalue percentages include it. If they choose “No,” the factor is dropped and eigenvalue percentages do not include it. Keep the default. (Output) File to contain row scores (Default = CorrespondenceRScores): This will be the name of dataset to contain coordinates of row points. For our purposes we will use DavisRS. (Output) File to contain column scores (Default = CorrespondenceCScores): This will be the name of dataset to contain coordinates of column points. For our purposes we will use DavisCS. (Output) File to contain singular values (Default = CorrespondenceEigen): This will be the name of dataset to contain eigenvalue of each dimension. For our purposes we will use DavisEigen. (Output) File to contain reconstructed matrix (Default = CorrespondenceRecon): This will be the name of dataset to contain the approximated data matrix (if any). For our purposes we will use DavisRecon. (Output) File to contain combined row/column scores (Default = CorrespondenceRCScores): This will be the name of dataset to contain concatenated row and column scores to produce single (m+n)-by-r matrix (useful for plotting row and column scores on same map). For our purposes we will use DavisRCS.
3-8 Version .42
Figure 3.9
Ucinet Correspondence Analysis Dialog Box
This initially provides us with a two-dimensional scatterplot. We will not use this, but it can be printed off or inserted into a Word document. We are interested, however, in the combined row and column scores. We need to export these data following the procedures outlined earlier. The figure clearly illustrates that some of the women and events are more central than are others. Interestingly, at the upper right portion of the image there are two balls, one representing event 11, the other representing both Flora and Olivia. Flora and Olivia are represented by one ball because their coordinates, as calculated by correspondence analysis, are identical.
3-9 Version .42
Figure 3.10
Visual Representation of Davis’s Southern Women Using Correspondence Analysis
3.1.4 Using geodesic distance to visualize two-mode data Borgatti and Everett (1997:247) argue that there are three problems related to correspondence analysis representations of two-mode data (see, however, Roberts 2000). One problem is that the distances in correspondence analysis are not Euclidean, yet researchers using this technique find it difficult to interpret the results in any other way. As such they suggest a variety of different approaches for visually representing two-mode data. One method they recommend is to first compute the geodesic distances between all pairs of nodes in the bipartite graph and then submit the resulting matrix to non-metric MDS. 3.1.4.1 Computing geodesic distance Before computing the geodesic distance between various nodes, we first have to construct a symmetrical bipartite graph. To do this simply follow the procedures outlined above for creating a bipartite graph, and in the dialog box where Ucinet asks whether you want a symmetrical bipartite graph, select “Yes.” For the Davis data we saved the resulting matrix as Davisbi2 3-10 Version .42
Geodesic distance is the length of the shortest path between two nodes. Ucinet makes this calculation quite simple. Under the “Network” menu choose “Cohesion,” then “Distance,” which brings up the following dialog box: Figure 3.11
Ucinet Graph-Theoretic Distance (Geodesic) Dialog Box
The parameters are defined as follows: Input dataset: This is the name of the file containing dataset to be analyzed. In this case we use “Davisbi2.” Type of Data (Default = Adjacency): Ucinet provides numerous choices for computing distance. While we will use the default, the choices include: Adjacency - standard binary data, distance corresponds to graph theoretic geodesic. Strengths - values indicate cost or lengths of links between nodes. Optimum is strongest path. Costs - values indicate strengths, capacities or cost. Optimum is the cheapest cost. Probabilities - values indicate probability of link and restricted to [0,1]. Optimum is most probable path. Nearness transformation (Default = None): This converts distance matrix to a nearness matrix by a variety of methods. These are: None - No transformation is applied and raw distances are given as output. Multiplicative – The distances between nodes are divided into the largest possible distance. New values are given by Yij = (N-1)/Dij. Additive – The distances between nodes are subtracted from the total number of nodes. New values are given by Yij = N - Dij. 3-11 Version .42
Linear – The distances between nodes are transformed linearly into [0,1]. New values are given by Yij = 1 - (Dij - 1)/(N-1). Exponential – The distances between nodes are transformed using exponential decay. New values are given by Yij = bDij. The attenuating factor b is selected by the user and should satisfy 0 < b < 1. Freq Decay - Uses Burt's 1976 frequency decay function. The nearness of i and j is one minus the proportion of actors that are as close to i as j is. Attenuation Factor (Default = 0×5): Value of the attenuation factor b when exponential is chosen. Larger values give slower decay. Output dataset (Default = GeodesicDistance): This refers to the name of data file containing the distance matrix. Here we change it to “DavisGeo.” Running this procedure produces the following matrix: Figure 3.12
Geodesic Distances Among Nodes in Davis’s Southern Women Matrix
3-12 Version .42
Because neither the women nor the events are directly connected to one another, the geodesic distances between any two women or between any two events are (and cannot) be less than two (or odd-valued) (Borgatti and Everett 1997:249; Faust 1997). Women are only connected to one another through events and events are only connected to one another through women. The next step is submitting this matrix to multidimensional scaling. Following Borgatti and Everett we will use non-metric MDS. There is no need to repeat these procedures since we outlined them earlier. After completing this task we then export both the MDS coordinates and the symmetric bipartite matrix (not the geodesic distance matrix) in kinemage format. These procedures yield the following representation: Figure 3.13
Visual Representation of Davis’s Southern Women Data Using MDS of Geodesic Distances
While Borgatti and Everett find this method more appealing than correspondence analysis, in this case the visual representation of the data is less than helpful. 3.2 Visualizing Two-Mode Matrices using Pajek
3-13 Version .42
Pajek offers certain advantages over Mage when it comes to visualizing two-mode networks. While Mage is essentially limited to visualizing one-mode networks (or two-mode networks that have been multiplied by their transpose), Pajek is capable of visualizing two-mode networks in their duality. In the following discussion we illustrate how to do this in Pajek. Pajek is also capable of exporting its visualizations in kinemage format, such that it can then be visualized in Mage where we can capitalize on the advantages Mage offers. As we shall see, Pajek allows users to very simply derive one-mode data from two-mode data, so users do not need to use Ucinet to create transposes of matrices or multiply matrices by their transposes. As we did with Mage, we use Davis’ Southern Women data as an example. 3.2.1
Preparing and reading two-mode data into Pajek
The steps involved for preparing and reading in two-mode data for use in Pajek do not differ from those for preparing and reading in one-mode data, so there is no need to repeat them here. However, when you read two-mode data into Pajek, Pajek’s main screen looks different than it does when you read in one-mode data. As you can see in Figure 3.14 after we read Davis’s Southern Women data into Pajek, not only does information concern the data appear in the Network dialog box, but additional information appears in the Partition dialog box. Figure 3.14
Pajek’s Main Screen after Reading Davis’s Southern Women Data
3-14 Version .42
Specifically, the information included in the Partition dialog box informs us that we are dealing with an affiliation network containing 18 actors and 14 events respectively. The Network dialog box also tells us that we have read in two-mode data. 3.2.2
Visualizing one-mode data derived from two-mode data with Pajek
Pajek offers a simple way to derive one-mode data from two-mode matrices. For example, if we wanted to derive an affiliation matrix from the Southern Women data, we simply select “Transform” under Pajek’s “Net” menu, then “2-Mode to 1-Mode” and then “Rows.” We select “Rows” if we want Pajek to create a one-mode matrix based on the actors represented by rows (in this case, the women) or we select “Columns” if we want Pajek to create a one-mode matrix based on the actors represented by columns (in this case, the events). Figure 3.15 demonstrates how to do this, while Figure 3.16 draws a picture of the one-mode co-membership (i.e., women) matrix created. Figure 3.15
Transforming Two-Mode Matrix to One-Mode Matrix Using Rows
3-15 Version .42
3-16 Version .42
Figure 3.16
Pajek Drawing of Southern Women Co-Membership Matrix
This picture somewhat resembles the image visualized in Mage using the same data and nonmetric multidimensional scaling (Figure 3.6). 3.2.3
Visualizing two-mode data with Pajek
Using Pajek to visualize two-mode data is as simple as using Pajek to visualize one-mode data although we do have additional options. To begin with, under the “Draw” menu, select “Draw,” which (as before) brings up the “Draw” screen where the image initially appears as a single point in space. As before, it also brings up a new set of menu selections from which we will next choose one of two drawing programs to graphically represent Davis’ Southern Women data. Rather than exploring all the drawing algorithms this time we only use the Fruchterman Reingold algorithm to visually represent the data. Under “Layout” first select “Fruchterman Reingold” and then “2D.” This will produce an image similar to the one displayed in Figure 3.11 (see Figure 3.17):
3-17 Version .42
Figure 3.17
Drawing of Davis’s Southern Women Data using 2-D Fruchterman Reingold
Certain patterns are apparent from this initial visualization. The women appear to be clustered into two groups: Dorothy, Helen, Nora, Katherine, Sylvia and Verne belong to one cluster while Ruth, Eleanor, Laura, Evelyn, Theresa, Brenda, Frances and Charlotte belong to the other. Olivia and Flora do not appear to belong to either of the two groups. Not only are the women clustered into groups, so are the events. Events 10, 12, 13 and 14 are clustered together and are associated with the first cluster of women, while events 1, 3, 4 and 5 are clustered together and are associated with the second cluster of women. Event 11 is the outlier event and is primarily associated with Olivia and Flora. Interestingly, events 7, 8 and 9 are quite central in this visualization, which suggests that they served as “bridge” events in that women from both clusters attended them. Now return to Pajek’s main screen and under the “Draw” menu, select “Draw-Partition,” and you will see an image similar to the one that appears in Figure 3.17 (see Figure 3.18):
3-18 Version .42
Figure 3.18
Drawing of Davis’s Southern Women using 2-D Fruchterman Reingold Algorithm and “Draw-Partition” Option
While the layout is the same, the vertices representing the events and actors are assigned different colors. In this case the actors are colored yellow while the events are colored green. Using different colors to visually represent the different modes helps make distinguishing between the two modes somewhat easier. It is even easier to distinguish between the two if the vertices of the two modes are different shapes as they are in Figure 3.19. Here the vertices representing the women are still colored yellow and remain in the shape of an ellipse, but the events are now colored blue and are in the shape of a triangle.
3-19 Version .42
Figure 3.19
Drawing of Davis’s Southern Women using 2-D Fruchterman Reingold Algorithm, Defining Shapes and Colors of Vertices with Input File
In order to change the shapes and colors of particular vertices, they need to be defined in the Pajek file itself. See section Appendix B, Section B.5.1.
3-20 Version .42
4. SOCIAL NETWORKS OVER TIME A once common criticism of social network analysis was that it conveyed a static, rather than dynamic, understanding of a social structure (i.e., it did not incorporate change), especially when it focused on ties that had become routinized over time (Marsden 1990; Nadel 1957). In recent years, however, researchers have demonstrated that such a bias is not inherent in social network analysis and have offered ways of modeling temporal changes in social networks (see e.g., Giuffre 1999). Nevertheless, the visual presentation of social networks over time is still in its infancy. Analysts have made great strides in using visualization techniques to explore social networks (see e.g., Borgatti and Everett 1997; Castilla, Hwang, Granovetter, and Granovetter 2000; Freeman 1999, 2000; Freeman, Webster, and Kirke 1998), but they have just begun to extend these techniques the examination of social networks over time . When visualizing a social network over time, analysts often present it as “movie.” That is, they portray it a series of snapshots at different points in time in the life of the social network (Assimakopoulos, Everton, and Tsutsui 2003). Movies such as these are most effective when readers have access to them either on-line or as a file so that they can run through them on their own. As valuable as this approach is, it does not always lend itself to publication in standard academic journals. Thus, there remains the need for presenting changes over time in a single “snapshot.” Here, I demonstrate both ways of picturing social networks over time, first using data from Sampson’s (1968) study of a Roman Catholic monastery (described below), and then with data of Silicon Valley semiconductor companies (Assimakopoulos, Everton, and Tsutsui 2003; Castilla, Hwang, Granovetter, and Granovetter 2000). For both social networks, I present them both in “movie” form and as a single snapshot. 4.1 The Sampson Monastery data Samuel Sampson (1968) conducted his study of a Roman Catholic monastery in the late 1960s, which was a unique time in the life of the Roman Catholic Church. Between October 1962 and December 1965 all of the Roman Catholic Church’s bishops and cardinals met for the Second Vatican Council (Vatican II) and introduced a number of changes in the way that male and female religious orders lived and worshipped together. Some welcomed these changes. Others did not. Almost immediately after the council drew its meeting to a close, there was a steep decline in the number of women and men entering religious orders and a sharp rise in the number who left their respective orders (Stark 2001; Stark and Finke 2000). Sampson sensed (correctly) that it might be worthwhile to examine how one monastery responded to the changes put for the Vatican II. During his stay, a “crisis in the cloister” occurred that resulted in the expulsion of four monks and the voluntary departure of several others. In the end, only four of the eighteen monks remained. Sampson recorded the social interactions among a group of eighteen monks and collected sociometric data along four dimensions: Esteem (SAMPES) and disesteem (SAMPDES), liking (SAMPLK) and disliking (SAMPDLK), positive (SAMPIN) and negative influence (SAMPNIN), and praise (SAMPPR) and blame (SAMPNPR). He had each monk rank only his 4-1 Version .42
top three choices where “3” indicated the highest or first choice and “1” the last. Some of the monks offered tied ranks for their top four choices. Sampson gathered most of the data after the breakup occurred. The only exception to this was that he gathered “liking” data at three different points in time (SAMPLK1, SAMPLK2, and SAMPLK3). This is the time data we intend to use under the assumption that it reflects changes of in-group sentiment over time. Figures 4.1 through 4.3 present Sampson’s “liking” matrices. Figure 4.1:
Sampson “Liking” Matrix at Time “1”
Figure 4.2:
Sampson Liking Matrix at Time “2”
4-2 Version .42
Figure 4.3:
Sampson Liking Matrix at Time “3”
4.1.1 Creating a movie of the Sampson Monastery data The first step in creating a movie of the Sampson monastery data is to prepare the data so that you can export the three “liking” data sets from Ucinet16 and then read them separtely into Pajek. Once we have read all three matrices into Pajek, we select to Pajek’s “Draw” window in order to set up Pajek’s drawing functions. First, under the “Options” menu, select “Value of Lines” and then “Similarities” since the higher the number indicates a stronger attraction between the sender and receiver. Because Pajek allows us to draw a series of networks using the “Previous” and “Next” buttons, we need to tell Pajek how we want it to draw the networks that are loaded into memory. Under the “Options” menu, select “Previous/Next,” then “Optimize Layouts,” and then the drawing algorithm (i.e., Kamada-Kawai, 2-dimensional Fruchterman Rinegold, 3-dimensional Fruchterman Reingold) we wish to use to draw your layout. In this example, I chose the 3dimensional Fruchterman Reingold algorithm (See Figure 4.4). We also need to tell Pajek which object (Network, Partition, Vector) will change when we select the Previous/Next option. Since the only objects loaded into memory are the three Sampson matrices, I selected the “Network” option (See Figure 4.5). If we wanted to draw a series of partitions, then we would select “Partition.”
16
The Sampson data comes with the Ucinet software package, so most people should not have to create the data from scratch based on the matrices presented in Figures 4.1 through 4.3.
4-3 Version .42
Figure 4.4
Selecting Layout Optimization for Pajek’s Previous/Next Drawing Options
Figure 4.5
Telling Pajek which Object (Network, Partition, Vector) to Apply When Previous/Next Option is Selected.
4-4 Version .42
The next step is to simply to draw the social network data at time “1,” and then click on the “Next” button to see Pajek’s drawing of the social network data at time “2” and then again to see it at time “3.” We can also watch the movie in “reverse” by clicking continuously on the “Previous” button. Figures 4.6 through 4.8 show the Sampson liking data over the three points in time as drawn by Pajek. What is somewhat clear from these drawings is that the social network appears to split apart from time “one” to time “three.” At time “one” the social network is relatively undifferentiated. Romuald (10) and Bonaventure (5) lie at the center of the social network and appear, in many ways, to hold it together. By time three, however, neither of them appears at the center of the network, and in fact the network appears to be splintering by that point. Figure 4.6
Pajek Drawing of Sampson Liking Data at Time “1”
4-5 Version .42
4-6 Version .42
Figure 4.7
Pajek Drawing of Sampson Liking Data at Time “2”
Figure 4.8
Pajek Drawing of Sampson Liking Data at Time “3”
4-7 Version .42
4.1.2 Creating a single snapshot of the Sampson Monastery data over time Creating a single snapshot of social network data over time is a somewhat more complicated task. The first step involves the construction of a super matrix that combines all three of the “liking” matrices. Because one of the goals here is to present the matrices at different times, we cannot simply “stack” the matrices on top of one another as illustrated in Figure 4.9. As will become clear later, if we stacked the matrices like this, the three points in time would collapse onto one another and provide a meaningless picture of their relationship over time. Figure 4.9
Stacked “Super Matrix” Sampson Liking Matrix at Time 1
Sampson Liking Matrix at Time 2
4-8 Version .42
Sampson Liking Matrix at Time 3
Instead, we need to create a 3 x 3 super matrix where the individual submatrices appear along the diagonal as is illustrated in Figure 4.10 where each row and column represents a separate time period. Furthermore, in order to connect each monk with himself across time periods, we also need to include identity matrices connecting the monks at time “1” to themselves at time “2” and the monks at time “2” to themselves at time “3.” This also keeps the networks distinct from one another Pajek draws them. Figure 4.9 illustrates this as well. The remaining matrices included in the super matrix contain all zeros.
4-9 Version .42
Figure 4.9 3 x 3 “Super Matrix” 1 1
Sampson Liking Matrix at Time 1
1 1 1 1 1 Sampson Liking Matrix at Time 2
1 1 1 1 1 Sampson Liking Matrix at Time 3
4.1.2.1 Creating zero and identity matrices Ucinet V provides a simple procedure for combining matrices. Before turning to that step, however, we first need to create the “zero” and identity matrices that we will combine with the three “liking” matrices. Creating an zero matrix from an existing matrix is straightforward in Ucinet. We first need to select Ucinet’s “Recode” option, which is found under the “Transform” submenu. Selecting this brings a dialog box with the following parameters (see Figure 4.10): Input dataset: Name of dataset to be recoded. Data type: Matrix. Because I want to create a matrix full of zeros with the same actors and dimensions as Sampson’s 18 x 18 monastery network, I recode one (it could be any of the matrices) of Sampson’s matrices as the base matrix (“Sampson Like 1”). Rows to recode: (Default = All). You specify the rows you want to recode with a list. You can list each row number separated by a comma or space and/or connect them with the keywords TO, FIRST and LAST. Thus, FIRST 3, 5 TO 7, 10, 12 would give row numbers 1, 2, 3, 5, 6, 7, 10 and 12. ALL gives all possible rows. You can also use lists kept in a UCINET dataset. Enter the filename followed by ROW (or COLUMN) and a number to specify which row or column of the file to use. The list must be specified using a binary vector where a 1 in position k indicates that vertex k is a member of the list, a zero indicates that k is not a member. In this case, I want all rows (and columns) recoded to zero, I use Ucinet’s default settings. Cols to recode: (Default = All). Same as rows recode command – see above. 4-10 Version .42
Mats (levels) to recode: (Default = All). Ucinet also allows you to recode matrices in much the same way as recoding rows and columns. The command, FIRST 3, 5 TO 7, 10, 12, would give matrix numbers 1, 2, 3, 5, 6, 7, 10 and 12. As with rows and columns you can use lists kept in a UCINET dataset. Enter the filename followed by ROW (or COLUMN) and a number to specify which row or column of the file to use. The list must be specified using a binary vector where a “1” in position k indicates that vertex k is a member of the list, a zero indicates that k is not a member. Include diagonal values: (Default = No). “Yes” means that diagonal values are recoded. “No” ignores the diagonal in the recoding. Here, I choose “yes.” Recode boxes: Five boxes of the form “values to are recoded as ” are used to perform the actual recodes. Thus, if the values x, y and z are entered so that the completed line reads “values x to y are recoded as z”, then all values of the matrix in the range from x to y inclusive are changed to the value z. To change a single value set both x and y to the value. In this case, I tell Ucinet to recode all values between 0 and 99 to equal 0. Since there are no values greater than 3, this will insure that all of cells in the matrix will equal zero. Output dataset: (Default = 'Recode'). Name of file that contains recoded matrix. Here, I name the file “Sampson Like 0.” Figure 4.10
Ucinet’s Recode Dialog Box
The next step is to create an identity matrix, which is easily accomplished using Ucinet’s “Diagonal” option, which you will also find under the “Transform” submenu. Selecting this option brings up a dialog box with the following parameters (see Figure 4.11): 4-11 Version .42
Input database: This is the name of matrix on which to perform the transformations. Data type: square matrix. Here, I use the recently created “Sampson Like 0” matrix. New diagonal value(s): (Default = 0). A single value will set all diagonal elements to the value. A list will set the diagonal to the values in the list that are separated by a space or comma. Since I want to create an identity matrix, I set the diagonal value to “1.” (Output) Diagonal Dataset: (Default = 'DiagonalSaveDiag'). This is the name of file that contains a square matrix with the diagonal of the input dataset as its diagonal and zeros elsewhere. This file is not displayed in the Log File, and is not a concern of ours in this case. (Output) Changed Matrix: (Default = 'DiagonalNewMat'). This is the name of file that contains matrix with new diagonal values. Figure 4.11
Ucinet’s “Diagonal” Dialog Box
4.1.2.2 Joining matrices in Ucinet After creating the zero and identity matrices, the next step is to combine the matrices into a super matrix like the one illustrated in Figure 4.9. To do this, we can use Ucinet’s “Join” option, which is found under the “Data” submenu.” Selecting this option brings up a dialog box with the following parameters (see Figure 4.12): Files selected: These are the names of datasets each containing one or more matrices. You should enter them in the order required in the merged data set. To enter a file, highlight one or more files in the Possible Files and click on the “ > ” button, and they will be moved across. Clicking on “ < ” moves the files back. You can move all possible files across by clicking on “ >> ” or “