evolving electronic communication networks: an ... - JD Eveland

7 downloads 2605 Views 855KB Size Report
Understanding electronic communication and ... development of electronic messaging networks in. 91 ..... of the "heavy" user group signed on each period.
EVOLVING ELECTRONIC COMMUNICATION NETWORKS: AN EMPIRICAL ASSESSMENT

J. D. Eveland T. K. Bikson

The Rand Corporation

higher level descriptions identify various kinds of patterns, or test hypotheses about those patterns, in a set of relationships. These patterns will be based upon the way individuals and objects interrelate in a network, and, to some extent, upon the measurement tools and methods used."

Understanding electronic communication and the patterns that characterize its development are critical to realizing full benefits from computersupported work (Hiltz and Turoff, 1978; Olsen and Lucas, 1982; Kiesler, Siegel and McGuire, 1984). Cooperative work depends on effective communication and on the ability of organizations to manage the technology of communication appropriately (Rogers and Agarwala-Rogers, 1976; Farace, Monge and Russell, 1977). Organizations that do not understand the political and social dimensions of their communications systems will inevitably fail to achieve their purposes (Hawes, 1974; Benson, 1975). The capacity of computers to integrate data processing, text processing, and communication within a single user-accessible framework is one of the most fundamental changes to affect the world of work since the first Industrial Revolution (Bair and Mancuso, 1985; Bikson and Eveland, 1986), and the rules and practices governing the use of such tools are still evolving rapidly (Anderson and Shapiro, 1985). The degree to which these capacities are used to increase the cooperative effectiveness of managerial and production processes depends on understanding how such tools are and are not like other more familiar tools, and how these new capacities can mesh with organizational priorities and outcomes. Since electronic tools are not simply linear extensions of the familiars new systematic research exploring their use is essential.

Network analysis comprises a rather disaggregated and uncoordinated collection of empirical procedures for understanding how people share information with each other. The techniques tend to be as much a matter of art as science in their application; nonetheless they have a rich theoretical and empirical tradition (Farace and Mabee, 1980; Knoke and Kuklinski, 1982; Burt and Minor, 1983). Despite their relatively long tradition and great utility, however, network analysis methods and other kinds of empirical communication analyses have been slow to find widespread application in the study of organizational interactions (Rogers and Kincaid, 1983). In particular, these analytical methods are only beginning to be applied to the study of computer-mediated communication. Most analyses have focused on issues such as computer conferencing (Rice, 1980), teleconferencing (Johansen, 1984), and videotex (Rice and Paisley, 1982). With a few significant exceptions (e.g. Rice and Case, 1983), the empirical analysis of computer-based direct messaging systems in dayto-day use for cooperative work is largely unexplored territory.

A type of social science research method eminently applicable to these issue areas is communication network analysis. As Rice and Richards (1885:106) phrase it, "The goal of network analysis is to obtain from low-level or raw relational data higher levels of description of a system. The

This paper presents the results of an analysis of the communication patterns that characterize The Rand Corporation's use of RandMail, its electronic messaging system, in the early stages of implementation. The goal of this research has been to explore and assess the development of electronic messaging networks in

91

other Rand associates on the basis of regular names of addressees as they appear in the Rand telephone directory (or name fragments, abbreviations, nicknames), very much as interoffice memos are addressed. Further, RandMail users do not need to know about host machines or even whether the intended recipients use computers--hardcopies are automatically printed and sent via the internal paper mail distribution system to addressees without computer accounts. The system can also provide links to other electronic mail systems through ARPANET, BITNET, and more recently MCI-MAIL connections.

the context of expanding use. A variety of quantitative properties of these communication networks are presented, including both sociometric structures and network metrics; implications are developed for both the planning and assessment of electronic communications systems in other contexts. While the methods employed in this analysis are neither new nor highly sophisticated in terms of the state of the art of communication research, they shed useful light on on a phenomenon that is beginning to attract consistently larger shares of research attention and corporate resources.

BACKGROUND

RandMail was introduced on an incremental basis to small groups of trial users (including Rand's executive management). After initial trials proved successful, RandMail was installed on all nonclassified computers and made available for general use by anyone with a computer account. For approximately the first 18 months of its operation, mail system data were automatically logged and archived. These data were to provide base for understanding and correcting system problems that might arise under conditions of broad use and to permit an evaluation of the pilot effort and its effects. The research reported here relies on a subset of archived data: messages sent by or received by account holders oi two host machines where electronic mail programs had not previously been in common use. Thus usag~ logs in the main represent individuals who were not new to computer use, but who were new to electronic messaging.

RandMail is a UNiX-based electronic communication system implemented on minicomputer hosts. It was developed in 1983 by The Rand Corporation and introduced for general use there in 1984 (Shapiro et al., 1984). A nonproprietary system, it is also in use in various versions in a number of other settings (e.g., Rose and Romine, 1985). Long before the initiation of RandMail, both internal and external electronic mail programs had been available to Rand computer users (Borden, Gaines and Shapiro, 1979). However, these communication capabilities were in fact employed only by a small proportion of computer users -- typically employees in Rand's Information Science Department or Computer Services Department with accounts on the host machine serving Rand's ARPANET node. RandMail began as a pilot project whose goal was to extend the usability of the electronic mail system and broaden the user community. The new system design reflected our belief that efforts to provide computer support for information work often fail by trying to automate very specific job functions instead of trying to augment established communication skills and general organizational procedures. Along with this central thesis we also assumed:

RESEARCH SETTING AND SUBJECTS The Rand Corporation is an nonproprietary policy research organization of about 1000 members. All but about 100 of them are employed by the corporation's Santa Monica CA office; the remainder are located in its Washington DC office three time zones away. Rand is formally structured for matrix management, partitioning responsibilities between "departments" and "programs." Departments (such as political science, information science, and the like) are responsible for hiring and promotion and are organized along disciplinary lines. In contrast, programs (such as health, education, labor and population) have a domain focus, serving to organize and oversee specific projects in their topic area. They have a common executive management and share many staff support units (e.g., public information office, reference library).

that message exchange, a critical component of generic information work, is the right starting point for developing an interactive environment for collaborative work; that confident person-to-person messaging and data exchange can be effected without special access knowledge such as login names, host computers, and the like; the technology has to support user teams (since that is how work is carried out) and to align itself with existing organization structures (since that is how information flows);

Individuals may be members of only one department; however they often work on more than one concurrent project and thus may participate ir multiple programs of research. Rand research is expected to be interdisciplinary, drawing togethe~ researchers with different kinds of expertise. I* this research setting, "professionals" are typically researchers; however, there are a numbel of staff professionals as well. By "manager" we mean anyone who is a department head or program head, plus the president and the 8 vicepresidents. Additionally, there are employees wh¢



system design and implementation, so construed, poses both technical and behavioral problems that cannot be separately resolved. A team of behavioral and computer scientists developed and introduced a messaging system that attempted to satisfy these tenets. For example, RandMail allows computer users to address mail to

92

and copied) of recipients of a message was also recorded. From the form of the header, it was possible to make a reasonable inference as to whether the message was a reply to a previous message or an initiating message, although it was not possible to link replies in turn to the initiating messages. As archived, headers did not include the "Subject" line or any of the body of the message, so no information about the content of the messages was available for research.

perform standard secretarial, clerical and administrative functions; for convenience, we refer to them in the discussion that follows as "clerical" staff. Finally, a very small proportion of employees carry out skilled craft and technical functions. Within this organization, we construe communicating units or "nodes" to be individuals, aggregate groups analytically derived from them (such as all the individuals in a department or program), or organizational entities that have been given a mail "alias" and thus can send or receive messages (e.g., the library's "LIBORDER" unit). The communication parameters we describe reflect overall patterns of relationships, not specific behaviors at specific times, and are constructed from aggregations of more "primitive" relationships such as existence and frequency of node-to-node contacts. Individual contacts are by themselves not the subject matter for analysis.

A secondary data file was also assembled, consisting of characteristics of each established node in the message file. It included, in addition to a unique node identifier, organizational status (manager, professional, technical, clerical), department, program(s), function (research vs. other), physical office location category, and date of first use of the RandMail system. We also identified the "host" machine serving particular individuals, although there were a number of individuals who sent messages through more than one host at different times. Overall, there were 1016 nodes identified, of whom about 200 were consultant or guest account nodes for whom much of the organizational data were unavailable. (The number of nodes is approximately equivalent to total organizational size because all individuals in the Rand telephone directory are potential RandMail recipients whether or not they ever send electronic messages.)

RESEARCH QUESTIONS AND METHODS We have been interested in both the overall structure of the communications system and in changes in that structure and in the behavior of individual nodes within it. Key dimensions of communications structures include parameters such as density (the ratio of communication paths actually used to those potentially available in the network); centralization (the degree to which communication is concentrated among a relatively few members rather than broadly dispersed); reciprocity (the degree to which communication in given channels is two-way rather than one-way); and linkage distance (the length of communication chains within the network, usually measured as the number of intervening people a message must pass through to move from one individual to another). We also consider the effects of phenomena such as separation in time and space and organizational, professional, and functional differentiation on messaging behavior. Our consistent emphasis here is on the utility of the findings, rather than on the underlying science of communications analysis; the analysis of these data is not finished.

In order to insure the privacy of communication behavior, identifiers for individuals were rendered anonymous to the researchers. This was accomplished by replacing all logon IDs with randomly assigned numbers; the link file associating random numbers with IDs was made inaccessible to researchers and was destroyed after a clean, merged dataset was available. Thus the research team knew something about the organizational and message-behavior characteristics of individual nodes, but had no way to determine who a particular node-number actually represented. This procedure allows aggregated conclusions to be drawn without compromising the identities of individuals.

As noted earlier, the raw material for this study was the file of message "headers" automatically logged during the first 18 months of general RandMail use (April 1983-July 1984). Headers as completed by the sender include the following sort of information (a "from" line is automatically To: Tora Bikson Cc: Norm Shapiro, C. Stasz Subject: MEETING TO DISCUSS EMAIL PAPER added). As read by the logging system, they are supplemented with the login ID of the message sender, similar IDs for message recipients, and dates and times of each message. The first approximately 8000 messages logged by the system (over about six weeks) lacked date/time information.

Approximately 69,000 appropriately sanitized message headers were retained for this research. They include data from messages sent to anyone at Rand from s computer account holder on one of the two logged host machines; messages received include those to anyone on the logged computers other machines, or even exclusively hardcopy recipients. Thus while the dataset represents active messaging by computer users on the two hose machines of interest (i.e., minicomputers serving users who were not previously part of the electronic mail culture), it systematically underrepresents the total level of computer-based communication at Rand as well as the patterns of communication across the total set of host computers. These conditions should be kept in mind in interpreting the findings that follow.

For purposes of this analysis, each "from/to" combination was treated as a separate communication, although the total number (primary

Five subperiods were selected from the overall data set for more intensive review to look at changes across time. These were 4-week periods

93

TABLE 1

spaced about evenly across the data (June 1983, September 1983, January 1984, April 1984, and July 1984). Numbers of messages sent during these periods varied from about 2500 messages in Period 1 to about 4000 in Period 5; the number of network participants in each period was about 500 of the overall 1000-member pool of communicators. The sample of nodes examined in this research, then, constitutes about half the membership of Rand.

RECEIVED 0

I-5

6-13

14-20

>20

TOTAl

...................................................

The analyses that follow were carried out on a personal computer, using a variety of general and special-purpose routines and applications (Lotus i-2-3 graphics in particular may be readily identifiable to the aficionado). Analytic techniques parallel those previously used to study communicative interactions in NSF-supported Industry-University Collaborative Research Centers (Eveland, 1985). The germane point is that for many years one of the major obstacles to the wider use of network analysis methods has been the perception that large-scale mainframe analysis methods were required. In recent years, there has been a proliferation of PC-based analysis routines, so that the techniques can be readily applied in almost any setting; for example, the increasingly popular UCINET series available from UC-Irvine. The groundwork thus exists for a much wider utilization of communications analysis for understanding the effects of computer-mediated work.

0

2

112

51

22

80

26~

S E I-5 N T 6-13

6

36

30

ii

44

12~

1

15

19

10

27

7:

14-20

2

5

10

7

19

4:

>20

2

20

25

20

228

29.'

13

188

135

70

398

80~

TOTAL

messages over the period studied. On the other hand, over 500 nodes sent at least one message. Since the actual number of particpants with accounts on the logged hosts was about 500, it appears that most people who went to the trouble to get an account at least logged in occasionally. 343 recipients received hardcopy only; of these, 133 sent messages, while the rest must be considered only peripherally part of the electronic network at all. (The non-sending recipients in Table I were presumably attached to a non-logged host.) Table II shows the cumulative distributions of the message traffic across all messages, displaying a not unexpected Pareto "80/20" TUN

CUMMLATIVE

RESULTS The following findings are presented less as the final word on the communication network structure of the Rand Corporation than as illustrations of the kinds of outputs that can be generated by relatively straightforward analyses of logged messaging data. As noted, there are systematic exclusions in the logged data that may complicate complete interpretation of the data in Rand's organizational context. Nevertheless, the data illustrate a number of interesting points about the organization and its use of electronic communication, and suggest avenues for further research in this area.

I!

P'EP,:i.EtI-F t[)F M E ~ S A G E S

0.9 ¸

0.7

'

06

~

0 3



i

I i

~'F':', i 111

We first describe the overall structure of the communications network in terms of levels of communication, density of interaction, and volume of message traffic. We then discuss the relationships of messaging behavior to occupational status, function, and organizational position, and the effects of time and physical distance -- two phenomena all organizations must cope with. We conclude with some observations on the communication network properties of the Rand group.

t

I

I

rl

.V'H,I I .

pattern. This distribution was more or less constant across the whole logged time period, with the proportions in each cell of Table I approximately equivalent at each of the five sampled subperiods. Overall message traffic did increase somewhat across the whole period (Table III), partly as a result of adding more participants, partly as a result of increasing traffic per capita. The table shows considerable random variation from week to week, in addition tc some seasonal patterns (Week 36, for example, was the Christmas/New Years holiday).

NETWORK DESCRIPTION. Overall, there were 69,219 messages sent from the relevant host machines during the approximately 18 months covered by these data (Table I). As the table shows, a relatively small proportion of senders accounted for a relatively large amount of the traffic, Of about 800 network participants (those who received softcopy messages form and who had computer accounts on one of the two logged hosts), only 228 nodes both sent and received more than 20

In order to investigate the patterns evident as people came into the system, a subsample of 100 senders was selected, all of whom became electronic mail users after date/time logging began and retained an account on the system for at least 36 weeks. The message traffic for each week of the individual's participation was computed, sc that in each case Week 1 represented the first

94

TIBLl

IzI

T~Jv~s v

P~l'l C,t,AAiL U'~;.a,::E NU£ASERS

2

AVERAGE MESSAGES SEHT PER PERIOD

~F k, E35AGES B'F '~VEE; $

FI=IOO,ALL

18 PE~IOOS

f0



i

1.6 ,.5 7 -

=E

1.3-

o.s,08-

, v'

/

~

B

'

8

b/)

O .6 -

~"

"

0.¢" 0.3-

2.

\

8

13

18

23

28

33

38

4.3

4.8

53

58

63

68

n

,

,

r

i

,

i

,

WEEK

week the individual participated (the calendar w e e k s do n o t n e c e s s a r i l y coincide -- one person's Week 1 m i g h t b e i n A u g u s t 1 9 8 3 , a n o t h e r ' s in December). This group approximately paraIleled the overall population in generaI patterns of participation. The g r o u p was d i v i d e d i n t o t w o s u b g r o u p s - - t h o s e who s e n t an a v e r a g e o f more t h a n 5 m e s s a g e s p e r w e e k a c r o s s t h e 36 w e e k s ( t h e " h e a v y " g r o u p ) and t h o s e who s e n t l e s s t h a n 5 ( t h e " l i g h t " g r o u p ) - - e a c h g r o u p n u m b e r i n g 50 members.

13

TABLE V I DISTRIBUTI~ OF PROFESSIONAL GROUPS BY DEPARTMENT AND PROGRAM PROFESSIONAL GROUPS DRPAR31'~NTS NA ~IGES PROF TECH .......................................................................

IV

CLER

TOTkL

HA CSD

189 9

0 8

0 54

0 5

0 7

189 83

ISD MIS

5 0

3 2

30 5

0 I

7 0

45 8

EASD R&S

10 9

2

4

0 0

14 20

SSD BSD PSD DAG

29

3

54 65 71

0

80 98 12&

13

3 3 4

49 27 5

0

21 18

10 0 0 5 2 6 2

WSP

TOT,~,L LJ.~EF..> ,~.T E,~CH PEFtlOD c-

HE..~h" USEP~

ORGANIZATIONAL CHARACTERISTICS. Basic organizational descriptors were available f o r 736 of the individual nodes in the total network. T a b l e VI s h o w s t h e d i s t r i b u t i o n of the six

As Tables IV and V show, there are notable differences between these groups. Table IV shows the percent of members of each group who actually 'r~L~

PEPtOD +

Ut~HT LIS-EPS

,c - '

VRS FRES PUB FIN

"H Ea9"* " = 8 = A'.,E RAG E3

45 40

3

2

14

13 21

I

43

0

0

49

3 7 13

0 12 7

0 7 1

3

15

8 47 42

LIB PER

0 2 2

6 4 9

4 2 1

8

SEC

RGS FACIL

0

0

I

0

9

11

4

19

1

0

7

0

3

3 1 0 0 1 1 0 PIO 0 1 2 0 ....................................................................... TOTAL 300 83 436 24

-,

/ /

/%'-

m'-~

1

2

14 5 5

171

1016

different professional groups across the 21 d e p a r t m e n t s and 19 m a j o r p r o g r a m s o f t h e organization (as noted earlier, Rand is best v i e w e d as a d e p a r t m e n t - b y - p r o g r a m matrix within a common management and administrative framework. Because of their small numbers, "service" and "craft" employees were generally excluded from the analyses that follow. Table VII shows the

/ a.

'5 ~0

2

3

4

5

6

7

8

9

113 11

12

13



18

16

17

19 15 15

1 3 2

TELEC

t

83 51

0 0 2

18

PERIOD 13

HEAVY USERS

+

LIGHT USERS

T~LZ v i i

LEVEL OF USE BY GROC?

sent a message in each of the 18 two-week periods of this data set. After Period 2, the percentage of the "heavy" user group signed on each period averaged between 20% and 35%, and shows a slow upward trend. By contrast, the light user group averaged no more than about 10% of its members sending messages in any period. Table V shows the average number of messages sent by a participant in each week he/she participated; the "heavy" group averaged 8 to 9, again with an increasing trend, while the light group averaged less than three, with no trend. These data suggest that users differentiate early, with those who will go on to be heavier users of the system showing heavier early usage and a tendency to increase, whiIe those who make light use of the system early in their exposure to it tend to continue that pattern.

:

L~

95

I~'

R]=; I

¢ .LR

percents of each group in different sending categories; while managers have a slightly higher percentage of high senders than do the others, neither department, program, nor professional group is a significant predictor of messaging behavior, according to analyses of variance. In fact, in these data there are NO statistically significant ways of differentiating levels of use of the system by the formal organizational characteristics of network participants.

TaitJI 1

CO FITACTS

OF

GROUP

0.8 - \ x - x l \\xj 0,7

,

0.6

/ / / ii/i

~~ \ \k\

" ~ "

/!/

-

: :'-":'t-~.::'~,~".'.: .%v.L~.7.' .u.:.\b:\\ ~.,,:c,x,., ~,-c.,. I

~

-

,,,~:,,.,:\\,

-

2~

....

///

~/-" .

//'/ ///

I,JI,JR5

~

/ /~// / ,/ / / I

/ / i

//,/L,-/// PROFES'L

~.~

TECH

wn~

OWN

'

/

-"\~ \"

"" ~

I1/ ,.~@@.'~" v// I~\,.~ ~x

//

"

-.\\

/-'/

~// "''

V / / M r /

/ / /

/ -" /

~ .

~.I/

V/t

/ / /

" / "

v~.,

I / t

.'/I

" I t / / ,

I t l

,.'I/

.

i / /

///

/.'.

/ / I ,

/I/ i

,

///

0.2

M>P

.

//." //"

/z/

.///

,1~ 3,'./j

.

.

.

- "

. "

i

i

M>C

P>M

"

v

P>P

!

."

\

v f l

i

i

u

u

P>C

C>M

C>P

C>C

Table XI shows the average number of recipients for messages among professional groups. With respect t o range of contacts, manager-to-manager contacts average the fewest recipients at 2.3, while professional-to-professional contacts

.

K////~;~'~""!i ,

IwlANAGERS ~'~

......... MGR PROF TECH CLER CRFT SERV

RECIPIENT : MGR PROP TECH CLER ÷ ...................................................... i 2.38 2.52 3.38 2.47 2.95 5,24 8.71 :5.92 : 5.16 2.51 5.63 1.00 : ~. 28 7,94 21.24 5.96 ~ 0.00 0,00 0.00 0.00 : 0.00 0.00 0.00 0.00

CR~T 2.38 23.71 1.00 12.94 0.00 0.00

SERV 9.00 .00

0.00 4.00 0.00 0.00

average 5.2 recipients. Table XII shows the percentage of messages in each category that we estimated to be replies to previous messages. Managerial and professional messages tend to be about 13%-15% replies, while clerical messages average less than 5% replies. It is reasonable to assume that clerical personnel tend to be on the receiving end of more order-type messages that do not call for replies.

CLERICAL

shows the overall frequencies of contacts among the different professional groups. In keeping with Rand's matrix structure, this contact is frequently across departmental lines in all cases except that of clerical-to-manager communications (Table IX). However, the pattern is different if CONTACTS

##

t / /

.CG/

./i{/////~

-//.,.. ,',f4