Global Differences in Attributes of Email Usage John C. Tang, Tara Matthews, Julian Cerruti, Stephen Dill, Eric Wilcox, Jerald Schoudt, Hernan Badenes IBM Research 650 Harry Road San Jose, CA 95120 USA
[email protected],
[email protected],
[email protected],
[email protected] ABSTRACT
Email usage data from users in a large enterprise were analyzed according to country and geographical regions to explore for differences. Data of 13,877 employees from 29 countries in a global technology company were analyzed. We found statistically significant differences in several attributes of email usage. Users in the U.S. tend to retain larger numbers of email messages while Latin American countries keep fewer messages. European countries tend to file more of their email into folders and Asian countries tend to do less so. These differences in filing behavior are not correlated with Hofstede’s Uncertainty Avoidance Index. This research adds another dimension for studies of email usage which previously have not reported the geographical source of their data.
But might there be cultural or other international differences that lead to different usage patterns for a tool as common and pervasive as email? If so, could those differences have design implications for these globally used tools? We are not aware of any previous studies that have analyzed email usage according to geographic region to explore these questions.
H5.3 Group and Organization Interfaces: Asynchronous interaction, Web-based interaction
A few different efforts have tried to characterize differences between culture along measurable dimensions [12, 13]. For example, Hofstede’s cultural dimensions [13] include a measure of uncertainty avoidance for cultures. Could this high-level cultural dimension predict how people use features such as email folders, which provide more organization and hierarchical structure to a user’s email? We set out to explore for global differences in attributes of email usage by analyzing a set of usage data of an email research prototype by employees at a multinational corporation.
Author Keywords
STUDYING EMAIL USAGE
ACM Classification Keywords
Email usage, international differences, email study, email folders, user interface metaphor. GLOBAL USAGE OF EMAIL
Email has become one of the most pervasively used computer tools around the world. While this pervasiveness may be hard to measure, a recent review of several different reports points to a 2007 survey that estimated the email user population to be over 1.2 billion people [4] and growing. Email, internet search, and other digital media for communicating and sharing information have become so popular that we might think that the usage of these tools and the information that they deliver are becoming globally shared reference points. This ubiquity may lead us to assume that virtually everyone has access to email and regards communicating via email in the same way.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. IWIC’09, February 20–21, 2009, Palo Alto, California, USA. Copyright 2009 ACM 978-1-60558-198-9/09/02...$5.00.
Even though email usage has been studied for over twenty years, none of the previous studies have explored differences of email usage based on cultural or geographical factors. While current practices in email usage are often studied to identify design implications for improving email [7], studies to date have not accounted for potential differences among international users. Studies of email use have consistently found that people use email for more than just interpersonal communication and develop personal and diverse habits around using email. Mackay’s [15] early study of email usage found that it was also used for time and task management. She found that the use of email was amazingly diverse, suggesting that email designers should offer flexible “primitives” that users can adopt and customize to their personal usage preferences. Whittaker and Sidner [19] described “email overload”— how email was also used for task management and personal archiving. They characterized three common strategies for managing and organizing email: •
No filers: No use of email folders, relying on full-text search to find information.
•
Frequent filers: Actively minimize the number of messages in their email inbox by frequently filing into a large number of folders.
•
Spring cleaners: Intermittently (typically every 1-3 months) clean out inbox into a large number of folders.
Studying these organizing strategies for email have become a common theme in email research. With the advent of commonplace and effective search mechanisms, one might expect that more users would increasingly adopt a no filer strategy. Yet, Teevan et al. [18] observed that even a perfect search engine could not fully satisfy users’ needs for managing their information. They identified how users progressively navigate toward information that they seek, effectively creating a structure through which they can orienteer to find information. Boardman and Sasse [2] studied email (among other information management tools), and grouped users according to four email management strategies:
drawn from research institution populations and, with the exception of the Boardman & Sasse [2] study, most probably drew data from North America users. None of them explore patterns according to the geographical setting of their users. Extending the study of email to users from different countries adds another potential source of variation. Some studies of globally distributed teams have identified differences in how users from different countries interact using email (c.f., [1, 5, 14]). Differences in the expected response time and level of commitment conveyed through email can cause misunderstandings among globally distributed team members. Our study identifies ways that differences in how individuals use various attributes of email reveal patterns when aggregated according to country and geographical region.
•
No filers: Do not file any messages.
•
Partial filers: File only a few (< 5) messages per day.
•
Extensive filers: File many messages every day.
In this study, we examine email usage data collected from users in an enterprise setting from around the world to explore if there are correlations between different usage patterns and the geographical region of the users. The next sections describe how we collected our email usage data, the relationships we found, and reflections on the design implications and limitations of our data.
•
Frequent filers: File or delete most incoming messages every day.
METHOD FOR COLLECTING EMAIL USAGE DATA
However, they found that users did not exclusively fall into one category, but employed a combination of strategies over time. Dabbish et al. [6] collected data on general email practice as part of a study to predict how users act an email messages. They also found that it was not easy to exclusively categorize users into the original email strategies, and reported the following common behaviors: •
“I try to keep my inbox size small”: Frequent filers.
•
“I file my messages into folders as soon as I have read them”: Frequent filers.
•
“I leave messages in the inbox after I have read them”: Spring cleaners and no filers.
Fisher et al. [11] updated the Whittaker and Sidner paper and added a fourth email management strategy of users who kept their inboxes small by filing into just a small number of folders. They also concluded that email users do not tend to distinctly follow only one strategy, but try a variety of strategies at different times. These studies present a consistent finding about the diverse and idiosyncratic ways in which people use email. They also identify some common strategies for organizing and managing email that are reflected in how heavily they use folders in email. However, the studies say very little about the characteristics of the user populations that were studied, especially in regard to the geographical background of the participants. Looking at the settings that the studies were conducted, we infer that many of the participants were
We collected email usage data through a research prototype called bluemail [17]. Bluemail was deployed for internal use within IBM, a large, multi-national company that develops computer hardware, software, and services. Bluemail is a web-based interface to email that was designed to be compatible with users’ existing email system (Lotus Notes). Thus, even though the users were using an email prototype that was new to them, it was collecting data that reflected their email practice using their everyday email tool. Bluemail was instrumented to collect data on users’ current email practice (e.g., number of email messages, percentage of messages stored in folders – see Table 2 for a partial list of data collected). Usage data were collected from anyone who used bluemail (users were informed of this data collection in the bluemail login page). Every time users logged in to bluemail, we attempted to collect a snapshot of data that described their email usage. For users from whom we collected multiple days of data, we averaged their data into a single number for that user. These usage data were stored in a MySQL database for analysis. Analysis of the data is only presented in the aggregate and without identifying the users. No private data, such as folder names or email message contents, were collected. Bluemail was deployed within our company for over eleven months. We were able to collect email data from 13,877 users in 29 countries (counting only countries with data from more than 40 people). The distribution of bluemail users around the world is depicted in the bubble graph in Figure 1. Users were drawn from a wide range of business
Figure 1. Bubble graph showing the distribution of bluemail users around the world
groups around the company: Global Consulting Services (37%), Software Development (21%), and Sales & Distribution (15%) composed the three largest proportions of our user population. Notably, bluemail users from the Research Division made up less than 2% of our user population, demonstrating the diversity of our user population compared to prior studies of email. We note that people who tried the bluemail prototype are a self-selecting sample of users who are willing to try new technologies (i.e., somewhat early adopters). Furthermore, people who tried using bluemail were likely to have some email pain for which web-based access to email was more convenient than the full Notes client (e.g., frequently working from home or another site). Even if a user only tried bluemail once, we were able to collect data that reflected their overall usage of email.
Region North America Latin America
Austria Belgium Czech Republic Denmark Finland France Germany Hungary Ireland
Europe
ANALYZING FOR GLOBAL USAGE PATTERNS
We analyzed data only from countries with more than 40 users. We considered this to be a minimum amount of data to represent a country. While we collected data from 13,877 total users in 29 countries, there are a number of reasons why we do not have data for all users for every metric. The sample size for each statistical analysis can be inferred using the degrees of freedom stated in the analysis of variance. The population sample per country also varied widely from 42 (Norway) to 4719 (U.S.), with an average of 479 users per country.
Countries Canada U.S. Argentina Brazil Mexico
Asia Oceania Middle East Africa
Italy The Netherlands Norway Slovakia Spain Sweden Switzerland U.K.
China India Japan Singapore Australia Egypt South Africa
Table 1. Grouping of countries into regions
Country USA India Germany UK Canada Brazil Australia Argentina Italy Japan Netherland France China Ireland Switzerland Spain Czech Rep. Singapore Belgium Sweden Denmark Mexico Austria Hungary South Africa Finland Slovakia Egypt Norway
Region North America Asia Europe Europe North America Latin America Oceania Latin America Europe Asia Europe Europe Asia Europe Europe Europe Europe Asia Europe Europe Europe Latin America Europe Europe Africa Europe Europe Middle East Europe
# Users 4719 2612 924 662 645 515 467 327 306 297 289 285 223 221 192 149 140 117 102 100 96 91 79 65 62 58 47 45 42
# Inbox Messages 1059 262 277 489 731 308 595 259 466 628 280 396 734 492 288 498 263 685 278 619 517 252 271 296 345 768 160 1203 361
# Total Messages 3576 606 1761 2465 2841 1041 1973 696 1512 1791 1765 1662 1489 1241 1752 1500 1044 1785 1794 2359 2987 841 2363 1160 1035 1997 1389 3089 2060
# Folders 76 17 83 102 66 46 77 31 74 30 69 97 24 49 92 72 24 43 105 69 87 29 117 69 54 76 35 64 72
% Messages Foldered 48.6 35.1 58.7 56.9 52.9 49.6 52.5 40.0 43.1 46.9 58.0 55.3 32.8 35.5 59.2 44.0 58.0 39.9 60.9 55.3 56.8 32.0 70.8 60.3 39.9 43.1 68.2 41.0 54.6
% Inbox Flagged 1.0 1.8 1.0 1.2 1.1 0.8 1.2 0.7 0.6 1.9 1.4 0.9 1.1 0.8 0.9 2.0 1.5 0.9 1.3 0.7 0.8 2.2 2.6 2.0 1.1 0.3 1.8 0.8 0.8
% Manager 14.2 8.5 10.1 12.3 12.3 5.8 11.9 2.2 5.9 6.6 8.4 8.9 10.0 11.6 11.2 14.6 10.5 17.5 12.3 4.4 9.7 8.1 13.9 15.0 5.7 14.3 22.0 7.5 0.0
Hofstede UAI 46 40 65 35 48 76 51 86 75 92 53 86 30 35 58 86 74 8 94 29 23 82 70 82 49 59 51 68 50
Table 2. Email usage statistics according to countries with more than 40 data samples, in decreasing order of number of users
To streamline some of the statistical analyses, we also grouped countries according to geographical regions, as listed in Table 1. For all statistical analyses based on region, we excluded regions that only consisted of one country (Oceania, Middle East, and Africa). While these regions were intended to group countries that exhibit similar patterns together, the statistical analyses were also conducted on individual countries. Table 2 summarizes our data by country, in decreasing order of number of participants. For each country it lists the number of bluemail users, the average number of messages in their inbox and total messages stored, the average number of folders created, the average percentage of received email filed into folders, the average percentage of the inbox that is flagged, the average percentage of bluemail users in that country who are managers, and Hofstede’s Uncertainty Avoidance Index (UAI).
While bluemail users constitute less than 4% of the employee population in most countries, there are a few where the proportion of bluemail users is notably higher (Egypt) and lower (Japan, China, Denmark). It is very difficult to uniformly code for job role across a dataset of this size. We were able to code for the percentage of managers among bluemail users per country, which is included in Table 2. Otherwise, a wide range of job roles were represented in the data, with the majority involving information technology development or consultancy roles in the Global Consulting Services division, technical development roles in the Software Development division, or manager roles across all of the divisions. Number of Email Messages Stored
Figure 2 shows a chart of the average total number of stored messages (including all folders) in increasing order by country. It also shows the average number of messages
# Total Messages with Inbox Messages
4000
# Inbox Messages
3500
# Total Messages
# messages
3000 2500 2000 1500 1000
USA
Egypt
Denmark
Canada
UK
Austria
Sweden
Norway
Finland
Australia
Belgium
Japan
Singapore
Netherlands
Germany
Switzerland
France
Italy
Spain
China
Slovakia
Ireland
Hungary
Czech Rep.
Brazil
South Africa
Mexico
Argentina
0
India
500
Figure 2. Total number of email messages in increasing order by country along with number of messages in inbox
in the inbox. This chart shows that there are differences in the total amount of email kept based on country. An analysis of variance (ANOVA) revealed a significant effect of country on the total amount of email stored, F(28, 9490)=71.3, p